Skip to main content

< All Technology Centers

Big Data, AI, & Genomics

The Big Data, Artificial Intelligence, and Genomics group consists of computational experts in genomics, mathematical modeling, scientific software, big data analytics, and the use of artificial intelligence to understand biological systems.

Overview of Services

Research Services

Extensive experience in both molecular biology and computational sciences aids in the development and applications of novel genomic and computational methods for solving biological problems. Members of the Big Data, AI, and Genomics group provide advanced expertise, analytical support, and comprehensive collaborative services to Stowers Institute scientists. Genomics support focuses on the latest approaches for making effective use of an immense amount of generated data. In particular, the Stowers SIMRbase platform provides assembled genomes and related data for various research organisms plus a toolkit for investigating these resources. Biomathematics support includes complex data analysis, modeling, and numerical/symbolic programming. Researchers can also tap into expertise in biophysical and systems biology, as well as image processing and analysis. Additionally, the team supports many of the scientific software packages used by Stowers researchers.


  • Rosetta Stone Transcript Mapper
  • Gene Search

Learn More

Software & Computing

  • SIMRbase
  • Planosphere

Learn More

Team Contact

Sean McKinney

Head of Computational Imaging

Stowers Institute for Medical Research

Portrait of Sean McKinney

Team Contact

Sofia Robb

Genomics Scientist

Stowers Institute for Medical Research

Portrait of Sofia Robb

Team Contact

Boris Rubinstein

Research Advisor


Portrait of Boris Rubenstein

Team Contact

Chris Seidel

Genomics Scientist

Stowers Institute for Medical Research

Portrait of Chris Seidel

Computational analysis, machine learning, genomics

Exponential increases in technology produce vast quantities of data. How to handle that data and to extract the essentials requires expertise in computation and machine learning, or artificial intelligence (AI). Whole genome assembly is a perfect example of really big data and the necessity of computers to parse information.


In bioinformatics, BLAST (Basic Local Alignment Search Tool) is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold. Different types of BLASTs are available according to the query sequences. For example, following the discovery of a previously unknown gene in the mouse, a scientist will typically perform a BLAST search of the human genome to see if humans carry a similar gene; BLAST will identify sequences in the human genome that resemble the mouse gene based on similarity of sequence.

Rosetta Stone Transcript Mapper

Find homologous genes across transcriptomes. Software to quickly map IDs to a reference sequence or other transcriptome IDs.

Gene Search

Search for transcripts by ID, experimentally determined tissue/pattern, homologs (best BLAST hits), GO terms, and PFAM protein domain terms. All the following search fields are cumulative, meaning the results need to meet all provided criteria.

Stowers software


SIMRbase is the place for assembled genomes and genome related data at the Stowers Institute for Medical Research. Here you can find tools to search sequences with BLAST, keywords, GO terms, tools to browse genomes, and tools to manually curate gene models. Currently SIMRbase has public access to Petromyzon marinus, the sea lamprey and its germline genome assembly, Nothobranchius furzeri, African killifish or turquoise killifish, Scolanthus callimorphus, the worm anemone, and Nematostella vectensis the starlet sea anemone. For more information or to get help contact Sofia Robb (

Learn More


Planosphere is a collection of data and tools from the Sánchez Alvarado lab planarian publications.

Learn More

Featured Publications

The architecture and operating mechanism of a cnidarian stinging organelle​

Karabulut A, McClain M, Rubinstein B, Sabin KZ, McKinney SA, Gibson MC. Nat Commun. 2022;13:3494. doi: 10.1038/s41467-41022-31090.

The Planarian Anatomy Ontology: A resource to connect data within and across experimental platforms

Nowotarski SH, Davies EL, Robb SMC, Ross EJ, Matentzoglu N, Doddihal V, Mir M, McClain M, Sánchez Alvarado A. Development. 2021;148:dev196097. doi: 196010.191242/dev.196097.

The SAGA core module is critical during Drosophila oogenesis and is broadly recruited to promoters

Soffers JHM, Alcantara SG, Li X, Shao W, Seidel CW, Li H, Zeitlinger J, Abmayr SM, Workman JL. PLoS Genet. 2021;17:e1009668. doi: 10.1371/journal.pgen.1009668.

DNA replication, transcription, and H3K56 acetylation regulate copy number and stability at tandem repeats

Salim D, Bradford WD, Rubinstein B, Gerton JL. G3 (Bethesda). 2021:jkab082. doi:010.1093/g1093journal/jkab1082.

Hox genes regulate asexual reproductive behavior and tissue segmentation in adult animals

Arnold CP, Lozano AM, Mann FG, Jr., Nowotarski SH, Haug JO, Lange JJ, Seidel CW, Sánchez Alvarado A. Nat Commun. 2021;12:6706. doi: 6710.1038/s41467-41021-26986-41462.

Changes in regeneration-responsive enhancers shape regenerative capacities in vertebrates

Wang W, Hu CK, Zeng A, Alegre D, Hu D, Gotting K, Ortega Granillo A, Wang Y, Robb S, Schnittker R, Zhang S, Alegre D, Li H, Ross E, Zhang N, Brunet A, Sanchez Alvarado A. Science. 2020;369: eaaz3090. doi: 10.1126/science.aaz3090.

Newsletter & Alerts