Press Release

30 April 2025
Stowers Investigator elected to the National Academy of Sciences
Jerry Workman, Ph.D., a pioneer in the field of gene regulation, receives one of the highest honors awarded to scientists worldwide.
Read Article
News
Technological advances yielding vast amounts of biological data have forever changed the way research is conducted, reported, and shared
Just eight years ago, analyzing research data was a more primitive affair, remembers Hua Li, PhD, a biostatistician in the Computational Biology Core group at the Stowers Institute. At that time, research scientists could still crunch numbers from most experiments on personal computers and use traditional charts and graphs to highlight findings. But technological advances yielding vast amounts of biological data have forever changed the way research is conducted, reported, and shared.
By Anissa Orr
High-tech tools help Stowers scientists focus on discovery
āNow it is often impossible to analyze all that data on your own workstation,ā Li says. āYou need to have a room full of servers and good IT [information technology] support. Data storage and computational skills have become essential for biomedical research.ā
Thatās especially true for scientists at the Stowers Institute who deal heavily in genomics research that allows studying an organismās complete set of DNA (genome). An estimated 80 percent of data processed by the Computational Biology Core involves sequenced genomic data. Sequencingāfiguring out the order of DNA bases in a genome: the As, Cs, Gs, and Ts that make up an organismās genetic codeāhas become more affordable and accessible for scientists, thanks to high-throughput next-generation sequencing. These technologies also provide scientists with other important forms of genetic information.
To make sense of all that data, Stowers scientists increasingly rely on sophisticated computing technologies. The Institute backs their efforts by devoting a substantial portion of the scientific operating budget to providing and supporting computing resources.
The result is a culture that embraces creativity and technological innovation. In particular, new advances in scientific software programs and computing techniques and tools are boosting productivity and making it easier for researchers to focus on important scientific questions. Hereās a closer look at how Stowers researchers are using tech to drive discovery.
Always adapting in IT
Meeting the technology needs of scientists is a constant challenge in an age when new technologies emerge daily and hardware and software quickly become obsolete, says Mike Newhouse, head of Information Management at the Institute.
āThe days of stagnant IT are gone,ā he says. āTodayās approach to information management demands a continual fluid change of programs, hardware, and storage. Our job is to adapt and handle those changes as they come up.ā
Newhouse joined the Institute in 1997 when he was hired by co-founder James E. Stowers Jr. Stowers had pioneered the application of computing power to investment management at American Century InvestmentsāStowersā renowned investment management firmāand sought to do the same with the Instituteās basic research. Newhouse joined as the Instituteās sixth staff member and helped build the IT team from the ground up.
Since then, Stowersā information management has grown tremendouslyāfrom its humble beginnings in a double-wide trailer with two team members and two computer servers, to its current state-of-the-art offices and data center, housing seventeen team members and more than 250 servers. The rise in storage capacity alone astounds, soaring from just 40 gigabytes to 2.3 petabytes (one petabyte is one quadrillion bytes)āan increase of nearly 60,000-fold.
āMuch of our growth is clearly based around the sequencing data and imaging data we collect now,ā Newhouse says. āThe data our researchers are creating in core groups like Molecular Biology (next-generation sequencing) and Microscopy is massive. The growth is increasing exponentially because of the technologies behind it.ā
To keep up, Newhouse maintains a strong IT infrastructure that supports new technologies and provides investigators with up-to-date tools, including more than 350 software packages. āGiving scientists what they need is a challenge at many scientific institutions stymied by bureaucracy,ā he says.
āHere there is an attitude of āLetās get investigators what they need to do science. And letās get it now,āā Newhouse explains.
Visualizing data from all angles
While the Information Management team keeps technology running at the Institute, an array of programmers and analysts helps researchers process, analyze, and visualize data. Many of these adept data handlers can be found in the Instituteās Computational Biology Core group, which provides computational support to labs on projects lasting from days to years.
āPiles and piles of sequences donāt mean much, and tables of numbers are really hard to look at and interpret,ā says Programmer Analyst Madelaine Gogol. āBut seeing data distilled into a plot or figure will allow you to pull meaning from it much easier. Patterns emerge and help you understand what is going on. Like the saying says, āA picture is worth a thousand words.āā Gogol and her Computational Biology colleagues create pictures that precisely illustrate complex data, using a variety of software and programming tools and their own custom scripts. The information revealed can be insightful or surprising, and may lead to more questions begging to be explored.
Gogol recently completed a year-long project with Arnob Dutta, PhD, a postdoctoral research associate in the laboratory of Jerry Workman, PhD. He studied how the Swi/Snf chromatin remodeling complex, a group of proteins that work together to change the way DNA is packaged, regulates gene transcription. Gene transcription is the first step of the process by which information encoded in a gene directs the assembly of a protein molecule. Recent studies have found that 20 percent of all cancers have mutations in the Swi/Snf complex, and have led scientists, like Dutta, to investigate the complex in more detail.
To help Dutta visualize his results, Gogol used programming packages created in R, an open source computing language used for data analysis, to map individual sequence reads to their position in the genome. She then sliced out the regions around the genes, retaining only the desired genetic area. Next, she clustered the genes by comparing the patterns in each row to one another and placing the two closest rows together. Finally, she represented the numerical values with a color gradient to form a graphical image called a heat map.
The final visualized data pops in red and blue. The image gives an immediate global view of gene profiles across different experimental conditions as well as how genes cluster into groups with similar profiles. Dutta used the heat map to understand how a particular component of the chromatin remodeling complex associates with genes under different conditions, with the color gradient representing the degree of association.
āLooking at the colors, you can see that blue is low and red is high, and you immediately get the picture,ā says Hua Li, Gogolās colleague in Computational Biology. āWith numbers it is really hard to see a pattern, but with colors you get it immediately.ā
Virtually wrapping up the whole package
In the laboratory of Julia Zeitlinger, PhD, Research Specialist Jeff Johnston uses virtual machines both to make sense of their research data and to allow other scientists to reproduce their results. Researchers in Zeitlingerās lab are studying how an organism is able to turn on and off the correct genes during development, using the fruit fly Drosophila as a model system.
āWe go through many different versions of a manuscript before settling on one for publication,ā Johnston explains. āDuring this time, many of the software packages we use get updated, similar to how the apps on your phone or software programs on your laptop are regularly updated. Because of all these changes, we can use virtual machines to build a clean computational environment with specific versions of all the software we need, and then repeat our analysis to ensure it is reproducible.ā
A virtual machine is a program on a computer that works as if it is a separate computer inside the main computer and allows users to run multiple operating systems without interference from each other. For example, a virtual machine would allow a Windows program to run on a Mac.
The Zeitlinger team made one of their first virtual machines public in 2013, with the publication of a paper in eLife. The link to the studyās virtual machine contained all the software packages, analysis code, raw data, and processed data used to create the figures and tables in the published manuscript.
āSince the virtual machine is essentially self-contained and frozen in time, it will always be able to reproduce our analysis, even years later when much of the underlying software code becomes obsolete,ā Johnston says.
Sharing data in this way is important because it advances research and paves the way for future developments in how data is analyzed and shared, he says. In this spirit, Johnston and his colleagues also use literate programming, a form of data analysis that mixes software code with descriptive text. When users click on a file, they see a more detailed description of the programming used to analyze dataāa document that reads more like a research āhow toā than a string of code.
āThis makes the resulting analysis much more presentable, easier to follow, and more amenable to use as a teaching tool,ā Johnston says.
Whatās next?
The past decade has been one of immense change for biomedical research, and continual innovations in technology and genome engineering promise even more change. Itās a future that excites IT experts, analysts, and scientists alike, who look forward to the challenge of using the latest technology to further the Instituteās science.
āMy basic goal is to help investigators understand and really see their data as quickly and thoroughly as possible, with the underlying hope that it will tell us something interesting and new about the processes of life,ā Gogol says. āI hope to contribute in my own small way to the discoveries that researchers are making about these wonderful complex biological systems that are going on daily within and all around us.ā
Information Management: Left to right, back rowāSteve DeGennaro, Andrew Holden, Dustin Dietz, Chad Harvey, David Hahn, Mark Matson, Jay Casillas, Mike Newhouse, Samuel Burns, Dan Stranathan. Front rowāChris Locke, Jenny McGee, David Duerr, Amy Ubben, Jordan Hensley. (Not pictured Shaun Price and Robert Reece)
Press Release
30 April 2025
Jerry Workman, Ph.D., a pioneer in the field of gene regulation, receives one of the highest honors awarded to scientists worldwide.
Read Article
News
25 April 2025
Leo Yan, Ph.D., is researching how differences in ribosome composition regulate early development in zebrafish, potentially leading to new insights into human development and disease progression.
Read Article