UC Berkeley Press Release
Mathematicians, computer scientists play key role in analysis of lab rat genome
(photo courtesy of NHGRI)
BERKELEY – The genome sequence of the common lab rat, announced this week in the journal Nature, would be a mere laundry list of genes if not for three teams of researchers - mostly mathematicians and computer scientists - whose alignment and comparison of rat, mouse and human genomes led to a greater understanding of evolutionary relationships among the three.
Lior Pachter, assistant professor of mathematics at the University of California, Berkeley, and a member of the California Institute for Quantitative Biomedical Research (QB3), led one team that developed computer programs to align the three genomes and compare them. Rats, mice and humans are the only three large vertebrates whose genomes have been determined.
"This was less a competition than a collaboration, because we wanted to make definitive statements about these genomes, such as the neutral rate of evolution," Pachter said. "Estimating the rate of mutation is going to be quite valuable."
The neutral rate of mutation is essentially the background rate of DNA mutation - how many sequence changes would occur over time in the DNA if none of them interfered with the workings of the genes. This rate can be estimated by looking at regions of non-functional or junk DNA, Pachter said. Genes that play critical roles, however, change much more slowly or not at all, because many mutations would incapacitate them. Over generations, the random mutations that are beneficial or benign accumulate, leading eventually to new species.
Pachter was one of more than 220 co-authors, including the two other alignment teams, one from Stanford University and one a collaboration between UC Santa Cruz and Pennsylvania State University. They found that rodents have mutated about three times faster than humans over the past 80 million years or so, and that the rat is mutating slightly faster than the mouse. The rat genome also is larger than the mouse genome, but smaller than the human genome, though all encode about the same number of genes, between 25,000 and 30,000.
"The goal of these papers is to emphasize what we can do with a third vertebrate genome," Pachter said. "We can really start getting at phylogenetic questions you can't get at with only two genomes."
The laboratory rat is a strain of the brown Norway rat (Rattus novegicus) that appears to have originated in central Asia and followed humans as they spread around the globe. Because of the rat's importance in biomedical research and drug testing, knowledge of its genome will have enormous implications.
"This is an investment that is destined to yield major payoffs in the fight against human disease," said Elias A. Zerhouni, M.D., director of the National Institutes of Health. "For nearly 200 years, the laboratory rat has played a valuable role in efforts to understand human biology and to develop new and better drugs. Now, armed with this sequencing data, a new generation of researchers will be able to greatly improve the utility of rat models and thereby improve human health."
The generation and analysis of the high quality 'draft' sequence, which covers over 90 percent of the brown Norway rat genome, are presented in the April 1 issue of Nature and in an additional 30 manuscripts describing more detailed analyses in the April issue of Genome Research. The results were announced by the Rat Genome Sequencing Project Consortium, led by the Human Genome Sequencing Center at Baylor College of Medicine in Houston, in conjunction with the National Heart, Lung and Blood Institute and the National Human Genome Research Institute.
Pachter and his UC Berkeley colleagues - recent graduate and soon-to-be graduate student Nicolas Bray and postdoctoral researcher Von Bing Yap of the Department of Mathematics; graduate students Colin Dewey and Sourav Chatterji and undergraduate Kushal Chakrabarti of the Department of Electrical Engineering and Computer Science; and graduate student Anat Caspi of the Bioengineering Graduate Group - are co-authors of the Nature paper and authored four of the papers in Genome Research.
Among the major findings reported in the Nature paper is that almost all human genes known to be associated with diseases have counterparts in the rat genome and appear highly conserved through mammalian evolution. A select few families of genes have been expanded in the rat, including smell receptors and genes for dealing with toxins, and these give clues to the distinctive physiology of the species.
The rat data also show that about 40 percent of the modern mammalian genome derives from the last common mammalian ancestor. These 'core' one billion bases encode nearly all the genes and their regulatory signals, accounting for the similarities among mammals.
"The sequencing of the rat genome constitutes another major milestone in our effort to expand our knowledge of the human genome," said Francis S. Collins, M.D., Ph.D., director of the National Human Genome Research Institute. "As we build upon the foundation laid by the Human Genome Project, it's become clear that comparing the human genome with those of other organisms is the most powerful tool available to understand the complex genomic components involved in human health and disease."
UC Berkeley's Pachter and his team came into play after the draft sequence was completed under the leadership of the Baylor sequencing center. Bray and Pachter developed a computer program called MAVID to align multiple genomes, used initially last year for a comparative analysis of the mouse and human genomes. They've shown it can work with large numbers of genomes, such as the many sequences of the AIDS virus or even hundreds of mitochondrial DNA sequences. Chakrabarti and Pachter also created a visualization tool, dubbed K-BROWSER, that will allow biologists to browse multiple vertebrate genomes.
In addition, Yap and Pachter developed a way to search for evolutionary hot spots in the rodent genomes, which should be of biological interest. Dewey and Pachter also collaborated with scientists from Baylor on gene prediction, developing computational software called SLAM to identify novel human genes using both the rat and mouse genomes.
"We discovered a few new genes in the human genome by using the rat, but it doesn't appear as if we've missed an enormous number of genes," Pachter said.
Pachter currently is working on a comparative analysis of the chicken (Gallus
gallus) genome, which was recently completed in draft form, and is enthusiastic about the six other vertebrate genomes scheduled to be completed this year: the chimpanzee, three kinds of fish, the frog and the dog. Soon to follow will be the macaque, cow and opossum genomes.
"We are really going from two genomes to ten in one year," Pachter said. "Each of these genomes is yielding distinct insights. The rat was very important because it's a disease model organism. The chicken is at a good evolutionary distance for doing comparative genomics. And while the chimp is going to be very bad in terms of looking for what genes are conserved or for identifying functional DNA, the differences will be very interesting to look at."
Funding for the Rat Genome Sequencing Project was largely provided by the National Heart Lung and Blood Institute and the National Human Genome Research Institute, with additional private funding provided by the Kleberg Foundation.