AT THE W.M. KECK CENTER,
THE STUDY OF AN ANCIENT MIGRATION
COULD LEAD TO BETTER UNDERSTANDING
OF GENETIC DISEASE
WRITTEN BY ALAN FRIEDMAN
One hypothesis argues that all aboriginal populations descended from one major exodus of migrants 15,000 years ago. Another interpretation makes the case for three waves of migration--the earliest occurring about 30,000 years ago. Still, on one matter both camps had agreed: No more than five core populations from what we now call Mongolia actually made the crossing. The New World, so the story goes, was settled as these distinct groups divided and subdivided.
But evidence uncovered by Robert Ferrell, acting chair of Pitt's Department of Human Genetics and core professor at the W. M. Keck Center for Computational Biology, may throw even that one clear piece of the puzzle back for reconsideration. What's especially interesting here is that Ferrell is not an anthropologist but a geneticist. His tools are not the chisels and brushes used to dislodge bone from earth but the Cray supercomputer. The road map he is following to explore the path of migration is not geographic but biological, written in microscopic strands of DNA.
The Keck Center itself is a foray into a new world. There, traditional biology meets the number-crunching power of computer science and the imagination of artificial intelligence as faculty, researchers, and graduate students from Pitt, CMU, and the Pittsburgh Supercomputing Center collaborate on a wide range of problems, each perplexing in its own way. One research problem might be nonlinear, bringing together a complex web of variables, each variable capable of affecting the others, all exerting an influence over billions of data items. Another might be longitudinal in nature, with researchers looking back over thousands of years' worth of information. Actually performing the endless, complicated calculations required for these kinds of projects makes it impractical, if not impossible, to tackle certain research questions, let alone draw any firm conclusions. But in the forum of the Keck Center, biological researchers can appeal to computer scientists to help them through bottlenecks, enlisting the supercomputer to accurately simulate years and years of experimentation--often in minutes.
Public health professor Herb Rosenkranz, for example, is using supercomputers to determine the actual harmfulness of potentially carcinogenic chemicals in the air. (Real-life tests would take years and ungodly amounts of money.) Donald Mattison, dean of the Graduate School of Public Health, is able to study infant mortality and assess useful intervention schemes, despite the fact that he works with databases containing many millions of items. And biological science professor John Rosenberg is able to develop artificial intelligence systems that learn to mimic the strategies and techniques of expert crystallographers who grow macromolecule protein structures for sophisticated research.
In tracing the DNA of the aboriginal settlers of the North and South American continents, Ferrell is actually following the patterns of gene distribution for a gene-based disease, late-onset diabetes. A clear understanding of how the culprit gene is passed along might lead to a new level of preventive measures, treatments, perhaps even a cure.
"Certain Native American populations have among the world's highest rates of the disease," Ferrell explains. "Populations in the Southwest and Mexico have rates that are 10 to 20 times those of neighboring European populations. So there are genes that are at a high frequency in American Indians that predispose them to diabetes." Understanding gene distribution in a historically segregated population--be it a Native American nation, the Amish, or Ashkenazi Jews--can lead to identifying the genes and environmental factors that enable genetically influenced diseases like Tay-Sachs, cystic fibrosis, or bipolar disorder to strike.
For this project, Ferrell has collaborated with Andrew Merriwether (Public Health '93). Merriwether, who did both his dissertation and post-doctoral work with Ferrell (he's since taken a position at the University of Michigan), describes himself as a "bridge scientist," someone who has studied both the biological sciences and the more computationally intense fields of mathematics and statistics. Such researchers can bridge the gap between supercomputer programmers and "wet-lab" scientists such as geneticists. Merriwether can speak both languages well enough to be the go-between, managing the computational facets of a project and insuring that they serve the biological scientists' needs.
The founding lineage study begins with the premise that no matter where and when the journey began, Native Americans have been carrying the DNA of their ancestors all along. In essence, Ferrell is recreating a forbiddingly large family tree by tracing mitochondrial DNA (mtDNA). "Genes that make up nuclear DNA are donated by both parents and get shuffled by recombination," he explains. "So every individual is a mosaic." Instead, he needed the DNA found in mitochondria, the power plants of cells. "The beautiful thing about mitochondrial DNA is that it's handed down from mother to offspring intact," he points out. Over thousands of generations, it hardly changes. Those rare changes in mtDNA, unlike nuclear DNA, stand out against a motionless background. When a change does occur, it signals a new branch in the family tree.
Ferrell's team began by assembling a vast collection of specimens. Anthropologists donated blood samples they had drawn from members of remote Indian populations throughout the Americas. Ferrell added samples he had collected in Bolivia in the 1970s and then put in deep freeze. Museums and individual archaeologists added the partially fossilized bones, teeth, and hair of 400 ancient inhabitants of the Americas. In all, the collections represented 2,000 Native Americans, Central and South American Indians, and contemporary Mongolians (an important group by virtue of being descendents of the tribes who chose not to migrate all those thousands of years ago).
Once the samples were collected, technicians had to distill the mtDNA of each. The surfaces of bones were wiped with diluted bleach and irradiated with ultraviolet light to remove contaminants. Then they were frozen in liquid nitrogen and reduced to powder, with remnants of DNA being isolated. With blood, white blood cells were isolated, an enzyme was added to digest protein matter, and ethanol was introduced to dry up any remaining fluid. The dregs left in the filter were strands of mtDNA.
But before these swirling spiral ladders (some with as many as 200 million rungs) could be of any use, Ferrell needed to break them down into the planks and beams from which the ladders were made. He used restriction enzymes, the finest saws a geneticist can buy. There were 14 in all, each engineered to locate and detach a specific DNA segment. A pile of pieces were left behind. The more similarity between the number and size of the pieces in any two piles, the greater the chance those ladders were built from wood that came from the same grove--a common ancestor.
It's easy enough, relatively speaking, to discern the similarity of any two samples. "But when thousands of individuals are studied, the number of possible trees grows larger than the number of molecules in the universe," says Andrew Merriwether. The Keck Center enabled Merriwether to field the advice of expert mathematicians and computer scientists.
"I was thrilled to collaborate with them," he says, "because they have an utterly different way of looking at problems than anthropologists do." And since looking back in time and across continents in this way requires billions of calculations--far more than enough to drive even the most zealous researchers to snap their pencils in two--the task was assigned to the Cray supercomputer.
For all the Cray's computational strength (in terms of wattage, it uses more power than a small city), it had to be taught how to identify impossible scenarios, then average out the viable choices to create a single and intelligible model. Being more idiot-savant than true genius, the Cray needs a good bit of human guidance. Given the chance, it would never stop dreaming up ways, however farfetched, of relating all of the DNA samples. "The simpler the tree," Merriwether explains, ̉the fewer forks and branches, the better its approximation of how the migrants truly gave rise to all Native Americans."
What Ferrell and Merriwether are learning about the ancient migrants takes issue with prevailing theories. For one thing, the study shows broad distribution of migrating populations, suggesting that the New World was settled by a diffusional pattern of migration, not by isolated populations charting paths all their own. "Also, it looks as if a small number of discrete migration events was not how the Americas were settled," adds Merriwether. "The crossing was very gradual, occurring over a long period of time."
As far as the putative five founding groups are concerned, theirs may have been the largest contingents, but stowaways from at least three--if not many more--other Asian populations were tagging along. In studying both modern-day Mongolians and the Yanomami, a tropical rain forest population of 10,000 living on the border of Venezuela and Brazil, Ferrell and Merriwether found three new ancestral patterns. The Yanomami have been in contact with the outside world for only about 50 years, yet among both groups these patterns frequently appear, almost certainly a connection back to the moment when their common ancestors--some filled with wanderlust, others content to sit still--parted ways.
The Yanomami finding also gives new life to the debate concerning when the first brave souls made the crossing. By inserting into a simple algebra equation (no supercomputers need apply) the number of mutations that differentiate Yanomami and Mongolian mtDNA, and the fairly well-established rate at which mutations in mtDNA occur, Merriwether could calculate the approximate date of separation. "But I did not use the data to assign specific dates. I don't believe that this molecular clock is accurate over such a short time span," Merriwether states. Even so, he did hazard to say that the equation posits the migration of the Yanomami's ancestors, if not all early migrants to the New World, closer to the present than we've ever had reason to believe.
While arriving at these anthropological answers is satisfying in its own right, Ferrell never takes his sight off his larger mission. "There is a tremendous disparity between populations in the prevalence of disease," he stresses. "We're interested in what determines patterns of a disease. If we want to understand the distribution of diseases in populations throughout the world, we need to first understand the genetic history of those populations. These studies are a tool for helping us understand them."
And so an ancient story continues, this time with researchers like Ferrell and Merriwether listening. Because understanding the distribution patterns of a disease may lead to its prevention and cure, the often tragic tale of the peoples who wondered about what lay at the other side of a bridge may reach a more generous end.
and Call Me in the Morning
But this expertise, borne from decades of experience, can be taught to a high-performance computer. Researchers at Pitt's Center for Biomedical Informatics, where Buchanan is a core faculty member, are succeeding in encoding this expert way of thinking so that high-performance computers can apply it to the real-life, down-to-the-wire dilemmas physicians have to face. When puzzled by ambiguous symptoms and inconclusive lab tests, young doctors, for instance, will have a resource at their disposal beyond the advice of their senior colleagues. They'll be able to appeal to their field's most discerning and respected minds, conveniently packaged in diagnostic software.
Pitt researchers are also putting the power of artificial intelligence and high-performance computers to work searching out "designer" therapies for hard-to-treat diseases. Two years ago, for example, Billy Day, associate professor of environmental and occupational health and pharmaceutical sciences and member of the University of Pittsburgh Cancer Institute, set out to find a substance that could stop the spread of breast cancer tumors without otherwise debilitating the women in which they grow. Day's team taught an artificial intelligence system to sift through 5,000 chemicals and determine which ones showed promise for destroying cancer cells without harming their healthy neighbors. Because the molecular structure of each chemical had to be analyzed in enormous detail, this computational assignment, if carried out by hand in the lab, wouldn't be completed until well into the twenty-second century.
This computational talent search spotted a rising star--discodermolide, a chemical found in a primitive Caribbean sea sponge by researchers at the Harbor Branch Oceanographic Institution. Discodermolide causes breast cancer cells to swell up and tear themselves apart from within. In lab tests, it has shown to be a hundred times more effective, and potentially safer, than Taxol, a leading chemotherapy agent.
As clinical tools, computers are being taught to serve the people who solicit their advice. They, too, aren't looking into crystal balls or waving wands, but for patients in crisis, their performances are no less magical.--Alan Friedman