I regularly make the distinction between information and knowledge generated from genomic studies. Information is just that, the explorative identification of genetic diversity. What nucleotide base or allelic variant is where on what chromosome in the genome? The Human Genome Project created a map that, coupled with incredible advances in sequencing technology, has allowed for tremendous genomic exploration and the production of a vast amount of genomic information over the past decade.
Genomic knowledge, in my usage, reflects not simply the acknowledgement of the location and quantity of genetic variation, but some understanding of what that variation is associated with. While we have generated a tremendous amount of genomic information over the past decade (and have used this information to generate evolutionary knowledge about the history of our species), what the variation in the genome actually does–on a biochemical, molecular, physiological, ecological level–has remained a mystery in most cases. This has prompted commentaries on the disappointment, or lack of progress, in genomic medicine based on genome-wide association studies. What has been lacking is a knowledge of how genetic variants are biologically associated with outcomes. In other words, what the genome does.
The simultaneous publication of 30 articles today as part of the long-term ENCODE project is a major step forward in this arena. Ed Yong has a wonderful description of what ENCODE is all about it that I recommend in full. For those really interested, ENCODE project coordinator, Ewan Birney, has an equally informative, though more technical account of the behind-the-scenes collaborative and publication process associated with the project. The very short description is that ENCODE attempts to identify what kinds of cellular/biochemical processes different parts of the genome are associated with. Ed Yong comes up with a useful analogy:
Think of the human genome as a city. The basic layout, tallest buildings and most famous sights are visible from a distance. That’s where we got to in 2001. Now, we’ve zoomed in. We can see the players that make the city tick: the cleaners and security guards who maintain the buildings, the sewers and power lines connecting distant parts, the police and politicians who oversee the rest. That’s where we are now: a comprehensive 3-D portrait of a dynamic, changing entity, rather than a static, 2-D map.
This is a major step forward. As with the human genome project, the results published by the ENCODE group today may reveal a lot about the evolutionary history and structure of our genome, but their implication for the production of genomic knowledge, not just information, will likely not be immediately apparent.
18 months ago, Nature published a commentary titled, “Charting a course for genomic medicine
from base pairs to bedside.” In it, the authors produced the following figure demonstrating their sense of how the future timeline of genomic research will play out.
The work of the ENCODE team makes a large step towards filling in the second column in the above figure, moving us closer to generating real knowledge about the genome’s role in shaping human biology.
*****
1. Eric D. Green, Mark S. Guyer & National Human Genome Research Institute (2011). Charting a course for genomic medicine from base pairs to bedside, Nature. 470:204-213. doi:10.1038/nature09764