Two decades ago, scientists celebrated sequencing the human genome—a milestone that was, in hindsight, incomplete. While most protein-coding genes were mapped, about 8% remained elusive until now. Far from 'junk,' these regions hold critical biological insights.
Launched in 1988, the Human Genome Project aimed to decode the entire human genome. Its 'completion' was declared on April 14, 2003. However, roughly 15% of the DNA sequence couldn't be assembled due to technological limits.
Unmapped areas clustered around telomeres—the protective chromosome ends—and centromeres—the dense central regions. By 2013, experts narrowed the gap to just 8%, yet 200 million base pairs, akin to a full chromosome, stayed unresolved.
A consortium from the National Human Genome Research Institute, University of California, and University of Washington has now achieved complete human genome mapping. Their findings appear in Science.
DNA comprises nucleotides—each with a phosphate, sugar, and one of four bases: adenine, thymine, guanine, or cytosine. These pair to form the double helix's rungs, encoding our genetic blueprint. We inherit 23 pairs of chromosomes from our parents.
This blueprint resides in nearly every cell, which selectively reads relevant genes—like skin cells accessing texture and pigmentation data, ignoring instructions for eyes or organs.
Sequencing determines the precise order of these base pairs. Past efforts used short-read methods, scanning hundreds of bases at a time and fragmenting the genome—like assembling a 10-million-piece puzzle of endless blue sky, leaving numerous gaps.
Complicating matters, paired chromosomes from each parent often share similar sequences, blurring distinctions.
To overcome this, researchers used tissue from complete hydatidiform moles—formed when sperm fertilizes an enucleated egg, yielding only paternal chromosomes. From this, they derived a cell line with 23 chromosome pairs from one individual.
Advanced 'long-read' sequencing, employing lasers, captures 20,000 to one million base pairs per scan—larger pieces that filled longstanding gaps. This rigorous effort has sealed the remaining 'holes' in prior assemblies.
A fully resolved genome unlocks study of hard-to-reach regions, potentially revealing complex mutations linked to diseases.