A glimpse into the unknown – human genome gets a huge update
The first draft of a human pangenome – a collection that aims to eventually represent as many as possible of the DNA sequences found across humans – has been published.
This comes 20 years after the first complete human genome was sequenced and 70 years after the structure of DNA was discovered.
The research published in Nature combines genetic material from a population of 47 genetically diverse individuals to provide a more complete image of the human genome. The next phase will sequence 350 genomes by the middle of next year, to add even more genetic diversity.
Professor Evan Eichler University of Washington School of Medicine said during a press conference that a pangenome will eventually represent the diversity of all humanity. “All the [genetic] variations of all the millions of patients which will be sequenced in the future will be mapped against this reference.”
It is the genetic differences between people, the variations, not the similarities between people which are especially medically important. These can include complex structural variations.
Eichler said they found “remarkable patterns of human variation… things that I would not have expected. There is evidence of new mutational mechanisms. We found genes that every human that have been sequenced to far, have a different complement of. I think it is extraordinary.”
A genome is the set of DNA instructions that helps each living creature develop and function. Genome sequences differ slightly among individuals. In the case of humans, any two peoples’ genomes are, on average, more than 99% identical.
The small differences contribute to each person’s uniqueness and can provide insights about their health, helping to diagnose disease, predict outcomes and guide medical treatments. A pangenome is a collection of DNA sequences that reveals genetic variation between individuals.
The original reference human genome is fundamentally limited in its representation of the diversity of the human species, since it consists of genomes from only about 20 people, and most of the reference sequence is from only one person. The pangenome also builds upon the previous reference genome sequence, adding more than 100 million new bases, or “letters” in DNA.
Dr Benedict Paten, from the University of California, said that just with the 47 genomes they can understand tens of thousands of structural variations, which will eventually be very important in finding rare diseases. Before this was difficult because the reference to compare it to was incomplete.
It is important to remember that the goal is not to represent every variation that exists, because there are hundreds of new mutations in every new born baby, said Dr Tobias Marschall, Heinrich Heine University. The importance of this is that every variation can now be analysed.
The human pangenome… also marks the beginning of a period in which we can see genomic sequence and variation completely.
Eichler said there are about 500 to 600 regions on the genome which are “radically different” between humans. People are both incredibly similar and “different by millions”.
The technique used to do this is called a “long read”, explained Professor Erik Garrison of the University of Tennessee Health Science Centre.
“The orders of magnitude are enormous in length difference. The long reads are triumphs in modern biology. They allow us to read through repetitive patterns which are very common in higher organisms. That is what is allowing access to the genes that are evolving so rapidly and things which are difficult to see. One can see a single molecule.
“The human pangenome is not just a chance to look into the unknown. It also marks the beginning of a period in which we can see genomic sequence and variation completely.
Read more in Daily Maverick: Peering deep into the genomes of Africans
“Previous genomic studies mostly look at regions that are easy to map short reads to, giving the impression that our pangenome is ‘flat’ and everyone is very similar, differing only by a handful of point mutations.
“Instead, the human pangenome shows that each of us carries bits of DNA that are unusual or unique. These are often parts associated with immune function or environmental interactions that are very important for our health. We will only get a handle on their significance by working with a reference that equally represents people from all genetic backgrounds. That reference must be a pangenome,” he said.
Several of the genomes were from Africa, because of the rich genetic diversity and all humans are descendants from Africa, said Eichler. But much more sampling from Africa is still needed, “to have a true human pangenome reference”. In future more will be included.
It may take decades before individual patients will be able to have their complete genomes sequenced at this level, said Eichler.
Personally, he has been focused on studying autism and for 70% of the children who visit his clinic “we cannot explain why they have autism. It is my belief a significant percentage will be explained if we can do complete sequencing on them… The Holy Grail of all of this is to make a difference and find better treatments.” DM