It’s among the smallest of human chromosomes, but the complex structure of the Y chromosome has made it notoriously resistant to efforts to fully decipher it. Now, the first-ever complete sequence of the Y chromosome has been revealed, bringing us one step closer to solving a plethora of unanswered questions.
“Now that we have this 100 percent complete sequence of the Y chromosome, we can identify and explore numerous genetic variations that could be impacting human traits and disease in a way that we weren’t able to do before,” said study co-first author Dylan Taylor, of Johns Hopkins University, in a statement.
The first human genome sequence was unveiled two decades ago, the culmination of a landmark project at the dawn of the millennium. The sequence wasn’t fully complete – there were some gaps left in regions of the genome that were too complex for the technology of the day to handle. It was only in 2022 that the Telomere-2-Telomere Consortium published the first gapless human genome sequence, which is now available for all to view.
But still, there was something missing. Much of the sequence of the Y chromosome was proving tricky to read.
“Just a few years ago, half of the human Y chromosome was missing [from the reference] – the challenging, complex satellite areas,” explained co-author Monika Cechova, of the University of California Santa Cruz, in another statement. “Back then we didn’t even know if it could be sequenced, it was so puzzling. This is really a huge shift in what’s possible.”
Why was the Y chromosome so difficult to sequence?
About 30 million of the letters that make up the genetic code of the Y chromosome are repetitive sequences. Such regions are known as “satellite DNA”.
Imagine that someone printed out and chopped up this article into strips of text, then instructed you to put it back together like a cursed jigsaw. You’d probably be quite annoyed at this interruption to your day, but you could do it. Now imagine that half the article is the same single sentence, repeated over and over. How would you know you were putting the pieces together in the right order?
By now, you’d have probably flipped the table over and gone off for a cup of tea and a lie down. Thankfully, genomics researchers are made of sterner stuff. Over the years, sequencing technology has advanced, and scientists were also armed with the knowledge gained from filling in the other gaps in the original reference genome.
Why is it important to understand the Y chromosome?
The Y chromosome has traditionally been associated with male sex-specific traits, since it contains genes like SRY – the switch that pushes a developing embryo down the path of male sexual development – and those that control functions like sperm production.
However, it’s important to understand that Y chromosomes are not only present in people assigned male at birth, and someone’s external anatomy is not necessarily a reliable indicator of which chromosomes they have.
Intersex people can have XX, XY, or XXY chromosomes, or even a mixture of XY and XX in different cells within their body, which can lead to a whole host of different physical presentations.
From the new complete sequence of the Y chromosome, researchers have been able to identify 41 previously unknown genes, some of which could be medically relevant. Further research will now be possible, to better understand how these genes could be impacting the health of people with a Y chromosome.
The human Y chromosome itself is also changing – it’s the most rapidly evolving chromosome among all great apes. Thanks to the sequencing data, scientists will now be in a better position to understand what it means when, for example, two people’s Y chromosomes have wildly different numbers of copies of a particular gene.
There’s also been a lot of speculation around the gradual loss of the Y chromosome, and what this could mean, including the controversial theory that it might disappear altogether. A complete genetic sequence can only help the scientists working to unravel these mysteries.
And on top of all this, there was another unexpected finding. It turns out that human Y chromosome DNA has been turning up repeatedly as a contaminant in studies of bacterial DNA. As many as 5,000 bacterial genome sequences in a commonly used scientific database have now been found to contain sequences that match the Y chromosome. Because the chromosome had not been fully sequenced until now, these bits of DNA were mistaken for bacterial DNA; now, geneticists will be able to clean up the database.
“That was a surprising thing,” said lead author and National Human Genome Research Institute staff scientist Arang Rhie. “People were guessing at it, but no one could prove that this was happening until now.”
Where will this work lead next?
Earlier this year, the first-ever draft human pangenome was published. Going beyond the reference genome, the pangenome is a new resource that seeks to incorporate more of the diversity of the human race. Now that we have the sequence of the Y chromosome at last, this can be added to future versions of the pangenome.
There are so many unanswered questions about the Y chromosome: its function, its potential role in disease, and its evolution. The researchers behind this project are keen to make the data as accessible as possible, available to all those working to understand this enigmatic little piece of the human puzzle.
The study is published in Nature.