Using whole-exome duplex sequencing of 81 semen samples from 57 men ages 24 to 75 (mean age = 53), researchers found a broader increased disease risk for children born to fathers of advanced age than previously appreciated. Specifically, they noted, the fraction of disease-causing mutations increased from about 2% in 30-year-olds to 4.5% in 70-year-olds.
The researchers identified more than 35,000 coding mutations and 40 genes under positive selection, 31 of which were previously unrecognized. Most participants contributed one or two samples. Sperm mutations accumulated at an average estimated rate of 1.67 single-nucleotide substitutions per haploid genome per year, driven primarily by age-related mutational processes SBS1 and SBS5. Overall, sperm carried about 5- to 20-fold fewer mutations than somatic tissues, which indicated a lower baseline mutational burden in the male germline.
By middle age and older, an estimated 3% to 5% of sperm carried a likely disease-causing variant—about 2 to 3 times higher than neutral expectations. Of an observed 3.3% “disease-mutation” fraction, roughly one-third arose from neutral background processes, one-third from known driver genes, and one-third from previously uncharacterized candidates. These variants were highly diverse and individually rare: 99.4% were detected in a single sperm cell, and donors had an average of 18.3 distinct disease-fraction variants.
Germline variants overlapped strongly with disease databases and showed approximately 11-fold enrichment among recurrent COSMIC cancer variants and about 66-fold enrichment among recurrent developmental-disorder variants. Positive-selection signals were enriched in RAS–MAPK, WNT, and TGF-β/BMP pathways that regulate spermatogonial proliferation and differentiation. Notably, 30 of the 31 newly identified genes were loss-of-function (LOF) enriched, which indicated that clonal expansion in spermatogonial stem cells is not limited to activating variants. Six genes—KDM5B, MIB1, SMAD6, PRRC2A, NF1, and PTPN11—accounted for more than 20% of the overall selection burden. SMAD4 harbored a germline-specific mutation hotspot associated with Myhre syndrome; the same hotspot is not observed among recurrent somatic cancer mutations.
In the Genome Aggregation Database (gnomAD), the researchers found that SMAD6, MIB1, LZTR1 and SSX1 had more LOFs than expected, and that MIB1, LZTR1 and SSX1 are among the strongest LOF-enriched outliers in the database. Further, these three genes are flagged in gnomAD "for unexplained LOF enrichment," Raheleh Rahbari, PhD, of the Wellcome Sanger Institute in Hinxton, UK, described with colleagues. "Our results suggest that their increased LOF frequency in gnomAD reflects increased input from germline positive selection, with insufficient negative selection to remove them from the population," the authors continued. Excess LOF mutations in MIB1, they added, which are found in developmental disorder trios, could reflect germline selection rather than disease association because they are more common in population cohorts than expected and do not show phenotype correlation.
Across tissues, the fraction of driver mutations was low in sperm (about 1% to 3%) compared with approximately 30% to 50% in many epithelial tissues (e.g., skin, endometrium, esophagus). This finding suggested that, while selection acts in both settings, its magnitude is more limited in the germline.
Modeling estimated that 0.5% of sperm from a 30-year-old and 2.6% from a 70-year-old man carry a known driver mutation; observed disease-mutation fractions increased with age and are consistent with the 3% to 5% level in older men. Exome-wide selection strength was dN/dS = 1.07, and age-stratified values rose from 1.01 in men 26 to 42 years old, to 1.03 in men 43 to 58 years old, and 1.09 in men 59 to 74 years old, implying that approximately 6.5% of nonsynonymous mutations confer a clonal growth advantage during spermatogenesis, though significance was not reached. Sensitivity analyses suggested about 14% to 43% of the excess nonsynonymous load may represent true positively selected mutations, and the strongest selection signals in genes are highly expressed in spermatogonial stem cells.
The team applied NanoSeq, a duplex method with an error rate of less than 5 × 10⁻⁹ per base, to sperm and matched blood from the TwinsUK cohort. They included only samples with more than 1 million sperm/mL to minimize somatic contamination. Selection was quantified with dNdScv, which corrected for CpG methylation and local sequence context. A useful nuance: SBS1 and SBS5 mutations accumulate faster in blood than in sperm (approximately 9 times and 7 times higher rates, respectively), reinforcing the comparatively mutationally quiet state of the germline. Unlike blood (where smoking and alcohol showed significant effects), the authors wrote, BMI, smoking pack-years, and alcohol consumption showed no significant effects on sperm mutation burden, suggesting the male germline may be protected from these exposures.
The authors cautioned that precise estimates of the true driver proportion remain uncertain because individual variants are rare and modeling assumptions vary. Additional limitations include pooled (not single-cell) sequencing, and the study did not assess structural or noncoding variants, so the landscape described here pertains to SNVs/indels in coding regions. As such, clinical implications are not established. Prospective follow-up is needed to map trajectories of selection and age-related risk.
“These findings reveal that germline selection operates in the broader framework of cellular selection, driven by many of the same genes and mechanisms that shape clonal dynamics in somatic tissues," the authors explained. "However, unlike somatic selection, germline selection affects offspring phenotypes and influences evolutionary trajectories."
The authors' full disclosures can be found in the published research.
Source: Nature