Massively parallel sequencing–based single nucleotide polymorphism profiling produced usable sequencing data in most unidentified human remains samples that met a minimum human DNA threshold, although routine presequencing DNA measures only moderately predicted profile completeness.
In a study published in Forensic Science International: Genetics, researchers analyzed 500 randomly selected, anonymized skeletal samples submitted to Othram Inc for forensic genome sequencing. Human-specific DNA quantity was measured with short and long autosomal quantitative polymerase chain reaction (qPCR) targets, and total DNA was measured fluorometrically. Samples advanced to sequencing only when the short autosomal target measured at least 0.005 ng/µL.
Of the 500 samples, 399 met that threshold and underwent sequencing. Among these sequenced samples, single nucleotide polymorphism (SNP) call rates ranged from 8% to 91%, and 95.7% achieved call rates above 50%. The call rate was defined as the proportion of successfully genotyped loci among 637,469 target markers.
The strongest correlate of profile completeness was the ratio of total DNA to short-target human DNA, which the researchers used as an indicator of background DNA burden relative to endogenous human DNA. Estimated human DNA input into library preparation was also positively associated with call rate. By contrast, the degradation index (DI), a metric commonly used to assess suitability for short tandem repeat (STR) typing, showed only a modest overall association with sequencing performance, although DI was the strongest predictor within patella samples.
Skeletal source appeared to matter most during initial sample triage. Petrous bone had the lowest failure-to-progress rate at 6.5%, followed by metatarsal and tibia samples at 12.5% and femur samples at 15.7%, although several of these estimates were based on relatively small sample sizes. The confidence interval for petrous bone did not overlap with several lower-performing skeletal categories, including rib and talus samples. However, among samples that advanced to sequencing, call-rate distributions were similar across major bone categories, with median call rates ranging from 76% to 85%.
Machine-learning models using standard quantitative predictors had moderate performance. The best-performing model, a Random Forest model using quantitative predictors alone, explained 47% of the variability in a validation set. Correlation and machine-learning analyses included 398 sequenced samples with complete quantification data because one sequenced sample lacked a complete Qubit total DNA measurement. Prediction accuracy declined among lower-performing samples, where models tended to overestimate sequencing success.
The findings suggest that standard quantitative DNA metrics can help guide workflow decisions but cannot reliably predict whether an individual skeletal sample will yield a high-completeness profile. The researchers noted that remaining variability likely reflected unmeasured sample-specific factors, including DNA damage, environmental exposure, inhibitors, and background DNA not fully captured by current laboratory metrics.
The study also highlighted the limits of using call rate alone as a marker of forensic genetic genealogy (FGG) success. Even the lowest-call-rate sample, at 8%, yielded about 51,000 SNPs. Although the researchers noted such profiles are unlikely to support segment-based kinship inference, they may still be useful for direct comparison or close kinship analyses. In addition, 93.2% of sequenced samples achieved call rates of at least 60%, which the researchers said, based on their experience, generally supports upload to the FamilyTreeDNA database.
Several limitations may affect interpretation. The samples likely represented unusually challenging forensic cases because FGG is not yet routinely applied and many submitted samples may already have failed traditional STR workflows. As a result, the approximately 80% progression rate observed in this study may underestimate sequencing success rates and would likely increase if FGG were used as a first-line method. Bone samples also varied in preservation status and environmental exposure, limiting direct comparisons across skeletal sources.
The researchers also noted that the sequencing threshold relied on an 80-base pair qPCR target originally developed for STR workflows. Some samples that failed to meet the threshold may still contain shorter DNA fragments recoverable through massively parallel sequencing approaches, suggesting current triage thresholds could exclude potentially sequenceable samples.
“These findings indicate that qPCR and fluorometric measurements are correlated with SNP call rate but are not sufficient, alone, to reliably predict outlier outcomes or to determine, with high confidence, whether an individual skeletal sample will yield a high-completeness SNP profile,” wrote first author Steven A. Bates of Othram Inc, and colleagues.
Disclosures: Steven A. Bates, Morgan Johnson, Jianye Ge, Kristen Mittelman, and David Mittelman were employees of Othram Inc. Bruce Budowle served as a consultant to Othram Inc.