Objective:
To identify protein-altering variants that are missed when genetic variants are interpreted using only reference transcripts.
Approach:
- Data Integration: Integrated long-read RNA sequencing data from multiple human tissue atlases with population variation from the Genome Aggregation Database, disease-associated variants from the Genome-Wide Association Studies Catalog and ClinVar, protein structure prediction, …
- Validation: Validated selected findings using targeted long-read RNA sequencing, proteomics, enzymatic assays, and cell-based experiments.
Key Findings:
- Alternative isoform-specific exons harbored a greater burden of genetic variation than reference exons.
- Approximately 80% of alternative transcripts containing disease-associated variants were not represented in current reference genome annotations.
- Computational analyses prioritized multiple missense variants predicted to alter protein structure or stability in alternative isoforms.
Interpretation:
Evaluating genetic variants within tissue-specific alternative transcript isoforms may identify protein-coding consequences that are overlooked when analyses rely exclusively on reference transcripts.
Limitations:
- Most variant effects were based on computational predictions rather than experimental validation.
- Transcriptomic data were derived primarily from healthy tissues rather than disease-specific samples.
- Analysis focused largely on missense variants, and many alternative transcripts are predicted to be nonfunctional or undergo nonsense-mediated decay.
Conclusion:
Assessment of both common and rare disease-associated variants in the context of isoform-specific effects will help explain genetic contributions to human disease.
Sources:
This content is an AI-generated, fully rewritten summary based on a published scholarly article. It does not reproduce the original text and is not a substitute for the original publication. Readers are encouraged to consult the source for full context, data, and methodology.