## From HK to LMM using 13K traits
http://sparql-test.genenetwork.org/sparql/
Sparql input: https://docs.google.com/spreadsheets/d/1Tkl9vA3JlsjtkE7LtQxoOiWXvpboUd5_/edit?gid=541315326#gid=541315326
Results on GO for aging here: https://drive.google.com/drive/folders/1FKcEUiClaUr2YtqW_nqNDeKQuEB3pbXh?usp=drive_link
### 2333 new QTLs from HK to LMM mapping using 13K traits
**Method**: QTL data was loaded from tab-separated format and subjected to systematic preprocessing. Genomic coordinates (chromosome, start position, stop position), LOD scores, and publication years were converted to numeric format with error handling for malformed entries.
QTL interval width was calculated as the difference between stop and start positions (both provided in megabases), yielding width measurements in Mb units.
**Result1:** **Overview of 2,333 new QTLs**
This figure summarizes the overall quantitative trait locus (QTL) architecture derived from re-mapping mouse phenotypes using a linear mixed model (LMM) approach. It highlights the large-scale improvement in QTL discovery and characterization compared to the previous Haley–Knott (HK) method.

**A.** Most loci cluster near the significance threshold, with a median LOD of 4.3, consistent with moderate-to-strong trait associations.
**B.** Shows the temporal trend of QTL discovery from 1980 to present. The red dashed line marks 2015, the transition point to LMM-based analysis. Post-LMM adoption, there was a substantial rise in QTL identification, indicating improved sensitivity of the mixed-model method.
**C.** Summarizes genomic resolution improvements. Median QTL width is ~1.0 Mb, indicating finer mapping precision under LMM. Most markers are narrow enough to identify specific candidate genes. Median 1.0 Mb typically contains only 10-25 genes, making it possible to identify specific causal genes
**D.** Shows that QTLs are not uniformly distributed. Chromosome 7 exhibits the highest density, followed by chromosomes 1, 9, and 14—potentially reflecting biologically relevant loci or gene density patterns.
**E.** Boxplots reveal median and variance in LOD scores across chromosomes. Variability suggests differential statistical power and genetic contribution by chromosome.
**F.** Presents a genome-wide QTL density heatmap (10 Mb bins), highlighting chromosomal “hotspots” of trait associations. The densest clusters occur on chromosomes 6–9 and around the 90–110 Mb region on Chr7.
**Result2.** **Biological system analyses on 2,333 QTLs**

This figure categorizes the identified QTLs by biological system.
**A.** Quantifies QTLs per system. Central nervous system (CNS), immune/infectious, and metagenomic traits dominate, collectively accounting for >75% of QTLs. Smaller categories (metabolic, musculoskeletal, cardiovascular, aging etc.) indicate diverse physiological representation.
**B.** Shows the range and median LOD scores per system. All systems meet the significance threshold (LOD = 5).
**C.** Stacked bars display how QTLs distribute across 19 chromosomes. Chromosome 7 is a massive hotspot. Dominated by infectious disease and immune system.
**D.** Zoom into the 227 "Other" QTLs, revealing 146 minor biological systems, such as aging, reproduction, eye, and mitochondrial systems. Aging-related QTLs (n = 8) form one for further analysis.
**Result 3. Focusing on aging/longevity QTLs-Biological enrichment (finding genes/pathways in the QTL intervals)**
| trait | chr | start | stop | lod | year | submitter | callret-7 | url | descr |
|--------:|------:|---------:|----------:|------:|-------:|:------------|:------------|:---------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| 12684 | 7 | 55.6535 | 68.5073 | 6.3 | 2011 | robwilliams | Lu L | https://genenetwork.org/show_trait?trait_id=12684&dataset=BXDPublish | Survaival
| 17468 | 1 | 22.8525 | 22.8525 | 4 | 2015 | robwilliams | Auwerx J | https://genenetwork.org/show_trait?trait_id=17468&dataset=BXDPublish | Aging, metabolism |
| 17469 | 7 | 105.552 | 105.552 | 4.5 | 2015 | robwilliams | Auwerx J | https://genenetwork.org/show_trait?trait_id=17469&dataset=BXDPublish | Aging, metabolism |
| 19423 | 16 | 79.1753 | 86.3478 | 4.1 | 2018 | sroy12 | Auwerx J | https://genenetwork.org/show_trait?trait_id=19423&dataset=BXDPublish | Aging, metabolism |
| 19423 | 16 | 3.5 | 5.88889 | 4 | 2018 | sroy12 | Auwerx J | https://genenetwork.org/show_trait?trait_id=19423&dataset=BXDPublish | Aging, metabolis |
| 24754 | 6 | 125.923 | 125.923 | 4.4 | 2024 | sroy12 | Mozhui K | https://genenetwork.org/show_trait?trait_id=24754&dataset=BXDPublish | Aging, metabolism |

This figure focuses on QTLs specifically associated with aging traits.
Results of a genome-wide QTL enrichment study focused on aging and longevity traits.
**Method**: I extracted the aging, longevity, lifespan terms.Genes located within aging-associated QTL intervals were extracted using the Ensembl BioMart API (pybiomart package) querying the mouse reference genome GRCm38 (mm10). For each QTL region, genomic coordinates were converted from megabases to base pairs, and all genes within the interval boundaries were retrieved.
The following gene attributes were collected: official gene symbols, Ensembl gene IDs, chromosomal positions, gene biotypes (protein-coding, lncRNA, snoRNA, etc.), and functional descriptions. Given the functional nature of Gene Ontology enrichment analysis, protein-coding genes were prioritized for downstream analysis, though all gene biotypes were catalogued for completeness.
Enrichment was calculated by comparing the observed gene overlap with each GO term against the expected overlap based on genome-wide gene distributions. Terms with FDR-adjusted p-values below 0.05 were considered statistically significant. Additional metrics computed included gene ratio (proportion of input genes in each term), fold enrichment (observed/expected ratio), and -log₁₀(p-value) for visualization purposes.
**Results:**
**A.** Normalized density plot showing aging-related QTLs primarily on chromosomes 1, 6, 7, and 16, Chromosomes 7 and 16 show the highest QTL density (~1.0 normalized), suggesting these regions are hotspots for aging-related genetic variants.
**B.** Displays the relationship between QTL width (in megabases) and LOD (Logarithm of Odds) scores. Most QTLs are narrow (0-8 Mb) with LOD scores between 4.0-6.5. One notable outlier at ~12 Mb suggests a broader genomic region associated with aging.
**C–E.** GO terms by ontology: aging QTL genes are mainly enriched in molecular function categories (30 terms).Conspicuous absence of Biological Process terms suggests the genes are functionally specific. GABA and neurotransmitter-related pathways are recurrently enriched, supporting aging relevance.
**D.** Horizontal bars showing statistical significance (-log₁₀ FDR p-value). Top enriched terms include GABA receptor activity and chloride channel complex, highlighting neuronal signaling processes.All terms show strong statistical significance (p-values ~0.003-0.01).
**F.** Bar chart showing functional annotation of aging QTL genes: predominantly lncRNA (230) and snoRNA (217), with 104 protein-coding genes, suggesting extensive regulatory complexity.
**G.** Shows protein-coding gene counts for individual QTL loci
Color gradient represents LOD score (warmer = higher). Two QTLs (GN_19423) are gene-rich with 52 protein-coding genes each and high LOD (~6.0). GN_12684 contains 50 genes with medium-high LOD (~5.5)
Some QTLs (GN_24754, GN_17469, GN_17468) are gene-poor (0-1) genes.
**H.** Out of 116 protein-coding genes. I studied 26 of them related to the againg. Distribution of 26 protein-coding genes implicated in aging. Highlighted genes align with neurotransmission and synaptic plasticity, consistent with the GABAergic theme. Detailed protein-genes related to the aging below.
| Gene | Evidence Strength | Summary | Model / Context | Key Sources |
|------|------------------|---------|----------------|-------------|
| Igf1r | Strong | Reduced IGF-1 signaling modulates lifespan across species | Worms→mice; human genetics | Reviews, experimental studies |
| Mef2a | Moderate | Delays endothelial senescence via SIRT1 activation | Vascular endothelium | Aging-US, HAGR |
| Pcsk6 | Moderate | Deficiency promotes cardiomyocyte senescence | Mouse heart | MDPI Genes |
| Selenos | Moderate | Selenoproteins regulate redox/ER stress in aging | Brain, hematopoietic | Frontiers, Blood 2025 |
| Adamts5 | Moderate | Overactivation drives age-related cartilage degeneration | Cartilage, OA | Frontiers Mol Biosci |
| Rbfox1 | Moderate | Decline linked to cognitive aging and amyloid burden | Human brain | JAMA Neurology |
| Trap1 | Moderate | KO mice show reduced age pathologies; mitochondrial role | Mouse | Cell Reports |
| Crebbp | Moderate | Histone acetylation and chromatin aging regulator | Various | Reviews |
| Slx4 | Moderate | DNA repair scaffold; Fanconi anemia premature aging link | DNA repair | Blood, Mol Cell |
| Mgrn1 | Moderate | Proteasome stress, age-dependent translocation | Neuronal | Mol Cell |
| Glis2 | Moderate | Loss causes epithelial senescence, kidney fibrosis | Mouse kidney | Nat Genet, Kidney Int |
| Hmox2 | Weak–Moderate | Redox balance, neuroprotection in aging | Brain | Frontiers 2024 |
| Tfap4 | Weak–Moderate | Regulates Myc–senescence axis | Cancer cells | Nat Cell Death Diff |
| Coro7 | Moderate (evolutionary) | Knockdown extends C. elegans lifespan | C. elegans | Front Genet |
| Ano2 | Weak–Moderate | Olfactory aging marker linked to mortality risk | Olfaction | Front Aging, JAMA |
| Gabra5 | Moderate | Expression declines with age; modulation reverses atrophy | Brain | Cereb Cortex 2020 |
| Ube3a | Moderate | Declines 50–80% with aging; affects proteostasis | Brain | Front Aging Neurosci |
| Chrna7 | Weak–Moderate | KO alters aging transcriptome, cognition | Mouse brain | Age 2010 |
| Tjp1 | Moderate | Tight-junction gene declines with age | Choroid plexus | Fluids & Barriers CNS |
| Chsy1 | Weak–Moderate | Cartilage ECM aging link | Cartilage | OA reviews |
| Apba2 | Weak–Moderate | APP interaction; AD risk | Brain | AD genetics |
| Fan1 | Moderate | DNA repair; DDR-aging intersection | Neuro, DDR | Sci Adv, bioRxiv |
| Trpm1 | Weak–Moderate | Retinal aging transcriptomic change | Retina | Front Mol Neurosci |
| Mcee | Weak | Propionate metabolism, aging metabolome link | Metabolism | Trends Endocrinol Metab |
| Lrrk1 | Moderate | Loss → age-dependent dopaminergic degeneration | Brain | J Neurosci 2022 |
| Aldh1a3 | Moderate | Knockdown accelerates senescence/SASP | Cancer, senescence | Cancers (Basel) 2024 |
**Next steps:**
1 The SPARQL database contains all the SNPs. We can query those too under a QTL.
2. Cross-species comparison (mouse QTLs → human disease relevance), synteny?