OpenAge · Healome One 01 / 16

OpenAge.
An open-weight
biological aging clock
validated against
actual death.

"You can't run ML benchmarks
on most of biology.
You can on aging."

Nikhil Yadala · Healome One Inc.

github.com/Healome/OpenAge · OpenAgeAI.com

The same person, the same blood, the same day — different biological ages from different tests — Same person. Same blood. Same day.
Different tests. Different answers.

OpenAge · Healome One AGI House SF · May 2026

OpenAge 02 / 16

10 years of aging research.
5 years of monthly bloods.
And I still can't tell you my biological age.

OpenAgeAGI House SF · May 2026

OpenAge 03 / 16

The longevity market hit $85B in 2025.
It runs on a measurement layer
that can't pass an ML smoke test.

$85B

Global longevity market, 2025

$120B

Projected by 2030
7% CAGR

$50K

Annual fee for a typical concierge longevity doctor

If your speedometer is broken,
the engine doesn't matter.

And without a working speedometer, you can't tune the engine. You can't design lab-validated protocols, dose interventions for optimal performance, or know which lever moved which marker. The whole optimization layer downstream collapses on top of a broken measurement.

Broken measurement chain: unreliable measurement → uncertain protocol → unknown outcomes — **The broken measurement chain.** Unreliable measurements propagate uncertainty through every downstream decision — the protocol, the supplement stack, the longevity doctor.

OpenAgeAGI House SF · May 2026

OpenAge 04 / 16

Seven modalities measure biological age.

Modality	Examples	Cost / sample	Test–retest	Interpretable?
Epigenetic (DNAm)	Horvath · GrimAge · DunedinPACE · PhenoAge	$300–500+	Poor	No
Proteomic (plasma)	Lehallier · SomaScan · Olink	$200–500	Good	Partial
Transcriptomic	Peters et al.	$$$	Moderate	Partial
Metabolomic	Hertel · MetaboAge	$$$	Moderate	Partial
Blood-clinical (CBC + CMP)	PhenoAge formula · BioAge · OpenAge	$10–50	High	Yes
Imaging	Brain-age · retinal-age	$$$$	Good	Partial
Functional / wearable	WHOOP age · grip-strength	$	High	Yes

OpenAgeAGI House SF · May 2026

OpenAge 05 / 16

Four failure modes.
Each one bigger than the last.

01

Interpretability
"A CpG site going from 0.61 to 0.67 methylated is not a treatment plan."
02

Actionability
"You can't run gradient descent on a thermometer."
03

Individual-level accuracy
"A population instrument designed for averages doesn't help you for n=1."
04

Latency
"You cannot do RL on aging if the reward signal arrives a year late."

The biological age iceberg: single number above water, undisclosed uncertainty below — **The undisclosed uncertainty.** The single number companies disclose sits atop a pyramid of hidden context: error range, confidence interval, hazard ratio, sensitivity to transient states, training data demographics. Each one of the four failures lives inside that iceberg.

OpenAgeAGI House SF · May 2026

OpenAge 06 / 16

Ninety minutes of running
drops your biological age
by seven years.

GrimAge2 -7.07 years and FitAge -4.76 years after a single soccer game — **Brooke et al. 2025 · Aging Cell.** 19 professional soccer players. Saliva pre/post a single 90-minute game. GrimAge2: −7.07 yr [−32%], 95% CI [−10.32, −3.71]. FitAge: −4.76 yr [−18%], 95% CI [−7.10, −2.36]. Midfielders: median GrimAge2 dropped ~17.8 yr.

Five biological age tests on the same individual on the same day spanning 16 years — **The reproducibility crisis, in one chart.** Five commercial biological-age tests, same individual, same day. Estimates span 16 years (Test A: 31 yr · Test E: 47 yr). Higgins-Chen 2022 reports up to 9-year deviations between technical replicates of the same DNA sample.

Bryan Johnson's Rejuvenation Olympics ranks people on DunedinPACE. The leaderboard could shift by years depending on whether you jogged before the blood draw.
The clock isn't measuring aging. It's measuring whether you exercised this morning.

Only GrimAge and DunedinPACE have credible cohort-level mortality validation; none are validated for individual-level decisions. OpenAge

OpenAge 07 / 16

To know if an intervention worked,
you'd need a decade of monthly samples.

Statistical power as a function of measurement count: low-noise reaches 80% at ~32 samples, current clocks need ~125 — **The n=1 power problem.** (a) Statistical power vs. number of measurements. Low-noise clocks (SD = 1 yr, d = 1.0) reach 80% power at ~32 measurements. Current clocks (SD = 4 yr, d = 0.25) require ~125. (b) Required tracking time vs. protocol switch frequency — orders of magnitude apart.

Required measurements formula and clinical trial vs n=1 self-experiment comparison — **Cohen 1988, applied.** With current clocks (d = 0.25) → n ≈ 125 → 10.4 years of monthly testing per intervention. The clinical-trial design (100 people × 1 measurement) collapses to n=1 self-experimentation (1 person × 125 measurements).

Most of us switch protocols every 2–4 weeks.
The measurement is orders of magnitude slower than the experiment.
We're flying completely blind.

OpenAgeSource: Cohen, Statistical Power Analysis (1988)

OpenAge 08 / 16

The standard fix from $10K–$100K/yr longevity doctors
is mathematically wrong.

The fallacy: naive averaging

Take multiple clocks. Average them. Done.

Averaging cuts variance. It does nothing for bias.

If GrimAge over-estimates and DunedinPACE caught a flu, you converge to the wrong answer with high confidence.

The fix: RANSAC clique consensus

RANSAC applied to biological age estimation: inlier clique identified, outliers discarded — **RANSAC for your body.** Five clocks. The clique average (34.9 yr) diverges from the naive average (35.3 yr) once outliers are discarded.

The intuition: two watches, one wrist

Two wearables: when concordant, trust the signal. When divergent (80 vs 120 bpm), discard — don't average to 100 — **The wearable principle.** Two watches agree → trust the reading. Two watches diverge (80 vs 120 bpm) → don't average to 100. Discard. Disagreement between instruments that should agree is itself evidence at least one is biased.

Outlier detection is a solved problem in computer vision.
We just don't apply it to ourselves.

OpenAgeAGI House SF · May 2026

OpenAge 09 / 16

Precision is not accuracy.

It's like averaging five broken compasses, each pointing in a slightly different wrong direction.

You get a precise heading.
Precisely wrong.

Three target diagrams: high precision low accuracy, low precision high accuracy, high precision and high accuracy — **Precision vs. accuracy.** (a) High precision, low accuracy — current aging-clock averaging. (b) Low precision, high accuracy. (c) The goal: high precision *and* high accuracy. Bias accumulates when you average biased instruments. More tests doesn't fix it; it makes it worse.

OpenAge · the bias accumulation problemAGI House SF · May 2026

OpenAge 10 / 16

What a real aging benchmark looks like.

Spec	Almost every published clock	OpenAge
Public training data	✗	✓
Frozen public test split	✗	✓
Ground truth outside the model	✗ (chronological age)	✓ (deaths · CDC linked)
Open weights	✗	✓
Inference traces · audit trail	✗	✓ (notebooks + leaderboard)
Disease-specific stratification	✗	✓ (10 ICD causes)

Public data. Actual ground truth.
The easier setup — and the field still doesn't ship like this.

So we did. This is the rest of the talk.

OpenAgeAGI House SF · May 2026

OpenAge 11 / 16

OpenAge. 21 markers. $30 blood panel.
Validated against 5,805 actual deaths.

  Inputs:    21 markers (16 lab + 5 history)
             CBC + CMP + HbA1c

  Model:     GradientBoostingRegressor
             n_est 4000, depth 8
  Train:     ~50K NHANES records (2003–2020)
  Test:      frozen split (random_state=3454)

  MAE:       5.11 yr     R²:        0.906
  Pearson:   0.952    HR / yr:   1.098
  Concord.:  0.83     [95% CI 1.095–1.100]

5,805

Linked deaths
in validation

+9.8%

Mortality risk
per yr bio-age

120→21

Pruned features
(plateau at 21)

The aging signal in blood is low-rank.
Trees beat every neural architecture I tried.

Feature importance distribution showing steep decay — **The minimal-clique thesis, visually.** Feature importances across the extended biomarker set before pruning. The steep decay beyond the top 5–6 features is the signal that the aging signal in blood is low-rank — angina history, glycohemoglobin, arthritis history capture the dominant signal; the rest is noise or redundancy.

OpenAgeReproducible: random_state=3454 · github.com/Healome/OpenAge

OpenAge 12 / 16

Significant association with every major cause of death.

Disease-specific HRs (univariate Cox PH)

Cause of death	HR / yr	Conc.
Pneumonia / influenza	1.131	0.88
Kidney disease	1.125	0.87
Heart disease	1.114	0.86
Diabetes	1.107	0.84
Alzheimer's	1.095	0.83
Cancer	1.088	0.82
Stroke	1.036	0.64

Stroke is weakest — no BP or AFib in the panel. The model knows what it doesn't know.

OpenAge vs. PhenoAge

Clock	HR/yr	Conc.	Open
PhenoAge (DNAm)	1.071	0.86	Partial
OpenAge	1.098	0.83	Yes

Kaplan-Meier survival curves comparing biological vs chronological age — **Bio-age beats chrono-age as a survival predictor.** The bio-age curve (orange) sits below the chrono-age curve (blue) — OpenAge captures variance in survival time that chronological age does not. 41,823 obs · 5,805 CDC-linked deaths.

PhenoAge wins concordance — trained on a mortality phenotype, optimized for rank-ordering. OpenAge wins per-year HR — what one year of bio-age reduction is worth in mortality terms.

OpenAgeLab-only sensitivity: heart HR 1.114 → 1.107; cancer HR 1.088 → 1.076

OpenAge 13 / 16

Blood-clinical brings the loop
from a year to a few months.

Kaplan-Meier survival curves for accelerated vs decelerated aging — **The intervention target, made visible.** Accelerated (blue · bio-age ≥ chrono + 5 yr) vs. decelerated (orange · bio-age ≤ chrono − 5 yr). Curves diverge sharply from age 40. Every intervention that reduces bio-age moves an individual from blue toward orange.

From a decade of monthly testing
to a few months.

The closed loop

blood draw ↓ OpenAge + marker decomposition ↓
intervention ↓ 8 weeks ↓ redraw → which markers moved?
measure → intervene → re-measure

OpenAge v2 — in training

Targeting MAE ≈ 1–2 yr (down from 5.11 yr in v1) by training on longitudinal repeats and a richer feature set.

A smaller MAE means a smaller noise floor in the n=1 math: detecting a 1-year change goes from d = 0.25 to d ≈ 0.5–1.0, and the required samples drop from ~125 to ~16–32.

  MAE 5.11 yr → ~1–2 yr
  n = 125     → ~16–32
  decade      → a few months

OpenAgeAGI House SF · May 2026

OpenAge 14 / 16

OpenAge v3.
Eight measurements to know if an intervention worked.

7,000–11,000

Plasma proteins + metabolites per draw
Olink Explore HT · SomaScan v5 · Nightingale · Metabolon

$2K → $200

Per-sample cost trajectory
2022 → 2026

2–4 wks

Intervention shifts the proteomic + metabolomic signature
(half-lives: hours to days)

Same Cohen 1988 power calculation, three measurement layers.

OpenAgeAGI House SF · May 2026

OpenAge 15 / 16

How do you do RLHF on aging?

The shift

Population statistics → individual closed-loop optimization.

The same shift NLP made when it went from corpus-level perplexity to per-conversation RLHF.

The bottleneck

Not the optimization. The measurement layer.

Plasma proteomics + metabolomics at fortnightly cadence is what closes the RL loop on human aging. Reward signal in weeks, not years.

The blocker isn't the optimization stack — you build that for a living.
The blocker is that the reward signal arrives a year late.

The three-stage evolution of biological age measurement: today, near future, end state — **Three stages.** Today: opaque single numbers ("Your bio age is 34"). Near future: transparent contextualised estimates ("34 ± 4.2 yr"). End state: per-patient titration with clinical validation, real-time feedback, open source. The thing we're building is the bridge between stage 1 and stage 3.

OpenAgeAGI House SF · May 2026

OpenAge 16 / 16

Star the GitHub repo. Start building.
Test the performance of models on your existing data.

For builders

github.com/Healome/OpenAge

Beat the leaderboard. 21 features, frozen test split. Drop in your model. Add a clock — GrimAge, DunedinPACE, your own. Same eval harness.

scan to open
github.com/Healome/OpenAge

For users

OpenAgeAI.com

Upload your last blood panel. See your OpenAge, the marker decomposition, and which physiological systems are accelerating.

scan to open
OpenAgeAI.com

The OpenAge v3 cohort.

If you're on a peptide stack, hormone protocol, or supplement regimen — and you want to know what's actually working — we're recruiting now.

You stop being n=1 and start being a labeled training example for the proteomic + metabolomic clock.

nikhil@OpenAgeAI.com

OpenAge · Nikhil Yadala · nikhil@OpenAgeAI.com AGI House SF · May 2026

OpenAge.An open-weightbiological aging clockvalidated againstactual death.

10 years of aging research.5 years of monthly bloods.And I still can't tell you my biological age.

The longevity market hit $85B in 2025.It runs on a measurement layerthat can't pass an ML smoke test.

Seven modalities measure biological age.

Four failure modes.Each one bigger than the last.

Interpretability

Actionability

Individual-level accuracy

Latency

Ninety minutes of runningdrops your biological ageby seven years.

To know if an intervention worked,you'd need a decade of monthly samples.

The standard fix from $10K–$100K/yr longevity doctorsis mathematically wrong.

The fallacy: naive averaging

The fix: RANSAC clique consensus

The intuition: two watches, one wrist

Precision is not accuracy.

What a real aging benchmark looks like.

OpenAge. 21 markers. $30 blood panel.Validated against 5,805 actual deaths.

Significant association with every major cause of death.

Disease-specific HRs (univariate Cox PH)

OpenAge vs. PhenoAge

Blood-clinical brings the loopfrom a year to a few months.

The closed loop

OpenAge v2 — in training

OpenAge v3.Eight measurements to know if an intervention worked.

How do you do RLHF on aging?

The shift

The bottleneck

Star the GitHub repo. Start building.Test the performance of models on your existing data.

github.com/Healome/OpenAge

OpenAgeAI.com

The OpenAge v3 cohort.

OpenAge.
An open-weight
biological aging clock
validated against
actual death.

10 years of aging research.
5 years of monthly bloods.
And I still can't tell you my biological age.

The longevity market hit $85B in 2025.
It runs on a measurement layer
that can't pass an ML smoke test.

Four failure modes.
Each one bigger than the last.

Ninety minutes of running
drops your biological age
by seven years.

To know if an intervention worked,
you'd need a decade of monthly samples.

The standard fix from $10K–$100K/yr longevity doctors
is mathematically wrong.

OpenAge. 21 markers. $30 blood panel.
Validated against 5,805 actual deaths.

Blood-clinical brings the loop
from a year to a few months.

OpenAge v3.
Eight measurements to know if an intervention worked.

Star the GitHub repo. Start building.
Test the performance of models on your existing data.