Every number here is computed, not guessed.
Two gauges sit on top of each compound: how well it's studied, and how the community reads it. This page is the whole machine behind them — the evidence ladder, the quality rubric, and the actual formulas. No vibes, no eyeballing. If you disagree with a score, you can check our working.
From a pile of papers to one honest read
Most of what a search returns isn't about a human who lifts. We screen it down before anything gets scored. Real example — Trenbolone:
That last number is why Trenbolone's Studies gauge lands mid-scale while its animal literature is huge: volume of non-human work never substitutes for human evidence. The formula below is built around exactly that principle.
Two gauges, two different questions
Studies — how studied it is
Confidence in the science. Driven by human-trial evidence and its quality. Animal and in-vitro data count, but are capped. Nothing here is sentiment.
Community — sentiment & popularity
How much the real world talks about it and how clearly they lean. High discussion + a clear consensus reads strong — independent of whether the science exists.
The evidence ladder
Every source is weighted by method, not by how exciting the claim is. Human trials are the only tier that can push a score to the top; the rest is pooled and capped.
Then each study earns a quality score
1–5, on method. A q5 is worth roughly 1.75× a q4 and 28× a q1 — top-tier evidence is rewarded steeply and marginal evidence barely moves the needle (which also makes the score hard to inflate with weak papers).
0.05
Anecdote-grade, tiny or uncontrolled.
0.15
Weak design or very small sample.
0.40
Decent observational / solid animal.
0.80
Good controlled study.
1.40
Large RCT or strong meta-analysis.
The Studies formula
Human evidence is summed linearly (it can always raise the score). Everything else is pooled and passed through a saturating cap, so a mountain of animal papers adds a bounded amount and no more. The total is squeezed into 0.10–0.98.
Labels: Thin < 0.40 · Moderate 0.40–0.65 · Strong ≥ 0.66.
The Community formula
Popularity leads: how much real reporting exists (named expert voices + structured cycle logs). A clear consensus — most voices leaning the same way — nudges it up; a genuine split holds it back.
A compound nobody logs sits near the floor. A widely-run compound with a clear community verdict — good or bad — reads Strong. This gauge measures signal, not endorsement: Trenbolone scores high because it's discussed constantly and consistently, most of it cautionary.
The parts that are just counting
The tallies under every scorecard are deterministic — no judgment involved.
And the needle is just geometry
A score s ∈ [0,1] becomes an angle. That's the whole trick behind the dial.
How the verdict gets written
The headline sentence on each page isn't a fourth score — it's the reconciliation of the two gauges with the plain yes/no of whether it works and how sharp the risks are.
What we don't pretend
The score is only as clean as the tags
A study mislabeled "human, high-quality" would over-count. We audit for this; where a compound looks over-scored, the fix is re-tagging the data, not fudging the gauge.
Popularity ≠ truth
The Community gauge can read Strong for a compound the science barely covers. That's the point of keeping the two gauges separate instead of averaging them into mush.
No sample-size term (yet)
Where trial sizes aren't reliably captured, a 20-person study and a 2,000-person study can weigh the same within a quality tier. We'd rather admit that than fake precision.
We never invent a number
If the evidence isn't there, the gauge sits low and the page says "we don't know" — instead of manufacturing confidence to fill the space.
Now go read a compound — and check our working.
Every scorecard runs the exact math on this page. If a number looks wrong, tell us where.