Evidence Grading Methodology

How we evaluate and grade supplement research.

Every ingredient-condition claim on our sites receives a letter grade (A through F) based on the totality of available peer-reviewed evidence. This page explains exactly how we arrive at each grade — our methodology is transparent and reproducible.

Grade Definitions

Strong Evidence

Multiple high-quality randomized controlled trials (RCTs) or meta-analyses with consistent positive results. Combined sample size exceeds 500 participants across 5 or more independent studies.

Example: Melatonin for sleep onset latency — supported by 20+ RCTs with consistent results.

Moderate Evidence

At least one well-designed RCT showing positive results, supported by additional studies. Results are mostly consistent across the evidence base, with adequate combined sample sizes.

Example: Magnesium for sleep quality — supported by several RCTs with mostly positive outcomes.

Limited Evidence

Preliminary positive findings from small RCTs, observational studies, or mixed results across studies. Evidence is promising but insufficient to draw firm conclusions.

Example: L-theanine for sleep — small positive studies exist but larger trials are needed.

Preliminary Evidence

Only in vitro, animal, case report, or pilot study data available. Or human studies exist but with inconsistent or inconclusive results. More research is needed.

Example: Ashwagandha for hair growth — limited to preclinical and small pilot studies.

Negative Evidence

Fewer than 30% of studies show positive effects, with two or more studies available. The weight of evidence suggests the ingredient does not provide the claimed benefit, or may cause harm.

Example: An ingredient where multiple RCTs show no benefit over placebo.

Scoring Algorithm

The evidence grade is calculated from four independent scoring dimensions, each contributing to a cumulative score that maps to a letter grade.

Dimension	Score Range	Description
Study Type Quality	0–4	Highest quality study type: Meta-Analysis (4), RCT (3), CCT/Cohort (2), Observational (1), In Vitro/Review (0)
Consistency	-1 to +1	>70% positive results: +1, <30% positive: -1, otherwise 0
Sample Size	-1 to +1	Total participants: ≥500 (+1), ≥100 (0), <100 (-1)
Study Count	-1 to +1	Number of studies: ≥5 (+1), ≥2 (0), <2 (-1)

Final grade mapping: Score ≥6 → A, ≥4 → B, ≥2 → C, ≥0 → D. Forced F when <30% positive with ≥2 studies.

Research Process

Systematic Search

Identify relevant research from PubMed

For each ingredient-condition pair, we conduct systematic PubMed searches using MeSH terms and title/abstract keywords. We prioritize randomized controlled trials (RCTs), meta-analyses, and systematic reviews, while also including observational studies and pilot trials for emerging evidence.

Paper Screening

Filter for relevance and quality

Retrieved papers are screened for relevance to the specific ingredient-condition relationship. We filter by study type (prioritizing interventional over observational), population (human studies preferred), and publication quality (peer-reviewed journals only).

PICO Extraction

Extract structured study data

From each included study, we extract structured PICO data: Population (sample size, demographics), Intervention (substance, dosage, duration, form), Comparison (placebo or active comparator), and Outcome (primary endpoint, effect size, statistical significance). AI-assisted extraction is validated against source text.

Evidence Grading

Calculate algorithmic grade (A-F)

Our grading algorithm scores each ingredient-condition pair based on four dimensions: (1) highest study type quality, (2) consistency of positive results across studies, (3) total combined sample size, and (4) number of independent studies. The final score maps to a letter grade from A (Strong) to F (Negative).

Publication

Review and publish evidence summaries

Generated evidence summaries undergo compliance review for FDA/FTC adherence. All language uses structure/function claims only. Evidence grades are recalculated automatically when new research is added to the database, ensuring grades reflect the most current body of evidence.

Data Sources

PubMed / MEDLINE — The U.S. National Library of Medicine's database of biomedical literature, containing over 36 million citations. Our primary source for identifying relevant clinical trials and systematic reviews.

PubMed Central (PMC) — Full-text archive of biomedical journal articles. We import CC-BY licensed full-text papers via JATS XML, preserving sections, figures, tables, and reference lists for comprehensive analysis.

Semantic Scholar — Allen Institute for AI's academic search engine. Used to enrich our paper database with citation counts and influence scores, helping assess the impact and reach of individual studies.

ClinicalTrials.gov — U.S. National Library of Medicine's registry of clinical studies. Referenced for ongoing trials and registration verification of published study results.

Limitations

Our methodology has known limitations that users should be aware of:

We primarily search PubMed, which may not capture all relevant research (e.g., studies published in non-indexed journals).
AI-assisted data extraction, while validated, may occasionally misinterpret complex study designs.
Our grading algorithm weighs study count and sample size equally, which may not reflect the true importance of each factor for every context.
Evidence grades reflect the current state of research and may change as new studies are published.
Individual responses to supplements vary. A high evidence grade does not guarantee effectiveness for every individual.

FDA Disclaimer: These statements have not been evaluated by the Food and Drug Administration. The products and information on this website are not intended to diagnose, treat, cure, or prevent any disease. The evidence grades presented are based on our analysis of published peer-reviewed research and do not constitute medical advice. Always consult your healthcare provider before starting any supplement regimen.

← About Us | Editorial Policy | Corrections Policy