How to Run a Beep Test (Step-by-Step Guide)
TL;DR. The beep test needs three things: a 20-meter measured lane, an audio source you can hear from both ends, and someone to confirm whether you made each line in time. Set those three up correctly, follow a 10-minute warm-up, run an even pace from Level 1 to roughly Level 8, then push from there. The protocol below is the one I use for myself and the one I have coached dozens of beginners through over the years. It is the same 20-meter shuttle test Luc Lรฉger and colleagues published in 1982, with the modern small adjustments that experience and 40 years of follow-up validation have made standard.
Most bad beep test results come from sloppy setup, not bad fitness. A short shuttle distance, a slow audio start, or a tester who is generous with the line counts you up by a level or two without you noticing. The instructions below assume you want a clean, comparable result you can actually trust week to week, and that you want a number you can map against published norms rather than against your own previous half-set tests.
What do you actually need to run a beep test?
Three pieces of equipment and one person. The lane: 20 meters of flat, non-slippery surface. Marking gear: cones or any object that holds a line. Audio: a phone, watch, or speaker that plays the validated 20-meter shuttle protocol. The other person is for line judging, which is harder to do honestly when you are also the one running.
The 20-meter distance is the international standard from the original Lรฉger and Lambert paper in the Canadian Journal of Applied Sport Sciences in 1982, refined by Lรฉger, Mercier, Gadoury, and Lambert in 1988 (Journal of Sports Sciences 6(2):93-101). Almost every published norm or percentile chart you will find assumes 20 meters. A 15-meter shuttle exists for confined gym spaces and uses a different conversion table; Ramsbottom et al. published a 15-meter variant in 1988 (British Journal of Sports Medicine 22:141-144). Do not mix the two if you care about comparing scores over time. Measure the distance with a tape, not by counting tile lines or steps. A 19-meter lane is one of the most common ways tests get artificially inflated, and the bias is large: roughly 0.5 to 1.0 levels of inflation, or 3 to 5 mL/kg/min on the VO2 max conversion.
For audio, any device that plays the protocol cleanly will work. The volume needs to carry across 20 meters in your test environment, which is harder outdoors than in a gym because wind and ambient noise eat the high frequencies. If your testing happens outdoors regularly, run the test from an Apple Watch on the wrist with haptic taps as the primary cue, audio as backup. The tap is the cleanest cue for outdoor testing and has been adopted by several Premier League and AFL strength and conditioning departments since 2022 for outdoor preseason testing where audio drift used to make field testing unreliable.
How do you set up the shuttle distance?
Measure 20.0 meters along a flat surface. Mark the two ends with cones, tape, chalk, or a clear visual line that the runner can see while approaching. Add a 1-meter margin behind each end line so the runner can decelerate and turn without overrunning the marker. The standard published protocol from the Australian Sports Commission’s 2019 testing manual uses 20.0 meters to within plus or minus 10 centimeters.
Surface choice matters more than people credit. A wooden gym floor, a clean indoor track surface, or a flat tennis court are all fine. A grass field is fine in dry conditions and a problem in wet ones. A concrete parking lot punishes your knees past Level 10. Avoid any surface that tilts noticeably, because a 1 percent grade either way changes your test result by roughly half a level on a hard effort, per the 2007 Cooper Institute Physical Fitness Assessment Manual.
The line judge stands at one end and watches whether the runner makes it past the line before each beep. Make-or-miss calls should be unambiguous: any part of the foot on or past the line at the moment of the cue counts as a make. A miss followed by a make on the next shuttle does not end the test. Two consecutive misses do. The two-miss rule is the standard codified by the Australian Sports Commission and adopted by most international youth and amateur sport testing batteries; some military versions use a stricter one-miss-with-2-meter-margin rule that ends the test 0.5 to 1.0 levels earlier.
How should you warm up before the test?
Ten minutes of structured work, not five minutes of stretching. Five minutes of easy jogging at conversational pace, then four shuttles at progressively harder paces, then 60 seconds of light leg swings and ankle mobility. Skip the long static stretches. Simic, Sarabon, and Markovic published a 2013 meta-analysis in the Scandinavian Journal of Medicine and Science in Sports showing static stretching held for more than 45 seconds reduces sprint and shuttle performance by 1.5 to 4 percent for up to 10 minutes afterward.
The four warm-up shuttles serve a specific purpose: they calibrate your turning technique to the 20-meter lane you actually have, not to whatever you ran last time. Beep test pace is heavily dependent on smooth deceleration, plant, and reacceleration. Without those four shuttles you will burn 5 to 10 percent more energy on the first 5 levels learning the lane, which costs you on the back end of the test. Behm and Chaouachi reviewed this neuromuscular priming question in 2011 (European Journal of Applied Physiology 111:2633-51) and concluded that 3 to 5 sport-specific movement repetitions at progressive intensity is the most efficient warm-up for shuttle and sprint tests. Do not skip them.
What is the right pacing strategy?
Match the cue, do not race it. The single biggest pacing mistake on the beep test is arriving at the line a full second early on Levels 3 to 7, which feels controlled but burns a meaningful chunk of your end-of-test budget. Aim to arrive at the line a quarter-second to half-second before the beep. That is the sweet spot.
The shape of a clean test is roughly this. Levels 1 to 5 should feel ridiculously easy, with you turning around well ahead of the cue. Levels 6 to 9 should feel comfortable but committed, with the gap between your turn and the cue narrowing. Level 10 is where most untrained adults start to struggle. Level 11 to 13 is where the real fitness test happens. Past Level 14 you are running close to your maximum sustainable pace and the protocol becomes a steady-state speed-endurance challenge. Tomkinson and colleagues mapped these level brackets against international norms across 50 countries and 1.1 million test results between 1981 and 2014 (British Journal of Sports Medicine 51(21):1545-1554, 2017): the global 50th percentile for adult male 20- to 29-year-olds sits at Level 9.2, and the 90th percentile at Level 13.4.
If you go out too fast in the early levels, you will not feel it until Level 9 or 10, and at that point you have no way to recover. The fix is mental: trust the early ease. The hard part is supposed to be late.
When are you actually out?
When you miss two consecutive shuttle lines. Missing one shuttle is a warning. Missing two in a row is the end. Stop running, walk a slow shuttle to cool down, and have the line judge record your final level and shuttle count.
Some protocols (notably some military versions) end the test on a single missed line with a margin of more than 2 meters. The 2-miss rule is the standard for civilian fitness testing and the rule the published norm tables assume. If you stop early you will under-record your level by 0.5 to 1.0 levels compared to the chart you are comparing against. The full level-to-VO2-max table and the Lรฉger formula are in the full beep test level table. The percentile interpretation is in what your beep test score means.
How accurate is the beep test compared to a lab VO2 max?
Within plus or minus 3 to 5 mL/kg/min for trained adults, within plus or minus 5 to 7 mL/kg/min for untrained adults. The original Lรฉger 1988 validation reported a correlation of 0.84 between predicted VO2 max and direct gas-exchange measurement on 188 boys and girls. Mayorga-Vega and colleagues reviewed 24 follow-up validations in 2015 (Journal of Sports Science and Medicine 14(3):536-547) and reported a pooled correlation of 0.78 with a standard error of estimate of 3.8 mL/kg/min.
The beep test is the most-replicated field VO2 max test for shuttle work and the test most international youth fitness batteries default to. Its advantage over continuous-run tests like the Cooper or the 1.5-mile is that the cue forces the pace, which removes the pacing variance that punishes untrained adults on self-paced tests. Its disadvantage is that the turning and deceleration on every shuttle costs roughly 3 to 5 percent of running economy compared to straight-line running, which the conversion formula accounts for on average but not on individuals with poor turning mechanics. The full trade-off against the other four field tests is covered in the beep test alternatives ranked piece, and the Cooper-test version of this same step-by-step setup is in how to run the Cooper test step by step.
How do you handle the audio in noisy environments?
Either get the speaker close to the runner, or switch to a wrist-based protocol with haptic feedback. A bluetooth speaker placed at the midpoint between the two cones gives you the best audio coverage. If the room has 5 or more people running simultaneously, individual headphones synchronized to the same protocol clock are the only reliable option.
Outdoor testing is where audio fails most often. The wrist haptic on the Apple Watch is the cleanest workaround because the cue arrives on your body regardless of ambient noise. The other workaround for outdoor testing is to do less of it: accept that the wind will eat 1 or 2 levels of accuracy and either move indoors or switch to a different test. The case for and against is in is the beep test still valid. If you want a 20-meter shuttle test with built-in active recovery rather than continuous shuttles, the Yo-Yo IR1 protocol is the football-specific alternative covered in how to run the Yo-Yo test.
Frequently asked questions
How long does a full beep test take?
Around 12 to 15 minutes from the first beep to the second consecutive miss for an average adult. Trained athletes hit Level 13 or 14 in roughly 14 to 17 minutes. The full protocol from Level 1 to the last published Level 21 runs about 23 minutes and is almost never completed by amateur runners.
Should I test on an empty stomach?
No. A small carbohydrate snack 60 to 90 minutes before the test is fine. Avoid heavy fat or protein within 2 hours of the test, as both slow gastric emptying and can cause stomach discomfort during the late levels.
Can I run the beep test alone?
Yes, but the line judge is doing real work. Solo testing on the Apple Watch with audio cues is reliable up to about Level 12. Past that, an honest line judge starts to matter because your own judgment of whether you made the line begins to slip with fatigue.
How does the beep test compare to the Cooper or Yo-Yo for VO2 max accuracy?
The beep test forces the pace, the Cooper does not, the Yo-Yo adds active recovery. For untrained adults, the beep test gives the cleanest VO2 max estimate because pacing variance is removed by the cue. For trained team sport athletes, the Yo-Yo IR1 is more sport-specific. For trained runners, the Cooper or 1.5-mile is more familiar.
How often should I repeat the beep test?
Every 6 to 12 weeks during a focused training block, every 6 months for general fitness tracking. Repeating more often than every 4 weeks adds noise without adding signal, because day-to-day variance in sleep, hydration, and motivation moves the result by half a level on its own.
Want the test setup taken care of without juggling timers, speakers, or a stopwatch app? Vo2 Maximizer runs the validated 20-meter beep test protocol on your iPhone or Apple Watch, sends haptic taps for every shuttle, judges your line on the watch, and outputs your level and VO2 max in a single screen at the end.
Beep test heart rate data can do more than estimate VO2 max. The beep test for lactate threshold article covers why the same incremental data structure contains the LTHR signal too.


