Blog · Muscle Testing
The honest answer is: not reliably, on current evidence. Applied kinesiology has repeatedly failed to show dependable diagnostic accuracy, including in Stephan Schwartz and colleagues' own double-blind study, which found a clear null. That null is the model Ashta works to — test cleanly, publish either way.
Applied kinesiology, often called muscle testing, claims that changes in muscle strength can reveal information about substances, diagnoses or physiological states. A practitioner presses on a person's arm while they hold a supplement, food sample, vial or written prompt; a "strong" or "weak" response is read as meaningful. That is a testable claim. If muscle testing can diagnose anything, it should work when both practitioner and participant are blinded to the correct answer, when samples are randomised, and when results are checked against an independent reference standard.
The important result for Ashta is not a positive one. It is a null. In 2014, Stephan A. Schwartz and colleagues — including the statistician Jessica Utts — published a double-blind, randomised study of applied kinesiology as a diagnostic tool and as a possible nonlocal proximity effect. The study matters because Schwartz is not a casual debunker; he has spent much of his career arguing that anomalous claims deserve careful testing. In this case the careful test did not support the claim: the results did not show applied kinesiology to be a useful or reliable diagnostic tool. That models exactly the standard Ashta adopts — investigate honestly, publish honestly, and do not hide the null.
The broader literature is also largely negative. Reviews of applied and specialised kinesiology have concluded that the evidence is insufficient for diagnostic accuracy or for the validity of the muscle response, and blinded studies of dental-material and allergy testing have failed to support it as a dependable method. This does not mean all manual muscle testing is worthless — in neurology, rehabilitation and orthopaedics it has legitimate uses when it measures strength, range, fatigue or impairment. The controversial claim is the stronger one: that a subtle arm response can diagnose allergies, select supplements or answer health questions through a "body wisdom" channel. That claim needs stronger evidence, and so far it has not earned it.
There are ordinary reasons AK can feel convincing. The practitioner may apply slightly different force without noticing; the client may anticipate the expected answer; both may respond to suggestion, posture, fatigue or the phrasing of the question. Once a result feels meaningful, confirmation bias does the rest. None of that requires fraud — only humans being human. So Ashta's position is plain: we do not assume AK works, we do not assume every practitioner is dishonest, and we treat the diagnostic claim as a hypothesis under test.
A strong protocol would blind the practitioner, blind the participant, randomise target and decoy samples, predefine success criteria, record force objectively where possible, and compare against a factual reference standard, with sham trials and reliability checks. If the method cannot distinguish target from control under those conditions, that is the result — and it is pre-registered and published like any other. The null matters because it protects people: diagnostic claims change real decisions, and a method that cannot reliably identify what it claims to can lead people to delay diagnosis or spend money on false certainty. Publishing a null is not hostile to inquiry. It is the ethical floor of it.