Research in Crisis: replication, RCTs, and un(der)funding

Licensed psychologist and therapist Luke R. Allen, PhD, discusses his recent research on suicidality following the initiation of gender-affirming HRT for trans youth, including the crisis of replication and the untenable demand for randomized controlled trials.

In May 2025, the Trump administration's Department of Health and Human Services released a report claiming gender-affirming care for youth "lacks robust evidence." Days later, the same administration moved to defund the very research that would strengthen that evidence base. Anti-trans authorities continue to demand better evidence for gender-affirming care, while defunding the research to find that evidence. They call for randomized controlled trials into affirming care, while the HHS report proposes conversion therapy as an alternative.

Meanwhile, psychologist Luke Allen spent an unfunded year deliberately replicating his own 2019 study on hormone therapy and suicidality. His most recent work, "Changes in Suicidality Among Transgender Adolescents Following Hormone Therapy: An Extended Study," expanded his sample from 47 to 432 transgender youth patients in order to address. Allen's replication study, published in The Journal of Pediatrics in November 2025, offers both methodological rigor and a window into what happens when scientists must defend their work against bad-faith attacks.

The Replication Nobody Funded

Allen didn't have to do this. He's a licensed psychologist in private practice spanning the majority of US states. There are no promotions waiting, no tenure clock ticking. His 2019 study—though small at 47 patients—was the first clinical investigation using real patient data to examine suicidality before and after hormone therapy. Even skeptics acknowledged it mattered.

But Allen understands the replication crisis. He knows that in science, initial findings often show larger effects that shrink upon replication—not because the original was wrong, but because early studies tend to capture the most dramatic cases. Larger, well-designed replications tell you whether an effect is real and stable.

"Given the replication crisis in the sciences and the well-documented decline effect—where initially large effects often shrink or vanish with replication—larger, better-designed replications are necessary before drawing strong conclusions," Allen wrote on his blog. "If in the long term, years, decades from now, if we're gonna change minds, part of that piece of that puzzle is also continued, consistent research and evidence."

So he spent a year on chart review, data cleaning, and analysis, without any grant funding or major institutional support, just the recognition that his field was under a microscope in ways other medical research isn't.

The new study confirmed the original findings. Suicidality declined significantly following hormone therapy initiation, with a medium effect size (partial η² = 0.075)—smaller than the original large effect, which is exactly what you'd expect in replication research. Among patients who endorsed suicidality at baseline, rates dropped by nearly two-thirds at follow-up. Recent suicide attempts dropped by 84.6%.

But the study revealed something else crucial: most transgender youth in the sample weren't in acute distress. Three-quarters (75.5%) scored zero on suicidality measures at both baseline and follow-up, suggesting stable mental health throughout treatment. And patients who'd received pubertal suppression before hormone therapy showed even lower baseline suicidality than the overall sample.

The RCT Double Standard

Critics routinely dismiss transgender health research for lacking randomized controlled trials, holding it to a standard applied almost nowhere else in medicine. "Most research is low quality evidence, across the board,” Allen tells me. Three quarters of medical interventions are based upon research that has not involved randomized controlled trials. “I think there's a double standard here, in the literature and the science."

RCTs aren't just impractical for adolescent hormone therapy—they're ethically untenable. "It's a false goal," Allen says. "One, it's not needed. Two, it's not even practical. There are serious ethical concerns."

First, you can't blind patients to treatments that cause visible physical changes—if you give patients in an HRT study a placebo, they are going to realize pretty quickly that they are in the control group. Then, control group retention fails when treatment is accessible elsewhere. Allen points to a decades-old attempt at an RCT for precocious puberty where "every single one" of the control group families dropped out in order to access treatment. People are not going to choose to suffer for science when they can find relief somewhere else.

What critics ignore: transgender health research isn't uniquely limited. In medicine, observational studies are commonly used when randomization is infeasible, and they are frequently considered sufficient to inform clinical guidelines—especially when findings are consistent across multiple studies. We don't have RCTs proving that smoking causes lung cancer, for example; instead, we have decades of observational data showing tremendous consistency. For interventions with larger effect sizes, RCTs become even less necessary.

Allen's study joins approximately 22 other investigations of youth medical interventions, showing tremendous consistency across diverse samples, populations, measurement tools, and methodologies over decades. That consistency, Allen argues, elevates certainty over time—even without RCTs.

"As the field is still early, with limited samples and heterogeneous methods, variation in findings across studies is not unexpected," Allen noted. His study focused specifically on suicidal ideation and behavior—lower base-rate outcomes than broader measures like anxiety or depression used in other research. Minor variations don't invalidate the overall signal. "The findings are stable across time, samples, and research groups," he wrote. "That is what science looks like in the complexity of real life."

When Critics Come Knocking

Hours before my interview with Allen, a letter to the editor appeared in The Journal of Pediatrics criticizing his study. The author, affiliated with the Society for Evidence-Based Gender Medicine (SEGM), argued that Allen's use of the Ask Suicide-Screening Questions (ASQ) as a continuous score rather than a binary screener was "an unvalidated endpoint."

The ASQ consists of four yes/no questions assessing suicidal ideation and behavior. It's validated as a binary tool: any "yes" answer flags elevated suicide risk. Allen summed the responses (ranging 0-4) and analyzed the total as an ordered indicator of suicidality severity—a common practice in psychometric research when measuring unidimensional constructs.

Allen's response methodically defended the decision. Prior studies have modeled ASQ data this way. The approach treats the summed score as an ordered proxy for underlying suicidal ideation, not as an interval-scaled risk measure. And critically, when Allen tested whether excluding the most severe item (recent suicide attempt) would change results, effect sizes remained virtually unchanged.

"No reasonable modeling decision could obscure the central result," Allen wrote. "Suicidality declined significantly and consistently over time among transgender youth receiving hormone therapy."

This exchange illustrates Allen's broader concern about research weaponization. He describes crafting deliberately long, run-on sentences in the paper to prevent cherry-picking—connecting caveats directly to findings so they can't be separated. He points to other transgender health research where misleading abstracts (claiming "no better outcomes than general population") obscured actual findings (significant pre-to-post improvement). "Part of our responsibility doing ethical research is also trying to minimize or address misuses of our works," Allen said.

The study honestly reports that 4.6% of patients showed increased suicidality scores—but contextualizes this carefully. Some may have had worsening distress before treatment that hormone therapy slowed but didn't reverse. Others may have faced adjustment challenges from visible physical changes in unsupportive environments. Still others may simply have become more comfortable disclosing suicidal thoughts in affirming care settings over time.

Critically, that 4.6% rate is lower than the 6% deterioration rate found in meta-analyses of psychological depression treatments for adolescents. Hormone therapy doesn't eliminate all risk—no mental health intervention does.

What Quality Evidence Actually Looks Like

The HHS report attacks transgender healthcare research as insufficiently rigorous while proposing conversion therapy—the attempt to change someone's gender identity—as a viable alternative. Conversion therapy also has no RCT evidence supporting effectiveness. It does, however, have extensive evidence suggesting it can cause serious harm.

This isn't about scientific standards. It's about manufacturing doubt to justify restricting care.

Allen's work can't single-handedly overcome political bad faith, but it does something important: it demonstrates what rigorous science looks like when researchers take replication seriously, defend their methodology transparently, and refuse to hide complexity. The study addressed criticisms proactively. And Allen made all of this public—on his blog, in his papers, in interviews—because transparency matters when your field is under siege.

"At the very least, if you're the most anti-trans or highly skeptical person," Allen said, "the takeaway is, we still don't have evidence that people are getting worse, or getting more suicidal." That's the floor—the absolute minimum conclusion Allen's replication supports. The ceiling is much higher: hormone therapy is associated with meaningful reductions in suicidality, this finding replicates across studies and contexts, and most transgender youth receiving comprehensive gender-affirming care show stable or improving mental health.

But Allen conducted this research during a year when the administration was actively cutting funding for the very studies that could strengthen these conclusions further. Their playbook is on full display: demand evidence, then eliminate the means to produce it. Call for higher-quality research, then defund NIH investigations. Claim concern for youth wellbeing, then block studies tracking their health outcomes.

Allen's study is unlikely to convince bad-faith critics. But it does clarify what's actually at stake: not the quality of evidence, but whether evidence will be allowed to exist at all.

Talk about the Science with Parents and Patients

Professional and Lifetime members at Well Beings News have access to our Resource Library, where you will find a Comprehensive Research Guide and a Quick Reference Fact Sheet with everything you need to know about the research supporting gender-affirming care, along with solid plans for how to share this information with your patients and their families.

Reply

or to participate.