Abstract
Lab animal experiments are quite a unique environment from a statistical perspective. Within the life sciences context, the researchers have a high degree of control over the experimental design.For instance, all the subjects can be chosen to have the same age, to have a specific geneticprofile, to follow the same diet, have identical housing conditions, and siblings can be assigned todifferent experimental conditions. These are just some of the measures taken to maximally reduceconfounding. Under most circumstances, it is possible to separate the experimental conditionsin such a way that one can directly isolate and measure the relevant outcome variable.
This high level of control results in large effect sizes that are uncommon in human studies.In his pioneering paper, Cohen defined standardized differences starting from 0.8 as large, but itis rare to encounter such small effect sizes in lab animal experiments. In my role as a statisticalconsultant and as a statistician on the ethical committee, I have reviewed hundreds of poweranalyses for experiments featuring laboratory animals and I can count on my fingers and toesthe number of cases where the standardized difference was that small. These large effect sizesresult in small sample sizes, which is where the challenge lies for statisticians.
Despite the high level of experimental control, the research domain is still plagued by areproducibility crisis with most likely many contributing factors. One such factor is that manylab animal experiments are systematically underpowered. This can partially be explained bythe difficulty of getting good variance estimates from small sample sizes, but there are moresystematic issues at play.
For one, ethical concerns surrounding the use of animals in research have led to increasingpressure to reduce sample sizes, making it difficult to detect relevant effects. Another issue isthat required sample sizes are often calculated based on results published in the literature, whichsuffers from publication bias. Large effects or effects that appeared larger through random errorare more likely to get published, hence leading to an overestimation of common effect sizes. Thisbias is further aggravated if experiments are commonly underpowered. When researchers areasked for the minimal clinically relevant difference, i.e. the minimal difference they want to beable to detect, the corresponding sample size for a sufficiently powered experiment is often muchhigher than they anticipated.
In this thesis, we propose the use of group sequential designs(GSD) to address the challengeof underpowered lab animal experiments. GSDs enable researchers to design experiments wherethey can stop once a small sample size has been achieved, but can be further increase the samplesize if necessary.
By enabling researchers to stop early when possible, GSDs can reduce the number of animalsused in those cases. Therefore, even though the GSD framework might enable researchers torequest more animals for their experiments, the average number of animals used in practice isexpected to remain stable or even decrease while still allowing for experimental designs withsufficient power. As such, GSDs can be used to meet the need for designs with more statisticalpower while simultaneously fitting within the 3Rs framework (replace, reduce, refine) promoted within laboratory animal science (LAS). Our contributions in this domain consist of evaluationof existing methodology, development of new methodology suitable for extremely small samplesizes, and making this methodology accessible to biomedical researchers. The reasoning forthe former is straightforward, if the methodology is not suitable for the context, it cannot beimplemented. The reasoning for the second is that the existence of the methodology is of limitedvalue if it is not used in practice.
This high level of control results in large effect sizes that are uncommon in human studies.In his pioneering paper, Cohen defined standardized differences starting from 0.8 as large, but itis rare to encounter such small effect sizes in lab animal experiments. In my role as a statisticalconsultant and as a statistician on the ethical committee, I have reviewed hundreds of poweranalyses for experiments featuring laboratory animals and I can count on my fingers and toesthe number of cases where the standardized difference was that small. These large effect sizesresult in small sample sizes, which is where the challenge lies for statisticians.
Despite the high level of experimental control, the research domain is still plagued by areproducibility crisis with most likely many contributing factors. One such factor is that manylab animal experiments are systematically underpowered. This can partially be explained bythe difficulty of getting good variance estimates from small sample sizes, but there are moresystematic issues at play.
For one, ethical concerns surrounding the use of animals in research have led to increasingpressure to reduce sample sizes, making it difficult to detect relevant effects. Another issue isthat required sample sizes are often calculated based on results published in the literature, whichsuffers from publication bias. Large effects or effects that appeared larger through random errorare more likely to get published, hence leading to an overestimation of common effect sizes. Thisbias is further aggravated if experiments are commonly underpowered. When researchers areasked for the minimal clinically relevant difference, i.e. the minimal difference they want to beable to detect, the corresponding sample size for a sufficiently powered experiment is often muchhigher than they anticipated.
In this thesis, we propose the use of group sequential designs(GSD) to address the challengeof underpowered lab animal experiments. GSDs enable researchers to design experiments wherethey can stop once a small sample size has been achieved, but can be further increase the samplesize if necessary.
By enabling researchers to stop early when possible, GSDs can reduce the number of animalsused in those cases. Therefore, even though the GSD framework might enable researchers torequest more animals for their experiments, the average number of animals used in practice isexpected to remain stable or even decrease while still allowing for experimental designs withsufficient power. As such, GSDs can be used to meet the need for designs with more statisticalpower while simultaneously fitting within the 3Rs framework (replace, reduce, refine) promoted within laboratory animal science (LAS). Our contributions in this domain consist of evaluationof existing methodology, development of new methodology suitable for extremely small samplesizes, and making this methodology accessible to biomedical researchers. The reasoning forthe former is straightforward, if the methodology is not suitable for the context, it cannot beimplemented. The reasoning for the second is that the existence of the methodology is of limitedvalue if it is not used in practice.
Original language | English |
---|---|
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 21 Sep 2023 |
Publication status | Published - 21 Sep 2023 |