Replications can help improve relevance of accounting experiments
"I should probably not talk about it because it's an experiment, but I have some concerns about practical relevance"
Nearly every discussant of an experimental accounting paper
As an experimental accounting researcher, nothing is more awkward than discussing practical relevance. Everyone is aware that experiments are less able to speak to naturally-occuring events than other empirical methods. Even if experiments test causal theories that are plausibly generalizable, their results will never apply to the population of interest as well as the results of some other empirical methods do (Bloomfield et al. 2016). For instance, unlike large-scale field and archival data, experimental data from the lab is not naturally-occurring, and experimental materials and procedures will never match closely to all the accounting events we care about and observe in practice. However, the broader accounting community still very much cares about experiments and their results matching or being able to match what happens in accounting practice. Thus, while everyone knows practical relevance is not really a fair criterion for experimental research, there always is that desire to discuss it due to the practical orientation of the field.
Besides practical relevance, there is another awkward discussion topic for experimental accounting researchers: Replications of previously-published experiments. It is common knowledge that replications are vital to address the replication crisis in the economic and social sciences (Baker 2016; Bardi and Zentner 2017; Camerer et al. 2016). Accounting experiments are not exempt from this crisis. Concerns about topics like reproducibility are also rising in the broader accounting community (Hail et al. 2020). However, relatively few experimental studies, whose sole purpose is to thoroughly replicate a previously-published accounting experiment, have been published (Hartmann 2017). Replicating a previously-published accounting experiment does not only entail testing the same or a similar hypothsis as the original. A "proper" replication also involves a thorough and lengthy research process, including but not limited to a detailed pre-registration process and following the original experimental materials, procedures, and data generation process to the letter. By publishing relatively few studies whose sole purpose it is to replicate prior published work, the accounting community continues to signal that replications are of relatively little value. As such, discussing the replicability of accounting experiments is as awkward as discussing their practical relevance.
Replications → Generalizability → Practical relevance
What if we told you that these two awkward discussion topics, practical relevance and replications, are related! Both are connected through the extent to which experiments generalize. Specifically, if we can replicate experimental results, we have more confidence that they hold more generally. In this way, we are surer that our theory is behind the results and that they do not hinge on specific experimental procedures and participant samples. As soon as we have more confidence that certain experimental results generally hold and that our theory is driving them, they may also be more relevant for accounting practice. The way replications and practical relevance are connected through generalizability pretty much resembles a traditional and frequently forgotten norm in the experimental accounting community: "Experiments only generalize through their theory."
One way to boost the practical relevance of experiments is to use a differentiated replication (Lindsay and Ehrenberg 1993). Differentiated replications modestly improve the generalizability of experiments by retesting the same hypotheses with the same experimental procedures and materials as the original experiment, except for one methodological modification. If your results are robust to changing this one modification, you modestly increase the generalizability and thus practical relevance of your experimental results. Conducting a differentiated replication of one’s own experiment within one experimental study is fortunately "already" emerging in the experimental accounting literature. For instance, Kachelmeier et al. (2021) recently published a paper in The Accounting Review in which they explicitly report featuring a differentiated replication of their main experiment to improve the generalizability of their previously obtained results.
However, we hope that differentiated replications will not only be used to strengthen the generalizability of our own experiments within each experimental study. Wouldn’t it be wonderful if experimentalists started conducted a bunch of differentiated replications of one seminal, published experimental accounting paper? It would allow us to uncover which insights hold up to alternative operationalizations and the test of time. But what if the results produced by a differentiated replication are different from the original? Isn't that a problem? Is does not have to be! An ancillary benefit of differentiated replications is that they have the potential to increase our understanding of the theoretical mechanisms at play in the original experiment. Specifically, if we find that a previously-found causal effect does not hold in a differentiated replication, nothing is lost. It just teaches us that we may have to revise our thinking regarding the theory that led to the original experiment. Thus, we still learn something new about the original theoretical process. Apparently, it was not precisely what we thought it was. By updating our theoretical framework, our theories are better applicable in practice. Thereby, replications improve practical relevance of our research.
Practice what you preach
We have already started conducting differentiated replications of previously-published accounting experiments. Our first project, which is also co-authored with Bart Dierynck, features two replications of an experimental study reported in Maas, Van Rinsum, Towry (2012). The original experimental study examines managers’ costly information acquisition decisions in the context of performance evaluations. Their results reveal that manager-participants, on average and generally speaking, were relatively willing to exert costly effort to acquire information. Our first replication is a “direct replication,” implying that it keeps all experimental materials and procedures as closely as possible to the original experimental study. The direct replication’s purpose is to test whether the results of the original experimental study pass the test of time and a different sample. The second replication is a differentiated replication, and its purpose is to test whether the results are robust to a subtle change in how managers make costly information acquisition decisions.
The motivation for the differentiated replication was simple. The original experimental study asked manager-participants to make one decision about whether and how much costly information they would like to acquire for performance evaluations. We were curious whether a sequential decision-process, in which manager-participants first decide whether to acquire costly information before deciding how much they would like to acquire, would change the results. This subtle change in how manager-participants express their willingness to exert costly effort does not change the underlying choice structure of the experimental study. It just changes the way the decision is framed in the experiment. Depending on whom you ask, either frame could be viewed as more realistic and practically relevant. Do managers in the real world make one simultaneous decision when they consider gathering information, as the original study implicitly assumes, or is this decision sequential: They first choose whether to collect additional information before deciding how much effort they are willing to exert for it, including backing out of the decision?
The direct replication nearly perfectly replicates the results of Maas, van Rinsum, and Towry (2012). In light of the replication crisis in other subfields, this is already an important finding! However, it turns out that the subtle operational change featured in the differentiated replication affects the results drastically. A sequential decision-process significantly reduced managers’ willingness to exert costly effort to acquire information. Thus, allowing for different decision-making processes seems to affect the results significantly, casting some doubt about whether managers are always as excited to gather costly information as the original study suggests. On the flip side, we learned something new about the underlying theoretical process, enabling us to consider a more refined theoretical explanation for both the findings in the original experimental study and ours. Notably, the effect caused by changing how manager-participants make costly information acquisition decisions illustrates that the situation managers find themselves in may be critical in determining to what extent they gather additional costly information. More research is needed to further develop a coherent, practically relevant theory about managers’ information acquisition decisions in the context of performance evaluations.
Final note
We started writing this post trying to share some ideas about replications offering a meaningful way to resolve awkward discussions about practical relevance in the experimental accounting community. However, in the process, we have arrived at a far more critical overarching message: If we care so much about practical relevance and we really want to appeal to policy-makers and practitioners, why don’t we start with doing more reproducible experimental research? Besides offering stronger causal inferences, the biggest gift experimental accounting can offer to other empirical accounting researchers, business, and society is that its results replicate.