Abstract
Persistent and difficult methodological issues, such as those highlighted in Ohlson (2025. Empirical accounting seminars: Elephants in the room. Accounting, Economics, and Law: A Convivium 15, 1–8), undermine confidence in the results of empirical studies. Many of these issues relate to the efforts of researchers to defend the statistical significance of their results. However, the seemingly endless combination of methodologies that can be employed make statistical significance a somewhat illusory concept. A more productive approach would be to place greater emphasis on the economic significance of empirical results.
Table of Contents
Management and Research by Numbers
Evidence on Empirical Research
Statistical Significance versus Economic Significance
References
Empirical Research in Accounting and Social Sciences: Elephants in the Room
Empirical Accounting Seminars: Elephants in the Room, by James A. Ohlson, https://doi.org/10.1515/ael-2021-0067.
Limits of Empirical Studies in Accounting and Social Sciences: A Constructive Critique from Accounting, Economics and the Law, by Yuri Biondi, https://doi.org/10.1515/ael-2021-0089.
Accounting Research’s “Flat Earth” Problem, by William M. Cready, https://doi.org/10.1515/ael-2021-0045.
Accounting Research as Bayesian Inference to the Best Explanation, by Sanjay Kallapur, https://doi.org/10.1515/ael-2021-0083.
The Elephant in the Room: p-hacking and Accounting Research, by Ian D. Gow, https://doi.org/10.1515/ael-2022-0111.
De-emphasizing Statistical Significance, by Todd Mitton, https://doi.org/10.1515/ael-2022-0100.
Statistical versus Economic Significance in Accounting: A Reality Check, by Jeremy Bertomeu, https://doi.org/10.1515/ael-2023-0002.
Another Way Forward: Comments on Ohlson’s Critique of Empirical Accounting Research, by Matthias Breuer, https://doi.org/10.1515/ael-2022-0093.
Setting Statistical Hurdles for Publishing in Accounting, by Siew Hong Teoh and Yinglei Zhang, https://doi.org/10.1515/ael-2022-0104.
1 Management and Research by Numbers
Corporate managers expend significant energy to meet quarterly earnings targets, such as analyst consensus estimates or earnings from the same quarter in the previous year. To meet these targets, managers might reduce discretionary spending, delay starting new projects, change accounting assumptions, or take any number of other actions. At times, the actions taken might test the boundaries of ethics. Ideally, meeting earnings targets would not take precedence over a more overarching goal of creating economic value, but an earnings target is a measurable, visible, well-defined benchmark that has become a focal point of management performance. Managers acknowledge a willingness to sacrifice economic value to meet earnings targets, in part due to career concerns (Graham et al., 2005).
Empirical researchers likewise expend significant energy to meet particular targets, in this case, the one percent, five percent, or 10 percent levels of statistical significance. To meet these targets, researchers might experiment with different regression specifications, alter sample selection criteria, modify hypotheses after analyzing the data, or take any number of other actions. At times, the actions taken might test the boundaries of ethics. Ideally, meeting statistical significance targets would not take precedence over a more overarching goal of discovering truth, but statistical significance is a measurable, visible, well-defined benchmark that has become a focal point of academic performance. Researchers may sacrifice more substantive knowledge discovery in order to defend statistical significance, in part due to career concerns.
Of course, researchers are not wrong to perceive that career success is tied to statistical significance. The bias of academic journals toward publishing statistically significant results is well documented, and publication success is integral to promotion and tenure. But overemphasizing statistical significance can be greatly detrimental to the research process. Ohlson (2025) highlights a number of methodological issues that undermine confidence in the results of academic studies. Not surprisingly, most of the issues discussed in Ohlson (2025) relate to the efforts of researchers to defend the statistical significance of their results. However, defending statistical significance has its limits, because rarely are empirical results completely robust. In the great majority of studies, some specifications lead to significant results and others do not.
2 Evidence on Empirical Research
In the process of carrying out empirical studies, researchers make many methodological decisions that impact the magnitude and significance of their findings, and at each decision point lies an opportunity to choose a method that leads to greater statistical significance. In Mitton (2022), I examine over 900 regression analyses published in top finance journals that study dependent variables derived from financial statements, such as profitability, investment, or leverage. I find that researchers employ a seemingly endless variation of methodologies in their empirical studies. For example, one source of variation is that definitions of dependent variables are far from standardized. In the case of profitability regressions, I identify 61 unique measures of profitability that are used as dependent variables, including 26 unique measures of return on assets.[1] Variation in methodology arises from many other decisions as well, including sample selection, outlier treatment, control variable inclusion, variable transformations, and so on. I show that when researchers have discretion over these methodological decisions, it is relatively easy to find specifications in which almost any explanatory variable is a statistically significant determinant of a common dependent variable.
For example, suppose I am using a panel of Compustat data to study firm profitability, and that I randomly generate an explanatory variable and “hypothesize” that the variable is a determinant of profitability. Suppose further that I am allowed discretion over 10 common binary methodological decisions such as retaining outliers or not, excluding financial firms or not, and measuring the size control as assets or sales. In this situation, I find that 90 % of the time there are one or more specifications in which the randomly generated variable is a statistically significant determinant of profitability at the 5 % level. Of course, 10 binary methodological decisions comprise but a small subset of all the methodological decisions that a researcher encounters when carrying out a typical empirical study, so we should be cautious about drawing conclusions about the statistical significance of any individual result.
Variation in methodology can be entirely appropriate, to the extent that researchers make methodological decisions based on theoretical considerations. However, in my survey I find that researchers usually state no reason for their methodological decisions. For example, researchers explain why they choose a particular method of outlier treatment only 6 % of the time, even though outlier treatment usually has a substantial impact on reported results. The lack of explanations raises the possibility that researchers could purposefully use methods that lead to statistical significance. But researchers might not even be fully aware that they are making decisions that favor statistical significance, as it is easy to justify, even subconsciously, methods that confirm a desired outcome. Additionally, even if each research study adheres to a predetermined methodology, statistically significant results can rise to the top (and be published) from among multiple independent studies when each study uses a different method to investigate a similar question (Denton, 1985).
3 Statistical Significance versus Economic Significance
Over the years, researchers in various scientific disciplines have cautioned against overemphasizing statistical significance (e.g., Arrow, 1960; Leamer, 1978; Cohen, 1994; McShane et al., 2019), but concern remains, as evidenced in part by a statement by the American Statistical Association that “A p-value, or statistical significance, does not measure the size of an effect or the importance of a result” (Wasserstein, 2016). The alternative to focusing on statistical significance is to put greater emphasis on the economic significance (or practical importance) of empirical results. After all, researchers should have greater interest in the implications of their findings for the real world than in whether a statistically significant effect can be detected (McCloskey and Ziliak, 1996; Harvey, 2017). Nevertheless, empirical papers typically devote substantial space to defending statistical significance, but little space to discussing economic significance. I find that a typical discussion of economic significance involves a sentence or two about the magnitude of the impact of the independent variable. In almost all cases (about 93 % of the time) this short discussion includes a declaration that the results are indeed economically significant, meaningful, relevant, large, or important (Mitton, 2023).
Why is so little effort expended in defending economic significance? Two possible reasons come to mind. First, it is typically much easier, or at least more formulaic, to demonstrate that a coefficient is statistically significant than to provide a convincing argument as to why it is economically significant. Demonstrating economic significance is difficult work, but as Ziliak and McCloskey (2008) put it, “Real science asks you to make real scientific judgments and real scientific arguments within a community of other scientists. It asks you to be quantitatively persuasive, not to be irrelevantly mechanical.” Second, researchers may neglect defending economic significance because there is no line to defend. Unlike with statistical significance, there are no established benchmarks of what constitutes an economically significant result. It is difficult to challenge a declaration of economic significance when there are no accepted ground rules.
So, how can researchers move away from an obsession over statistical significance and toward more meaningful discussions of economic significance? First, newer methods to demonstrate the statistical significance of results across numerous methodologies can help forestall long discussions of statistical significance. These methods, referred to as specification checks (Brodeur et al., 2020) or specification curves (Simonsohn et al., 2020), allow readers to observe, in compact graphical form, how often, and in what situations, a result meets standard levels of significance.[2] Second, statistical significance can be de-emphasized by not highlighting significance in tables (e.g., with asterisks) and by not classifying findings as either significant or insignificant when interpreting results. Some journals—for example, Econometrica, American Economic Review, and Strategic Management Journal—have already banned one or both of these practices. Amrhein et al. (2019), backed by a coalition of 854 scientists from 52 countries, make the case against dichotomizing results as either significant or insignificant. Third, economic significance can be taken more seriously by using appropriate measures of economic significance and by properly interpreting those measures. For example, most studies evaluate economic significance by measuring, in some way, how much the dependent variable changes for a given change in the explanatory variable. Mitton (2023) shows that such measures should be scaled by the standard deviation (not the mean) of the dependent variable. Mitton (2023) also emphasizes that these measures are but a starting point for evaluating economic significance, and that authors should provide benchmarks that allow readers to interpret the magnitude of coefficients. While a complete solution to the overemphasis on statistical significance may be elusive, these suggestions would at least be a few steps in the right direction.
References
Amrhein, V., Greenland, S., & Blake, McS. (2019). Scientists rise up against statistical significance. Nature, 567, 305–307. https://doi.org/10.1038/d41586-019-00857-9 Search in Google Scholar
Arrow, K. (1960). Decision theory and the choice of a level of significance for the t-test. In I. Olkin, S.G. Ghurye, W. Hoeffing, W.G. Madow, & H.B. Mann (Eds.), Contributions to probability and statistics (pp. 70–78). Stanford University Press.Search in Google Scholar
Berchicci, L., & King, A. (2022). Corporate sustainability: A model uncertainty analysis of materiality. Journal of Financial Reporting, 7, 43–74. https://doi.org/10.2308/jfr-2021-022 Search in Google Scholar
Brodeur, A., Cook, N., & Anthony, H. (2020). A proposed specification check for p-hacking. AEA Papers and Proceedings, 110, 66–69. https://doi.org/10.1257/pandp.20201078 Search in Google Scholar
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 997–1003. https://doi.org/10.1037/0003-066x.49.12.997 Search in Google Scholar
Denton, F. (1985). Data mining as an industry. The Review of Economics and Statistics, 67, 124–127. https://doi.org/10.2307/1928442 Search in Google Scholar
Graham, J., Campbell, H., & Rajgopal, S. (2005). The economic implications of corporate financial reporting. Journal of Accounting and Economics, 40, 3–73. https://doi.org/10.1016/j.jacceco.2005.01.002 Search in Google Scholar
Harvey, C. (2017). Presidential address: The scientific outlook in financial economics. The Journal of Finance, 72, 1399–1440. https://doi.org/10.1111/jofi.12530 Search in Google Scholar
Leamer, E. (1978). Ad hoc inference with non-experimental data. John Wiley & Sons.Search in Google Scholar
McCloskey, D., & Ziliak, S. (1996). The standard error of regressions. Journal of Economic Literature, 34, 97–114.Search in Google Scholar
McShane, B., Gal, D., Gelman, A., Robert, C., & Tackett, J. (2019). Abandon statistical significance. The American Statistician, 73, 235–245. https://doi.org/10.1080/00031305.2018.1527253 Search in Google Scholar
Mitton, T. (2022). Methodological variation in empirical corporate finance. Review of Financial Studies, 35, 527–575. https://doi.org/10.1093/rfs/hhab030 Search in Google Scholar
Mitton, T. (2023). Economic significance in corporate finance. Review of Corporate Finance Studies, forthcoming.10.1093/rcfs/cfac008Search in Google Scholar
Ohlson, J. (2025). Empirical accounting seminars: Elephants in the room. Accounting, Economics, and Law: A convivium 15, 1–8.10.1515/ael-2021-0067Search in Google Scholar
Simonsohn, U., Simmons, J., & Nelson, L. (2020). Specification curve analysis. Nature Human Behaviour, 4, 1208–1214. https://doi.org/10.1038/s41562-020-0912-z Search in Google Scholar
Wasserstein, R., & Lazar, N. A. (2016). ASA statement on statistical significance and p-values. The American Statistician, 70, 129–133. https://doi.org/10.1080/00031305.2016.1154108 Search in Google Scholar
Ziliak, S., & McCloskey, D. (2008). The cult of statistical significance. University of Michigan Press.10.3998/mpub.186351Search in Google Scholar
© 2023 CONVIVIUM, association loi de 1901
Articles in the same Issue
- Frontmatter
- Research Articles
- Empirical Accounting Seminars: Elephants in the Room
- Limits of Empirical Studies in Accounting and Social Sciences: A Constructive Critique from Accounting, Economics and the Law
- Accounting Research’s “Flat Earth” Problem
- Accounting Research as Bayesian Inference to the Best Explanation
- The Elephant in the Room: p-hacking and Accounting Research
- De-emphasizing Statistical Significance
- Statistical versus Economic Significance in Accounting: A Reality Check
- Another Way Forward: Comments on Ohlson’s Critique of Empirical Accounting Research
- Setting Statistical Hurdles for Publishing in Accounting
Articles in the same Issue
- Frontmatter
- Research Articles
- Empirical Accounting Seminars: Elephants in the Room
- Limits of Empirical Studies in Accounting and Social Sciences: A Constructive Critique from Accounting, Economics and the Law
- Accounting Research’s “Flat Earth” Problem
- Accounting Research as Bayesian Inference to the Best Explanation
- The Elephant in the Room: p-hacking and Accounting Research
- De-emphasizing Statistical Significance
- Statistical versus Economic Significance in Accounting: A Reality Check
- Another Way Forward: Comments on Ohlson’s Critique of Empirical Accounting Research
- Setting Statistical Hurdles for Publishing in Accounting