Startseite Accounting Research’s “Flat Earth” Problem
Artikel Öffentlich zugänglich

Accounting Research’s “Flat Earth” Problem

  • William M. Cready ORCID logo EMAIL logo
Veröffentlicht/Copyright: 19. September 2022

Abstract

This essay advances the possibly startling notion that as a matter of largely conceptual understanding, the vast majority of accounting research’s empirical efforts concern tests of null hypotheses that are, a priori, false. Figuratively speaking, our journals specialize in providing compelling evidence that ‘the earth is not flat’. It further presents thoughts on the causes and consequences of this state of affairs. Moreover, while other fields also suffer from this ‘flat earth’ condition, it argues that as a field that specializes in issues surrounding the conveyance of useful information, the accounting academy should step up and take leadership on dealing with the problem. The essay closes by discussing how the field might venture forth from this sheltered approach to empirical inference and thereby enhance the discovery content of its empirical inquiries.

JEL Classifications: C12; B40; M40

Table of Contents

  1. Introduction

  2. Hypothesis Considerations

    1. Conceptual Hypotheses

    2. Individualistic Hypotheses

    3. Directional Hypotheses

  3. “Flat Earth” Enablers

    1. The Epsilon Equals Zero Fallacy

    2. The Competing Hypothesis Illusion

    3. The Significance Isn’t Easy Equivalency

    4. The Null-Free NHST Norm

    5. The “It Must Have Tension” Mandate

  4. “Flat Earth” Consequences

    1. Crowds out Useful Content

    2. Supplants Viable Nulls

    3. Sacrifices Descriptive Relevance

    4. Facilitates Misappropriation of Significance

  5. Conclusion

  6. References

Empirical Research in Accounting and Social Sciences: Elephants in the Room

  1. Empirical Accounting Seminars: Elephants in the Room, by James A. Ohlson, https://doi.org/10.1515/ael-2021-0067.

  2. Limits of Empirical Studies in Accounting and Social Sciences: A Constructive Critique from Accounting, Economics and the Law, by Yuri Biondi, https://doi.org/10.1515/ael-2021-0089.

  3. Accounting Research’s “Flat Earth” Problem, by William M. Cready, https://doi.org/10.1515/ael-2021-0045.

  4. Accounting Research as Bayesian Inference to the Best Explanation, by Sanjay Kallapur, https://doi.org/10.1515/ael-2021-0083.

  5. The Elephant in the Room: p-hacking and Accounting Research, by Ian D. Gow, https://doi.org/10.1515/ael-2022-0111.

  6. De-emphasizing Statistical Significance, by Todd Mitton, https://doi.org/10.1515/ael-2022-0100.

  7. Statistical versus Economic Significance in Accounting: A Reality Check, by Jeremy Bertomeu, https://doi.org/10.1515/ael-2023-0002.

  8. Another Way Forward: Comments on Ohlson’s Critique of Empirical Accounting Research, by Matthias Breuer, https://doi.org/10.1515/ael-2022-0093.

  9. Setting Statistical Hurdles for Publishing in Accounting, by Siew Hong Teoh and Yinglei Zhang, https://doi.org/10.1515/ael-2022-0104.

“I’m shocked, shocked to find that gambling is going on in here.” Captain Renault, CASABLANCA

1 Introduction

The inferential value of null hypothesis significance test (NHST) assessment rests in the identification of empirical evidence that is incompatible with tested null hypotheses (Principle 1 of the ASA Statement on Statistical Significance and P-Values, henceforth identified as the ASA Statement).[1] Consequently, the knowledge obtained from such assessment directly stems from the a priori plausibility of the hypotheses so tested. Producing evidence against plausible nulls possibly changes beliefs. Producing evidence against implausible nulls does not. As Leamer (1978) acutely observes, “one, in those circumstances, should trouble himself not with the results of classical hypothesis testing but rather with the question of why he bothered to test an obviously false hypothesis in the first place.” The discussion that follows argues that the testing of “obviously false hypothes(e)s” is widespread in the accounting literature. It discusses key factors contributing to this state of affairs and the adverse inferential quality consequences that flow from it. It also argues why accounting as an academic discipline that studies the presentation of information useful to decision making, should be particularly concerned about removing such uninformative content from its journals.

This essay takes the view that a hypothesis is false a priori when common sense or prior knowledge enable a reasonable person to refute it without recourse to further scientific investigation. For instance, nowadays at least, the hypothesis that ‘the earth is flat’ widely accepted as being false. While NHST assessment of it can be imaginative, it is also philosophically sterile. The problematic nature of NHST assessment of a priori false hypotheses is hardly original to this essay. In the broader NHST integrity literature, Berkson (1938) identifies a priori false hypothesis assessment as a problem for absence of association nulls shortly after the advent of the NHST paradigm. Cohen (1994), the source of this essay’s title, addresses it at length. His specific concern pertains to tests of hypotheses that effects, differences, associations, and so on equal zero (as in do not exist). Hypotheses he derisively labels as (less than null) “nil” hypotheses, arguing that, in social science settings at least, such conjectures are transparently false. Of particular relevance to this essay’s motivation, he further argues that researchers are commonly oblivious to how testing such hypotheses neuters NHST’s inferential relevance.[2]

The issue of a priori false nulls also appears in some occasional accounting article. However, in line with Cohen’s arguments, the broader discussion accompanying such acknowledgements commonly reveal disturbingly incomplete recognition of how fully ‘flat earth’ hypotheses eviscerate NHST viability. Ball (2013), for instance, in his classic deconstruction of the idea that “‘earnings management’ is rife” opens with the following statement regarding earnings management non-existence: “Of course earnings management goes on. Agency costs are positive. People have been tried and convicted.” (p. 850) Yet, his conclusion advances “better research designs” (p. 852) as a remedy. If the non-existence null is truly false then how will better-founded evidence of it being so possibly improve matters? Indeed, to the extent that the problem lies with fundamental misunderstanding about “statistical significance,” wouldn’t such improvements simply make things worse? Even otherwise admirable direct engagements with NHST shortcomings run into trouble here. Dyckman (2016), for instance, observes that “any-no effect-hypothesis is always false in the social sciences” (p. 328), yet elsewhere takes such no effect nulls as viable (e.g., for Bayesian or Meta-Analysis assessment). Stone (2018) asserts that “the null hypothesis can never be true” (p. 108) before proceeding to his main concern: the importance of restoring integrity to rejections of such untrue nulls.[3]

It is important to point out that I broadly agree with the overall tenor of the Ball, Dyckman, and Stone’ arguments, and, with some reservations, remedies. In fact, I echo a number of them later in this discussion. My difficulty with their presentations is that much of what they argue presumes that the field commonly builds its analyses around viable, not ‘flat earth,’ null hypotheses. What exactly does replicating evidence that ‘the earth is not flat’ achieve? What does triangulating evidence against such a hypothesis validate? Or, for that matter, why would one ever produce posterior odds for the likelihood that ‘the earth is flat’? Similarly, why should the depositing of evidence that does little to refute the proposition that ‘the earth is flat’ into file drawers or favoring research designs and measures that disfavor such a proposition (a.k.a. “p-hacking) concern us that much? If the tested null is plausible these things matter. If it’s a priori false then the only possible inferential error is to suggest otherwise.

Finally, a central component to accounting’s ‘flat earth’ condition is, in line with Cohen’s (1994) critique, a myopic focus on assessing hypotheses that parameters or parameter differences of interest equal zero. Such assessments, upon achieving “statistical significance” status, do indeed yield data-driven directional location identifications of effects and differences of interest (e.g., that they are positive or negative, bigger or smaller, etc.) in an efficient-for-presentation fashion. The provision of succinct, yet comparatively coarse, descriptive insights, however, is not a basis for bestowing credibility upon tested hypotheses. Such credibility instead stems from a reasoned critical consideration of the conceptual basis for thinking that the hypothesis is possibly true. Yet, such zero parameter value conjectures are rarely even remotely credible. Moreover, producing evidence in support of this lack of credibility is not that difficult. Simply pick out an issue from any journal in the field. Evaluate its empirical contents not through the lens of their effective engagement with other explanations for reported “findings,” but rather critically consider what conditions must hold for each studied effect, relation or difference to be plausibly thought of as being entirely absent from the examined population (i.e., the case for being “nil”). Alternatively, for those disinclined to such an undertaking on their own, simply turn to the one provided in the Appendix. There you will, perhaps shockingly, discover that nil hypotheses such as taxation never deters, things going on at the office never distract, loan officers lack identity, and financial analysis never informs financial decision-making are conjectures deemed worthy of having doubt cast upon them by a leading journal in the field.[4]

2 Hypothesis Considerations

The discussion here broadly engages the misuse of hypothesis testing in the provision of relevant useful understanding of studied phenomena. As such, it addresses hypotheses on a number of salient dimensions. Most obviously, it concerns null hypotheses (i.e., hypotheses that NHST exercises seeks to provide evidence against). It also, as discussed in the introduction, addresses null hypotheses in the form of assertions that effects or differences are entirely absent from or between studied groups or populations (i.e., “nil” hypotheses). Several further ways of thinking about hypotheses also feature prominently in its arguments. These concern: (i) conceptual hypotheses; (ii) individualistic hypotheses; and, (iii) directional hypotheses.

2.1 Conceptual Hypotheses

In general, a tested empirical null addresses the joint truth of: (1) some conjecture of (supposedly) direct relevance to a matter of research interest that I henceforth identify as the conceptual hypothesis (e.g., that earnings quality is unaffected by some information production choice, that firms never engage in a proposed form of earnings management, that enforcement of a tax rule does not inhibit taxpayer actions), and, (2) a set of further assumptions made in order to empirically evaluate the targeted conceptual construct (e.g., that abnormal accruals measure earnings quality, that returns are correctly risk-adjusted, that error terms are normally distributed, that empirical models are fully and correctly specified, etc.).[5]

With respect to this second component, the idea that these further necessary-for-empirical-assessment assumptions are never wholly or strictly true also informs the assertion that as-tested null hypotheses, nil or otherwise, are never true. In contrast, the conceptual or conceptually derived component of a NHST exercise may be true or false. Earnings management activity may or may not exceed some predetermined “rifeness” threshold. A certain lady may or may not lack the ability to discern the order in which tea and milk are poured into a cup. The speed of light may (as suggested by Quantum mechanics) or may not (per classical physics) exceed 300,000 km/s.[6] This distinction between empirical assessment assumptions and underlying phenomenon of empirical interest matters because before one opts to enter a NHST assessment minefield, one should really consider what its successful navigation possibly accomplishes beyond displaying one’s adeptness at avoiding mines.

2.2 Individualistic Hypotheses

In general, statistical inference concerns estimating population parameters and testing hypotheses about such parameters. As such, it does not directly concern itself with what is going on with each individual subject in a population. For example, one does not infer that the height of every member of a population exceeds six feet simply because the population average exceeds six feet (even if the excess is “statistically significant”). On the other hand, conceptual hypotheses in accounting and other social science fields commonly proceed from subject level understandings of studied phenomena. We advance individualistic conceptual hypotheses that some subjects in a population exhibit some attribute, behavior, characteristic, etc. When the hypothesis development stops at this individualistic “some subjects do it” point two issues arise. First, it necessarily identifies the associated conceptual null as complete absence. That is, it advances the idea of establishing that no subject exhibits “it” as a belief meriting falsification. Outside of the obviously absurd (e.g., that unicorns do not exist) such complete absence conjectures invite disbelief in most social science settings. We are studying humans after all. And, both individually and in groups, some humans do tend to respond to, differ about, or associate with experienced incentives, stimuli, and conditions irrespective of notions of how they should or ought to respond or associate with such things. In the accounting domain, just how likely is it that no firm responds to an earnings management incentive, that taxes never inhibit taxed activities, that experienced signals never elicit subject level responses, etc.?

Second, the population parameters we estimate are not well-suited to individualistic non-existence assessments. For example, even establishing with certainty that a variable’s mean exceeds zero does not preclude the existence of individual members exhibiting negative values for such a variable. In fact, the best evidence against an individualistic non-existence null is not a p-value of 0.00000001, but rather the identification of a population member that exhibits the supposedly non-existent behavior, characteristic, response, etc, (i.e., a non-stochastic counterexample).

2.3 Directional Hypotheses

Tested empirical null hypotheses in the accounting field are typically non-directional. Studies identify an empirical metric or difference related to a conceptual issue or hypothesis of interest and test whether it equals zero or not. Nominally, the alternative hypothesis in such a setting is also non-directional. Positive and negative underlying effects or differences are equally incompatible with a proposed zero point value null hypothesis. That said, many studies do actually have a clear directional conceptual alternative to their tested non-directional nulls in mind. So, an earnings management study might test the nil hypothesis that earnings management doesn’t happen, to advance its case in favor of a conceptual argument that it should increase, although there is no basis for thinking that it could decrease as well.[7]

The prevalence of directional alternative hypothesis conceptualizations in the field raises the possibility that one might mitigate accounting’s ‘flat earth’ problem by replacing non-directional nil hypotheses with directional zero-bounded null hypotheses. That is, motivate and test null hypotheses that effects are less than or equal to zero when the study’s advanced alternative is that the effect is positive, or that effects are greater than or equal to zero when the study’s advanced alternative is that the effect is negative. Unfortunately, as is also seen in the appendix examinations, such reframing rarely works. It fails because it does not alter the advanced individualistic directional alternative. Solid reasons still exist for thinking that some subjects in the population exhibit the proposed directional relation. But, the directional individualistic null is that no subject in the population exhibits the alternative directional relation, effect, response, or difference.

3 “Flat Earth” Enablers

While the descriptive attractiveness, as opposed to inferential relevance, of establishing which side of zero that an effect generally falls is the prime enabler of accounting’s ‘flat earth’ problem, there is no shortage of supporting players in the matter. Among the most obvious of these is statistical software packages that, entirely unasked, provide their users with nil hypothesis test results for every estimated parameter requested of it. Consequently, users are relentlessly bombarded with the subliminal message that such hypotheses have inherent importance and meaningfulness.[8] Other key drivers of the problem that particularly speak to why firmly flat earth hypothesis assessment is so firmly entrenched in the field include: (i) the epsilon equals zero fallacy; (ii) the competing hypothesis illusion, (iii) the significance isn’t easy equivalency; (iv) the null-free NHST norm; (v) the “it must have tension” mandate.

3.1 The Epsilon Equals Zero Fallacy

In mathematical science “epsilon” commonly identifies a variable that takes on the property of possibly being infinitesimally different from zero, while never actually reaching zero. Importantly, entire fields of mathematics (e.g., the calculus) depend quite critically on epsilon not being zero. Similarly, tests of a hypothesis that a parameter value or difference equals or is bounded at zero is not equivalent to a test of a hypothesis that the parameter value is epsilon or within epsilon of zero. If the parameter value is epsilon, then it is not zero and the null hypothesis that it is zero is false. Hence, commonly encountered arguments advancing null hypothesis viability that appeal to notions that deviations from it are likely weak, small, negligible, inconsequential, etc., are unfounded. Given sufficient power, a test that a parameter is zero will reliably return rejection outcomes when the underlying parameter value is epsilon.[9]

3.2 The Competing Hypothesis Illusion

Studies commonly focus their analyses around directional existence alternatives to tested nil hypotheses. In some instances, a study further identifies opposite direction “competing” hypotheses as a means of injecting a priori doubt about things. Rarely, however, it is the case that the truth of such a competitor precludes the existence of the proposed directional effect. So, for instance, the fact that reasons exist for thinking that supplier industry concentration might sometimes decrease cash compensation to executives (see appendix analysis of Carter et al., 2021) does not preclude the advanced idea that it also sometimes increases it. Indeed, from a strict NHST inference perspective of things, all the introduction of such a hypothesis does is provide another reason for viewing the tested non-existence of anything “nil” hypothesis as implausible.[10]

3.3 The Significance Isn’t Easy Equivalency

Obtaining robust rejection outcomes in accounting research inquiries is, in my own experiences at least, rather difficult. A statistically insignificant outcome in an examination addressing a seemingly reasonable conjecture regarding an accounting issue of interest is a distressingly common occurrence. Consequently, obtaining a rejection outcome is something easily taken as a badge of accomplishment. It is a mistake, however, to equate difficulty in rejecting null hypotheses with the idea that such hypotheses are truly viable and not ‘flat earth’. Accounting research commonly addresses second order effects with noisy measures (e.g., abnormal accruals measuring earnings quality) and imperfect research designs (e.g., the abundance of ad hoc control variables and ad hoc measurement of such variables). It is hardly surprising that empirical examinations in the field commonly fail to reject nulls, even when such nulls are transparently false.

3.4 The Null-Free NHST Norm

When I started writing the initial draft of this essay the latest available edition of The Accounting Review was March, 2020. This issue consists of 14 articles, 12 of which employ NHST assessment.[11] All make extensive use of the word “significant,” with 10 of the articles invoking it over 25 times (mean usage is 34 times). The word “null,” on the other hand, appears in just four of these articles, once in three of them and twice in one of them. Moreover, three of these five usages motivate two-sided “competing hypothesis” inquiries. The remaining two describe: (1) an expected null outcome non-“finding” (see Cready et al., 2022); and, (2) a rejection outcome obtained in a final robustness assessment. In other words, the literature aggressively embraces the “significance” aspect of NHST, but shuns the “null” aspect that gives rise to such “significance.” Given this out-of-mind/out-of-sight attitude regarding nulls, why is at all surprising that ‘flat earth’ null hypotheses thrive in our research efforts?

3.5 The “It Must Have Tension” Mandate

Ask any serious accounting scholar today about what an empirical article must have to stand a chance of publication; the answer will reliably center on “tension.” Articles must address or raise an issue about which some perceivable doubt or conflict exists and, via the application of statistical assessment of relevant evidence provide substantive resolution of such doubt or conflict. This demand for tension pushes research to apply NHST based on the mistaken notion that it divines the “significant” from the “insignificant.” More critically, it leads researchers to be thoughtful about alternative hypotheses (that they wish to be “significant”) but thoughtless about the null hypotheses that NHST actually addresses. Moreover, if the sole objective is NHST rejection, then ceteris paribus, testing a priori false hypotheses surely beats testing ones with a reasonable chance of being true. Thus, researchers readily fall into the trap of resolving fake “tension” by testing “earth is flat” hypotheses.

4 “Flat Earth” Consequences

In and of itself subjecting ‘flat earth’ hypotheses to NHST assessment, when properly understood as such, are little more than exercises in redundancy. When they return “statistically significant” rejections, such exercises simply further confirm that the earth truly isn’t flat. When they don’t, they indicate that the chosen research design is possibly not up to the task of discriminating truth from fiction. Consequently, it is tempting to view the testing of such hypotheses as a sort of benign tumor that peacefully exists within the discipline, a tumor perhaps best left undisturbed. After all, while testing such hypotheses doesn’t truly add to empirical understandings of phenomena of interest, neither does it lead researchers astray from what is true about things.[12]

In this section I identify some reasons why, to the contrary, ‘flat earth’ hypothesis testing’s impact on what is learned from our research is far from benign. It is a malignancy that, among other things: (1) wastes scarce article space and researcher resources on developing and conducting pointless analyses; (2) supplants the use of viable null hypotheses; (3) leads researchers to sacrifice descriptive saliency in their design and measurement choices in the pursuing of unassailable evidence of the curvature of the earth; and, (4) opens a wide pathway for misleading “significance” attributions to evidence.

4.1 Crowds out Useful Content

The typical empirical accounting article adopts a near monolithic focus on NHST assessments of data. Putting together a solid convincing empirical design that defensibly identifies a likely obscure effect of interest typically requires a good bit of textual presentation and research design development. Studies further supplement their earth isn’t flat “findings” with clever NHST based robustness and salient cross-sectional validation analyses, requiring still further textual development, presentation, and analysis. The article space required for all of this inferentially useless content leaves little or no space for the pursuit of other inferential approaches to understanding studied phenomena. So it is hardly surprising that notions such as substantive incremental explanatory power assessment (e.g., Ohlson, 2015), Bayesian assessment strategies (e.g., Dyckman, 2016); reverse regression bounding of estimates (e.g., Klepper & Leamer, 1984; Cready et al., 2000), or confidence interval assessments (e.g., Dyckman, 2016) rarely appear in our journals.

A particularly distressing crowding out consequence is the limited space provided to assessing the (“economic”) importance to studied parameter values in the occasional article that chooses to address it. Typically, such efforts proceed by taking their parameter estimates as exact representations of associated unobserved parameter values and attaching some sort of sketchy reasoning as to why such values possess “economic significance.” Such an approach certainly saves space. I’ve seen it done, and done it myself, in as little as two sentences and can never recall it requiring more than a paragraph. Often it is nothing more than a footnote. But, this saved space comes with a cost. Most obviously, it simply drops estimation error out of the picture. Consequently, the discussion blissfully ignores the possibility that a study’s “statistically significant” estimate of a parameter is substantially larger than the value of the parameter so estimated. More broadly, “importance” is a fundamentally elusive construct. One or two sentences can’t possibly provide a clear, thoughtful, or objective perspective about it, particularly with respect to identifying and addressing plausible alternative importance viewpoints.

4.2 Supplants Viable Nulls

In a setting where obtaining statistical significance is the prime objective then it readily follows that researchers wish to maximize the likelihood of reporting such significance. Ceteris paribus, given a choice between testing a hypothesis that is surely false and one that is possibly true what rational researcher would choose to take on the one that might have a chance of being true. When we welcome ‘flat earth’ hypothesis rejection findings with open arms, we destroy incentives for researchers to take on null hypotheses that we might actually learn something from producing evidence against.

4.3 Sacrifices Descriptive Relevance

Researchers make design and measurement choices with their chosen research objective in mind. Therefore, studies seeking to identify existence by testing a non-existence null choose designs and measures targeting existence. With some regularity, they achieve such targeting by discounting or compromising descriptive understanding of studied phenomena. So, we replace continuous variables with their decile ranks, thereby warping natural understanding of independent variable variation and distorting co-variation among sets of independent variables. We delete substantial numbers of observations pertaining to real populations to produce “matched” samples corresponding to fictitious populations where the key redeeming quality is the provision of robust platforms for existence identification as opposed to measuring just what the effect actually looks like in a real population. We control for fixed effects that are in part determined by studied phenomena of interest in order to demonstrate that existence survives such inclusion. And so on. Hence, we sacrifice accurate measurement of phenomena in the pursuit of robust defensible existence identifications of things, which if we gave much a priori thought to, we would have little doubt about.

One particularly popular route that moves research efforts away from descriptive relevance in the pursuit of certitude with respect to ‘flat earth’ hypothesis falsification is the use of identification-centric designs such as “discontinuities,” “exogenous shocks,” “instrumental variables” and “natural experiments.” The common theme to such approaches is the isolation of a sliver of “clean” exogenous variation in an independent variable of interest. Such clean variation provides a compelling test of the hypothesis that an independent variable of interest never has a causal impact on a studied dependent variable. Such selective variation focused designs, however, have limited relevance for identifying the overall consequence of the independent variable for the studied dependent variable.[13] They fail in this regard because such consequence is determined in good part by the total level of exogenous variation commonly exhibited by a variable, not a sliver of it picked out for examination by these sorts of analyses. Indeed, most such slivers are anything but representative. Rather, they are products of odd, unusual, or trivial happenings, circumstances that have little to do with the level of independent variation commonly exhibited by a variable.

4.4 Facilitates Misappropriation of Significance

Cumming (2012) describes the “slippery slope of significance” as follows: “An effect [that] is found to be statistically significant, is described, ambiguously, as ‘significant,’ and then later discussed as if it had been shown to be ‘important’ or ‘large’.” So, while testing hypotheses that the earth is flat achieves little in the way of inferential insight, it does lead directly to the very top of this slope. So, at best, the widespread entertaining of such hypothesis tests is simply a way for us to fool ourselves about the “importance” of “discovering” evidence about things we already (ought to) know.

5 Conclusion

The ‘flat earth’ concerns raised here are certainly not unique to accounting research endeavors. This general nature of the problem, however, does not alter the fact that flat earthiness truly does severely inhibit knowledge production in the accounting domain. Other fields doing the same thing does not mitigate this consequence in the slightest. Moreover, in my view the relativist “everyone else is speeding too” defense here stands on rather shaky ground. If the appendix analysis is any indication, an indication that is certainly consistent with my own casual observation of publications in the field (inclusive of my own), flat earthiness is a pervasive phenomenon in our empirical endeavors. Outside the confines of Flat Earth Society chapter houses, it is difficult to imagine how things could possibly be much worse.

Another important reason that accounting should take its ‘flat earth’ problem seriously follows from the very nature of accounting practice as we study it. What accounting does is summarize relevant information for use by decision-makers, information which it commonly seeks to faithfully represent. Accounting researchers, in turn, study these reporting practices inclusive of their impacts on affected parties choices, decisions, and behavior. In doing so, we are widely concerned with factors that degrade the qualitative content of such information inclusive of systematic misreporting practices. Should the field not be particularly concerned about the integrity and relevance of what our journals publish?

Similarly, is the widespread provision of summarized evidence falsifying a priori false conjectures about things truly where a field that studies the representationally faithful provision of useful imperfect information wants to be? Taken literally, such efforts do not possibly produce new knowledge. Taken as some sort of pathway to certifying identified (alternative hypothesis) insights about things as “statistically significant”, it is deceptive and, to boot, is a markedly limited approach to producing relevant inference about the importance or consequence of studied relations. As Principle 6 of the ASA Statement makes clear, “By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.” The American Accounting Association widely proclaims themselves as being “thought leaders.” Wouldn’t “thought leadership” here be taking actions that move the academy away from a status quo in which producing evidence that ‘the earth is not flat’ is the accepted research effort norm?

So, what might the path leading out of the flat earth swamp look like? Well, it most certainly requires a broad understanding that presentations of evidence against a priori false hypotheses lack inferential value. Given such an understating it follows that reputable journals in the field should have little interest in publishing studies majoring in such analyses. Indeed, they might go so far as to put in place a policy of desk rejecting such submissions.[14] What might happen then? Well, for one, the April–May 2021 issue of the Journal of Accounting and Economics, would be considerably thinner. For another it would free up space for the provision of something more than fundamentally pointless existence identification content. It would also shut down a cheap, in relative terms at least, pathway to fake statistical significance outcome drama.

As authors generally respond to clearly conveyed messaging about what does and does not get one published one certain response to a shutdown of the flat earth null charade is a shift to tests of plausible nulls. So, for instance, sterile studies reporting evidence incompatible with no firm ever managing earnings in the face of a plausible incentive to do so would disappear, possibly replaced by studies addressing the more lively (and challenging) nulls of whether the level of earnings management in studied settings are of only inconsequential magnitude. In doing so such studies would need to grapple with just what constitutes a consequential level of earnings management or, for that matter, just how small a level is practically equivalent to its being non-existent.

In general, of course, replacement of non-existence nulls with inconsequentiality nulls is something easily said, but quite difficult to pull off in terms of formal empirical assessment. The appendix analysis, for instance, identifies high level inconsequentiality replacement candidate nulls for most of the examined articles. That’s not very hard to do. Empirical assessments of these nulls, however, requires numerical inconsequentiality cutoffs. Identifying such cutoff values, commonly known as “minimum practical significant distance” (MPSD) values (see, for instance, Goodman et al., 2019) is generally challenging. In the case of the appendix analysis, for instance, I really don’t know what these cutoffs should look like. I am not an expert in any of the studied areas. I would, however, be most interested in knowing what the authors of these studies think these cutoffs should look like. They are, after all, experts in the studied areas.

The challenges involved in determining consequentiality cutoffs might also inspire researchers to shift away from hypothesis assessment as the preferred inferential strategy. Faced with necessity of engaging the question of quantifying consequentiality on a before-the-evidence-is-seen basis, the idea of approaching things from, to paraphrase Leamer (1978), seeking to learn more about unknown models perspective of inference (i.e., “specification searches”) becomes a reasonable, arguably highly preferable, alternative to NHST-driven inference. Hence, for example, instead of testing whether the null hypothesis that the percentage of firms managing earnings is less than 50% (an admittedly ad hoc dividing line between “rife” and not “rife”), analyses might set out to get the best possible idea of what this earnings management rate is. The case for whether or not it is rife would then rightly center around such evidence-suggested estimates. Evidence suggesting that the rate is say under 5% perhaps being unsupportive of earnings management being rife while evidence that the rate is clearly higher than 50% might suggest that Ray Ball has it wrong.[15]

Moreover, the stochastic nature of the exercise necessarily limits the relevance of point value estimates in such exploring-of-the-unknown approaches to evidence. While they may be “best guesses,” they are also reliably wrong exact guesses. Hence, we might see a good deal in terms of range identifications addressing the general location of effects of interest suggested by the examined evidence. That is, the use of confidence (compatibility[16]) intervals, credibility intervals, reverse regression based bounds, and engaged presentations of (as opposed to cherry-picking from) effect value ranges determined from plausible alternative empirical specifications would become norms rather than outliers.

When Christopher Columbus set out for India he was not, Washington Irving’s dramatization of things notwithstanding, seeking to disprove the null hypothesis that the earth was flat (Federer, 2021; Wallis, 1992). He and his contemporaries knew otherwise. Rather, he sought to discover a new, hopefully better, route to trade with India. A discovery that eluded him in part because of a poor descriptive understanding of the distances involved. Nevertheless, his identifications of what was out there changed the world. Perhaps, if the accounting research field abandoned its vigorous pursuit of the fake drama of NHST assessment of flat earth hypotheses and instead focused on the often messy business of faithfully representing enhanced understandings of things we know are out there we might truly change our understanding of our world. Who knows, we might even “earn our keep” (Ohlson, 2021).


Corresponding author: William M. Cready, The University of Texas at Dallas, Richardson, USA, E-mail:

Acknowledgments

I thank Shyam Sunder for encouraging me to write this paper. The paper has benefited considerably from comments provided by Catharine Cready, Judith Hermis, Yuri Biondi, and the anonymous reviewers.

Appendix

“Flat Earth” Null Hypothesis Testing in the April–May 2021 Issue of the Journal of Accounting and Economics.

After acceptance of prior draft of the essay for presentation at the July, 2021 SASE conference I learned that, John Core, an editor of the Journal of Accounting and Economics would be discussing it. Based on feedback I had received on earlier versions, I anticipated that absence of empirical support could be a concern to some. Hence, in preparation for the conference it seemed fitting that I put together the analysis that follows, in a somewhat less developed form, as a means of addressing this weakness. It concerns ‘flat earth’ prevalence, circa 2021, in the Journal of Accounting and Economics by means of detailed textual assessment of the contents of its April–May issue.

The April–May 2021 issue contains 14 articles, 13 of which report NHST based assessments of evidence. Table A1 provides my identification of central nil hypotheses in these articles along with my underlying reasoning for its being ‘flat earth’. Most frame their analyses around conceptual hypotheses that even if reconstituted in a directional form, are, in my judgement, ‘flat earth’. For the most part these hypotheses are individualistic. From an alternative hypothesis perspective, the underlying motivation is identifying the presence of some proposed relation, characteristic, difference, etc. exist in some members of a studied population. Outside of prediction settings, in social science settings the absence of existence nulls implied by such conjectures are invariably ‘flat earth’. One article, Balakrishnan et al. (2021), employs nil hypotheses as nothing more than a means of providing directional descriptive inferential insights about matters. But, it then falls into the trap of inferring that an absence of statistical significance implies that the underlying parameter is zero or of inconsequential magnitude (Cready et al., 2022). Statistical significance assessment is a remarkably poor stand-alone approach to obtaining undirected inferential insights from evidence. In this case, minimally, reporting likelihood ratios or confidence intervals would be a far better approach.

Table A1:

Identification and relevant supportive analysis of “flat earth” null hypotheses in articles. Published in the April–May 2021 issue of the Journal of Accounting and Economics.

Article Key (implied) conceptual null hypothesis (es) Flat earth analysis
Anantharaman & Henderson (2021)
  1. Going concern based valuation of pension obligations, on average, explains exactly the same amount of (incremental) variation in debt values as settlement based valuation.

  2. Going concern based valuation of pension obligations, on average, explains exactly the same amount of (incremental) variation in equity values as settlement based valuation.

The two nils here nominally address explanatory power horse races between going concern and settlement valuation numbers. As such, they are not individualistic, but rather address the on-average relative performances of going concern and settlement based approaches for valuing: (1) equity; (2) debt. As the likelihood that either race ends as an exact tie is essentially zero, one can safely presume that differences between valuation approaches exist. Hence, these nulls are ‘flat earth’. The possibly more productive way to view the hypothesis structure here, however, is as assessing directional rather than existence of difference nulls. Specifically (ignoring the exact tie possibility for expositional convenience), the study seeks to produce evidence against the following two nulls:
  1. Relative to Going Concern valuation, Settlement valuation explains less variation in debt values.

  2. Relative to Settlement valuation, Going Concern valuation explains less variation in equity values.

The possible ‘flat earth’ nature of this view of the exercise emerges when one recognizes that the analysis hinges on the plausibility of the joint truth of two nulls. How plausible is it that settlement based valuation explains less variation in debt values and more variation in equity values? The article itself is silent on this point, focusing instead on how the evidence supports the importance of settlement numbers for debt valuation. Hence, one is left with a test of a null hypothesis pair for which there is little reason to think are jointly true.
Armstrong et al. (2021) CEO voluntary share purchases never provide indirect retention benefits.

CEOs never purchase firm shares in order to (successfully) prolong their tenure.
As seen from its title, “Are CEO share purchases more profitable than they appear?,” the article advances the idea that CEOs sometimes purchase shares to inordinately prolong their tenure, thereby providing extra retention derived value to them. The two nils here are that neither of these things ever happens. The article provides no insights about why one should view such non-existence hypotheses as plausible. Nor, can I think of any. The counter-directional alternatives, that such actions generally shorten rather than extend tenure or impose net penalties on CEOs, seem imaginative at best and also stray from the individualistic existence motivation adopted by the article.
Beardsley et al. (2021) Office level Non-Audit Services (apart from client-specific NAS) never distracts from the quality of its audits. The key idea here is that things going on at the office, entirely apart from what is going on with the specific client, are sometimes a source of distraction. The storyline is individualistic at the client engagement level. The idea that office level things do not, on occasion distract, is implausible. Shifting matters away from the individual engagement to the overall average impact of office level NAS does little to counteract this implausibility. Such a directional perspective of things requires a basis for thinking that office level NAS, apart from client provided NAS) commonly enhances audit quality. The article, however, provides no such discussion and I have a hard time seeing a basis for it.
Article Null hypothesis (es) Flat earth analysis
Li et al. (2021) Increased state level taxation of innovation does not adversely impact innovation activity by firms. How plausible is it that taxation (via adddback enforcement) never impacts the behavior of those who are taxed? Or, taking the nil as addressing state level taxation efficacy, that state addback tax laws are always impotent. Finally, converting this to a directional assertion about the on average effect of such taxation on R&D requires a rationale for why taxing R&D possibly leads firms to spend more on it. The article provides no such rationale and it certainly lacks any sort of intuitive appeal.
Balakrishnan et al. (2021) This analysis employs NHST as a means of providing descriptive insights regarding analyst cost of equity estimates. Hence, while it tests a bevy of nil hypotheses, these tests are directed at identifying overall relations in the data. The analysis addresses overall relations between analyst cost of equity (CoE) estimates and various other metrics of academic interest. Tested nils include: CoE estimates are unrelated to future realized returns, beta, book-to-market ratio, size, leverage, volatility, profitability, etc. Assuming that CoE estimates have some grounding in the economic reality of firms, the idea that they are entirely unrelated to other measures of such reality is not plausible. Hence, there is little reason for thinking that any of the study’s tested nils are true, including those that the anaysis, based on a flawed understanding of statistically insignificant outcomes, mistakenly suggests (Amrhein et al., 2019) are supported by the evidence. Fundamentally, the analysis is using NHST assessments of flat earth hypotheses as a mechanism for directionally locating effects of interest. Such an approach does provide relevant insights about the directional location of underlying parameters of interest suggested by the examined evidence. That is, it has estimation value, as opposed to test of hypothesis value. However, “By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.” (Principle 6 of The ASA Statement) Approaches that “emphasize estimation over testing” (e.g. confidence intervals) provide far more comprehensive descriptive insights than what is obtained from p-values. Such misapplication of NHST also opens the door for thinking that statistical significance of a relation somehow identifies it as having importance or consequence.
Kim & Valentine (2021) No firm’s innovative activity level is impacted by its own or its rivals patent disclosure levels. Relevant information (and commonly irrelevant data as well) affects decisions. Why would innovation information be an exception? What is less clear is the directional impact of such information. Indeed, one suspects that such information leads firms to increase innovative activity in some settings and decrease it in others. But, what is the rationale for thinking they will exactly offset? And, if not, then what is the directional null of particular interest here? (Otherwise one is simply engaging in blind specification search under the guise of NHST.)
Bernard et al. (2021) Initial entrant firms never rely upon existent firm leverage ratios in making financing choices. Uncommon, perhaps. But, never? Are financial analysis courses and industry ratio books that much of a waste of time and effort? Furthermore, as negatively relying on information is impossible; opposite direction storylines are not a thing here.
Herpfer (2021) No banker in the studied population exhibits persistent idiosyncratic preferences in making syndicated loan decisions (i.e. all deviations from a more general bank level lens model specification are purely random);

No banker in the studied population gains information asymmetry attenuating knowledge from an ongoing banker/borrower relationship.
While there is certainly a reasonable basis for thinking that the persistence and asymmetry reduction impacts here are of inconsequential importance, that is not what is tested. There is no sensible reason at all for thinking that they never happen. Importantly, much of the analysis takes the rejection of these hypotheses as a basis for addressing the far more salient issue of how important they are in the broader scheme of things. NHST, however, is not well-suited to such identification/estimation endeavors. A more complete analysis would engage things such as the precision attached to provided estimates, parameter (not meaningless F-statistic) confidence intervals, and likelihood assessments.
Bushman et al. (2021) In making lending decisions loan officers are (layered) lens model + random variation robots with respect to interest spreads and loan covenant designs.

Loan performance does not differ by loan officer (after controlling for many other things).
Individuals have identities. A conjecture that cross-officer differences, inclusive of performance differences, may be inconsequential has plausibility. The idea that they are, for all intents and purposes, clones does not. It certainly doesn’t require an elaborate test statistic distribution simulation (as presented in Table 5 of the article) to confidently identify the implausible nature of such nothing-but-clones-here nulls. In leveling this criticism, I think it pertinent to point out that the article’s thrust is not the existence of systematic cross-officer differences but rather their consequences as measured by statistical significance attainment and how much dependent variable variation they explain. Unfortunately, as Principle #5 of The ASA Statement makes clear, statistical significance is not a solid basis for attributing consequence. And, per Principle #6 p-values do not “provide .. good measure(s) of evidence.” Point estimates, on the other hand, have some relevance but they are really only a starting point. More complete consequence assessment must also take into account the uncertainty associated with them.
Kepler (2021) Strategic alliance private information sharing never leads firms to reduce their level of public information disclosure. While rarely might work here, never is a problem. How plausible is it that the existence of a private information channel for delivering an item of information to key interested parties never impacts any firm’s reluctance to disclose it publicly?

Kepler also devotes a paragraph of his motivation to the presentation of a possibly confounding competing hypothesis. He observes that since strategic alliances involve information sharing, involvement in them provides firms with a larger body of information to possibly disclose. Given that that there are benefits to such public disclosure then this line of reasoning suggests that alliance involvement could, on average, increase rather than decrease overall disclosure. As this effect offsets rather than precludes the article’s advanced substitution story it does not speak at all to the truth of the existence of a substitution effect that, outside of this lone paragraph, the article focuses on. Rather, it represents a reason why it may be difficult to clearly identify such existence in the examined evidence.
Dey & White (2021) Human capital protection never affects any firm’s takeover provision choices. Rarely, not very much, only a little bit. Sure. But never? Or, if we adopt the article’s advanced directional argument that enhanced human capital protection is one reason firms sometimes adopt such provisions, then what is the reason for thinking that removal of such protection is a factor influencing firms to adopt such provisions?
Heese & Perez-Cavazos (2021) Retaliation costs never deter any employee from engaging in whistleblowing. The focus of the analysis is very clearly on the empirical identification of the existence of a negative relation between whistleblowing costs/consequences and inclination to engage in whistleblowing. Specifically, it assesses how an exogenous decrease in the cost of whistleblowing (in the form of increased unemployment benefits) affects whistleblowing frequencies. The article’s clear attraction is that the production of compelling empirical evidence of such a relation has proven elusive. However, the underlying conceptual hypothesis here is flat earth. There is no basis for thinking that retaliation costs positively impact inclination to engage in whistleblowing, at least in a dominating the relation sense of things. And, there certainly isn’t any reason for thinking that whistleblowing inclinations are entirely unrelated to the potential costs/consequences of engaging in it.
Dong & Young (2021) -Firm earnings announcements never convey information (i.e., information transfer) about firms in other countries.

-Within country within industry information transfer is, on average, less than or equal to the sum of without the country industry information transfer and non-same industry (macro) information transfer effects.
Seriously? (Note, negative information transfer is not a thing. So, there is no opposite direction offset argument in place here.)

Why would a direct comprehensive information source (within country within industry) ever (in expectation) underperform a cobbled together collection of noisy indirect sources addressing the same basic underlying information?
  1. Italicized null identify conceptual hypotheses that, to a material degree at least, address estimated parameters as opposed to individualistic hypotheses addressing the complete absence of subjects within a population who exhibit the proposed relation, behavior, response, difference, characteristic, etc.

Table A2 supplements Table A1 material. For selected Table A1 articles it provides statistical significance information from article-provided tests of identified nil hypotheses. It also advances alternative, non-nil, conceptual null hypotheses, as viable replacement candidates for the tested nil hypotheses. These replacement nulls are long on generality and short on specifics in the sense that they do not identify numerical point values to delineate between the consequential and the inconsequential. The latter identifications, however, go beyond my limited abilities. I have no clear idea what level of response to taxation is minimally meaningful, what the materiality threshold is for distraction, or how much (performance) identity is too much for a loan officer. I do know I would like to hear the authors’ thoughts on such matters. I am not at all interested “no look” statistical significance ‘earth isn’t flat’ declarations. That said, I admit that one thing statistical significance outcomes are possibly useful for is when they come in borderline. A p-value that just clears the 0.05 threshold (e.g., a t-statistic or Z-statistic in the neighborhood of 2) necessarily implies that a test of a null hypothesis that the underlying effect value is barely different from zero would not produce a rejection outcome. It is implausible that analyses reporting such borderline outcomes will successfully reject whatever inconsequentiality null their authors can come up with. Such failures, of course, do not mean that the underlying effect is truly inconsequential. It simply means that the evidence was inadequate to the inferential task facing it. The relevant solution will be then the gathering of more/better evidence (see, in particular, the conclusion of Cready et al., 2022).

Table A2:

The path not taken: some non-flat-earth conjectures that could have been nulls.

Article Implied “flat earth” conceptual nulls “Evidence against” worthy nulls
p-value/Z/t
Anantharaman & Henderson (2021) Going concern based valuation of pension obligations explains exactly the same amount of variation in debt values as settlement based valuation of such obligations. Z = 1.79 Going concern based valuation of pension obligations explains an inconsequentially greater amount of debt valuation variation than that explained by settlement based valuation.
Armstrong et al. (2021) CEOs never purchase firm shares in order to (successfully) prolong their tenure. t = −4.05 On average, CEOs rarely purchase substantial numbers of firm shares in order to prolong their tenure.
Beardsley et al. (2021) Office level NAS (apart from client-specific NAS) never distracts from the quality of its audits. Z = 2.53 Office level NAS, on average, only inconsequentially distracts from high quality audit production.
Li et al. (2021) State addback taxation of innovation does not adversely impact innovation activity. t = −2.15 Current state level addback taxation levels are a minimal deterrent to firm innovative activities.
Kim & Valentine (2021) No firm’s innovative activity level is impacted by its own or its rivals patent disclosure levels. t = 4.94, t = 5.01 AIPA mandated patent disclosures did not have a consequential impact on firm innovative activity levels
Bernard et al. (2021) Initial entrant firms never rely upon existent firm leverage ratios in making financing choices. t = 2.69 Existing firm leverage choices, on average, are a minor determinant of initial entrant firm financing choices.
Herpfer (2021) In making syndicated loan lending decisions bankers are lens model + random variation robots. p < 0.0001 Client and Banker specific characteristic, on average, provides little explanatory insight regarding bank lending decisions.
Bushman et al. (2021) In making lending decisions loan officers are lens model + random variation robots. p < 0.0001 In general, loan officer specific factors are an inconsequential determinant of loan decisions and outcomes.
Kepler (2021) In strategic alliance settings, private information channels never substitute for public information channels. t = −4.22 In strategic alliance settings, private information channels, on average, only minimally substitute for public information channels.
Carter et al. (2021) Supplier industry competition never affects pay for performance choices. t = 4.638 Supplier industry competition, on average, has little impact on pay for performance choices.
Dey & White (2021) Human capital protection never affects any firm’s takeover provision choices. t = 2.23–2.95 On average, human capital protection is of inconsequential relevance to firm takeover provision choices.
Heese & Perez-Cavazos (2021) Retaliation costs never deter any employee from engaging in whistleblowing. t = 3.92 Retaliation costs, on average, are not a consequential deterrent to whistleblowing.
Dong & Young (2021) Firm earnings announcements never convey information (i.e., information transfer) about firms in other countries. t = 7.34–7.89 Firm earnings announcement information transfer effects do not, on average, provided economically significant amounts of information about firms in other countries.
Within country within industry information transfer is less than or equal to the sum of without the country industry information and non-same industry (macro) information transfer effects. p = 0.003 (t ∼ 2.75) Firm domestic earnings announcement information transfer effects do not, on average, materially exceed the sum of their without the country same industry and non-same industry information transfer effects.

References

Amrhein, A. S., Greenland, & McShane, B. (2019). Scientist rise up against statistical significance. Nature, 567, 305–307. https://doi.org/10.1038/d41586-019-00857-9.Suche in Google Scholar

Anantharaman, D., & Henderson, D. (2021). Contrasting the information demands of equity- and debt-holders; Evidence from pension liabilities. Journal of Accounting and Economics, 71(2/3), 1–21. https://doi.org/10.1016/j.jacceco.2020.101366.Suche in Google Scholar

Armstrong, C., Blackburne, T., & Quinn, P. (2021). Are CEOs’ purchases more profitable than they appear? Journal of Accounting and Economics, 71(2/3), 1–22. https://doi.org/10.1016/j.jacceco.2020.101378.Suche in Google Scholar

Balakrishnan, K., Shivakumar, L., & Taori, P. (2021). Analysts’ estimates of the cost of equity capital. Journal of Accounting and Economics, 71(2/3), 1–27. https://doi.org/10.1016/j.jacceco.2020.101367.Suche in Google Scholar

Ball, R. (2013). Accounting informs investors and earnings management is rife: Two questionable beliefs. Accounting Horizons, 27(4), 847–853. https://doi.org/10.2308/acch-10366.Suche in Google Scholar

Beardsley, E., Imdieke, A., & Omer, T. (2021). The distraction effect of non-audit services on audit quality. Journal of Accounting and Economics, 71(2/3), 1–20. https://doi.org/10.1016/j.jacceco.2020.101380.Suche in Google Scholar

Berkson, J. (1938). Some difficulties of interpretation encountered in the application of the chi- square test. Journal of the American Statistical Association, 33, 526–542. https://doi.org/10.1080/01621459.1938.10502329.Suche in Google Scholar

Bernard, D., Kaya, D., & Wertz, J. (2021). Entry and capital structure mimicking in concentrated markets: The role of incumbents’ financial disclosures. Journal of Accounting and Economics, 71(2/3), 1–19. https://doi.org/10.1016/j.jacceco.2020.101379.Suche in Google Scholar

Bushman, R., Gao, J., Martin, X., & Pacelli, J. (2021). The influence of loan officers on loan contract design and performance. Journal of Accounting and Economics, 71(2/3), 1–25. https://doi.org/10.1016/j.jacceco.2020.101384.Suche in Google Scholar

Carter, M., Choi, J., & Sedatole, K. (2021). The effect of supplier industry concentration on pay-for-performance incentive intensity. Journal of Accounting and Economics, 71(2/3), 1–24. https://doi.org/10.1016/j.jacceco.2021.101389.Suche in Google Scholar

Cohen, J. (1994). The earth is round (p < 0.05). American Psychologist, 49(2), 997–103. https://doi.org/10.1037/0003-066x.49.12.997.Suche in Google Scholar

Cready, W., He, J., Liu, W., Shao, C., Wang, D., & Zhang, Y. (2022). Is there a confidence interval for that? A critical examination of null outcome reporting in accounting research. Behavioral Research in Accounting, 34(1), 43–72. https://doi.org/10.2308/bria-2020-033.Suche in Google Scholar

Cready, W., Hurtt, D., & Seida, J. (2000). Applying reverse regression techniques in earnings- return analyses. Journal of Accounting and Economics, 30(2), 227–240. https://doi.org/10.1016/s0165-4101(01)00006-4.Suche in Google Scholar

Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge.10.4324/9780203807002Suche in Google Scholar

De Long, J., & Lang, K. (1992). Are all economic hypotheses false? Journal of Political Economy, 100(6), 1257–1272. https://doi.org/10.1086/261860.Suche in Google Scholar

Dey, A., & White, J. (2021). Labor mobility and antitakeover provisions. Journal of Accounting and Economics, 71(2/3), 1–24. https://doi.org/10.1016/j.jacceco.2021.101388.Suche in Google Scholar

Dodgson, M., Agoglia, C., Bennett, G., & Cohen, J. (2020). Managing the auditor-client relationship through partner rotations: The experiences of audit firm partners. The Accounting Review, 95(2), 89–111. https://doi.org/10.2308/accr-52556.Suche in Google Scholar

Dong, Yashu, & Young, Danqing (2021). Foreign macroeconomic and industry-related information transfers around earnings announcements: Evidence from U.S.-listed non-U.S. firms. Journal of Accounting and Economics, 71(2/3), 1–23. https://doi.org/10.1016/j.jacceco.2021.1014000165-4101.Suche in Google Scholar

Dyckman, T. (2016). Significance testing: We can do better. Abacus, 52(2), 319–342. https://doi.org/10.1111/abac.12078.Suche in Google Scholar

Federer, W. (2021). https://myemail.constantcontact.com/Columbus–Miscalculation–How-Far-Around-is-the-Round-Earth-.html?soid=1108762609255&aid=eIzMBrYZGTQ.Suche in Google Scholar

Goodman, W., Sprull, S., & Komaroff, E. (2019). A proposed hybrid effect size plus p-value criterion: Empirical evidence supporting its use. The American Statistician, 73(51), 168–185. https://doi.org/10.1080/00031305.2018.1564697.Suche in Google Scholar

Greenland, S. (2018). The unconditional information in P-values, and its reputational interpretation via S-values. Working paper.Suche in Google Scholar

Greenland, S. (2019). Valid P-values behave exactly as they should: Some mislieading criticisms of P-values and their resolution with S-values. The American Statistician, 73(supl), 106–114. https://doi.org/10.1080/00031305.2018.1529625.Suche in Google Scholar

Heese, J., & Perez-Cavazos, G. (2021). The effect of retaliation costs on employee whistle- blowing. Journal of Accounting and Economics, 71(2/3), 1–19. https://doi.org/10.1016/j.jacceco.2020.101385.Suche in Google Scholar

Herpfer, C. (2021). The role of bankers in the U.S. syndicated loan market. Journal of Accounting and Economics, 71(2/3), 1–26. https://doi.org/10.1016/j.jacceco.2020.101383.Suche in Google Scholar

Ioannidis, J. (2005a). Why most published research findings are false. PLoS Med, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124.Suche in Google Scholar

Ioannidis, J. (2005b). Contradicted and initially stronger effects in highly cited clinical research, Journal of the American Medical Association, 294(No. 2), 218–228. https://doi.org/10.1001/jama.294.2.218.Suche in Google Scholar

Kepler, J. (2021). Private communication among competitors and public disclosure. Journal of Accounting and Economics, 71(2/3), 1–24. https://doi.org/10.1016/j.jacceco.2021.101387.Suche in Google Scholar

Kim, J., & Valentine, K. (2021). The innovation consequences of mandatory patent disclosures. Journal of Accounting and Economics, 71(2/3), 1–22. https://doi.org/10.1016/j.jacceco.2020.101381.Suche in Google Scholar

Klepper, S., & Leamer, E. (1984). Consistent sets of estimates for regressions with errors in all variables. Econometrica, 52(1), 163–184. https://doi.org/10.2307/1911466.Suche in Google Scholar

Leamer, E. (1978). Specification searches: Ad hoc inference with nonexperimental data. Wiley.Suche in Google Scholar

Li, Q., Ma, M., & Shevlin, T. (2021). The effect of tax avoidance crackdown on corporate innovation. Journal of Accounting and Economics, 71(2/3), 1–26. https://doi.org/10.1016/j.jacceco.2020.101382.Suche in Google Scholar

Ohlson, J. (2015). Accounting research and common sense. Abacus, 51(4), 525–535. https://doi.org/10.1111/abac.12059.Suche in Google Scholar

Ohlson, J. (2021). Empirical accounting research: Elephants in the room. Working paper.Suche in Google Scholar

Stone, D. (2018). The “New Statistics” and nullifying the null: Twelve actions for improving quantitative accounting research quality and integrity. Accounting Horizons, 32(1), 105–120. https://doi.org/10.2308/acch-51949.Suche in Google Scholar

Wallis, H. (1992). What Columbus knew. History Today, 42, 17–23.Suche in Google Scholar

Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on P-values: Context, process, and purpose. The American Statistician, 70, 129–133. https://doi.org/10.1080/00031305.2016.1154108.Suche in Google Scholar

Published Online: 2022-09-19

© 2022 CONVIVIUM, association loi de 1901

Heruntergeladen am 23.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/ael-2021-0045/html
Button zum nach oben scrollen