Make a summary and PowerPoint
prepare a 2-3 pages summaryof each paper (including the Libby box summary)
Focus on summarizing the most salient points of the article.
B .and make PowerPoint slides for each study.
Three Articles in the attachments
Gow, I., Larcker, D., and Reiss, P. (2016). Causal inference in accounting research. Journal of Accounting Research54 (2): 477–523.
Lennox, C., Francis, J., Wang, Z. (2012). Selection models in accounting research. The Accounting Review87 (2): 589–616.
DOI: 10.1111/1475-679X.12116
Journal of Accounting Research
Vol. 54 No. 2 May 2016
Printed in U.S.A.
Causal Inference in Accounting
Research
I A N D . G O W ,∗ D A V I D F . L A R C K E R ,† A N D P E T E R C . R E I S S†
ABSTRACT
This paper examines the approaches accounting researchers adopt to draw
causal inferences using observational (or nonexperimental) data. The vast
majority of accounting research papers draw causal inferences notwithstanding the well-known difficulties in doing so. While some recent papers seek to
use quasi-experimental methods to improve causal inferences, these methods
also make strong assumptions that are not always fully appreciated. We believe that accounting research would benefit from more in-depth descriptive
research, including a greater focus on the study of causal mechanisms (or
causal pathways) and increased emphasis on the structural modeling of the
phenomena of interest. We argue these changes offer a practical path forward
for rigorous accounting research.
JEL codes: C18; C190; C51; M40; M41
Keywords: Causal inference; accounting research; quasi-experimental
methods; structural modeling
1. Introduction
There is perhaps no more controversial practice in social and biomedical research than drawing inferences from observational data. Despite . . .
∗ Harvard Business School; † Rock Center for Corporate Governance, Stanford Graduate
School of Business.
Accepted by Philip Berger. We are grateful to our discussants, Christian Hansen and Miguel
Minutti-Meza, and participants at the 2015 JAR Conference for helpful feedback. We also
thank seminar participants at London Business School, Karthik Balakrishnan, Robert Kaplan, Christian Leuz, Alexander Ljungqvist, Eugene Soltes, Daniel Taylor, Robert Verrecchia,
Charles Wang, and Anastasia Zakolyukina for comments.
477
C , University of Chicago on behalf of the Accounting Research Center, 2016
Copyright
478
I. D. GOW, D. F. LARCKER, AND P. C. REISS
problems, observational data are widely available in many scientific fields
and are routinely used to draw inferences about the causal impact of interventions. The key issue, therefore, is not whether such studies should be
done, but how they may be done well. (Berk [1999, p.95])
Most empirical research in accounting relies on observational (or nonexperimental) data. This paper evaluates the different approaches accounting researchers adopt to draw causal inferences from observational data.1
Our discussion draws on developments in fields such as statistics, econometrics, and epidemiology. The goal of this paper is to identify areas for
improvement and suggest how empirical accounting research can improve
inferences drawn from observational data.
The importance of causal inference in accounting research is clear from
the research questions that accounting researchers seek to answer. Most
long-standing questions in accounting research are causal: Does conservatism affect the terms of loan contracts? Do higher quality earnings reports
lead to lower information asymmetry? Did International Financial Reporting Standards cause an increase in liquidity in the jurisdictions that adopted
them? Do managerial incentives lead to managerial misstatements in financial reports? The accounting researchers focus on causal inference, which
is consistent with the view that “the most interesting research in social science is about questions of cause and effect” (Angrist and Pischke [2008,
p. 3]). Simply documenting descriptive correlations provides little basis for
understanding what would happen should circumstances change, whereas
using data to make inferences that support or refute broader theories could
facilitate these kinds of predictions.
To provide insights into what is actually done in empirical accounting
research, we examined all papers published in three leading accounting
journals in 2014. While accounting researchers are aware of problems that
can arise from the use of observational data to draw causal inferences, we
found that most papers still seek to draw such inferences. Making causal
inferences requires strong assumptions about the causal relations among
variables. For example, estimating the causal effect of X on Y requires
that the researcher has controlled for variables that could confound estimates of such effects. Section 2 provides an overview of causal inference
using causal diagrams as a framework for thinking about the subtle issues
involved. We believe that these diagrams are also very useful for communicating the cause-and-effect logic underlying regression analyses that use observational data. Nonetheless, difficulties identifying, measuring, and controlling for all possible confounding variables have led many to question
causal inferences drawn from observational data.
Recently, some social scientists have held out hope that better research designs and statistical methods can increase the credibility of causal
1 Thus, our focus is on what Bloomfield, Nelson, and Soltes [2016] call “archival studies.”
Floyd and List [2016] discuss opportunities for researchers to use experiments in accounting
research.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
479
inferences. For example, Angrist and Pischke [2010] suggest that “empirical microeconomics has experienced a credibility revolution, with a consequent increase in policy relevance and scientific impact.” Angrist and
Pischke [2010, p. 26] argue that such “improvement has come mostly
from better research designs, either by virtue of outright experimentation or through the well-founded and careful implementation of quasiexperimental methods.” Our survey of research published in 2014 finds 5
studies claiming to study natural experiments (or “exogenous shocks”) and
10 studies using instrumental variables (IVs). Although these numbers suggest that quasi-experimental methods are infrequently used in accounting
research, we believe their use will increase in the future.2
Section 3 evaluates the use of quasi-experimental methods in accounting research. Quasi-experimental methods produce credible estimates of
causal effects only under very strong maintained assumptions about the
model and data. For example, variations in treatments are rarely random,
the list of controls rarely exhaustive, and instruments do not always satisfy the necessary inclusion and exclusion restrictions. We explain some of
these concerns using causal diagrams. In general, it appears that the assumptions required to apply quasi-experimental methods are unlikely to
be satisfied by observational data in most empirical accounting research
settings.
Ultimately, we believe that accounting research needs to recognize the
stringent assumptions that need to be maintained to apply statistical methods to derive estimates of causal effects for observational data. Statistical
methods alone cannot solve the inference issues that arise in observational
data. The second part of the paper (sections 4 and 5) identifies approaches
that can provide a plausible framework for guiding future accounting
research. Specifically:
r There should be an increased emphasis on the study of causal mechanisms, that is, the “pathways” through which claimed causal effects are
propagated. We believe that evidence on the actions and beliefs of individuals and institutions can bolster causal claims based on associations,
even absent compelling estimates of the causal effects. We also suggest
that more careful modeling of phenomena, using structural modeling
or causal diagrams, can help to identify plausible mechanisms that warrant further study.
r Causal diagrams are a useful tool for conveying the key elements of
a structural model and can also act as a middle-level stand-in when
structural modeling of a phenomenon is infeasible.3
2 We use the term “quasi-experimental” methods to refer to those methods that have a plausible claim to “as if” random assignment to treatment conditions. The term “as if” is used by
Dunning [2012] to acknowledge the fact that assignment is not random in such settings, but
is claimed to be as if random assignment had occurred.
3 “Middle-level” here refers to the placement of causal diagrams between relatively informal
verbal reasoning and the rigors of a structural model.
480
I. D. GOW, D. F. LARCKER, AND P. C. REISS
r There should be an increased use of structural modeling methods.
Structural models provide a more complete characterization of the behavior and institutions that underlie a phenomenon of interest. We
acknowledge that while structural models need not be a correct characterization, they have the advantage of making what is assumed explicit. This gives other researchers a rigorous way to assess the model
and understand what would happen if features of the model change.
r There are many important questions in accounting that have not yet
been addressed by formal models. In these settings, it is important to
conduct sophisticated descriptive research aimed at understanding the
phenomena of interest so as to develop clearer cause-and-effect models. In our view, many hypotheses that are tested with observational
data are only loosely tied to the accounting institutions and business
phenomena of interest. We hope that a larger number of richer descriptive studies will provide insights that the theorists can use to build
models that empiricists can actually “take to data.”
2. Causal Inference: An Overview
2.1 CAUSAL INFERENCE IN ACCOUNTING RESEARCH
To get a sense for the importance of causal questions in accounting research, we examined all papers published in 2014 in the Journal of Accounting Research, The Accounting Review, and the Journal of Accounting and Economics. We counted 139 papers, of which 125 are original research papers.
Another 14 papers survey or discuss other papers. We classify each of the
125 research papers into one of the following four categories: “Theoretical” (7), “Experimental” (12), “Field” (3), or “Archival Data” (103). For
our next discussion, we collect the field and archival data papers into a
single “Observational” category.
For each nontheoretical paper, we determine whether the primary or
secondary research questions are “causal.” Often the title reveals a causal
question, with words such as “effect of . . .” or “impact of . . .” (e.g., ClorProell and Maines [2014], Cohen et al. [2014]). In other cases, the abstracts
reveal that authors have causal inferences as a goal. For example, de Franco
et al. [2014] inquires “how the tone of sell-side debt analysts’ discussions
about debt-equity conflict events affects the informativeness of debt analysts’
reports in debt markets.”
We recognize that some authors might disagree with our characterizations. For example, a researcher might argue that a paper that claimed that
“theory predicts X is associated Y and, consistent with that theory, we show
X is associated with Y ” is merely a descriptive paper that does not make
causal inferences. However, theories are invariably causal in that they posit
how exogenous variation in certain variables leads to changes in other variables. Further, by stating that “consistent with . . .theory, X is associated with
Y ,” the clear purpose is to argue that the evidence tilts the scale, however
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
481
slightly, in the direction of believing the theory is a valid description of the
real world: In other words, a causal inference is drawn.4
Of the 106 original papers using observational data, we coded 91 as
seeking to draw causal inferences.5 Of the remaining empirical papers,
we coded seven papers as having a goal of “description” (including two
of the three field papers). For example, Soltes [2014] uses data collected
from one firm to describe analysts’ private interactions with management.
Understanding how these interactions take place is key to understanding
whether and how they transmit information to the market. We coded five
papers as having a goal of “prediction.” For example, Czerney, Schmidt,
and Thompson [2014] examine whether the inclusion of “explanatory language” in unqualified audit reports can be used to predict the detection
of financial misstatements in the future. We coded three papers as having
a goal of “measurement.” For example, Cready, Kumas, and Subasi [2014]
examine whether inferences about traders based on trade size are reliable
and suggest improvements for the measurement of variables used by accounting researchers.
In summary, we find that most original research papers use observational
data and about 90% of these papers seek to draw causal inferences. The
most common estimation methods used in these studies include ordinary
least-squares (OLS) regression, difference-in-differences (DD) estimates,
and propensity-score matching (PSM). While it is widely understood that
OLS regressions that use observational data produce unbiased estimates of
causal effects only under very strong assumptions, the credibility of these
assumptions is rarely explicitly addressed.6
2.2. CAUSAL INFERENCE: A BRIEF OVERVIEW OF RECENT DEVELOPMENTS
In recent decades, the definition and logic of causality has been revisited by researchers in fields as diverse as epidemiology, sociology, statistics,
and computer science. Rubin [1947, p. 7] and Holland [1986] formalized
ideas from the potential-outcome framework of Neyman [1923], leading
to the so-called “Rubin causal model.” Other fields have used path analysis, as initially studied by geneticist Sewell Wright (Wright [1921]), as an
organizing framework. In economics and econometrics, early proponents
of structural models were quite clear about how causal statements must be
4 Papers that seek to estimate a causal effect of X on Y are a subset of papers we classify as
causal. A paper that argues that Z is a common cause of X and Y and claims to find evidence
of this is still making causal inferences (i.e., Z causes X and Z causes Y ). However, we do not
find this kind of reasoning to be common in our survey.
5 While we exclude research papers using experimental methods, all these papers also seek
to draw causal inferences.
6 There are settings where DD and fixed-effect estimators may deliver causal estimates. For
example, if assignment to treatment is random, then it is possible for a DD estimate using
pre- and posttreatment data to yield unbiased estimates of causal effects. But, in this case, it is
the detailed understanding of the research setting, not the method per se, that makes these
estimates credible.
482
I. D. GOW, D. F. LARCKER, AND P. C. REISS
tied to theoretical economic models. As discussed by Heckman and Pinto
[2015], Haavelmo [1943, p. 4] promoted structural models “based on a system of structural equations that define causal relationships among a set of
variables.” Goldberger [1972, p. 979] promoted a similar notion: “By structural equation models, I refer to stochastic models in which each equation
represents a causal link, rather than a mere empirical association . . .Generally speaking the structural parameters do not coincide with coefficients of
regressions among observable variables, but the model does impose constraints on those regression coefficients.” Goldberger [1972] focuses on
linking such approaches to the path analysis of Wright.
An important point worth emphasizing is that the model-based causal
reasoning is distinct from statistical reasoning. Suppose we observe data on
x and y and make the strong assumption that we know causality is one-way.
How do we distinguish between whether X causes Y or Y causes X ? Statistics can help us determine whether X and Y are correlated, but correlations
do not establish causality. Only with assumptions about causal relations between X , Y , and other variables (i.e., a theory) can we infer causality. While
theories may be informed by evidence (e.g., prior research may suggest a
given theory is more or less plausible), they also encode our understanding
of causal mechanisms (e.g., barometers do not cause rain).
Computer and decision scientists, as well as researchers in other disciplines, have recently sought to develop an analytical framework for thinking about causal models and their connection to probability statements
(e.g., Pearl [2009a]). Pearl’s framework, which he calls the structural causal
model, uses causal diagrams to describe causal relationships. These diagrams encode causal assumptions and visually communicate how a causal
inference is being drawn from a given research design. Given a correctly specified causal diagram, these criteria can be used to verify conditioning strategies, IV designs, and mechanism-based causal inferences.7
We use figure 1 to illustrate the basic ideas of causal diagrams and how
they can be used to facilitate causal inference. Figure 1 depicts three variants of a simple causal graph. Each graph depicts potential relationships
among the three (observable) variables. In each case, we are interested in
understanding how the presence of a variable Z impacts the estimation of
the causal effect of X on Y . The only difference between the three graphs
is the direction of the arrows linking either X and Z , or Y and Z . The
boxes (or “nodes”) represent random variables and the arrows (or “edges”)
connecting boxes represent hypothesized causal relations, with each arrow
pointing from a cause to a variable assumed to be affected by it.
Pearl [2009b] shows that, if we are interested in assessing the causal effect of X on Y , we may be able to do so by conditioning on a set of variables,
7 While Pearl [2009a, p. 248] defines an instrument in terms of causal diagrams, additional
assumptions (e.g., linearity) are often needed to estimate causal effects using an instrument
(Angrist, Imbens, and Rubin [1996]).
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
483
A
Treatment
variable (X)
Outcome
variable (Y)
“Control” (Z)
B
Treatment
variable (X)
Outcome
variable (Y)
“Control” (Z)
C
Treatment
variable (X)
Outcome
variable (Y)
“Control” (Z)
FIG. 1.—Three basic causal diagrams. (A) Z is a confounder, (B) Z is mediator, and (C) Z is a
collider.
Z , that satisfies certain criteria. These criteria imply that very different conditioning strategies are needed for each of the causal diagrams (see the
appendix for a more formal discussion).
While conditioning on variables is much like the standard notion of “controlling for” such variables in a regression, there are critical differences.
First, conditioning means estimating effects for each distinct level of the
set of variables in Z . This nonparametric concept of conditioning on Z is
more demanding than simply including Z as another regressor in a linear
regression model.8 Second, the inclusion of a variable in Z may not be an
appropriate conditioning strategy. Indeed, it can be that the inclusion of Z
results in biased estimates of causal effects.
Each of the three graphs in figure 1 provides an alternative view of the
causal effect of X on Y . Figure 1(A) is straightforward. It shows that we need
8 Including variables in a linear regression framework “controls for” only under strict assumptions, such as linearity in the relations between X , Y , and Z .
484
I. D. GOW, D. F. LARCKER, AND P. C. REISS
to condition on Z in order to estimate the causal effect of X on Y . Note that
the notion of “condition on” again is more general than just including Z in
a parametric (linear) model.9 The need to condition on Z arises because Z
is what is known as a confounder.
Figure 1(B) is a bit different. Here, Z is a mediator of the effect of X on Y .
No conditioning is required in this setting to estimate the total effect of X
on Y . If we condition on X and Z , then we obtain a different estimate, one
that includes the indirect effect of X on Z .
Finally, in figure 1(C), we have Z acting as what is referred to as a “collider” variable (Glymour and Greenland [2008], Pearl [2009a]).10 Again,
not only do we not need to condition on Z , but that we should not condition
on Z to get an estimate of the total effect of X on Y . While in epidemiology,
the issue of “collider bias . . .can be just as severe as confounding” (Glymour
and Greenland [2008, p. 186]), collider bias appears to receive less attention in accounting research than confounding. Many intuitive examples of
collider bias involve selection or stratification. Admission to a college could
be a function of combined test scores (T ) and interview performance (I )
exceeding a threshold, that is, T + I ≥ C. Even if T and I are unrelated unconditionally, a regression of T on I conditioned on admission to college
is likely to show a negative relation between these two variables.
2.3. CAUSAL DIAGRAMS: APPLICATIONS IN ACCOUNTING
A typical paper in accounting research will include many variables to
“control for” the potential confounding of causal effects. While many of
these variables should be considered confounders, less attention is given
to explaining why it is reasonable to assume that they are not mediators or
colliders. Such a discussion is important because the inclusion of “controls”
that are mediators or colliders will generally lead to bias.
One paper that does discuss this distinction is Larcker, Richardson, and
Tuna [2007], who use a multiple regression (or logistic) model of the
form11
Y =α+
R
γr Zr +
r =1
S
βs Xs + .
(1)
s =1
Larcker, Richardson, and Tuna [2007, p.983] suggest that:
One important feature in the structure of Equation 1 is that the governance factors [X ] are assumed to have no impact on the controls (and
thus no indirect impact on the dependent variable). As a result, this structure may result in conservative estimates for the impact of governance on
9 Inclusion of Z blocks the “back-door” path from Y to X via Z .
10 The two arrows from X and Y “collide” in Z .
11 We alter the mathematical notation of Larcker, Richardson, and Tuna [2007] to conform
to the notation we use here.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
485
the dependent variable. Another approach is to only include governance
factors as independent variables, or:
Y =α+
S
βs X s +
(2)
s =1
The structure in Equation 2 would be appropriate if governance impacts the control
variables and both the governance and control variables impact the dependent variable (i.e., the estimated regression coefficients for the governance variables will capture the total effect or the sum of the direct effect and the indirect effect through
the controls).
But, there are some subtle issues here. If some elements of Zr are mediators and others are confounders, then both equations will be subject to bias.
Equation (2) will be biased due to omission of confounders, while equation
(1) will be biased due to inclusion of mediating variables. Additionally, the
claim that the estimates are “conservative” is only correct if the indirect effect via mediators is of the same sign as the direct (i.e., unmediated) effect.
If this is not the case, then the relation between the magnitude (and even
the sign) of the direct effect and the indirect effect is unclear.
Additionally, this discussion does not allow for the possibility of colliders.
For example, governance plausibly affects leverage choices, while performance is also likely to affect leverage. If so, “controlling for” leverage might
induce associations between governance and performance even absent in a
true relation between these variables.12 While the with-and-without-controls
approach used by Larcker, Richardson, and Tuna [2007] has intuitive appeal, a more robust approach requires careful thinking about the plausible
causal relations between the treatment variables, outcomes of interest, and
candidate control variables.
3. Quasi-Experimental Methods in Accounting Research
While most studies in accounting use regression or matching methods
to condition out confounding variables, a number of studies use quasiexperimental methods that rely on “as if” random assignment to identify
causal effects (Dunning [2012]). Of the 91 papers in our 2014 survey seeking to draw a causal inference from observational data, we classify 14 as
relying on quasi-experimental methods. Despite the low count, we believe
that papers using these methods are considered stronger research contributions, and there seems an increasing trend toward the use of quasiexperimental methods. Additionally, a number of papers use methods such
as DD or fixed-effect estimators, which are widely believed to approximate
quasi-experimental methods. This section discusses and evaluates the usefulness of these methods for accounting research.
12 Note that Larcker, Richardson, and Tuna [2007] do not in fact use leverage as a control
when performance is a dependent variable.
486
I. D. GOW, D. F. LARCKER, AND P. C. REISS
3.1. NATURAL EXPERIMENTS
Natural experiments occur when observations are assigned by nature (or
some other force outside the control of the researcher) to treatment and
control groups in a way that is random or “as if” random (Dunning [2012]).
Truly random assignment to treatment and control provides a sound basis
for causal inference, enhancing the appeal of natural experiments for social science research. However, Dunning [p. 3, emphasis added] argues that
this appeal “may provoke conceptual stretching, in which an attractive label is
applied to research designs that only implausibly meet the definitional features of the method.”
Our survey of accounting research in 2014 identified five papers that
exploited either a “natural experiment” or an “exogenous shock” to identify
causal effects.13 An examination of these papers reveals how difficult it is to
find a plausible natural experiment in observational data.
An important difficulty is that most “exogenous shocks” (e.g., Securities
and Exchange Commission (SEC) regulatory changes or court rulings) do
not randomly assign units to treatment and control groups and thus do
not qualify as natural experiments. For example, an early version of Dodd–
Frank contained a provision that would force companies to remove a staggered board structure.14 It is tempting to use this event to assess the valuation consequences of having a staggered board by looking at excess returns for firms with and without a staggered board around the announcement of this Dodd–Frank provision. Although potentially interesting, the
Dodd–Frank “natural experiment” does not randomly assign firms to treatment and control groups. Instead, firms made an endogenous choice about
whether to have a staggered board, and the regulation is potentially forcing firms to change that choice. But, firms might have a variety of margins
through which they can respond to such a requirement, some of which may
have valuation consequences of their own.15 Absent an account of these
margins, an event study that includes a staggered board treatment variable
does not isolate the (pure) effect of staggered boards on valuations.
Another important concern is that there could be a reason to believe that
the natural experiment affected treatment assignments, and this impact
is correlated with unobserved factors that might impact the outcome of
interest. In general, even claims of random assignment to treatment do
not suffice to deliver unbiased estimates of causal effects. An example of a
drug trial can help underscore these points. Suppose we wish to understand
whether a drug lowers blood pressure. Imagine patients in the trial are
drawn from two hospitals. One hospital is randomly selected as the hospital
13 These are Lo [2014], Aier, Chen, and Pevzner [2014], Kirk and Vincent [2014], Houston
et al. [2014], and Hail, Tahoun, and Wang [2014].
14 See Larcker, Ormazabal, and Taylor [2011].
15 For instance, if forced to remove a staggered board, some firms may put in another antitakeover provision.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
487
in which the drug will be administered. The other hospital’s patients serve
as controls. Suppose, in addition, that we know the patient populations in
both hospitals are similar.
Most researchers would argue that we have all the ingredients for a successful treatment effect study. In particular, assignment to treatment is random. Now imagine that patients actually have to take the drug for it to have
an effect. In this case, if there are unobserved reasons why some assigned
to treatment opt out, modify the dosage, or stop taking medications for
which there might be interactions, then being assigned to treatment is not
the same as treatment. To take an extreme example, suppose the drug has a
slight negative effect on blood pressure, everyone in fact takes the drug, but
doctors in the hospital where patients are treated tell patients to stop taking
their regular blood pressure medication. In this case, if regular blood pressure medications lower blood pressure more than the new drug, we might
conclude that the new drug actually raises blood pressure! In sum, even
showing that a treatment is randomly assigned does not guarantee that a
regression will uncover the causal effect of interest.
Finally, it is important to carefully consider the choice of explanatory variables in studies that rely on natural experiments. In particular, researchers
sometimes inadvertently use covariates that are affected by the treatment in
their analysis. As noted by Imbens and Rubin [2015, p. 116], including such
posttreatment variables as covariates can undermine the validity of causal
inferences.16
Extending our survey beyond research published in 2014, we find papers with very plausible natural experiments. One such paper is Michels
[2015], who exploits the difference in disclosure requirements for significant events that occur before financial statements are issued. Because the
timing of these events (e.g., fires and natural disasters) relative to balance
sheet dates is plausibly random, the assignment to the disclosure and recognition conditions is plausibly random. Nevertheless, even in this relatively
straightforward setting, Michels [2015] recognizes the possibility of different materiality criteria for disclosed and recognized events, which could
affect the relation between underlying events and observed disclosures.
Michels’ paper takes care to address this concern.17
Another plausible natural experiment is examined in Li and Zhang
[2015, p. 80], who study a regulatory experiment in which the SEC “mandated temporary suspension of short-sale price tests for a set of randomly
selected pilot stocks.” Li and Zhang [2015, p. 79] conjecture “that managers respond to a positive exogenous shock to short selling pressure . . .by
reducing the precision of bad news forecasts.” But if the treatment affects
16 See the discussion of mediators above.
17 The setting of Michels [2015] plausibly involves a natural experiment. The endogenous
nature of the disclosure and reporting responses by firms to these events, which is what is
observable to the researcher, makes drawing causal inferences about the effect of recognition
versus disclosure problematic.
488
I. D. GOW, D. F. LARCKER, AND P. C. REISS
the properties of these forecasts, and Li and Zhang [2015, p. 79] sought
to condition on such properties, they would risk undermining the “natural
experiment” aspect of their setting.
When true natural experiments can be found, they are an excellent setting for drawing causal inferences from observational data. Unfortunately,
credible natural experiments are very rare. Certainly researchers should
exploit these natural experiments when they occur (e.g., Li and Zhang
[2015], Michels [2015]), but care also is needed when doing so.
3.2 INSTRUMENTAL VARIABLES
Angrist and Pischke [2008, p. 114] describe IVs as “the most powerful
weapon in the arsenal” of econometric tools. Accounting researchers have
long used IVs to address concerns about endogeneity (Larcker and Rusticus
[2010], Lennox, Francis, and Wang [2012]) and continue to do so. Our
survey of research published in 2014 identifies 10 papers using IVs.18 Much
has been written on the challenges for researchers using IVs as the basis for
causal inference (e.g., Roberts and Whited [2013]), and it is useful to use
this background to evaluate the application of this approach in accounting
research.
3.2.1. Evaluating IVs Requires Careful Theoretical Causal (Not Statistical) Reasoning. With respect to accounting research, Larcker and Rusticus [2010]
lament that “some researchers consider the choice of IVs to be a purely
statistical exercise with little real economic foundation” and call for “accounting researchers . . .to be much more rigorous in selecting and justifying their instrumental variables.” Angrist and Pischke [2008, p. 117] argue
that “good instruments come from a combination of institutional knowledge and ideas about the process determining the variable of interest.” One
study that illustrates this is Angrist [1990]. In that setting, the draft lottery
is well understood as random and the process of mapping from the lottery
to draft eligibility is well understood. Furthermore, there are good reasons
to believe that the draft lottery does not affect anything else directly except
for draft eligibility.19
Note that simply arguing that the only effect of an instrument on the
outcome variable of interest is via the treatment of interest does not suffice
to establish the exclusion restriction. Even if the claim that Z only affects Y
via its effect on X is true, the researcher also needs to argue that variation
in the instrument (Z ) is “as if” random. For example, suppose that the only
effect of Z on Y occurs via X , but Z is a function of a variable W that is also
18 These are Cannon [2014], Cohen et al. [2014], Kim, Mauldin, and Patro [2014], Vermeer, Edmonds, and Asthana [2014], Fox, Luna, and Schaur [2014], Guedhami, Pittman,
and Saffar [2014], Houston et al. [2014], de Franco et al. [2014], Erkens, Subramanyam, and
Zhang [2014], and Correia [2014].
19 Though some have questioned the exclusion restriction even in this case, arguing that
the outcome of the draft lottery may have caused some, for example, to move to Canada (see
Imbens and Rubin [2015]).
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
489
associated with Y . In this case, IV estimates of the effect of X on Y will be
biased. Thus, a researcher should also account for the sources of variation
in the chosen instrument and why these are not expected to be associated
with variation in the outcome variable.20
Unfortunately, there are few (if any) accounting variables that meet the
requirement that they randomly assign observations to treatments, and do
not affect the outcome of interest outside of effects on the treatment variable. Sometimes researchers turn to lagged values of endogenous variables
or industry averages as instruments, but these too are subject to criticism.21
3.2.2. There Are No Simple (Statistical) Tests for the Validity of Instruments.
Some accounting researchers appear to believe that statistical tests can resolve the question of whether their instrument is “valid.” Indeed, many studies choose to test the validity of their IVs using statistical tests (see Larcker
and Rusticus [2010]). But such tests of instruments are of dubious value.
Consider, for example, the following simulation of a setting where X does
not cause y , but we nevertheless estimate the regression y = Xβ + . That
is, we estimate a regression model where β = 0. To make matters interesting, suppose ρ(X, ) > 0 (i.e., X is correlated with the error). Clearly, if we
estimated the equation by OLS, we would conclude that there is a (positive) relationship between X and y . Suppose that, after being told that X is
“endogenous,” we found three instruments: z 1 , z 2 , and z 3 . Unbeknown to
us, the three instruments were determined as follows: z 1 = X + η1 , z 2 = η2 ,
and z 3 = η3 , with η1 , η2 , η3 ∼ N (0, ση2 ) and independent. That is, z 1 is X
plus noise (e.g., industry averages or lagged values of X would seem to approximate z 1 ), while z 2 and z 3 are random noise (many variables could be
candidates here).
Assuming that X and are bivariate-normally distributed with variance
of 1 and ρ(X, ) = 0.2, and ση = 0.03, we performed 1,000 IV regression
simulations with 1,000 firm-level observations in each case. Both OLS and
IV coefficients are close, with the IV-estimated coefficient averaging 0.201.
The IV coefficient estimates are statistically significant at the 5% level 100%
of the time.22 Based on a test statistic of 30, which easily exceeds the thresholds suggested by Stock, Wright, and Yogo [2002], the null hypothesis of
weak instruments is rejected 100% of the time. The Sargan [1958] test of
overidentifying restrictions fails to reject a null hypothesis of valid instruments (at the 5% level) 95.7% of the time.
This example illustrates why it is that no statistical test allows the researcher to verify that their instruments satisfy the exclusion restriction.23
20 In the case of Angrist [1990], this was plausibly satisfied using a lottery for assignment of
Z to subjects.
21 See Reiss and Wolak [2007] for a discussion regarding the implausibility of general claims
that industry averages are valid instruments.
22 Note that this coefficient is close to ρ(X, ) = 0.2, which is to be expected, given how the
data were generated.
23 This is a corollary of the “causal reasoning is not statistical reasoning” point made above.
490
I. D. GOW, D. F. LARCKER, AND P. C. REISS
Obviously, causal inferences based on such IVs is completely inappropriate.
Yet, this shows that it is quite possible for completely spurious instruments
to deliver bad inferences, yet easily pass tests for weak instruments and tests
of overidentifying restrictions.
3.2.3. Causal Diagrams Can Clarify Causal Reasoning. To illustrate the
application of causal diagrams to the evaluation of IVs, we consider
Armstrong, Gow, and Larcker [2013]. Armstrong, Gow, and Larcker study
the effect of shareholder voting (Shareholder supportt ) on future executive
compensation (Comp t+1 ). Because of the plausible existence of unobserved
confounding variables that affect both future compensation and shareholder support, a simple regression of Comp t+1 on Shareholder supportt and
controls would not allow Armstrong, Gow, and Larcker [2013] to obtain an
unbiased or consistent estimate of the causal relation. Among other analyses, Armstrong, Gow, and Larcker [2013] use an IV to estimate the causal
relation of interest. Armstrong, Gow, and Larcker [2013] claim that their
instrument is valid. Their reasoning is represented graphically in figure 2.
By conditioning on Comp t−1 and using Institutional Shareholder Services
(ISS) recommendations as an instrument, Armstrong, Gow, and Larcker
[2013] argue that they can identify a consistent estimate of the causal effect
of shareholder voting on Comp t+1 , even though there is an unobserved confounder, namely determinants of future compensation observed by shareholders, but not the researcher.24
While the authors note that “validity of this instrument depends on
ISS recommendations not having an influence on future compensation
decisions conditional on shareholder support (i.e., firms listen to their
shareholders, with ISS having only an indirect impact on corporate policies through its influence on shareholders’ voting decisions),” they are
unable to test the assumption (Armstrong, Gow, and Larcker [2013,
p. 912]). Unfortunately, this assumption seems inconsistent with the findings of Gow et al. [2013], who provide evidence that firms calibrate compensation plans (i.e., factors that directly affect Comp t+1 ) to comply with
ISS’s policies so as to get a favorable recommendation from ISS. As depicted
in figure 2(B), this implies a path from ISS recommendation t to Comp t+1 that
does not pass through Shareholder support t , suggesting that the instrument
of Armstrong, Gow, and Larcker [2013, p. 912] is not valid.25
3.2.4. IVs in Accounting Research: An Evaluation. A review of IV applications in our 2014 survey suggests that accounting researchers have paid
24 In figure 2, we depict the unobservability of this variable (to the researcher) by putting
it in a dashed box. Note that we have omitted the controls included by Armstrong, Gow, and
Larcker [2013] for simplicity, though a good causal analysis would consider these carefully.
25 Armstrong, Gow, and Larcker [2013] recognize the possibility that the instrument they
use is not valid and conduct sensitivity analysis to examine the robustness of their result to
violation of the exclusion restriction assumptions. This analysis suggests that their estimate is
highly sensitive to violation of this assumption.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
491
A
Comp t−1
Comp t+1
Shareholderobservable
determinants of
compensation t+1
Shareholder
support t
ISS
recommendation t
B
Comp t− 1
Comp t+1
Shareholderobservable
determinants of
compensation t+1
ISS
recommendation t
Shareholder
support t
ISS policy
Design of
proposed compensation plan
FIG. 2.—Identifying effects of shareholder support on compensation. (A) Causal diagram for
Armstrong, Gow, and Larcker [2013] and (B) alternative causal diagram for Armstrong, Gow,
and Larcker [2013].
little heed to the suggestions and warnings of Larcker and Rusticus [2010],
Lennox, Francis, and Wang [2012], and Roberts and Whited [2013]. This is
perhaps not surprising, as most studies do not have a theoretical model that
can explain why a variable can naturally be excluded from the equation of
interest but still matter. Thus, while instruments work in theory, in practice
there remains a substantial burden of proof on researchers to justify the
assumptions that justify IV estimators.
3.3. REGRESSION DISCONTINUITY DESIGNS
Recently, regression discontinuity (RD) designs have attracted the interest of accounting researchers, as a number of phenomena of interest to
492
I. D. GOW, D. F. LARCKER, AND P. C. REISS
accounting researchers involve discontinuities. For example, whether an
executive compensation plan is approved is a discontinuous function of
shareholder support (e.g., Armstrong, Gow, and Larcker [2013]) and
whether a firm initially had to comply with provisions of the Sarbanes–
Oxley Act was a discontinuous function of market float (Iliev [2010]).
In discussing the recent “flurry of research” using RD designs in other
fields, Lee and Lemieux [2010, p. 282] point out that they “require seemingly mild assumptions compared to those needed for other nonexperimental approaches . . .and that causal inferences from RD designs are
potentially more credible than those from typical ‘natural experiment’
strategies.” While RD designs make relatively mild assumptions, in practice
these assumptions may be violated. In particular, manipulation of the running variable (or the variable that determines whether an observation is assigned to a treatment) may occur and researchers should carefully examine
their data for this possibility (see, e.g., Listokin [2008], McCrary [2008]).
Another issue with RD designs is that the causal effect estimated is a local estimate (i.e., it relates to observations close to the discontinuity). This
effect may be very different from the effect at points away from the discontinuity. For example, in designating a public float of $75 million, the SEC
may have reasoned that at that point the benefits of Sarbanes–Oxley were
approximately equal to the fixed costs of complying with the law. If true,
we would expect to see an estimate of approximately zero effect, even if
there were substantial benefits of the law for shareholders of firms having a
public float well above the threshold.
Another critical assumption is the bandwidth used in estimation (i.e., in
effect how much weight is given to observations according to their distance
from the cutoff). We encourage researchers using RD designs to employ
methods that exist to estimate optimal bandwidths and the resulting estimates of causal effects (e.g., Imbens and Kalyanaraman [2012]).
Finally, one strength of RD designs is that the estimated relation is often
effectively univariate and easily plotted. As suggested by Lee and Lemieux
[2010], it is highly desirable for researchers to plot both underlying data
and fitted regression functions around the discontinuity. This plot will enable readers to evaluate the strength of the results. If there is a substantive
impact associated with the treatment, this should be obvious from a plot of
the actual data and the associated fitted function.
3.4. OTHER METHODS
3.4.1. Difference-in-Differences and Fixed-Effect Estimators. Accounting researchers have come to view some statistical methods as requiring fewer
assumptions and thus being less subject to problems when it comes to drawing causal inferences. Angrist and Pischke [2010, p. 12] include so-called
“DD estimators” on their list of such quasi-experimental methods, along
with “IV and RD methods.”26 Enthusiasm for DD designs perhaps stems
26 As Angrist and Pischke [2008, p. 228] argue that “DD is a version of fixed effects estimation,” we discuss these methods together.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
493
from a belief that these are “quasi-experimental” methods in the same sense
as the other two approaches cited by Angrist and Pischke [2010, p. 12]. But
the essential feature that IVs and RD methods rely on is the “as if” random treatment assignment mechanism. If treatment assignment is driven
by unobserved confounding variables, then DD and fixed-effect estimates
of causal effects will be biased and inconsistent. As few settings in accounting satisfy random treatment assignment, there is a heavy burden on researchers using DD or fixed-effect estimators to explain why they believe
these methods allow them to recover unbiased or consistent estimates of
causal effects.
Proponents of DD methods argue that they rely on the relatively innocuous assumption of “parallel trends.” But it is far from clear that this assumption is actually a mild one. First, it is a highly parametric assumption: parallel trends might hold for levels of a variable, but that does not mean they
would hold for log-transformations of the variable. Second, many variables
of interest to accounting researchers are mean-reverting, which is inconsistent with parallel trends when treatment and control observations differ in
pretreatment outcomes. Third, as DD studies typically rely on some kind of
quasi-natural experiment, the existence of pretreatment differences raises
questions about the claimed “as if” random assignment to treatment and
control. For example, the frequently cited study of Kelly and Ljungqvist
[2012] uses supposedly random shocks to brokerage coverage and a DD
design. But the existence of a 0.039 difference in spreads between treatment and matched control firms suggests that the assignment was far from
random.27
The causal interpretation of regressions that use fixed effects to control
for unobservable differences in observations also can be problematic, particularly when there are heterogeneities in treatment effects. If the true
effect is positive for some units (e.g., firms) and negative for others, then,
depending on the composition of the sample, the sign of the effect can
be positive, negative, or indistinguishable from zero. Additionally, if units
self-select into a binary treatment for the entire sample period, then a fixedeffect estimator will not use these observations in estimating the effect, even
though these might plausibly be the observations with the greatest treatment effect.
Heterogeneity in effects is not the only problem that fixed-effect strategies cannot necessarily handle. In particular, when there are complex relations between unobservables and treatments, as is likely to be the case in
many accounting research settings, it is unclear what a fixed-effect strategy
would produce. If time-invariant heterogeneity is correlated with potential
outcomes, then fixed-effect estimators can have greater bias than estimators
that omit fixed effects.
27 See Kelly and Ljungqvist [2012, p. 1388, table 2]. This pretreatment difference is material,
given the estimated treatment effect of 0.020. Perhaps recognizing this issue, the subsequent
paper by Balakrishnan et al. [2014] matches on pretreatment values.
494
I. D. GOW, D. F. LARCKER, AND P. C. REISS
In our view, accounting researchers need to be much more careful using and interpreting fixed-effect estimators. In particular, researchers need
to clearly demonstrate how their fixed-effect estimates are related to the
causal effect of interest, particularly when that effect could differ across
observations.
3.4.2. Propensity-Score Matching. Another method that has become popular in accounting research is PSM. Regression methods can be viewed as
making model-based adjustments to address confounding variables. Stuart
and Rubin [2007, p.157] argue that:
[M]atching methods are preferable to these model-based adjustments for
two key reasons. First, matching methods do not use the outcome values
in the design of the study and thus preclude the selection of a particular
design to yield a desired result. Second, when there are large differences
in the covariate distributions between the groups, standard model-based
adjustments rely heavily on extrapolation and model-based assumptions.
Matching methods highlight these differences and also provide a way to
limit reliance on the inherently untestable modeling assumptions and the
consequential sensitivity to those assumptions.
For these reasons, PSM methods can prove useful when faced with observational data. However, PSM does not provide “the closest archival approximation to a true random experiment” and does not represent “the most appropriate and rigorous research design for testing the effects of an ex ante
treatment” (Kirk and Vincent [2014, p. 1429]). Rosenbaum [2009, pp. 73–
75] points out that matching is “a fairly mechanical task,” and when assignment to treatment is driven by unobservable variables, PSM-based estimates
may be biased as much as regression estimates. We agree with Minutti-Meza
[2014], who argues that “matching does not necessarily eliminate the endogeneity problem resulting from unobservable variables driving [treatment]
and [outcomes].”
3.5 QUASI-EXPERIMENTAL METHODS: AN EVALUATION
We agree that the revolution in econometric methods for causal inference represents an opportunity for accounting researchers. However, the
assumptions required for these methods to deliver credible estimates of
causal effects are unlikely to be met in many applications that rely on observational data. In this regard, we echo the observation in Leuz and Wysocki
[2016, p. 29] that “finding valid instruments to implement selection models
and IV regressions is very difficult.”
Given the dominance of causal questions and observational data in accounting research, and the difficulty researchers will face in applying quasiexperimental methods in accounting research, our appraisal may seem
disappointing. Yet, these methods can be used in certain settings. In what
follows, we offer some alternative paths that accounting researchers might
consider going forward.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
495
4. Causal Mechanisms, Causal Inference, and Descriptive Studies
In the first part of this paper, we have argued that, while causal inference
is the goal of most accounting research, it is extremely difficult to find settings and statistical methods that can produce credible estimates of causal
effects. Does this mean accounting researchers must give up making causal
statements? We believe the answer is no. There are viable paths forward.
The objective of the second part of this paper is to discuss these paths.
The first path we discuss is an increased focus on causal mechanisms. Accounting research is not alone in its reliance on observational data with the
goal of drawing causal inferences. It is, therefore, natural to look to other
fields using observational data to identify causal mechanisms and ultimately
to draw causal inferences. Epidemiology and medicine are two fields that
are often singled out in this regard. In what follows, we briefly provide examples and highlight the features of the examples that enhanced the credibility of the inferences drawn. A key implication of this discussion is that
accounting researchers need to identify clearly and rigorously the causal
mechanism that is producing their results.
4.1 JOHN SNOW AND CHOLERA
A widely cited case of successful causal inference is John Snow’s work on
cholera. As there are many excellent accounts of Snow’s work, we will focus on the barest details. As discussed in Freedman [2009, p. 339], “John
Snow was a physician in Victorian London. In 1854, he demonstrated that
cholera was an infectious disease, which could be prevented by cleaning
up the water supply. The demonstration took advantage of a natural experiment. A large area of London was served by two water companies.
The Southwark and Vauxhall company distributed contaminated water, and
households served by it had a death rate ‘between eight and nine times as
great as in the houses supplied by the Lambeth company,’ which supplied
relatively pure water.” But there was much more to Snow’s work than the
use of a convenient natural experiment. First, Snow’s reasoning (much of
which was surely done before “the arduous task of data collection” began)
was about the mechanism through which cholera spread. Existing theory
suggested “odors generated by decaying organic material.” Snow reasoned
qualitatively that such a mechanism was implausible. Instead, drawing on
his medical knowledge and the facts at hand, Snow conjectured that “a living organism enters the body, as a contaminant of water or food, multiplies
in the body, and creates the symptoms of the disease. Many copies of the
organism are expelled with the dejecta, contaminate water or food, then
infect other victims” (Freedman [2009, p. 342]).
With a hypothesis at hand, Snow then needed to collect data to prove it.
His data collection involved a house-to-house survey in the area surrounding the Broad Street pump operated by Southwark and Vauxhall. As part
of his data collection, Snow needed to account for anomalous cases (such
as the brewery workers who drank beer, not water). It is important to note
496
I. D. GOW, D. F. LARCKER, AND P. C. REISS
that this qualitative reasoning and diligent data collection were critical elements in establishing (to a modern reader) the “as if” random nature of
the treatment assignment mechanism provided by the Broad Street pump.
Snow’s deliberate methods contrast with a shortcut approach, which would
have been to argue that in his data he had a natural experiment.
Another important feature of this example is that widespread acceptance
of Snow’s hypothesis did not occur until compelling evidence of the precise causal mechanism was provided. “However, widespread acceptance was
achieved only when Robert Koch isolated the causal agent (Vibrio cholerae, a
comma-shaped bacillus) during the Indian epidemic of 1883” (Freedman
[2009, p. 342]). Only once persuasive evidence of a plausible mechanism
was provided (i.e., direct observation of microorganisms now known to
cause the disease) did Snow’s ideas become widely accepted.
We expect the same might be true in the accounting discipline if researchers carefully articulate the assumed causal mechanism for their observations. It is, of course, necessary for researchers to show that the proposed mechanism is actually consistent with behavior in the institutional
setting being examined. As we discuss below, detailed descriptive studies of
institutional phenomenon provide an important part of the information to
evaluate the proposed mechanism.
4.2 SMOKING AND HEART DISEASE
A more recent illustration of plausible causal inference is discussed by
Gillies [2011]. Gillies focuses on the paper by Doll and Peto [1976], which
studies the mortality rates of male doctors between 1951 and 1971. The
data of Doll and Peto [1976] showed “a striking correlation between smoking and lung cancer” (Gillies [2011, p. 111]). Gillies [2011] argues that
“this correlation was accepted at the time by most researchers (if not quite
all!) as establishing a causal link between smoking and lung cancer.” Indeed
Doll and Peto [1976, p. 1535] themselves say explicitly that “the excess mortality from cancer of the lung in cigarette smokers is caused by cigarette
smoking.” In contrast, while Doll and Peto [1976] had highly statistically
significant evidence of an association between smoking and heart disease,
they were cautious about drawing inferences of a direct causal explanation
for the association. Doll and Peto [1976, p. 1528] point out that “to say that
these conditions were related to smoking does not necessarily imply that
smoking caused . . .them. The relation may have been secondary in that
smoking was associated with some other factor, such as alcohol consumption or a feature of the personality, that caused the disease.”
Gillies [2011] then discusses extensive research into atherosclerosis between 1979 and 1989 and concludes that “by the end of the 1980s,
it was established that the oxidation of LDL was an important step in
the process which led to atherosclerotic plaques.” Later research provided “compelling evidence” that smoking causes oxidative modification of
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
497
biologic components in humans.28 Gillies [2011, p. 120] points out that this
evidence alone did not establish a confirmed mechanism linking smoking
with heart disease, because the required oxidation needs to occur in the
artery wall, not in the blood stream, and it fell to later research to establish this missing piece.29 Thus, through a process involving multiple studies
over two decades, a plausible set of causal mechanisms between smoking
and atherosclerosis was established.
Gillies [2011] avers that the process by which a causal link between smoking and atherosclerosis was established illustrates the “Russo–Williamson
thesis.” Russo and Williamson [2007, p. 159] suggest that “mechanisms allow us to generalize a causal relation: while an appropriate dependence
in the sample data can warrant a causal claim ‘C causes E in the sample
population,’ a plausible mechanism or theoretical connection is required
to warrant the more general claim ‘C causes E .’ Conversely, mechanisms
also impose negative constraints: if there is no plausible mechanism from
C to E , then any correlation is likely to be spurious. Thus mechanisms can
be used to differentiate between causal models that are underdetermined
by probabilistic evidence alone.”
The Russo–Williamson thesis was arguably also at work in the case of
Snow and cholera, where the establishment of a mechanism (i.e., Vibrio
cholerae) was essential before the causal explanation offered by Snow was
widely accepted. It also appears in the case of smoking and lung cancer,
which was initially conjectured based on correlations, prior to a direct biological explanation being offered.30
4.3 CAUSAL MECHANISMS IN ACCOUNTING RESEARCH
Our view is that accounting researchers can learn from fields such as
epidemiology, medicine, and political science.31 These fields grapple with
observational data and eventually draw inferences that are causal. While
randomized controlled trials are a gold standard of sorts in epidemiology,
in many cases it is unfeasible or unethical to use such trials. For example, in
political science, it is not possible to randomly assign countries to treatment
conditions such as democracy or socialism. Nevertheless, these fields have
often been able to draw plausible causal inferences by establishing clear
mechanisms, or causal pathways, from putative causes to putative effects.
28 This evidence is much higher levels of a new measure (levels of F -isoprostanes in blood
2
samples) of the relevant oxidation in the body due to smoking. This conclusion was greatly
strengthened by the finding that levels of F2 -isoprostanes in the smokers “fell significantly after
two weeks of abstinence from smoking” (Morrow et al. [1995, pp. 1201 and 1202).
29 “Smoking produced oxidative stress. This increased the adhesion of leukocytes to the
. . .artery, which in turn accelerated the formation of atherosclerotic plaques” (Gillies [2011,
p. 123]).
30 The persuasive force of Snow’s natural experiment, coming decades before the work by
Neyman [1923] and Fisher [1935], might be considered greater today.
31 In this regard, we echo the suggestion by Leuz and Wysocki [2016] that it “might be
useful for regulators, policy makers and academics to study the experience in medicine.”
498
I. D. GOW, D. F. LARCKER, AND P. C. REISS
One paper that has a fairly compelling identification strategy is Brown,
Stice and White [2015], which examines “the influence of mobile communication on local information flow and local investor activity using the enforcement of state-wide distracted driving restrictions.” The authors find
that “these restrictions . . .inhibit local information flow and . . .the market
activity of stocks headquartered in enforcement states.” Miller and Skinner
[2015, p. 229] suggest that “given the authors’ setting and research design,
it is difficult to imagine a story under which the types of reverse causality
or correlated omitted variables explanations that we normally worry about
in disclosure research are at play.” However, notwithstanding the apparent
robustness of the research design, the results would be much more compelling if there were more detailed evidence regarding the precise causal
mechanism through which the estimated effect occurs and the authors appear to go to lengths to provide such an account.32 For example, evidence
of trading activity by local investors while driving prior to, but not after, the
implementation of distracted driving restrictions would add considerable
support to conclusions in Brown, Stice and White [2015].33
As another example, many published papers have suggested that managers adopt conditional conservatism as a reporting strategy to obtain benefits such as reduced debt costs. However, as Beyer et al. [2010, p. 317]
point out, an ex ante commitment to such a reporting strategy “requires a
mechanism that allows managers to credibly commit to withholding good
news or to commit to an accounting information system that implements
a higher degree of verification for gains than for losses,” yet research has
only recently begun to focus on the mechanisms through which such commitments are made (e.g., Erkens, Subramanyam, and Zhang [2014]).
It is very clear that we need a much better understanding of the precise causal mechanisms for important accounting research questions. A
clear discussion of these mechanisms will enable reviewers and readers to
see what is being assumed and assess the reasonableness of the theoretical
causal mechanisms.
32 Brown, Stice and White [2015, pp. 277 and 278] “argue that constraints on mobile communication while driving could impede or delay the collection and diffusion of local stock
information across local individuals. Anecdotal evidence suggests that some individuals use
car commutes as opportune times to gather and disseminate stock information via mobile
devices. For instance, some commuters use mobile devices to collect and pass on stock information either electronically or by word of mouth to other individuals within their social
network. Drivers also use mobile devices to wirelessly check stock positions and prices in realtime, stream the latest financial news, or listen to earnings calls.”
33 Note that the authors disclaim reliance on trading while driving: “our conjectures do not
depend on the presumption that local investors are driving when they execute stock trades
. . .[as] we expect such behavior to be uncommon.” However, even if not necessary, given the
small effect size documented in the paper (approximately 1% decrease in volume), a small
amount of such activity could be sufficient to provide a convincing account in support of their
results.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
499
4.4 DESCRIPTIVE STUDIES
Accounting is an applied discipline and it would seem that most empirical research studies should be solidly grounded in the details of how institutions operate. These descriptions can form a basis for identifying and justifying causal mechanisms for explaining empirical results. Unfortunately,
there are very few studies published in top accounting journals that focus
on providing detailed descriptions of institutions in accounting research
settings. Part of this likely reflects the perception that research that pursues causal questions (i.e., tests of theories) is more highly prized, and thus
more likely to be published in top accounting journals.34 We believe that accounting research can benefit substantially from more in-depth descriptive
research. As we discuss below, this type of research is essential to improve
our understanding of causal mechanisms and develop structural models.35
One reason to value descriptive research is that it can uncover realistic
structures and mechanisms that would be exceedingly difficult to arrive at
from basic economic theory or the simple intuition of the researcher. In the
compensation area, the early research by Lewellyn [1968] and the more
recent work by Frydman and Saks [2010] are also essentially descriptive
studies that caused researchers to explore why certain patterns of remuneration arrangements are used, revised, or eliminated over time. These types
of data motivate researchers to frame research studies that have the potential to uncover the causal mechanisms that produce these institutional
observations.
A good example in the accounting literature is the study by Healy [1985].
Using proxy statement disclosures and conversations with actual executives
and consultants, Healy [1985] studies the bonus contracts of 94 large U.S.
companies and identifies a common structure of these bonus plans, including the existence of caps and floors. The paper also suggests hypotheses
worth investigating regarding the effects of these plan features on accounting decisions. It seems highly unlikely that a model derived from fundamental economic theory would arrive at these plan features found in his
data.
Another example is work by Smith and Warner [1979], Kalay [1982], and
many others who look at debt covenant provisions. Institutional knowledge
34 At one point, the Journal of Accounting Research published papers in a section entitled
“Capsules and Comments.” The editor at the time (Nicholas Dopuch) would seem to place
a paper into this section if it “did not fit” as a main article, but examined new institutional
data or ideas. Such a journal section might have provided a credible signal of a willingness to
publish descriptive studies of institutionally interesting settings.
35 There are many “classic” descriptive studies that have had a major impact on subsequent
theoretical and empirical research in organizational behavior and strategy (e.g., Cyert, Simon,
and Trow [1956], Mintzberg [1973], Bower [1986]). Cyert, Simon, and Trow [1956] argue that
“a realistic description and theory of the decision-making process are of central importance to
business administration and organization theory. Moreover, it is extremely doubtful whether
. . .economics does in fact provide a realistic account of decision-making in large organizations
operating in a complex world.”
500
I. D. GOW, D. F. LARCKER, AND P. C. REISS
about debt covenants has generated hypotheses about managerial wealth
and accounting manipulation. Moreover, descriptive statistics regarding
covenants also provided Dichev and Skinner [2002] with the data to show
that leverage is not a valid proxy for “closeness to covenant.” This is an
important finding because the empirical literature to this point simply assumed that leverage was a reliable and valid proxy for potential covenant
violations. An in-depth examination of actual debt covenants and an understanding of how covenant violations are dealt with by financial institutions would have substantially improved much of the research on how
debt covenants influence firm behavior (i.e., so-called “positive theory”
research).
In the corporate governance area, the descriptive data on board of director interlocks in Brandeis [1913], U.S. Federal Trade Commission [1951],
and U.S. Congress Senate Committee on Governmental Affairs and Ribicoff [1978] provided novel descriptive insights into the structure of boards
of directors. These and other similar studies had an important impact
on starting the large literature on how boards of directors function. Similarly, the initial collection of equity ownership by executives, directors, and
large shareholders by the Securities and Exchange Commission [1936] enabled researchers to understand the extent to which ownership is separated
from control, and examine the implications of the classic Berle and Means
[1932] hypotheses regarding economic activity.
Descriptive data on antitakeover provisions collected by the Investor Responsibility Research Center (IRRC) have provided the basis for a considerable amount of research on the market for corporate control. Gompers,
Ishii, and Metrick [2003], Bebchuk, Cohen, and Ferrell [2009], and many
others use these data to form and test a multitude of research questions
related to corporate governance. Perhaps more importantly, Daines and
Klausner [2001] provided an institutionally grounded examination of how
these specific antitakeover provisions actually work from a legal perspective
(which contrasts with conjectures made by researchers in other disciplines).
The Daines and Klausner [2001] analysis provides a good example of how
descriptive data combined with institutional and legal knowledge can provide appropriate insights into the workings of corporate governance.
The descriptive disclosure data compiled by the Association for Investment Management and Research (AIMR) have had a similar impact on
financial accounting research. These ratings reflect the assessments of analysts specializing in specific industries as to the informativeness of disclosures made by firms. The ratings data have provided a variety of useful
information about differences in disclosure practices across firms, industries, and time. We suspect that these statistics were instrumental in motivating Lang and Lundholm [1993, p. 6], Healy, Hutton, and Palepu [1999],
and many others. They provided new insights into whether firm disclosure
is associated with performance, consensus among investors, stock liquidity, and other important outcome variables. In related work, Groysberg,
Healy, and Maber [2011] provide an informative analysis of how analysts are
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
501
compensated using descriptive proprietary data and statistical analyses to
uncover the fundamental features of the reward system.
Recently published research suggests an increased recognition of the
value of descriptive research. Soltes [2014] examines the interactions between sell-side analysts and company management in one firm that granted
proprietary access to its data to “offer insights into which analysts privately
meet with management, when analysts privately interact with management,
and why these interactions occur.” By comparing private interaction to
observed interaction between analysts and managers on conference calls,
and highlighting that private interaction with management is an important communication channel for analysts, Soltes [2014] suggests a plausible
mechanism through which information transfers actually occur.
That private communication with management is an important source of
information is confirmed by Brown et al. [2015]. Brown et al. survey and
interview financial analysts to understand how they think about a variety of
issues. Their findings suggest that analysts’ views on earnings quality differ
from those most researchers explore. For instance, analysts do not use the
“red flags” used by academics to identify manipulation. Analysts also generally are not attempting to uncover manipulation and use forecasts to figure out a stock price target. These insights should shape research seeking
to develop hypotheses and models of accounting information and analyst
behavior. Despite the dearth of descriptive research in top accounting journals, we believe that our discipline can benefit substantially from this style
of research. An interesting question is what makes a descriptive study an
important contribution that should be published in a top journal. An obvious required attribute is that the descriptive study examines an interesting institutional question where researchers care about understanding the
phenomenon producing the observations. Stated differently, would anyone
change their research agenda or their (causal) interpretations of prior work
if provided with these descriptive results?
The descriptive research needs to be neutral and unbiased in terms of
data collection and interpretations. If expert opinions are used, can we be
assured that the opinions are not biased because of their business dealings?
Data collected using surveys or interviews by consulting firms may provide
great descriptive data, but researchers need to be convinced that the data
are not confounded by selection bias or other sampling concerns.
The research should also provide deep insight into the causal mechanisms underlying observed institutional data. There may well be alternative mechanisms suggested by the research, and these alternatives may be a
function of nuances and contextual variables for the setting. Provided the
researcher is clear that their aim is description and not the last word on
causality, the presence of several alternative explanations should not detract from the insight of the descriptive work.
Obviously, the evaluation of descriptive research is somewhat subjective,
but the evaluation of more traditional accounting research is similarly subjective. As a discipline, we do not have much recent experience assessing
502
I. D. GOW, D. F. LARCKER, AND P. C. REISS
descriptive research, and we are unfamiliar with recent advances in descriptive methods, such as nonparametric regression. However, given the
possibility that descriptive research can help us begin to think about causal
mechanisms, it should be encouraged and accepted in the top accounting
journals.
5. Structural Modeling
5.1 STRUCTURAL MODELING: AN OVERVIEW
In sections 2 and 3, we suggested that researchers minimally consider
using diagrams to communicate the basis for their causal inferences, and
in section 4, we suggested that researchers be more precise in describing
how their data permit causal inferences. This section explores a formal approach to developing a causal model, namely, the “structural” approach.
Structural models are empirical models that are derived from theoretical
models of behavior. The term structural model originated with economists
and statisticians working at the Cowles Foundation in the 1940s and 1950s.
The earliest structural models used economic models of consumer and producer behavior to derive demand and supply equations. By adding an equilibrium condition, such as quantities demanded equal quantities supplied,
economists obtained a set of mathematical equations that could be used
to understand movements in observed prices and quantities. A question
then arose as to whether economists could reverse-engineer this modeling
process and use observed prices and quantities to recover the underlying
demand and supply relations. The models made it clear that the empiricist
could only recover estimates of the unobserved demand and supply equations if certain exogenous (IV) variables were available.
The impact of these early models on empirical work in economics encouraged other social scientists to begin using theoretical models to interpret data. Structural models have found widest application in situations
where causality is an issue, such as the determinants of educational choices,
voting, contraception, addiction, and financing decisions. Other applications of structural models are discussed in Reiss and Wolak [2007] and Reiss
[2011].
A structural empirical model comprises a theoretical model of the phenomenon of interest and a stochastic model that links the theoretical
model to the observed data. The theoretical model minimally describes
who makes decisions, the objectives of decision-makers, and constraints on
their behavior. In developing and analyzing the theoretical model, the researcher decides what conditions (variables) matter and what is endogenous and exogenous. While the theoretical model typically draws on economic principles, it could also be derived from behavioral theories in other
fields, such as psychology and sociology.36
36 Some researchers refer to any mathematical model fit to data as a structural model.
For instance, one might assume that the number of restatements in an industry follows a
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
503
Structural models offer a number of benefits for empirical researchers.
First, structural modeling is a process that forces a researcher to make explicit assumptions about what determines behavior and outcomes (i.e., the
causal mechanism). Second, structural models make it clear what data are
needed to identify unobserved parameters and random variables, such as
coefficients of risk aversion. Third, structural models provide a foundation
for estimation and inference. Finally, structural models facilitate counterfactual analyses, such as what might happen under conditions not observed
in the data. To illustrate these benefits, as well as some of their limitations,
we next explore an accounting application.
5.2 STRUCTURAL MODELS IN ACCOUNTING: AN ILLUSTRATION
This section develops a model of managerial incentives to misstate accounting information. This topic has been the focus of many papers in
recent years (see the review in Armstrong, Jagolinzer, and Larcker [2010]).
The key question in this literature is whether certain kinds of managerial
incentives increase the tendency for managers to misstate (or attempt to
misstate) financial information. A number of papers hypothesize that tying
managers’ compensation to the information that they provide will increase
their desire to misstate that information. However, some researchers suggest that, by aligning the long-term interests of shareholders and managers,
certain kinds of incentives could actually reduce misstatements (Burns and
Kedia [2006]).
Efendi, Srivastava, and Swanson [2007] illustrate a fairly typical descriptive empirical paper in this literature. Efendi, Srivastava, and Swanson
[2007, p. 687] estimate a logistic regression with an indicator for restatements as the dependent variable and measures of CEO incentives as independent variables of interest, along with controls for firm size, financial
structure, and corporate governance proxies.37
A key assumption implicit in much of this literature is that restatements are a good proxy for actual misstatements (e.g., Efendi, Srivastava, and
Swanson [2007], Armstrong, Jagolinzer, and Larcker [2010]). This assumption is made because, in practice, accounting researchers only observe misstatements that are detected and corrected by external monitors after the
financial statements were issued. Examples of these external monitors include whistleblowers, regulators, media, and others (e.g., Dyck, Morse, and
Zingales [2010]). For simplicity, we refer to the actions of these external monitors collectively as “subsequent investigations.” If subsequent
Poisson process and then fit the parameters of the Poisson model using industry-level data on
restatements. We do not view such models as structural because they lack specific behavioral
or institutional components that permit a causal inference. We would classify this approach as
descriptive or statistical modeling.
37 Efendi, Srivastava, and Swanson [2007] also employ a case–control design that involves
matching firms with restatements with firms without. We do not focus on that aspect of their
research design in our discussion here.
504
I. D. GOW, D. F. LARCKER, AND P. C. REISS
investigations are perfect and detect all misstatements, then there is a oneto-one correspondence between misstatements and restatements.38 Realistically, these subsequent investigations are not perfect, meaning that we
need to recognize the difference between misstatements and restatements
when estimating the effect of managerial incentives on misstatements.
In the following analysis, we consider two alternative models of the causal
mechanism linking managerial incentives to accounting restatements. Each
model explicitly considers the incentives of the manager and the role of the
external auditor. The two models, however, lead to different conclusions
about how CEO incentives affect restatements. These differences permit us
to illustrate the value of having a theoretical model that can interpret competing empirical estimates, as well as the difficulty of interpreting estimates
in the absence of such models.
5.2.1. Model 1: A Nonstrategic Auditor Model. We assume that firm misstatements are deliberate and are made by a single agent, whom we refer to as
the “CEO.” The CEO is assumed to be rational in the sense that he or she
trades off private expected benefits and costs of misstatements when deciding whether to misstate. Specifically, suppose that the CEO receives a
benefit of B ∗ from the successful manipulation of earnings (i.e., a misstatement that is not detected either by the firm’s auditors before a report is
released or by subsequent investigations).
Besides the CEO, we assume that the firm’s auditors independently
detect and correct attempted misstatements at a constant rate p A and
that the (conditional) probability of subsequent investigations catching
a misstatement is p I . Given these assumptions, the probability of a misstatement getting past the firm’s auditor and subsequent investigations is
(1 − p A ) × (1 − p I ). The CEO’s expected benefit from a successful misstatement is then
B ∗ = (1 − p I ) × (1 − p A ) × B,
where B is a gross benefit to the manager from a misstatement.
Assume the CEO must exert a fixed cost of effort CM in order to misstate
performance. Combining this cost with the manager’s expected benefits
from of misstatement gives
Misstate
if (1 − p I ) × (1 − p A ) × B − CM ≥ 0
∗
yM
=
(3)
Don’t misstate, otherwise.
This (structural) inequality describes the unobserved misstatement process.
In general, researchers will not observe the structural parameters of interest: B, CM , p A , or p I .
To complete the structural model and recover these parameters, the
researcher must add assumptions that relate the parameters to the data
38 There will still be a difference between attempted misstatements and actual misstatements
due to the external auditor correcting some attempted misstatements.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
505
available. Suppose we only observe a (zero-one) indicator variable y for restatements. These restatements are the result of three decisions:
1) The manager misstates (or not).
2) The firm auditor detects and corrects an attempted misstatement (or
not).
3) A subsequent investigation detects a misstatement and a restatement
occurs (or not).
Mathematically, this sequence can be modeled as
∗
y = I (Restate) = I (y M
≥ 0) × (1 − I (y A∗ ≥ 0)) × I (y I∗ ≥ 0),
(4)
where I (·) is a zero-one indicator function equaling 1 when the condition
∗
in parentheses is true. The unobserved variables y M
, y A∗ , and y I∗ reflect the
criteria that underlie the CEO’s, firm’s auditor’s, and subsequent investigators’ decisions. Note that equation (4) uses (1 − I (y A∗ ≥ 0)), an indicator
for the firm’s auditor missing the misstatement.
Equation (4) somewhat resembles a traditional binary discrete choice
model. The easiest way to see this is to take expectations (from the researcher’s standpoint). Assuming the decision variables are independent,
∗
≥ 0) × (1 − I (y A∗ ≥ 0)) × I (y I∗ ≥ 0)
E (y ) = E I (y M
= Pr(Misstate) × Pr(Auditor Misses) × Pr(Investigation Finds)
= β ∗ × (1 − p A ) × p I = Pr(Restate),
(5)
∗
where β is the (researcher’s) forecasted probability that a misstatement
occurs, or, from equation (3),
β ∗ = Pr ( (1 − p A )(1 − p I )B − CM ≥ 0 ) .
(6)
At this point, the theory has delivered a structure for relating the unobserved probability of a misstatement, β ∗ , to the potentially estimable probability of a restatement. Now, we face a familiar structural modeling problem,
which is that the model does not anticipate all the reasons why, in practice,
these probabilities might vary across firm accounting statements. For example, the theory so far does not point to reasons why CEOs might differ
in their benefits and costs of misstatements. To move theoretical relations
closer to the data, researchers typically allow parts of the model to depend
on differentiating variables. Often the specifications of these dependencies
are ad hoc. Empiricists are willing to do this, however, because they believe
that it is important to account for practical aspects of the application that
the theory does not recognize.
To illustrate this approach, and following suggestions of what might matter from the accounting research literature, suppose the CEO’s unobserved
costs and benefits vary as follows:
B = b 0 + b 1 EQUITY + XB β
CM = m 0 + m 1 SALARY + XC γ + ξ ,
(7)
506
I. D. GOW, D. F. LARCKER, AND P. C. REISS
where EQUITY is the fraction of a CEO’s total pay that is stock-based compensation, the XB are other observable factors that impact the manager’s
benefits from misstatements, SALARY is the CEO’s annual base salary, and
the XC are observable factors impacting the CEO’s perceived costs of misstatements.39 The EQUITY variable is intended to capture the idea that
the more a CEO is rewarded for performance, the greater will be his or
her incentive to misstate results so as to increase (perceived) performance.
Thus, we would expect the unknown coefficient b 1 to be positive if providing more equity incentives increases the tendency of the CEO to misstate
earnings, but expect b 1 < 0 if it reduces that tendency. Similarly, we include
the variable SALARY as a driver of the cost of making misstatements, with
the idea that a CEO caught misstating might lose his or her job, including
salary (and other benefits). Thus, we would expect the unknown coefficient
m 1 also to be positive. For now, we leave the other X variables unnamed.
We have no strong theoretical reason for the assumption of linearity. Its
motivation is practical, as it facilitates estimation of the model unknowns
(as we will shortly see).40
With these assumptions, the probability of a restatement becomes
Pr(Restate) = θ0 Pr θ1 + θ2 EQUITY + θ3 SALARY ≥ ξ .
(8)
The new θ parameters are functions of the underlying incentive parameters as follows: θ0 = (1 − p A ) × p I , θ1 = (1 − p A )(1 − p I )b 0 − m 0 , θ2 =
(1 − p A )(1 − p I )b 1 , and θ3 = −m 1 . Apart from the scalar multiple θ0 , which
can be absorbed into the probability statement (and thus is not identified),
this probability model has the form of a familiar binary choice model (e.g.,
a probit or logit). Thus, the value of the structure imposed so far is that
it can motivate the application of a familiar statistical model as in Efendi,
Srivastava, and Swanson [2007], as well as explain how the estimated coefficients are potentially connected to quantities that impact the probability
of a misstatement.
5.2.2. Estimating the Nonstrategic Auditor Model. To illustrate how to estimate this structural model, we simulated a data set containing 10,000 firmyear observations on whether or not financial results were restated.41
For verisimilitude, we simulated variables that have been used to model
restatements. RESTATE is a zero-one indicator variable for whether a firm
39 For expositional purposes, we assume away X and X in our analysis.
B
C
40 Another key variable in the above model is the unobserved cost ξ . While it makes sense
to say that the researcher cannot measure all misstatement costs, why not also allow for unobserved benefits as well? The answer here is that adding an unobserved benefit would not really
add to the model, as it is the net difference that the model is trying to capture. The sense in
which it could matter is if we thought we observed the probabilities p A and p I . In this case,
we might be able to distinguish between the cost and benefit unobservables based on their
variances.
41 The parameter values used to generate the data are a = 0.5, a = 3.5, a = 3.5, m =
0
1
2
0
7, m 1 = 1.5, b 0 = 20, b 1 = 10, p 0 = 0.75, v0 = 0.05, p I = 0.45, and r 0 = 60. For those interested, the data are available at http://web.stanford.edu/ preiss/Data page.html.
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
507
TABLE 1
Descriptive Statistics
Variable
Sample Mean (SE)
RESTATE
0.099
(0.30)
1.06
(0.27)
0.45
(0.26)
0.75
(0.43)
0.09
(0.08)
1.49
(0.50)
0.31
(0.46)
SALARY
EQUITY
BIG4
FINDIRECT
SEG
INT
RESTATE is a zero-one indicator for whether a sample firm made a restatement in a particular year.
SALARY is the CEO’s annual base salary (in millions of $). EQUITY is the fraction of a CEO’s total pay
that is equity-based compensation. BIG4 is a zero-one indicator for whether the firm uses a Big 4 auditor.
FINDIRECT is the fraction of the board of directors with a professional finance or accounting background.
INT is a zero-one indicator for whether the firm derives most of its revenue outside the United States. SEG
is the firm’s number of two-digit SIC business segments.
restated (RESTATE = 1) their financial results in a given year. The variable
BIG4 also is a zero-one indicator for whether the firm’s auditor is one of the
four largest U.S. accounting firms. It is included in the specifications because Big 4 auditing firms might have more accounting expertise and this
expertise might make them more likely to catch misstatements. Similarly,
the corporate governance literature suggests that board oversight from directors with accounting or finance backgrounds reduces the likelihood of
misstatements. We proxy this possibility with FINDIREC, the percentage
of directors who have professional accounting or finance backgrounds. Finally, the variables INT and SEG are included to capture the complexity
and costs of audits. Specifically, INT is a zero-one indicator for whether the
firm does a majority of its business outside the United States. We assume
that international companies have higher auditing costs. Similarly, SEG is a
count of the firm’s business segments. We assume that more segments likely
will increase the costs of auditing.
Table 1 reports descriptive statistics for our sample and table 2 reports
the results of logit regressions in which the dependent variable is the restatement indicator variable. These specifications parallel prior descriptive
statistical models that correlate restatements with other variables that might
impact misstatements. The table contains both a simple specification containing an intercept along with the two CEO compensation variables, and
a more intricate specification involving the other variables in the data set.
For each specification, we report the estimated coefficients of the logit and
the corresponding marginal effects evaluated at the sample means of the
exogenous variables.
508
I. D. GOW, D. F. LARCKER, AND P. C. REISS
TABLE 2
Logit Regression Results
Specification 1
Coefficient
Intercept
SALARY
EQUITY
BIG4
FINDIRECT
INT
SEG
Coefficient
(SE)
−2.278
(0.141)
0.280
(0.120)
−0.504
(0.130)
Marginal Effect
(SE)
0.025
(0.011)
−0.045
(0.011)
Specification 2
Coefficient
(SE)
Marginal Effect
(SE)
−3.498
(0.198)
0.326
(0.121)
−0.503
(0.131)
0.135
(0.080)
−0.239
(0.408)
0.548
(0.069)
0.578
(0.069)
0.028
(0.010)
−0.043
(0.011)
0.011
(0.006)
−0.020
(0.034)
0.051
(0.007)
0.049
(0.006)
This table presents results from logistic regressions of RESTATE, a zero-one indicator for whether the
firm made a restatement in a particular year, on a proxy for managerial incentives and controls. The controls
are as follows: SALARY is the CEO’s annual base salary (in millions of $), EQUITY is the fraction of a CEO’s
total pay that is equity-based compensation, BIG4 is a zero-one indicator for whether the firm uses a Big
4 auditor, FINDIRECT is the fraction of the board of directors with a professional finance or accounting
background, INT is a zero-one indicator for whether the firm derives most of its revenue outside the United
States, and SEG is the firm’s number of business segments.
The results for the pay coefficients in both specifications run counter to
those the previous accounting literature might predict and counter to those
predicted by the structural model that assumes the benefit coefficient on
equity pay, b 1 , is greater than zero. Specifically, more base pay is associated
with more restatements, while more equity-based compensation is associated with fewer restatements.
Besides the intercepts and the EQUITY and SALARY coefficients, the
only other coefficients that are statistically significant are those on INT and
SEG. While we can say (descriptively) that INT and SEG are associated with
higher restatement rates, unless we take a position on how they enter XC or
XB , it is difficult to interpret whether these signs make sense.
The question we now address is what to make of the fact that the coefficients on EQUITY seem inconsistent with our informal arguments and with
the prediction from our structural model that assumes b 1 > 0. One possible
interpretation of this finding is that our beliefs about the effects of incentives on misstatements were wrong. Another possibility is that the measures
we employ and the functional forms assumed are incorrect, which leads to
spurious results. Yet another possibility is that our theory of misstatements
is incorrect. It is this last possibility that we consider now.
5.2.3. Model 2: A Strategic Auditor Model. A key weakness of the previous
model is that it ignores the incentives of the external auditor. According
to PCAOB guidance in Auditing Standard No. 12, assessment of the risk of
CAUSAL INFERENCE IN ACCOUNTING RESEARCH
509
material misstatement should take into account “incentive compensation
arrangements.” Similarly, Auditing Standard No. 8 suggests that audit effort
should increase if risk is higher. To make the model richer in a manner
consistent with these institutional details, we assume that auditors trade off
the costs of audit effort against the reputational losses they might incur
should they miss a managerial misstatement that is subsequently detected.42
In the previous model, the firm’s auditor impacted the manager’s misstatement benefits through p A (which is assumed to be constant). Suppose
that p A is in fact a choice variable for the firm’s auditor. To make matters
simple, suppose that the auditor detects manipulation with probability p AH
if they exert high effort and, otherwise, they detect manipulation with the
lower probability p AL . Let the cost of high effort be a fixed cost CA > 0.
Without loss of generality, suppose the cost of low effort is zero. When deciding whether to audit with high or low effort, the auditor perceives a cost
to its reputation, CR , because of not detecting a misstatement that is caught
by subsequent investigations. This structure implies that the total cost of
high effort to the auditor is CA + (1 − p AH ) × p I × CR or the cost of high
effort plus the expected cost of missing a misstatement that is subsequently
caught with probability p I . The total expected cost of low effort is similarly
equal to (1 − p AL ) × p I × CR .
To complete this new model, we need to make an (equilibrium) assumption about how the CEO and firm auditor interact. Following the literature,
we assume that the two simultaneously and independently make decisions,
and their strategies form a Nash equilibrium. That is, we assume the players’ strategies are such that they optimize their objectives taking the actions
of the other players as fixed. This means that, in a Nash equilibrium, the
players are taking actions that they cannot unilaterally improve upon.
In this type of auditing game, the Nash equilibrium has the CEO and
the auditor playing mixed (randomized) strategies. That is, the auditor will
independently exert high effort with probability α ∗ and the CEO independently misstates with probability β ∗ . These probabilities are such that each
party has no incentive to change strategies. That is,
1) the CEO is indifferent between misstating and not misstating, or
(1 − p A∗ )(1 − p I )B − CM = 0,
(9)
where p A∗ = α ∗ p AH + (1 − α ∗ )p AL is the equilibrium probability a misstatement is detected; and
2) the auditor is indifferent between exerting high and low effort, or
β ∗ (1 − p AH )p I CR + CA = β ∗ (1 − p AL )p I CR .
42 Here we have in mind the findings by Dyck, Morse, and Zingales [2010], who show that
many egregious forms of misstatements are detected subsequently by employees, directors,
regulators, and the media.
I. D. GOW, D. F. LARCKER, AND P. C. REISS
510
Solving these two equations for the equilibrium probabilities α ∗ and β ∗
yields
(1 − p AL )(1 − p I )B − CM
,
(1 − p I )(p AH − p AL )B
CA
β∗ =
.
(10)
(p AH − p AL )p I CR
From these equations, we can calculate the equilibrium probability of a
restatement43
α∗ =
Pr(Restate) = Pr(Misstate) × Pr(Auditor Misses) × Pr(Investigation Finds)
= β ∗ × (1 − p A∗ ) × p I .
(11)
This equation illustrates how the probability of a restatement is related to
the unobserved frequency of misstatements. In particular, if we knew the
frequency with which auditors and subsequent investigations caught misstatements, we could easily link the two. Otherwise, we would have to estimate these probabilities (or make assumptions about them).
Substituting the equilibrium strategies (10) into (11) yields
CA CM (1 − p AL )
.
(12)
(p AH − p AL )(1 − p I )CR B
Now we are in a position to use the theory to help interpret the conflicting
logistic regression results in table 3.
Equation (12) shows that the presence of a strategic external auditor
changes how the CEO’s incentives impact the probability of a restatement.44 Partial derivatives of equation (12) show that the restatement probability is:
Pr(Restate) =
r Decreasing in the benefit B that the CEO enjoys from misstatement;
r Increasing in the personal cost of manipulation CM incurred by the
CEO;
r Decreasing in the reputational cost CR incurred by the external auditor;
r Increasing in the cost of high effort CA incurred by the external auditor.
Thus, in contrast to the model with a non…
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more