Defining and using evidence in conservation practice

There is growing interest in evidence‐based conservation, yet there are no widely accepted standard definitions of evidence, let alone guidance on how to use it in the context of conservation and natural resource management practice. In this paper, we first draw on insights of evidence‐based practice from different disciplines to define evidence as being the “relevant information used to assess one or more hypotheses related to a question of interest.” We then construct a typology of different kinds of information, hypotheses, and evidence and show how these different types can be used in different steps of conservation practice. In particular, we distinguish between specific evidence used to assess project hypotheses and generic evidence used to assess generic hypotheses. We next build on this typology to develop a decision tree to support practitioners in how to appropriately use available specific and generic evidence in a given conservation situation. Finally, we conclude with a discussion of how to better promote and enable evidence‐based conservation in both projects and across the discipline of conservation. Our hope is that by understanding and using evidence better, conservation can both become more effective and attract increased support from society.


| INTRODUCTION
There is growing interest in evidence-based practice in biodiversity conservation and natural resource management (Keene & Pullin, 2011;Pullin & Knight, 2001;Sutherland, Pullin, Dolman, & Knight, 2004). This concept was first developed in medicine (Sackett, Rosenberg, Gray, Haynes, & Richardson, 1996;TRIP, 2018), and has taken root in other action-oriented disciplines such as education (Davies, 1999;IES, 2018), social work (Nutley, Walter, & Davies, 2009), policing (Sherman, 2015), and ecological risk assessment (Suter, 2016). In evidence-based practice, rather than merely rely on personal experience or anecdote, practitioners make decisions and take actions that are informed by systematic and critical analyses of both their own and the world's previous experiences. Practitioners also ideally document their results and contribute their findings back to the evidence base.
If we collectively are going to practice evidence-based conservation, it would be helpful to have a widely accepted definition of evidence as well as standard guidance on how to use evidence in conservation practice. These are not straightforward tasks. The term evidence is currently used in many different ways to refer to many different things (Table 1). But if all these things are evidence, then it will be hard for a discipline to develop standard guidance on how to use it. As the Stanford Encyclopedia of Philosophy states, "it is far from obvious that any one thing could play all of the diverse roles that evidence has at various times been expected to play" (Kelly, 2016).
Furthermore, conservation cannot adopt evidence frameworks, tools, and guidance from other fields wholesale, since the needed and available evidence as well as the standards for evidence quality vary vastly across disciplines. In relatively mature disciplines like medicine or education, case situations are generally well-defined and relatively homogenous and there are many controlled studies of interventions conducted by a large cadre of clinical researchers. These fields thus have a relatively high standard for the quality of evidence of intervention effectiveness. But this higher standard may not yet be appropriate for a discipline like conservation in which practitioners typically work in relatively complex and messy situations, and in which there are sparse records of case results, let alone controlled studies of intervention effectiveness (Sutherland, Dicks, Ockendon, Petrovan, & Smith, 2018). We thus need to assess and use evidence in the context of the range of current and future practices in conservation.
To this end, in this paper, we use examples of conservation practice along with insights on evidence-based practice from different disciplines to define and construct a typology of evidence in conservation practice. We then build on this typology to develop a decision tree to support practitioners in how to appropriately use available evidence in a given conservation situation. We conclude with a discussion about incorporating evidence into conservation practice at both project and discipline levels that are based around a theory of change. Our hope is that by understanding and using evidence better, conservation can both become more effective and attract increased support from society.

| EXAMPLES OF CONSERVATION PRACTICE
For the purposes of this paper, we are assuming that the practice of conservation (which includes natural resource management) is a process that involves a defined group of practitioners first agreeing on desired outcomes with regard to a given situation of interest and then taking one or more actions to achieve these outcomes. This process, which can be applied to projects at any spatial or temporal scale (Salafsky et al., 2008;Salafsky, Margoluis, Redford, & Robinson, 2002), is described and supported by various planning and decision-support frameworks ( Schwartz et al., 2017) and ideally incorporates principles of adaptive management (sensu CMP, 2013;Holling, 1978) as necessary. We are thus interested in the evidence needed by a project team to help make the various decisions needed to iteratively go through this conservation process. Supporting Information Figures S1 and S2 provide examples of a typical situation analysis (Margoluis, Stem, Salafsky, & Brown, 2009) and theory of change (Margoluis et al., 2013) for a fictitious conservation project that is the basis for the conservation examples used in this paper.

| DEFINITION AND TYPOLOGY OF EVIDENCE IN CONSERVATION PRACTICE
Most disciplines define evidence in relation to a question, proposition, or claim about the situation of interest. For example, the Stanford Dictionary of Philosophy states that "one's evidence consists of the totality of propositions that one knows" (Kelly, 2016). The U.S. Office of Management and Budget officially defines evidence as "the available body of facts or information indicating whether a belief or proposition is true or valid" (U.S. Office of Management and Budget, 2017). In medicine, Sackett et al. (1996) state "evidence based medicine is not restricted to randomised trials and meta-analyses. It involves tracking down the best external evidence with which to answer our clinical questions." In education, evidence is defined in relation to "an answerable question about education" (Davies, 1999). In the legal realm, the U.S. Federal Rules of Evidence that govern the information that can be used to draw inferences about "facts in issue" in U.S. Federal Courts state that "evidence is relevant if it has any tendency to make a fact more or less probable than it would be without the evidence" (U.S. Supreme Court, 2015). And in ecological risk assessment, evidence is defined as "information that informs inferences regarding a condition, cause, prediction, or outcome" (Suter, 2016). In a scientific or adaptive management context, questions or propositions about a situation of interest are often formally stated as hypotheses. We can thus define: • Evidence-Relevant information used to assess one or more hypotheses related to a question of interest.
In order to operationalize this definition, we need to develop a more detailed typology of each of these highlighted terms in the context of conservation practice. This typology is based on a review of evidence related concepts and terms in various disciplines (Table S1).

| Types of information
The basic concept of information can be understood as a hierarchy of data, information, knowledge, and wisdom represented by the classic Data, Information, Knowledge, Wisdom (DIKW) Pyramid ( Figure S3) (attributed to Ackoff, 1989). Using this hierarchy, we can then define the following sources of evidence (Dicks, Walsh, & Sutherland, 2014;Glover, Izzo, Odato, & Wang, 2006;Haynes, 2006): • Basic data-Raw observations about the situation of interest. These might include details about the conservation targets, threats, stakeholders, actions and/or other basic data for evidence-based practice.
• Primary studies-Documentation of specific research or adaptive management efforts that describe the research question, situation, method, results, and conclusions of each case. These "pieces of evidence" (Suter, 2016) can range from peer-reviewed scientific publications of randomized controlled trials to grey-literature case studies or informal field notes. These studies are the core information for evidence-based practice.
• Evidence syntheses/decision-support systems-Analyses of a set of primary studies about a specific question. These range from formal systematic reviews and maps to subject-wide evidence syntheses to more informal summaries of available evidence (see typology in Cook, Nichols, Webb, Fuller, & Richards, 2017). This category also includes decision-support systems that summarize evidence and make it available to practitioners when making decisions. These can range from simple decision trees, to more sophisticated searchable online information technologies and decision-support software (Schwartz et al., 2017), to traditional knowledge systems employed by indigenous peoples. These syntheses and systems contain the knowledge for evidence-based practice.
• Theory/Principles-Articulations of known evidencebased principles for a given discipline. These can range from rules of thumb to codified guidance and principles. These principles ideally encapsulate the wisdom of evidence-based practice.
Some experts consider knowledge and especially wisdom to be derived from the evidence found in data and information, thus excluding decision-support systems and theory from their formal definition of evidence. However, we find it more practical to compile all these sources of evidence to create: • Evidence base-The body of all data, studies, syntheses/systems, and theory being used as evidence for a particular set of hypotheses (Suter, 2016).
In the end, the collective weight of an overall evidence base is a function of the weight of the individual sources and the manner in which they were assembled, screened, and assessed (CEE, 2018;Cook et al., 2017;Suter, 2016). Different disciplines have developed different protocols and criteria for searching for and weighing individual sources and then aggregating them into the overall body of evidence (Table S1). In more established disciplines such as medicine or education, it is generally assumed that sources of evidence are replicated studies and so the criteria used to weight evidence focus exclusively on the quality and size of the studies. In environmental work and other newer disciplines, however, sources of evidence can range from one-off case studies of single interventions to systematic reviews covering many cases. Weighting protocols in these cases involve some variant of summarizing the reliability of each source, assessing the strength (both direction and magnitude) of the findings from each source, determining the relevance of the source to the hypothesis of interest, and finally combining these parameters per Figure 1 to produce an assessment of the degree of support for a hypothesis from the overall evidence base (Norton, Cormier, & Suter, 2014;Suter, 2016). The evidence from weighted sources can be synthesized quantitatively (e.g., meta-analysis), qualitatively (e.g., through expert-based Delphi reviews), or in narrative form (CEE, 2018;. This synthesis of studies that have been conducted at different times and in different places enables researchers to examine potentially confounding variables or interacting factors that vary over time and space to explore and understand the reasons for heterogeneity in outcomes.

| Types of hypotheses
By our definition above, evidence is used to assess one or more hypotheses related to a question of interest. In a scientific or adaptive management context, this assessment typically employs one of two approaches: • Popperian approach-A hypothesis must be expressed as a falsifiable statement which can be "rejected" by assessing available evidence (Popper, 1959). Often, the falsifiable statement takes the form of a null hypothesis, so that rejecting the null hypotheses constitutes support for the original hypothesis directly pertaining to the project team's question of interest. This approach does not necessarily imply a statistical test. Traditional (frequentist) statistical tests are a subset of Popperian hypothesis assessment that involve using observations about a sample of a population to make inferences about the population as a whole. • Bayesian approach-A hypothesis is expressed as a prior probability distribution (often shortened to prior) which is then transformed into a posterior probability distribution (posterior) that either "moves away from" or "stays at" the prior as available evidence is incorporated into the analysis (Wade, 2000).
In this paper, we use the term "hypotheses assessment" to refer to both Popperian and Bayesian approaches, although our example hypotheses are phrased in the Bayesian fashion as it is generally more compatible with how most conservation practitioners think. F I G U R E 1 Schematic representation of criteria for weighing evidence. We can think of "weighing the evidence" as literally putting all the sources of evidence used to evaluate a given hypothesis on a balance. Reliability (aka Quality) is represented by the weight of each individual source (here categorically described as VH = Very High, H = High, M = Medium, L = Low) regardless of its conclusion. Direction refers to whether a source is placed on the positive (supports) or negative (refutes) side of the balance or in the middle (mixed). Strength refers to how far from the center point the source is put on either side (strong-weak). Relevance refers to whether the source of evidence even belongs on this particular balance for this particular hypothesis being evaluated. The Collective Weight of the Evidence Base is the net balance of all sources Assembling and using evidence to assess hypotheses requires a well-formulated hypothesis statement (U.S. Agency for International Development, 2018). For example, contrast the first and second hypotheses in each pair: H1a. Seabirds are successfully nesting in Eastern Bay.
H1b. There are at least 100 breeding-pairs of ruby-crested puffins that fledged an average of at least 1 chick during each of the last 5 breeding seasons in Eastern Bay.
H2a. If ecotourists demand "green" practices, this will result in their adoption.
H2b. If more than 25% of likely ecotourists demand seabird friendly practices, most boat operators will voluntarily install rat barriers.
In both cases, the second hypothesis is better formulated because it is more specific and measurable and therefore easier to assess. This is the same principle that lies behind project management guidance that encourages practitioners to formulate specific and measurable (SMART) goals and objectives which are in effect hypotheses about change needed within a system to achieve a desired impact (CMP, 2013).
There are different types of hypotheses about any given system. Some common ones include: • Univariate hypotheses-Claims about one factor in the system: Ia. Presence (or absence) of a factor-Factor X is present in the system. Or, Factor Y was historically present in the system. Ib. Status (or change in status) of a factor-Factor X has Status A. • Bi-or multivariate hypotheses-Claims about the relationship between two or more factors in the system: IIa. Association between two (or more) factors-Factor X increases when Factor Y increases, and viceversa. IIb. Causation between two (or more) factors-A specific change in Factor X causes a corresponding positive or negative change in Factor Y. Or, as Factor X varies, it causes a corresponding linear or nonlinear set of changes in Factor Y. Or, Factor W contributes two thirds of the change and Factor X one third of the change in Factor Y.
Note that in specifying multivariate hypotheses, it is important to take into account both interactive effects between factors as well as potentially confounding effects of other factors that are not explicitly considered in the stated hypothesis. For example, a project team might find that its original hypothesis: H3a. Rats are the primary cause of seabird nest predation is not accurate and has to be replaced by a new expanded version: H3b. Rats and cats are the primary causes of seabird nest predation.
Finally, we can also define: • Specific (project) hypothesis-A proposition about a specific case situation. For example, rats are the primary cause of seabird nest predation on all Eastern Bay Islands.
Or, an outreach campaign to 25% of likely ecotourists to Eastern Bay can pressure most boat operators to install rat barriers if combined with appropriate policy incentives. • Generic hypothesis-A proposition about a generic situation that is often a composite of many specific case situations. For example, rats are a primary cause of seabird nest predation on islands. Or, outreach campaigns will change target audience attitudes and behaviors.
This distinction between specific and generic hypotheses is important because as discussed below in more detail, conservation project teams ultimately need to assess specific hypotheses about their situation of interest, but most evidence is about analogous generic hypotheses. In medicine, education and other disciplines in which the situations of interest are more homogenous, this distinction may be less important, although in medicine, this may change with the advent of personalized medicine based on individual patient genomes. There are many different types of hypotheses relevant to conservation practice, each requiring different types of evidence ( Table 2). Some of these hypotheses may be related to understanding the situation such as the status of target species or the cause of a threat. Other hypotheses are related to the effectiveness of an action or the conditions under which a given action might be effective. A large part of the "art" of evidence-based conservation practice thus involves understanding the system well enough to figure out the right set of hypotheses to consider and the sequence in which these need to be assessed (U.S. Agency for International Development, 2018). For example, the team may need to first confirm that rats are present on the island, then that they are at least a partial cause of seabird nest predation, and finally that poisoning might be an effective action to take to remove the rats In this case, it is not so important to match the species or the threat, but rather the local economic, legal and cultural conditions.
Weak to strong circumstantial Weak to strong circumstantial Weak to strong circumstantial Weak to strong circumstantial given local rainfall patterns. The shared mental models found in situation analyses and theories of change become important tools to help project teams go through this hypotheses generation process, especially in the complex and dynamic systems with many interacting factors that are typical of many conservation situations. It is also important to note that some conservation questions may not be answerable with evidence. For example, the question of whether rats have an inherent right to be on an island is a values question that cannot be answered with data. At best, evidence might be used to establish that rats were not historically present on Eastern Bay islands, which in turn might inform the outcomes that the project team chooses to set for their work.

| Types of evidence
Assessing different types of hypotheses may require different types of evidence which can be categorized across four dimensions (Figure 1).

| Direction of effect (sign)
• Supporting (positive) evidence builds the case for a hypothesis (i.e., rejects a Popperian null hypothesis or enables staying at a Bayesian prior). • Refuting (negative) evidence reduces the case for a hypothesis (i.e., fails to reject a Popperian null hypothesis or moves away from a Bayesian prior). It is vital to distinguish between "negative evidence" that strongly or weakly refutes the case for a hypothesis versus a "lack of evidence" for a hypothesis one way or the other; often an assessment of "no evidence" refers to the latter (CEE, 2018). Similarly, "mixed evidence" refers to an evidence base in which there is a blend of positive and negative evidence.

| Strength of effect (magnitude)
• Strong evidence convincingly supports or refutes a hypothesis. For example, a research study that shows either a strong positive effect of poison controlling rat populations, or a strong negative effect definitively showing the poison does not work. • Weak evidence only somewhat supports or refutes a hypothesis. Note that we are explicitly not including reliability or relevance in this definition of relative strength.

| Reliability (quality or internal validity)
• More reliable evidence comes from a higher quality source or evidence base and thus has higher internal validity.
• Less reliable evidence comes from a lower quality source or evidence base and thus has lower internal validity.

| Relevance (external validity)
• More relevant evidence addresses the specific or generic hypothesis in question and matches key enabling conditions (parameters of the situation of interest that may affect the hypothesis such as the local rainfall patterns or the government's land-tenure policies). • Less relevant evidence either does not address the hypothesis in question and/or does not match key enabling conditions.
Three additional ways that can be helpful to categorize evidence are:

| Direct vs. circumstantial evidence
• Direct evidence sufficiently assesses a hypothesis, positively or negatively, without need for any additional evidence or inference. • Circumstantial evidence needs to be combined with additional evidence and/or inference to fully assess a hypothesis-it helps build a case to support or refute a hypothesis. There is a blurry line at best between sufficient circumstantial and direct evidence (L. LaRue, personal communication, August 2018).

| Specific vs. generic evidence
• Specific (project) evidence is the "local" information about a specific hypothesis about a particular situation. For example, observations that show rats are present on all islands in Eastern Bay. Or, project data collected to evaluate whether a particular outreach campaign changed tourist awareness in Eastern Bay. • Generic (external) evidence is the information the world knows about a generic version of a hypothesis. It is often derived from consideration of specific case studies through what David Hume and Karl Popper term inductive reasoning (Hume, 1748;Popper, 1959). Popper's induction fallacy (Popper, 1959) states that generic evidence can only provide insights into the specific hypothesis that then need to be confirmed locally. For example, systematic reviews showing that poison is effective to control rat populations on small islands are only circumstantial evidence to support a specific hypothesis about rat control at a specific project site (see Table 2, Row G).
In particular, it is essential to understand how critical enabling conditions vary between various specific sites as these often determine whether generic evidence has external validity and thus can be applied locally.
There are different types of both specific (Table 2, Column 2) and generic evidence (Table 2, Column 3) that could be used to either strengthen or weaken the case for each specific hypothesis of interest. The challenge for any project team is thus to make sure they have access to and are drawing on the full complement of both project evidence and external evidence as appropriate.

| Observational vs. experimental evidence
• Observational evidence addresses a hypothesis based on an assessment of one or more real-world situations; its validity depends on the expertise, skills and reliability of the observer(s), underlying sampling design (single point, cross-sectional and/or before-after or time series), size of the sample, and statistical analysis used for statistical control. • Experimental evidence addresses a hypothesis based on a comparison of different situations; its validity depends on the expertise, skills and reliability of the experimenter, underlying experimental design, size of the sample, and statistical analysis used. Active experiments involve artificially manipulating a situation while passive or quasiexperiments make use of naturally occurring situations (e.g., four islands, two of which have rats on them and two which do not) (Holling, 1978).
Finally, when using evidence to assess a hypothesis, there are two additional considerations: • Burden of proof-This concept describes how certain a team needs to be about the evidence used to make the case for a hypothesis. The specific burden of proof is situational and depends on the nature of the claim being made-as David Hume (1748) first pointed out, "extraordinary claims require extraordinary evidence," the consequences of the decision, and the relative risks of action versus inaction (see Salafsky & Redford, 2013 for more details). • Observer bias and reliability-Much of the effort behind formal data collection and analysis techniques involves trying to identify and mitigate the uncertainty introduced by various forms of observer bias (CEE, 2018). There is an entirely different level of uncertainty that can be introduced by unreliable observers who are either incompetent or have some motivation to falsify results (e.g., introducing and then "finding" endangered species in a pond to promote its protection, or exaggerating the outcomes of a project to obtain funding). To this end, a substantial percentage of evidence introduced in criminal trials involves establishing the reliability of observers as well as the chain of custody regarding physical evidence (L. LaRue, personal communication, August 2018). For the purposes of this work, however, we generally assume that members of the project team are reasonably competent and honest.

| GUIDANCE FOR USING EVIDENCE IN CONSERVATION PRACTICE
Using evidence in conservation practice is not necessarily a straightforward task. For example, if a project team is interested in the question of whether rats are the cause of nest predation at a given project site, it is probably more valuable to have a few local observations of rat-eaten egg shells than many controlled experimental studies from the other side of the world. On the other hand, if the team is exploring a new rat eradication technique, evidence from studies around the world of this technique might be very helpful. In this section we build on the definitions and typology in the previous section to develop a decision tree to guide practitioners in how to appropriately use available evidence in a given conservation situation. We then apply this decision tree to two examples.

| Proposed decision tree for using evidence
We present a description of our proposed decision tree in the context of one type of hypothesis about assessing a potential conservation action ( Figure 2). However, this decision tree could easily be modified to support any type of project hypothesis. Note that each of the decisions in this process could be made via quick high-level assessments, or more systematic and detailed calculations. The starting point for this decision tree requires a project team to have a proposed action with clear outcomes and an explicit TOC. If there is not agreement on outcomes, then the project team needs to develop them.
Step 1 of the decision tree involves a project team developing a wellformulated specific hypothesis (or set of hypotheses) about the action in the context of the situation of interest.
Step 2 then involves reviewing all available local project evidence and determining to what degree this local evidence base supports or refutes the case for this specific hypothesis, resulting in a determination of Initial Confidence in Specific Hypothesis (Figure 3a). If the team is very confident that either the hypothesis holds for the project's conditions, or conversely, that it is unlikely to be true, then the team is done. If the project team is less confident or needs more information about their specific hypothesis, Step 3 involves ensuring that this action is sufficiently critical and urgent to warrant additional research effort; this step is a "circuit-breaker" to remind teams that not all decisions necessarily require extensive work to find external evidence.
Step 4 involves the project team compiling and using available generic evidence that assesses generic versions of their hypothesis. In some situations, there may already be existing evidence syntheses such as systematic reviews and maps (e.g., CEE, 2018), subject-wide evidence syntheses (e.g., Sutherland & Wordley, 2018), or other evidence synthesis projects e.g., (Mongabay, 2017), completed by specialists who have the skills and training to do this work while minimizing potential bias. In other situations, however, it may be necessary for the team to do its own search, assembly, screening, and weighting of available primary evidence studies. There are a range of techniques available for each of these tasks, the choice of which typically involves trading off potential bias for cost (CEE, 2018;Suter, 2016; Table S2). Weighting individual sources involves combining various criteria to determine the summary weight of each source, the application of which results in a determination of the Weight of a Given Source of generic evidence (Figure 3b). All available sources are then in turn rolled up to arrive at the Collective Weight of the Generic Evidence Base (Figure 1).
Step 4 is then completed by determining to what degree this generic evidence base supports or refutes the case for the generic hypothesis, resulting in a determination of the Overall Support for Generic Hypothesis from Generic Evidence Base (Figure 3c). If the available external evidence base clearly refutes the hypothesis, or if it is not clear, then the team is done.
Finally, in cases where the external evidence base convincingly or potentially supports the generic hypothesis, Step 5 involves determining whether the cases in the generic evidence base are sufficiently similar to the local project that their evidence can inform the specific project hypothesisor more technically, if there is external validity (CEE, 2018). This external validation is often done qualitatively. As CEE (2018) states, "appraisal of study relevance can be a more subjective exercise than appraisal of study reliability." This results in a determination of Final Confidence in Specific Hypothesis which can then be translated into a recommendation of what conservation action to take ( Figure 3d).
As shown in the far right-hand side of Figure 2, if the team is "Very confident" that the available evidence base supports their hypothesis, they can implement the action at scale and only monitor implementation. If the team is "Confident, but" not completely sure, then they can implement the action at scale, but should probably invest a bit more in monitoring effectiveness. If the team members "Need more info" they should consider alternative actions to achieve their desired outcome, but if none exist, they may wish to pilot this action using an adaptive management approach, especially if the conservation situation urgently demands action. If the situation is not urgent, the team could also wait for additional external evidence to be generated by other projects. Finally, if the team determines the hypothesis is "Unlikely true" then they should consider alternative actions and if no better candidates exist, they should probably triage this work.
4.2 | Examples of using this decision tree 4.2.1 | Example 1. Sufficient local evidence to take immediate action The Eastern Bay project team wants to know if rats have recolonized one of the islands in the Bay: F I G U R E 2 Decision tree for using evidence in assessing a potential conservation action. This decision tree helps guide practitioners in using evidence to assess a specific hypothesis about a potential conservation action (Row G in Table 2). This decision tree could easily be adapted to apply to other types of conservation hypotheses as well, such as the status of a threat or target or the assumed causal relationship between two factors in a conceptual model. TOC = theory of change; AM = adaptive management approach. See text for description Guides in support of steps in decision tree. (a) Determination of project evidence support for a specific hypothesis. This chart combines the type of project evidence available with the degree that this project evidence base collectively supports or refutes the case for the specific hypothesis, to arrive at the Initial Confidence in Specific Hypothesis. (b) Weighting a source of evidence. This chart contains independent criteria that can potentially be used to weight the relative importance of a given source of evidence. Generic relevance ensures that the source applies Step 1. The team formulates a specific hypothesis: Rats are present on the island.
Step 2. The team reviews available project evidence: Fresh rat droppings have been sighted on the island. The team is thus "very confident" that this local evidence supports their specific hypothesis.
End. The team thus goes directly to taking action to deal with the rat recolonization.

| Example 2. Mixed generic evidence
One strategy that the Eastern Bay project team is considering involves empowering local women to help in marine resource management: Step 1. The team formulates a specific hypothesis: Promoting women's involvement in marine resource management councils will lead to more sustainable resource management.
Step 2. The team reviews available local project evidence: To date, women have not been involved in marine resource management councils. The team concludes they "Need more info." Step 3. The team determines this is a critical hypothesis for their project.
Step 4. The team reviews the literature. They find a systematic evidence map (Leisher et al., 2016) that concludes "For India and Nepal, there is strong and clear evidence of the importance of including women in forest management groups for better resource governance and conservation outcomes. Outside of India and Nepal, there are substantial gaps in the evidence base..." The team determines using Figure 3c that this is a "High" rating for the collective weight of the generic evidence base crossed with "Strongly supports (+ +)" generic hypothesis, which leads them to a rating of "Potentially supports (+)" generic hypothesis.
Step 5. The team using Figure 3d, concludes that since their project site is taking place in very different cultural context and in a marine rather than forest setting, there is "less relevance" of the generic evidence to their project conditions. The team members thus still "need more info" in terms of their final confidence in their specific hypothesis.
End. The team thus decides to pilot this action using an adaptive management approach and to share their findings with other similar projects in the region to see if they can collectively develop enough evidence to fill the hole in the evidence base.

| DISCUSSION
5.1 | Incorporating use of evidence into conservation projects As described above, conservation occurs as projects at all scales go through an iterative management process, supported by various planning and decision-support frameworks (CMP, 2013;Cook et al., 2016;Schwartz et al., 2017). Although many of these frameworks at least implicitly support evidence-based practice, there are several steps that can be taken to more explicitly incorporate evidence. There are a number of places in the conservation process in which evidence can be used to inform conservation decisions (Table 2), including: • Informing stakeholder determination of appropriate project scope and targets; • Determining the presence/absence or the status of key factors such as conservation targets, biophysical factors, direct threats, and contributing factors in a situation analysis; • Determining the associative or causal relationships between key factors in a situation analysis or TOC including interactive and confounding factors; • Setting desired outcomes for key factors such as targets or threat reduction results; and • Deciding which action or set of actions to invest in to achieve desired outcomes.
A large part of the "art" of evidence-based conservation involves the project team understanding the complexity of to the generic hypothesis of interest. Reliability speaks to the quality of the source. We explicitly do not include strength (magnitude of effect) as a criterion in this chart because this dimension is added to the analysis in Figure 3c. The Weight of a Given Source could be a high-level estimated integration of the criteria across the chart, or it could be calculated either as a (weighted) average of numerical scores assigned to each criteria or via a rule-based algorithm. (c) Determination of generic evidence base support for a generic hypothesis. This chart combines the collective weight of the generic evidence base with the direction and strength of its collective support for the generic hypothesis to arrive at the overall Support for Generic Hypothesis from the Generic Evidence Base. The collective weight of an evidence base is determined by the weight of its component sources (per Figure 3b) and one's confidence in the levels of bias introduced by the search and eligibility screening protocols employed to assemble the evidence base (CEE, 2018). This collective weight could be a high-level estimate, or it could be more systematically calculated. The direction and strength of support for the hypothesis is a weighted assessment of the distribution of positive, mixed and negative sources in the evidence base (per Figure 1). (d) Determination of final confidence in specific hypothesis. This chart combines the rating emerging from Figure 3c with a determination of the relevance of the sources in the generic evidence base to the specific project hypothesis and key enabling conditions (e.g., Was the poison tested with this rat species? In similar wet conditions?). It results in a rating of the Final Confidence in Specific Hypothesis their system of interest well enough to determine the right set of specific and generic hypotheses to consider, as well as the right sequence and the level of detail to which they need to be assessed. Once these key hypotheses have been determined, the project team needs to explicitly or implicitly follow a decision tree (e.g., Figure 2) in order to appropriately assemble and use project and/or generic evidence to support their decisions. The key is not to be paralyzed by a lack of evidence, but rather to use evidence where it is needed and available-and then to make sure to document the type of evidence used to make the decision as well as the source of that evidence. It is perfectly acceptable under an adaptive management approach to make a decision based on "a rough guess" or "expert knowledge" as long as it is clear that this is how the team arrived at this decision. And of course, it is also important for the project team to contribute their findings to the broader global evidence base.
To enable evidence-based conservation, it is essential to build it into the frameworks and tools that conservation practitioners use. As one example, the CMP will build basic guidance into the next version of the Open Standards for the Practice of Conservation that more explicitly defines and supports evidence-based practice, including the need to develop specific hypotheses, assess them as appropriate, and document the sources of evidence. Support for evidencebased practice can also be built into key tools used to implement conservation projects; for example, Figure S4 shows recently developed features of Miradi Software that support evidence-based conservation.

| Incorporating evidence use and generation into conservation as a discipline
In relatively mature disciplines such as medicine or education, there has been a great deal of progress in incorporating evidence into professional practice. These disciplines have many factors supporting evidence-based practice, including relatively well-defined case situations such as treating a disease in patients or improving student math skills, generally accepted and quantifiable metrics of success, a vast cadre of clinical researchers who focus on studying the effectiveness of clinical practice, and the cultural, political, and financial support to make evidence-based practice happen. Even with these advantages, however, adoption of evidence-based practice is far from the universal norm. In conservation, by contrast, we are still trying to work out basic understandings of complex and messy case situations, develop appropriate metrics of success, build the capacity of clinical researchers, and raise the needed support to make all of this happen. Given these challenges, it is perhaps not surprising that evidence-based practice is currently less developed in conservation.
That said, conservation also has a great advantage in that we can learn from the experiences of other disciplines to systematically develop an approach to evidence-based practice

Ultimate Outcome
Intermediate Outcome

Intermediate Result
Blue Text = Enabling Condition Action (Strategy) Green Text = Action by Others F I G U R E 4 High-level theory of change for promoting evidence-based conservation (EBC) across the discipline. The three large boxes in the center of the chain represent how evidence is generated, distributed and used to improve conservation projects and programs. The large box at the bottom of the diagram contains the enabling conditions that have to be in place to promote evidence-based practice. As shown by the large box in the lower right corner, there is also an inherent assumption that if we can collectively use evidence to show enhanced effectiveness, we will be able to attract increased support for conservation from society. Finally, the hexagons represent high level strategies that key actors could collectively undertake to help implement this chain that makes sense in the complex and messy context of conservation projects. Ultimately, if conservation as a discipline is going to become more evidence-based, then we collectively need to improve how evidence is generated, accessed, and ultimately used by practitioners along a shared theory of change ( Figure 4). Our collective challenge going forward will be to implement this theory of change.