Matching expert range maps with species distribution model predictions

Abstract Species’ range maps based on expert opinion are a critical resource for conservation planning. Expert maps are usually accompanied by species descriptions that specify sources of internal range heterogeneity, such as habitat associations, but these are rarely considered when using expert maps for analyses. We developed a quantitative metric (expert score) to evaluate the agreement between an expert map and a habitat probability surface obtained from a species distribution model. This method rewards both the avoidance of unsuitable sites and the inclusion of suitable sites in the expert map. We obtained expert maps of 330 butterfly species from each of 2 widely used North American sources (Glassberg [1999, 2001] and Scott [1986]) and computed species‐wise expert scores for each. Overall, the Glassberg maps secured higher expert scores than Scott (0.61 and 0.41, respectively) due to the specific rules (e.g., Glassberg only included regions where the species was known to reproduce whereas Scott included all areas a species expanded to each year) they used to include or exclude areas from ranges. The predictive performance of expert maps was almost always hampered by the inclusion of unsuitable sites, rather than by exclusion of suitable sites (deviance outside of expert maps was extremely low). Map topology was the primary predictor of expert performance rather than any factor related to species characteristics such as mobility. Given the heterogeneity and discontinuity of suitable landscapes, expert maps drawn with more detail are more likely to agree with species distribution models and thus minimize both commission and omission errors.


Introduction
Data on species' distributions are critical to conservation planning, predicting responses to climate change and public health (Parmesan 1996(Parmesan , 2006Dawson et al. 2011;Mainali et al. 2015;Merow et al. 2017). Historically, a key source of such distributional data has been expert-drawn range maps, which set boundaries on species' likely occurrences. Expert maps, which are developed for individual species based on a combination of distribution data and the collected experience and knowledge of naturalists, scholars, and others, delineate the geographical region in which a species is believed to occur (Hurlburt & Jetz 2007;Merow et al. 2017). As such, expert maps, which are available for thousands of species across diverse taxa, predict the binary state of species distributions as occupied or unoccupied, usually with a fairly coarse grain. The utility of these maps ultimately depends on accuracy, but defining accuracy is difficult because the true distribution of a species cannot be known. When drawing maps, experts generally delineate a single region that includes the entire species range. This is done because any occupied areas falling outside of the delineated range clearly diminish map credibility (Hurlbert & White 2005;Hurlbert & Jetz 2007;Merow et al. 2017). Because of this emphasis on avoiding omission errors (false negatives), expert maps appear particularly good for delineating range edges beyond which a species is unlikely to occur (Jetz et al. 2012;Domisch et al. 2016). For birds, the boundaries of expert maps were reasonably accurate at 100-200 km resolution (Hurlbert & White 2005;Hurlbert & Jetz 2007;Merow et al. 2017), but predicted many false presences at finer resolutions (Hurlbert & White 2005;Hurlbert & Jetz 2007).
Expert maps are often used to identify biodiversity hotspots for conservation (Hurlbert & Jetz 2007) or estimate species richness (Hurlbert & White 2005). In such cases, multiple expert maps are stacked to obtain multispecies measures. Yet, expert maps have traditionally been developed specifically to accompany individual species accounts. Indeed, most expert-drawn maps are supplemented by written species accounts that include ecological trait data such as habitat or elevational associations. This information, which is not included in the maps, makes it possible for readers to infer the internal heterogeneity of plotted species ranges.
Species distribution models (SDMs) provide an alternative to expert maps. Typically, SDMs are based on occurrence data and then interpolated and extrapolated to account for areas in which species may not have been seen, but are likely to occur due to a combination of environmental variables that correlate with known occurrences (Guisan & Thuiller 2005;Elith & Leathwick 2009). An SDM usually features fine spatial resolution and therefore captures fine-grained heterogeneity of species ranges in ways that expert maps rarely do. Recent efforts leverage both expert maps and point data to improve SDMs (Domisch et al. 2016;Merow et al. 2017).
Because much conservation activity hinges on species ranges, metrics must be developed for assessing the accuracy of maps, whether expert-drawn or implemented from statistical models. One way to judge the accuracy of expert maps is to compare them with each other, but we are not aware of any researchers who have done this. One challenge to this approach is that with no independent reason to prefer one over the other it becomes difficult to ascertain what conclusions to draw about the relative value of competing maps. A neutral benchmark provides a means of determining map quality. Although SDMs have several shortcomings (Kramer-Schadt et al. 2013;Yackulic et al. 2013;Gomes et al. 2018), they are transparent, repeatable, and optimizable for particular purposes (e.g., by adjusting weights on commission vs. omission errors). We developed a metric that compares competing expert maps against carefully trained SDMs. Although we used SDMs as a benchmark for comparison, we do not suggest they represent truth. Instead, we devised an analytical way to use these SDMs as a neutral arbiter to evaluate range information when multiple expert-drawn maps are available and a framework for understanding and judging map value. We explored this analytical framework with North-American butterfly maps.

Map Development
We obtained digitized expert ranges of American butterfly species from 2 sources. James Scott (1986) published range maps after studying butterflies for 25 years. He reviewed several hundred references and consulted with over 100 experts. In drawing these ranges, Scott reported summer (where adult butterflies may be seen, even if they do not reproduce) and winter ranges (where the species are known to overwinter). The union of a species' summer and winter ranges constituted its expert range here. We obtained digitized ranges for 541 species from Scott (1986) via the Map of Life project. Hereafter, this set of expert maps is referred to as Scott.
The second expert source comprised maps originally published in two books by Jeffrey Glassberg (1999Glassberg ( , 2001 and subsequently updated by Glassberg. Based on published and unpublished literature, Glassberg created draft ranges by including places where butterflies regularly fly and produce at least 1 brood before dying back. Glassberg mapped the regions with 1, 2, or >2 broods. Thus, unlike Scott, if a species expanded its summer range but did not reproduce in those areas, those were not included in the range map, but rather denoted as strays and excluded from our analysis. The union of all these brood regions constituted the species range and represented areas where a species was known to reproduce, even if it did not overwinter. These draft ranges were then reviewed by 85 experts before Glassberg finalized the ranges. In 2014, we obtained digitized expert ranges of 659 butterfly species from this source, hereafter, referred to as Glassberg. To provide a baseline for our analyses of expert maps, we sought occurrence records for American butterfly species and located 478,200 occurrence records from the North-American Butterfly Association, 46,904 records from Butterflies and Moths of North America, and 137,431 records from Global Biodiversity Information Facility. To define the study area for each species, we first created an alpha hull (α = 8°), a generalization of the convex hull (Burgman & Fox 2003), around a species' occurrence records to eliminate highly distant, isolated records. Second, we created a convex hull representing the spatial union of the alpha hull based on species occurrences and the two expert maps for that species. Third, we clipped the resulting convex hull by the boundary of land mass of the contiguous United States to obtain the study area (denoted S) of the species. Defined in this way, the study area for a species represented the spatial union of all relevant data sets. We developed SDMs for each butterfly species with the inhomogeneous Poisson point process (IPP) distribution approach (e.g., Warton & Shepherd 2010). Ultimately, we analyzed ranges for 330 species, after excluding species based on how much of their range was included in our main study area or because of conflicting taxonomy. We also ran the analysis separately for species whose ranges were 100% within our primary study area. See Supporting Information for details and rationale for SDM method and which species were included in analyses.

Measuring Agreement Among Maps
We assumed each expert map was generated from a binary process, such that, within any subregion of the study area, the species were either present (y = 1) or absent (y = 0). To determine how close the expert maps agreed with the SDM predictions, we relied on the notion of a scoring rule for evaluating the accuracy of predictions of binary events ( Table 1 in Gneiting & Raftery 2007). Because the IPP implies a Bernoulli distribution for occurrences, we used a proper scoring rule to appropriately match our SDM with the procedures used to evaluate the predictive accuracy of the expert maps (Cressie & Wikle 2015;Gneiting & Raferty 2007;Hefley & Hooten 2016). For a scoring rule, we used where y j is equal to 1 if a map indicates that the jth small geographic area A j is occupied and 0 if the map indicates A j is unoccupied, and p j is the probability of occurrence in A j . For a given species, the overall score for the study area is calculated by summing each l (y j , p j )

Conservation Biology
Volume 34, No. 5, 2020 Table 1. Potential predictors of expert score, a quantitative metric developed to evaluate the agreement between an expert map and a habitat probability surface obtained from a species distribution model, and predictors of expert agreement, the fraction of the union of 2 expert maps that is common.

Predictor and measure Definition and level of categorical variable
Attributes of expert opinion maps a Polsby-Popper index (range 0 to 1) Ratio of the area of the map to the area of a circle whose circumference is equal to the perimeter of the map (Cox 1927) Convex hull score (range 0 to 1) Ratio of the area of a map to the area of the minimum convex polygon that encloses the map Detailed edge (range 0 to ∞) Ratio of the area of a map to its edge length Attributes of occurrence-based maps Number of occurrence points Total number of occurrence records in the study area (as defined in Methods) Moran's I (range -1 to 1) Spatial heterogeneity of occurrences in the study area; score of the map is 1 for perfect clustering of similar values, 0 for perfect randomness, and -1 for perfect clustering of dissimilar values Average density of occurrence points (count/10,000 km 2 ) Butterfly life-history traits b Mobility Local, migratory, or mass migration Habitat breadth Generalist: associated with many specific ecotypes (e.g., fields, meadows, prairies, and pastures), although they may have particular canopy requirements (e.g., no canopy rather than closed forest); specialist: associated with specific ecotypes (e.g., tall-grass prairies); narrow: narrow but not specialized habitat associations Host plant across J nonoverlapping A j 's that fully partition the study area S. The expert's score is calculated as J j = 1 l (y j , p j ) for the set S = ∪ J j=1 A j . Other types of scoring rules could be used instead of Eq. 1. For example, area under the curve (AUC) is widely used in ecology, but this scoring rule is not proper (i.e., it is possible to find better AUC scores when the estimated probabilities of occurrence differ from the true probabilities of occurrence) (Byrne 2016). Proper scoring rules like Eq. 1 ensure that the best value of the scoring rule is achieved when the estimated probabilities of occurrence match the true values (Byrne 2016).
We used the results of the SDM as an independent arbiter of truth without assuming that it necessarily outperforms either expert map. We restricted our models to regions in which occurrence is highly probable, and this greatly lessened the typical SDM problem of finding the true area of occurrence.
Although useful, the score produced by Eq. 1 depends on how the A j s are chosen to partition the study area S. That is, the map's score, J j = 1 l (y j , p j ), depends on grid resolution, which is arbitrary. We set the grid resolution at 10 arc minutes. Because the SDM provides predictions in continuous geographic space, we needed to define the continuous analog of the occupancy probability as whereλ = A λ(s)ds, s is a vector of coordinates and λ(s) is an IPP estimated intensity function the log of which is specified as a linear combination of location-specific covariates of species distribution (Supporting Information). Likewise, a map is a binary process that exists in continuous geographic space, such that at any point, the map indicates that the species is present, y(s) = 1, or absent, y(s) = 0. With spatially continuous specifications of y(s) and p(s), the map's score is calculated as Conservation Biology Volume 34, No. 5, 2020 which we refer to as the deviance. This deviance measures how close each expert's map is to the intensity function from the SDM built from the occurrence records for that species. However, because this measure in Eq. 3 is defined in continuous geographic space, it is no longer sensitive to an arbitrary choice of how the study area S is partitioned. In practice, the integral in Eq. 3 is evaluated using a numerical quadrature approximation (Givens & Hoeting 2012). Furthermore, p(s) is a function of λ(s), which must be estimated from the occurrence records. Thus, the deviance is estimated by plugging in the estimated valueλ(s) in place of λ(s).
Because the deviance in Eq. 3 is a relative measure, we needed a null map to characterize the predictive value of expert maps. We defined the null map for a species as that which covers the entire study area S (Fig. 1a). We propose a deviance-explained metric as where E{l[y(s), p(s)]} is the deviance given the expert's map (i.e., y(s)) and E{l[z(s), p(s)]} is the deviance given the null map (i.e., z(s)). The expert score has an interpretation similar to the familiar coefficient of determination from simple linear regression or the more general pseudocoefficient of determination for generalized linear models. For example, when the expert score equals 0, the expert map has predictive accuracy equal to that of the null map. A higher expert score indicates a closer match of expert map with highly suitable sites. Expert score can be negative when an expert map has less predictive accuracy than the null map. The deviance in Eq. 3 can be decomposed to represent the contribution to the deviance score due to omission and commission errors. More specifically, for any expert map, the study area S can be partitioned into two disjointed sets, S = I ∪ O, where I is the area inside the expert's map and O is the area outside the expert's map but within S. With this partition, the integral in Eq. 3 is The first term on the right side of Eq. 5 is the deviance inside the expert map, which, when scaled by the null deviance, quantifies commission error (error of predicting unsuitable landscape as part the expert map). Similarly, the second term on the right side in Eq. 5 is the deviance outside the expert map, which, when scaled by the null deviance, quantifies omission error (error of predicting suitable landscape outside the expert map). Thus, this decomposition in Eq. 5 results in a natural interpretation of the expert score as [1 -(scaled deviance inside + scaled deviance outside)].
Glassberg and Scott used different ecological phenomenon to define their ranges. Therefore, we did not expect them to match exactly even if they both could be assumed to represent truth as they defined it. Nevertheless, we believed it was useful to directly compare the overlap in the two maps, especially because expert maps are often used without regard to the specific rules used to generate them. For each species, we quantified agreement between the two expert maps as: expert agreement = area of the intersection area of the union (6) This measure scales from 0 (complete disagreement between corresponding expert maps) to 1 (complete agreement). Next, we used three groups of explanatory variables to predict expert score (defined in Eq. 4) in each of the 2 sources of expert maps and to predict expert agreement between 2 expert maps. Specifically, we considered expert map geometry (3 predictors), occurrence records geometry (3 predictors), and life-history and ecological traits of butterfly species (11 predictors) (details in Table 1). Predictors related to expert map attributes and to occurrence records were calculated directly from the expert maps and the occurrence data, respectively. Predictors related to life-history and ecological traits were compiled from Scott (1986), Opler and Malikul (1992), Bird (1995), Glassberg (1999), Opler (1999), Daniels (2003), and Bouseman et al. (2006).
We used multimodel inference  to quantify the utility of these three groups of variables as significant predictors of expert score and, separately, of expert agreement. For each of the 3 groups of predictors, we exhaustively examined the performance of all possible main effects models built from the combinations of predictors in that group. For example, we had 7 main effects models for expert map geometry (three models with individual predictors plus three bivariate models plus a model with all three predictors). Each of these models was assigned a probability based on the Akaike Information Criterion (AICc) corrected for small sample sizes, such that the summed probability of the models was 1 (Buckland et al. 1997;Calcagno & de Mazancourt 2010). A predictor was deemed important if the sum of the probability of all models that included that parameter was ≥0.8 (Calcagno & de Mazancourt 2010). We then created a linear model with the important predictors from the map geometry group.
We repeated the method for the groups of predictors related to occurrence records (7 models) and life-history  and ecological traits (2047 models) and identified important predictors in each of these groups. Eventually, we created models with important predictors from more than one group. For each of these models, we report goodness of fit as adjusted R 2 . Data analyses and plotting were performed in R × 64 3.5.1 (R Project for Statistical Computing) with the following libraries: gbm (for the main analysis of SDM), raster, maptools, maps, rgdal, geosphere, rgeos, scales, alphahull, sp, rgbif, plyr, mandeR, spa-tialEco, Hmisc, glmulti, magrittr, DT, htmlwidgets, gg-plot2, officer, flextable, jtools, venneuler, and cvAUC. We developed an R package called expertscore for computing the metric we developed; the library Conservation Biology Volume 34, No. 5, 2020  Fig. 2a,b). Overall model fits, which draw on all groups of predictors, are in Fig. 2a can be downloaded from https://github.com/kpmainali/ expertscore.

Distributional Data and Species Distribution Models
Spatial overlap between the two sets of expert maps and between each set of expert maps and the occurrence records varied by species (Fig. 1a). Modeling the distribution of a species based on its occurrence records and environmental covariates yielded a probability surface for the occurrence of that species in its specific study area (Fig. 1b illustrates this process for one species). Usually, these surfaces included very low probabilities toward the edges of the study area, but exceptions occurred when land and water boundaries truncated the range or when occurrence records were clustered near an artificial (geopolitical) truncation boundary. Species distribution models, when evaluated by AUC score, performed very well. Following the rule of thumb that an AUC score of 0.8-0.9 implies a good model and >0.9 implies an excellent model (Araújo et al. 2005), >98% of species scored excellent and the rest scored good.

Expert Scores and Predictors
When considering all species, expert score for Glassberg was substantially higher than for Scott (median = 0.61 vs. 0.41, respectively) (Fig. 1c). On a pairwise basis, Glassberg Expert Score exceeded Scott Expert Score for 86% of species (Fig. 1d). Both coefficients decreased with increasing expert range size (Supporting Information). Several predictors related to attributes of map geometry and attributes of occurrence records, and butterfly traits were significant predictors of expert score for both the Glassberg and Scott data sets (Fig. 2a,b & Table 2). Attributes of the maps themselves constituted the strongest predictors of expert score (adjusted R 2 = 0.57 for Glassberg and 0.48 for Scott). Glassberg and Scott maps were also similar in that predictors related to occurrence records and to species ecology or life-history traits explained much less variance in expert score than attributes of map geometry and in that the variance explained by these 2 groups of predictors added little to  models that already included attributes of map geometry (Fig. 2a, Glassberg and Scott, and (c) expert agreement predicted separately based on covariates related to occurrence geometry and life history (2 groups of covariates merged to predict expert agreement; circle interpretation as in Fig 2). Bivariate plots of expert agreement with each of the significant predictors related to points geometry and to ecology and life history are in Supporting Information.
than was the Polsby-Popper index (Fig. 2c-h) (definitions in Table 1). Results were broadly similar when we considered only those species with nontruncated ranges (Supporting Information). Standardized coefficients of significant predictors are in Supporting Information. The degree to which expert score was predicted by each of the map geometries under the simple linear regression framework is summarized in Supporting Information.

Expert Agreement and Predictors
Across all species, Scott range size was consistently greater than the corresponding Glassberg range size (for 87% of species) (Fig. 3a). Expert Agreement ranged between almost complete agreement to complete disagreement (Fig. 3b). Across all species, the 2 expert maps shared an average of 58% of the total range (median = 62%). Expert Agreement increased monotonically as a function of range area for each of the sets of expert maps (not shown). However, we excluded expert range area and other map attributes as predictors of expert agreement to avoid issues of circularity. Predictors related to attributes of occurrences and ecology and life history explained 30% and 29% of the variance in expert agreement, respectively (Fig. 3c & Table 2); collectively, they explained 44% of the variance. Scott and Glassberg expert maps agreed more for those species represented by a larger number of occurrence records but agreed less for species whose occurrences featured greater heterogeneity in spatial distribution (Supporting Information).
Point density exhibited a weak negative relationship with expert agreement, and several ecological and life-history traits were also weakly associated with expert agreement (Supporting Information).
Restricting these same analyses to those species with nontruncated ranges, we found 45% of variance among maps explained by attributes of occurrences (vs. 30% for all species), 58% explained by ecology and life-history traits (vs. 29% for all species), and 64% explained by both groups of predictors (vs. to 44% for all species) (Supporting Information).

Deviance in Expert Maps
Expert maps performed well in predicting unsuitable landscape outside the expert-drawn map boundaries. This was true for both Glassberg and Scott maps and for the vast majority of species, as indicated by very low deviance outside scores irrespective of expert score (Fig. 4a,b). Consequently, the overall performance of the expert maps was almost exclusively determined by the deviance inside (i.e., from predictions concerning unsuitable areas inside the map).
Deviance inside was lower for Glassberg than for Scott for 88% of species (Fig. 4c). This indicated that Glassberg more robustly predicted suitable sites than Scott. In contrast, 80% of species had their deviance outside higher for Glassberg than for Scott (Fig. 4d), indicating the Glassberg maps had more omission errors than the Scott maps. However, omission error contributed little to overall model performance (Fig. 4a,b).

Discussion
Expert opinions, especially opinions that represent the collective consensus of many experts, are indispensable components of knowledge. Such opinions are especially useful when more objective knowledge is incomplete, as is true for species distributions. Different experts draw maps with different intentions, which can result in surprising deviations from each other in terms of range boundaries. Only rarely are alternative expert maps available for comparison. We contrasted alternative expert maps for diverse species via comparisons to SDMs that value reducing errors of commission as well as omission.
Expert maps are often drawn specifically to reduce omission errors and deal with commission errors by having accompanying text describing where species are most likely to be seen within the drawn range boundaries. Consequently, expert range maps make substantial commission errors, often including large amounts of uninhabited land within the range boundaries. For example, species of birds, on average, occur in only 40% of surveyed sites (Hurlbert & White 2005) or about half of the 0.25°grid cells (Hurlbert & Jetz 2007). Our analyses of expert maps echo these findings when we treated the SDMs as the reference point (Fig. 4a,b). Our efforts feature two advances. First, we addressed errors arising from the inclusion of unsuitable sites in the expert maps as well as errors arising from the exclusion of highly suitable sites from the expert map. Second, rather than evaluating the expert maps against presence-absence grids based on occurrence records (Hurlbert & White 2005;Hurlbert & Jetz 2007), we evaluated expert maps relative to gridded probability surfaces derived from SDMs. Volume 34, No. 5, 2020 This alternative may be especially advantageous when, as here, highly efficient SDMs are available that can detect habitat and likely occupied habitats in unsampled locations.

Conservation Biology
For both the Scott and Glassberg expert map sets, we observed substantial interspecific differences in agreement with the SDMs (expert score, Fig. 1b). Thus, the same expert method can yield very different predictions relative to occurrence-based models when applied across diverse species. Overall, the Glassberg maps provided better matches to the SDM maps for 86% of species (Fig. 1c), indicating a consistent consequence of the different approaches the 2 authors took when delineating expert maps. For instance, Glassberg only included regions in his range where the species was known to reproduce even if it does not overwinter, whereas Scott included areas that the species expanded to each year, even if it did not reproduce. Thus, not surprisingly, the Glassberg range was smaller than the corresponding Scott range for the vast majority of the species (Fig. 3a). This suggests the higher fraction of false positives that occurs in large expert ranges reduces the degree to which those maps will agree with the predictions of SDMs.
Indeed, by contrasting the performance of expert maps inside and outside expert-drawn map boundaries, we found that for almost all species the predictive performance of the expert maps was penalized by inclusion of unsuitable sites within the map, rather than by exclusion of suitable sites outside of the map (Fig. 4a,b). This reflects the strategies used in drawing the expert maps: experts routinely delineate the boundaries of a species' range and then allow their accompanying species description to provide guidance on internal heterogeneity. Overall, the two sets of expert maps differ greatly in the extent to which they include false positives inside the map boundaries. By including nonbreeding ranges, maps that are overly generous in space may have reduced utility (Fig. 1c) if they include unsuitable sites inside (Fig. 4c) because they stretch to include more strays or vagrants (Fig. 4d).
Expert maps vary in important ways. Among birds, which have some of the most complete distributional information (Hurlbert & Jetz 2007), expert maps are a reasonable approximation of species' range at 100-200 km spatial scale (Hurlbert & White 2005;Hurlbert & Jetz 2007;Merow et al. 2017). For many other taxonomic groups, whose distributions are less well known than birds, the spatial accuracy of expert maps is largely unknown. Even for birds, expert maps tend to include false presences at coarse resolutions (Hurlbert & Jetz 2007), limiting the utility of the expert maps for understanding ecological processes, conservation planning, disease risk assessment, and similar applications unless the accompanying text is also taken into account. However, when used for analytical purposes, usually only the range map is considered. The framework we developed offers a way to explore the congruence between expert maps and SDMs, which are very different approaches for understanding species ranges. An emphasis in minimizing omission errors more strongly than commission errors shifts an SDM output from detailed edge and heterogeneous and disjointed patches of highly suitable areas to a smooth blob-like area commonly reported in expert maps.
One caveat of using a probability surface to evaluate expert credibility is that SDM outputs cannot possibly account for all dispersal barriers, biotic interactions (Soberón 2007), and environmental dependencies. These omissions could result in overestimation of species distribution, whereas experts would be expected to know about such geographical and biological constraints (Domisch et al. 2016), thereby, yielding more realistic species range maps. If such omissions were important, they would drive deviance outside scores upward. However, we found near-0 deviance outside (Fig. 4a,b). In contrast, SDMs may not capture range internal heterogeneity due to spatial biases in available data (Kramer-Schadt et al. 2013;Yackulic et al. 2013). This important problem can be at least partially addressed if species occurrences are well sampled across multiple environmental gradients. Such sampling reduces bias along the variables used to predict occurrence patterns, even if the sampling is geographically biased. A comprehensive review of the best practices in SDM is beyond the scope of this article (see contemporary literature, including Elith &Leathwick 2009 andAraújo et al. 2019).

Lessons for the use of Expert Maps
Organisms typically tolerate environmental conditions across a continuous range, with upper and lower critical limits on either side of an optimum (Miller & Stillman 2012). However, continuity in environmental ranges need not map onto continuity in geographic space. Consequently, mapped ranges may feature porosities reflecting unsuitable localities within a larger suitable region (Hurlbert & White 2005); tortuous range edges that result in range concavities and increased detailed edge measures; and disjunct suitable areas. Here, agreement between expert maps and SDMs decreased with increases in 3 expert map traits: convex hull score, detailed edge, and Polsby-Popper score (Fig. 2d-i) (definitions in Table 1). Collectively, these 3 measures characterize the geometric shape of the maps: were the maps elongate versus compact, convex or featuring concavities, and drawn with much or little boundary detail. Once these differences in range geometry were accounted for, life-history and ecological traits explained almost no further variability, which was a surprising result (Fig. 2).
Because expert maps are generally presented with accompanying information about habitat, elevational, Conservation Biology Volume 34, No. 5, 2020 or other physical environmental preferences, actual internal heterogeneity is routinely absent from the maps. Increased accessibility to relevant environmental layers may allow future expert maps to be combined within geographic information systems to minimize both omission and commission errors. Effectively this would lead to expert maps that feature porosities or are split into separate spatial units to eliminate unsuitable landscape. None of the maps we used featured porosities, nor did they allow a reliable counting of separate spatial units.
Conservation practitioners are far more likely to use existing expert maps than to develop new ones. As we show, not all expert maps are drawn based on the same criteria; thus, expert maps are not always comparable (either for different species by the same group of experts or for the same species by different groups of experts). Practitioners must be especially careful to understand how expert maps were developed and what assumptions were presented by the authors. We found expert maps with small convex hull scores and small area to perimeter ratios were more likely to match predictions from species distribution models (Fig. 2). It remains to be seen if this is more generally true.
In conclusion, range maps are abstractions of experts' belief about species distributions. Those abstractions depend on spatial scale and thus may introduce errors especially when they are digitized and stacked for uses beyond which they were intended such as estimating local species richness (Hurlbert & Jetz 2007). However, expert range maps are widely used in macroecological and conservation analyses (e.g., Shriner et al. 2006) precisely because they are so readily available. It is clear, however, that expert maps should be used with caution and with clear attention to the assumptions originally used to draw them. In recent analyses, expert range maps were used in 69% of 85 studies of species richness (Hawkins et al. 2003), even though they predicted roughly as many false occupancies as true occupancies (Hurlbert & White 2005;Hurlbert & Jetz 2007) and overestimated the spatial pattern of species richness (Hurlbert & Jetz 2007). Our analyses suggest that expert range maps will provide the best matches to SDMs when they are drawn to reduce errors of commission and when they feature heightened values of concavity and detailed edge. When available, we predict that other map characteristics, such as porosity and number of spatially isolated units, will also contribute to the agreement between SDMs and expert maps. Likewise, SDMs could leverage the value of expert maps (Jetz et al. 2012) if the maps are used to limit the boundaries of potential space (Fig. 1a), thus, helping to reduce SDM commission errors from the outset.