Early View e14376
METHOD
Open Access

Systematic conservation prioritization with the prioritizr R package

Jeffrey O. Hanson

Corresponding Author

Jeffrey O. Hanson

Centre for Biodiversity and Conservation Science, University of Queensland, St Lucia, Queensland, Australia

Department of Biology, Carleton University, Ottawa, Ontario, Canada

Correspondence

Jeffrey O. Hanson, Centre for Biodiversity and Conservation Science, University of Queensland, St Lucia QLD 4072, Australia. Email: [email protected]

Search for more papers by this author
Richard Schuster

Richard Schuster

Department of Biology, Carleton University, Ottawa, Ontario, Canada

Nature Conservancy of Canada, Toronto, Ontario, Canada

Search for more papers by this author
Matthew Strimas-Mackey

Matthew Strimas-Mackey

Cornell Laboratory of Ornithology, Cornell University, Ithaca, New York, USA

Search for more papers by this author
Nina Morrell

Nina Morrell

Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada

Search for more papers by this author
Brandon P. M. Edwards

Brandon P. M. Edwards

Department of Biology, Carleton University, Ottawa, Ontario, Canada

Search for more papers by this author
Peter Arcese

Peter Arcese

Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, British Columbia, Canada

Search for more papers by this author
Joseph R. Bennett

Joseph R. Bennett

Department of Biology, Carleton University, Ottawa, Ontario, Canada

Search for more papers by this author
Hugh P. Possingham

Hugh P. Possingham

Centre for Biodiversity and Conservation Science, University of Queensland, St Lucia, Queensland, Australia

Search for more papers by this author
First published: 13 September 2024
Citations: 1

Article impact statement: The prioritizr R package is a decision support tool for systematic conservation planning for generating optimal prioritizations.

Abstract

en

Plans for expanding protected area systems (prioritizations) need to fulfill conservation objectives. They also need to account for other factors, such as economic feasibility and anthropogenic land-use requirements. Although prioritizations are often generated with decision support tools, most tools have limitations that hinder their use for decision-making. We outlined how the prioritizr R package (https://prioritizr.net) can be used for systematic conservation prioritization. This decision support tool provides a flexible interface to build conservation planning problems. It can leverage a variety of commercial (e.g., Gurobi) and open-source (e.g., CBC and SYMPHONY) exact algorithm solvers to identify optimal solutions in a short period. It is also compatible with a variety of spatially explicit (e.g., ESRI Shapefile, GeoTIFF) and nonspatial tabular (e.g., Microsoft Excel Spreadsheet) data formats. Additionally, it provides functionality for evaluating prioritizations, such as assessing the relative importance of different places selected by a prioritization. To showcase the prioritizr R package, we applied it to a case study based in Washington state (United States) for which we developed a prioritization to improve protected area coverage of native avifauna. We accounted for land acquisition costs, existing protected areas, places that might not be suitable for protected area establishment, and spatial fragmentation. We also conducted a benchmark analysis to examine the performance of different solvers. The prioritization identified 12,400 km2 of priority areas for increasing the percentage of species’ distributions covered by protected areas. Although open source and commercial solvers were able to quickly solve large-scale conservation planning problems, commercial solvers were required for complex, large-scale problems.. The prioritizr R package is available on the Comprehensive R Archive Network (CRAN). In addition to reserve selection, it can inform habitat restoration, connectivity enhancement, and ecosystem service provisioning. It has been used in numerous conservation planning exercises to inform best practices and aid real-world decision-making.

Abstract

es

Priorización de la conservación sistemática con el paquete prioritizr R

Resumen

Los planes para expandir los sistemas de áreas protegidas (priorizaciones) necesitan cumplir con los objetivos de conservación. También necesitan considerar otros factores, como la viabilidad económica y los requerimientos para el uso antropogénico del suelo. Aunque con frecuencia las priorizaciones se generan con herramientas de apoyo para decidir, la mayoría de estas herramientas tienen limitantes que complican su uso en la toma de decisiones. Esbozamos cómo el paquete prioritizr R (https://prioritizr.net) puede usarse para la priorización de la conservación sistemática. Esta herramienta de apoyo para decidir proporciona una interfaz flexible para construir problemas de la planeación de la conservación. También puede sacar provecho de una variedad de solucionadores exactos de algoritmos comerciales (p. ej.: Gurobi) y de fuentes abiertas (p. ej.: CBC y SYMPHONY) para identificar soluciones óptimas en un periodo breve. La herramienta también es compatible con una variedad de formatos de datos tabulares con espacialidad explícita (p. ej.: ESRI Shapefile, GeoTIFF) y sin espacialidad (p. ej.: hojas de cálculo de Microsoft Excel). Además, proporciona la funcionalidad para evaluar las priorizaciones, como el análisis de la importancia relativa de los diferentes lugares seleccionados por una priorización.

Para mostrar la funcionalidad del paquete prioritizr R, lo aplicamos a un estudio de caso en el Estado de Washington, Estado Unidos, para el cual desarrollamos una priorización para mejorar la cobertura del área protegida de la avifauna nativa. Consideramos los costos de adquisición de tierras, las áreas protegidas existentes y la fragmentación espacial. También realizamos un análisis comparativo para examinar el desempeño de los diferentes solucionadores. La priorización identificó 12,400 km2 de áreas prioritarias para incrementar el porcentaje de la distribución de especies cubiertas por las áreas protegidas. Aunque los solucionadores comerciales y de fuente abierta lograron resolver rápidamente los problemas de conservación a gran escala, sólo los comerciales fueron requeridos para los problemas complejos de gran escala. El paquete prioritizr R está disponible en el Comprehensive R Archive Network (CRAN). Además de seleccionar las reservas, el paquete puede informar la restauración de hábitat, la mejora de la conectividad y el suministro de servicios ambientales. El paquete se ha usado en varios ejercicios para informar las mejores prácticas y ayudar a la toma de decisiones en el mundo real.

INTRODUCTION

Protected areas are essential for nature conservation (Ferraro & Pattanayak, 2006; Ribas et al., 2020; Watson et al., 2014). They safeguard species from anthropogenic impacts and provide sanctuaries for ecological and evolutionary processes (Klein et al., 2009; Rosauer et al., 2017; Williams et al., 2020). They also set the stage for direct management interventions, such as habitat restoration, pest eradication, and species reintroduction projects (Burrows et al., 2003; Innes et al., 1999; Omeja et al., 2016). Because resources are limited, plans for expanding protected area systems (termed prioritizations) need to strategically allocate resources to achieve conservation objectives (Margules & Pressey, 2000). As a consequence, there has been increasing interest in optimizing prioritizations to better inform conservation decision-making (Beyer et al., 2016; Rodrigues et al., 2000; Watts et al., 2009).

Decision support tools play a critical role in conservation planning (Sarkar et al., 2006). The most commonly used conservation planning tools include Marxan and Zonation (Ball et al., 2009; Moilanen et al., 2022). These tools have helped inform protected area systems across the planet, with examples in Australia (Fernandes et al., 2005), Finland (Kareksela et al., 2020), Mexico (Álvarez-Romero et al., 2013), and South Africa (Cowling et al., 2003). Such tools use ecological, economic, and social data—such as species’ geographic ranges (Butchart et al., 2015), ecosystem boundaries (Klein et al., 2009), land acquisition costs (Rodewald et al., 2019), and land-use data (Pinto et al., 2019)—along with optimization algorithms to generate prioritizations. They can also account for connectivity during the optimization process (Beger et al., 2010; Moilanen & Wintle, 2007). Furthermore, by accounting for multiple management zones, decision support tools can inform landscape-level planning by prioritizing specific places for the implementation of particular policies or management actions (Watts et al., 2009).

Conservation decision support tools should be flexible, accountable, and efficient (Rodrigues et al., 2000). They should be flexible, so that they can be applied to a broad range of conservation planning scenarios (Sarkar et al., 2006). Although different scenarios might involve different design criteria, decision support tools should, ideally, have the flexibility to accommodate a broad range of objectives and constraints (Rodrigues et al., 2000). Decision support tools should also be accountable (Rodrigues et al., 2000), so that conservation planners can easily understand how and why particular priority areas are identified (Sarkar et al., 2006). Additionally, they should generate optimal prioritizations that achieve conservation objectives with maximal cost-efficiency (Rodrigues & Gaston, 2002a). Although some decision support tools rely on algorithms that can produce suboptimal prioritizations (e.g., simulated annealing and heuristic algorithms; Rodrigues et al., 2000; Schuster et al., 2020), decision support tools should guarantee—given the limitations of available data—that prioritizations are cost-efficient.

We considered how the prioritizr R package can be used in systematic conservation prioritization (https://prioritizr.net). The package offers a flexible interface to build conservation planning problems and solve them with exact algorithms to generate optimal prioritizations. We detailed its usage for conservation planning exercises. We applied it to a case study based in Washington state (United States). We also conducted a benchmark analysis to examine run times across different problem formulations, sizes, and solvers. Code and data for the case study and benchmark analyses are available online (Hanson et al., 2024). Finally, we considered its advantages, limitations, and previous applications.

METHODS

Software description

The prioritizr R package (hereafter, the package) is implemented as an open-source software package to extend the R statistical computing environment (R Core Team, 2020). Within a systematic conservation planning exercise (Margules & Pressey, 2000), identifying priority areas involves the following four steps: obtaining and preparing input data, building conservation planning problems, solving these problems to generate prioritizations, and evaluating the prioritizations. These stages are often revisited as new data become available, stakeholder requirements evolve, and preliminary analyses reveal additional factors for consideration. We considered these steps in detail (details below) (see https://prioritizr.net for additional information).

Systematic conservation planning exercises require spatially explicit data (Margules & Pressey, 2000). Briefly, they require data to delineate places for consideration (termed planning units) (e.g., point localities, grid cells, property boundaries, or ecosystem boundaries). Planning units should include places within existing protected areas to account for existing conservation efforts and potential places for protected area establishment. Planning units may also include places that—albeit not necessarily available for protected area establishment (e.g., urban areas)—are important for correctly assessing conservation benefits (e.g., calculating the percentage of a species’ geographic range represented by priority areas). Planning exercises also require spatial data for biotic elements of conservation interest (termed features) (e.g., species, ecosystem types, ecosystem services). They require management data, such as cost (or a surrogate thereof) of implementing conservation actions within planning units (Rodewald et al., 2019), as well as the current protection status of planning units. If exercises involve multiple management zones (or actions), then they also require the cost of allocating each planning unit to each zone and the expected amount of each feature in each planning unit when allocated to each zone (Watts et al., 2009). Furthermore, exercises often use data to parametrize connectivity for guiding reserve selection (e.g., river connectivity; Lin et al., 2020).

The package can use a variety of input data formats. Briefly, planning unit and feature data can be supplied in spatially explicit vector (e.g., ESRI shapefile), raster (e.g., GeoTIFF), and nonspatial tabular (e.g., Microsoft Excel Spreadsheet) formats with purpose-built R packages (e.g., sf, terra, readxl R packages) (Hijmans, 2023; Pebesma, 2018; Wickham & Bryan, 2019). These formats can be used for exercises involving a single management zone or multiple zones. Although tabular data require prior processing, such operations are automated for spatially explicit data formats (via exactextractr R package) (Baston, 2020). Matrices are used to specify spatial relationships (e.g., shared boundary lengths) and connectivity between planning units (e.g., gene flow). To help with applying such matrices, functions are provided to automate their creation for spatially explicit planning unit data (e.g., the boundary_matrix() and connectivity_matrix() functions). Additionally, to provide interoperability, data prepared for Marxan can be imported directly (with the marxan_problem() function).

Systematic conservation planning exercises involve mathematical optimization (Rodrigues et al., 2000). To achieve this, conservation planners use the package to specify all the data, settings, and design criteria for generating prioritizations. As such, the package provides an accountable method for conservation planning. Specifically, a new conservation planning problem is created with the planning unit, cost, and feature data (with the problem() function), which is then customized by adding components via functions (details below). These functions serve as building blocks to specify how the planning unit, cost, and feature data—along with additional data (e.g., connectivity data, existing protected area data)—combine to formulate an optimization problem. Since most of these functions are compatible with each other and generalize to multiple management zones, they can be added together in many different combinations—producing a highly flexible interface that can be applied to a variety of scenarios.

The functions for customizing problems are organized into the following roles: objectives, penalties, targets, constraints, decisions, solvers, and portfolios. Each role serves a distinct purpose, and functions are named according to their role (see below). For some of the roles, only a single function can be used when building a problem (i.e., objectives, targets, decisions, solvers, portfolios). For other roles, multiple functions belonging to the same role can be used when building a problem (i.e., penalties and constraints).

Objective functions specify the primary metric that is maximized (or minimized) during optimization. For example, such functions can be added to minimize representation shortfalls given a budget, which is useful for improving representation of features under limited funding (with the add_min_shortfall_objective() function [based on Jung et al., 2021]). They can also be added to maximize phylogenetic diversity given a budget, which is useful for incorporating evolutionary processes into prioritizations (with the add_max_phylo_div_objective() function [based on Rodrigues & Gaston, 2002b]). Additionally, similar to Marxan, they can be added to minimize total cost—while ensuring that representation targets are met for all features (see below for details)—which is useful for generating prioritizations that fully achieve conservation goals (with the add_min_set_objective() function [based on Beyer et al., 2016]).

Penalty functions specify additional criteria for penalizing (or weighting) prioritizations. They generally use a penalty parameter to balance their relative importance with the primary objective (similar to the boundary length modifier used by Marxan). For example, such functions can be added to penalize spatial fragmentation (with the add_boundary_penalties() function [based on Beyer et al., 2016]) and low connectivity (with the add_connectivity_penalties() function [based on Beger et al., 2010]).

Target functions specify threshold coverage values for feature representation (e.g., Butchart et al., 2015). They are typically used to specify a minimum level of coverage for each feature. For example, these functions can specify targets as a proportion of the total amount of each feature across the study area (with the add_relative_targets() function). For fine-grained control, such functions can be specified in tabular format (with the add_manual_targets() function). Most of the objective functions require target functions (e.g., the add_min_set_objective() or add_min_shortfall_objective() function).

Constraint functions ensure that prioritizations exhibit specific characteristics. For example, such functions can ensure that certain planning units are selected (with the add_locked_in_constraints() function) or not selected by prioritizations (with the add_locked_out_constraints() function). These functions can also be used to promote connectivity, such as ensuring that priority areas each have a particular number of neighbors (with the add_neighbors_constraints() function) or form contiguous networks (with the add_contiguity_constraints() function [based on Billionnet, 2013]).

Decision functions specify the type of values in prioritizations. For example, such functions can specify that prioritizations should contain binary values that are either zero or one (similar to Marxan; with the add_binary_decisions() function) or contain continuous values from zero to one (with the add_proportion_decisions() function).

Solver functions specify the software and settings for optimization (e.g., optimality gap, maximum run time). The following solvers are currently supported: Gurobi (Gurobi Optimization LLC, 2020), IBM CPLEX (IBM, 2019), CBC (COIN-OR Branch and Cut; Forrest & Lougee-Heimer, 2005), HiGHS (Huangfu & Hall, 2018), and SYMPHONY (Ralphs & Güzelsoy, 2005). The SYMPHONY solver can be accessed via the Rsymphony or lpsymphony R package (Harter et al., 2017; Kim, 2020). We recommend the commercial Gurobi or IBM CPLEX solvers where possible because they have the best performance (with the add_gurobi_solver() and add_cplex_solver() functions) (Koch et al., 2011). If neither of those solvers are available, we recommend the open-source CBC or HiGHS solver (with the add_cbc_solver() or add_highs_solver() function, respectively).

Portfolio functions specify approaches for generating multiple prioritizations. Such functionality can be useful for facilitating stakeholder discussions and exploring trade-offs among multiple near-optimal solutions (Ardron et al., 2010). For example, such functions can generate a set of prioritizations within a specified optimality gap (with the add_gap_portfolio() function). These functions are optional. If a portfolio function is not specified, then a default portfolio is automatically applied to generate a single solution.

Systematic conservation planning exercises use algorithms to identify priority areas (Margules & Pressey, 2000). After building a conservation planning problem, the package compiles a mixed integer programing problem and solves it with an exact algorithm solver to generate a prioritization. These solvers provide numerous advantages (Rodrigues et al., 2000). First, they are mathematically guaranteed to identify optimal prioritizations (Rodrigues & Gaston, 2002a; Underhill, 1994). This is unlike Marxan and Zonation, which use simulated annealing and iterative heuristic algorithms (respectively) that do not provide such mathematical guarantees (Pressey et al., 1996). If optimality is not strictly required, exact algorithm solvers can identify near-optimal prioritizations within a specified optimality gap (e.g., 10% from optimality). Second, both open-source and commercial solvers can generate near-optimal solutions for moderately sized prioritizations relatively quickly (Beyer et al., 2016; Schuster et al., 2020). For example, Schuster et al. (2020) demonstrate that the prioritizr R package, when applied with the Gurobi and SYMPHONY solvers, has superior performance to Marxan. Third, the package can solve large-scale problems that comprise hundreds of thousands of planning units (Hanson et al., 2020). Although such problems exceed recommended limits for other tools (e.g., Marxan, per Ardron et al., 2010), the package is able to solve such problems to near-optimality (Schuster et al., 2020). Fourth, exact algorithm solvers are steadily improving over time (Achterberg & Wunderling, 2013), and leveraging these solvers grants conservation planners access to advances in optimization.

Systematic conservation planning exercises involve evaluating prioritizations (Margules & Pressey, 2000). To help with this, the package provides methods to summarize the performance of prioritizations. For example, it can report how well features are represented (with the eval_feature_representation_summary() function), the total cost (with the eval_cost_summary() function), and the number of selected planning units (with the eval_n_summary() function) associated with a prioritization. It can also report the total boundary (perimeter) length associated with a prioritization (with the eval_boundary_summary() function) and connectivity information (with the eval_connectivity_summary() function).

Details are in the caption following the image
Annotated example showing how the prioritizr R package was used to identify priority areas for the case study of Washington state (United States) (insets, input data sets, priority areas identified by prioritization, and relative importance scores for the priority areas). This example details computational procedures for loading packages, importing data, building a conservation planning problem, solving the problem to generate a prioritization, and evaluating the prioritization. These procedures are expressed as code in the R programing language. Code comments (green) provide explanations for the computational procedures and descriptions of the input data sets and resulting outputs.

Systematic conservation planning exercises involve characterizing the relative importance of priority areas within a prioritization (Margules & Pressey, 2000). This information is useful for scheduling implementation and reaching consensus among stakeholders (Pressey, 1999). To provide this information, solutions can be evaluated with a replacement cost metric (with the eval_replacement_importance() function [based on Cabeza & Moilanen, 2006]), irreplaceability metric (with the eval_ferrier_importance() function [based on Ferrier et al., 2000]), and rarity-weighted richness metric (with the eval_rare_richness_importance() function [based on Williams et al., 1996]). Although the replacement cost metric provides the most intuitive measure (Cabeza & Moilanen, 2006), it has a very high computational burden. As such, we recommend the replacement cost metric for small problems (< 0,000 planning units) and the irreplaceability metric for larger problems.

Case study

We used a case study to showcase the prioritizr R package. Our case study was based in Washington state (United States). We aimed to identify priority areas for new protected areas to improve conservation of native avifauna. The methodology was based on Hanson et al. (2022). All analyses were completed with the R statistical computing environment (version 4.2.2). Spatial data processing was completed with the fasterize (Ross, 2022), sf, and terra packages. Prioritization analyses were performed with the prioritizr R package (Hanson et al., 2021).

We used a spatial grid across the study area to define the planning units (4 × 4 km resolution, n = 10,757). We then obtained land valuation data from Nolte (2020a, 2020b). Although such data were derived from property sales (Nolte, 2020b), they may not fully reflect current market values (e.g., due to changing opportunity costs). To account for existing conservation efforts, we obtained protected area boundaries (U.S. Geological Survey Gap Analysis Project 2022) and excluded protected areas not managed for biodiversity. We then overlaid the planning units with the protected areas and subsequently treated planning units with at least 50% coverage as covered by existing protected areas. To account for places that might not be suitable for protected area establishment, we obtained land-use data to characterize urban areas and the boundaries of Tribal Reservation and Trust Lands (Commission for Environmental Cooperation, 2020; Washington State Department of Transportation, 2017). We then identified the dominant land use in each planning unit and subsequently treated planning units dominated by urban areas as covered by urban areas (following Hanson et al., 2022). We also overlaid the Tribal Reservation and Trust Lands with the planning units and subsequently treated planning units with at least 1% coverage as covered by such lands (following Hanson et al., 2022).

We obtained 396 species distribution maps for 258 bird species (4 × 4 resolution) from Hanson et al. (2022) (Appendix S1). Briefly, these maps were originally derived from high-resolution data for 2018 from the eBird Status and Trends data set (Fink, Auer, Johnston, Ruiz-Gutierrez, et al., 2020; Fink, Auer, Johnston, Strimas-Mackey, et al., 2020). To help account for migratory patterns, we used separate maps for the species’ breeding and nonbreeding distributions. Since 47 species did not have separate maps for their seasonal distributions, we accounted for these species with maps of their full annual distribution. The species distribution maps were subsequently resampled to match the planning units.

We generated a prioritization to identify priority areas for protected area establishment (Figure 1). To achieve this, we defined planning unit costs with the land valuation data and features with the species distribution maps. We added an objective to specify that the prioritization should aim to minimize shortfalls for the species representation targets (see below), subject to a given budget (see Jung et al. 2021 for details). To guide expansion of conservation efforts, we calculated the total land value of the existing protected area system (i.e., US $3013.56 million) and set a total budget for the prioritization—including costs of existing protected areas—by increasing the current total land value by 30% (i.e., US $3917.63 million). We then added boundary penalties to reduce spatial fragmentation of the solution (with a penalty value of 1 × 10 5 $1 \times {{10}^{ - 5}}$ , which was derived from preliminary analyses). We then added representation targets to ensure that each feature had at least 20% of its total distribution covered by the prioritization. We also added locked in constraints for planning units covered by protected areas and locked out constraints for planning units covered by urban areas and Tribal Reservation and Trust Lands. After building the problem, we solved it to within 10% of optimality with the Gurobi optimization software. We also evaluated the relative importance of planning units in the solution following Ferrier et al. (2000).

Benchmark analyses

We conducted a benchmark analysis to examine the performance of solvers for generating prioritizations. We aimed to provide insight into the run times of commercial and open-source solvers when applied to various different problem formulations and sizes. These analyses were completed with the R statistical computing environment (version 4.1.2). Spatial data processing was completed with the terra R package, and prioritizations were generated with the prioritizr R package.

We used the case study data set as the basis for the benchmark analysis. To examine a range of different problem sizes, we created four different versions of the case study data set by resampling it to four different spatial resolutions (i.e., 1 × 1-, 2 × 2-, 4 × 4-, and 8 × 8-km resolution). Thus, we produced 4 conservation planning data sets, where each data set comprised 172,112, 43,028, 10,757, and 2680 planning units, respectively, and all data sets comprised 396 features. To ensure that the resampling process did not artificially introduce duplicate values into the resulting data sets, we used bilinear interpolation to resample data. Although the case study data set contained data for existing protected areas and places that are not suitable for protected area establishment, we did not use these data for the benchmark analysis. This is because including such data would substantially reduce run times and produce misleading results.

We performed the benchmark analysis by generating prioritizations based on different combinations of problem formulations, data set sizes, and solvers. Specifically, we generated a total of 120 prioritizations based on different combinations of the four different spatial resolutions of the case study, five different solvers, two different problem formulations, and with three replicates per combination. The five solvers used to generate the prioritizations were Gurobi 10.0.1, IBM CPLEX 22.1.1.0, CBC 2.10.3, HiGHS 1.2.2, and SYMPHONY 5.6.16 via the Rsymphony R package. The two problem formulations used to generate the prioritizations were the minimum set formulation and the minimum shortfall formulation, with a budget representing 10% of the total planning unit costs. Similar to the case study, prioritizations were generated with species representation targets of 20%. All prioritizations were generated with the prioritizr R package 8.0.2 with a 10% optimality gap and restricted to a single CPU core for optimization. We did not fix the random number generator state for the solvers when running the benchmarks. These analyses were completed on a server with an AMD EPYC Processor (with IBPB) (3.0 GHz) with 32 CPU cores and 126 GB RAM. To complete our analyses within a feasible period, each solver was run with a maximum time limit of one week (168 h).

RESULTS

Case study

The prioritization identified 12,400 km2 of priority areas for protected area establishment (Figure 2a,b). These priority areas expanded the existing protected area system from 8880 km2 to a total of 21,280 km2 and cost approximately US $3911.83 million. When added to the existing protected area system, the priority areas increased the percentage of the species’ distributions covered by protected areas from, on average, 2.62% (ranging from 0% to 17.02% per species) to 11.34% (ranging from 0% to 45.24% per species) (Figure 2c). In the eastern side of the study area, most of the priority areas served to expand the boundaries of existing protected areas, link up existing protected areas, and create entirely new large protected areas. This result is likely due to the fact that the eastern side of study area had the lowest costs (Figure 1), enabling the optimization process to improve species’ representation and reduce spatial fragmentation for relatively little cost. In the western side of the study area, most of the priority areas served to create new small protected areas. Although far fewer planning units were selected in the western side than the eastern side (Figure 2a), the western planning units had much higher irreplaceability scores (Figure 2b). This result suggests that these new areas—despite encompassing relatively small areas and having relatively large acquisition costs—are critical for improving overall species’ representation by protected areas. In addition to protected area establishment, other management actions—such as reintroductions, pest management, or restoration—might be needed for species’ recovery.

Details are in the caption following the image
In Washington state (United States), (a) the existing protected area system and priority areas identified by the prioritization, (b) relative importance scores associated with planning units selected by the prioritization, and (c) percentage of each species’ spatial distribution secured by existing protected areas and the prioritization (i.e., existing protected areas and priority areas combined) (dashed line, representation target). For species associated with multiple seasonal stages, data show the minimum percentage secured.

Benchmark analyses

The benchmark analysis demonstrated that the prioritizr R package could generate prioritizations relatively quickly for large-scale problems by leveraging exact algorithm solvers (Figure 3). The commercial Gurobi and IBM CPLEX solvers had the shortest run time, followed by the open-source CBC and HiGHS solvers. The SYMPHONY solver (via the Rsymphony R package) had the longest run time. The Gurobi and IBM CPLEX solvers each solved the largest version of the conservation planning data set—comprising 172,112 planning units—within 17 min. Although these benchmarks suggest the Gurobi solver takes slightly longer than some other solvers for small-scale minimum set problems, previous benchmarks show that the Gurobi solver can solve more complex versions of the minimum set problem, such as when including boundary penalties, in a much shorter period (Hanson et al., 2023). These findings show that commercial and open-source solvers can be readily applied to moderately sized conservation planning exercises (e.g., comprising 10,000 planning units) and that commercial solvers may be especially useful for complex large-scale exercises (e.g., comprising more than 100,000 planning units).

Details are in the caption following the image
For Washington state (United States), timings for generating prioritizations with 5 different exact algorithm solvers based on (a) minimum set formulation and (b) minimum shortfall formulation (bars, run median times; error bars, minimum and maximum run times; timed out, runs not complete within the 1 week time limit; timings are on a lo g 10 ${\mathrm{lo}}{{{\mathrm{g}}}_{10}}$ scale).

DISCUSSION

The prioritizr R package advances the science and practice of conservation planning. First, the package provides a flexible framework for generating prioritizations based on multiple different problem formulations. As such, conservation scientists can use the same tool to generate and compare prioritizations based on different problem formulations and ensure that differences between prioritizations are not confounded by differences in solver algorithms (e.g., Buenafe et al., 2023). Second, the package can generate prioritizations within a predetermined optimality gap. This is important so that conservation scientists can test hypotheses by comparing prioritizations based on different data sets or problem formulations while ensuring that differences between prioritizations are not confounded by differences in optimality (e.g., Rodewald et al., 2019). This also means that conservation practitioners can be confident in the quality of prioritizations and assuages concerns that running optimization procedures for a longer period would substantially alter results. Third, the package can generate prioritizations relatively quickly via a variety of data formats. This can help conservation scientists and practitioners rapidly iterate on the design of prioritizations.

The package complements existing decision support tools. For instance, Marxan generates a diverse portfolio of prioritizations based on a single conservation planning data set. This may be desirable for managers that need a range of potential options and do not require optimality. However, because the prioritizr R package generates individual prioritizations much faster than Marxan (Schuster et al., 2020), it can also provide a range of options that more directly account for the uncertainty in input parameters. This is because the prioritizr R package can quickly generate a portfolio based on prioritizations produced from multiple different data sets, such as different climate change scenarios (Buenafe et al., 2023) or different samples from a Bayesian posterior distribution (Rosauer et al., 2017). In another instance, although the prioritizr R package provides several approaches to account for connectivity, Zonation has much more advanced methodologies to account for connectivity (see Lehtomäki & Moilanen, 2013). As such, Zonation may present a better option for conservation planning exercises involving species that have highly particular connectivity requirements.

The package has limitations. First, it is not as well-suited for scheduling conservation actions over multiple periods as other approaches, such as partially observable Markov decision processes (Chadès et al., 2017). Second, although the package requires the commercial Gurobi and IBM CPLEX solvers for complex large-scale problems (e.g., 500,000 planning units and boundary penalties), academics can freely obtain special licenses for these solvers. Conservation planners working in governmental or nongovernmental organizations will need to weigh potential benefits of these solvers against license costs. Indeed, because these commercial solvers can identify cost-effective prioritizations—potentially saving millions of dollars (Schuster et al., 2020)—their license costs might only comprise a small proportion of a project budget. Third, the package does not provide functionality for obtaining and preparing data for analysis. Instead, geographic information system (GIS) tools (e.g., ESRI ArcMap and QGIS; ESRI, 2020; QGIS Development Team, 2020) and purpose-built R packages are available for this task (Bivand & Nowosad, 2023). Fourth, errors and biases present in input data (e.g., cost data, species’ distribution data) will affect prioritizations generated by the package (Nolte, 2020b). Fifth, data with very small (e.g., < 1 × 10 6 $1 \times {{10}^{ - 6}}$ ) or large (e.g., > 1 × 10 6 $1 \times {{10}^6}$ ) nonzero values can cause numerical issues that degrade performance (see Gurobi Optimization LLC 2017 for details). Although the package contains automated checks for such issues, input data may need to be rescaled (e.g., converting data from square meters to square kilometers). Finally, because it may be challenging to fully encapsulate all the trade-offs and preferences for a stakeholder group within a single prioritization, the package can help facilitate stakeholder discussions by providing a range of prioritizations based on different trade-off scenarios.

The prioritizr R package was first released to the Comprehensive R Archive Network (CRAN) in 2016. Since then, it has been used to identify priority areas (Scriven et al., 2020; Visalli et al., 2020), inform best practices (Domisch et al., 2019; Rodewald et al., 2019), and develop new frameworks for novel data sets (Buenafe et al., 2023; Xuereb et al., 2020). It has also been used to help inform decision-making for governmental organizations, such as the Government of Montserrat (Flower et et al., al. 2020), the Scottish Government (Campbell G, personal communication 2023), the U.S. Geological Survey (Leopold et al., 2021), and the governments of the Maldives and the Federated States of Micronesia (J. Flower, personal communication 2023). It has been used by conservation organizations too, such as the Nature Conservancy of Canada (https://ncc.carleton.ca) and the UN Biodiversity Lab (UNDP, 2022). Across terrestrial, marine, and freshwater realms, it has been used to identify priority areas for conserving amphibian, bird, mammal, reptile, fish, insect, and vascular plant species (Jung et al., 2021; Rosauer et al., 2017; Scriven et al., 2020; Visalli et al., 2020). It has been applied at the global-scale—involving tens of thousands of species and hundreds of thousands of planning units (Hanson et al., 2020)—and at finer scales to provide detailed management recommendations (Flower et al., 2020; Tack et al., 2019). In addition to protected area establishment, it has also been used to identify priority areas for restoring habitats (Bryant et al., 2020), restoring connectivity (Lin et al., 2020), and provisioning ecosystem services (Neugarten et al., 2024; Williams et al., 2020). For more information, we encourage readers to visit the package website for examples, tutorials, and workshop materials (https://prioritizr.net). We believe that it is a useful tool for conservation scientists and practitioners engaged in protecting biodiversity.

ACKNOWLEDGMENTS

We thank Z. Stone for graphic design assistance. This work was supported by an NSERC discovery grant to P.A., Werner and Hilde Hesse, and North Pacific Landscape Conservation Cooperative. J.O.H. was supported by Environment and Climate Change Canada (ECCC), and R.S. was supported by a Liber Ero Fellowship and ECCC. B.P.M.E. and J.R.B. were supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and ECCC.