Using social media records to inform conservation planning
Article impact statement: Integrating biodiversity data sets from social media sources could substantially improve understanding of the natural world.
Abstract
enCitizen science plays a crucial role in helping monitor biodiversity and inform conservation. With the widespread use of smartphones, many people share biodiversity information on social media, but this information is still not widely used in conservation. Focusing on Bangladesh, a tropical megadiverse and mega-populated country, we examined the importance of social media records in conservation decision-making. We collated species distribution records for birds and butterflies from Facebook and Global Biodiversity Information Facility (GBIF), grouped them into GBIF-only and combined GBIF and Facebook data, and investigated the differences in identifying critical conservation areas. Adding Facebook data to GBIF data improved the accuracy of systematic conservation planning assessments by identifying additional important conservation areas in the northwest, southeast, and central parts of Bangladesh, extending priority conservation areas by 4,000–10,000 km2. Community efforts are needed to drive the implementation of the ambitious Kunming–Montreal Global Biodiversity Framework targets, especially in megadiverse tropical countries with a lack of reliable and up-to-date species distribution data. We highlight that conservation planning can be enhanced by including available data gathered from social media platforms.
Abstract
esRegistros de las redes sociales para guiar la planeación de la conservación
Resumen
La ciencia ciudadana es importante para monitorear la biodiversidad e informar la conservación. Con el creciente uso de los teléfonos inteligentes, muchas personas comparten información de la biodiversidad en redes sociales, pero todavía no se usa ampliamente en la conservación. Analizamos la importancia de los registros de las redes sociales para las decisiones de conservación enfocados en Bangladesh, un país tropical megadiverso y mega poblado. Cotejamos los registros de distribución de especies de aves y mariposas en Facebook y Global Biodiversity Information Facility (GBIF), las agrupamos en datos sólo de GBIF o datos combinados de Facebook y GBIF e investigamos las diferencias en la identificación de las áreas de conservación críticas. La combinación de los datos de Facebook con los de GBIF mejoró la precisión de las evaluaciones de la planeación de la conservación sistemática al identificar otras áreas importantes de conservación en el noroeste, sureste y centro de Bangladesh, extendiendo así las áreas prioritarias de conservación en unos 4,000-10,000 km2. Se requieren esfuerzos comunitarios para impulsar la implementación de los objetivos ambiciosos del Marco Global de Biodiversidad Kunming-Montreal, especialmente en países tropicales que carecen de datos confiables y actuales sobre la distribución de las especies. Destacamos que la planeación de la conservación puede mejorarse si se incluye información tomada de las redes sociales.
利用社交媒体记录为保护规划提供信息
zh【摘要】 公民科学在帮助监测生物多样性和提供保护信息方面发挥着至关重要的作用。随着智能手机的广泛使用, 许多人会在社交媒体上分享生物多样性信息, 但这些信息仍未被广泛用于保护。本研究聚焦于孟加拉国这个生物多样性丰富、人口众多的热带国家, 探究了社交媒体记录在保护决策中的重要性。我们整理了来自Facebook和全球生物多样性信息网络(GBIF)的鸟类和蝴蝶物种分布记录, 将其分为纯GBIF数据和GBIF-Facebook整合数据, 并分析了其在确定重点保护区域方面的差异。我们发现, 在GBIF数据的基础上增加Facebook数据, 可以在孟加拉国西北部、东南部和中部地区确定额外的重要保护区域, 将优先保护区扩大了4000-10000平方公里, 提高了系统性保护规划评估的准确性。推动实施雄心勃勃的昆明-蒙特利尔全球生物多样性框架目标需要社会各界的努力, 尤其是在生物多样性高但缺乏最新物种分布可靠数据的热带国家。最后, 本研究强调了将来自社交媒体平台的可用数据纳入保护规划, 可以提升保护规划的效果。【翻译: 胡怡思; 审校: 聂永刚】
INTRODUCTION
Earth's biodiversity is unevenly distributed (Pimm et al., 2014). Despite occupying <2% of Earth's surface, the tropics contain about 50% of global biodiversity, much of which resides in humid forests (Collen et al., 2008). Most tropical countries have high human population densities, substantial socioeconomic disadvantages, and high dependence on forests (Lewis et al., 2015; Newton et al., 2020). In many tropical countries, forests are overexploited or rapidly being converted to agricultural and urban land (Bradshaw et al., 2009; Chowdhury, Alam, Chowdhury, et al., 2021; Chowdhury, Alam, Labi, et al., 2021; Symes et al., 2018). These multifaceted human pressures pose an ongoing existential risk to tropical biodiversity (Malhi et al., 2014).
Protected areas (PAs) are the main tool to safeguard biodiversity from human pressures (Mukul & Rashid, 2017; Watson et al., 2014). They play crucial roles in protecting species and populations from extinction (Chowdhury, Jennions, et al., 2022; Maxwell et al., 2020), and their management can include sustainable land use. The Kunming–Montreal Global Biodiversity Framework (CBD, 2022) includes an ambitious target of expanding the coverage of PAs and other effective area-based conservation measures to 30% of terrestrial and marine areas by 2030, emphasizing area-based conservation approaches as a key means to maintain species and ecosystem functions. The effectiveness of this approach largely depends on maximizing biodiversity protection in PAs, requiring detailed records of the distribution of species. Although such data are often available for Europe and North America, tropical taxa are typically less well sampled (Chowdhury, Aich, et al., 2023; Di Marco et al., 2017; Troudet et al., 2017).
Citizen science is playing a vital role in reducing global biodiversity knowledge gaps (Callaghan et al., 2021, 2022; Chandler et al., 2017; Di Minin et al., 2015; Pocock et al., 2019), and, even in Europe, around 80−90% of biodiversity observational records are collected by dedicated volunteers (Schmeller et al., 2009). Amateur (and professional) naturalists are increasingly taking advantage of expanded internet coverage and the photographic capacity of mobile devices to share their observations online (Andrachuk et al., 2019; Chowdhury, Ahmed, et al., 2023; Chowdhury, Aich, et al., 2023; Marcenò et al., 2021; O'Neill et al., 2023). Consequently, the amount of biodiversity data from citizen science in the Global Biodiversity Information Facility (GBIF) is sharply increasing, although its data are biased toward Europe and North America (Hughes et al., 2021). Due to the increasing popularity of social media (e.g., Facebook, Flickr), millions of people post photographs that contain biodiversity information (Chowdhury, Ahmed, et al., 2023; Chowdhury, Aich, et al., 2023; Toivonen et al., 2019). If these biodiversity observation records can also be captured and mobilized, this could enhance existing knowledge of tropical species distributions and vastly improve conservation assessments (Chowdhury, Aich, et al., 2023; Jarić et al., 2020; Toivonen et al., 2019). Conservation science has so far utilized social media data in limited instances, such as mapping ecosystem services; promoting conservation through marketing and education; monitoring species and ecosystems, and management; and facilitating conservation communication (Di Minin et al., 2015). We focused on a tropical, mega-populated country, Bangladesh, to test whether social media data can directly contribute to conservation decision-making.
Bangladesh is part of the Indo-Burma and Indo-Malayan biodiversity hotspots (Chowdhury, Alam, Labi, et al., 2021; Chowdhury, Fuller et al., 2023) and is home to many globally charismatic species, including Royal Bengal Tiger (Panthera tigris tigris), spoon-billed sandpiper (Calidris pygmaea), and the Ganges River dolphin (Platanista gangetica). About 25% of assessed species in Bangladesh are threatened with extinction (IUCN Bangladesh, 2015), and ongoing climate change is significantly affecting the distribution of many species (Chowdhury, 2023). Biodiversity data from Bangladesh are scarce in GBIF (0.0001% of total GBIF records), like many other tropical countries. However, there is an active community of amateur photographers whose images, posted on social media platforms such as Facebook, often contain biodiversity information (Chowdhury, 2023; Chowdhury, Aich, et al., 2023; Sbragaglia et al., 2021, 2023). A recent study captured 7,096 records of butterflies from Bangladesh posted on Facebook (compared with 205 observations on GBIF [Chowdhury, Alam, Chowdhury, et al., 2021]).
We considered whether biodiversity distribution records from social media can improve conservation assessments. To test this, we used Bangladesh as a case study and examined whether social media records can inform conservation planning and decision-making. We aimed to demonstrate how social media data can complement and expand existing biodiversity data and, consequently, contribute to real-world conservation planning.
METHODS
Data
We compiled a comprehensive checklist of birds and butterflies of Bangladesh from the most recent national red list data book (871 species total, 566 bird species, 305 butterfly species) (IUCN Bangladesh, 2015). We collected climatic data from the WordClim database (http://www.worldclim.com/version2) at the finest resolution (0.693 km2; 833 × 833 m). We downloaded the distribution of the current PAs in Bangladesh (UNEP-WCMC, 2021) with the wdpar R package (Hanson, 2022), and the land-cover data came from Copernicus Global Land Service (Buchhorn et al., 2020). The land-cover layer contained 7 major classes: shrublands, herbaceous vegetation, herbaceous wetlands, permanent water bodies, built areas, forests, and croplands.
For the species data, we used 2 approaches. First, we downloaded spatial distribution records for the birds and butterflies of Bangladesh from GBIF with the rgbif package (Chamberlain et al., 2022) in R 4.0.4 (R Core Team, 2021). The GBIF is the largest global biodiversity data infrastructure network, and it compiles occurrence records from various sources—from museum specimens to citizen science records (Heberling et al., 2021). To avoid repetition, we did not collect data from other biodiversity repositories that provide data to GBIF (e.g., iNaturalist).
Second, we collected species distribution records from Facebook from our previous work (Chowdhury, Aich, et al., 2022), following the method described by Chowdhury, Ahmed, et al. (2023). These records were obtained by searching for species distribution records in 2 Facebook groups: Birds Bangladesh (https://www.facebook.com/groups/2403154788) and Butterfly Bangladesh (https://www.facebook.com/groups/488719627817749). In each group, we explored data by species common name obtained from IUCN Bangladesh (2015), double-checked the identification in each photograph, and extracted the species details (taxonomic information, location, date, and photographer). Afterwards, for each observation, we georeferenced the location with Google Maps (https://maps.google.com/). We excluded pictures if the identification was either incomplete (not up to the species level) or erroneously identified in Facebook (and we could not identify the image correctly), if the photograph did not allow clear taxonomic identification, or if the location was unspecified or could not be accurately determined. Some photographers may have shared the same photographs in Facebook and citizen science applications that were then eventually deposited in GBIF; this could have caused duplication. To address this problem, before running the conservation prioritization, we cleaned and spatially thinned the data (see “Data cleaning”).
Although other social media channels can be reliable sources of biodiversity data (Toivonen et al., 2019), we considered only Facebook because Facebook is among the most popular social media channels for photographers of Bangladesh, and the locality information is typically much vaguer in other social media channels (e.g., Twitter, Instagram). When sharing biodiversity photographs in the two Facebook groups we used, photographers are required to follow the group rule that the location information must be specified so that group members can evaluate the records (Chowdhury, Ahmed, et al., 2023; Chowdhury, Aich, et al., 2023; Chowdhury, Alam, Labi, et al., 2021).
Data cleaning
We cleaned GBIF data with the CoordinateCleaner R package (Zizka et al., 2019). We removed duplicate records, records with precision uncertainty over 10 km, imprecise coordinates (zero coordinates, integers, records in oceans), and invalid coordinates (specified locality was incompatible with the coordinates given).
To address sampling bias, we spatially thinned the combined data with the spThin R package (Aiello-Lammens et al., 2015) and considered a single occurrence record at 0.693 km2 (833 × 833 m) resolution for each species. We followed the same process for both species occurrence data (GBIF-only) and combined data (Facebook and GBIF [hereafter combined data]).
We checked collinearity among the WorldClim variables and removed highly correlated (r > 0.75) variables (Appendix S1). In this way, we removed 11 of the 19 climatic variables and kept the following 8 variables for the analyses: annual mean temperature, isothermality, mean temperature of the driest quarter, mean temperature of the coldest quarter, precipitation of the driest month, precipitation seasonality, precipitation of the driest quarter, and precipitation of the warmest quarter. Details on WorldClim variables are available from https://www.worldclim.org/data/bioclim.html.
Cleaning PA
We used the wdpar R package (Hanson, 2022) to clean the PA data following a globally accepted method (Butchart et al., 2015). Namely, we reprojected the data into an equal-area coordinate system (World Behrmann; ESRI: 54017), excluded UNESCO biosphere reserves and sites with unknown or proposed status, created buffers around PAs denoted as point localities, and expanded them to their reported extent. We rasterized the protected boundaries at 0.693 km2 (833 × 833 m) resolution with the fasterize R package (Ross, 2020). The cleaned PA data set resulted in boundaries for 42 PAs.
Habitat suitability maps
We fitted MaxEnt species distribution models to generate habitat suitability maps with the ENMEval package in R (Muscarella et al., 2014). We ran the model separately for GBIF-only and combined data sets.
We fitted species distribution models for each species with 9 predictor variables (8 climatic and 1 land cover), 10-fold cross-validation, and 5,000 randomly generated background records at 0.693 km2 resolution. Because Bangladesh is a small country, we used 5,000 instead of the typical 10,000 background records. We generated folds by overlaying the presence and background records with a spatial grid to control sampling bias and spatial autocorrelation on model performance. We assigned the records to grid cells and then randomly assigned grid cells to particular folds (Muscarella et al., 2014). To further improve model performance, we performed a calibration procedure by fitting the model under different combinations of parameters. Specifically, we fitted the model under 6 feature class combinations (L, LQ, H, LQH, LQHP, and LQHPT, where L is linear, Q is quadratic, H is hinge, P is product, and T is threshold) and 8 different regularization multipliers (0.5–4 at 0.5 intervals). Although the feature class allows Maxent to develop composite models to ensure good fit to the data, regularization multiplier values control model overfitting (Muscarella et al., 2014).
We evaluated the models with the AUC (area under a receiver operating characteristic –ROC–z curve) and chose the best model with the highest AUC score. After identifying the best model for each species, we used them to generate maps of continuous habitat suitability across the study area. We then applied thresholds to convert the continuous values into binary values, resulting in maps that denoted the presence or absence of suitable habitat conditions. The threshold values were specified by maximizing the sum of the sensitivity and specificity statistics (Liu et al., 2016). Because the best models all had an AUC >0.7 (mean = 0.92), we were confident that they were suitable to address the aims of our study. We also checked the suitability distribution of the predicted maps, and the prediction was good, with low omission and commission errors.
We used the binary habitat suitability maps for subsequent analysis. We had 470 (GBIF data) and 698 (combined data) species in our final analyses. We extracted built areas from the land-cover map and removed suitable habitats in these areas for each species for both Facebook and the combined data sets.
PA coverage
To evaluate the extent to which existing PAs in Bangladesh overlapped with biodiversity, we overlaid the species’ binary habitat suitability maps with the PA data and measured the percentage of suitable habitats occurring in existing PAs. Afterwards, we compared the percent level of coverage with a target threshold (termed representation target). We set the targets following a modified version of standard practices for global analysis (Butchart et al., 2015). We set the target at 100% for species with a distribution of <1000 km2 and 10% for those with 148,460 km2 (country area). For the intermittent values, we log linearly interpolated the targets with the prioritizr R package (Hanson et al., 2022).
Spatial conservation prioritization
We identified priority areas that most efficiently fill shortfalls in the existing PA system based on GBIF-only data and combined data. For this, we generated a single prioritization based on the minimum set formulation of the reserve selection problem, where the grid cells were used as planning units (Schuster et al., 2020). We generated these prioritizations with the species’ binary habitat suitability maps and the representation targets we used in the previous step to assess the performance of existing PAs. As such, the prioritization was constrained to meet the representation targets for every species assessed. To account for opportunity costs associated with implementing conservation areas, we considered the human footprint index (Venter et al., 2018) as a proxy for cost data (Butchart et al., 2015). Additionally, to ensure that priority areas complement existing PAs, existing PAs were locked in. These analyses were completed with an optimality gap of 10% with the prioritizr R package (Hanson et al., 2022) and Gurobi (version 8.1.0; Gurobi Optimization, LLC, 2021). We used Gurobi because this is the fastest way to generate prioritization (see Schuster et al. [2020] for details). After generating the prioritization, we overlaid it with land-cover data to facilitate interpretation.
To identify the most important priority areas in the analysis, we ran an irreplaceability analysis for each planning unit selected in a solution with the prioritizr R package (Hanson et al., 2022). While running the irreplaceability analysis, to quantify the importance of planning units, we used the Ferrier score (Ferrier et al., 2000).
To test other scenarios, we ran the spatial prioritization and irreplaceability analysis in 3 more combinations: only with the species suitability maps (excluding cost layers); only with species for which we obtained suitability maps based on both GBIF-only and combined data approaches without a cost layer (also known as biodiversity-only approach); and only with species for which we obtained habitat maps based on both GBIF-only and combined data approaches with the human footprint index cost layer.
Both GBIF data (GBIF, 2022) and Facebook data (Chowdhury, Aich, et al., 2022) are publicly available. All the R scripts are available in the following public GitHub repository: https://github.com/ShawanChowdhury/SocialMedia_ConservationPlanning.
RESULTS
Data distribution
Our cleaned combined data set included 47,077 georeferenced records for 472 species of birds (41,476 records) and 226 species of butterflies (5,601 records). We obtained 49% of the records from GBIF (n = 22,885), including 540 species (428 birds and 112 butterflies), and 51% of the records from Facebook, including 158 new species (compared with GBIF; 44 birds and 114 butterflies). Facebook data provided substantial variations across species and taxa (Figure 1a; Appendix S2). For butterflies, the average number of occurrence records (per species) jumped from 2 to 25 after including Facebook data, whereas GBIF data never represented more than 21% of species’ records (Figure 1a). For birds, the inclusion of Facebook records raised the average number of occurrence records per species from 48 to 88. Although there were no butterfly species with GBIF-only distribution data, there were 18 bird species for which we obtained data only from GBIF (Figure 1a).
Habitat suitability maps
The overall model performance was good with both data sets. After including Facebook data, the average AUC score only increased from 0.92 to 0.93. However, using GBIF-only data led to 228 species (33%) not being included in the modeling due to a limited number of spatial observation records. This was especially true for butterflies (161 species, 71% of butterflies), less so for birds (67 species, 14% of birds) (Figure 1b).
Spatial conservation prioritization
With GBIF-only data, the spatial prioritization process identified that 37.45% of the country's area was required for birds (55,600.51 km2) and 28.12% for butterflies (41,746.95 km2) to meet the target conservation coverage (see METHODS). After adding Facebook data, the prioritization process identified 40.14% of the area for birds (3,987.81 km2 increase) and 34.46% of the area for butterflies (9,410.66 km2 increase). When we ran the prioritization process, without the cost layer, the difference in identified important conservation areas (between GBIF-only and when adding Facebook data) was similar for birds (increased by 4,106.94 km2) and slightly lower (increased by 7,637.95 km2) for butterflies (Appendix S3).
For birds, with GBIF-only data, the prioritization process missed many important areas in the north, northeast, and southeast compared with the combined data. However, when considering the current land-cover patterns across Bangladesh (Figure 2a), there were no substantial differences in the proportion of land-cover type selection. For butterflies, priority areas identified using GBIF-only data also missed many parts in the northwest, southeast, and central parts of Bangladesh. However, similar to birds, there were few differences in land-cover types between the 2 schemes. The proportion of priority areas peaked along croplands and forests and was lowest for shrublands and herbaceous vegetation (Figure 2a–c; Appendix S4). In GBIF-only and combined data, the proportion of priority areas was highest for croplands and forests and lowest for shrublands and herbaceous vegetation (Figure 2a–c; Appendix S4).
Although testing whether the prioritization process identified more areas simply due to the inclusion of more species, we found that the number of important conservation areas identified by the prioritization process was slightly higher for birds and lower for butterflies—compared with the GBIF-only data (Figure 2b,c; Appendix S5). The result was similar when identifying important conservation areas based only on species common to both data sets. The number of conservation areas identified by the prioritization process was slightly higher for birds and lower for butterflies—compared with the GBIF-only data (Appendix S6).
Given our definition of a cost surface based on HFP, the priority areas were primarily distributed in places with a low level of anthropogenic impact (see METHODS) based on both GBIF-only and combined data for birds and butterflies. However, with the inclusion of Facebook data, the priority areas’ mean and median HFP index increased slightly (Appendix S7). After adding Facebook data, butterflies’ mean and median HFP index of the priority areas increased from 12.75 to 13.84 and from 11.63 to 14.25, respectively. The mean and median HFP index for birds increased similarly from 13.66 to 14 and from 14.26 to 15.79, respectively.
Irreplaceability score
Most priority areas had relatively low irreplaceability, but the scores improved markedly after adding Facebook data. For birds, 25% of priority areas had a score >0.00135 with combined data, compared with only 0.00125% with GBIF-only data (Figure 3a,b). For butterflies, 25% of priority areas had an irreplaceability score >0.0006 with combined data, compared with 0.00016% with the GBIF-only data (Figure 3c,d). The result was somewhat similar when we ran the irreplaceability analysis without a cost layer (Appendix S8), with species common in both GBIF-only and combined data sets but with the HFP index (Appendix S9), and with species common in both GBIF-only and combined data sets but without the HFP index (Appendix S10).
Although additional Facebook data resulted in little difference in identifying the most crucial priority areas (top 10%) for birds (Figure 3a,b), we obtained marked differences for butterflies (Figure 3c,d). With the addition of Facebook records, for birds, irreplaceability scores increased in central, northwestern, and southeastern Bangladesh (Figure 3a,b), whereas, for butterflies, the irreplaceability scores increased substantially in the northeast and east (Figure 3c,d).
DISCUSSION
Bangladesh, like many tropical countries, is highly biodiverse. Yet, knowledge of most of its species’ distribution is limited (Chowdhury, Aich, et al., 2023; IUCN Bangladesh, 2015; Chowdhury, Fuller et al., 2023). The ubiquitous availability of digital phones and cameras creates abundant opportunities for people in less-represented countries to post their biodiversity photographs on social media. We found that data obtained from social media had a significant capacity to inform important conservation decision-making (priority areas identified increased by 4000–10,000 km2). Our prioritization scheme that included the Facebook data identified more areas, especially due to the inclusion of more species in the analysis. The number and location of identified conservation priority areas increased sharply after adding Facebook data, and there were marked differences in the most valuable irreplaceable areas (between schemes with and without these data), especially for butterflies.
The increasing popularity of citizen science has greatly improved our understanding of species distributions in recent years. There has been a 12-fold increase in biodiversity data in GBIF since 2007 (Heberling et al., 2021), albeit mostly from the Global North (Hughes et al., 2021; Slade & Rui Ong, 2023). In countries like Bangladesh, with a lack of natural history museums and systematic monitoring schemes, citizen-derived data can play an important role in boosting the volume of biodiversity data. When we included Facebook data, our occurrence records doubled. Remarkably, data for more than two-thirds of butterfly species were only available from Facebook. During our initial data collation for birds in GBIF, we did not collect their original data source; however, based on a random check, the majority came from eBird, and the contribution of museum data was negligible. Bangladesh has many active eBird users, but there are not many butterfly enthusiasts that use specialized butterfly citizen science applications, which probably caused the difference in GBIF records between birds and butterflies (41,476 vs. 5,601, respectively [Chowdhury, Aich, et al., 2022]).
By including observation records from Facebook, our systematic conservation planning approaches identified many new important areas from the northeastern and southeastern parts of Bangladesh. Despite being home to many charismatic species and biodiversity hotspots, the importance of these areas for biodiversity conservation remains unnoticed. These areas are occupied mainly by indigenous communities, are distant from metropolitan areas, and lack familiarity among most residents with citizen science applications with dedicated biodiversity monitoring schemes (Chowdhury, Fuller et al., 2023). Many people living in these areas are, however, Facebook users. Moreover, wildlife photographers from other parts of Bangladesh often visit these regions and share their photographs on Facebook. Therefore, our results highlight the great utility of combining biodiversity repositories and social media data for conservation monitoring and planning, across scales, especially in less-monitored regions (Kelling et al., 2019).
Our results should be interpreted with caution. Citizen science data are highly spatially biased and largely centered around major cities, which might have an impact on our results. However, we followed a range of approaches to control the survey bias (e.g., spatial thinning) and model prediction (e.g., the checkerboard2 evaluation method to control biased sampling). Besides, the scope of our analyses does not include guiding PA expansion in Bangladesh; rather, we carried out an academic exercise to assess the possible role of diversified biodiversity data sources in spatial conservation prioritization. PA planning requires the consideration of complex sociopolitical constraints, in addition to knowledge of biodiversity distribution. Still, our results can be used by decision-makers in Bangladesh to identify new areas for biodiversity conservation within Bangladesh's forest coverage map developed by the Bangladesh Forest Department. Furthermore, this knowledge can support future PA planning and expansion in situations where large amounts of biodiversity data might not (yet) be available from international repositories such as GBIF, as is the case for Bangladesh.
Although social media can play an important role in supporting biodiversity conservation assessments, there remain considerable challenges to capturing and collating such data. First, although biodiversity data can be harvested from different social media channels (Flickr, Twitter), we used only Facebook in our study because Facebook groups in Bangladesh are regularly monitored by group moderators, unlike many other social media platforms. Second, capturing biodiversity data from Facebook is very time-consuming, taking about 380 hours to harvest data for all 680 species in our study (Chowdhury, Ahmed, et al., 2023; Chowdhury, Aich, et al., 2022, 2023). Third, Facebook photographs do not contain specific geolocation information, resulting in frequent coordinate uncertainty when georeferencing. Finally, accurate species identification from Facebook photographs requires high-quality pictures and a high level of taxonomic expertise.
Taking photographs rich in taxonomic information is difficult, and many species remain consequently unidentified. To enhance semistructured monitoring in citizen science (Kelling et al., 2019), Facebook group moderators could help train recorders, and photos could have automated GPS records attached. In addition, citizen science records could be enhanced using novel technologies, such as camera traps and artificial intelligence for automated image recognition (van Klink et al., 2022). Furthermore, platforms more narrowly dedicated to recording biodiversity data (e.g., eButterfly, Flora Incognita, iNaturalist) could be used to augment Facebook (and other social media) data. In turn, information from such, more dedicated sources could be used to develop and train deep-learning image classification and identification models (Jarić et al., 2020), especially for lesser-known tropical species. Overall, promoting the importance of citizen science in biodiversity conservation and the broader availability of digital apps can generate extensive data from remote areas. Moreover, citizen science can also heighten awareness of biodiversity and help engender a sense of social responsibility or social license for conservation (Kelly et al., 2019).
Our understanding of tropical biodiversity remains limited. Yet, with the increasing popularity of mobile phones and social media platforms, millions of users habitually share valuable biodiversity information through photographs. Such information, if carefully harvested and collated, could significantly decrease the Wallacean shortfall (Hortal et al., 2015). Biodiversity monitoring needs a culture of integration (Kühl et al., 2020; Chowdhury, Fuller et al., 2023), and it is important to align different data sources. With the addition of biodiversity data collected from Facebook (or similar sources), knowledge of many range-restricted species can be significantly improved and inform more effective conservation. Although our study focused on Bangladesh, its methods could be applied to many tropical developing countries with sufficiently good internet penetration and an active culture of social media use. The Kunming–Montreal Global Biodiversity Framework prioritizes area-based conservation approaches, placing a premium on rapidly improving knowledge of species distributions. Combining data from multiple repositories, including social media, should thus be a priority to improve the quality of large-scale conservation planning. In short, if the limitations of capturing, cleaning, and collating biodiversity information from social media platforms can be overcome, there is an enormous potential for improving biodiversity conservation globally and especially in tropical megadiverse countries.
ACKNOWLEDGMENTS
We sincerely thank all contributors of data to Facebook and GBIF employed in analyses in this manuscript. S.C. and A.B. gratefully acknowledge the support of the German Centre for Integrative Biodiversity Research (iDiv) and the sMon project funded by the German Research Foundation (DFG-FZT 118, 202548816). R.A.C. acknowledges personal funding from the Academy of Finland (#348352) and the KONE Foundation (#202101976). V.S. is supported by a Ramón y Cajal research fellowship (RYC2021-033065-I) granted by the Spanish Ministry of Science and Innovation.
Open access publishing facilitated by The University of Queensland, as part of the Wiley - The University of Queensland agreement via the Council of Australian University Librarians.