Inventory statistics meet big data: Complications for estimating numbers of species

Biodiversity Institute, University of Kansas, Lawrence, Kansas, United States
Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, KS, United States
Centro de Agroecología y Ambiente, Benemerita Universidad Autónoma de Puebla, Puebla, Puebla, Mexico
DOI
10.7287/peerj.preprints.27965v1
Subject Areas
Biodiversity, Conservation Biology
Keywords
Chao estimator, Data quality, Species richness, Virtual biotas
Copyright
© 2019 Khalighifar et al.
Licence
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
Cite this article
Khalighifar A, Jiménez L, Nuñez-Penichet C, Freeman B, Ingenloff K, Jiménez-García D, Peterson AT. 2019. Inventory statistics meet big data: Complications for estimating numbers of species. PeerJ Preprints 7:e27965v1

Abstract

Abstract

We point out complications inherent in biodiversity inventory metrics when applied to large-scale datasets. The number of samples in which a species is detected saturates, such that crucial numbers of detections of rare species approach zero. Any rare errors can then come to dominate species richness estimates, creating upward biases in estimates of species numbers. We document the problem via simulations of sampling from virtual biotas, illustrate its potential using a large empirical dataset (bird records from Cape May, New Jersey, USA), and outline the circumstances under which these problems may be expected to emerge.

Author Comment

This is a submission to PeerJ for review.

Supplemental Information