Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Methods

LifeWatch ERIC

This page documents the Python re-implementation of Soroye et al. 2020’s analysis pipeline. The full code lives in soroye_port/ in the repository (Zenodo concept DOI 10.5281/zenodo.19756173).

The five-script pipeline

Each Python script is a port of one of the R scripts in Soroye’s Figshare release (Soroye et al. (2020)).

ScriptR originalPurpose
01_clean_data.py (Phase 2)1_Cleandata_and_makeMCPs.RContinental Bombus cleaning
01_clean_data_iberia.py (Phase 3)adaptedIberian GBIF cleaning
02_presence_absence.py2_CalcSpeciesPr_Rich.R100 km equal-area presence/absence rasters
03_sampling_continent.py3_CalcSamplingEffort_Cont.RPer-cell sampling effort + continent
04_climate_tei_pei.pyclimate-position helpersPer-species TEI / PEI
05_regression.py5_binomialGLMM4Presence.R (bambi/MCMC)Mixed-effects GLMM, MCMC
05b_regression_statsmodels.pyfallbackVariational Bayes mixed-effects GLMM

All scripts 02–05 read an OUT_SUBDIR environment variable that controls whether they operate on Phase 2 (outputs/) or Phase 3 (outputs_iberia/) intermediates. Only the 01_clean_*.py script differs between phases.

The 100 km cylindrical equal-area grid

Following Soroye 2020:

Continent assignment: lon < −25 ⇒ North America, otherwise Europe (Iberia is naturally classified as Europe).

Periods and seasons

Thermal Exposure Index (TEI)

The original paper’s TEI is the rate at which monthly maximum temperatures exceed the species-specific historical maximum. Mathematically, for species ss in cell cc:

By linearity of CPI, this is equivalent to computing CPI from the period-mean temperatures rather than averaging per-year CPI values. The Python port uses the linearised form for speed; the result is identical to within floating-point.

PEI is the analogous Precipitation Exposure Index.

Mixed-effects logistic GLMM

The model fitted on the species × cell observation matrix is:

extinction ~ continent
           + sc_sampling
           + sc_TEI_bs + sc_TEI_delta + sc_TEI_bs:sc_TEI_delta
           + sc_PEI_bs + sc_PEI_delta + sc_PEI_bs:sc_PEI_delta
           + sc_TEI_bs:sc_PEI_bs
           + sc_TEI_delta:sc_PEI_delta
           + (1 | species)

sc_* denotes z-score standardisation (Bessel-corrected, i.e. ddof=1, matching R’s scale()). Soroye’s R code uses MCMCglmm with full Markov- chain Monte Carlo. This work uses statsmodels.BinomialBayesMixedGLM with variational Bayes — a fast approximation that preserves point estimates and signs but typically underestimates credible-interval widths by 10–20 %.

Definitions of the response variable

Following the R code (5_binomialGLMM4Presence.R):

Three discrepancies between an early simplified port and Soroye’s R code were caught and fixed in v0.2.0:

  1. Sampling effort: the samp column must sum across all 6 period-season rasters (stackApply(... rep(1, 6), sum) in the R), not baseline only.

  2. z-scoring: numpy.std defaults to ddof=0; R’s scale() uses Bessel-corrected ddof=1. The Python port now matches R.

  3. Extinction definition: persistence (bs=1, rc=1) AND colonisation (bs=0, rc=1) are both extinction = 0. An early version only counted persistence as 0, which inflated extinction rate from 51 % to 60 %.

GBIF data acquisition (Phase 3 only)

The Iberian Bombus dataset is fetched via the GBIF Occurrence Download API in notebooks/01b_gbif_download_doi.py:

The script submits the download, polls until completion, and saves the DOI + metadata to data/gbif_bombus_iberia_metadata.json. Re-running it mints a new DOI rather than re-downloading the original; the existing DOI remains valid for citation.

Reproducibility

References
  1. Fouilloux, A. (2026). WeatherXBiodiversity: Soroye et al. (2020) Replication for Iberian Bombus. Zenodo. 10.5281/ZENODO.19756173
  2. Soroye, P., Newbold, T., & Kerr, J. T. (2020). Climate change contributes to widespread declines among bumble bees across continents - DATA REPOSITORY. figshare. 10.6084/M9.FIGSHARE.9956471
  3. GBIF.org User. (2026). Occurrence Download. The Global Biodiversity Information Facility. 10.15468/DL.3FRMSQ