Incorporating plant community structure in species distribution modelling: a species co-occurrence based composite approach
Species distribution models (SDM) with remotely sensed (RS) imagery is widely used in ecological studies and conservation planning, and the performance is frequently limited by factors including small plant size, small numbers of observations, and scattered distribution patterns. The focus of my thesis was to develop and evaluate alternative SDM methodologies to deal with such challenges. I used a record of nine endemic species occurrences from the Athabasca Sand Dunes in northern Saskatchewan to assess five different modelling algorithms including modern regression and machine learning techniques to understand how species distribution characteristics influence model prediction accuracies. All modelling algorithms showed robust performance (>0.5 AUC), with the best performance in most cases from generalized linear models (GLM). The threshold selection for presence-absence analysis highlights that actively selecting the optimum level is the best approach compared to the standard high threshold approach as with the latter there is a potential to deliver inconsistent predictions compared to observed patterns of occurrence frequency. The development of the composite-SDM framework used small-scale plant occurrence and UAV imagery from Kernen Prairie, a remnant Fescue prairie in Saskatoon, Saskatchewan. The evaluation of the effectiveness of five algorithms clearly showed that each method was capable of handling a wide range of low to high-frequency species with strong GLM performance irrespective of the species distribution pattern. It is critical to highlight that, although GLM is computationally efficient, the method does not compromise accuracy for simplicity. The inclusion of plant community structure using image clustering methods found similar accuracy patterns indicating limited advantages of using high-resolution images. The study found for high-frequency species that prediction accuracy declines to be as low as the accuracy expected for low-frequency species. Higher prediction confidence was often observed with low-frequency species when the species occurred in a distinct habitat that was visually and spectrally distinct from the surroundings. Such a pattern is in contrast to species widespread in different grassland habitats where distinct spectral signatures were lacking. The study has substantial evidence to state that the optimal algorithmic performance is tied to a balanced number of presences and absences in the data. The co-occurrence analysis also revealed significant co-occurrence patterns are most common at moderate levels of species occurrence frequencies. The research does not indicate any consistent accuracy changes between baseline direct reflectance models and composite-SDM framework. Although accuracy changes were marginal with the composite-SDM framework, the method is well capable of influencing associated type 1 and type 2 error rates of the classification.
species distribution modelling, remotely sensing, co-occurrence, habitat distribution modelling, generalized linear modelling, Kernen prairie, Athabasca sand dunes, Landsat
Doctor of Philosophy (Ph.D.)