APPLICATION OF A SENSOR NETWORK, MACHINE LEARNING AND SENSITIVITY ANALYSIS IN WATER QUALITY ASSESSMENT IN A PILOT SCALE PIT LAKE IN THE ATHABASCA OIL SANDS REGION
Date
2024-10-08
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
0000-0002-3922-7363
Type
Thesis
Degree Level
Doctoral
Abstract
A pilot scale pit lake in the Athabasca oil sands (AOS) region, Lake Miwasin (LM), was created in 2017-2018 by Suncor Energy Inc. to test a new way of reclaiming waste materials from the oil sands extraction process. This pit lake incorporates a tailings treatment technique known as permanent aquatic storage structure (PASS) which involves an inline coagulation and flocculation process to accelerate tailings dewatering and to improve the released water quality. Moreover, it utilizes reclamation technology that includes closure landform, pit lake design, and water management. Altogether, LM secures treated fluid tailings (TFT) by capping them with a blend of oil sands process water (OSPW, expressed from the treated tailings) and runoff from the surrounding landscape. For LM to be deemed successful, it must eventually mimic a natural lake, supporting a diverse aquatic ecosystem at closure. Research on LM is crucial to assess performance and to inform future pit lake design in the AOS region. To assess the status and trends in water quality of this artificial system, this study measured key water quality parameters at different locations and depths of the lake over several years using a combination of a wireless sensor network (WSN) and manual samplings. Particular emphasis was placed on how physio-chemical processes interacted to influence turbidity, conductivity, and ammonium at the sediment water interface (SWI) in the early stages of the lake's development.
High-resolution sensor data identified seasonal stratification in the water column, typically near the SWI. Increased electrical conductivity (EC) near the SWI during water column stratification indicated expression of pore water with elevated salt content, as the bottom tailings progressively consolidated. Turbidity spikes often coincided with rainy days, rather than wind speed, suggesting that bottom sediment was not resuspending due to high wind during the open water season. Chemical parameters (trace metals and ions) were assessed individually using Canadian Council of Ministers of the Environment (CCME) guidelines for the long-term protection of aquatic life, with individual trace metals concentrations posing low risk to aquatic organisms. However, cumulatively, trace metals concentrations were estimated to pose a risk of toxicity to aquatic life, but only when using a highly conservative approach. Cumulative risk in both mid-littoral and limnetic zones decreased over the monitoring period, indicating improving environmental quality conditions as this system ages.
General water chemistry (major ions) of this pilot scale pit lake was compared with surrounding natural water bodies to assess similarities and differences. Furthermore, this study identified distinctive chemical signatures, water types based on lake water chemistry, and evaluated different water quality indices based on major ions, metals/metalloids, combinations of ions, and general parameters. In general, surface water from LM is slightly alkaline with elevated total dissolved solids (TDS) and differs chemically from surrounding natural waterbodies. Lake Miwasin’s water chemistry is more typical of a Na-Cl water type. Surrounding bodies show weathering from carbonates and silicates with a Ca-HCO3- water type. As the LM is not a natural lake, this study suggests that the evaporation is responsible for the concentrations and ratios of various ions present in the overlying water, likely a function of the tailings porewater and recycle water used for processing the ore. Evaluation based on different water quality indices (metal pollution index, weighted arithmetic water quality index (WA-WQI), and CCME water quality index) is an important approach to assess suitability of the resulting water quality for different potential uses. Results show that WA-WQI values for both key monitoring zones (mid-littoral and limnetic) have declined over time (i.e. improved water quality) from 2019 to 2022 and across seasons in LM and also highlights the importance of monitoring specific parameters in LM such as EC, TDS, alkalinity, and NH3.
During the early stages of LM’s development, there was limited understanding of the relative influence of underlying system biogeochemical processes. Machine learning (ML) models are particularly well suited for prediction of key water quality variables, such as ammonium (NH4+), chlorophyll-a (Chl-a), pH, and dissolved oxygen (DO) in scenarios where there is considerable available data through a WSN framework but limited process understanding. A suite of ML models was used in this study to predict the trajectory of key water quality variables in LM. Six tree-based ML models, the basic tree (Decision Tree-DT) and five bagging and boosting models, along with the conventional connectionist (Artificial Neural Network-ANN) and kernel (Support Vector Regression-SVR) based ML models, were developed. Multiple linear regression (LR) was used to provide a minimalist performance as a benchmark for model comparison. The results showed that the ensemble bagging and boosting tree-based ML models supersede the conventional ANN, SVR, and DT models in the prediction of NH4+ and Chl-a. The hyperparameters tuning substantially improved testing outcomes of the ANN and SVR models. However, tree-based modelling shows narrow gaps in rising the performance of ensemble ML models through hyperparameters tuning. This suggest that default tree-based ML models are easy to set up and suitable here in LM for prediction of key water quality variables instead of tuning complex ML models.
When there is a need to provide support for high-stake decisions, such as for environmental systems, ML can struggle to provide explainable predictive solutions. Here, Variogram Analysis of Response Surface (VARS) was extended, a method rooted in formal sensitivity analysis (SA) to detect the primary controls on ML model outputs. Unlike typical eXplainable artificial intelligence (XAI) methods, VARS handles complex multi-variate distributional properties of the input-output data, commonly observed in environmental systems. This approach was applied to a suite of ML models that were developed for predicting future concentrations of various water quality variables in LM and to demonstrate the explanatory robustness of ML models by checking the dispersion of SA results across 30 replicates. A critical finding was that subtle alterations in the design of an ML model, such as variations in random seed for initialization, functional class, hyperparameters, or data splitting, can lead to entirely different representational interpretations of the dependence of the outputs on explanatory inputs. This would suggest different types of explanations by an ML model across 30 replicates. Further, models based on different ML families (decision trees, connectionists, or kernels) seem to focus on different aspects of the information provided by data, although displaying similar levels of predictive power. Moreover, this research explored the power of other XAI techniques, i.e., the SHapley Additive exPlanations (SHAP) method based on game theory, widely used by the computer science community. The disciplines of SA and XAI are both designed to elucidate the specific contributions of input instances, and thereby uncover how features collectively influence model outputs across all input instances. It was further investigated how the contributions of feature importance to model outputs vary based on the parameter settings in the ML model functional classes. Overall, the SA-based approach was better suited and more computationally efficient for situations where connectionist ML and deep learning models are applied, and where tree-based modeling is limited or not deemed to be preferable.
Thorough water quality assessment and sensitivity analysis based on the ML modeling in LM surface water is imperative for insight into the potential character and trajectories of water quality in this and future pit lakes and inform design considerations and application of this novel reclamation approach in the AOS region. In particular, using data from wireless sensor networks (WSNs) and explainable ML-based modeling can provide valuable support for ecological, environmental and water resources management in systems with limited process knowledge.
Description
Keywords
oil sands process-affected water (OSPW), water quality assessment, machine learning modeling, sensitivity analysis
Citation
Degree
Doctor of Philosophy (Ph.D.)
Department
Toxicology Centre
Program
Toxicology