Repository logo

Investigation of Machine Learning Techniques to Determine Informative Wavelengths for Noninvasive Glucose Monitoring



Journal Title

Journal ISSN

Volume Title






Degree Level



The trend towards noninvasive blood glucose monitoring to reduce nerve damage and infection-related mortality rate of diabetic patients has led to the advent of near-infrared (IR) based devices. The overlaps between the absorption peaks of glucose and other molecules mean that many wavelengths are potentially correlated to the glucose concentration, and a suitable combination of spectral information across a range of wavelengths is necessary to determine the glucose concentration in an effective and robust manner. This work investigates the use of dimensional reduction and support vector machines (SVMs) as core algorithms to develop an automated and computationally efficient system to calibrate the relation between spectral wavelengths and glucose concentration, while facilitating feature selection of the informative wavelengths for accurate glucose monitoring. Evaluations performed on two datasets, containing information regarding the absorbance of short-wave infrared (SWIR) by glucose solution with distilled water, demonstrated that wrapper methods of feature selection could be highly effective for glucose monitoring model using SVM. By utilizing the developed wrapper methods, training accuracy can be improved, achieving up to 91.53%, testing accuracy to 91%, f1 score to 90.97% for classification approach, and standard error of cross-validation (SECV) can be decreased to 45.12mg/dl, standard error of prediction (SEP) to 39.08mg/dl for regression approach. Furthermore, filter methods of feature selection were found to offer a trade-off between speed and performance for the proposed models when used in combination with wrapper methods. If time is an important constraint, then techniques of filter method should be added to the system, since this addition can increase the feature selection speed and training speed up to 17 and 9 times respectively with only a slight drop in performance. Because wavelengths can be considered either discrete or continuous, different assumptions of continuity of wavelengths and their relative choice of evaluation metrics, whether following a classification or regression approach, were investigated to check for influences and consequentially found to impact information extraction ability of dimensionality reduction techniques. The proposed system model consists of 3 phases, envisioned as three interacting modules: data acquisition, training pipeline and testing pipeline. The main training module is in turn composed of 4 major steps: preprocessing, dimensional reduction, hyperparameter tuning, and prediction (with SVMs). Using the proposed model, the obtained computational results suggest that the most informative wavelengths for noninvasive glucose monitoring, given the experimental datasets used in this investigation, should fall in the ranges of 1300-1600nm and 1800-2400nm or 2000-2600nm.



Machine Learning, Noninvasive Glucose Monitoring, Informative Wavelengths



Master of Science (M.Sc.)


Biomedical Engineering


Biomedical Engineering


Part Of