Penalized Regression Methods for Modelling Rare Events Data with Application to Occupational Injury Study
Date
2019-09-20
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
Type
Thesis
Degree Level
Masters
Abstract
Occupational injuries are a serious public health concern for workers around the world.
Among all occupational injuries reported to the Workers' Compensation Board of Saskatchewan
(WCB-SK) from 2007-2016, 177 (0.06%) out of 280,704 injury claims were fatal. Although
work-related injuries are relatively rare, they have tremendous impact on the workers, their family, as well as a company's overall productivity, hiring/training costs, and insurance premiums. To help inform prevention of fatal claims, this study identified factors that increase
the probability of fatal injury claims in Saskatchewan.
WCB Saskatchewan's administrative occupational injury claims data from 2007-2016 was
used to extract fatal and non-fatal occupational events. Potential covariates included worker
characteristics (age, gender, occupation) and incident characteristics (source of injury, cause
of injury, part of body). Given the fatality being rare in this study, conventional logistic
regression including multiple categorical covariates with over 40 parameters yielded biased
parameter estimates. Penalized logistic regression methods, such as bias-correction method,
i.e. Firth's method as well as the model selection methods, i.e., lasso and elastic net were
compared to identify an optimal modelling strategy for calculating the odds ratio (OR) and
95% confidence intervals (CI) for probability of a WCB claim being fatal (vs. non-fatal).
Based on the best-fitting model, i.e., Firth's logistic regression of the selected variables
under the elastic net method, odds of a claim being fatal was 5.5 (95% CI: 2.77,12.46) times
higher among men than women and was 6.59 (95% CI: 3.59,12.20) times higher for seniors
aged 65-85 as compared with those who are aged 14-24. Odds of a claim being fatal among
those who work in primary industry is 2.85 (95% CI: 1.07,9.39) higher than those working
in social sciences. The odds of injury being fatal for machinery sources is 51 (95% CI:
10.38,505.38) times higher than chemical products as the source.
Men workers are at higher risk of a claim being fatal (vs non-fatal). With respect to
age, result of analysis showed that the middle-aged workers are at a lower risk, and the
young workers are at a higher risk than middle aged workers. The risk of a claim being fatal
increased sharply as age increased from 45 to 85. Primary industry sector and machinery have
a disproportionate share of fatal claims. This knowledge can improve workplace safety by learning from past incidents, identifying significant risk factors, and implementing targeted
prevention strategies. Through development of effective interventions, we hope to prevent
fatal injuries in Saskatchewan.
Description
Keywords
Penalized regression methods, Occupational injuries
Citation
Degree
Master of Science (M.Sc.)
Department
School of Public Health
Program
Biostatistics