Modeling the Risk of Hip Fracture among Residents in the Long Term Care Facilities in British Columbia, Canada: Impact of Misspecication of the Correlation Structure on the Parameter Estimates
In practice, survival data are often grouped into clusters, such as clinical sites, geographical regions and so on. This clustering imposes correlation among individuals within each cluster, which is known as within cluster correlation. For instance, in our motivating example, within each long term care facility (LTCF), the elderly are likely from nearby areas with similar quality of life and having access to similar health care. As such, individual sharing the same hidden features may correlate with each other. The shared frailty model is therefore often used to take into account the correlation among individuals from the same cluster. In some applications, when the survival data are collected over geographical regions, random effects corresponding to geographical regions in closer proximately to each other might also be similar in magnitude, due to underlying environmental characteristics. Therefore, shared spatial frailty model can be adopted to model the spatial correlation among the clusters, which are often implemented using Bayesian Markov Chain Monte Carlo method. This method comes at the price of slow mixing rates and heavy computational cost, which may reader it impractical for data intensive application. In this thesis, motivated by the computational challenges encountered in modelling spatial correlation in a real application involving large scale survival data, we used simulations to assess the efficiency loss in parameter estimates if residual spatial correlation is present but using a spatially uncorrelated random effect term in the model. Our simulation study indicates that the share frailty model with only the spatially correlated random effect term may not be sufficient to govern the total residual variation, whereas the simpler model with only the spatially uncorrelated random effect term performs surprisingly well in estimating the model parameters compared with the true model with both the spatially correlated and uncorrelated random effect terms. As such, using the shared frailty model with independent frailty term should be reliable for estimating the effects of covariates, especially when the percentage of censoring is not high and the number of clusters is large. Also, such model is advantageous, since it can be easily and efficiently implemented in a standard statistical software. This is not to say that the shared frailty model with independent frailty term should be preferred over the spatial frailty model in all cases. Indeed, when the primary goal of inference is predicting the hazard for specific covariates group, additional care needs to be given due to the bias in the scale parameter associated with the Weibull distribution, when the correlation structure is misspecified.
Regression, Random-effect term, Likelihood function, Bias, Bayesian Statistics, MCMC,
Master of Science (M.Sc.)
School of Public Health