Repository logo
 

Empirical Investigation of Randomized Quantile Residuals for Diagnosis of Non-Normal Regression Models

Date

2016-10-03

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Type

Thesis

Degree Level

Masters

Abstract

Traditional tools for model diagnosis for Generalized Linear Model (GLM), such as deviance and Pearson residuals, have been often utilized to examine goodness of fit of GLMs. In normal linear regression, both of these residuals coincide and are normally distributed; however in non-normal regression models, such as Logistic or Poisson regressions, the residuals are far from normality, with residuals aligning nearly parallel curves according to distinct response values, which imposes great challenges for visual inspection. As such, the residual plots for modeling discrete outcome variables convey very limited meaningful information, which render it of limited practical use. Randomized quantile residuals was proposed in literature to circumvent the above-mentioned problems in the traditional residuals in modeling discrete outcomes. However, this approach has not gained deserved awareness and attention. Therefore, in this thesis, we theoretically justify the normality of the randomized quantile residuals and compare their performance with the traditional ones, Pearson and deviance residuals, through a set of simulation studies. Our simulation studies demonstrate the normality of randomized quantile residuals when the fitted model is true. Further, we show that randomized quantile residual is able to detect many kinds of model inadequacies. For instance, the linearity assumption of the covariate effect in GLM can be examined by visually checking the plots of randomized quantile residuals against the predicted values or the covariates. Randomized quantile residuals can be also used to detect overdispersion and zero-inflation, two commonly occurred cases associated with count data. We advocate examining normality of the randomized quantile residuals as a unifying way for examining the goodness of fit for regression model, especially for modeling the discrete outcomes. We also demonstrate this approach in a real application studying the independent association between air pollution and daily influenza incidence in Beijing, China.

Description

Keywords

Regression, GLM, Residual, Pearson, Deviance, Randomized Quantile, ZIP

Citation

Degree

Master of Science (M.Sc.)

Department

Mathematics and Statistics

Program

Mathematics

Citation

Part Of

item.page.relation.ispartofseries

DOI

item.page.identifier.pmid

item.page.identifier.pmcid