STANDARD REGRESSION VERSUS MULTILEVEL MODELING OF MULTISTAGE COMPLEX SURVEY DATA
Date
2012-02-23
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
ORCID
Type
Degree Level
Doctoral
Abstract
Complex surveys based on multistage design are commonly used to collect large population data. Stratification, clustering and unequal probability of the selection of individuals are the complexities of complex survey design. Statistical techniques such as the multilevel modeling – scaled weights technique and the standard regression – robust variance estimation technique are used to analyze the complex survey data. Both statistical techniques take into account the complexities of complex survey data but the ways are different.
This thesis compares the performance of the multilevel modeling – scaled weights and the standard regression – robust variance estimation technique based on analysis of the cross-sectional and the longitudinal complex survey data. Performance of these two techniques was examined by Monte Carlo simulation based on cross-sectional complex survey design.
A stratified, multistage probability sample design was used to select samples for the cross-sectional Canadian Heart Health Surveys (CHHS) conducted in ten Canadian provinces and for the longitudinal National Population Health Survey (NPHS).
Both statistical techniques (the multilevel modeling – scaled weights and the standard regression – robust variance estimation technique) were utilized to analyze CHHS and NPHS data sets. The outcome of interest was based on the question “Do you have any of the following long-term conditions that have been diagnosed by a health professional? – Diabetes”.
For the cross-sectional CHHS, the results obtained from the proposed two statistical techniques were not consistent. However, the results based on analysis of the longitudinal NPHS data indicated that the performance of the standard regression – robust variance estimation technique might be better than the multilevel modeling – scaled weight technique for analyzing longitudinal complex survey data. Finally, in order to arrive at a definitive conclusion, a Monte Carlo simulation was used to compare the performance of the multilevel modeling – scaled weights and the standard regression – robust variance estimation techniques . In the Monte Carlo simulation study, the data were generated randomly based on the Canadian Heart Health Survey data for Saskatchewan province. The total 100 and 1000 number of simulated data sets were generated and the sample size for each simulated data set was 1,731. The results of this Monte Carlo simulation study indicated that the performance of the multilevel modeling – scaled weights technique and the standard regression – robust variance estimation technique were comparable to analyze the cross-sectional complex survey data.
To conclude, both statistical techniques yield similar results when used to analyze the cross-sectional complex survey data, however standard regression-robust variance estimation technique might be preferred because it fully accounts for stratification, clustering and unequal probability of selection.
Description
Keywords
Multilevel, Standard regression, bootstrap, weight, complex survey, scaling of weight, stratification, clustering,Monte Carlo,robust variance
Citation
Degree
Doctor of Philosophy (Ph.D.)
Department
Community Health and Epidemiology
Program
Biostatistics