Learning for Contingency Tables and Survival Data using Imprecise Probabilities
Bayesian inference is a method of statistical inference in which all forms of uncertainty are expressed in terms of probability. Classical Bayesian inference has some limitations. One of these situations is when we have little to no information about the experiment; another situation is when we have computational or time limitations. Also problematic is a situation where there are conflicts in choosing a prior distribution where we have experts giving different prior information, which results in less precise posterior probabilities. Because of these limitations, imprecise Bayesian approach takes place in Bayesian inference. Upper and lower posterior expectations are computed in order to calculate the degree of imprecision of the log-odds ratio. This is implemented in two-way contingency tables and then generalized to three-way tables by using different families of prior distributions, is which the core of this work. Survival data including right-censored observations are generated and converted to a sequence of 2 x 2 tables, three-way contingency tables, each 2 x 2 is built at each observed death time. Here, we assume only one death happens at each time and no ties. To implement imprecise Bayesian inference, two choices of imprecise priors are chosen. A set of four Normal priors and a set of four Beta priors are used with a non-central hypergeometric likelihood to update the posterior families and then the degree of imprecision is calculated for both cases. An example of real data is applied on Ovarian Cancer Survival data where upper and lower posterior expectations are estimated in order to calculate the degree of imprecision. We conduct simulation studies to sample from posterior distribution and estimate the log-odds ratio by using upper and lower posterior expectations. In the situation of three-way contingency tables, updating a set of priors to a set of posterior is done sequentially at each table by running MCMC method through using JAGS from R via rjags and runjags packages. Also, four factors (sample size, censoring rate, true parameter, and balancing rate) are studied to see how these four factors a ect the degree of imprecision with the two choices of imprecise priors. A fractional factorial design of 27 runs is constructed to see which one of these four factors is more signi cant. For each one of these 27 combination, upper and lower posterior expectations and the degree of imprecision of the log-odds ratio are calculated. The findings show that the smallest value of the degree of imprecision appears at the combination where the sample size is large (n = 200) and small number of censored times. In contrast, the largest value of the degree of imprecision is observed at the combination where the sample size is small (n = 40) and large number of censored times. These conclusions are supported by the findings of ANOVA that show that main e ects of the four factors are significant. The conclusion that can be summarized from the results of this work is having more information (more data) leads to less uncertainty about the parameter of interest.
Bayesian inference, Contingency tables, Survival Data, and Imprecise probability.
Doctor of Philosophy (Ph.D.)
Mathematics and Statistics