Repository logo
 

Microarray analysis using pattern discovery

dc.contributor.advisorKusalik, Anthony J. (Tony)en_US
dc.contributor.committeeMemberNeufeld, Ericen_US
dc.contributor.committeeMemberDeCoteau, Johnen_US
dc.contributor.committeeMemberDaley, Marken_US
dc.contributor.committeeMemberSoteros, Chrisen_US
dc.creatorBainbridge, Matthew Neilen_US
dc.date.accessioned2004-12-10T09:13:53Zen_US
dc.date.accessioned2013-01-04T05:10:06Z
dc.date.available2004-12-10T08:00:00Zen_US
dc.date.available2013-01-04T05:10:06Z
dc.date.created2004-11en_US
dc.date.issued2004-11-05en_US
dc.date.submittedNovember 2004en_US
dc.description.abstractAnalysis of gene expression microarray data has traditionally been conducted using hierarchical clustering. However, such analysis has many known disadvantages and pattern discovery (PD) has been proposed as an alternative technique. In this work, three similar but different PD algorithms – Teiresias, Splash and Genes@Work – were benchmarked for time and memory efficiency on a small yeast cell-cycle data set. Teiresias was found to be the fastest, and best over-all program. However, Splash was more memory efficient. This work also investigated the performance of four methods of discretizing microarray data: sign-of-the-derivative, K-means, pre-set value, and Genes@Work stratification. The first three methods were evaluated on their predisposition to group together biologically related genes. On a yeast cell-cycle data set, sign-of-the-derivative method yielded the most biologically significant patterns, followed by the pre-set value and K-means methods. K-means, preset-value, and Genes@Work were also compared on their ability to classify tissue samples from diffuse large b-cell lymphoma (DLBCL) into two subtypes determined by standard techniques. The Genes@Work stratification method produced the best patterns for discriminating between the two subtypes of lymphoma. However, the results from the second-best method, K-means, call into question the accuracy of the classification by the standard technique. Finally, a number of recommendations for improvement of pattern discovery algorithms and discretization techniques are made.en_US
dc.identifier.urihttp://hdl.handle.net/10388/etd-12102004-091353en_US
dc.language.isoen_USen_US
dc.subjectdata miningen_US
dc.subjectpatternsen_US
dc.subjectpattern discoveryen_US
dc.subjectmicroarrayen_US
dc.subjectbioinformaticsen_US
dc.titleMicroarray analysis using pattern discoveryen_US
dc.type.genreThesisen_US
dc.type.materialtexten_US
thesis.degree.departmentComputer Scienceen_US
thesis.degree.disciplineComputer Scienceen_US
thesis.degree.grantorUniversity of Saskatchewanen_US
thesis.degree.levelMastersen_US
thesis.degree.nameMaster of Science (M.Sc.)en_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis.3o.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
905 B
Format:
Plain Text
Description: