Microarray analysis using pattern discovery

Bainbridge, Matthew Neil

Microarray analysis using pattern discovery

dc.contributor.advisor	Kusalik, Anthony J. (Tony)	en_US
dc.contributor.committeeMember	Neufeld, Eric	en_US
dc.contributor.committeeMember	DeCoteau, John	en_US
dc.contributor.committeeMember	Daley, Mark	en_US
dc.contributor.committeeMember	Soteros, Chris	en_US
dc.creator	Bainbridge, Matthew Neil	en_US
dc.date.accessioned	2004-12-10T09:13:53Z	en_US
dc.date.accessioned	2013-01-04T05:10:06Z
dc.date.available	2004-12-10T08:00:00Z	en_US
dc.date.available	2013-01-04T05:10:06Z
dc.date.created	2004-11	en_US
dc.date.issued	2004-11-05	en_US
dc.date.submitted	November 2004	en_US
dc.description.abstract	Analysis of gene expression microarray data has traditionally been conducted using hierarchical clustering. However, such analysis has many known disadvantages and pattern discovery (PD) has been proposed as an alternative technique. In this work, three similar but different PD algorithms – Teiresias, Splash and Genes@Work – were benchmarked for time and memory efficiency on a small yeast cell-cycle data set. Teiresias was found to be the fastest, and best over-all program. However, Splash was more memory efficient. This work also investigated the performance of four methods of discretizing microarray data: sign-of-the-derivative, K-means, pre-set value, and Genes@Work stratification. The first three methods were evaluated on their predisposition to group together biologically related genes. On a yeast cell-cycle data set, sign-of-the-derivative method yielded the most biologically significant patterns, followed by the pre-set value and K-means methods. K-means, preset-value, and Genes@Work were also compared on their ability to classify tissue samples from diffuse large b-cell lymphoma (DLBCL) into two subtypes determined by standard techniques. The Genes@Work stratification method produced the best patterns for discriminating between the two subtypes of lymphoma. However, the results from the second-best method, K-means, call into question the accuracy of the classification by the standard technique. Finally, a number of recommendations for improvement of pattern discovery algorithms and discretization techniques are made.	en_US
dc.identifier.uri	http://hdl.handle.net/10388/etd-12102004-091353	en_US
dc.language.iso	en_US	en_US
dc.subject	data mining	en_US
dc.subject	patterns	en_US
dc.subject	pattern discovery	en_US
dc.subject	microarray	en_US
dc.subject	bioinformatics	en_US
dc.title	Microarray analysis using pattern discovery	en_US
dc.type.genre	Thesis	en_US
dc.type.material	text	en_US
thesis.degree.department	Computer Science	en_US
thesis.degree.discipline	Computer Science	en_US
thesis.degree.grantor	University of Saskatchewan	en_US
thesis.degree.level	Masters	en_US
thesis.degree.name	Master of Science (M.Sc.)	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Thesis.3o.pdf
Size:: 1.11 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 905 B
Format:: Plain Text
Description:

Download

Collections

Graduate Theses and Dissertations