A Machine Learning  Generalization of LSI-OR

Oraji, Rahim

A Machine Learning Generalization of LSI-OR

dc.contributor.advisor	Spiteri, Raymond
dc.contributor.committeeMember	Eramian, Mark
dc.contributor.committeeMember	Stanley, Kevin
dc.contributor.committeeMember	Mago, Vijay
dc.creator	Oraji, Rahim
dc.date.accessioned	2016-07-04T20:34:10Z
dc.date.available	2016-07-04T20:34:10Z
dc.date.created	2016-05
dc.date.issued	2016-07-04
dc.date.submitted	May 2016
dc.date.updated	2016-07-04T20:34:10Z
dc.description.abstract	The Level of Service Inventory-Ontario Revision (LSI-OR) is used as a risk/need assessment tool to classify, manage, and treat the offender population so that they receive supportive services consistent with their custodial needs. This thesis adopts a machine learning approach employing the Naive Bayes technique as an alternative to the LSI-OR. The study was conducted on a group of (72725) offenders with different races and includes males (82.62%) and females (17.38%). Participants were monitored for two years to collect recidivism information. A basic analysis of the dataset revealed that 1) 83.18% of population used a unique pattern to answer 43 LSI-OR items, 2) the total LSI-OR scores in the entire population and also in male and female population followed two beta distribution functions, one for each recidivism class, and 3) the recidivism rate was approximated by a normal distribution function. It was shown that the Naive Bayes classifier can be considered as an extended LSI-OR classifier that accepts multiple continuous and discrete features as input. In other words, the Naive Bayes classifier provides a simple framework for studying the effect of distinct features on classification efficiency and accuracy. The results of running the Naive Bayes classifier with various input features revealed that the Naive Bayes classifier presented better performance than the LSI-OR. However, there was no obvious trend in the accuracies predicted by both models to indicate the superiority of one model over the other. The only feature whose value could be treated as a continuous variable was the LSI-OR score. Many models were created based on continuous and discrete LSI-OR scores producing either the same performance and mean accuracy or slightly better. The dataset contained many features that are never used by the LSI-OR assessment for instance, the offence severity. A model was built at each index of offence severity based on LSI-OR scores and 43 LSI-OR items as input features. The results of running the experiment indicate that considering 43 LSI-OR items gives more stable results in terms of accuracy than the LSI-OR scores.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10388/7318
dc.subject	LSI-OR
dc.subject	Naive Bayes
dc.subject	machine learning
dc.subject	level of service inventory
dc.subject	classifier
dc.subject	algorithm
dc.title	A Machine Learning Generalization of LSI-OR
dc.type	Thesis
dc.type.material	text
thesis.degree.department	Computer Science
thesis.degree.discipline	Computer Science
thesis.degree.grantor	University of Saskatchewan
thesis.degree.level	Masters
thesis.degree.name	Master of Science (M.Sc.)

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ORAJI-THESIS-2016.pdf
Size:: 1.72 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: LICENSE.txt
Size:: 2.27 KB
Format:: Plain Text
Description:

Download

Collections

Graduate Theses and Dissertations