Repository logo
 

A Machine Learning Generalization of LSI-OR

dc.contributor.advisorSpiteri, Raymond
dc.contributor.committeeMemberEramian, Mark
dc.contributor.committeeMemberStanley, Kevin
dc.contributor.committeeMemberMago, Vijay
dc.creatorOraji, Rahim
dc.date.accessioned2016-07-04T20:34:10Z
dc.date.available2016-07-04T20:34:10Z
dc.date.created2016-05
dc.date.issued2016-07-04
dc.date.submittedMay 2016
dc.date.updated2016-07-04T20:34:10Z
dc.description.abstractThe Level of Service Inventory-Ontario Revision (LSI-OR) is used as a risk/need assessment tool to classify, manage, and treat the offender population so that they receive supportive services consistent with their custodial needs. This thesis adopts a machine learning approach employing the Naive Bayes technique as an alternative to the LSI-OR. The study was conducted on a group of (72725) offenders with different races and includes males (82.62%) and females (17.38%). Participants were monitored for two years to collect recidivism information. A basic analysis of the dataset revealed that 1) 83.18% of population used a unique pattern to answer 43 LSI-OR items, 2) the total LSI-OR scores in the entire population and also in male and female population followed two beta distribution functions, one for each recidivism class, and 3) the recidivism rate was approximated by a normal distribution function. It was shown that the Naive Bayes classifier can be considered as an extended LSI-OR classifier that accepts multiple continuous and discrete features as input. In other words, the Naive Bayes classifier provides a simple framework for studying the effect of distinct features on classification efficiency and accuracy. The results of running the Naive Bayes classifier with various input features revealed that the Naive Bayes classifier presented better performance than the LSI-OR. However, there was no obvious trend in the accuracies predicted by both models to indicate the superiority of one model over the other. The only feature whose value could be treated as a continuous variable was the LSI-OR score. Many models were created based on continuous and discrete LSI-OR scores producing either the same performance and mean accuracy or slightly better. The dataset contained many features that are never used by the LSI-OR assessment for instance, the offence severity. A model was built at each index of offence severity based on LSI-OR scores and 43 LSI-OR items as input features. The results of running the experiment indicate that considering 43 LSI-OR items gives more stable results in terms of accuracy than the LSI-OR scores.
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10388/7318
dc.subjectLSI-OR
dc.subjectNaive Bayes
dc.subjectmachine learning
dc.subjectlevel of service inventory
dc.subjectclassifier
dc.subjectalgorithm
dc.titleA Machine Learning Generalization of LSI-OR
dc.typeThesis
dc.type.materialtext
thesis.degree.departmentComputer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Saskatchewan
thesis.degree.levelMasters
thesis.degree.nameMaster of Science (M.Sc.)

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ORAJI-THESIS-2016.pdf
Size:
1.72 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
LICENSE.txt
Size:
2.27 KB
Format:
Plain Text
Description: