A Machine Learning  Generalization of LSI-OR

Oraji, Rahim

A Machine Learning Generalization of LSI-OR

Files

ORAJI-THESIS-2016.pdf (1.72 MB)

Date

2016-07-04

Authors

Oraji, Rahim

Type

Thesis

Degree Level

Masters

Abstract

The Level of Service Inventory-Ontario Revision (LSI-OR) is used as a risk/need assessment tool to classify, manage, and treat the offender population so that they receive supportive services consistent with their custodial needs. This thesis adopts a machine learning approach employing the Naive Bayes technique as an alternative to the LSI-OR. The study was conducted on a group of (72725) offenders with different races and includes males (82.62%) and females (17.38%). Participants were monitored for two years to collect recidivism information. A basic analysis of the dataset revealed that 1) 83.18% of population used a unique pattern to answer 43 LSI-OR items, 2) the total LSI-OR scores in the entire population and also in male and female population followed two beta distribution functions, one for each recidivism class, and 3) the recidivism rate was approximated by a normal distribution function. It was shown that the Naive Bayes classifier can be considered as an extended LSI-OR classifier that accepts multiple continuous and discrete features as input. In other words, the Naive Bayes classifier provides a simple framework for studying the effect of distinct features on classification efficiency and accuracy. The results of running the Naive Bayes classifier with various input features revealed that the Naive Bayes classifier presented better performance than the LSI-OR. However, there was no obvious trend in the accuracies predicted by both models to indicate the superiority of one model over the other. The only feature whose value could be treated as a continuous variable was the LSI-OR score. Many models were created based on continuous and discrete LSI-OR scores producing either the same performance and mean accuracy or slightly better. The dataset contained many features that are never used by the LSI-OR assessment for instance, the offence severity. A model was built at each index of offence severity based on LSI-OR scores and 43 LSI-OR items as input features. The results of running the experiment indicate that considering 43 LSI-OR items gives more stable results in terms of accuracy than the LSI-OR scores.

Keywords

LSI-OR, Naive Bayes, machine learning, level of service inventory, classifier, algorithm

Degree

Master of Science (M.Sc.)

Department

Computer Science

Program

Computer Science

Advisor

Spiteri, Raymond

Committee

Eramian, Mark ; Stanley, Kevin ; Mago, Vijay

URI

http://hdl.handle.net/10388/7318

Collections

Graduate Theses and Dissertations

Full item page

A Machine Learning Generalization of LSI-OR

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Type

Degree Level

Abstract

Description

Keywords

Citation

Degree

Department

Program

Advisor

Committee

Citation

Part Of

item.page.relation.ispartofseries

URI

DOI

item.page.identifier.pmid

item.page.identifier.pmcid

Collections