University of SaskatchewanHARVEST
  • Login
  • Submit Your Work
  • About
    • About HARVEST
    • Guidelines
    • Browse
      • All of HARVEST
      • Communities & Collections
      • By Issue Date
      • Authors
      • Titles
      • Subjects
      • This Collection
      • By Issue Date
      • Authors
      • Titles
      • Subjects
    • My Account
      • Login
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      View Item 
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item

      A Comparison of Machine Learning Techniques to Classify Tweets relevant to People impacted by Dementia and COVID-19

      Thumbnail
      View/Open
      AZIZI-THESIS-2022.pdf (3.810Mb)
      Date
      2022-11-18
      Author
      Azizi, Mehrnoosh
      ORCID
      0000-0002-4337-4630
      Type
      Thesis
      Degree Level
      Masters
      Metadata
      Show full item record
      Abstract
      Dementia has emerged as one of today's biggest healthcare challenges due to the increasing demand for medical, social, and institutional care. Moreover, the COVID-19 pandemic has had a unique impact on people with dementia. Those with dementia are also at an increased risk of contracting COVID-19, as well as having more severe symptoms and disease consequences. This highlights the importance of focusing on the issues of people living with dementia. Modern technologies including social media can help psychologists to analyze people’s experiences and take necessary measures. However, one of the principal problems for psychologists is that they must process huge amounts of data, but not all of the data can be analyzed due to a lot of irrelevant information in the data. Therefore, the data need to be labeled manually either by one or several researchers, which is a tedious and time-consuming task and may be costly due to the human effort involved. Thus, improvements to existing methodologies are needed to enable psychologists to make better use of the data and understand the impacts of COVID-19 on people with dementia. Nowadays, one of the modern and reasonable ways perform a task (e.g., automatic labeling) is to use Machine Learning (ML) algorithms to save time and energy. To this end, this study compares various ML algorithms to classify tweets relevant to dementia and COVID-19 in order to help psychologist examine the COVID-19 impacts on people living with dementia. In this case, three different datasets are used: (i) a dataset comprised of 5,058 tweets extracted from Twitter on COVID-19 and dementia from February 15 to September 7, 2020 to train, evaluate, and compare different models, (ii) a dataset comprised of 6,240 tweets from September 8, 2020 to December 8, 2021 to test the best model, and (iii) a dataset comprised of 1,289 tweets related to Canada’s Alzheimer’s Awareness Month from January 1 to January 31, 2022 to retrain and test the best model. In the first step, to choose the best machine learning model, several classification models, including logistic regression, Gaussian naïve Bayes classifier, multinomial naïve Bayes classifier, support vector classifier, decision tree classifier, K-nearest neighbor classifier, random forest classifier, AdaBoost classifier, XGBoost classifier, BERT classifier, and ALBERT classifier are trained and compared in terms of classification performance. According to the classification results, the ALBERT model outperformed all other models in the comparison and achieved the least over-fitting problem and the highest accuracy, AUC, and F1-score compared to the other explored models. In the second step, the ALBERT model is tested on the second dataset (a completely unseen dataset) and achieved an accuracy of 80% in classifying relevant and irrelevant tweets for people impacted by dementia and COVID-19. Finally, to show that the ALBERT model can be used for future studies in the context of people impacted by dementia and COVID-19 in an efficient way, the model is trained on 10% of the third dataset and tested using 90% of the rest and reached an accuracy of 88%.
      Degree
      Master of Science (M.Sc.)
      Department
      Computer Science
      Program
      Computer Science
      Supervisor
      Spiteri, Raymond
      Committee
      Vassileva, Julita; Klarkowski, Madison
      Copyright Date
      2022
      URI
      https://hdl.handle.net/10388/14317
      Subject
      Dementia, COVID-19, logistic regression, Gaussian naïve Bayes classifier, multinomial naïve Bayes classifier, support vector classifier, decision tree classifier, K-nearest neighbor classifier, random forest classifier, AdaBoost classifier, XGBoost classifier, BERT classifier, ALBERT classifier
      Collections
      • Graduate Theses and Dissertations

      Related items

      Showing items related by title, author, creator and subject.

      • PROPERTIES OF AQUEOUS-ALCOHOL-WASHED PROTEIN CONCENTRATES PREPARED FROM AIR-CLASSIFIED PEA PROTEIN AND OTHER AIR-CLASSIFIED PULSE PROTEIN FRACTIONS 

        Peter, Rosemary 1988- (2018-09-25)
        Pea protein concentrates were prepared from air-classified pea protein by aqueous-alcohol (ethanol or isopropanol) washing. Response surface methodology (Box Behnken design) was used to create mathematical models to explain ...
      • The Selective Decision Tree Classifier: A Novel Classifier based on Feature Selection 

        Neiser, Jennafer 1995- (2019-02-20)
        Living in the era of big data, it is crucial to develop and improve techniques that aid in data processing, such as data reduction. Feature selection is a data reduction technique that generates subsets of data that can ...
      • Synthetically lethal interactions classify novel genes in postreplication repair in Saccharomyces cerevisiae 

        Barbour, Leslie (2005-02-07)
        Both prokaryotic and eukaryotic cells are equipped with DNA repair mechanisms to protect the integrity of their genome in case of DNA damage. In the eukaryotic organism Saccharomyces cerevisiae, MMS2 encodes a ...
      University of Saskatchewan

      University Library

      The University of Saskatchewan's main campus is situated on Treaty 6 Territory and the Homeland of the Métis.

      © University of Saskatchewan
      Contact Us | Disclaimer | Privacy