University of SaskatchewanHARVEST
  • Login
  • Submit Your Work
  • About
    • About HARVEST
    • Guidelines
    • Browse
      • All of HARVEST
      • Communities & Collections
      • By Issue Date
      • Authors
      • Titles
      • Subjects
      • This Collection
      • By Issue Date
      • Authors
      • Titles
      • Subjects
    • My Account
      • Login
      JavaScript is disabled for your browser. Some features of this site may not work without it.
      View Item 
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item
      • HARVEST
      • Electronic Theses and Dissertations
      • Graduate Theses and Dissertations
      • View Item

      Predicting content manipulations by open web proxies

      Thumbnail
      View/Open
      NEZHADIAN-THESIS-2022.pdf (580.6Kb)
      Date
      2022-05-20
      Author
      Nezhadian, Zahra
      Type
      Thesis
      Degree Level
      Masters
      Metadata
      Show full item record
      Abstract
      The need for anonymity and privacy has given rise to open web proxies that act as gateways relaying traffic between web servers and their clients, allowing users to access otherwise not accessible content. As the open web proxy ecosystem continues to grow, more and more studies point out the extent of content alteration on the Internet. The content alterations applied by proxies include both benign and malicious modifications, such as adding crypto-mining scripts or adding injections. While some content modifications such as add injections can be prevented using blocker tools, adding scripts to JavaScript files cannot be detected with any antivirus or blocker tool. The widespread use of proxies and their malicious behaviour motivated us to focus on the feasibility of predicting these manipulations to choose a proxy for daily use carefully. While the previous studies focused on the detection and analysis of content manipulation by proxies, we present a novel approach for predicting the types of content alterations that might be silently applied by open proxies. Besides, this approach allows us to predict the injection of any extra file by open proxies. The predictions in this study indicate changes without a need to fetch the data through a proxy first. The leveraged dataset in this work is created by collecting website content of 1028 domains fetched through 1293 proxies as the initial steps of this study. Then, we derive 13 types of content modification through a detailed analysis of content manipulations on collected content. Then the detected content modification types are utilized to form our dataset for prediction analysis. This research allows us to accurately predict proxy behaviour over a particular website, enabling us to recognize malicious and benign proxies and cautiously select a proxy to connect to. This study predicted the type of content modifications with 92\% accuracy. In addition, the injection of extra files was predicted with 99\% accuracy. Besides, our study reveals an important observation that the majority of proxies manipulate website content based on technical information of the website and its web server.
      Degree
      Master of Science (M.Sc.)
      Department
      Computer Science
      Program
      Computer Science
      Supervisor
      Stakhanova , Natalia
      Committee
      Codabux, Zadia; Wahid, Khan; McQuillan, Ian
      Copyright Date
      May 2022
      URI
      https://hdl.handle.net/10388/13971
      Subject
      Porxy
      Machine learning
      modification
      manipulation
      injection
      open proxy
      Web
      Collections
      • Graduate Theses and Dissertations
      University of Saskatchewan

      University Library

      © University of Saskatchewan
      Contact Us | Disclaimer | Privacy