Show simple item record

dc.contributor.advisorRoy, Chanchal K.
dc.contributor.advisorSchneider, Kevin A.
dc.creatorFerdous, Rayhan 1992-
dc.date.accessioned2019-03-12T04:21:21Z
dc.date.available2019-03-12T04:21:21Z
dc.date.created2019-02
dc.date.issued2019-03-11
dc.date.submittedFebruary 2019
dc.identifier.urihttp://hdl.handle.net/10388/11902
dc.description.abstractWorkflow provenance is a crucial part of a workflow system as it enables data lineage analysis, error tracking, workflow monitoring, usage pattern discovery, and so on. Integrating provenance into a workflow system or modifying a workflow system to capture or analyze different provenance information is burdensome, requiring extensive development because provenance mechanisms rely heavily on the modelling, architecture, and design of the workflow system. Various tools and technologies exist for logging events in a software system. Unfortunately, logging tools and technologies are not designed for capturing and analyzing provenance information. Workflow provenance is not only about logging, but also about retrieving workflow related information from logs. In this work, we propose a taxonomy of provenance questions and guided by these questions, we created a workflow programming model 'ProvMod' with a supporting run-time library to provide automated provenance and log analysis for any workflow system. The design and provenance mechanism of ProvMod is based on recommendations from prominent research and is easy to integrate into any workflow system. ProvMod offers Neo4j graph database support to manage semi-structured heterogeneous JSON logs. The log structure is adaptable to any NoSQL technology. For each provenance question in our taxonomy, ProvMod provides the answer with data visualization using Neo4j and the ELK Stack. Besides analyzing performance from various angles, we demonstrate the ease of integration by integrating ProvMod with Apache Taverna and evaluate ProvMod usability by engaging users. Finally, we present two Software Engineering research cases (clone detection and architecture extraction) where our proposed model ProvMod and provenance questions taxonomy can be applied to discover meaningful insights.
dc.format.mimetypeapplication/pdf
dc.subjectScientific workflow, provenance, log analytics, automated logging, programming model, graph analysis, provenance questions, classification, taxonomy, data visualization, software engineering, software architecture.
dc.titleWorkflow Provenance: from Modeling to Reporting
dc.typeThesis
dc.date.updated2019-03-12T04:21:21Z
thesis.degree.departmentComputer Science
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Saskatchewan
thesis.degree.levelMasters
thesis.degree.nameMaster of Science (M.Sc.)
dc.type.materialtext
dc.contributor.committeeMemberKhan, Shahedul
dc.contributor.committeeMemberDeters, Ralph
dc.contributor.committeeMemberKeil, Mark
dc.creator.orcid0000-0002-5937-0925


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record