Enhancing Computational Methods for Strain Typing and Separating Strains of Mycoplasma bovis in Mixed Culture
Waldner, Matthew J
MetadataShow full item record
There are no programs that allow a user to isolate strain-speciﬁc sequences within a complex assembly of mixed bacterial strains, unbiased by reference assembly. The tools that do exist each have a specialized focus, such as isolating small haplotype diﬀerences within strains, or have a reliance on reference genomes that may bias the sequences. For this purpose we have developed a tool called the Separator of Strain Inherent Sequences (SepSIS) that extracts sequences speciﬁc to each bacterial strain from the de novo assembly graph created using the SPAdes assembler. SepSIS is accompanied by a set of pre-processing scripts that form the “SepSIS pipeline”. The scripts are available at “https://github.com/MatthewWaldner/sepsis”. The SepSIS pipeline provides two functionalities, with each accepting a particular form of input data. The pipeline was designed for use with Illumina MiSeq paired-read data, but in theory, any read dataset compatible with SPAdes could function with SepSIS. The ﬁrst function of the SepSIS pipeline accepts reads obtained from non-clonal bacterial isolates as input. It then attempts to isolate the complete strain-speciﬁc sequences using relative coverage levels of strain-speciﬁc subsequences in the assembly graph. It is marginally successful at this task. The second function of the SepSIS pipeline accepts reads from independently cultured isolates and mixes them in silico before assembly. After assembly, the contiguous sequences are analyzed by SepSIS using meta-information describing their strain of origin to produce lists of sequences speciﬁc to each strain. These sequences can then be studied and contrasted further. The second functionality of SepSIS was used to perform two primary investigations. The ﬁrst investigation identiﬁes unique sequences from sets of isolates, where each set was hypothesized to consist entirely of copies of a single strain. This investigation analyzed 10 sets of 5 independently sequenced isolates of Mycoplasma bovis, with all the isolates originating from a single culture spread on a growth plate. Despite originating from a single culture, it was found that many of the isolates had unique sequences; therefore, these isolates likely each represent an individual strain. The second investigation was based upon mixing two or more strains with contrasting phenotypic features allowing the second function of SepSIS to be applied to isolating sequences potentially responsible for each phenotype. By running multiple mixes with the same contrasting phenotypic combinations, the intersection of sequences common to a phenotype can be identiﬁed. This type of investigation was performed on 29 pairs of Mycoplasma bovis lung and stiﬂe joint isolates, with each pair originating from a single animal. Infection location was considered a phenotype and sequences unique to each infection location were isolated and identiﬁed. The sequences with the strongest correlation to phenotype were variants of Mycoplasma bovis insertion sequences, or were from genes for variable surface lipoproteins and HAD-family hydrolases. The results show that SepSIS is useful when provided with reads sequenced from independently cultured isolates along with meta-information.
DegreeMaster of Science (M.Sc.)
SupervisorKusalik, Anthony; Jelinski, Murray
CommitteeLinks, Matthew; Hill, Janet; Van Kessel, Andrew
Copyright DateSeptember 2020