-4 -2 0 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 Distribution of scores for LacI protein mutants Delta-bitscore Density Functional Loss of function Method A profile-based approach to measuring the significance of genetic variation Results Identification of several key genes that differentiate between pathogens and non-pathogens Profile HMMs can be used to score sequence variation according to how likely it is to affect the functioning of a gene. The models capture information on the frequency of each amino acid at each position in the protein, as well as the frequency of indels. We can compare sequences to the models to get a score that indicates the quality of the match to the model. The Pseudomonas genus of bacteria contains members capable of infecting a wide range of plants, including Psa, which had a dramatic impact on New Zealand's kiwifruit industry following an outbreak in 2010. However, the genus also contains members that live associated with, or in the presence of plants but don't cause disease. We examined 27 Pseudomonas isolates from a variety of species in order to determine whether there were genes that showed significantly different functional potential in pathogenic isolates compared to non-pathogenic isolates. Introduction Searching for genetic variation indicative of pathogenicity in Pseudomonas genomes This score can be used to differentiate functionally significant mutations from functionally neutral variation. The figure on the right shows the ability of the method to separate neutral variation and deleterious mutations in the E. coli LacI protein based on score. Acknowledgements PG funded by Rutherford Fellowship NW supported by UC PhD Scholarship Profile-based comparison of Pseudomonas genomes reveals signatures of pathogenicity Nicole E. Wheeler 1 , Honour McCann 2 , Paul P. Gardner 1 1 School of Biological Sciences, University of Canterbury, Christchurch. 2 New Zealand Imstitute for Advanced Study, Massey University, Auckland. Using the delta-bitscore approach, we were able to identify three genes that showed significantly different score distributions in the non- pathogenic isolates compared to the pathogenic isolates. We were able to identify a number of genes that offer promising discriminatory power on their own, and when combined can perfectly discriminate between pathogenic and non-pathogenic isolates, given the sampling we looked at. Using this scoring method, we identified genes shared by Pseudomonas isolates in our test group, that showed a significantly different score distribution in pathogens compared to non-pathogens. ● ● ● ● ● ● ● ● 250 300 350 400 Score distributions for 4−aminobutyrate aminotransferase Bitscore ● ● ● ● ● ● ● ● ● ● ● ● ● Pathogenic Non−pathogenic ● ● ● ●● ● ● ● ● 260 300 340 Score distributions for outer membrane efflux protein Bitscore ● ● ● ● ● ● ● ● ● ● ● ●● ● Pathogenic Non−pathogenic ●●● ● ●●●●● 80 100 120 140 160 Score distributions for bacterioferritin Bitscore ● ●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● Pathogenic Non−pathogenic Potential application: Using delta-bitscore for the classification of organisms Gene A Score > 350? Gene B Score > 300? Gene C Score > 115? Gene B Score > 300? Gene C Score > 115? Gene C Score > 115? Gene C Score > 115? P ✘ ✔ ✘ ✔ ✔ ✘ ✘ ✔ ✔ ✘ ✘ ✘ ✔ ✔ P NP = pathogen = non-pathogen P P P NP NP NP NP For more information on the method: Species 1 Species 2 S 1 - S 2 = delta-bitscore Image: Eric Narwocki Profile HMM = score 1 (S 1 ) = score 2 (S 2 ) Image: thekiwifruitclaim.org