Evolution and Adaptation of Australian Bordetella ... - UNSWorks

237
Evolution and Adaptation of Australian Bordetella pertussis Author: Safarchi, Azadeh Publication Date: 2016 DOI: https://doi.org/10.26190/unsworks/2906 License: https://creativecommons.org/licenses/by-nc-nd/3.0/au/ Link to license to see what you are allowed to do with this resource. Downloaded from http://hdl.handle.net/1959.4/55504 in https:// unsworks.unsw.edu.au on 2022-08-02

Transcript of Evolution and Adaptation of Australian Bordetella ... - UNSWorks

Evolution and Adaptation of Australian Bordetella pertussis

Author:Safarchi, Azadeh

Publication Date:2016

DOI:https://doi.org/10.26190/unsworks/2906

License:https://creativecommons.org/licenses/by-nc-nd/3.0/au/Link to license to see what you are allowed to do with this resource.

Downloaded from http://hdl.handle.net/1959.4/55504 in https://unsworks.unsw.edu.au on 2022-08-02

Evolution and Adaptation of Australian

Bordetella pertussis

Azadeh Safarchi

A thesis submitted in fulfilment of the requirements for the degree of

Doctor of Philosophy (Microbiology and Immunology)

School of Biotechnology and Bimolecular Sciences

Faculty of Science

The University of New South Wales

2016

i

THE UNIVERSITY OF NEW SOUTH WALES

Thesis/Dissertation Sheet Surname or Family name: Safarchi

First name: Azadeh Other name/s: -

Abbreviation for degree as given in the University calendar: PhD

School: The School of Biotechnology and Bimolecular Science Faculty: Faculty of Science

Title: Evolution and Adaptation of Australian Bordetella pertussis

Abstract The resurgence of pertussis also known as whopping cough has been reported worldwide including Australia. Strain

variation and pathogen adaptation have been reported in many countries in response to acellular pertussis vaccine. Most

recently Single Nucleotide Polymorphism (SNP) typing was used and separated Australian B. pertussis isolates into five

clusters, known as SNP cluster I to V with the current predominant cluster I strains, also known as the ptxP3 strains.

Whole genome sequencing was used to investigate the microevolution of 22 B. pertussis isolates from the latest

Australian pertussis epidemic (2008-2012), which all belonged to SNP profile 13 of cluster I. Ten of the 22 isolates were

pertactin (Prn) negative with three different mechanisms of inactivation. Five Australian pre-epidemic isolates, all Prn

positive, were also included for analyses. There were five SNPs differentiating epidemic isolates from pre-epidemic

isolates. Phylogenetic analysis separated the 22 epidemic isolates into 5 lineages, EL1 to EL5. There were spatial and

temporal clustering for the isolates analysed. However, there were also some isolates from different locality and time of

isolation that were grouped together suggesting clonal spread of B. pertussis across Australia. Similarly, one of the seven

isolate with the same prn gene inactivation were separated from the remaining six suggesting independent evolution of

Prn negative strains.

The overall genomic diversity and molecular evolution among major Australian clones were then investigated using

Illumina and PacBio sequencing. The results confirmed the previous SNP clusters and showed ongoing genome reduction

with the deletion of two large regions of differences including BP0910-BP0934 in all clusters and BP1947-BP1968

specific to cluster I. Our findings also revealed the role of progressive gene loss, frameshift indels, new insertion elements

and genome arrangements which may have contributed to the evolution and divergence of B. pertussis isolates.

Lastly, a mixed infection competition assay in a mouse model study was used to determine the differential fitness

between Prn negative and Prn positive strains representing the predominant cluster I as well as between cluster I and

cluster II strains. A novel tagged primer Illumina sequencing was used to differentiate the proportion of each isolate in the

extracted DNA from lungs and trachea of control and ACV-immunised mice. The results revealed that cluster I strains

colonised better in mice respiratory tract regardless of immunisation status and Prn negative strains have better fitness in

ACV-immunised mice.

Declaration relating to disposition of project thesis/dissertation

I hereby grant to the University of New South Wales or its agents the right to archive and to make

available my thesis or dissertation in whole or in part in the University libraries in all forms of media,

now or here after known, subject to the provisions of the Copyright Act 1968. I retain all property

rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all

or part of this thesis or dissertation.

I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation

Abstracts International (this is applicable to doctoral theses only).

Azadeh Safarchi

Signature

……………………………

Witness

21 September 2015

Date

The University recognises that there may be exceptional circumstances requiring restrictions on

copying or conditions on use. Requests for restriction for a period of up to 2 years must be made

in writing. Requests for a longer period of restriction may be considered in exceptional

circumstances and require the approval of the Dean of Graduate Research.

FOR OFFICE USE ONLY

Date of completion of requirements for Award:

THIS SHEET IS TO BE GLUED TO THE INSIDE FRONT COVER OF THE THESIS

ii

Originality statement

‘I hereby declare that this submission is my own work and to the best of my

knowledge it contains no materials previously published or written by another

person, or substantial proportions of material which have been accepted for the

award of any other degree or diploma at UNSW or any other educational

institution, except where due acknowledgement is made in the thesis. Any

contribution made to the research by others, with whom I have worked at UNSW

or elsewhere, is explicitly acknowledged in the thesis. I also declare that the

intellectual content of this thesis is the product of my own work, except to the

extent that assistance from others in the project's design and conception or in style,

presentation and linguistic expression is acknowledged.’

Signed Azadeh Safarchi

Date 21 September 2015

iii

Acknowledgment

First and foremost, I would like to thank A/Prof Ruiting Lan for taking me as a

PhD student, for his guidance and support for the duration of my candidature. I also

appreciate all the support, computer prowess and expertise of Dr. Sophie Octavia as

my co-supervisor and dearest friend.

Many thanks to all the students and friends in Lab 301B and my friends in office

301C especially my dearest friend Vikneswari Mahendran who gave me the

opportunity to share and experience many memorable years in Australia. I am also

very grateful to Laurence Luu for all his help in proofreading my thesis.

I would also like to acknowledge and thank my parents for all their support and

encouragement they have provided me and for teaching me to work hard for the

things that I aspire to achieve.

Finally but definitely not least, I express my sincerest gratitude to my beloved

husband, Amir Abas Hayati, for his endless love and continuous encouragements,

his shoulders to cry on, his hands for help and support, and most of all for him

taking up all the responsibilities as well as the pressure of working and living in

Australia during my study abroad and throughout this entire journey. Without him,

I would have struggled to find the inspiration and motivation to complete my

dissertation. This accomplishment would not have been possible without you.

Dearest Amir Abas, I am truly thankful for having you in my life and I would like

to dedicate this thesis to you with all my love and respect.

iv

Publication and Presentation

Publication

Safarchi A., Octavia S., Luu L. D. W., Tay C. Y., Sintchenko V., Wood N.,

Marshall H., McIntyre P., Lan R., “Pertactin negative Bordetella pertussis

demonstrates higher fitness under vaccine selection pressure in a mixed infection

model”, (published, Vaccine 2015 Nov 17;33(46):6277-81.)

Safarchi A., Octavia S., Luu L. D. W., Tay C. Y., Sintchenko V., Wood N.,

Marshall H., McIntyre P., Lan R., “The differential fitness of Bordetella pertussis

belonging to two major clusters in in vivo competition assay”, (published, available

online 27 January 2016 in Journal of Infection, 2016)

Safarchi A., Octavia S., Kaur S., Sintchenko V., Gilbert G. L., Wood N., McIntyre

P., Marshall H., Keil A. D., Lan R., “Genomic dissection of Australian Bordetella

pertussis isolates from the 2008-2012 epidemic” , (Manuscript is in revision)

Poster presentation

Safarchi A., et al., 2014, “Comparative genomics of major Australian B. pertussis

clones”, Microbiology after the genomics revolution: Genome 2014, Institute

Pasture, Paris, France.

Safarchi A., et al., 2015, “A Genomic Portrait of Australian Pertussis Epidemic in

2008”, Australian Society for Microbiology, Annual Conference in Canberra,

Australia.

v

Abstract

Whooping cough or pertussis is a highly infectious respiratory disease in humans

mainly caused by Bordetella pertussis. The success of worldwide vaccinations

against pertussis with the whole cell vaccine (WCV), introduced in the 1950s, led

to a dramatic decrease in the morbidity and mortality associated with pertussis

around the world. However, due to concerns over the reactogenic side effects of

WCV, the acellular vaccine (ACV) was developed in the 1980s and was introduced

in many developed countries from the 1990s. In Australia, ACV replaced WCV

initially as a booster dose in 1997 and for all scheduled doses by 2000. Despite the

high vaccination coverage, there has been a re-emergence of pertussis worldwide

including in Australia.

Molecular epidemiological data suggest that the resurgence of pertussis in the high

vaccine coverage population is associated with genomic adaptations of B. pertussis

to the vaccine selection pressure. Genetic analysis by Single Nucleotide

Polymorphism (SNP) typing divided Australian B. pertussis isolates from 1970s to

the present into five major clusters referred to as SNP clusters I to V. It was also

shown that after the introduction of ACV, the majority of isolates belonged to

cluster I (carrying ptxP3/prn2) and replaced cluster II (carrying ptxP1/prn3). The

emergence and expansion of non-pertactin expressing isolates (Prn negative), was

also observed.

In Chapter 3, Illumina whole genome sequencing (WGS) was used to study the

microevolution and genomic diversity of 22 B. pertussis isolates from the latest

Australian epidemic (2008-2012). This included 10 Prn-negative isolates with three

different modes of inactivation (1 IS481F, 7 IS481R and 2 IS1002). Five pre-

epidemic isolates were also sequenced for comparison. Five single nucleotide

polymorphisms (SNPs) were common in the epidemic isolates and differentiated

them from pre-epidemic isolates. The epidemic isolates can be further divided into

5 lineages and spatial and temporal clustering was also revealed. Of the seven

isolates not expressing Prn due to IS481R insertion, six were grouped together

suggesting clonal expansion while one was derived independently. These findings

vi

also suggest that SNPs play an important role in the adaptation and microevolution

of the 2008-2012 epidemic B. pertussis isolates.

In Chapter 4, two genome sequencing platforms, Illumina and PacBio, were used to

investigate the gene content and genomic diversity of nine Australian B. pertussis

isolates representing clusters I to V, respectively. There were 426 SNPs amongst

the isolates and phylogenetic analysis separated them into their respective clusters

thus confirming the relationship of the clusters revealed by a previous study using

SNP typing. Non-synonymous SNPs, frameshift indels, new insertions sequences

and gene losses were found to be the key factors in driving the genomic diversity of

B. pertussis strains. Eleven functional genes were converted to pseudogenes and six

pseudogenes were reverted back to become functional genes due to small

frameshift indels. Deletion of two large regions of differences including BP0910A

–BP0934 in all clusters and BP1947 –BP1968 for cluster I, showed the ongoing

genome reduction that was observed in B. pertussis strains from other countries.

Multiple genome rearrangements including translocations, inversions and a

combination of both were also observed with 17 hotspot regions for arrangement

breakpoints. Genome rearrangements may affect the expression of the genes and

phenotypes of the strains.

Recent studies showed B. pertussis strains that do not express pertactin (Prn), a key

antigenic component of the ACV, have emerged and become prevalent. In Chapter

5, in vivo competition assays in mice immunised with ACV and naïve (control)

mice were used to compare the proportion of colonisation with recent clinical Prn

positive and Prn negative B. pertussis strains from Australia. The Prn negative

strain colonised the respiratory tract more effectively than the Prn positive strain in

immunised mice, out-competing the Prn positive strain by day 3 of infection.

However, in control mice, the Prn positive strain out-competed the Prn negative

strain. The findings that Prn negative strains possess a greater ability to colonise

ACV-immunised mice are consistent with reports of a selective advantage for these

strains in ACV-immunised humans.

In Chapter 6, an in vivo competition assay was carried out to compare the

differential fitness between two Australian B. pertussis strains belonging to SNP

vii

cluster I and II, respectively. The cluster I strain colonised better than the cluster II

strain in both naïve and immunised mice from day 3 post-infection in lungs and

trachea. The results suggest that cluster I strains have a better fitness in the

population regardless of the immunisation status of the host and may have

contributed to its predominance in the population. It was also shown that ACV still

enhances the bacterial clearance from the mouse respiratory tract despite the

antigenic mismatch in cluster I strain.

The findings from this thesis contribute to an enhanced understanding of the

evolution and adaptation of current predominant Australian B. pertussis at the

genomic level. In addition, the findings also form the basis for future studies on the

fitness of B. pertussis strains under the pressure of different vaccines and will help

to develop new vaccination strategies.

viii

List of Abbreviation

ACT Adenylate cyclase toxin

ACV Acellular vaccine

AUC Area under Curve

BALB/C Bagg Albino laboratory-bred (inbred research mouse strain)

bp base pair

Bvg Bordetella virulence gene

CGH Comparative Genome Hybridisation

CFU Colony Forming Unit

DNT Dermonecrotic toxin

EDTA Ethylenediaminetetraacetic acid

ELISA Enzyme Linked Immunosorbent Assay

EL Epidemic lineage

FHA Filamentous haemagglutinin

Fim Fimbriae

Indels Small insertion and deletions

IS Insertion sequence

min minute

kb kilobase

LCB Locally Collinear Blocks

MLST Multilocus Sequence Typing

MLVA Multilocus VNTR Analysis

MT MLVA type

OD Optical Density

ORF Open reading frame

PacBio Pacific Bioscience

PCR Polymerase Chain Reaction

PFGE Pulsed Field Gel Electrophoresis

Prn Pertactin

Ptx Pertussis toxin

RD Regions of Difference

SMRT Single Molecule Real-Time

SNP Single Nucleotide Polymorphism

SP SNP profile

SS Stainer-Scholte

TBS Tris-buffered Saline

TcfA Tracheal colonisation factor

TCT Tracheal cytotoxin

UC Un clustered

VNTR Variable Number Tandem Repeat

WCV Whole cell vaccine

WGS Whole genome sequencing

WHO World Health Organisation

ix

Table of contents

CHAPTER 1. LITERATURE REVIEW ................................................................................. 1

1.1 THE GENUS BORDETELLA AND ITS CHARACTERISTICS ............................................................. 1

1.2 BORDETELLA PERTUSSIS .......................................................................................................... 4

1.3 VIRULENCE FACTORS IN BORDETELLA PERTUSSIS ................................................................... 5

1.3.1 Adhesins ....................................................................................................................... 5

1.3.2 Toxins ........................................................................................................................... 9

1.3.3 Additional toxins and virulence factors ..................................................................... 11

1.3.4 BvgAS and the regulation of virulence ....................................................................... 12

1.4 PERTUSSIS – THE DISEASE .................................................................................................... 15

1.4.1 Definition and clinical symptoms ............................................................................... 15

1.4.2 Diagnosis of pertussis ................................................................................................ 17

1.4.3 Treatment .................................................................................................................. 19

1.5 PERTUSSIS VACCINES ........................................................................................................... 19

1.5.1 Whole cell vaccines .................................................................................................... 20

1.5.2 Acellular vaccines ...................................................................................................... 20

1.5.3 Efficacy and protection from pertussis infection by acellular vaccine ....................... 22

1.5.4 Immunity responses to infection and vaccines .......................................................... 24

1.5.5 Vaccination schedules................................................................................................ 25

1.6 EPIDEMIOLOGY OF PERTUSSIS .............................................................................................. 26

1.6.1 Pertussis around the globe ........................................................................................ 26

1.6.2 Pertussis in Australia ................................................................................................. 28

1.6.3 Multiple causes of pertussis re-emergence ............................................................... 29

1.7 MOLECULAR EPIDEMIOLOGY AND EVOLUTION OF B. PERTUSSIS ........................................... 32

1.7.1 The genomic content of B. pertussis .......................................................................... 32

1.7.2 Adaptation and evolution of B. pertussis ................................................................... 33

1.7.3 Genotyping tools for epidemiologic studies............................................................... 40

1.7.4 Genomic adaptation and evolution of B. pertussis strains in Australia ..................... 43

1.7.5 Animal models to study B. pertussis infection ........................................................... 45

1.8 AIMS AND MOTIVATIONS ..................................................................................................... 47

CHAPTER 2. MATERIALS AND METHODS .................................................................... 49

2.1 MATERIALS .......................................................................................................................... 49

2.1.1 Bacterial Strain .......................................................................................................... 49

x

2.1.2 Mouse strain .............................................................................................................. 51

2.2 METHODS ............................................................................................................................ 51

2.2.1 Culturing of B. pertussis ............................................................................................. 51

2.3 DNA EXTRACTION ............................................................................................................... 51

2.4 POLYMERASE CHAIN REACTION (PCR) ................................................................................ 53

2.5 AGAROSE GEL ELECTROPHORESIS ....................................................................................... 53

2.6 PCR PRODUCT PURIFICATION .............................................................................................. 53

2.7 COLONY FORMING UNIT (CFU) COUNT ............................................................................... 54

2.8 WHOLE GENOME SEQUENCING ............................................................................................. 54

2.9 GENERAL PROCEDURES PERFORMED IN MOUSE MODEL STUDIES .......................................... 54

2.9.1 Mouse housing and monitoring ................................................................................. 54

2.9.2 Anaesthetic methods used for sedation .................................................................... 55

2.9.3 Mouse blood collection .............................................................................................. 56

CHAPTER 3. GENOMIC DISSECTION OF AUSTRALIAN BORDETELLA PERTUSSIS

ISOLATES FROM THE 2008-2012 EPIDEMIC ...................................................................... 57

3.1 INTRODUCTION .................................................................................................................... 57

3.2 AIMS AND MOTIVATION ...................................................................................................... 58

3.3 MATERIALS AND METHODS ................................................................................................. 59

3.3.1 Bacterial Strains ......................................................................................................... 59

3.3.2 DNA sequencing and Assembly .................................................................................. 61

3.3.3 SNP Identification ...................................................................................................... 61

3.3.4 Insertion sequence elements analysis ........................................................................ 61

3.3.5 Phylogenetic analysis ................................................................................................. 61

3.3.6 Reference genomes ................................................................................................... 62

3.4 RESULTS .............................................................................................................................. 63

3.4.1 Selection and sequencing of epidemic isolates .......................................................... 63

3.4.2 Polymorphisms in SP13 isolates ................................................................................. 64

3.4.3 Phylogenetic relationships ......................................................................................... 68

3.4.4 Potential adaptive SNPs of epidemic SP13 B. pertussis isolates ................................ 71

3.4.5 Indels ......................................................................................................................... 71

3.4.6 Insertion Sequence elements ..................................................................................... 75

3.4.7 Gene loss .................................................................................................................... 77

3.5 DISCUSSION ......................................................................................................................... 78

3.6 CONCLUSION ....................................................................................................................... 81

xi

CHAPTER 4. COMPARATIVE GENOMICS OF MAJOR AUSTRALIAN BORDETELLA

PERTUSSIS CLONES ............................................................................................................... 82

4.1 INTRODUCTION .................................................................................................................... 82

4.2 AIMS AND MOTIVATIONS ..................................................................................................... 83

4.3 MATERIALS AND METHODS ................................................................................................. 85

4.3.1 Bacterial strains ......................................................................................................... 85

4.3.2 DNA extraction and quality control ........................................................................... 86

4.3.3 Illumina sequencing and assembly ............................................................................ 86

4.3.4 PacBio sequencing and assembly .............................................................................. 86

4.3.5 SNPs, indels, insertion Sequence and detection of gene losses ................................. 86

Genome rearrangements ......................................................................................................... 87

4.3.6 Phylogenetic analysis ................................................................................................. 87

4.4 RESULTS .............................................................................................................................. 88

4.4.1 Selection and sequencing of representative isolates of different SNP clusters from

Australia ................................................................................................................................... 88

4.4.2 Single Nucleotide Polymorphisms in Different Clusters ............................................. 90

4.4.3 Phylogenetic Relationships of the isolates ................................................................. 97

4.4.4 Potential Adaptive SNPs in Different isolates/Clusters ............................................ 101

4.4.5 Indels ....................................................................................................................... 103

4.4.6 Insertion Sequences ................................................................................................. 111

4.4.7 Gene Loss ................................................................................................................. 114

4.4.8 Genome rearrangement .......................................................................................... 119

4.5 DISCUSSION ....................................................................................................................... 122

4.6 CONCLUSION ..................................................................................................................... 126

CHAPTER 5. FITNESS OF PERTACTIN NEGATIVE BORDETELLA PERTUSSIS IN A

MIXED INFECTION MODEL ............................................................................................... 127

5.1 INTRODUCTION .................................................................................................................. 127

5.2 AIMS AND MOTIVATION ..................................................................................................... 127

5.3 MATERIAL AND METHODS.................................................................................................. 129

5.3.1 B. pertussis clinical strains ....................................................................................... 129

5.3.2 in vitro growth curve determination ........................................................................ 129

5.3.3 The mouse model of B. pertussis infection .............................................................. 129

5.3.4 Differentiation of the two B. pertussis isolates in the mixed infection in lungs and

trachea 130

xii

5.3.5 Statistical analysis ................................................................................................... 131

5.4 RESULTS ............................................................................................................................ 133

5.4.1 in vitro growth rate of the isolates used in this study .............................................. 133

5.4.2 Bacterial clearance in immunised mice infected with Prn positive and negative

isolates 134

5.4.3 Competitive fitness of Prn negative B. pertussis in the mixed infection in vivo study

136

5.5 DISCUSSION ....................................................................................................................... 138

CHAPTER 6. THE DIFFERENTIAL FITNESS OF BORDETELLA PERTUSSIS

BELONGING TO TWO MAJOR CLUSTERS IN IN VIVO COMPETITION ASSAY........ 141

6.1 INTRODUCTION .................................................................................................................. 141

6.2 AIMS AND MOTIVATION ..................................................................................................... 142

6.3 MATERIALS AND METHODS ............................................................................................... 143

6.3.1 B. pertussis clinical strains ....................................................................................... 143

6.3.2 in vitro growth curve determination ........................................................................ 143

6.3.3 The mouse model of B. pertussis infection .............................................................. 143

6.3.4 Differentiation of the two B. pertussis isolates in the mixed infection in lungs and

trachea 144

6.3.5 Statistical analysis ................................................................................................... 144

6.4 RESULTS ............................................................................................................................ 146

6.4.1 in vitro growth rate of the isolates used in this study .............................................. 146

6.4.2 Bacterial clearance in immunised mice infected with the mixed infection of cluster I

and cluster II isolates ............................................................................................................. 146

6.4.3 Competitive fitness of cluster I B. pertussis in the mixed infection in vivo study ..... 148

6.5 DISCUSSION ....................................................................................................................... 150

6.6 CONCLUSION ..................................................................................................................... 152

CHAPTER 7. GENERAL DISCUSSION ............................................................................ 153

7.1 MICROEVOLUTION OF CURRENT EPIDEMIC B. PERTUSSIS ISOLATES .................................... 154

7.1.1 A genomic portrait of the 2008-2012 Australian epidemic ..................................... 154

7.1.2 Diversification of epidemic SP13 through random mutations, adaptive changes,

indels and insertion sequence transposition .......................................................................... 154

7.1.3 Independent evolution of Prn negative isolates ...................................................... 155

7.2 COMPARATIVE GENOMIC INVESTIGATION OF MAJOR AUSTRALIAN B. PERTUSSIS CLONES .. 156

xiii

7.2.1 Comparative genomic variation of current circulating cluster I B. pertussis strains

with other clusters in Australia .............................................................................................. 156

7.2.2 Ongoing genome reduction in B. pertussis through large indels ............................. 157

7.2.3 Genetic diversities driven by transposition and genome rearrangements .............. 158

7.3 THE COMPARATIVE FITNESS OF EPIDEMIC B. PERTUSSIS STRAINS IN VIVO IN THE MOUSE

MODEL ......................................................................................................................................... 159

7.3.1 Development of a mixed infection model and a new method to perform mixed

bacterial competition assay ................................................................................................... 159

7.3.2 The better fitness of Prn negative strains under the pressure of ACV selection ...... 160

7.3.3 Better fitness of cluster I strains in both immunised and unimmunised hosts ........ 161

7.4 FUTURE WORK ................................................................................................................... 162

7.5 CONCLUSION ..................................................................................................................... 164

CHAPTER 8. REFERENCES ............................................................................................. 165

APPENDIX 1: LIST OF SNPS DETECTED IN SP13 B. PERTUSSIS ISOLATES USING

ILLUMINA WHOLE GENOME SEQUENCING .................................................................. 195

APPENDIX 2: LIST OF SNPS DETECTED IN MAJOR AUSTRALIAN CLONE .............. 202

APPENDIX 3: GENES AFFECTED BY 300 BP MORE DELETION ................................... 211

xiv

List of Figures

FIGURE 1.3-1: ADHESIN FACTORS IN B. PERTUSSIS AND THEIR STRUCTURES. A) PERTACTIN. B) FILAMENTOUS

HEMAGGLUTININ AND C)FIMBRIAE . FIGURE WAS ADAPTED FROM MELVIN ET AL. [65] ................................ 7

FIGURE 1.3-2: SCHEMATIC PICTURES OF PERTUSSIS TOXIN. (A)THE HOLOTOXIN VIEWED PERPENDICULAR TO THE FIVE-

FOLD AXIS OF THE B-OLIGOMER. (B)THE B-OLIGOMER VIEWED ALONG THE FIVE-FOLD AXIS FROM THE SURFACE

OPPOSITE TO S1 WITH THE POSITION OF THE FIVE-FOLD AXIS INDICATED BY AN ASTERISK. SUBUNITS ARE COLOUR-

CODED AS FOLLOWS: S1, GREEN; S2, TURQUOISE; S3, PURPLE; S4, RED; S5, YELLOW. Β-STRANDS (THREE OR

MORE RESIDUES) ARE SHOWN AS ARROWS, AND Α-HELICES AS SPIRALS. (FIGURE WAS ADAPTED FROM STEIN ET

AL. [97]) ..................................................................................................................................... 10

FIGURE 1.3-3: THE BVGAS MASTER REGULATORY SYSTEM. A) THE STRUCTURE OF BVGAS AND THE ACTIVATION

PATHWAY THAT OCCURS BY AUTO-PHOSPHORYLATION OF CONSERVED HISTIDINE (H) IN THE HISTIDINE KINASE

DOMAIN. B) PHOSPHORYLATED BVGA (BVGA-P) DIMERISES AND ACTIVATES THE EXPRESSION OF VIRULENCE –

ASSOCIATED GENES (WHICH ARE SUBDIVIDED INTO TWO CLASS 1 AND 2 GENES) AND REPRESSES THE EXPRESSION

OF VIRULENCE-REPRESSED GENES (WHICH ARE CLASS 4 GENES). FIGURE WAS ADOPTED FROM MELVIN ET AL.[65]

................................................................................................................................................. 15

FIGURE 1.5-1: TREND OF MULTIPLE CHANGES IN PERTUSSIS VACCINATION SCHEDULE IN AUSTRALIA. (ADOPTED FROM

CAMPBELL ET AL. [204]. ................................................................................................................ 26

FIGURE 1.6-1: INCIDENCE RATE OF PERTUSSIS FROM 1995-2012 IN AUSTRALIA. NOTIFICATION RATES WERE

SEPARATED ACCORDING TO AGE GROUP. THE GRAPH WAS OBTAINED FROM PILLSBURY ET AL.[219]. ............. 29

FIGURE 1.7-1: ANTIGENIC SHIFT IN THE B. PERTUSSIS POPULATION RESULTING IN INCREASED ANTIGENIC MISMATCH

BETWEEN VACCINE STRAINS IN USE AND CIRCULATING STRAINS. (FIGURE ADAPTED FROM VAN DER ARK ET AL.

[267])........................................................................................................................................ 36

FIGURE 1.7-2: TRENDS OF THE FOUR MAJOR CLUSTERS OF B. PERTUSSIS IN AUSTRALIA. FOUR MAJOR CLUSTERS (I–IV)

IN AUSTRALIA WERE DIVIDED INTO THREE PERIODS: WCV (PRIOR TO 1997), TRANSITION FROM WCV TO ACV

(1997–1999), AND ACV (2000 ONWARDS). PERCENTAGE (Y AXIS) OF A GIVEN CLUSTER OF THE TOTAL

NUMBER OF ISOLATES FOR THAT PERIOD IS SHOWN (ADAPTED FROM OCTAVIA ET AL. [306]) ....................... 44

FIGURE 3.4-1: THE FUNCTIONAL CATEGORIES OF THE SINGLE NUCLEOTIDE POLYMORPHISMS OBSERVED IN AUSTRALIAN

BORDETELLA PERTUSSIS SP13 ISOLATES. ........................................................................................... 67

FIGURE 3.4-2: MINIMUM EVOLUTIONARY TREE OF 27 BORDETELLA PERTUSSIS SP13 ISOLATES BASED ON 305 SINGLE

NUCLEOTIDE POLYMORPHISMS (SNPS). THE NUMBER ON THE INTERNAL AND TERMINAL BRANCHES

xv

CORRESPONDS TO THE NUMBER OF SNPS SUPPORTING EACH BRANCH. EPIDEMIC ISOLATES GROUPED INTO 5

EPIDEMIC LINEAGE (EL). ................................................................................................................. 68

FIGURE 4.4-1. MINIMUM EVOLUTIONARY TREE OF 10 B. PERTUSSIS ISOLATES FROM DIFFERENT CLUSTERS BASED ON

426 SNPS. THE NUMBER ON THE INTERNAL AND TERMINAL BRANCHES CORRESPONDS TO THE NUMBER OF SNPS

SUPPORTING EACH BRANCH. ISOLATION YEAR, CLUSTER INFORMATION AND SNP PROFILE NUMBERS ARE SHOWN

IN BRACKETS. ............................................................................................................................... 98

FIGURE 4.4-2: GENES AFFECTED BY PARTIAL OR COMPLETE DELETION IN B. PERTUSSIS ISOLATES ANALYSED IN THIS

STUDY. HYBRID ASSEMBLIES WERE BLASTED AGAINST B. PERTUSSIS TOHAMA I GENOME AND REGIONS WITH 300

BP OR MORE DELETED WERE DETECTED AND ANALYSED. TWO LARGE LOCI WERE DELETED IN ALL ISOLATES AND

BP1947. .................................................................................................................................. 116

FIGURE 4.4-3: THE PROPORTION OF EACH FUNCTIONAL CATEGORY FOR DELETED GENES..................................... 118

FIGURE 4.4-4. PAIRWISE GENOME COMPARISON OF B. PERTUSSIS ISOLATES ANALYSED IN THIS STUDY. REGIONS OF

HOMOLOGY BETWEEN A PAIR OF GENOMES ARE INDICATED BY LINES; RED FOR THE SAME DIRECTION AND BLUE

FOR REVERSE DIRECTION (INVERSION). TRANSLOCATION IS ALSO APPARENT WHEN THE LINES ARE CROSSING

OVER. ....................................................................................................................................... 120

FIGURE 4.5-1: GENOMIC DIVERSITY IN DIFFERENT CLUSTERS BASED ON THE RESULTS OF SNP TYPING AS MINIMUM

EVOLUTIONARY TREE, DELETED AND NEW IS ELEMENTS , INDELS AND DELETED REGIONS . DETAILS FOR EACH

ISOLATES INCLUDED NAME, YEAR OF ISOLATION, CLUSTER AND SNP PROFILE WERE BASED ON THE OCTAVIA ET

AL. [306] RESULTS. ..................................................................................................................... 123

FIGURE 5.4-1 : GROWTH CURVE OF TWO PRN POSITIVE AND PRN NEGATIVE B. PERTUSSIS ISOLATES FROM CLUSTER I

USING OPTICAL DENSITY. DOUBLING TIME WAS ALSO ESTIMATED BASED ON THE CFU COUNT RESULTS AND NO

SIGNIFICANT DIFFERENCE WAS OBSERVED. THE ERROR BARS HAVE BEEN CALCULATED AND SHOWN IN FIGURE

5.4-1 WHICH MAY AFFECT THE DOUBLING TIME CALCULATION. ............................................................ 133

FIGURE 5.4-2: COLONISATION OF BORDETELLA PERTUSSIS IN NAÏVE AND ACV IMMUNISED MICE INFECTED WITH A

MIXTURE OF PRN NEGATIVE AND PRN POSITIVE B. PERTUSSIS. LOG10 CFU WAS CALCULATED BASED ON THE

NUMBER OF COLONIES IN GROUPS OF 3 MICE FOR EACH TIME POINT. A) LUNGS B) TRACHEA AND C) AREA

UNDER THE CURVE FOR LUNGS (P = 0.0034) AND TRACHEA (P = 0.02) OF NAÏVE AND ACV IMMUNISED MICE. *

DENOTES SIGNIFICANT DIFFERENCE (P < 0.05) IN BACTERIAL CLEARANCE. .............................................. 135

FIGURE 5.4-3 : THE PROPORTION OF PRN NEGATIVE ISOLATE IN A) LUNGS AND B) TRACHEA OF IMMUNISED AND

CONTROL MICE AT DIFFERENT TIME POINTS. SIGNIFICANCE DIFFERENCE IS FOUND IN ALL TIME POINTS POST-

INFECTION (P >.0.05). ................................................................................................................ 137

FIGURE 6.4-1: GROWTH CURVE OF TWO B. PERTUSSIS ISOLATES FROM CLUSTER I AND II USING OPTICAL DENSITY.

DOUBLING TIME WAS ALSO CALCULATED BASED ON THE CFU COUNT RESULTS AND NO SIGNIFICANT DIFFERENCE

WAS OBSERVED. ......................................................................................................................... 146

xvi

FIGURE 6.4-2. COLONISATION CURVE FOR NAÏVE AND ACV IMMUNISED MICE INFECTED WITH A MIXTURE OF CLUSTER I

(PTXP3,PRN2) AND CLUSTER II (PTXP1, PRN3) B. PERTUSSIS. LOG10 CFU WAS CALCULATED BASED ON THE

NUMBER OF COLONIES IN GROUPS OF 3 MICE FOR EACH TIME POINT. A) LUNGS; B) TRACHEA, *SIGNIFICANT

DIFFERENCE (P< 0.05) IN BACTERIAL CLEARANCE WAS FOUND IN 3 DAYS POST-INFECTION IN LUNGS AND 7 DAYS

IN TRACHEA; C) AREA UNDER THE CURVE FOR LUNGS (P=0.00003) AND TRACHEA (P=0.00002) OF

IMMUNISED AND CONTROL MICE IN THE MIXTURE INFECTION. .............................................................. 147

FIGURE 6.4-3. THE PROPORTION OF CLUSTER I (PTXP3, PRN2) STRAIN IN A) LUNGS AND B) TRACHEA OF IMMUNISED

AND CONTROL MICE. SIGNIFICANT DIFFERENCES WERE FOUND IN DAY 14 OF POST-INFECTION IN LUNGS. * (P

<.0.05) .................................................................................................................................... 149

xvii

List of Tables

TABLE 1.1-1: PROPERTIES OF THE BORDETELLA SPECIES A ................................................................................. 2

TABLE 2.1-1: B. PERTUSSIS ISOLATES FROM SELECTED FROM MAJOR CLUSTERS AND USED IN CHAPTER 4 AND 6. ....... 49

TABLE 2.1-2: DETAILS OF 28 B. PERTUSSIS ISOLATES FROM CLUSTER I INCLUDING SNP PROFILE 13 AND 16 (ALL PTXP3

ISOLATES). PRN NEGATIVE ISOLATES AND THE CAUSE OF PRN DISRUPTION ARE MENTIONED. ......................... 50

TABLE 3.3-1: DETAILS OF SP13 ISOLATES USED IN THIS STUDY. ........................................................................ 60

TABLE 3.4-1: QUALITY OF ASSEMBLY FOR EACH SP13 B. PERTUSSIS ISOLATE BASED ON VELVETG............................ 64

TABLE 3.4-2: COMMON SNPS FOUND IN ALL 2008-2012 SP13 ISOLATES WHEN COMPARED WITH B. PERTUSSIS

TOHAMA I. .................................................................................................................................. 66

TABLE 3.4-3: SINGLE NUCLEOTIDE POLYMORPHISMS UNIQUE TO EPIDEMIC LINEAGES. ......................................... 70

TABLE 3.4-4: FRAMESHIFT INDELS IN SP13 ISOLATES. ................................................................................... 73

TABLE 3.4-5: LIST OF NON-FRAMESHIFT AND INTERGENIC INDELS IN SP13 ISOLATES ........................................... 74

TABLE 3.4-6: NEW IS ELEMENTS WHICH WERE FOUND IN SP13 ISOLATES. THERE WAS NO UNIQUE IS FOR 2008-2012

EPIDEMIC ISOLATES AND ONLY ONE IS LOCATED IN BP2327 WERE COMMON FOR EL4. .............................. 76

TABLE 4.3-1: B. PERTUSSIS ISOLATES USED FOR WHOLE GENOME SEQUENCING. ................................................. 85

TABLE 4.4-1: QUALITY OF ASSEMBLY FOR EACH ISOLATE- ILLUMINA SEQUENCING. .............................................. 89

TABLE 4.4-2: QUALITY OF ASSEMBLY FOR EACH ISOLATE- PACBIO SEQUENCING. ................................................ 89

TABLE 4.4-3: THE NUMBER OF SNPS OBSERVED IN B. PERTUSSIS ISOLATES FROM DIFFERENT CLUSTERS................... 91

TABLE 4.4-4: SNPS LOCATED IN GENES REGULATED BY THE BVG SYSTEM. ......................................................... 92

TABLE 4.4-5: SNPS DETECTED USING SAMTOOLS AND PROGRESSIVEMAUVE WHEN COMPARED WITH B. PERTUSSIS CS

AS THE REFERENCE GENOME. ........................................................................................................... 96

TABLE 4.4-6: FIXED SNPS FOR ONE OR MORE CLUSTERS BASED ON THE PHYLOGENETIC TREE. ............................... 99

TABLE 4.4-7: POSSIBLE UNIQUE SNPS FOR EACH CLUSTER ........................................................................... 102

TABLE 4.4-8: INDELS FOUND IN B. PERTUSSIS ISOLATES BELONGING TO DIFFERENT CLUSTERS. ............................. 107

TABLE 4.4-9: GENERAL INFORMATION ABOUT THE PRESENCE AND DELETION OF IS481 IDENTIFIED IN B. PERTUSSIS

ISOLATES THAT WERE ANALYSED IN THIS STUDY.................................................................................. 111

TABLE 4.4-10: NEW IS ELEMENTS FOUND IN THE DIFFERENT ISOLATES. .......................................................... 113

TABLE 4.4-11: THE NUMBER AND PERCENTAGE OF GENES AFFECTED BY PARTIAL OR COMPLETE DELETIONS IN EACH

ISOLATE ..................................................................................................................................... 114

TABLE 4.4-12: POTENTIAL REARRANGEMENTS IN THE B. PERTUSSIS ISOLATES ANALYSED IN THIS STUDY ................. 121

TABLE 5.3-1: PRIMERS DESIGNED FOR THIS STUDY. ..................................................................................... 132

xviii

TABLE 6.3-1 : SELECTED SNPS FOR THIS STUDY AND THE DESIGNED PRIMERS USED FOR SEQUENCING ................... 145

Chapter 1

1

Chapter 1. Literature review

1.1 The genus Bordetella and its characteristics

The genus Bordetella within the family Alcaligenaceae was named in honour of Jules

Bordet, who identified the microorganism from a patient with whooping cough in

1906. Within the Bordetella genus, there are nine species (Table 1.1-1), of which

Bordetella pertussis and Bordetella parapertussis are well known to cause whooping

cough in humans [1]. B. pertussis, B. bronchiseptica and B. parapertussis were

grouped in a cluster named B. bronchiseptica cluster [2]. Different methods including

DNA-DNA hybridisation, multilocus enzyme gel electrophoresis (MLEE) and

comparative sequence analysis of multiple genes including 16S and 23S rRNA genes,

the beta subunits of RNA polymerase (RpoB), gyrase (GyrB) and virulence genes

demonstrated the limited genetic diversity separating these three Bordetella species [2-

6].

B. brochiseptica, the oldest member of the genus Bordetella, has been isolated from

broad range of mammals with respiratory tract infections including monkeys, rabbits,

swine, dogs, and horses [7].

B. parapertussis can cause pertussis like infections. These infections are less severe in

terms of duration and the severity of symptoms in human as it is closely related to B.

pertussis [8, 9]. In fact, several studies have indicated that between 5 to 35% of all

pertussis cases reported in European countries or in the US are caused by B.

parapertussis [10-12].

Other Bordetella species have also been isolated from humans including B. holmesii,

B. petrii, B. ansorpii and B. trematum and two from birds, B. hinzii B. avium [1].

B. avium was first isolated from young turkeys with upper respiratory tract infection.

B. avium infections can have major economic impacts on the poultry industry [13-15].

Isolation of B. avium from human cases of respiratory disease has been recently

reported in patients with cystic fibrosis, and chronic obstructive pulmonary disease

[16, 17].

Chapter 1

2

Table 1.1-1: Properties of the Bordetella species a

Feature B.

pertussis

B.

parapertussis

B. bronchiseptica B. holmesii B. hinzii B. avium B. trematum B. petrii B. ansorpii

Host humans Humans, sheep mammals Humans? Birds, humans Birds, reptiles Humans Environment,

diverse hosts

Human

Disease Whooping

cough

Mild

whooping

cough

Various respiratory

diseases (e.g. atrophic

rhinitis in piglets,

kennel cough in dogs

etc.)

Septicaemia,

respiratory

illness

Septicaemia in

patients with

underlying disease;

asymptomatic,

Respiratory infection

in birds

Respiratory

infection

(turkey

coryza)

Wounds, ear

infection

Bone

degenerative

disease in man

Epidermal

cyst,

septicemia

Site of isolation in

humans

respiratory

tract

respiratory

tract

respiratory tract, blood respiratory tract,

blood

respiratory tract,

blood

– Wounds, ear

G+C content

(mol%)

66–68 66–68 66–68 61.5–62.3 65–67 62 64–65 62-67 63-65

Genome size (kbp) 3880–4060 > 4400 > 5300 >3800 >4700 >3700 ND >5200

a: table were adapted from Gross [18].

Chapter 1

3

B. holmesii was initially isolated from a patient with septicaemia [19]. Soon after, it

was identified that B. holmesii can cause both pertussis- like illnesses and invasive

infections like meningitis, bacteraemia, endocarditis, arthritis and pneumonia in

both healthy and immunocompromised individuals [20]. Compared to B. pertussis

infections, respiratory infections caused by B. holmesii are commonly milder [21].

Recent epidemiological studies showed that B. holmesii have been isolated from

different countries during pertussis outbreaks [22-25]. Due to the presence of IS481

in B. holmesii and B. pertussis, it is difficult to differentiate these two species in

pertussis infections by diagnostic PCR targeting IS481.

B. holmesii and B. pertussis have 99.5% similarity in their 16s rRNA sequence

[26]. However, cellular fatty acid profiles and genomic analysis have shown that B.

holmesii is more closely related to B. hinzii and B. avium than to other Bordetella

species [26, 27]. A new member of the genus Bordetella, B. petrii, was initially

isolated from environmental samples. There have also been reports of B. petrii

being isolated in humans from different infections recently, however their

pathogenicity remains unclear [1, 28]. In B. petrii, there are 7 large genomic islands

in its genome, most of which encodes metabolic factors that degrade aromatic

compounds and enable it to survive in difficult ecological niches [18, 29].

There is limited information about the remaining three Bordetella species, B.

trematum, B. hinzii and B. ansorpii. B. hinzii have been isolated from

immunocompromised humans, the respiratory tract of poultry, from mice that were

kept in experimental facilities, and finally from the blood culture of rodents [30-

32]. B. trematum have been isolated from humans with chronic diabetes, ear or

wound infections and recently, in patients with bacteraemia [33-35]. B. trematum

was also isolated from the rumen of native Korean cattle [36]. B. hinzii have a close

genomic relationship with B. avium and B. petrii as compared to other Bordetella

species. There are also a greater number of genes associated with membrane

transport activity in B. hinzii compared to B. terimatum [37]. There is currently

limited information about B. ansorpii, as it has only been recently isolated from two

immunocompromised patients [28]. Analysis of its 16s rRNA sequences has

confirmed that it belongs to the genus, Bordetellae, and is closely related to B.

Chapter 1

4

petrii and B. hinzii [18]. Interestingly however, B. ansorpii is capable of anaerobic

growth while other Bordetella spp. are strictly aerobic [38].

Bordetella species are small, Gram negative, non-spore forming coccobacilli

between 0.2-0.5 x 0.5-2.0 μm. With the exception of B. petrii and B. ansorpii,

Bordetella species are strictly aerobic and grow well at 35 to 37°C [6]. On the

plates, colonies are smooth, convex and pearl shaped. They can agglutinate

erythrocytes from a variety of mammals and the colonies are surrounded by a zone

of haemolysis on Bordet-Gengou agar (BG) (Becton Dickinson) supplemented with

7% defibrinated horse blood.

Bordetella species have high GC content between 65 to 68% mol and can be

differentiated by the presence or absence of specific insertion elements (IS) and

virulence gene expression [6, 39]. Host and characteristics of each species are listed

in table 1.1-1.

1.2 Bordetella pertussis

B. pertussis is the most well studied species of Bordetella and is the main causative

agent of whooping cough. B. pertussis has been shown to have recently evolved

from a B. bronchiseptica- like ancestor 0.3-2.5 million years ago [40]. B. pertussis

and B. parapertussis evolved independently at different time points from B.

bronchiseptica and are clones of B. bronchiseptica [6, 40]. Recent study analysing

343 isolates suggested that B. pertussis has recently evolved within the last 500

years [41]. During the evolution of B. pertussis from its ancestors, some genes were

deleted or inactivated as it became a strict human pathogen [42]. Despite the close

genomic relationship, B. pertussis is the only Bordetella species that can produce

pertussis toxin (Ptx). Both B. parapertussis and B. bronchiseptica have lost the

ability to express Ptx due to mutations within the promoter region [3].

Chapter 1

5

1.3 Virulence factors in Bordetella pertussis

1.3.1 Adhesins

One of the main requirements for pathogenicity involves the firm attachment of

bacteria to the target cells in the host. Adhesins are cell surface molecules that

enable pathogens to attach to host cells and contribute to the initial step in

establishing infection [43]. The ability of Bordetella species to attach to the

respiratory tract is a result of the wide range of adhesin molecules present. Due to

the wide range of adhesins, some adhesins may play more important roles as

virulence factors than others. Here we discuss the major adhesins present in B.

pertussis that contribute to the pathogenicity of the bacteria.

1.3.1.1 Pertactin

Pertactin, first named P.69, is one of the many adhesins in B. pertussis that

mediates the attachment of the bacterium to mammalian cells during pertussis

infection [44]. Pertactin is an outer membrane protein that belongs to the

autotransporter family and is expressed on the bacterial surface. It is encoded by the

BvgAS dependent gene, prn, and is highly polymorphic [45, 46]. Polymorphism in

the prn gene usually occurs in two regions of the gene named region 1 and 2 which

are amino acid repeat regions. Region 1 is located near the RGD motif which is

proposed to have a role in adherence to host receptors [44, 45]. Like other

autotransporters, Prn is produced as a 91, 93 and 92.5 kDa precursor and then

undergoes autoproteolytic processing resulting in a mature 69, 70 and 68 kDa

pertctin in B. pertussis, B. parapertussis and B. bronchiseptica respectively which

is located on the outer membrane of the bacterial cell [47-50].

There are conflicting evidences about the role of Prn in adhesin. While some

studies have demonstrated the role of Prn in attachment to CHO and HeLa cells

[51], others have indicated that Prn mutants do not affect B. pertussis colonisation

in the mouse lungs [52]. However, in B. bronchispetica, Prn, 68 KDa promotes

persistence in the lower respiratory tract of mice and resistance to neutrophil-

mediated clearance [53-55]. Several studies have also shown that Prn is important

Chapter 1

6

for immunity against disease [56-58] and that the levels of anti-Prn antibodies

correlated with protection [59, 60]. Recently, the epitope for CD4+ T-cells in Prn

was found to evoke strong cytokine responses after infection or immunisation of

mice and was associated with CD4+ immunity in humans [61].

1.3.1.2 Filamentous Haemagglutinin

Filamentous haemogglutinin (FHA) is one of the major adhesin proteins in the

genus Bordetella. It is a large rod-shaped protein that is synthetised as a 370 kDa

precursor and undergoes processing at both the N and C terminals by peptidase and

SphB1 respectively to produce the mature ~250 kDa FHA. FHA is translocated

across the cytoplasmic membrane by the Sec translocation system and across the

outer membrane by FhaC [62]. The fha locus is comprised of at least 3 genes fhaA,

fhaB, and fhaC of which fhaB controls the production and assembly of FHA. There

are also other Bvg regulated genes encoding FhaB-like proteins including fhaL

(Fha-like large) and fhaS (Fha-like small) which are both expressed and may be

involved in host-pathogen interactions [42]. However, the reason why B. pertussis

expresses all three genes is still unclear. However, since FHA is a key virulence

factor in B. pertussis, Fha-like genes may act as back-up copies and low expression

of these genes might act as reservoirs for homologous recombination with the main

fha gene to increase genetic diversity [63].

FHA helps B. pertussis to bind to a broad range of target cells such as epithelial

cells and macrophages using the RGD (Arg-Gly-Asp) motif and a carbohydrate

recognition domain. However, recent studies have shown that the mature C

terminal domain was more important than the RGD motif as there were no

significant differences in bacterial adherence to host cells when the RGD motif was

changed to an REA (Arg-Ala-Asp) motif [64]. Furthermore, FHA is vital for the

progression of infection from the upper to the lower respiratory tract [65] and it was

shown that it mediates the initial bacterial colonisation in the trachea in mice but

not in the lungs. FHA also has a suppressor role for the innate immune responses in

mouse model studies and can downregulate the pro-inflammatory cytokine

production resulting in decreased inflammation and increased bacterial persistence.

However, FHA has a different role in humans. Human studies have shown that

Chapter 1

7

FHA increases pro-inflammatory and pro-apoptotic responses in monocyte-like

cells, monocyte-derived macrophages and bronchial epithelia cells [66].

Interestingly, new findings showed that the presence of FHA correlated with

increases in the average membrane rigidity. It also showed the contribution of FHA

to biofilm structures in human and mouse respiratory tracts after infection with B.

pertussis. This was thought to help the pathogen attach to host cells and increase

survival in response to the immune system [67, 68].

Figure 1.3-1: Adhesin factors in B. pertussis and their structures. A) Pertactin. B) Filamentous

hemagglutinin and C)Fimbriae . Figure was adapted from Melvin et al. [65]

1.3.1.3 Fimbriae

Type I pili, also known as fimbriae, are produced by almost all Bordetella species

and are composed of two major subunits; Fim2 or Fim3, encoded by fim2 and fim3

genes respectively. A fimBCD operon, located between fhaB and fhaC genes, is an

additional gene cluster that is responsible for encoding the putative chaperon

(FimB), usher (FimC) and tip adhesin (FimD) that are required for the export and

assembly of fimbrial subunits [69-71].

Chapter 1

8

In B. pertussis and B. bronchiseptica, fimD encodes a 40 kDa minor subunit that is

attached to the tip of the assembled major subunits. It appears to be necessary for

pathogen colonisation in the nasopharynx of rats and mice and is also involved in

the attachment of major fimbrial subunit to human monocytes [72, 73]. There are

also three pseudogenes, fimX, fimA and fimN, which are homologous to fim2 and

fim3. fimX expressed at very low and the other two are not expressed in B. pertussis

due to deletions in the gene or promoter but they are secreted in other Bordetella

species like B. bronchiseptica and B. parapertussis [74-76].

Although the expression of Fim subunits is governed by the BvgAS system, recent

studies have demonstrated that Bvg-regulated promoters behave differently from

each other during in vivo and in vitro conditions with the promoter of Fim2 being

stronger in the in vivo study [77]. In addition, it seems that both Fim2 and Fim3 are

expressed in vivo [78].

Fim2 and Fim3 are closely related in molecular weight and can be differentiated

serologically. B. pertussis may express one or both types of fimbriae and can

undergo phase variation [79, 80]. Both Fim2 (22.5 kDa) and Fim3 (22 kDa) are

monomers and contains two regions with heparin-binding activity that might be

involved in binding to the extracellular matrix of respiratory epithelial cells [81,

82].

Fimbriae are required for the adhesion, colonisation and persistence of B. pertussis

in the respiratory tract particularly for adherence to tracheal ciliated respiratory

epithelium. It has also been suggested that Fim may be important for humoral

immune responses by interacting with epithelial cells, monocytes and macrophages

[83] and for the suppression of the initial inflammatory response during infection

[65].

1.3.1.4 Tracheal colonisation factor

Tracheal colonisation factor is encoded by the tcfA gene as a 90 kDa precursor and

undergoes autoproteolytic processing to form a 60 kDa surface associated proline-

rich protein that contains an RGD motif like Prn, FHA and BrkA proteins [84]. It is

Chapter 1

9

exclusively expressed in B. pertussis and appears to be specifically involved in the

colonisation of the trachea in mouse model studies [85].

1.3.2 Toxins

1.3.2.1 Pertussis toxin

Ptx was one of the first virulence factors identified and is composed of five

different subunits (S1-S5) encoded by the ptx operon. Despite the presence of the

ptx operon in B. parapertussis and B. bronchiseptica, Ptx is only produced and

secreted by B. pertussis [39]. Pertussis toxin is an ADP-ribosylating AB5 protein. It

consists of an A catalytic subunit encoded by ptxA, and five B subunits, which are

involved in membrane-binding or transport and are encoded by ptxB-E (Figure 1.3-

2) [65, 86]. The holotoxin is first assembled in the periplasm and then a 105 kDa

hexameric toxin is secreted by the type IV secretion system (T4SS) that is encoded

by the ptl locus. The ptl locus is comprised of nine genes which are located

immediately downstream of the ptx operon [87-89]. The pyramid shape of Ptx

(Figure1.3-2) is structured by the sitting of S1 (PtxA) subunit, which acts as an

ADP-ribosyltransferase, on top of a triangular base of B oligomer that consists of

two molecules of S4 and one molecule each of S2, S3 and S5, that is responsible

for binding the toxin to the target cells [81, 90, 91]. Evidence have shown that the

S1 subunit may have a greater role in Ptx assembly by serving as the main

nucleation site for the B oligomer and can be localised to the outer membrane of the

bacterium independently [92].

A promoter region (ptxP) of about 170 bp upstream of the ptx operon positively

regulates the expression of pertussis toxin [93]. For transcription, two tandem 20-bp

repeats (-157 to -117) upstream from the pertussis toxin act as the binding sites for

BvgA dimers to ptxP and are required for Ptx expression [94].

It was thought that Ptx was responsible for pertussis infections in the host and that

the disease is a toxin mediated infection [95, 96]. However, despite the importance

of Ptx in pathogenesis, it has been demonstrated that pertussis infections are a result

of the coordinated functions between many different virulence factors [65]

Chapter 1

10

Figure 1.3-2: Schematic pictures of pertussis toxin. (A)The holotoxin viewed perpendicular to the

five-fold axis of the B-oligomer. (B)The B-oligomer viewed along the five-fold axis from the

surface opposite to S1 with the position of the five-fold axis indicated by an asterisk. Subunits are

colour-coded as follows: S1, green; S2, turquoise; S3, purple; S4, red; S5, yellow. β-strands (three or

more residues) are shown as arrows, and α-helices as spirals. (Figure was adapted from Stein et al.

[97])

Ptx can adhere and bind to multiple receptors including oligosaccharide receptors

that are present in various eukaryotic proteins and have been identified in a broad

range of cell types. Ptx can mediate the attachment of the bacterium to ciliated

epithelial cells [65, 91, 98].

It was demonstrated that biological responses mediated by Ptx are caused by at

least two different signalling pathways including Gi/o protein-dependent and -

independent effects of Ptx action [81, 99]. In the first pathway, target proteins such

as the α subunit of heterotrimeric Gi/o is ribosylated by the A subunit following the

binding of Ptx to host cells in various tissues including β cells of pancreatic islands,

adipocytes, macrophages and lymphocytes. This leads to a wide range of

downstream effects including histamine sensitisation, dysregulation of the immune

system and prolonged airway inflammatory responses [65, 81, 100]. While in the

Gi/o protein-independent pathway, the B oligomer performs the main role by

binding to specific surface proteins on host cells such as components of T cell

receptors, and toll like receptors (TLR) 2 and 4. This leads to the induction of

biological responses [81, 86, 90, 99].

A. B.

Chapter 1

11

1.3.2.2 Adenylate cyclase toxin

Another important toxin in B. pertussis is adenylate cyclase toxin (ACT) which is a

member of the RTX (repeats in toxin) toxin family. It is encoded by the cyaA gene

which has a high degree of homology to ACT in E. coli. This toxin is expressed by

all Bordetella species that can cause infections in mammals [65, 86]. ACT, a 200-

kDa polypeptide, is secreted by a cyaBDE-encoded type I secretion system and

contains 2 functionally separate domains including a catalytic (N terminal) domain

and a haemolytic-binding (C-terminal) domain [81]. The C-terminal domain is

composed of a hydrophobic channel domain and calcium binding repeats. This

domain mediates binding and internalisation of the toxin into the host cells. The N

terminal domain includes adenylate cyclase activity and calmodulin-binding site

that converts ATP to cyclic AMP (cAMP) [101, 102].

Studies have shown that most of the ACT is localised on the surface of the

bacterium by the interaction with FHA and can mediate inhibition of phagocytosis

in neutrophils and macrophages by binding with high affinity to complement

receptor 3 (CR3) [103-106]. This enables bacteria to resist neutrophil-mediated

clearance since ACT deficient B. pertussis and B. bronchiseptica are cleared faster

than wild type bacteria in mouse model studies [107, 108]. ACT can also suppress

immune responses of T cells and dendritic cells [65, 109].

1.3.3 Additional toxins and virulence factors

The Bordetella serum-resistance to killing protein (BrkA) is an autotransporter

protein which is expressed as a 103 kDa precursor. During secretion, the precursor

is processed to a 73 kDa surface associated N-terminal passenger domain and a 30

kDa outermembrane C-terminal domain [110]. It contains two RGD (Arg-Gly-Asp)

motifs and two potential binding sites for sulphated glycolconjugates. Like

pertactin, its secretion is governed by the BvgAS system and after secretion, it

remains tightly associated with the bacterial surface [71] [111]. It was shown that

BrkA mediates resistance to killing by serum and protects Bordetella against

antimicrobial peptides from the host [112, 113].

Chapter 1

12

Tracheal cytotoxin (TCT) is another virulence factor identified in Bordetella. It is a

disaccharide-tetrapeptide monomer of peptidoglycan and is produced during cell

wall modelling. TCT is the only known BvgAS-independent virulence factor which

B. pertussis secretes in large amounts into the extracellular environment [65].

The role of TCT in pertussis pathogenesis in humans is still unclear. However, in

mouse models and cell culture studies, it was shown that TCT can stimulate the

production of proinflammatory cytokines and nitric oxide resulting in the

destruction and extrusion of ciliated cells from the epithelia surface. TCT can also

cause mitochondrial bloating and disruption of tight junctions [114, 115].

Dermonecrotic toxin (DNT) is a heat labile, single polypeptide, 160 kDa, AB- toxin

with a N-terminal receptor binding domain and a C-terminal enzymatic domain

[116, 117]. DNT is encoded by the dnt gene under positive regulation by BvgAS

[118]. Purified DNT induces localised necrotic lesion in mice when injected

intradermally and is lethal in low doses in intravenous injections [119]. DNT

induces dramatic morphological changes in osteogenic cells from a spindle shape to

a spherical form with many blebs, and can also stimulate DNA replication. DNT

also contributes to the ability of B. bronchispetica to induce turbinate atrophy and

lung pathology in swine [120-122]

1.3.4 BvgAS and the regulation of virulence

Genetic analysis revealed that in the three classical Bordetella species, a ~5kb

locus known as Bordetella virulence gene (bvg) locus encodes a two component

signal transduction system called BvgA and BvgS. This system regulates the

expression of nearly all pertussis virulence genes in Bordetella [123]. The first

component, BvgA, is a 23 kDa cytoplasmic protein with DNA binding helix-turn-

helix domain at the C-terminal and a receiver domain at the N-terminal. The second

component, BvgS, is a 135 kDa protein, which is a polydomain periplasmic

histidine kinase sensory protein that responds to external signals [124, 125].

Chapter 1

13

BvgAS governs the expression of a broad range of genes in response to

environmental changes including virulence genes, genes encoding surface and

secreted proteins, enzymes, factors required for survival outside mammalian host,

and even itself [65].

There are two major phase variations under the regulatory control of the BvgAS

system (Figure 1.3-3). The virulent phase (Bvg+) occurs when Bvg is activated and

is important during infection in host while the avirulent phase (Bvg-) is important

for survival outside the host. There is also the intermediate phase (Bvg-i) that

occurs during the switch from Bvg- to Bvg+ in the host or in the presence of low

concentrations of chemical modulators such as nicotinic acid and MgSO4 or growth

at low temperatures [65, 126].

There are three major categories of genes that are regulated by the BvgAS system.

The first category is called virulence activated genes and includes those with that

are up-regulated with maximal expression only in the virulent phase (Bvg+) or in

both the Bvg+ and Bvg-i phases. Most of the bacterial virulence genes are in this

group including the ptx-ptl operon (that encodes Ptx and its transport system),

cyaA-E (which encodes ACT), the bsc operon encoding a type III secretion system

(T3SS), sphB1, prn, tcfA, fhaB and fhaC, vag8 (encoding autotransporter), brkA,

and ompQ (encoding outer membrane porin protein, OmpQ) [127]. One gene, bipA,

which encodes an unknown outer membrane protein, is classified in the second

category where expression is only upregulated during the intermediate phase in B.

bronchispetica. However, in B. pertussis it was shown to be upregulated in both

Bvg+ and Bvg-i phases [127]. There are also a third group of genes which are

down-regulated in the virulent phase, known as virulence repressed genes.

Virulence repressed genes include genes that are required for flagella synthesis and

motility. These genes are maximally expressed in the Bvg- phase.

During the active phase, phosphorylated BvgA (BvgA~P) binds to the sequence

receptors located in the promoter of vag and leads to the maximum expression of

these genes [128]. It also upregulates the expression of the bvgR gene located

downstream of the bvgAS locus resulting in the expression of BvgR regulator

Chapter 1

14

protein. BvgR is a 32 kDa protein that represses the expression of virulence

repressed genes.

The role of Bvg- is still unclear in B. pertussis infections since some studies

showed no Bvg- phase in vivo [129, 130]. Bvg- might be expressed only in

external environments under nutrient limiting conditions or low temperatures. in

vitro studies have shown that Bvg- is induced by decreasing temperature below

26oC and increasing concentration of chemicals like nicotinic acid, MgCl2 or

sulphate ions [131]. Boulanger et al. demonstrated that in the presence of MgSO4

in the in vitro cultures, BvgA~P is not detectable and bacteria switches to Bvg-

phase [132].

Chapter 1

15

Figure 1.3-3: The BvgAS master regulatory system. A) the structure of BvgAS and the activation

pathway that occurs by auto-phosphorylation of conserved histidine (H) in the histidine kinase

domain. B) Phosphorylated BvgA (BvgA-P) dimerises and activates the expression of virulence –

associated genes (which are subdivided into two class 1 and 2 genes) and represses the expression of

virulence-repressed genes (which are class 4 genes). Figure was adopted from Melvin et al.[65]

1.4 Pertussis – the disease

1.4.1 Definition and clinical symptoms

Pertussis is a human respiratory tract infection caused by B. pertussis and is

particularly severe in infants less than 6 months old. Inhalation of respiratory

droplets and aerosols without contact between infected individuals and naïve host is

the main mechanism of pathogen transmission [133, 134]. As few as 140 bacteria

are sufficient to cause disease in naive infants [135].

The incubation time after initial infection varies from 4 to 21 days with two ranges

of classical or atypical symptoms. Classical pertussis can be characterised by three

stages of disease including catarrhal, paroxysmal and convalescent. The catarrhal

stage lasts between 7-14 days after the initial incubation period with nonspecific

symptoms related to the broad range of upper respiratory tract infections.

Chapter 1

16

Rhinorrhoea, malaise, mild cough and possibly low-grade fever may be present

[136]. In adults with partial immunity from past infections or vaccinations, this

stage may be mild or asymptomatic which can lead to pathogen transmission to

naïve infants and cause severe pertussis.

The paroxysmal stage usually occurs in the second week of infection and may last

for 2-8 weeks. Increases in the severity of cough occurs and the decline in lung

volumes can lead to the inspiratory “whoop” especially in infants and children with

smaller trachea. Although most countries follow the clinical case definitions

documented by WHO to record the infection, at least one additional symptom is

required by some countries to record the infection as pertussis [137]. Other

symptoms includes eye bulging, vomiting and syncope have been reported in young

infants during the paroxysmal stage. The final convalescent stage of pertussis

coincides with a decrease in the severity and frequency of coughing but may take a

long period to resolve or it may result in secondary respiratory infections [137,

138].

The clinical presentation of pertussis may vary in patients and can be affected by

parameters such as age, previous infection or vaccination, time between previous

immunisation caused by natural infection or vaccination and the current infection

and co-infection [139]. Pertussis symptoms in adult and adolescence may be

limited to mild cough, post-tussive emesis and exhaustion [136].

Aside from typical symptoms, pertussis can have mild to severe atypical symptoms.

Atypical pertussis (absence of whoop) is usually defined by a mild shorter duration

of cough. It is usually reported in adults and adolescents in countries with high

vaccine coverage and can be misdiagnosed as other respiratory infections unless it

is distinguished by molecular or serology laboratory tests [140]. In infants and

children, severe complications including seizures, encephalopathy, cerebral hypoxia

leading to brain damage, pulmonary hypertension and rectal prolapse have been

documented, while in adults, complications include hearing loss, inguinal hernia,

urinary incontinences, pneumonia, pneumothorax and carotid artery dissection

[139].

Chapter 1

17

1.4.2 Diagnosis of pertussis

Diagnosis of pertussis by the WHO case definition is based on two weeks or more

of paroxysmal cough as clinical symptoms with one or more positive laboratory

tests including culture positive for B. pertussis, PCR (direct test) and/increased titre

in IgG or IgA to Ptx, FHA or Fim2/3 (indirect test). However, in other countries the

criteria to define an infection as pertussis varies in both clinical and laboratory

diagnosis [141]. In Australia, pertussis is diagnosed based on 2 weeks of

continuous cough with the presence of one other additional symptoms including

paroxysm, inspiratory whoop or post-tussive vomiting [142].

1.4.2.1 Culture

Identification of B. pertussis by culture is a routine method for infection diagnosis

in many countries and reference laboratories. Despite being highly specific, the

sensitivity of culture depends on the time and site from which the sample was

collected. For reliable diagnosis, many standard procedures recommended samples

be taken within the first two weeks from the onset of coughing and collected from

the surface of ciliated epithelial cells of the upper respiratory tract, in particular

nasopharynx [141]. After this time, due to antibiotic therapy or immunity from the

disease or immunisation, the sensitivity of culture reduces and 3 weeks after the

onset of coughing, the sensitivity is less than 1-3%[142]. Other factors like the type

and method of specimen collection and the use of cotton or rayon swab can also

affect sensitivity [142]. Both Bordet–Gengou (BG) agar and Regan Lowe (RL),

also known as charcoal agar, containing sheep or horse blood are widely used for

culturing Bordetellae.

Since culturing is a time-consuming and labour intensive process, many diagnostic

laboratories have replaced it with new molecular methods. However, culture is still

useful for bacterial typing and phenotypic characterisations such as antibiotic

sensitivity.

Chapter 1

18

1.4.2.2 Serology

Serology is usually useful for later stages of infection in older children and adults

especially in the presence of clinical symptoms and negative culture or PCR result.

Enzyme-linked immunosorbent assay (ELISA) is used to measure the level of

specific-IgG/IgA against pathogen antigens in the serum of patients. While

pertussis toxin antibodies are specific for B. pertussis, other antibodies against

antigens like FHA, Prn and Fim2/3 can be measured as well. In routine diagnosis, it

was recommended to only check the level of anti-Ptx antibodies [143].

Since serology is unable to differentiate whether antibodies in sera measured are as

a result of infection or immunisation, it is suggested that two samples are taken

with 3 to 4 weeks of interval to demonstrate the increase of antibodies level.

However, a single high level of IgG to Ptx can also be used as a positive result for

acute pertussis infection [142].

1.4.2.3 Polymerase Chain Reaction (PCR)

Development of molecular techniques has allowed reference laboratories to use

PCR-based methods for diagnostic investigations [142]. In many developed

countries, routine culture methods have been replaced by PCR –diagnostic

methods. However, a worldwide- standard protocol has not been produced and

different protocols are used by different laboratories. As a result, PCR sensitivity

and specificity may vary between different reference laboratories. PCR is 2 to 3

fold more sensitive than culture in patients with 3 weeks or more of coughing and

antibiotic therapy [142]. A study in 2013 showed that different PCR methods

including real-time PCR, are used in 24 national reference laboratories in European

countries and in the identification survey, all were able to identify B. pertussis but

there were misidentifications for other Bordetellea species like B. parapertussis and

B. holmesii [144].

PCR-based diagnostic methods have been developed since 1990 using primers to

identify repetitive DNA sequence of B. pertussis [145]. It was shown that IS481

primers have high sensitivity to pertussis. However, in recent years it has also been

determined that IS481 was also present in the B. holmesii genome which may lead

Chapter 1

19

to misidentification [141]. The presence of IS481 in the genome of B. holmesii has

prompted a primer redesign to differentiate between the two Bordetella species.

Currently, a combination of IS481, IS1001, unique IS1001-like (hIS1001) specific

for B. holmesii, and ptx promoter primers can be used for the detection of different

Bordetella species. However, this method depends on the circulating species in the

area of study [144, 146]. In 2012, in the United States, the CDC designed a

multiplex RT-PCR assay including primers that targeted IS481, pIS1001 for

detecting B. parapertussis and hIS1001 for B. holmesii. They also included a single

RT-PCR that targeted ptxA for B. pertussis confirmation since it is not expressed by

B. holmesii. The results were published in 2015 and showed better harmonisation

between laboratories as well as increased sensitivity (96%) and specificity (95%) of

PCR method for pertussis diagnostics [147].

1.4.3 Treatment

Antibiotic therapy is a recommended treatment for pertussis and is effective in

eliminating the pathogen especially in the early stages of infection. It was shown

that antibiotic administration during the catarrhal phase may decrease the severity

of symptoms and duration of the infection and promote bacterial clearance from the

upper respiratory tract [148]. However, in patients with more than 3 weeks of

symptoms, coughing will continue even after bacterial clearance. Therefore, it is

suggested that antibiotics are administered during the first 3 weeks of infection. The

recommended antibiotics are macrolides; azithromycin (for 5 days), erythromycin

(for 14 days) or clarithromycin (for 7 days) to promote bacterial clearance and

prevent secondary infections. However, they do not reduce the clinical symptoms

of the disease after week 3 of infection even with bacterial clearance from the

nasopharynx. It was also shown that symptomatic treatments were not effective for

reducing coughs during the infection [149].

1.5 Pertussis vaccines

Whooping cough is an important example of a highly contagious respiratory

disease that can be controlled by effective worldwide immunisation programs.

Isolation of the causative agent of pertussis by Bordet and Gengou led to the

Chapter 1

20

manufacture of different types of pertussis vaccines to immunise the population,

particularly for infants. Vaccinations caused a dramatic reduction in pertussis

notifications around the globe. There are two available types of vaccines, the whole

cell vaccine (WCV) and the acellular vaccine (ACV).

1.5.1 Whole cell vaccines

The first pertussis vaccine was introduced in the 1930s and was composed of

suspended B. pertussis cells (10 x 109/mL) in phenolysed saline as a whole cell

vaccine (WCV). It was usually produced using formalin inactivated whole cells of

B. pertussis in many countries and was manufactured in combination with other

vaccines like diphtheria and tetanus toxoids as DTP. However, the efficacies and

immune responses to various B. pertussis strains differed from one country to

another [150]. Studies with whole cell vaccines in the United States during 1940s

and 1990s showed that the efficacies of different WCVs ranged from 53% to 91 %

based on the vaccine combination and the method of production [150, 151].

Massive vaccinations with WCV caused a dramatic reduction in the morbidity and

mortality of pertussis. After successful immunisation led to controlling pertussis

rates globally, concerns turned to side effects after vaccinations with WCV in

infants and adults occasionally resulted in a range of side effects from high fever to

local and systemic reactions and even possible brain damage or death [152, 153].

These side effects caused many to refuse the WCV and resulted in an increase in

pertussis cases. This led to the development of a new type of vaccine that was less

reactogenic and only included some of the major antigens of B. pertussis.

Many countries still use WCV to immunise the population against whooping

cough. However, in many developed countries, the WCV has been substituted with

the ACV [141, 154, 155].

1.5.2 Acellular vaccines

The mass immunisation with ACV was firstly implemented in Japan with two

component vaccine, Ptx and FHA since 1981 [156]. A number of acellular vaccines

ranging from one component to five components have been produced and used in

Chapter 1

21

different countries for immunisation. The monocomponent ACV contains only one

antigen, Ptx, and is currently used in Denmark and some parts of Sweden [157],

while the three component ACV includes Ptx, Prn and FHA are used worldwide in

many countries and is the most common ACV used in combination with other

vaccines as a multivalent vaccine. There is also a five-component vaccine that

includes the same antigens as the 3-component ACV and an additional Fim2 and

Fim3. The latter is prescribed mostly in European countries, Canada and the United

States [158, 159].

Ptx is a major virulence factor in B. pertussis and is present in all ACVs as the core

component. However, the amount of toxin in ACVs varies from 40 µg in the

monocomponent ACV to 2.5 µg in the 5-component ACV [160].

FHA, Prn and Fim2/3 were added to the multicomponent ACV due to the crucial

role of adhesin factors in the pathogenesis of B. pertussis and the immunogenicity

of these adhesin factors. Studies have shown that FHA or Prn-specific antibody

secreting cells in the respiratory tract of mice were detected during immunisation

with these antigens and this suggested that increased local antibody responses by

these cells during infection helped promote the pulmonary clearance of B. pertussis

[161, 162]. It was also revealed that the immune response against Prn in mice was

correlated with anti-Prn antibodies and facilitated bacterial clearance [163]. Data

from clinical trials showed that ACVs containing Prn were more effective than the

one or two component (Ptx and FHA) ACV [164].

The addition of Fim2 and Fim3 to ACVs also showed that they were more effective

than other types of vaccines and can be used for general immunisation or as a

booster [164, 165].

At present, ACVs in different formulations are available and used in various

countries. Most of the ACVs contain PtxA2, Prn1 and FhaB1 including vaccines

manufactured by Glaxo-SmithKline [166]. Generally B. pertussis strains Tohama I,

fim2-1, fim3-1, prn1, ptxA2, and strain 10536 (fim2-1, fim3-1, prn7, ptxA4) are used

for ACV production with the exception of some Latin American vaccine strains

Chapter 1

22

with fim2-2. In the Netherlands the strain used for vaccine harbours fim2-2 and

fim3-1[167, 168].

1.5.3 Efficacy and protection from pertussis infection by acellular vaccine

After the development and introduction of ACV, many countries replaced WCV

with ACV based on studies which showed that the ACV was as protective as the

WCV but produced less reactogenic responses in infants and children.

Various studies and clinical trials have demonstrated the safety and

immunogenicity of one component to multi-component ACVs in infants, children,

adolescents and adults [157, 164, 165, 169-174]. However, some studies showed

that the ACV was less effective in comparison to WCV [175, 176]. Furthermore, it

was shown that ACV-induced immunity waned faster and failed to adequately

protect against pertussis due to an increase in pertussis outbreaks in recent years

[176, 177]. A study carried out by Warfel et al. in 2014 demonstrated that in the

non-human primate baboon model, the ACV was not able to prevent infection and

pathogen transmission although it can protect against severe pertussis [178]. It was

also observed that B. pertussis was still able to colonise immunised baboons but

were cleared faster compared to infections of unimmunised baboons. ACV

vaccinated baboons were shown to still be able to transmit B. pertussis to naïve

baboons.

Several factors may affect the efficacy of ACV, of which the most important is the

number of antigens in the ACVs [176, 179]. Zhang et al. in 2012 reviewed six

recent double –blind efficacy trials to study the efficacy and safety of different

types of ACVs in children and found that the efficacy of one and two component

ACVs was less (59% to 75% respectively) than the three or more component

vaccines (84 to 85%) in preventing typical whooping cough in children [180].

The concentration and balance of antigens in the formulation of ACV is another

important factor which may affect its efficacy. Two studies found that antibodies to

Prn and Fim are more important for efficacy than antibodies against FHA. [56, 58].

In fact, it was claimed that there is no significant basis for including FHA in the

Chapter 1

23

ACV, since both types of vaccine with and without FHA have the same protective

activity [181]. Clinical trials carried out on one or two component ACVs showed

that they are as effective as multicomponent ACVs and can also provide enough

protection to prevent pertussis due to the higher concentrations of antigens in the

vaccine formulation [157, 182, 183].

The method of detoxification for Ptx and the production of ACV can also affect

vaccine efficacy. Chemical detoxification of Ptx by formaldehyde or gluteraldehyde

during production can remove up to 80% of surface epitopes and reduce the

immunogenicity of Ptx [177]. It was shown that genetic detoxification by

substituting 2 amino acids in the S1 subunit could increase the potential

immunogenicity of Ptx [184] and enhanced bacterial clearance in mice [185].

The effect of an adjuvant on the vaccine efficacy was demonstrated with a study

that showed aluminium phosphate produced greater disruptive effects than

aluminium hydroxide on the antigens used in the ACV particularly Prn and this was

correlated with less efficacy in mice [186]. Animal model studies have also

suggested that the replacement of alum with alternative adjuvants like toll-like

receptor agonists can increase the efficacy of ACV by promoting greater cell

mediated immune responses [187, 188].

Vaccine priming can also influence protection. Recent data indicated that the

decline in immunity in adults who had ACV-primary and booster vaccinations

during childhood were faster than adults who were primed or fully immunised with

the WCV [189-191]. It was concluded that the first dose, the order of vaccination

and the vaccine type appears to be important for full protection against the disease.

Finally, other factors that can influence the results of clinical trial studies were the

method of the study and the definition of pertussis disease and case definition [139,

176, 177]. Cherry [176] believed that vaccine efficacy was inflated by “observer

bias” even within completely blinded trials since the case definition and infection

symptoms in the vaccinated and control groups were different .

Chapter 1

24

1.5.4 Immunity responses to infection and vaccines

Infection, immunisation with WCV, and ACV has been found to induce different

humoral and cell mediated immune responses in the host against whole B. pertussis

or its antigens [192].

Innate immunity plays an important role against pathogens during infection in

mouse model studies. In the presence of antigens, T-helper (Th1 cells) mediates the

phagocytosis of bacteria in the body. Infiltration of macrophages, dendritic cells,

neutrophils and natural killer cells in the mouse lungs have been observed during

the first 2-3 weeks of infection which suggest their limited role in bacterial

clearance from the respiratory tract [193, 194]. It was demonstrated in mouse

studies that the secretion of interferon- γ (IFN-γ) by CD4 T cells (Th1) and

interleukin 17 (IL-17) by CD4 T cells (Th17) promotes the killing activity of

macrophages and neutrophils which leads to clearance of intracellular bacteria

[187, 195, 196]. It was also demonstrated that immunisation with WCV in mice can

increase the phagocytosis of macrophages due to the production of nitric oxide

[197].

While during infection with B. pertussis or immunisation with WCV only Th1 cells

are activated, immunisation with ACV can activate either Th2 or a mixed Th1/Th2

response in the body. Despite the fact that Th2 cells mediate antibody production

by B cells, in a recent mouse model study, Th2 cells do not induce protection

against B. pertussis infection after immunisation with ACV [196]. Both WCV and

ACV are able to induce Th17. However, in mouse studies, IFN-γ-secreting Th1

cells are the key cells for protection with WCV while Th17 cells were more

important for protection with ACV [196]. It was demonstrated that IL-12 is

produced by macrophages after natural infection with B. pertussis or immunisation

with WCV and increased the efficacy of the vaccine. However, this phenomenon

was not observed after ACV-immunisation. This suggests that IL12 can be added to

the ACV as an adjuvant to enhance the ACV efficacy [198].

In B-cell-mediated immunity, specific IgA, IgG and IgM antibodies to B. pertussis

are produced after natural infection or immunisation. These antibodies prevent

Chapter 1

25

bacterial adhesion to the epithelial cells and mediate bacterial clearance. While

WCV induces IgG2a/ IgG2c subclass which were important in opsonisation and

complement fixation, ACV was unable to induce them and only promoted the

production of IgG1 and IgM [187, 199]. Many studies showed that ACV failed to

induce IgA, this might decrease the vaccine efficiency during the early stages of

infection [187], Hendrikx et al. in 2011 compared the levels of IgA for different

antigens and revealed that anti-Prn and anti-FHA IgA were increased in children

after infection or vaccinations with ACV and WCV. However, increase in the anti-

Ptx IgA was significantly higher in infected children compared to WCV-

vaccination and anti-Ptx IgA was not increased after ACV immunisation [200]. IgA

was detected in mucosal tissues and promoted phagocytosis of bacteria by human

polymorphonuclear leukocytes in the early stages of infection and helped control

the upper respiratory pathogen invasion [201, 202].

1.5.5 Vaccination schedules

In Australia, vaccinations against pertussis started from 1953 with the whole cell

pertussis vaccine used in combination with diphtheria and tetanus toxoids for

immunisation [203]. Since then, the vaccination schedule has been changed

multiple times (Figure 1.5-1) [204].

Priming immunisation was induced with three doses of trivalent vaccine, DTPw,

used at 3, 4, 5 months, and a booster at 15-18 months from 1953 for decades. The

booster dose at 15-18 months was removed in 1978. In 1982, vaccination dates for

primary immunisation started at 2 months and repeated at 4 and 6 months of age.

Increased pertussis cases led to the re-introduction of the 18 month vaccination

booster in 1985 and another booster was added to the schedule at 4 years of age

from 1994 [203, 204].

Chapter 1

26

Figure 1.5-1: Trend of multiple changes in pertussis vaccination schedule in Australia.

(Adopted from Campbell et al. [204].

Due to the high rate of side effects, the gradual replacement of the WCV with ACV

started from 1997 and from 2000 the acellular pertussis vaccine, mainly the three

component, was the only vaccine used for both primary and booster immunisation.

The changes in booster vaccination time continued recently with the replacement of

the 15-18 month booster injection with a booster at 15-17 years of age [204]. The

current priming vaccination schedule for pertussis according to the National

Immunisation Program in Australia includes three doses of ACV at 2,4 and 6

months of age with a booster at 4-5 and 12-17 years of age [205]. The 18 months

booster has been reintroduced in 2015 in Australia based on the updated Australian

immunisation handbook edition 10 in June 2015.

Similar primary immunisation schedules to Australia have been used in many

countries. However, in the case of booster immunisation, the recommended time

and number of vaccination vary considerably [206]. For instance, in the United

States and Brazil, primary vaccinations are the same as Australia but booster

immunisations are administered at 15-18 months, 4 and 12 years in the United

states while Brazil does not provide booster vaccines [142, 207].

1.6 Epidemiology of pertussis

1.6.1 Pertussis around the globe

Before the introduction of vaccines, pertussis was one of the most important

respiratory infectious diseases in infants worldwide that frequently caused death. It

is still life threatening in unimmunised infants. In the United States before the

immunisation program, pertussis caused 1 death per every 10 cases and in 1934

Chapter 1

27

peaked at over 265,000 cases. This dropped below 100,000 in 1948 after

immunisation with WCV was introduced [208]. Although pertussis cases decreased

in many countries following the development and introduction of WCV, based on

the WHO reports, pertussis is still a public health concern in both developing

countries and those with high vaccine coverage. Based on the WHO reports in

2008, there were 16 million cases of pertussis globally, of which 95% were in

developing countries and about 195,000 children died from the [209]. However, the

true number of cases may be higher due to the lack of records in some developing

countries and differences in the clinical definition of the disease. In most countries,

epidemics occur with the interval of 3-5 years possibly due to an increased number

of susceptible individuals in the population. Pertussis rates are higher in infants and

children than in adults and adolescence [210].

After successful immunisation programs against pertussis led to a dramatic decline

in the morbidity and mortality of pertussis cases, concerns have been raised in

countries with high vaccination coverage due to the increased incidence of pertussis

cases since the 1990s. Increased pertussis notification rates especially in adults and

adolescences has been well documented and reported in many countries including

the United states, Australia, Canada and many European countries [205, 211-214].

The recent 2010 outbreak in the USA involved 9477 patients mostly in children

under the age of 3 months in California and 2520 cases in the pertussis epidemic

reported in the American city of Washington during 2011-2012 mostly in children

under 1 year old [215, 216]. Reports from the European Centre for Disease

Prevention and Control (ECDC) and from the Surveillance Network for vaccine

preventable diseases (EUVAC-NET) showed high pertussis incidence rate in

European countries with 43482 cases during 2003-2007 and 20,591cases in 2009

with the highest incidence rate in Norway, Estonia, The Netherlands and Poland

[216, 217]. The same trend of increase in pertussis notification and hospitalisation

was reported in Latin American, African and Asian countries [218]. In Australia the

notification rate was increased from 47 in 2006 to 107 in 2012 with the highest rate

in 2011 with 173 cases per 100000 population [219].

Interestingly, since the pertussis resurgence, it was revealed that the incidence rate

has shifted from infants to older children and adults [220, 221]. The re-emergence

Chapter 1

28

of pertussis highlighted the significance of pertussis infections in adults and the risk

of transmission to naive infants. Furthermore, it also highlights the inability of the

ACV to prevent infection and transmission in the population [136, 178].

1.6.2 Pertussis in Australia

Epidemic cycles of pertussis have occurred every 3-5 years in Australia. However,

after the introduction of WCV in Australia during the 1950s, pertussis cases have

decreased and the infection was well controlled for many years with small

notification rates between 1949 to 1979 [222]. Despite high vaccination coverage,

Australia still experiences epidemics every 3-5 years with the first being recorded

in 1993, followed by 1997, 2004 and the latest was in 2008.(Figure 1.6-1) [219,

222-224]. Quinn et al. reported the highest notifications from 1995-2005 were in

infants under 6 month. However, there was also a steady increase in the number of

adult pertussis cases. This increase raised concerns that adults may play an

important role in the transmission of the infection [223]. It also demonstrated that

parents and siblings with cough might be the main source of infection to infants

[225].

The last epidemic pertussis in Australia occurred from 2008 - 2012 in all states and

territories. Despite the fact that vaccination coverage with ACV was high in

Australia with over 95% coverage in infants with 3 doses of vaccination at 24

months, 173 pertussis cases per 100,000 was reported in 2011 compared to 23 cases

per 100,000 in 2007 [219]. There was also a 2.8 times higher notification rate in the

2008-2012 epidemic compared to the last epidemic [219]. Although the most severe

disease was still reported in infants less than 6 months, increased notification rates

were documented in children from 0 month to 9 years old with the highest in

children aged 5-9 years [219]. This shift might be due to waning immunity as a

result of changes in vaccination strategies. Since 2003 based on the National

Immunisation Program, the vaccination at 18 month old was removed [226].

However, in the new updated version of the Australian immunisation handbook

edition 10 in 2015, it is recommended that children receive the 18 month booster

due to waning pertussis immunity following receipt of the primary schedule.

Chapter 1

29

Figure 1.6-1: Incidence rate of pertussis from 1995-2012 in Australia. Notification rates were

separated according to age group. The graph was obtained from Pillsbury et al.[219].

1.6.3 Multiple causes of pertussis re-emergence

Pertussis resurgence brought attention to the factors that might contribute to the

rapid global increase in the pertussis incidence rate. Many believe that a

combination of multiple factors is responsible for the re-emergence of this disease.

The most important factors can be grouped into three categories including pertussis

definition and diagnostic factors, immunity and vaccine related factors, and

pathogen related factors.

1.6.3.1 Changes to reporting practices due to awareness and diagnostics

Case definition plays an important role in epidemiological studies and vaccine

efficacy trials. Pertussis case definition differs between countries and areas and is

supplemented with laboratory and epidemiological data [137]. Variations in case

definition and laboratory methods can influence the notification rates and results

reported by different countries. Previously, pertussis was recognised as a disease in

children and was diagnosed based on clinical symptoms and bacterial cultures.

However, this is no longer an accurate definition for the infection after increases in

pertussis cases were observed in adults and adolescent with mild symptoms and the

development of new laboratory diagnostic methods [227]. Based on the WHO case

Chapter 1

30

definition that is accepted by many countries including Australia a new universal

case definition was required due to variations in serum serology results, symptoms

between different age groups, endemic or outbreak situations and misdiagnosis in

different countries [137].

The development of new diagnostic methods including ELISA for serological

diagnosis and molecular methods such as PCR assists in the fast and reliable

diagnosis of pertussis. B. pertussis DNA detection by PCR has been shown to be a

sensitive and accurate laboratory method and is widely used in many diagnostic

laboratories. This led to a dramatic increase in the awareness of pertussis [228,

229]. The over-reliance of PCR results as the sole laboratory method to confirm

pertussis without checking the clinical history of patients may lead to an increase in

false positive cases [230] and therefore overestimate the pertussis resurgence.

Increased awareness of the disease by clinicians and faster molecular methods can

promote increased detection of the pathogen in other age groups not previously

identified. Furthermore, the sole use of PCR without bacterial culture could cause

misdiagnosis. Both of these factors may affect epidemiological data analysis and

increase the number of pertussis notifications. It was suggested that these factors

magnified the pertussis outbreak reports in Canada [231]. Moreover, a study also

showed that in a group of 6-14 years old children with 2 weeks of persistent cough,

the sensitivity and specificity of the WHO case definition can be varied by

increasing the number of clinical findings or changing the laboratory diagnostic

methods [232].

1.6.3.2 Vaccine efficacy and wanning immunity in immunised population

Pertussis resurgence and the increase in the number of adult cases have heightened

concerns that the ACV may not be as effective as those showed in clinical trials

showed previously or that waning immunity after vaccination was present. After the

development of the ACV, numerous vaccine trials showed that the ACV was as

effective as the WCV in protecting host against pertussis. This led to the

introduction of new vaccination strategies using ACV in many developed countries

[164, 165, 184]. Zhang et al. demonstrated that in a Cochrane systemic review of 6

Chapter 1

31

randomised control trials, multi component (3 or more) ACV are 85% effective in

preventing typical whooping cough and 71-78% effective in preventing mild

pertussis disease in children [180]. Although clinical trials mostly studied the

vaccine efficacy in children, Ward et al. also showed that ACV was also protective

against symptomatic pertussis infection in adults and adolescents [233]. However,

recent studies in the non-human primate baboon model, revealed that ACV can

prevent serious infection in host but it was unable to prevent pathogen transmission

in the population [178].

Another concern on ACV is the duration of immunity primed by vaccination

particularly in adults and adolescents who can be a source of pathogen transmission

in the population. Although it is difficult to measure the duration of immunity

against pertussis, waning ACV protection has been reported in the population.

Wendelboe et al. analysed published data of clinical trials and demonstrated that

while immunity acquired after natural infection can last between 4-20 years,

immunity primed by vaccination wanes after 4-12 years [234]. Recent studies

showed that ACV-induced protection was less than the estimations and the risk of

infection can increase 42% per year after the fifth dose of ACV [190]. Sheridan et

al. showed that in Australian during the 2009-2010 pertussis epidemic in

Queensland, ACV induced immunity was protective in the first year of priming,

however, this protection waned the following year [235]. There has also been some

evidence which showed longer immunity induced by WCVs compared to ACVs

[236, 237]. Therefore, it can be suggested that switching from WCVs to ACVs

might be one of the factors responsible for the pertussis resurgence in highly

vaccinated populations [189, 238, 239] and furthermore, new strategies should be

considered to improve protection against pertussis [240-242].

Different factors may be responsible for waning protection after vaccination. The

most important factors are the large gap between priming and booster vaccination,

the quality of vaccines that can be affected by adjuvants, and the number and type

of antigens used for immunisation [167]. These factors need to be considered for

future designs of more efficacious vaccines which can induce longer protection or

changing vaccination strategies with existing vaccines to maintain the level of

Chapter 1

32

immunity in the population and decrease pertussis transmission to infants. The

development of either vaccines with new antigens or new strategies with current

vaccines to prime immunity in parents and pregnant women are recommended for

preventing infection in infants and children [154, 243, 244] .

1.6.3.3 Bacterial adaptation to vaccine induced selection pressure

There are evidences that under the selection pressure of either WCV or ACV,

genomic adaptation occurred in B. pertussis. It has been highlighted that bacterial

evolution is one major cause of pertussis resurgence. Bacterial adaptation is

discussed in detail below.

1.7 Molecular epidemiology and evolution of B. pertussis

Different mechanisms are used by microorganisms to evolve and adapt in response

to environmental changes. Gene acquisition by horizontal gene transfer (HGT),

gene duplication by amplification, genome decay, gene deletion or inactivation and

point mutations are the major mechanisms for large scale genome alterations and

emergence of bacterial pathogens [245-247]. In this section, the genomic diversity

and adaptation of B. pertussis by different mechanisms including antigenic shift,

single nucleotide polymorphisms, insertion elements and gene loss will be

discussed.

1.7.1 The genomic content of B. pertussis

One strain from each of the three closely related Bordetella species were

completely sequenced in 2003 [42]. The complete genome of B. pertussis strain

Tohama I was sequenced and compared with B. parapertussis strain 12822 and B.

bronchiseptica strain RB50 [42]. B. pertussis Tohama I genome harbours 3816

genes, three rRNA operons and 51 tRNA. The genome of B. pertussis contains a

high number of pseudogenes (9.4%) and insertion sequence (IS) elements. For

instance, there are 261 IS elements in B. pertussis of which 238 are IS481 [42]. The

results of the comparative genome analysis revealed that these two human restricted

species, B. pertussis and B. parapertussis, diverged from a B. bronchiseptica

ancestor and suggested that evolution and adaptation to the single host was the

result of genome inactivation and reduction [42]. B. pertussis Tohama I, isolated in

Chapter 1

33

Japan in 1950s, was the first pertussis strain to be completely sequenced. However,

this strain was shown to be not representative of current B. pertussis strains [248].

Therefore, other B. pertussis strains including B. pertussis strains CS and B.

pertussis strain 18323 were also sequenced [249]. B. pertussis strain CS widely

used as a vaccine strain for production of ACV in China and isolated from an

infant in 1954 [250], while B. pertussis strain 18323 was isolated in 1946 in the US

and used for mouse potency test[3]. Genome comparison of these three strains

revealed that there are two large segments in CS and 18323 which were also found

in B. parapertussis and B. bronchiseptica but not in Tohama I [249, 250]. All three

strains sequenced were isolated before or around 1950s. A number of complete

genomes from currently circulating strains are now available [251-253].

1.7.2 Adaptation and evolution of B. pertussis

Studies around the globe particularly in countries where resurgence were reported

showed B. pertussis has undergone genomic divergence. Variations in genes

encoding antigens used for ACV, single nucleotide polymorphisms, genome

reduction and mobilisation of the IS elements in genes were found as the main

mechanisms driving the evolution of the B. pertussis population [254].

1.7.2.1 Variation in the virulence associated genes

One of the major mechanisms in B. pertussis evolution is genetic polymorphisms of

genes associated with virulence and genes encoding proteins that induce an immune

response in the host. To investigate the effect of vaccination on bacterial adaptation,

allelic variation in genes encoding ACV components, Ptx, Prn, Fim2 and Fim3,

have been shown in many studies [255-257].

Polymorphisms within Ptx were found mostly in ptxA which encodes the major

toxic protein of Ptx, and has been caused by point mutations. From the eight ptxA

alleles identified, five including ptxA1, ptxA2, ptxA4, ptxA5 and ptxA8 encodes

polymorphic protein and the other three, ptxA3, ptxA6 and ptxA7, produced the

same protein as ptxA1 [254]. Strains that were used for WCV and ACV production

mostly carry ptxA2 or ptxA4. [166, 258].

Chapter 1

34

During the pre-vaccination era, the most predominant alleles in most countries were

ptxA2 or ptxA4. However, since vaccination programs started; they were replaced

by the non-vaccine type allele, ptxA1 [230]. In Finland and the Netherlands, strains

harbouring ptxA1 allele were first detected a decade after the introduction of WCV

and were then subsequently found in other countries such as China, Australia,

Europe and the US [166, 259-262]. The cause of allelic divergence was thought to

be partly due to vaccine driven pathogen adaptation. A study in European countries

showed that all ACV and WCV vaccine strains were ptxA2 strains while all clinical

strains isolated between 1998-2012 carried the ptxA1 allele [159]. In the United

Kingdom, three strains were used for WCV production of which two carried ptxA1

and the other was ptxA2. However, the strains currently circulating also harbour

ptxA1 [255, 263].

Variations in other Ptx subunits have not been reported widely. Two ptxC alleles,

ptxC1 and ptxC2 differ by a silent mutation and each one is predominant in

different countries. A study revealed that in European countries, ptxC1 was

predominant in Finland and Sweden while ptxC2 was predominant in Germany, the

Netherlands and France. Divergence of ptxC was not thought to be related to

vaccine pressures as the mutations were silent [166]. In the UK, ptxC2 has replaced

ptxC1 in recent years [255].

Prn has the highest levels of variations in the current circulating B. pertussis strains.

Variations in the prn gene were found to be in the short repeat regions, region 1 and

region 2, that comprise of five and three amino acids respectively. Variations within

these regions can cause structural changes during translation which can affect the

binding capacity of the mature protein [60]. Different mechanisms include SNPs

and small insertions or deletions leading to changes in the number of amino acids

were found to be responsible for gene variations [254]. The high number of

polymorphisms was observed in region 1 of prn which is near the Arg-Gly-Asp

motif. This region is implicated in adhesion and plays an important role in the

immunity against Prn [60]. The differences between the alleles are caused by the

insertion or deletion of repeating units of five GGXXP amino acids [230].

Currently, thirteen prn alleles were identified, of which prn1 and prn7 are the

Chapter 1

35

alleles present in 10 strains that are mainly used for either WCV or ACV

production. prn1, prn2 and prn3 are the predominant alleles in the population and it

appears to be as a result of vaccination [167]. Strains with prn2 are predominant in

all countries with high vaccine coverage particularly in countries where ACV has

been used. In European countries, the frequency of prn2 ranges from 75% in France

to 95% in the Netherlands and the strains with prn1 were isolated from patient with

no vaccination histories [166]. In most countries, the expansion of prn2 strains

occurred after ACV introduction although the first prn2 strains were isolated in the

1980s prior to the introduction of ACVs [45, 256, 260]. The frequency of prn2

strains in 12 European countries with ACV vaccination dramatically increased to

99 % in 2012 while in Poland, where WCV is used for immunisation, there was

only a slight increase in prn2 strains to 56% in 2006 [159]. Further support for the

effect of ACV on the prn shift was the fact that in countries with low vaccine

coverage like Senegal, the predominant isolates still harbours prn1 [264]. In China,

where WCV was replaced by ACV in 2012, the predominant prn type during 1997

to 2005 was prn1 and only 16% of isolates were prn2 which was first detected in

2000[262, 265]. It was shown that Prn type-specific antibodies were produced in

immunised or infected individuals [266]. Since the Prn type in ACV is Prn1, the

antibodies induced during immunisation might not be as protective against Prn2

producing strains.

Although strains carried prn3 were also isolated in different countries, their

frequencies were not as consistent as prn2 strains in different countries. prn3 strains

started to expand after the introduction of ACV but were replaced by prn2 soon

after. In Denmark, prn3 isolates declined from 20% to 9% after 1997 [261]. In the

12 European countries, the frequency of prn3 strains declined from 10% in 1998-

2001 to 4% in 2002-2006 and reached its lowest level of 1% in 2007 to 2012[159].

Two types of fimbrial genes, fim2 and fim3 have also been found in B. pertussis.

Two fim2 alleles, fim2-1 and fim2-2, and four fim3 alleles have been identified so

far. fim2 alleles are identified by a single amino acid substitution and fim3 alleles

are distinguished by two amino acid substitutions. However, fim3-4 and fim3-1

differ only by a silent mutation [79, 167]. All vaccine strains used for the

Chapter 1

36

production of either the WCV or ACV carried fim2-1 and fim3-1 alleles with the

exception of a Dutch vaccine strains which harbours fim2-2 and fim3-1[167].

Figure 1.7-1: Antigenic shift in the B. pertussis population resulting in increased antigenic

mismatch between vaccine strains in use and circulating strains. (Figure adapted from van der

Ark et al. [267])

Mass vaccinations with WCV led to an antigenic shift in fimbrial serotypes from

Fim3 or Fim2/Fim3 to Fim2 in many countries. However, in recent years after the

introduction of ACV, strains with Fim3 increased and are the predominant Fim

serotype in countries with high vaccination coverage [268, 269]. The best example

of the effect of vaccination on serotype shift was observed in Sweden where Fim3

strains were replaced by Fim2 during 1979 to 1995 when vaccination ceased. After

the re-introduction of ACV in 1996 in Sweden, Fim3 strains began to expand in the

population to become the predominant serotype and constitute 96% of all strains

isolated in 2003 [270].The same trend for Fim3 expansion after the introduction of

ACV was also observed in the United Kingdom and 11 other countries in Europe,

of which 86% of isolates in 2012 expressed Fim3 [159, 271].

In addition to the serotype shift of Fim in the bacterial population, variations in fim

alleles was also observed. The shift from fim2-1 to fim2-2 and from fim3-1 to fim3-

2 alleles was reported in the United kingdom, Russia, Canada, the United States

and Europe [255, 260, 269, 272, 273]. In the United Kingdom, strains carrying

fim2-2 were observed since 1982 and increased to 20% of the bacterial population

[255]. In Russia, fim2-2 and fim3-2 rose to 89% and 64% of the population

respectively [272]. In Finland, 77% of strains isolated during 1992 to 2006 were

Chapter 1

37

fim3-2 [258] and children infected by Fim3 expressing strains were hospitalised

more [274]. In Canada the number of strains harbouring fim3-2 increased to 33% in

2002 [275]. However, the predominant strains still contained vaccine type allele,

fim3-1[257]. Strains with fim 3-2 alleles started to increase from 50% in 2001 to

59% in 2006 but decreased to 42% in the strains collected between 2007 to 2012 in

12 European countries with ACV vaccinations [159].

Six alleles have been identified so far for the tcfA gene. Alleles are differentiated by

point mutations, deletions or insertions in the gene. All vaccine strains in different

countries used for WCV or ACV production carry tcfA2 [258]. In the UK, 89% of

all clinical strains collected from 1992 to 2002 harboured tcfA2 and no allelic shifts

were observed between pre and post vaccination [255]. Van Amersfoorth et al. also

showed that tcfA2 is predominant in five other European countries ranging from

80% in France to 100% in Finland [166]. The same trend was also reported in other

countries like Japan, Denmark, the Netherland and the United States [257, 261,

276, 277].

Limited polymorphisms in other virulence genes such as vag8, bapC and fha has

also been reported with no variations between vaccine and clinical strains [255,

277].

1.7.2.2 Single Nucleotide Polymorphism (SNP)

Point mutation is one of the main mechanisms for genetic variation in the

population and plays an important role in adaptation and evolution [278]. Point

mutation includes deletion or insertion of a single nucleotide or substitution of one

nucleotide with another called single nucleotide polymorphism (SNP). SNPs in the

genome may play a key role in the evolution of the pathogen and some SNPs may

confer a selective advantage for the pathogen [279].

Many studies have investigated these small changes in the genome of B. pertussis

during the pre and post vaccination era. Early studies of SNP identification carried

out by partial DNA sequencing focused on virulence genes, in particular, genes

Chapter 1

38

encoding antigens used in ACV production. Whole genome sequencing allowed

identification of SNPs across the whole genome.

B. pertussis is a monomorphic pathogen. van Loo et al. in 2002 found only 1 SNP

per 1.3 kb in genes coding for surface proteins from strains isolated within the last

60 years [276]. In 2008, a microarray-based comparative genomic sequencing

showed 1 SNPs per 20 kb based on analysing 9.4% of B. pertussis genome [280].

The latest data sequenced 343 B. pertussis strains collected from 19 countries from

1920 to 2010 and showed a mean density of 0.0013 SNPs/bp and an estimated

mutation rate of 2.24x10-7

per site per year [41].

The current list of SNPs identified in clinical B. pertussis strains can be affected by

the number, year and location of isolates collected. Polymorphisms that were found

in the genes coding for antigenic proteins caused antigenic shifts in the B. pertussis

population. Another important polymorphism that was reported to change the

temporal trend of current global circulating strains was SNPs that were detected in

the promoter region of the Ptx operon (ptxP). ptxP regulates the expression of Ptx

through BvgA binding site. To date, 19 alleles of ptxP have been identified of

which two alleles; ptxP1 and ptxP3 are predominant worldwide. ptxP1 and ptxP3

differ by a single nucleotide change located at -65 of the ptx operon [254].

Although strains with the ptxP3 allele were identified in 1980s and became

dominant in many countries before the introduction of the ACV, widespread

distribution of ptxP3 strains as documented in countries with high vaccine coverage

including Canada, Finland, France, Japan, US and the Netherlands the during last

decade [159, 260, 273]. In the Netherlands, the rise of ptxP3 isolates was associated

with the resurgence of pertussis in terms of increasing notification rates and

hospitalisations. ptxP3 strains were shown to produce 1.6 times the amount of Ptx

as non-ptxP3 strains in vitro [281, 282]. A study in 2015 compared strains isolated

from different countries which used either the ACV or WCV during 1998 to 2012

and showed that the predominant strains in European countries which used ACV

contained ptxP3/ptxA1/prn2 alleles. This was in contrast to countries like Poland

and China where WCV as used for vaccinations, with the predominant strains still

harbouring ptxP1/prn1 alleles [159]. However, in Argentina where WCV and ACV

Chapter 1

39

are used for primary and booster immunisations respectively, the predominant

strains harbour ptxP3/prn2 allele [168].

Strains with ptxP3 alleles are usually associated with prn2, fim2-2 and fim3-1/fim3-

2 alleles and were shown to group on a distinct branch on the B. pertussis genome

tree of over 300 isolates with unique SNPs in other loci [41, 283]. It seems that

ptxP3 B. pertussis strains expanded rapidly after the introduction of ACVs

worldwide [275, 282, 284]. The study by King et al. showed higher expression of

some virulence genes in ptxP3 strains compare to ptxP1 strains. This suggested that

ptxP3 may be more virulent [284]. Increased colonisation of wild type ptxP3 strains

in the lungs and trachea of mice also suggested a better fitness of ptxP3 strains in a

mouse model [284]. High polymorphisms were also reported in other loci including

cysB, encoding a LysR-like transcriptional regulator that plays a role in sulphur

metabolism and promoters of bvg and fhaB genes [41].

1.7.2.3 Genome reduction

Genome reduction plays a key role in the evolution and adaptation of

microorganisms as the deletion of genes can increase the virulence of bacterial

pathogen [279]. It was demonstrated that the genome size of B. pertussis (~ 4.1

million base pairs) is smaller than B. parapertussis and B. bronchispetica with

almost 1.3 Mbp removed from the genome during divergence of the Bordetella

species[42, 250]. It seems that this progressive gene loss pattern has continued

recently in current circulating B. pertussis strains [283, 285].

The majority of genes that were variably present amongst different Bordetella

species are clustered together in locations around the genomes and are defined as

regions of differences (RDs). Genome investigation by King et al. showed that two

RDs relative to Tohama I strain, as the reference genome, were deleted in recent

Dutch isolates [286]. There are two major deletions found in B. pertussis strains.

The first deletion was BP910A-BP934 which occurred after the introduction of

WCV while the second deletion coincided with the introduction of ACV [285].

Different studies showed that there was a definitive relationship between the loss of

Chapter 1

40

BP1948-BP1962 and the presence of ptxP3 alleles in current circulating strains

around the world [285-288].

1.7.2.4 Insertion Elements (IS)

A high number of insertion elements have been found in the genome of B. pertussis

and ISs play a key role in creating diversity within the pathogen. Genome

rearrangement, gene reduction and gene inactivation are associated with ISs [254].

Three different ISs were identified in B. pertussis, of which IS481 with 238 copies

in the Tohama I genome is the most important IS. The loss of RDs in B. pertussis

strains was mostly associated with IS481 [42, 286]. There are other ISs include

IS1002 (6 copies) and IS1663 (17 copies).

The number of ISs in current circulating strains varies between isolates [287].

However, a recent study [289] showed that the IS481 copy number remains

unchanged in recent B. pertussis isolates. IS elements may lead to functional

changes in the clinical isolates in terms of genotype and phenotype. The best

example of the effect of IS in current circulating strains was the detection of strains

that do not express Prn in countries with high ACV coverage. Strains which do not

express Prn, mainly due to IS481 disruption in the prn gene, were first found in

France in 2009 and then in other countries including the US, Japan, Europe,

Australia and Canada [290-294]. Around 32% of strains collected from 1997 to

2009 in Japan and over 50% of strains collected in 2012 in the United States were

Prn negative [291, 293]. In countries where WCV was still used like Poland, China,

Russia and Senegal, no Prn negative strains have been detected [262, 264, 295].

Since there are no differences between patients infected with Prn negative strains

compared to Prn positive strains [296, 297], it appears that prn gene inactivation is

a selective advantage for the pathogen to adapt against ACVs which contains Prn in

their formulations [298].

1.7.3 Genotyping tools for epidemiologic studies

The ability to differentiate and genotype B. pertussis is critical for surveillance of

the disease as well as for studying evolutionary relationships. A variety of methods

have been used including: 1) phenotypic tests like metabolic activity testing,

Chapter 1

41

reaction to antisera and resistance to antibiotics and 2) genotypic tests which shows

variations in genome structure or individual genes [279]. Reviewed below are some

of the genotyping methods used in understanding B. pertussis epidemiology include

Multilocus Variable Number Tandem Repeats analysis (MLVA), Pulsed Field Gel

electrophoresis (PFGE), Multilocus sequence typing (MLST) and SNP typing. Each

method has its own advantages and disadvantages and is used for different

epidemiological analysis.

1.7.3.1 Multilocus sequence typing

MLST is a method of comparing housekeeping genes to characterise bacterial

isolates. The combination of alleles from different housekeeping genes constitutes a

sequence type (ST) [299]. However, variations in housekeeping genes of B.

pertussis is limited and therefore, the MLST typing of B. pertussis does not follow

the strict definition of MLST using housekeeping genes and virulence genes are

also included in the typing. One study used the traditional MLST of 7 housekeeping

genes and that there were only 3 sequence types [40].

MLST analysis of isolates collected from European countries, Japan and the United

States was carried out by using 15 virulence associated genes including genes

encoding vaccine antigens (ptxS1-5, prn, fhaB, fim2 and fim3) and surface proteins

(ompP, ompQ, tcfA, brkA, vag8 and bipA). Results showed limited variation

between isolates collected from different countries and no variations were found in

brkA, fim3, ompP, ptxS2, ptxS4 and ptxS5 [277].

Further MLST investigations using the same virulence genes and 2 new virulence

genes, cyaA and tcfA confirmed the limited variations in selected genes of isolates

collected in the United Kingdom with MLST5, ptxA1-ptxC2-tcfA2, as the

predominant type [255] . The predominant MLST type in the United States from

2006 to 2009 was prn2- ptxP3-ptxS1- fim3-2 [260].

Chapter 1

42

1.7.3.2 Pulsed-field Gel Electrophoresis typing

PFGE is one of the most commonly used methods for subtyping B. pertussis. It is

based on the analysis of DNA fragments after the genome has been cut with

restriction enzymes that target specific sequences. The fragments are then

visualised by gel electrophoresis and the isolates can be categorised by different

patterns of fragmentation. Europeans laboratories have standardised PFGE for B.

pertussis typing [300].

PFGE analysis showed that most of the current European B. pertussis strains are

grouped as cluster IV which consists of three subgroup; IVα, IVβ, and IVγ. The

most predominant profiles in the European strains are BpSR11 which belongs to

IVβ followed by BpSR10 (IVα), BpSR3 (IVβ), BpSR5 (IVα) and BpSR12 (IVγ).

All of the PFGE profiles mentioned harbour the allele combinations of ptxP3, prn2,

ptxC2 and tcfA2. Moreover, isolates from BpSR11, BpSR15 and BpSR12 had the

fim3-2 allele while BpSR10 and BpSR3 had fim3-1 [301].

In the United States where the number of Prn negative strains have increased

dramatically, the predominant PFGE type in Prn positive strains is CDC013 and

among Prn negative isolates they are CDC002 and CDC273 [293]. In China, the

predominant PFGE type of isolates collected in the 2012-2013 was BpFINR9 with

prn1-fim3-1-ptxP1 alleles [302].

1.7.3.3 Multilocous Variable Number Tandem Repeats analysis typing

MLVA is a PCR-based typing method used to identify naturally occurring

variations in the numbers of short tandem repetitive sequences, named VNTR, in

the genome. A combination of multiple VNTR loci are characterised either by

sequencing or fragment analysis and the combination of alleles produce a unique

MLVA type (MT) [303]. While there are 13 VNTRs in the B. pertussis genome,

only six VNTRs including VNTR 1, 3a, 3b, 4, 5, and 6 have been regularly used for

typing isolates and other loci like VNTR2 showed no substantial differences to be

included for further analysis [304].

Chapter 1

43

MLVA analysing using six VNTRs on B. pertussis isolates collected over 40 years

from different countries including Australia showed that from six predominant

clones circulating globally, MT27 is the predominant type followed by MT29 in

Australia and other countries. However, some MTs were restricted to specific

regions such as MT186 in Japan [305]. The MLVA typing of isolates from the

United States also showed the same results [260]. However, in China, the MLVA

type for strains collected in 2012-2013 was different (MT55) to other countries

[302].

1.7.4 Genomic adaptation and evolution of B. pertussis strains in Australia

Australia has experienced pertussis resurgence since the 1990s with outbreaks

occurring every 3 to 5 years. A major question was whether the introduction of

ACV contributed to the resurgence. Clonal replacement was observed based on

several molecular epidemiological studies [41, 159, 283]. The B. pertussis strains

circulating after the introduction of ACV in 1997 were different from those before

ACV introduction [305-307]. There are two types of ACV used in Australia for

vaccination. The predominant ACV is a three component vaccine

(GlaxoSmithKline, Melborne Australia) that includes Ptx, Prn and FHA and the

less frequently used ACV is a five component vaccine (Sanofi, Adacel) that

contains two more antigens, Fim2 and Fim3. Both vaccines were produced from

Tohama I which carries ptxA2/prn1/fhaB1/fim2-1/fim3-1 alleles [306].

Using MLVA, an analysis of over 200 Australian isolates collected over four

decades identified four predominant MLVA Types: MT27, 13.5%, including one

isolate from 1973 and the others were from 1990s to 2008; MT29, 21.6%, observed

since 1972; MT70, 21.2%, from 1996 to 2005, observed mostly since the

introduction of ACV in 1997; and MT64, 9.1%, observed from 1989 and 2002. The

trends in prevalence, over time, of the four commonest MTs, were analysed in three

time periods, based on type of vaccine(s) used at the time: WCV (prior to 1997),

the transition period of both WCV and ACV (1997 to 1999) and ACV only (2000

onwards). MT64 was found to be steady over the three periods. MT29 decreased

while MT27 and MT70 increased. Typing of genes encoding ACV antigens showed

Chapter 1

44

that the use of ACV may have driven the antigenic changes of two MTs (MT27 and

MT70) predominant in Australia [305].

As MLVA was less useful at resolving evolutionary relationships, Octavia et al.

used SNP typing to separate 316 isolates into 42 SNP profiles by using 65 SNPs

[306]. Isolates were grouped into six distinct cluster based on SNP profiles ranging

from cluster I to cluster VI. It was shown that the majority of Australian isolates

were grouped into four clusters (I to IV) which were differentiated from cluster VI

that contained Tohama I. The majority of recent Australian isolates had the

genotype profile of ptxA1/prn2/fim3-1or fim3-2 and were grouped into cluster I

which increased in frequency after the introduction of ACV (Figure 1.7-2).

Figure 1.7-2: Trends of the four major clusters of B. pertussis in Australia. Four major clusters

(I–IV) in Australia were divided into three periods: WCV (prior to 1997), transition from WCV to

ACV (1997–1999), and ACV (2000 onwards). Percentage (y axis) of a given cluster of the total

number of isolates for that period is shown (adapted from Octavia et al. [306])

Similar to the expansion of ptxP3 strains in countries with high ACV vaccine

coverage, further investigations of clinical B. pertussis strains collected during the

2008-2010 pertussis epidemic in Australia revealed that the number of strains

harbouring ptxP3/prn2 alleles (cluster I) dramatically increased to 86% which

0

5

10

15

20

25

30

35

40

WCA WCV/ACV ACV

% Is

ola

tes

I II III IV

Chapter 1

45

suggests that it may have occurred due to the pressure of ACV- induced immunity

[259, 308].

Prn negative B. pertussis isolates have been reported in European countries, Japan,

North America and many other countries [291, 294, 309, 310]. Several countries

have reported a rapid rise of Prn negative strains. In the United States more than

50% of strains collected in 2012 were Prn negative and it increased to 85% by the

end of 2013[293, 297]. It seems that there may be a direct relationship between

ACV use and an increase in Prn negative strains in the US as infections with Prn

negative were significantly higher in ACV vaccinated patients. During the latest

Australian pertussis epidemic, the number of Prn-deficient isolates also increased

dramatically from 5% in 2008 to 78% in 2012 of which the majority lost their

ability to express Prn by insertion of IS481 in the prn gene [311].

1.7.5 Animal models to study B. pertussis infection

Animal models are good tools to experimentally investigate immunity induction,

vaccine efficacy and pathogenic mechanisms with live pathogens under different

conditions [312]. However, for respiratory infections like pertussis, good animal

models reflecting symptoms of infection in humans, particularly coughing, are

difficult to find. Various animal species have been used for investigation of

pertussis infections including mouse, rat, piglet, rabbit, and non-human primates

[267]. Each animal model allows us to study one or more parameters such as

pathogenicity, host response to infection, immunogenicity and effectiveness of

candidate or registered vaccines but each model also has their own limitations

[267].

The best animal models to investigate pathogenicity of B. pertussis are nonhuman

primates as classical whooping cough or pertussis coughing and human-like

immune responses can only be seen in these animals. Moreover, the longevity of

these animals helps researchers model transmission and the duration of protection

over time [178, 267]. On the other hand, ethical strategies and research expenses

can limit the wider use of nonhuman primates.

Chapter 1

46

One of the best accessible and common models to study pertussis is the mouse

model. The relatively easy access to wild-type, knockout and transgenic mouse

strains have made it a widely used model for in vivo studies [65]. However, in

mouse models, investigating transmission may be challenging due to the lack of

coughing. Since some immune response similarities between humans and the

mouse model have been found, it can be widely used to study the importance of

virulence factors and the role of various immune responses in vivo in response to

either natural infection or pertussis vaccines [65, 313]. The adherence of bacteria

to ciliated trachea and macrophage invasion in lungs also makes the mouse a good

animal model to investigate vaccine efficacy [194].

Either intranasal or aerosol method can be used to induce infection in mouse.

Intranasal challenge is simpler than aerosol method in terms of delivering the

bacteria to the respiratory tract and time of bacterial exposure. In contrast, in

intranasal method, which involves dropping a solution of bacteria on the nares of

mouse and waiting for inhalation, light anaesthesia may be needed which can affect

the study [194].

Mouse model studies has been used widely to investigate ACV efficacy and

antigenic shifts in the recent B. pertussis strains [314-316]. It has been shown to

have good sensitivity to detect strain differences, such as recent studies by van Gent

et al. [317] who showed differences between PtxA1/Prn1 B. pertussis strains and

PtxA1/Prn2/Prn3 strains in the colonisation of the lungs and Hegerle et al. [316]

who also showed differences between Prn negative and Prn positive strains in the

ACV-immunised mouse model [316].

Chapter 1

47

1.8 Aims and motivations

As discussed above, B. pertussis adaptation including antigenic shift has been

observed in many countries including Australia with high ACV coverage.

Therefore this thesis has focused on the microevolution and fitness of the current

circulating B. pertussis strains in Australia which were compared with strains that

had been predominant in the past to gain a further understanding of the evolution of

B. pertussis.

Previous SNP typing grouped B. pertussis strains currently circulating in Australia

within a single cluster, named cluster I, and separated them from strains which were

predominant prior to the introduction of ACV [259]. It was documented that during

the latest pertussis outbreak in Australia, the majority of strains belonged to SNP

profile 13 of cluster I. It was also shown that Prn negative strains were increased

from 5% in 2008 to 78% in 2012 [311]. Therefore the first aim of this thesis

(chapter 3) was to investigate the microevolution of SP13 strains including Prn

positive and Prn negative B. pertussis isolated during 2008 to 2012 in Australia.

Australian B. pertussis from the 1970s to the present has been divided into four

major clusters (I-IV). cluster I strains in Australia have the same antigen alleles as

strains that are globally circulating and are known as ptxP3 strains [259]. ptxP3

strains have been showed to carry a new allele of the pertussis toxin promoter, ptxP,

and produce more pertussis toxin [281]. Therefore the second aim of this thesis

with results presented in chapter 4 was to investigate the genomic evolution of

different clusters using two different methods of whole genome sequencing,

Illumina and PacBio sequencing, to reveal the genomic diversity of different

clusters including small indels, IS elements and genome rearrangement in addition

to SNPs.

There are a few studies that have investigated the vaccine selection pressure in vivo

and most of these studies carried out ACV vaccine efficacy on the bacterial

clearance and strains fitness in the individual infections using the mouse model.

Therefore the third aim of this thesis was to develop a mixed infection mouse

Chapter 1

48

model to determine their relative in vivo fitness of Prn negative strains over Prn

positives strains (chapter 5) and cluster I strains over cluster II strains (chapter 6) in

immunised and naïve environments.

Chapter 2

49

Chapter 2. Materials and methods

2.1 Materials

2.1.1 Bacterial Strain

A total of 35 B. pertussis strains collected in Australia from different SNP clusters

were used in this study for whole genome sequencing and mouse studies. Details of

the isolates including the strain number, date of isolation and state of origin, SNP

profile and genotype profile can be found in Table 2.1-1and 2.1-2.The genotypes,

SNP profiles and Prn expression were determined by Octavia et al. [259, 306] and

Lam et al.[311].

Table 2.1-1: B. pertussis isolates from selected from major clusters and used in

chapter 4 and 6.

Lab.

No. Year Country-State Cluster SP^ ptxA ptxP prn fim3

Original

name

L506 2004 Australia IV 30 1 1 1 1

L568 1954 Japan VI 36 2 1 1 1 NCTC13251

(Tohama I)

L580 2007 Australia VI 27 2 1 1 1

L706 2002 Australia III 19 1 1 1 1

L1191 2009 Australia-NSW II 37 1 1 3 1

L1204 2009 Australia-NSW UC 18 1 1 1 1

L1415 2009 Australia-NSW UC 11 1 1 2 1

SP^=SNP Profile

UC= Unclustered isolates

Chapter 2

50

Table 2.1-2: Details of 28 B. pertussis isolates from cluster I including SNP profile 13 and 16

(all ptxP3 isolates). Prn negative isolates and the cause of prn disruption are mentioned.

Lab. No. Year State SP^ MLVA

Type fim3 Prn exp.

Cause of prn

silencing

L524 1997 Australia 13 27 1

L462 1999 Australia 13 27 1

L475 2000 Australia 13 27 1

L482 2001 Australia 13 27 1

L490 2002 Australia 13 27 1

L1037 2008 NSW 13 27 1

L1042 2008 NSW 13 27 1

L1380 2008 WA 13 27 1

L1391 2008 WA 13 3 (214) 1

L1214 2009 NSW 13 27 1

L1216 2009 NSW 13 27 1

L1419 2009 NSW 13 27 1

L1361 2009 WA 13 3 (214) 1 NEG IS481R

L1382 2009 WA 13 27 1

L1421 2010 NSW 13 19(114) 1 NEG IS1002

L1423 2010 NSW 13 19(114) 1

L1432 2010 NSW 16 22(114) 2 NEG IS481R

L1376 2010 WA 13 27 1

L1397 2010 WA 13 27 1

L1493 2011 NSW 13 27 1 NEG IS481R

L1507 2011 NSW 13 27 1 NEG IS481R

L1756 2011 WA 13 27 1 NEG IS481R

L1770 2011 WA 13 27 1

L1658 2012 NSW 13 27 1 NEG IS1002

L1661 2012 NSW 13 27 1 NEG IS481R

L1663 2012 NSW 13 27 1 NEG IS481F

L1779 2012 WA 13 27 1 NEG IS481R

L1780 2012 WA 13 27 1 NEG IS481R

SP^= SNP profile

Chapter 2

51

2.1.2 Mouse strain

Female BALB/c mice (2-3 week old) were used in the in vivo studies in chapters 5

and 6. Mice were purchased from the Animal Resource centre located in Perth,

WA, and were transferred to the Biological Research Centre (BRC) in UNSW at

least one week before experiments were started.

2.2 Methods

2.2.1 Culturing of B. pertussis

The routine plate culture of B. pertussis was performed using a medium first

described by Bordet and Gengou in 1906 [318]. Bacterial isolates were cultured on

commercial Bordet-Gengou agar (BG) (Becton Dickinson) supplemented with 7%

defibrinated horse blood (Oxoid) and 1.8% glycerol at 37°C for 3-5 days from the

frozen stock. Cultures were observed for haemolysis to select colonies in the Bvg+

phase for further subculturing on BG agar or charcoal agar supplemented with 7

and 10% defibrinated horse blood respectively and 40 mg/L cephalexin (Oxoid).

For broth cultures, a loopful of Bvg+ colonies from each isolate was inoculated into

20 millilitres of Stainer-Scholte (SS) broth for overnight culture (24 hours) at 37°C

with shaking (180 rpm). The broth was also supplemented with 1% Heptakis

((2,3,6-tri-O-methyl)-β-cyclodextrin)(MP Biomedicals) and 1x SS supplement to

increases the growth of B. pertussis based on Hulbart et al.[318]. Since B. pertussis

grows slowly in broth, a loopful of inoculum rather than a single colony was used.

If OD adjustment was required for growth curve determination studies, after 24

hours incubation, the OD600 was recorded and then adjusted to 0.05 in 20 millilitres

of fresh SS broth and samples were then taken at 12 hr intervals for 48 hr.

2.3 DNA extraction

For DNA extraction, a pure plate culture from a single colony of B. pertussis was

collected and dissolved in 250 µl of premixed buffer containing 50 mM Tris-HCL

and 0.5M EDTA buffer (with a 40:30 ratio). The mixture was centrifuged at 14000

rpm (18630 x g) for 30 sec and the supernatant was removed. The cell pellet was

Chapter 2

52

then washed again by resuspending it in 250 µl of Tris/EDTA mixed buffer and

centrifuged. Next, the washed pellet was resuspended in 250 µl premixed buffer

and incubated at 37°C for 20 minutes. Fresh lysozyme (10 µl of 20 mg/ml) was

added and the mixture was further incubated for 20 min to lyse the cells. For

protein digestion, 5µl of 20 ng/mL Proteinase K (Sigma) and 50µl of 10% SDS

(Promega) were added to the solution and mixed well by pipetting before

incubating it in a water bath at 37°C for overnight. Protein digestion was complete

when the solution became clear. After overnight incubation, 1µl of 20 mg/mL

RNase (Sigma) was added to the mixture and incubated at 65 °C for 15 min to

degrade RNA.

The contents were then transferred to a gel phase dividing tube (Eppendorf) and

250µl of phenol:chloroform:isoamyl (with a 25:24:1 ratio) was added. The tube

was then shaken gently for 3-5 min and centrifuged at 14000 rpm (18640xg) for 4

min. After repeating this step and centrifugation for another 4 min, 250µl of

chloroform:isoamyl (24:1) was added and centrifuged again at 14000 rpm

(18640xg) for 4 min.

The supernatant which contained the DNA was then transferred into two volumes

of cold 100% ethanol for DNA precipitation. The DNA was spooled using a glass

hook and rinsed in 70% ethanol. The purified DNA was then transferred into 100

µl of TE buffer and heated at 60°C for 5 min on a heating block to evaporate any

residual ethanol. The DNA was then stored at 4°C in the fridge.

If the DNA was to be used for whole genome sequencing, further purification was

required to obtain high quality DNA. High quality DNA was defined as having an

OD 260/280 and 260/230 ratio of 1.8-2 and a concentration of at least 200 ng/µl.

For purification, 2 volumes of cold 100% ethanol was added to DNA for overnight

precipitation and then centrifuged at 14000 rpm (18640xg) for 30 min. The

supernatant was removed and 200 µl of 70% ethanol was added and centrifuged for

20 min at 14000 rpm (18640xg). The purified DNA was then dissolved in

autoclaved and 0.2µm filtered MilliQ water. Nanodrop® and Qubit® were then

used to check the quality and concentration of the DNA while agarose gel

electrophoresis was used to check for DNA degradation.

Chapter 2

53

2.4 Polymerase chain reaction (PCR)

Each PCR mix consisted of approximately 30 ng of DNA template, 10 µl of 5X

MyTaq Reaction buffer (Bioline) which contains the dNTPs and MgCl2, 1 µl of

each of the forward and reverse primers (approximately 15 pmol) and 0.25 µl (1

unit) of MyTaq HS DNA polymerase (Bioline). Autoclaved and 0.2µm filtered

MilliQ water was added to reach a total reaction volume of 50 µl.

PCR cycling condition was as follows: Initial denaturation at 95°C for 1 min,

followed by 25 or 35 cycles of 95°C for 15 sec, 60°C for 30 sec, 72°C for 30 sec

and a final extension step at 72°C for 5 min.

2.5 Agarose Gel Electrophoresis

PCR products and extracted DNA were separated and visualised on a 1-2% agarose

gel in 1x TBE buffer for 30 min at 80-90 volts, to confirm the presence and

approximate size of the amplified products and the presence of any unwanted RNA

or degradation in the extracted DNA.

After setting the gel, DNA or PCR products mixed with 1x loading dye were

loaded into each well and after 30 min of running, the gel were stained in GelRed®

(Biotium) for 10-15 min before visualisation under ultraviolet transillumination

(Gel Doc 2000, BioRad). Each gel also contained 2µl of 1 or 10 kb molecular size

standard (Hyper ladder I or II, Bioloine) from which the size of PCR products or

DNA could be estimated.

2.6 PCR product purification

PCR products were precipitated using the ethanol/sodium acetate precipitation

method. In brief, amplified DNA was transferred to a microcentrifuge tubes

together with 2 volume of 80% ethanol and 1 volume of 3M sodium acetate. The

mixture was then vortexed and left at room temperature or in the freezer for 30 min

or overnight respectively, depending on the required purification quality. The DNA

was then pelleted by centrifugation at maximum speed ((18640 x g) for 25 min.

The supernatant was removed and the pellet was dried at 65°C on a heating block.

Chapter 2

54

The precipitated product was then resuspended in 10µl of autoclaved and 0.2µm

filtered MilliQ water.

2.7 Colony Forming Unit (CFU) count

CFU count was performed to determine the doubling time of isolates analysed in

Chapter 5 and 6. Briefly, a loopful of Bvg+ colonies were inoculated in 20 ml of

SS medium supplemented with 1% Heptakis and SS supplement (1x), at 37°C with

shaking (180 rpm). After overnight incubation, the OD600 was adjusted to 0.05 and

the samples were taken at 12h intervals for 48 h. CFU were then estimated by

plating samples diluted with 1x PBS on BG agar. The CFU count was performed in

triplicates. The CFU count was then used to calculate the doubling time using the

following equation:

2.8 Whole genome sequencing

Illumina sequencing was performed by The Ramaciotti Centre at the University of

New South Wales and the PacBio sequencing was done commercially at the Yale

University sequencing centre. The main difference between the Illumina and

Pacbio platforms are the length of reads generated. PacBio reads are much longer.

However, the cost is also much higher and thus only 9 isolates were sequenced

using PacBio in this study.

2.9 General procedures performed in mouse model studies

Details of methods used for animal studies were presented in the relevant chapters.

Here, only the common procedures are described. All animal work was performed

at the UNSW Biological Resource Centre (BRC). All procedures involving animals

were conducted under the University of New South Wales, Animal Care and Ethics

Committee approval number 12/137A.

2.9.1 Mouse housing and monitoring

All the procedures were performed according to the Australian code for the care

and use of animals for scientific purposes, edition 8, 2013. Safety cabinets and

DT= Ʃ {(𝑙𝑜𝑔10 𝐶𝐹𝑈𝑡𝑖𝑚𝑒 𝑝𝑜𝑖𝑛𝑡 2−𝑙𝑜𝑔10 𝐶𝐹𝑈𝑡𝑖𝑚𝑒 𝑝𝑜𝑖𝑛𝑡 1

0.301) X time point 2}

Chapter 2

55

antiseptic procedures were used to prevent any contamination and reduce the

hazards for both human and animals.

All mice were housed in cages which had lids covered by HEPA filters. All cages

were ventilated individually to protect both the animals and humans. To follow

housing guidelines, all cages were changed every two weeks with new HEPA

filters. Mice were acclimatised for one week after arrival at the BRC and monitored

before experiments were started.

Mice were monitored daily after immunisation and induction of infection to record

any changes in their behaviour and health as well as to check that there was an

adequate supply of food and water. The weight of each mouse was recorded on the

day of intranasal challenge and again before they were euthanized.

2.9.2 Anaesthetic methods used for sedation

Two methods, inhalation and injection, were used to sedate the mice for

immunisation and intranasal challenge during the in vivo experiment.

2.9.2.1 Inhalation

Inhalation was performed using an anaesthetic machine (The Stinger, AAS®) with

the oxygen flowmeter setting at maximum on 2L/min (usually 0.9-1L/min) at 1.0

BAR. Also, 2-3% isoflurane (ISOTHESIA®) was used for induction and 1.5% was

used for maintenance. Mice were placed in the induction chamber connected to the

anaesthetic machine for 1 minute. As the mice became lightly sedated,

subcutaneous immunisation was performed with either diluted ACV for the

immunised group or 1x PBS for the control group. This procedure was repeated

two weeks later for the second dose of immunisation. Details of the vaccine and

immunisation dose used were described in the relevant chapters.

2.9.2.2 Injection

Two weeks after the second immunisation, the mice were lightly sedated for

intranasal challenge. To prepare the stock, a sterile 1.5 ml Eppendorf tube

containing 210 µl of 2 mg/ml ketamine (Cenvet®) and 105 µl of 2 mg/ml xylazine

(Cenvet®) were diluted with 315 µl of 1x PBS for the IP (intraperitoneal) injection

using a 0.5 ml insulin syringe. Each mouse received 30 µl of the prepared solution

Chapter 2

56

containing 50 mg/kg ketamine and 25 mg/kg xylazine. The use of S8 drugs had

appropriate approval and all guidelines were adhered to.

To euthanase mice for blood and lung collection, the solution which was prepared

contained 240 µl of ketamine and 80 µl of xylazine and 360 µl of 1xPBS. Each

mouse received 120 µl of the solution containing 200 mg/kg ketamine and 100

mg/kg xylazine.

2.9.3 Mouse blood collection

Cardiac puncture was used for blood collection and was performed after the mice

were euthanized. Insulin syringes gauge 29 (BD®) were used for blood collection.

The mice were laid down on their backs and the needle was then inserted below the

sternum, slightly towards the left and directed towards its heart. Approximately 500

to 750 µl of blood were collected into a micro tube Serum Gel Z/1.1 (Sarstedt®

)

which contained a clotting activator. Then, the micro tubes containing the blood

were centrifuged for 10 min at 8000 rpm (6080 x g) and the sera were collected and

stored in a -80°C freezer for future studies.

Chapter 3

57

Chapter 3. Genomic dissection of Australian Bordetella pertussis

isolates from the 2008-2012 epidemic

3.1 Introduction

Bordetella pertussis, a small Gram-negative bacterium, is the causative agent of the

respiratory infection, pertussis. The infection can be life threatening in infants and

young children. Introduction of whole cell vaccine (WCV) during the 1950s

significantly reduced the morbidity and mortality of pertussis globally. However,

due to the side effect profiles of WCV, acellular vaccine (ACV) was developed in

the 1980s [111]. In Australia, ACV replaced the WCV initially as a booster in 1997

and then for all scheduled doses by 2000 [259]. Australia predominantly uses 3

component ACVs from GlaxoSmithKline containing detoxified pertussis toxin

(Ptx), pertactin (Prn), filamentous haemagglutinin (FHA). The 5-component ACV

from Sanofi –Aventis containing additional 2 types of fimbriae (Fim2 and Fim3)

has also been used in Australia [305].

Re-emergence of pertussis has been reported in many countries with high

vaccination coverage, including the US, Canada, Japan, European countries and

Australia [46, 167, 218, 219, 231, 261]. Despite high pertussis vaccine uptake in

Australia, (91-92%) pertussis incidence is still the highest amongst all vaccine-

preventable diseases. The latest pertussis epidemic started in 2008 and reached its

peak in 2011 [219, 319]. In the state of New South Wales (NSW), the pertussis

notification and hospitalization rate were 2.7 and 3.9 times higher, respectively, as

compared to the previous five-year average. In addition, there was a significant

increase in notification and hospitalisation rates for infants aged less than one-year

old [319].

Strain variation and pathogen adaptation have been reported in different countries

as evidenced by polymorphisms in several virulence associated genes and their

promoters, including those included in the ACV: ptx genes and its promoter ptxP,

prn, fhaB and fim [41, 167, 251]. Recent studies have reported the emergence and

increasing circulation of isolates that do not express Prn, a key component of the

ACV, in several countries [290, 291, 309, 310, 320, 321]. Comparative genomics

Chapter 3

58

of pre-vaccination and modern B. pertussis strains showed that single nucleotide

polymorphism (SNP) in important genes including virulence-associated genes or

regulatory genes may have helped the organism to survive under vaccine selection

pressure [280, 282, 283].

Previously, SNP typing classified 208 Australian B. pertussis isolates collected

since the 1970s into six SNP clusters [306]. An increase in prevalence of SNP

cluster I was documented after the introduction of ACV with SNP profile (SP) 13

found to be the most prevalent in the latest Australian pertussis epidemic [306].

Moreover, pertussis notification rates were high in two states of Australia, NSW

and Western Australia (WA), two widely separated states. [306, 322].

3.2 Aims and Motivation

SNP typing of 194 B. pertussis isolates collected from 2008-2010 revealed that

44% and 60% of NSW and WA isolates respectively belonged to SP13 [306].

Subsequent study showed that Prn negative isolates emerged in 2008 and increased

to 78% in 2012 [322]. Therefore, the aims of this study were to use the next

generation genome sequencing for;

a) Investigating the genetic diversity and evolutionary characteristics of B.

pertussis isolates associated with the recent Australian pertussis epidemic of

2008-2012

b) Determining the origin and relationships of 2008-2012 epidemic Prn negative

isolates.

Chapter 3

59

3.3 Materials and Methods

3.3.1 Bacterial Strains

A total of 27 B. pertussis SP13 isolates were selected based on year and state of

isolation and the inactivation mechanism of prn gene. Amongst the 27 selected B.

pertussis SP13 isolates, 22 represented isolates from the Australian 2008-2012

epidemic and are referred to henceforth as epidemic isolates. The remaining five

isolates represented isolates prior to the 2008-2012 epidemic and are referred to as

pre-epidemic isolates (Table 3.3.-1). Bacterial isolates were maintained on Bordet-

Gengou agar (Oxoid) supplemented with 7% horse blood (Oxoid) at 37°C for 3-5

days. Clinical isolates have only been passaged for five times or less since their

first isolation for diagnostic purpose. Genomic DNA was extracted and purified

from pure culture using the phenol-chloroform method as described previously

[323].

Chapter 3

60

Table 3.3-1: Details of SP13 isolates used in this study.

State Year of isolation Lab

No.

SNP

Profile Cluster ptxA fim3 prn

PRN

phenotype

NSW

2008 L1037 SP13 I 1 1 2 +

L1042 SP13 I 1 1 2 +

2009 L1214 SP13 I 1 1 2 +

L1216 SP13 I 1 1 2 +

L1419 SP13 I 1 1 2 +

2010 L1421 SP13 I 1 1 2 prn-::IS1002

L1423 SP13 I 1 1 2 +

2011 L1493 SP13 I 1 1 prn-::IS481R

L1507 SP13 I 1 1 prn-::IS481R

2012 L1658 SP13 I 1 1 prn-::IS1002

L1661 SP13 I 1 1 prn-::IS481R

L1663 SP13 I 1 1 prn-::IS481F

WA

2008 L1380 SP13 I 1 1 2 +

L1391 SP13 I 1 1 2 +

2009 L1361 SP13 I 1 1 2 prn-::IS481R

L1382 SP13 I 1 1 2 +

2010 L1376 SP13 I 1 1 2 +

L1397 SP13 I 1 1 2 +

2011 L1756 SP13 I 1 1 prn-::IS481R

L1770 SP13 I 1 1 +

2012 L1779 SP13 I 1 1 prn-::IS481R

L1780 SP13 I 1 1 prn-::IS481R

Unknown

1997 L524 SP13 I 1 1 2 +

1999 L462 SP13 I 1 1 2 +

2000 L475 SP13 I 1 1 2 +

2001 L482 SP13 I 1 1 2 +

2002 L490 SP13 I 1 1 2 +

Chapter 3

61

3.3.2 DNA sequencing and Assembly

DNA libraries were constructed with the insert size of 250 bp paired-end using

NexteraXT kit (Illumina) and sequenced on the MiSeq (Illumina). Genome

sequencing was done in a multiplex of 24. de novo assembly was performed for all

sequencing data using VelvetOptimiser [324] to combine reads into contigs. These

contigs were compared to the reference genomes with progressiveMauve program

(version 2.3.1) [325]. Some contigs were unique as they were unable to be aligned

against the B. pertussis Tohama I as the main reference genome. The identity of

these contigs was confirmed using BLASTn to determine their homologues.

3.3.3 SNP Identification

SNP detection was performed as described previously [326]. Briefly, reads were

mapped against reference genomes using Burrows-Wheeler Alignment (BWA)

tools (version 0.7.5) [327]. SNPs were identified using SAMtools (version 0.1.19)

[328]. Any SNPs with the quality score of less than 20 were removed. Only SNPs

and indels supporting ten or more reads and having ≥70 % of reads were selected

and further confirmed in progressiveMauve alignments. The filtered SNPs from

mapping were compared to the SNPs exported by progressiveMauve and common

SNPs were selected for further analysis.

3.3.4 Insertion sequence elements analysis

The distribution of insertion elements (IS) in the SP13 isolates was investigated

using custom script. Briefly, the reads were mapped onto the IS481 and IS1002

sequences and reads with partial matches to the IS sequences (hybrid reads) were

captured. The hybrid reads were cut into two segments, one that matched to the

non-IS and another that matched to the IS. These segments were then used to

identify the insertion site and the direction of the IS. B. pertussis strain Tohama I

was also re-sequenced to confirm the ability of this script to locate the IS positions

on the genome using reads.

3.3.5 Phylogenetic analysis

Phylogenetic analysis was conducted using MEGA (version 5) [329]. The

minimum evolution algorithm was applied based on the Nucleotide Maximum

Chapter 3

62

Composite Likelihood analysis of all positions. Bootstrap analysis was based on

1000 replicates. B. pertussis strain Tohama I was used as outgroup. Mutation rate

and ancestral node date for SP13 epidemic lineage were estimated using Bayesian

analysis in the BEAST (version 1.7.5) package [330]. Analyses were run using the

variable sites within SP13 lineage and their isolation dates as the number of days

before the present, under a GTR model of evolution, with all combinations of

constant, expansion, logistic and extended skyline population size models and

strict, relaxed exponential and relaxed log-normal clock models. For each

combination, three independent Markov chains were run for 10 million generations

each, with parameter values sampled every 1000 generations. Chains were

manually checked for reasonable ESS values and for convergence between the

three replicate chains using Tracer (version 1.5). Tracer was also used to identify a

suitable burn-in period to remove from the beginning of each chain and to assess

the model with best fit the data using Bayes factors. The strict clock and extended

Bayesian skyline population stepwise models were found to be the most

appropriate, so this combination of models was used for further analyses. Chains

were combined and downsampled to every 10,000 generations using LogCombiner

with 10% of burn-in removed. A maximum clade credibility tree was computed

with TreeAnnotator.

3.3.6 Reference genomes

The genome of B. pertussis strain Tohama I (accession number BX470248) was

used as a reference to generate comparable data as all previous genomic studies

used Tohama I as a reference. Some studies revealed that there are some genomic

regions present in B. pertussis strains but are not found in Tohama I [248, 249].

Therefore, genome of B pertussis strain CS which is fully sequenced [250] was also

used to investigate the genomic diversity among these regions of differences.

Chapter 3

63

3.4 Results

3.4.1 Selection and sequencing of epidemic isolates

In this study, whole genome sequencing was used to investigate the microevolution

and genomic diversity of 22 epidemic B. pertussis isolates. The isolates were

collected from two geographically separated states of Australia, NSW, the most

populous eastern state, and WA, the western state. All isolates belonged to SP13,

the predominant SNP profile causing the epidemic. We selected 12 isolates from

NSW and 10 from WA, isolated during 2008-2012 with at least two isolates per

state per year. Ten of these were Prn negative with different mechanisms of

inactivation. Strains inactivated by IS481R (insertion of the IS into prn in the

forward and reverse orientation relative to the IS encoded tnpA gene, referred to as

F and R, respectively) were predominant (one prn::IS481F, seven prn::IS481R and

two prn::IS1002 disruptions). Additionally, five Australian pre-epidemic SP13

isolates from 1997-2002, were selected for comparison. The use of pre-epidemic

SP13 facilitates identification of genetic changes present in all epidemic strains.

The average number of reads generated per genome was 2,274,658. Isolate L1214

had the minimum number of reads since it was sequenced only once while the other

isolates were sequenced twice and the data of two sequencing results were used to

have better results. The average coverage depth for all genomes was 48.35, with the

lowest coverage of 33 for L1423. Percentage match of reads to reference strain

Tohama I chromosome ranged from 96.23% to 98.93% using Qualimap version 2

[331]. The number of contigs ranged from 640 to 3184, with 999 on average (Table

3.4-1).

Chapter 3

64

Table 3.4-1: Quality of assembly for each SP13 B. pertussis isolate based on velvetg

Isolates

Lab No. No. of reads

% mapped to

B. pertussis Tohama I

No. of

contigs Contigs coverage

L462 2,836,139 98.66% 640 63.98

L475 2,426,161 98.60% 803 48.44

L482 2,339,608 98.67% 854 56.63

L490 2,561,808 98.60% 895 51.22

L524 2,159,056 98.36% 846 48.75

L1037 2,687,681 98.49% 860 45.33

L1042 2,414,796 98.66% 856 56.43

L1214 789,486 98.79% 1203 22.1

L1216 1,675,412 96.23% 1035 36.97

L1361 2,347,134 98.64% 944 49.39

L1376 2,595,615 98.88% 819 61.45

L1380 2,330,394 98.75% 980 51.98

L1382 2,675,364 98.69% 823 49.39

L1391 2,534,162 98.82% 977 43.44

L1397 1,596,243 98.65% 861 46.39

L1419 1,898,632 98.64% 1022 43.47

L1421 2,135,262 98.66% 782 50.7

L1423 4,253,462 98.11% 1077 30

L1493 2,458,187 98.67% 944 33.2

L1507 1,958,656 98.82% 949 50.2

L1658 2,835,695 98.67% 1083 48

L1661 1,770,538 98.73% 825 48.65

L1663 2,255,754 98.79% 995 50.59

L1756 2,915,695 98.93% 941 60.96

L1770 2,094,247 98.87% 917 51.52

L1779 1,995,540 98.91% 879 54.28

L1780 2,876,837 98.89% 3184 52

3.4.2 Polymorphisms in SP13 isolates

Two approaches, de novo assembly and reads mapping to the reference Tohama I

genome, were used for SNP calling. A total of 305 SNPs (mutation in one or more

of the 27 SP13 isolates) were detected, 44 of which (14.14%) were located in

intergenic regions (Appendix 1). More than half of the SNPs (188 or 61%) were

Chapter 3

65

common in all SP13 isolates which suggests that either these SNPs were unique to

Tohama I or the SNPs appeared before the divergence of SP13. For the 133 SNPs

that were variably present in SP13 isolates, 32 were also found in pre-epidemic

SP13 isolates and only five were shared by all epidemic SP13 isolates (Table 3.4-

2).

Chapter 3

66

Table 3.4-2: Common SNPs found in all 2008-2012 SP13 isolates when compared with B. pertussis Tohama I.

Position

in

genome Ref. SNP

Amino

acid

change Locus Gene

Position

in gene Product Functional category

136138 G A Promoter

BP0178-BP0138

Transposase for IS481/Pseudogenes Phage related /Pseudogene

136140 T C Promoter

BP0178-BP0138

Transposase for IS481/Pseudogenes Phage related /Pseudogene

223961 G A V->I BP0216 sphB1 361 Auto transporter subtilisin-like protease Virulence-associated genes

838886 G A BP0814 - 586 Probable LysR-family transcriptional

regulator

Regulation

3783560 A C Y->S BP3570 - 323 30S ribosomal protein S8 Ribosome constituents

Chapter 3

67

Of the 133 SNPs, 102 SNPs were located within genes, 56 of which were non-

synonymous while the remaining 15 SNPs were found in the intergenic regions.

The SNP density varied from 0.335 per kb to 0.014 per kb with phage related or

transposon related genes, energy metabolism, amino acid biosynthesis, virulence

associated genes carried more SNPs than chromosomal average (Figure 3.4-1).

Fourteen out of the 102 genic SNPs (13.72%) were in genes regulated by the bvg

regulon which are mostly virulence associated genes (6), with four being non-

synonymous SNPs. Additionally, 12 SNPs were located in putative promoter

regions which may affect the transcription of downstream genes.

Figure 3.4-1: The functional categories of the single nucleotide polymorphisms observed in

Australian Bordetella pertussis SP13 isolates.

The reads were also mapped against the regions absent in Tohama I but present in

the Chinese CS strain which is used in china for vaccine production. Two SNPs,

located BPTD_0398 encoding for hypothetical protein and BPTD_0394 coding for

IclR family transcriptional regulator were found among all SP13 isolates which were

reported in recent study by Xu. et al.[302]. There was also one SNP in BPTD_0392

coding for hypothetical protein for L1770 and L1756.

0

0.00005

0.0001

0.00015

0.0002

0.00025

0.0003

0.00035

0.0004

De

ns

ity (

SN

P/b

p)

Functional categories

Chapter 3

68

3.4.3 Phylogenetic relationships

The identified SNPs were used to generate a minimum evolution tree to illustrate

the genetic relationship of the 22 epidemic SP13 isolates and the five pre-epidemic

SP13 isolates (Figure 3.4-2). The genome sequence of B. pertussis strain Tohama I

was used as an outgroup to root the tree. All Australian 2008-2012 epidemic

isolates were clearly differentiated from pre-epidemic isolates by five SNPs (Table

3.4-2).

The epidemic isolates can be divided into five epidemic lineages (EL1 to EL5).

Each lineage was supported by more than one SNP with three, four, two, seven and

three supporting EL1, EL2, EL3, EL4 and EL5, respectively (Table 3.4-3). Some

isolates were geographically clustered; EL4 was a NSW cluster with five isolates

from 2008-2012, while EL5 was a WA lineage with two isolates from 2008 and

2009. The other three clades contained isolates from both states.

Figure 3.4-2: Minimum evolutionary tree of 27 Bordetella pertussis SP13 isolates based on 305

single nucleotide polymorphisms (SNPs). The number on the internal and terminal branches

corresponds to the number of SNPs supporting each branch. Epidemic isolates grouped into 5

epidemic lineage (EL).

L1493 (NSW 2011 prn::IS481R)

L1507 (NSW 2011 prn::IS481R)

L1661 (NSW 2012 prn::IS481R)

L1756 (WA 2011 prn::IS481R)

L1779 (WA 2012 prn::IS481R)

L1780 (WA 2012 prn::IS481R)

L1770 (WA 2011 Prn+)

L1421 (NSW 2010 prn::IS1002)

L1658 (NSW 2012 prn::IS1002)

L1397 (WA 2010 Prn+)

L1376 (WA 2010 Prn+)

L1380 (WA 2008 Prn+)

L1382 (WA 2009 Prn+)

L1216 (NSW 2009 Prn+)

L1037 (NSW 2008 Prn+)

L1419 (NSW 2009 Prn+)

L1663 (NSW 2012 prn::IS481F)

L1423 (NSW 2010 Prn+)

L1214 (NSW 2009 Prn+)

L1042 (NSW 2008 Prn+)

L1391 (WA 2008 Prn+)

L1361 (WA 2009 prn::IS481R)

L475 (2000 Prn+)

L462 (1999 Prn+)

L482 (2001 Prn+)

L524 (1997 Prn+)

L490 (2002 Prn+)

Bordetella pertussis strain Tohama I105

1

1

6

8

1

1

7

2

2

5

1

3

3

2

1

8

1

3

4

3

9

7

83

3

1

7

1

1

1

2

4

1

3

2

3

1

2

5

1

5

EL1

EL2

EL3

EL4

EL5

Chapter 3

69

The ten Prn negative isolates were distributed into four ELs (EL1, EL2, EL4 and

EL5). EL1 contained Prn negative isolates only with IS481R insertion as the mode

of inactivation. The isolates were from both NSW and WA and from 2011 to 2012,

suggesting a single Prn negative strain spreading over the two years across

Australia (Figure 3.4-2). EL2 contained the two isolates L1421 and L1658 that lost

their Prn expression due to an IS1002 insertion. Both were isolated in NSW but

from different years, clearly showing a single Prn negative strains perpetuating

over the two years. EL4 contained the only Prn negative isolate due to IS481F

included in this study, likely to have arisen from an ancestral Prn positive strain

from NSW as this cluster consisted of only NSW isolates.

EL5 contained one (L1361) of the seven Prn negative isolates with IS481R

insertion showing an independent origin as it was separated from the other 6

isolates which were clustered in EL1. L1361 was isolated and grouped with a Prn

positive isolate from the same state from 2008, suggesting that this Prn negative

isolate arose locally.

Using the strict clock and extended Bayesian skyline population stepwise models,

the date for the most recent common ancestor (MRCA) for the epidemic SP13 was

around late 2002 (95% confidence interval of late 1999- mid 2007). Using the core

genome size of 3,485,846 bp [332], the mutation rate was determined to be

3.81x10-7

substitutions/genome/year (95% CI 2.28x10-7

– 5.46x10-7

).

Chapter 3

70

Table 3.4-3: Single nucleotide polymorphisms unique to epidemic lineages.

Lineage

Position on

Tohama I Locus Gene Product Category

EL1 1162706 BP1110 sphB3 Serine protease Virulence-associated genes

2244138 BP2120 gor Glutathione reductase Central/intermediary metabolism

3758055 BP3546 - Putative branched-chain amino acid transport system protein Transport/binding proteins

EL2 259371 BP0250 - Tripartite tricarboxylate transporter family receptor Cell surface

1288344 BP1226

Putative exported protein (pseudogene) Pseudogenes (phase-variable)

1358703 BP1285 livJ Leu/Ile/Val-binding protein Receptor family ligand binding

3060100 Intergenic

EL3 1692984 BP1610

Putative autotransporter (Pseudogene) Pseudogenes

2657330 BP2509 dapB Dihydrodipicolinate reductase Amino acid biosynthesis

EL4 2570056 BP2427

2570058 BP2427

Threonine synthase (pseudogene) Pseudogenes

2681216 BP252A

Putative exported protein (pseudogene) Pseudogenes

3347400 BP3139P^

Putative oxidoredoctase miscellaneous

3485064 BP3267 apaG Conserved hypothetical protein Conserved hypothetical

3555632 BP3333 pykA Pyruvate kinase Energy metabolism

3927872 BP3718 - Branched-chain amino acid transport ATP-binding protein Transport/binding proteins

EL5 2191410 BP2072 - Putative lipoprotein Cell surface

2514582 BP2375P

conserved hypothetical protein Pseodogenes

3388793 BP3177 - Putative methylaspartate ammonia-lyase Energy metabolism

^promoter

Chapter 3

71

3.4.4 Potential adaptive SNPs of epidemic SP13 B. pertussis isolates

SNP data revealed five SNPs that separated the 2008-2012 epidemic isolates from

pre-epidemic isolates (Table 4.3-2). Two SNPs were located in intergenic regions

while the remaining three SNPs were located in genes. The two IG SNPs were

located between BP0137 encoding a transposase for IS481 and BP0138 encoding a

pseudogene thus they were likely to be neutral. Two of the three genic SNPs were

non-synonymous SNPs with one being in a virulence associated gene (BP0216 -

sphB1) and the other in a gene encoding a transport binding protein (BP3570).

These two changes may be adaptive. One silent SNP was found in a regulatory

gene (BP0814). The three genic SNPs have also been reported in some ptxP3

strains which were isolated from different countries by Bart et al. [41], suggesting

that the epidemic lineage may have spread globally. All three genic SNPs were also

found in the Finnish isolates collected in 1999 and sequenced recently [302].

However, only the SNP in BP0814 was also found in the genomes of UK 2012

outbreak [251].

3.4.5 Indels

A total of 45 indels were found in SP13 isolates, of which 20 were common to all

27 isolates (Table 3.4-4 and 3.4-5). Thirty two indels were located in genes, 15 of

which were on pseudogenes and 7 were on bvg regulated genes. There were 27

frameshift indels, 16 of which were resulted from a one base pair indel. Six genes

were affected by two base pair indels and one gene each affected by 4, 5, 7, 8, 31

and 39 base pairs indels, respectively (Table 3.4-4). Five annotated pseudogenes,

BP0880, BP2000, BP2738, BP2899 and BP3762, appeared to produce a complete

coding sequence suggesting reversion to functional genes. These genes were

common in all SP13 isolates except for BP2738. In contrast, four genes including

BP2232, BP2595, BP2928, BP2946 and BP3465 had a stop codon resulting in

proteins that were >20% shorter which were considered as pseudogenes [326].

Three of the four genes were common in all SP13 isolates (Table 3.4-4).

Six indels, located in BP0967 (cysU), BP1624 (kpsT), BP2141, BP1186, BP3258

and BP3580, respectively, were in a multiple of 3 bases and thus were not likely to

affect the reading frame. Thirteen indels were located in intergenic regions, some

Chapter 3

72

of which were found in the promoters including the promoters of fim2 and fim3

genes which are known to affect the expression of these genes [76, 80]. Thirteen

indels were located in homopolymeric tracts in genic regions and are likely to be

subjected to phase variation including one in BP2738 (bapC) which is known to be

phase variable. Two indels, one each located in BP0880 and BP2946, were

previously reported to be present in ptxP3 strains [283]. None of the indels reported

to be specific to the UK ptxP3 strains by Sealey et al. [251] was observed in SP13

isolates used in this study. There were no indels specific to the 2008-2012 epidemic

isolates. There were two, two, one and three indels specific for lineages EL1 to

EL4, respectively. EL5 had no lineage specific indel. The two indels specific to

EL1 were located in BP1186 and BP2018 with the former encoding a putative

cytoplasmic protein, aldolase, and the latter encoding a transposase. The indel in

BP1186 was a deletion of 3 bases and did not disrupt the reading frame. Of the two

indels specific to EL2, one was located in the intergenic region between BP1052

and BP1053 while the other is located in a bvg regulated pseudogene BP2738

(bapC) encoding an outer membrane protein. bapC is phase variable [333] and

recently it has been shown that BapC is an adhesin factor [334]. EL2 consisted of

two Prn negative isolates and the adhesin role of BapC may become more

important without Prn. The other indel located in the poly-C region of bapC was

found in only one isolate. The only EL3 specific indel was located in the promoter

region of BP1418 encoding methionine aminopeptidase and three EL4 specific

indels were all located in intergenic regions, including the promoters of BP0191,

BP1568 (fim3) and the intergenic region upstream of BP2086.

Considering that some of the genes disrupted by indels were either regulated by the

bvg regulon or encoding cell surface proteins, it was likely that these genes have an

effect on the virulence of the strains.

Chapter 3

73

Table 3.4-4: Frameshift indels in SP13 isolates.

^ D= Deletion, I= Insertion

Po

siti

on

in

gen

om

e

I/D

(N

o.

of

base

s)

Ho

mo

po

lym

eri

c t

ract

Lo

cu

s

Po

siti

on

in

gen

e

L4

90

L5

24

L4

82

L4

75

L4

62

L1

39

1L

13

61

L1

04

2L

12

14

L1

42

3L

16

63

L1

41

9L

12

16

L1

03

7L

13

82

L1

38

0L

13

76

L1

39

7L

16

58

L1

42

1L

17

70

L1

49

3L

15

07

L1

66

1L

17

80

L1

75

6L

17

79

Fu

ncti

on

al

cate

go

ry

PR

OT

EIN

Eff

ect

of

mu

tati

on

bvg

act

ivat

ed

167272 I(2) BP0167 1324 - + + + - + + - - - - + - - - + - + + + - - + + + - + pseudogenes C-terminal region of a probable ABC transporter Towards end

167280 I(1) BP0167 1333 - + + + - + + - - - - + - - - + - + + + - - + + + - + pseudogenes C-terminal region of a probable ABC transporter Towards end

365461 D(1) BP0364 969 + + + + + + + + + + + + + + + + + + + + + + + + + + + cell surface putative exported protein Towards end

584199 D(2) BP0576 932 - - - + - - - - - - - - - - - - - - - - - - - - - - - conserved hypothetical conserved hypothetical protein Towards end

919013 D(8) BP0880 2237 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes putative exported protein Revert?

1022899 I(1) BP0985 438 - - - - - - - - - - - - - - - - - - - - - - - - - + - protection responses acriflavine resistance protein B No difference?

1026536 D(2) BP0986 899 - - - - - - - - - - - - - - - - - - - - + - - - - - - cell surface probable outer membrane lipoprotein No difference?

1080117 D(1) BP1037 1069 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes putative apolipoprotein N-acyltransferase No difference? +

1184203 I(2) BP1123 1765 + + + + + + + + + + + + + + + + + + + + + + + + + + + cell surface putative membrane protein Towards end

1426476 D(1) + BP1351 8 + + + + + + + + + + + + + + + + + + + + + + + + + + + cell surface putative exported protein Revert?

1464175 I(1) BP1386 262 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes methyl-accepting chemotaxis protein No difference?

1464181 I(1) BP1386 268 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes methyl-accepting chemotaxis protein No difference?

2111898 I(1) BP2000 90 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes putative membrane protein Revert?

2132148 I(1) BP2018 2 - - - - - - - - - - - - - - - - - - - - + + + + + + + phage-related or transposon-related transposase No difference?

2361165 D(1) BP2232 469 + + + + + + + + + + + + + + + + + + + + + + + + + + + unknown hypothetical protein Pseudogene +

2747341 D(1) + BP2595 237 - - - - - + - - - - - - - - - - - - - - - - - - - - - pseudogenes putative peptidase Pseudogene

2747351 D(1) BP2595 238 - - - - - - + - - - - - - - - - - - - - - - - - - - - pseudogenes putative peptidase duplicated

2906305 D(2) BP2738 1972 - - - - - - - - - - - - - - - - - + + + - - - - - - - pseudogenes putative autotransporter No difference? +

2907628 D(1) + BP2738 2337 - - - - - - - - - + - - - - - - - - - - - - - - - - - pseudogenes putative autotransporter Revert? +

3074606 I(2) BP2899 141 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes NAD(P) transhydrogenase subunit beta Revert?

3117315 D(7) BP2928 828 + + + + + + + + + + + + + + + + + + + + + + + + + + + transport/binding proteins putative membrane transport protein Pseudogene +

3134430 D(31) BP2946 543 + + + + + + + + + + + + + + + + + + + + + + + + + + + regulation putative arsR-family transcriptional regulator Pseudogene

3291152 I(4) BP3090 545 + + + + + + + + + + + + + + + + + + + + + + + + + + + conserved hypothetical conserved hypothetical protein Towards end

3436723 I(1) + BP3224 779 + + + + + + + + + + + + + + + + - + + + + - + + + + + energy metabolism putative cytochrome oxidase No difference?

3678676 D(1) BP3465 275 + + - - - - - - - - - - - - - - - - - - - - - - - - - conserved hypothetical conserved hypothetical protein Pseudogene

3974877 D(1) + BP3762 521 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes probable ATP-binding ABC transporter protein Revert?

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Chapter 3

74

Table 3.4-5: List of non-frameshift and intergenic indels in SP13 Isolates

^ D= Deletion, I= Insertion

Po

siti

on

in

gen

om

e

I/D

(N

o.

of

base

s)

Ho

mo

po

lym

eri

c t

ract

Lo

cu

s

Po

siti

on

in

gen

e

L4

90

L5

24

L4

82

L4

75

L4

62

L1

39

1L

13

61

L1

04

2L

12

14

L1

42

3L

16

63

L1

41

9L

12

16

L1

03

7L

13

82

L1

38

0L

13

76

L1

39

7L

16

58

L1

42

1L

17

70

L1

49

3L

15

07

L1

66

1L

17

80

L1

75

6L

17

79

Fu

ncti

on

al

cate

go

ry

PR

OT

EIN

Non-frameshift

1005224 D(6) BP0967 111 + + + + + + + + + + + + + + + + + + + + + + + + + + + pseudogenes sulfate transport system permease protein

1250494 D(3) BP1186 643 - - - - - - - - - - - - - - - - - - - - - + + + + + + miscellaneous putative aldolase

1707618 D(6) BP1624 671 + + + + + + + + + + + + + + + + + + + + + + + + + + + transport/binding proteins polysialic acid transport ATP-binding protein

2265235 I(6) BP2141 83 + + + + + + + + + + + + + + + + + + + + + + + + + + + cell surface putative exported protein

3476167 I(3) BP3258 1417 + + + + + + + + + + + + - + + + + + + + + + + + + + + pseudogenes putative exported protein

3793608 I(3) BP3580 269 + + + + + + + + + + + + + + + + + + + + + + + + + + + macromolecule degradation intracellular PHB depolymerase

Intergenic

193055 D(5) BP0191P - - - - - - - + + + + + - - - - - - - - - - - - - - - cell surface putative exported protein

617175 I(2) BP0611-0612P + + + + + - - + + - + + + + + + - + + - + - + - - + + phage-related or transposon-related transposase

1087350 I(1) + intergenic + + + + + + + + + + + + + + + + + + + + + + + + + + +

1096454 I(1) + intergenic - - - - - - - - - - - - - - - - - + + + - - - - - - -

1176540 D(6) + fim2 P + + + + + + + + + - - + + + + + + + + + + + + + + + + pathogenicity serotype 2 fimbrial subunit precursor

1493544 D(2) BP1418P - - - - - - - - - - - - + + + + + - - - - - - - - - - macromolecule degradation methionine aminopeptidase

1647526 D(2) + fim3 P - - - - - - - + - + + + - - - - - - - - - - - - - - - pathogenicity serotype 3 fimbrial subunit precursor

1858025 I(1) + intergenic - - - - - - - - - - - - - + - - - - - - - - - - - - -

2208358 D(1) + intergenic - - - - - - - + + + + + - - - - - - - - - - - - - - -

3041130 I(1or2) + BP2862P + + + + + + + + + + - + + + + + + + + + + - - - - - - unknown hypothetical protein

3140663 D(1) + BP2952P - - - - + - - - - - - - - - - - - - + - - - - - - - - central/intermediary metabolism serine hydroxymethyltransferase

3846122 I(1) BP3642 + + + + + + + + + + + + + + + + + + + + + + + + + + + unknown hypothetical protein

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Chapter 3

75

3.4.6 Insertion Sequence elements

IS elements play an important role in the genome evolution of B. pertussis as there

are more than 200 copies of IS in the genome. Parkhill et al. reported 238, 6 and 17

IS481, IS1002 and IS1663, respectively in Tohama I genome [335]. A total of 25

IS insertions that were absent in the Tohama I genome were detected in one or

more SP13 isolates, six of which were common to all SP13 isolates (Table 3.4-6).

There was no unique IS insertion site common to SP13 epidemic isolates. Thirteen

IS insertions were unique to a single isolate. All new IS insertion sites except those

in the prn gene were due to the IS481R. Of the six insertions at the same site

present in one or more isolates, multiple parallel insertions have occurred in

different lineages. Most of isolates with the same IS insertion were not grouped

together suggesting that either the site is a hotspot for IS insertion or disruption of

the gene is advantageous. However, a common insertion in the gene BP2327 was

found in all four isolates of EL4 (Table 3.4-6).

Chapter 3

76

Table 3.4-6: New IS elements which were found in SP13 isolates. There was no unique IS for 2008-2012 epidemic isolates and only one IS located in BP2327

were common for EL4.

^= state and year of isolation, Prn expression status

LocusT

ag

Epid

em

ic L

ineage

BP

0276

BP

0326

BP

0344-I

G-B

P0345

BP

0551-I

GB

P0552

BP

0764-I

G-B

P0765

BP

0935

BP

0976-I

G

BP

0983

BP

0985

BP

1054

BP

1209

BP

1395

BP

1399

BP

1442-I

G-B

P1443

BP

1560-I

G-B

P1561

BP

1912

BP

1987

IG-B

P2327

IG-B

P2468

BP

2497

BP

2609

BP

2764

BP

2839-I

G-B

P2840

BP

2907

BP

3440

L1493 (NSW11-)^

L1507 (NSW11-)

L1661 (NSW12-)

L1756 (WA11-)

L1779 (WA12-)

L1780 (WA12-)

L1770 (WA11+)

L1421 (NSW10-)

L1658 (NSW12-)

L1397 (WA10+)

L1376 (WA10+)

L1380 (WA08+)

L1382 (WA09+)

L1216 (NSW09+)

L1037 (NSW08+)

L1419 (NSW09+)

L1663 (NSW12-)

L1423 (NSW10+)

1214(NSW+09)

L1042 (NSW08+)

L1391 (WA08+)

L1361 (WA09-)

L475 (2000+)

L462 (99+)

L482 (01+)

L524 (97+)

L490 (02+)

EL

1E

L3

EL

4E

l5P

re-e

pid

emic

EL

2

77

3.4.7 Gene loss

Several studies have investigated gene loss in B. pertussis [41, 42, 283, 285, 287,

336]. Currently circulating strains appear to have lost a large 24 kb region

(BP1948-BP1966). The contigs of SP13 isolates were compared to the reference B.

pertussis strain Tohama I using progressiveMauve [325] to identify deletions in

epidemic isolates. Three loci known as regions of difference including BP0910A-

BP0937 as RD3, BP1134-BP1141 as RD5 and BP1947-BP1968 as RD10 were

deleted in all SP13 isolates and have been reported by others with RD10 deleted in

all ptxP3 strains [285, 336, 337].

78

3.5 Discussion

The 2008-2012 pertussis epidemic was the most severe pertussis epidemic

Australia has experienced since the introduction of the ACV in 1997. The epidemic

started in 2008 and lasted for 5 years with its peak of nearly 40,000 cases in 2011.

In this study, whole genome sequencing was used to investigate the microevolution

and genomic diversity of 22 epidemic B. pertussis isolates, including ten Prn-

negative isolates with three different modes of inactivation (IS481F, IS481R and

IS1002) and the genetic changes detected in the epidemic isolates include SNPs,

small indels, IS insertions and gene losses. The changes detected indicate

continuing evolution of B. pertussis. However, there are only five SNPs that

differentiated the epidemic isolates from pre-epidemic isolates and no other

changes such as indels that are common to all epidemic isolates, suggesting that

SNPs played an important role in the adaptation and microevolution of the 2008-

2012 Australian epidemic B. pertussis. Three SNPs common to the epidemic

isolates are likely to have a functional effect on adaptation as discussed below.

These SNPs may be subject of future functional studies.

The non-synonymous SNP in BP0216 (sphB1) found in all epidemic isolates is of

particular interest. sphB1 is a virulence associated gene positively regulated by

BvgAS [338] and encodes autotransporter SphB1, a 109 kDa exported protein,

belonging to the superfamily of subtilisin-like serine proteases [339]. Unlike other

proteases which mainly proteolyse host proteins, its function is proteolysis

maturation of Fha, the major adhesin in B. pertussis [340, 341]. Secretion of 230

kDa Fha needs maturation of 367 kDa precursors and removal of its ~130 kDa C-

terminal domain which is done by SphB1 [341]. It has been shown that the release

of Fha depends on its maturation and SphB1 is required for its normal maturation

[342]. In SphB1-deficient mutants, the ability of colonisation in mouse respiratory

tract was significantly affected [341]. SphB1 is a 931 amino acid linear protein

consisting of five major regions (I-V) and two main domains. The subtilisin –like

activity of the protein located in the region III with a putative Asp184-His221-Ser412

catalytic triad [340]. The SNP, found in sphB1, changed amino acid 121 from

Chapter 3

79

valine to isoleucine which is located before catalytic site of the protein. Although

the change was conservative with both being hydrophobic amino acids, it may

affect the function of the protein as it is located just outside region III. Five SNPs

were detected in sphB1 in a global set of B. pertussis isolates in Bart et al. [41]

study, including a SNP present in 2008-2012 epidemic isolates in this study.

The other non-synonymous SNP found in all epidemic isolates was located in the

locus BP3570 encoding a branched-chain amino acid ATP Binding Cassette (ABC)

transporter. ABC transporters are relatively specific for their own particular

substrate and can transfer small or large molecules with different hydrophobicity

[343]. It is a cytoplasmic membrane transport/binding protein and consists of two

major domains which are similar to two subfamilies of ABC transporters in other

bacteria such as LivM (N-terminal) and LivG (C-terminal) in Escherichia coli [344,

345]. The SNP caused the change of hydrophilic tyrosine into another polar amino

acid, serine and may impact on the function of the protein as it is located in the first

domain of protein that is involved in the branched-chain amino acid activity.

A non-synonymous SNP in the gor gene (BP2120) unique to lineage EL1 may also

be adaptive. The non-synonymous SNP resulted in an amino acid change from

tyrosine to histidine both of which are polar amino acids. gor encodes glutathione

reductase, which catalyses reversibly the reduction of glutathione disulfide to two

glutathione. It has been shown that high concentration of reduced glutathione in the

respiratory tract lining fluid plays as an important defence against oxidative

damages [346]. Stenson et al. studied the influence of cysteine–containing

compounds on transcription, assembly and secretion of Ptx in B. pertussis and

found that Ptx secretion is promoted efficiently in the in vivo study by reducing

glutathione [347]. There is a possible indirect interplay between the glutathione

reductase and the Ptx secretion.

The genomic tree of the 22 epidemic isolates revealed 5 epidemic lineages,

showing both spatial and temporal clustering of isolates. Two lineages contained

isolates from one state only; all 5 isolates in EL4 and both isolates in EL5 were

from NSW and WA respectively. However, the other 3 lineages contained isolates

Chapter 3

80

from both states indicating interstate spread of the lineage during the epidemic and

highlighting the rapid spread of this respiratory pathogen. As the sample numbers

were small, two lineages seen in one state only may have also spread to the other

states. BEAST analysis dated the MRCA of the epidemic isolates back to late 2002,

giving a few years for B. pertussis to diversify and spread geographically before

causing an epidemic. Globally there is no evidence of geographic restriction of B.

pertussis as ptxP3 strains have spread across the world [41].

Temporal clustering was also evident. One pair of isolates (L1493 and L1507)

collected in 2011 from NSW was identical. There is a need to sequence more

isolates to obtain a better picture of the spatial and temporal clustering and the

relative frequency of the different epidemic lineages. Alternatively the lineage

specific SNPs detected can be used as markers for SNP typing to determine the

spread and expansion of the epidemic lineages.

In the 2008-2012 epidemic, Prn negative B. pertussis strains emerged at the start of

the epidemic and increased to more than 80% by 2012. Multiple mechanisms of Prn

non-expression were found with the insertion of IS481R as the main mechanism of

gene disruption [311]. Therefore, this study analysed the genomes of ten Prn

negative isolates with seven IS481R, one IS481F and two IS1002 inactivated prn

isolates. Phylogenetic analysis demonstrated that Prn negative isolates expanded

independently at different time points. The two isolates with IS1002 mediated prn

inactivation were closely related with four SNP differences and isolated two year

apart from NSW, demonstrating a clonal spread. Six of the seven isolates with

IS481R mediated prn inactivation were grouped together in EL1 as a single origin

and spread across NSW and WA in 2011 and 2012. How one isolate with IS481R

prn inactivation isolated in 2008 from WA showed a separate origin and thus the

disruption of prn gene even with the same mechanism can independently occur.

The results also confirm that Prn negative strains can arise multiple times from

different lineages. Further investigation with a larger number of Prn negative

strains, in particular those inactivated by IS481 would need to be carried out to

determine the extent of independent activations. Considering reports of Prn

negative isolates from many countries, there must be numerous IS mediated prn

Chapter 3

81

inactivation, displaying the remarkable role IS has played in the adaptive evolution

of B. pertussis under selection pressure of the vaccine.

Prolonged epidemic allowed mutations to accumulate during the epidemic and

sequencing of isolates from different years revealed that the number of SNPs

accumulated is quite small. Bart et al. [41] previously found the mutation rate of B.

pertussis is 2.24x10-7

per site per year which equates to about 1 SNP per genome

per year. Our calculation showed that the mutation rate was 3.38x10-7

which was

1.5 times faster than that derived from the global B. pertussis data. As seen from

Figure 3.4-2, most of the closest related isolates from different years differed by

one or two SNPs which was consistent with the mutation rate estimate. However, it

is interesting to note that the study by Xu et al. [302] reports that the mutation rate

may vary among different B. pertussis lineages.

3.6 Conclusion

The comparative genomic analysis of the Australian 2008-2012 epidemic SP13

isolates provided new insights into the evolution of the epidemic B. pertussis in

Australia. Our findings indicate that small changes in genome structure including

SNPs may have contributed to the expansion of the epidemic clone in Australia,

with five SNPs unique to the epidemic lineage. The epidemic SP13 isolates can be

divided into five lineages (EL1-EL5). EL4 and EL5 were only found in NSW and

WA, respectively, while the remaining three lineages were found in both states. All

except one Prn negative isolate with IS481R disruption belong to EL1 as a single

origin and spread to two states. The data also showed that the same mechanism of

inactivation of the prn gene can occur independently.

Chapter 4

82

Chapter 4. Comparative genomics of major Australian Bordetella

pertussis clones

4.1 Introduction

Different methods have been used to type Bordetella pertussis strains to determine

their evolutionary relationships. Multilocus Variable Number of Tandem Repeat

Analysis (MLVA) and Single Nucleotide Polymorphism (SNP) typing have been

used widely to differentiate B. pertussis isolates in Australia and worldwide [282,

301, 305, 348]. Kurniawan et al. separated over 300 B. pertussis isolates from

different countries including 207 Australian isolates into 66 MLVA types. Two of

the MLVA types, MT27 and MT29, were predominant globally [305]. Octavia et

al. also typed 65 SNPs and separated the isolates into 42 SNP profiles [306].

Phylogenetic analysis grouped the SNP profiles into six clusters, cluster I to cluster

VI [306]. The reference genome Tohama I belonged to cluster VI while the

majority of recent Australian isolates belonged to clusters I to IV. It was also

identified that recent Australian isolates were descendants from a single pre-

vaccine lineage. Cluster I appeared to be a major clone with a worldwide

distribution. Typing of genes encoding the acellular pertussis vaccine (ACV)

antigens, pertussis toxin subunit A (ptxA), pertactin (prn), filamentous

haemagglutinin (fhaB) and fimbrial genes (fim2 and fim3) revealed the emergence

and increased incidence of non-ACV alleles present in clusters I and IV.

van Gent et al. also used a different set of 85 SNPs than those used in Octavia et al.

to type Dutch B. pertussis isolates collected from 1949 to 2010 and showed that

they can be differentiated into seven allele types, I to VII. The most recent B.

pertussis isolates were grouped into two allele types, VI and VII with the antigenic

profile ptxP3-ptxA1-prn2/3-fim3-1 and ptxP3-ptxA1-prn2/3-fim3-2 respectively.

These allele types also expanded after the introduction of ACV in the Netherlands

[282]. Thus, by association, the SNP cluster I defined in Octavia et al. corresponds

to allele types VI and VII which are also better known as the ptxP3 strains.

Worldwide epidemiological studies showed that ptxP3 strains expanded mostly

after the introduction of the ACV and became predominant in countries with ACV-

Chapter 4

83

induced immunity against pertussis [41, 282]. Strains with ptxP3 alleles were found

to have a 1.6 fold increase in Ptx production and their emergence and spread were

associated with higher hospitalisation rates in the Netherlands [281].

With the advancement of whole genome sequencing (WGS), genomic studies have

provided a clearer picture of changes within the genomes as well as a better

understanding of bacterial adaptation and evolution. The latest WGS of 343 global

B. pertussis isolates, isolated between 1920 and 2010 showed rapid strain flow

between countries. Furthermore, the study also showed that the genes encoding

virulence associated or surface-exposed proteins were involved in B. pertussis

adaptation [41].

Currently, most genomic studies using Illumina data have focused on identifying

SNPs which are easier to detect than indels or other genetic changes. This is due to

the short reads that are generated from Illumina sequencing for genome assembly.

Short reads causes the assembly of regions which contains repeat sequences to be

difficult and impacts genome assembly as there are over 200 ISs in the B. pertussis

genome. Longer reads that sequence through these repeat regions greatly increase

the contig sizes and can even be used to obtain complete genome assembly. The

Pacific Bioscience (PacBio) sequencing technology is based on single molecule

real time (SMRT) sequencing which provides longer read lengths (average read

length is 6 kb for P6-C4 chemistry) [349, 350]. Longer reads facilitates the

assembly of larger contigs and the detection of indels and genome rearrangements.

However, PacBio sequencing also has a much higher cost per base in comparison

to Illumina sequencing.

4.2 Aims and Motivations

Australian B. pertussis from the 1970s to the present has been divided into four

major clusters (I-IV) [259]. Cluster I strains contains the same antigen gene alleles

as strains that are circulating globally and are also known as ptxP3 strains [259].

The genomic diversity of different clusters including SNPs, indels, IS insertions

and genome rearrangement has not yet been studied. The overall aim of this chapter

was to investigate the genomic evolution of different clusters using two different

Chapter 4

84

methods of whole genome sequencing, Illumina and PacBio sequencing and

specifically to:

a. Investigate the genetic diversity and evolution of major Australian B.

pertussis clones

b. Determine the genetic basis of cluster I predominance in Australia

c. Use two major next generation sequencing methods, Illumina and PacBio

sequencing, to investigate major genome structural changes in different

clusters.

d. Investigate the genomic variations in the regions that are not present in

Tohama I by using a different B. pertussis reference strain, CS.

Chapter 4

85

4.3 Materials and Methods

4.3.1 Bacterial strains

Six isolates from cluster II to V and 3 isolates from cluster I from two major SNP

profiles (SP) including SP13 and SP16, listed in Table 4.3-1, were selected for

sequencing using both Illumina and PacBio sequencing technologies. We selected

clinical isolates that were all isolated after 2000 when WCV were replaced

completely by ACV in Australia.

Table 4.3-1: B. pertussis isolates used for whole genome sequencing.

Lab. No. Year Country-State Cluster SP^ ptxA ptxP prn fim3

L506 2004 Australia IV 30 1 1 1 1

L580 2007 Australia VI 27 2 1 1 1

L706 2002 Australia III 19 1 1 1 1

L1191 2009 Australia-NSW II 37 1 1 3 1

L1204 2009 Australia-NSW UC* 18 1 1 1 1

L1415 2009 Australia-NSW UC 11 1 1 2 1

L1432$ 2010 Australia-NSW I 16 1 3 2 2

L1376 2010 Australia-WA I 13 1 3 2 1

L1756$ 2011 Australia-WA I 13 1 3 2 1

^ SP, SNP Profile, * UC Un-clustered, $ Prn negative isolate

Chapter 4

86

4.3.2 DNA extraction and quality control

Genomic DNA was extracted and purified from pure culture using the phenol-

chloroform method as described previously in section 2.3. For Illumina sequencing,

the DNA concentration was optimised to contain at least 20 ng/µl with a 280/260

and 260/230 ratio of 1.8 to 2. For PacBio sequencing, the optimum amount of high

quality DNA used was with a 280/260 and 260/230 ratio of 1.8 to 2.

4.3.3 Illumina sequencing and assembly

Illumina sequencing and assembly was performed as described in Chapter 3 section

3.3.2

4.3.4 PacBio sequencing and assembly

PacBio sequencing was performed using the PacBio RS system with 2 SMRT cells

per isolates at Yale Centre for Genomic Analysis (YCGA) in the Yale School of

Medicine in the United States. Hybrid assemblies of Illumina and PacBio reads

were done using two different types of assemblers. The first assembler was using

the PBcR assembler pipeline. This approach aimed to first improve the accuracy of

PacBio reads using Illumina reads to recall accurate consensus followed by the

assembly of these corrected PacBio reads. The second type of assembler was using

the PBJelly assembler pipeline where PacBio reads were aligned to Illumina-based

assemblies thereby filling or reducing as many captured gaps from the Illumina

assemblies to produce upgraded draft genomes. Default parameters were used for

both assemblers. It was found that PBJelly performed better, yielded longer contigs

and better coverage, and thus it was used for assembling all genomes sequenced in

this study.

4.3.5 SNPs, indels, insertion Sequence and detection of gene losses

The methods for analysing Illumina reads to identify SNPs, small indels and the

dynamics of IS elements were described in chapter 3 (3.3.3 and 3.3.4). For IS

analyses, PacBio assemblies were also used to manually confirm the results using

progressiveMauve [351]. Similarly, progressiveMauve was also used to identify ≥1

kb size regions that were deleted in the genomes when compared to Tohama I as

potential gene losses in these genomes.

Chapter 4

87

Genome rearrangements

Genome rearrangements were either due to translocation or inversion. To detect the

presence of genome rearrangements, the backbone files generated by

progressiveMauve were analysed. Regions were regarded as having undergone

translocation if the locally collinear blocks (LCB) produced by progressiveMauve

were not in the same order. Regions were regarded as being inverted if the

backbone files showed that the LCBs were in the reversed order (designated as

negative values in the backbone files). Only potentially translocated and inverted

regions from contiguous contigs were considered. Pairwise comparisons were also

performed to illustrate the presence of homologous regions in the B. pertussis

strains analysed in this study that could be affected by translocation and inversion

events using ACT [352].

4.3.6 Phylogenetic analysis

Phylogenetic analysis was conducted using MEGA (version 5) [329]. The

minimum evolution algorithm was applied based on the Nucleotide Minimum

Evolutionary analysis of all positions. Bootstrap analysis was based on 1000

replicates. B. pertussis strain Tohama I was used as the outgroup.

Chapter 4

88

4.4 Results

4.4.1 Selection and sequencing of representative isolates of different SNP

clusters from Australia

To provide a genomic insight into the SNP clusters as defined by Octavia et al. and

to the genomic diversity of the major B. pertussis clones circulating in Australia,

we selected nine isolates for sequencing including three isolates from the currently

circulating cluster I and one isolate each to represent SNP clusters II, III, IV and V.

We also selected two isolates from the SNP profiles which did not belong to any

clusters (unclustered). For cluster I, two isolates were from SP13 (one Prn positive

and one Prn negative) and one was from SP16 (Prn negative) to better represent the

diversity within the current circulating major clone. SP13 was the predominant

SNP profile in the Australian 2008-2012 epidemic while SP16 was the SP which

diverged latest [306]. Both Prn negative isolates were due to inactivation by

IS481R which was the predominant disruption mechanism in current Prn negative

strains.

For Illumina sequencing, the average number of reads generated per genome was

2,027,977. The average coverage depth for all genomes was 38.83, with the lowest

coverage of 20 for isolate L706. The percentage match of reads to the reference

strain Tohama I chromosome ranged from 93.93% to 98.93% using Qualimap

version 2 [331]. The number of contigs ranged from 640 to 3,422, with an average

of 1,112 (Table 4.4-1).

For PacBio sequencing, 2 SMRT cells per genome were used. The quality of

assembly was checked using QUAST version 3 [353]. The average number of

contigs was 110 and ranged from 33 for L1756 to 217 for L1432. The average

length of the largest contigs and the average total length were 326,026 bp and 41 kb

respectively (Table 4.4-2).

Chapter 4

89

Table 4.4-1: Quality of assembly for each isolate- Illumina sequencing.

Isolate(cluster) No. of

reads

% match to

Tohama I

No. of

contigs Contig fold coverage

L506 (IV) 1,619,035 98.62% 464 40.7

L580(V) 2,181,604 98.83% 562 48.9

L706(III) 744,938 98.54% 504 20.07

L1191(II) 898,477 93.93% 381 28.07

L1204(UC^) 1,965,736 97.07% 382 43.7

L1415(UC) 1,928,391 98.48% 2,534 22.83

L1432(I) 3,402,302 98.39% 3,422 22.8

L1376(I) 2,595,615 98.88% 819 61.45

L1756(I) 2,915,695 98.93% 941 60.96

Average 2,027,977 98% 1,112 38.83

^= Un clustered

Table 4.4-2: Quality of assembly for each isolate- PacBio sequencing.

Isolate No. of contigs Total length (bp) Largest contigs

(bp)

N50 (bp)*

L506 (IV) 153 4,123,588 163,881 39,998

L580(V) 150 4,098,589 174,850 47,117

L706(III) 172 4,300,440 214,204 41,172

L1191(II) 66 4,133,993 657,379 125,320

L1204(UC) 66 4,096,203 336,414 120,839

L1415(UC) 66 4,139,824 563,086 115,200

L1432(I) 217 4,028,512 107,502 28,257

L1376(I) 71 4,101,544 307,323 242,945

L1756(I) 33 4,120,661 409,603 256,981

Average 110 4,127,039 326,027 113,092

*N50 is the length for which the collection of all contigs of that length or longer covers at least

half an assembly

Chapter 4

90

4.4.2 Single Nucleotide Polymorphisms in Different Clusters

Only Illumina data were used for SNP discovery as PacBio data generally have a

higher incorrect call rate for SNPs [349]. Two approaches, de novo assembly and

reads mapping to the reference Tohama I genome, were used for SNP calling. A

total of 426 SNPs (mutation in one or more of the 9 isolates) were detected, and

classified into 3 categories: synonymous (sSNP), non-synonymous (nsSNPS) and

intergenic (IG) (Table 4.4-3).

Most of the SNPs were located within genes (364 SNPs, 85%) with 222 being

nsSNPs. In all clusters except clusters II and IV, the number of nsSNPs was greater

than sSNPs with the highest in cluster III. The ratio of nsSNPs to sSNPs ranged

between 0.85 for cluster II to 3.2 for cluster III. A general pattern observed was that

the majority of nsSNPs were in cell surface proteins and pseudogenes with 35 and

25 respectively. The ratio for cluster I strains was also high (2.38). Out of the 364

SNPs found in genes, the highest number of SNPs (53, 15%) was found in genes

coding cell surface proteins and the second highest was for genes coding transport

binding protein with 12%. Twenty three SNPs (6%) were in genes associated with

pathogenicity. Sixty five (15%) genes containing SNPs were regulated by the Bvg

system (Table 4.4.-4) of which 20 (30%) were found in all clusters including one in

the bvgS gene and may explain the divergence of clinical isolates from Tohama I

which is the main strain used for WCV production. The isolate L580 (cluster V)

had the most SNPs in Bvg regulated genes followed by L506 (cluster V) with six

SNPs. Cluster I strains had nine SNPs with two, located in ptxC and bscI, in

common for all three isolates. From a total of 19 SNPs that were located in

virulence associated genes, ten were located in the genes of which their products

are included in the ACV, and interestingly four were relevant to FHA including

fhaB, fhaS, fhaL and sphB1 and four were relevant for Fim, fim2, fim3(two SNPs)

and fimD of which seven of these SNPs were nsSNPs.

Chapter 4

91

Table 4.4-3: The number of SNPs observed in B. pertussis isolates from different clusters

Cluster isolate SNP IG^

No. (%)

sSNP*

No. (%)

nsSNP#

No. (%)

Ratio of

nsSNP/sSNP

I L1432,L1376,L1756 53 9(17%) 13(25%) 31(58%) 2.38

II L1191 30 4(13%) 14(47%) 12(40%) 0.85

III L706 24 3(13%) 5(21%) 16(67%) 3.2

IV L506 34 4(12%) 16(47%) 14(41%) 0.87

V L580 46 8(17%) 13(28%) 23(50%) 1.76

UC$ (SP18) L1204 36 3(8%) 10(28%) 23(64%) 2.3

UC (SP11) L1415 30 4(13%) 12(40%) 14(47%) 1.16

Common in 2 to

8 isolates

Multiple 35 7(20%) 12(34%) 16(46%) 1.33

Common in all

isolates

138 18(13%) 47(34%) 73(53%) 1.55

^= Intergenic SNPs, *= synonymous SNPs, #= Non-synonymous SNPs, $= Un clustered

Many SNPs from the 426 SNPs belonged to a single isolate (Table 4-4-3). Thirty

five SNPs were shared by two or more isolates/clusters. Additionally, 16 SNPs

were shared by the three cluster I isolates. One hundred and thirty eight (32%)

SNPs were common to all nine isolates (Table 4.4-3).

A total of 65 SNPs were found in Bvg-regulated genes of which 37 were nsSNP.

Thirteen SNPs were common to all isolates, one for cluster I to IV which was

located in ptxA, one for cluster I located in bcsI, one in SP13 isolates which was

found in sphB1 and one located in BP2935 that was found from clusters I to III

(Table 4.4-4).

Chapter 4

92

Table 4.4-4: SNPs located in genes regulated by the Bvg system.

Sit

e

SN

P t

yp

e^

Gen

e n

am

e

Gen

e ID

iso

late

s

To

ha

ma

I

L5

80

(V

)

L5

06

(IV

)

L1

20

4 (

UN

)

L7

06

(II

I)

L1

41

5 (

UN

)

L1

19

1 (

II)

L1

43

2 (

I)

L1

37

6 (

I)

L1

75

6 (

I)

Fu

nct

ion

al

ca

teg

ori

es

Bv

g a

ctiv

ate

d*

220937 s ppc BP0215 L706-end G G G G A A A A A A energy metabolism +

223961 ns sphB1 BP0216 L1376-end G G G G G G G G A A pathogenicity +

224066 ns sphB1 BP0216 L1432 G G G G G G G A G G pathogenicity +

543199 ns - BP0534 L580 C T C C C C C C C C miscellaneous +

564430 ns - BP0558 L1432 T T T T T T T C T T transport/binding proteins -

634038 BP0626 L580 G A G G G G G G G G miscellaneous -

647204 ns - BP0640 L580 C T C C C C C C C C miscellaneous -

1074219 cHeR BP1031 L506 A A G A A A A A A A pseudogenes -

1080079 Int BP1037 L580 G A G G G G G G G G pseudogenes +

1080686 cutE BP1037 Tohama I T C C C C C C C C C pseudogenes +

1098918 s prn BP1054 L1191 T T T T T T C T T T pathogenicity +

1101501 s cysG BP1055 L506 G G A G G G G G G G cofactor biosynthesis +

1117897 ns pstS BP1071 Tohama I A G G G G G G G G G central/intermediary metabolism -

1162706 s sphB3 BP1110 L1756 C C C C C C C C C T pathogenicity -

1175956 ns fim2 BP1119 L506 C C T C C C C C C C pathogenicity +

1264962 ns tcfA BP1201 L1191 G G G G G G A G G G pathogenicity +

1448026 s flgM BP1371 L1191 C C C C C C T C C C regulation -

Chapter 4

93

1470281 s fliM BP1394 L1376-end C C C C C C C C T T cell processes -

1565468 ns smoM BP1487 L1204 A A A G A A A A A A transport/binding proteins +

1565529 ns smoM BP1487 Tohama I G A A A A A A A A A transport/binding proteins +

1647688 s fim3 BP1568 L1204 C C C T C C C C C C pathogenicity +

1647861 ns fim3 BP1568 L1432 C C C C C C C A C C pathogenicity +

1652706 s glnH BP1573 Tohama I T C C C C C C C C C transport/binding proteins +

1702741 s - BP1619 L1415 C C C C C T C C C C unknown -

1703727 ns - BP1619 L580 C T C C C C C C C C unknown -

1861522 ns - BP1771 Tohama I A G G G G G G G G G conserved hypothetical -

1880238 BP1790 Tohama I G T T T T T T T T T pseudogenes +

1965604 ns bvgS BP1877 Tohama I T C C C N C C C C C pathogenicity +

1971172 ns fhaB BP1879 L1191 A A A A A A G A A A pathogenicity +

1984103 ns fimD BP1883 Tohama I T C C C C C C C C C pathogenicity +

2104868 s - BP1992 L1204 G G G A G G G G G G cell surface -

2188173 s - BP2068 L1204 C C C T C C C C C C cell surface -

2214506 ns - BP2091 L1415 G G G G G A G G G G small molecule degradation -

2258534 ns - BP2134 L1415 C C C C C T C C C C conserved hypothetical -

2271462 s - BP2149 L1415 C C C C C T C C C C regulation -

2353874 s alr BP2228 Tohama I G A A A A A A A A A cell surface +

2355443 s - BP2229 L580 G A G G G G G G G G transport/binding proteins +

2356411 ns - BP2229 Tohama I T C C C C C C C C C transport/binding proteins +

2356417 ns - BP2229 Tohama I T C C C C C C C C C transport/binding proteins +

2357290 ns - BP2229 L706 C C C C T C C C C C transport/binding proteins +

2359892 ns - BP2231 Tohama I A G G G G G G G G G cell surface +

Chapter 4

94

2363842 s bscC BP2235 L580 C T C C C C C C C C pathogenicity +

2374322 ns bscI BP2249 cluster I T T T T T T T C C C pathogenicity +

2384626 s bscD BP2262 L1204 C C C T C C C C C C pathogenicity +

2492505 s - BP2352 Tohama I A G G G G G G G G G transport/binding proteins -

2605624 s bcr BP2462 L706 T T T T C T T T T T transport/binding proteins +

2817932 s - BP2662 L580 C A C C C C C C C C miscellaneous -

2826237 ns fhaS BP2667 L1432 G G G G G G G A G G pathogenicity +

2921561 ns - BP2751 Tohama I T G G G G G G G G G cell surface +

3023386 BP2846 Tohama I G A A A A A A A A A pseudogenes -

3092624 ns fhaL BP2907 L1756 C C C C C C C C C T pathogenicity +

3123720 ns - BP2935 L1204-end C C C A A A A A A A regulation +

3404606 s - BP3192 Tohama I A G G G G G G G G G transport/binding proteins -

3587622 s - BP3371 Tohama I A G G G G G G G G G conserved hypothetical +

3591185 BP3379 L506,1415-end T T C T T C C C C C unknown +

3596789 ns - BP3385 L580 T G T T T T T T T T conserved hypothetical +

3596916 s - BP3385 Tohama I A G G G G G G G G G conserved hypothetical +

3610438 s - BP3402 L506 C C T C C C C C C C cell surface +

3642513 ns - BP3434 L580 G T G G G G G G G G cell surface -

3950021 s ctaD BP3743 Tohama I A G G G G G G G G G energy metabolism -

3959407 s - BP3750 L580 C T C C C C C C C C small molecule degradation -

3988941 ns ptxA BP3783 L506-end G G A A A A A A A A pathogenicity +

3989239 ns ptxB BP3784 Tohama I G A A A A A A A A A pathogenicity +

3991376 s ptxC BP3787 cluster I C C C C C C C T T T pathogenicity +

4063054 s katA BP3852 L506 C C T C C C C C C C protection responses +

^: (s) = synonymous, (ns) = non-synonymous*: (+) = Bvg upregulated, (-) = Bvg down regulated

Chapter 4

95

We also used the B. pertussis strain CS as a reference to identify SNPs located in

regions that are not present in the Tohama I reference genome. We only used the

two regions that are not present in Tohama I which contains 39 genes. The same

SNP discovery methods were used. A total of 10 SNPs were detected in one or

more isolates, of which two were common to all and 8 were unique for some

isolates (Table 4.4-5). Interestingly, all SNPs were located in genes. Five SNPs

were non synonymous.

Only two SNPs followed the phylogenetic tree, of which, one SNPs BPTD_0398

coding for hypothetical protein was common in all isolates and one SNP in

BPTD_0394 IclR family transcriptional regulator coding for were for all except for

L580 (cluster V). No unique SNPs were found for cluster I strains.

Chapter 4

96

Table 4.4-5: SNPs detected using SAMtools and progressiveMauve when compared with B. pertussis CS as the reference genome. P

osi

tio

n i

n C

S

Gen

e

AA

Ch

an

ge

B.

per

tuss

is C

S

L5

80

(V

)

L5

06

(IV

)

L1

20

4 (

UC

)

L7

06

(II

I)

L1

41

5 (

UC

)

L1

19

1 (

II)

L1

43

2 (

I)

L1

37

6 (

I)

L1

75

6 (

I)

Pro

tein

397797 BPTD_0392 G -> S C C C C C C C C C T hypothetical protein

398362 BPTD_0392 G G G G G G G A G G hypothetical protein

400613* BPTD_0394 A -> T C C T T T T T T T T IclR family transcriptional regulator

404970* BPTD_0398 T G G G G G G G G G hypothetical protein

405700 BPTD_0399 T -> S G G G C G G G G G G enoyl-CoA hydratase/isomerase family protein

411832 BPTD_0404 V -> M C C C C T C C C C C hypothetical protein

3084803 BPTD_2836 G A G G G G G G G G high-affinity branched-chain amino acid for ATP-

binding

3087517 BPTD_2839 G A G G G G G G G G branched chain amino acid ABC transporter

permease

3088495 BPTD_2841 P->R G G C G G G G G G G acetyl-CoA synthetase

3098279 BPTD_2851 G G A G G G G G G G putative amidase

*: Found in study by Xu et al. [302]

Chapter 4

97

4.4.3 Phylogenetic Relationships of the isolates

A phylogenetic tree was constructed using the 426 SNPs generated by the

Minimum Evolutionary analysis method (Figure 4.4-1). Tohama I was used as an

outgroup. Strain CS was not included for phylogenetic analysis as a full

comparison of its genome was not performed in this study and its phylogenetic

relationship to Tohama I is known based on previous studies [249]. The

phylogenetic tree was similar to that reported by Octavia et al. where B. pertussis

isolates were separated into six major clusters [306]. Octavia et al. found that the

clusters showed a step-wise evolution. This is now confirmed using genome data

with multiple SNPs supporting the branching order of the SNP clusters previously

defined. The previously unclustered isolate L1415 shared no common SNPs with

SNP cluster I or II strains and thus remain as “unclustered”. L1204, also an

unclustered isolate, showed the closest relationship with cluster III with one shared

SNP.

The SNPs in the internal branches can be considered as fixed SNPs during the step-

wise evolution of the clusters. For example the eight SNPs on the internal branch,

separating isolates from clusters I to IV, were fixed SNPs as they were shared by

clusters I to IV, although caution must be exercised as only one isolate per cluster

was used. The 46 fixed SNPs are listed in Table 4.4-6 and includes SNPs with

important genetic changes that marked the divergence of the clusters; the antigenic

shift from ptxA2 to ptxA1 in clusters I-IV and prn1 to prn2/3 in clusters I-II. In

addition, five of the eight SNPs fixed during the divergence of cluster I-IV were

genes encoding three cell surface proteins, PtxA, and a regulatory protein. The ptxA

change was observed previously and was hypothesised to be associated with the

introduction of WCV. Five of the nine SNPs fixed in clusters I-III were in genes

involved in transport/binding and macromolecular synthesis of which one, BP1014,

became a pseudogene due to the SNP resulting in a stop codon. Three of the 12

SNPs fixed in cluster I-II were also in genes involved in transport/binding and

macromolecular synthesis. Some of the 16 SNPs shared by the three cluster I

isolates are also on their way to becoming fixed.

Chapter 4

98

Figure 4.4-1. Minimum Evolutionary Tree of 10 B. pertussis isolates from different clusters

based on 426 SNPs. The number on the internal and terminal branches corresponds to the number

of SNPs supporting each branch. Isolation year, cluster information and SNP profile numbers are

shown in brackets.

L1432 (2010’I;16)

L1756 (2011’I;13)

L1376 (2010’I;13)

L1191 (2009’II;37)

L1415 (2009’UC;11)

L706 (2002’III;19)

L1204 (2009’UC;18)

L506 (2004’IV;30)

L580 (2007’VI;27)

Tohama I (1954’VI;36)

10

138

8

46

11

11

30

30

34

36

24

6

16

912

1

8

Chapter 4

99

Table 4.4-6: Fixed SNPs for one or more clusters based on the phylogenetic tree.

Position Locus Gene AA

change

Clu

ster

To

ha

ma

I

L5

80

(V

)

L5

06

(IV

)

L1

20

4 (

UN

)

L7

06

(II

I)

L1

41

5 (

UN

)

L1

19

1 (

II)

L1

43

2 (

I)

L1

37

6 (

I)

L1

75

6 (

I)

Functional categories

864189 BP0833 - D-> N I-IV G G A A N A A A A A pseudogenes

939561 BP0904 pphA F->L I-IV T T C C C C C C C C regulation

1806314 BP1722 - T->M I-IV C C T T T T T T T T conserved hypothetical

2185065 BP2064 - V->A I-IV T T C C C C C C C C pseudogenes

2392797 BP2271 - G->S I-IV G G A A A A A A A A cell surface

2736088 BP2585 - I-IV G G A A A A A A A A cell surface

3938341 BP3728† rkpK A->T I-IV C C T T T T T T T T cell surface

3988941 BP3783 ptxA M->I I-IV G G A A A A A A A A pathogenicity

700292 BP0684 - I-III G G G T N T T T T T pseudogenes

1059382 BP1014 pitA W->STOP I-III C C C T T T T T T T transport/binding proteins

1931433 intergenic - I-III A A A G G G G G G G

2194756 BP2075 - I-III G G G A A A A A A A transport/binding proteins

2488085 BP2348 - I-III T T T C C C C C C C transport/binding proteins

3123720 BP2935† - A->S I-III C C C A A A A A A A regulation

3239465 intergenic BPr09P I-III T T T C C C C C C C

3260282 BP3059 cca I-III C C C T T T T T T T macromolecule synthesis/modification

3846833 BP3642 rpoA I-III G G G A A A A A A A macromolecule synthesis/modification

220937 BP0215† ppc I-III G G G G A A A A A A energy metabolism

50044 intergenic - I-II C C C C C G G G G G

511992 intergenic BP0499P/ BP0500P I-II A A A A A G G G G G

514171 intergenic - I-II G G G G G A A A A A

Chapter 4

100

517207 BP0505 - I-II G G G G G T T T T T phage-related or transposon-related

525420 BP0518 - D->E I-II G G G G N C C C C C conserved hypothetical

883816 BP0854 nuoN L->F I-II C C C C C T T T T T energy metabolism

1137841 BP1090 - T->K I-II C C C C C A A A A A conserved hypothetical

1290405 BP1227 radA I-II G G G G G A A A A A macromolecule synthesis/modification

1827556 BP1741 - D->N I-II G G G G G A A A A A pseudogenes

4068047 BP3857 - G->S I-II C C C C C T T T T T miscellaneous

4068650 BP3858 - G->S I-II C C C C C T T T T T transport/binding proteins

4071996 BP3861 - I-II G G G G G A A A A A transport/binding proteins

36857 intergenic BP0032P/ BP0033P I A A A A A A A G G G

185405 BP0184 - Q I G G G G N G G A A A cell surface

196307 BP0194 - V->A I T T T T N T T C C C transport/binding proteins

299559 BP0292 - A->T I C C C C C C C T T T pseudogenes

518837 BP0507 - E I T T T T T T T C C C unknown

694521 BP0678 prfA K I A A A A N A A G G G macromolecule synthesis/modification

1077844 intergenic - I C C C C C C C T T T

1331840 BP1261 - A->T I G G G G N G G A A A pseudogenes

1547488 BP1471 - N->S I A A A A A A A G G G conserved hypothetical

1795894 intergenic - I C C C N C N C A A A

2374322 BP2249† bscI Y->C I T T T T T T T C C C pathogenicity

2651008 BP2502 - T->I I G G G G G G G A A A unknown

3263622 intergenic BP3062P I A A A A A A A C C C

3840411 intergenic rpsH I G G G G G G G A A A ribosome constituents

3988168 ptxP3 BP3783P I G G G G G G G A A A

3991376 BP3787† ptxC I C C C C C C C T T T pathogenicity

*AA: amino acid, †Bvg activated

Chapter 4

101

4.4.4 Potential Adaptive SNPs in Different isolates/Clusters

Non-synonymous SNPs may play a role in adaptation. The nsSNPs can be seen in

the Appendix 2 which lists other potential adaptational changes in different

clusters. It should be noted that only one isolate was selected from each of the

different clusters except for cluster I and thus only provides a very limited view of

variations within a cluster. There were 24 to 46 SNPs in different clusters. To

differentiate the unique SNPs for each isolate from the SNPs defining clusters, we

compared them with the SNPs identified in the WGS of 343 B. pertussis isolates

collected worldwide [41]. SNPs were filtered to be fixed for each cluster based on

the high frequency in B. pertussis isolates. For clusters II to V, an average of five

SNPs of internal branches was detected for each cluster. However, for cluster I, two

SNPs were shared with all ptxP3 strains, 2 for SP16 and one for SP13 strains. Of

interest is the nsSNPs located in the tcfA for cluster II (prn3 strain) that was also

found in the majority of prn3 strains in Bart et al. study [41] and a few other prn

type strains that were grouped with prn3 strains in the phylogenetic analysis. Two

nsSNPs in fim3 and prpB were for the SP16 isolate L1432 which were fixed SNPs

involved in antigenic shift from fim3-1 to fim3-2. Other SNPs that were present in

fewer numbers of isolates might be unique SNPs for Australian clusters.

Intergenic SNPs such as those found in ptxP promoter may also be adaptive. From

a total of 62 intergenic SNPs, 59 were in promoter regions of genes, of which, 11

were located in phage-related or transposon-related genes and eight in genes

encoding cell surface proteins. Three SNPs were located in the promoter region of

virulence associated genes including ptx for all cluster I strains, fhaL for L1415 and

tcfA for all isolates.

We observed 16 SNPs which were specific for cluster I or ptxP3 strains in this

study (Table 4-6). Of these, the most important were in BP0678 (peptide chain

release factor 1), BP2249 (type III secretion protein (bscI)), BP3630 (30S

ribosomal protein S8 (rpsH)), ptx promoter and BP3787 (ptxC). Other SNPs were

located in the intergenic regions or genes coding for hypothetical protein. One SNP

was located in the intergenic region (1077844) and was only identified here unique

to our set of isolates. One SNP (in BP2249) was reported previously when 316

worldwide isolates were categorised into 6 major clusters using 65 SNPs [306].

Chapter 4

102

Table 4.4-7: Possible unique SNPs for each cluster

Isolate(s) Cluster SP^ No. of SNPs for each

isolate (cluster or SP)

No. of SNPs found in

Bart et al. study[41]

Intergenic sSNP* nsSNP# Genes with nSNP

L580 V 27 46 45** 1 3 1 BP3385

L506 IV 30 34 24 0 0 4 BP0099,BP1421,BP2611,BP2694

L706 III 19 24 22 2 3 4 BP0435,BP1276,BP1666,BP2229

L1191 II 37 30 12 1 2 2 BP1201,BP1291

L1204 UC$ 18 36 25 0 1 3 BP2014,BP2538,BP3474

L1415 UC 11 30 24 0 4 1 BP1498

L1432 I 16 11 9 0 2 2 fim3,prpB

L1376,L1756 I 13 9 7 0 0 1 BP0658

^=SNP Profile, *= synonymous SNP, # non-synonymous SNP, $= Un clustered

**: The majority of the SNPs found were previously reported in Bart et al. study with a small number of SNPs unique to Australian isolates.

Chapter 4

103

4.4.5 Indels

Indels less than 50 bp were identified by mapping reads against B. pertussis

Tohama I using Burrows-Wheeler Alignment (BWA) tools (version 0.7.5) and

SAMtools (version 0.1.19) [328] as described for indel identification in Chapter 3.

Indels were then confirmed using contigs from mixed assembly of Illumina and

PacBio data. A total of 68 indels were found in all isolates, of which 19 were

located in intergenic regions and 49 were in genic regions (Table 4.4.-7). Of the 49

indels located in genes, 25 were due to deletions and 24 were due to insertion of

one or more bases. Thirteen were in pseudogenes suggesting further degradation of

these pseudogenes.

For indels present in more than one isolate, they followed the SNP-based

phylogenetic relationships. A total of 15 genic indels were commonly identified in

all nine isolates while another four genic indels followed the SNP phylogenetic

tree. One indel in BP2928, coding for putative membrane transport protein, was

common to all clusters except cluster V. Another indel located in BP0364,

encoding a hypothetical protein, was identified in seven isolates from L1204 to

cluster I to III strains but absent in clusters IV and V. The final two indels were

frameshift indels that were unique to cluster I (ptxP3) strains and located in

BP0880 and BP2946. Bart et al. in 2010 also reported these indels present in ptxP3

strains when they were compared with pre-ACV vaccination strains [283].

For unique indels, L580, a cluster V isolate had the most number of unique indels

(11) followed by L506 (cluster II) and L706 (cluster III) with 5 indels. L1204 had 4

indels and three indels each were found in L1415, L1432 and L1756. Only one

indel was found in L1191 and L1376.

Nineteen indels were located in the intergenic regions, of those, 16 were in the

promotor regions. Five were commonly identified in all isolates including four in

the promoter regions of BP0612, BP1044, fim2 and rpoA. One indel which was

located in the promoter region of glyA, and codes for a serine hydroxyl methyl

transferase, was present in all clusters except for cluster V. Eleven indels were

unique to a single isolate with four found only in L580. One indel was located in

Chapter 4

104

the promoter region of fim3 and was found in SP13 isolates which belonged to EL4

(Chapter 3).

4.4.5.1 Effect of indels on the function of genes

There were 37 frameshift indels which occurred in different genes. Twenty four

genes were affected by one base pair indel. Five genes were affected by two base

pair indels and one gene each was affected by 4, 5, 7, 8, 10, 11 and 31 base pairs

indels, respectively. Twelve indels were in a multiple of three bases and did not

affect the reading frame of the genes involved. Ten frameshift indels were common

to all isolates including six pseudogenes and two genes encoding functions within

the cell surface category. Eight indels were also common to multiple isolatess.

Altogether, 19 indels were unique to different individual isolates with 7 unique

indels present in L580 (cluster V).

Most pseudogenes were formed by small frameshift indels and early gene

termination. These changes resulted in proteins that were >20% shorter and were

considered to be pseudogenes [354]. Eleven genes were converted to pseudogenes,

of which 4 encoded cell surface proteins. BP2232, which encodes a hypothetical

protein, had a one base pair deletion that was identified in all isolates. This suggests

that only Tohama I possessed a functional copy of the gene. Two genes, BP2928

and BP2946, which codes for a putative membrane transport protein and a putative

ArsR-family transcriptional regulator respectively, followed the phylogenetic tree.

The former had a common 7 bp deletion present in all clusters except cluster V

while the latter was only altered in cluster I strains with a 31 bp deletion. There

were also two genes, BP0986 and BP1366, in L1432 and L1191 respectively with

no stop codons at the end due to a 2 and 10 bp frameshift indel. These indels

resulted in a larger sized protein being produced which may affect their functions.

Indels which resulted in a reversion of pseudogenes to functional genes were also

investigated. From 11 pseudogenes, six were likely to have been reverted back to

functional genes. Of these six, four genes including BP1351, BP2000, pntB and

BP3762 were found in all isolates which suggest that they might only be

pseudogenes in Tohama I. The other two genes were BP0880 that may be reverted

back for all cluster I strains and bapC for L1415 (unclustered isolate). BP0880

Chapter 4

105

codes for a putative exported protein that may function as a metal dependent

phosphohydrolase, and was converted to a pseudogene in Tohama I by a small

frameshift in codon 793 which terminated gene translation. An 8 bp frameshift

deletion near the same position was likely to have changed the stop codon to an

amino acid codon therefore reverting it back to a functional gene.

bapC is a Bvg regulated pseudogene which codes for a putative outer membrane

protein in B. pertussis Tohama I. However, recent studies showed that bapC is an

adhesin factor and that it may also be important in serum resistance with a role

similar to BrkA by interfering with the classical pathway of complement activity

[355, 356]. Based on our finding, it seems that due to a one bp deletion in the

homopolymeric region of this gene, it has reverted it back to a functional protein

being produced in L1415. We also identified two frameshift indels in bapC in the

SP13 study; one of which was specific for the EL2 lineage (Chapter 3).

Homopolymeric regions are likely to be subjected to phase variations in bacteria.

There were 19 indels found within the homopolymeric regions of genes or

intergenic regions. Eight indels were located in homopolymeric genic regions

including 2 within Bvg regulated virulence genes, BP2399 which codes for a

putative transcriptional regulator in L580 and bapC in L1415, as discussed above.

One indel within the homoplymeric region in BP1351, which encodes a cell surface

protein, was found in all isolates. Other indels within homopolymeric regions were

found unique to some isolates. Interestingly, one indel located in the

homopolymeric regions of BP3224 and encodes a putative cytochrome oxidase was

only found in Prn negative cluster I isolates, L1756 and L1432.

The majority of intergenic indels (11, 61%) were located in homopolymeric

regions. Of these 11 intergenic indels located in the homopolymeric regions, two

were in the intergenic regions of fim2P and BP1044P, and were common to all

isolates. The C-stretch in the fim2 and fim3 promoter regions have been known to

affect the expression of fim2 and fim3 [76, 80].

Chapter 4

106

For the regions absent in Tohama I but present in CS, only one indel was found. A

single base deletion in the BPTD_2837 that codes for an amino acid transport ATP-

binding protein resulted in a pseudogene and this was also found in all 9 isolates.

Chapter 4

107

Table 4.4-8: Indels found in B. pertussis isolates belonging to different clusters. P

osi

tio

n

in/d

el

Siz

e (b

p)

Ho

mo

po

lym

eric

tra

ct

Lo

cus

Gen

e

L5

80

(V

)

L5

06

(IV

)

L1

20

4 (

UC

*)

L7

06

(II

I)

L1

41

5 (

UC

)

L1

19

1 (

II)

L1

43

2 (

I, S

P1

6)

L1

37

6 (

I, S

P1

3)

L1

75

6 (

I, S

P1

3)

Fu

nct

ion

al

ca

teg

ory

Eff

ect

of

mu

tati

on

Frameshift

63605 I 1 + BP0064 - - - + - - - - - conserved hypothetical Pseudogene

167280 I 1 BP0167 + - + + + + + - - pseudogenes No change

195374 I 1 BP0192 + - - - - - - - - phage-related or transposon-related No change

266677 I 1 BP0256 - - - - - - + - - phage-related or transposon-related No change

365461 D 1 BP0364 - - + + + + + + + cell surface No change

919013 D 8 BP0880 - - - - - - + + + pseudogenes Revert?

1022899 I 1 BP0985 acrB - - - - - - - - + protection responses Pseudogene

1026380 I 2 BP0986 cusC - - - - - - + - - cell surface no end stop codon

1080117 D 1 BP1037† lnt + + + + + + + + + pseudogenes No change

1184203 I 2 BP1123 + + + + + + + + + cell surface No change

1193543 I 1 BP1129 - - - - - - + - - regulation No change

1417558 I 2 BP1344 - - - - + - - - - pathogenicity No change

1426476 D 1 + BP1351 + + + + + + + + + pseudogene Revert?

1442608 D 10 BP1879† fhlB - - - - - + - - - pathogenicity Pseudogene

1464175 I 1 BP1386 + + + + + + + + + pseudogenes no end stop codon

1464181 I 1 BP1386 + + + + + + + + + pseudogenes No change

1733295 D 1 BP1645 - - - + - - - - - miscellaneous No change

Chapter 4

108

1767149 D 1 + BP1682 + - - - - - - - - cell surface Pseudogene

2111898 I 1 BP2000 + + + + + + + + + pseudogenes Revert?

2132148 I 1 BP2018 - - - - - - - - + phage-related or transposon-related No change

2241646 I 11 BP2029 + + - + + + + - - phage-related or transposon-related Pseudogene

2299222 I 1 BP2180 - - + - - - - - - conserved hypothetical No change

2304558 I 1 BP2184 rpoD + - - - - - - - - macromolecule

synthesis/modification

No change

2361165 D 1 BP2232† + + + + + + + + + unknown Pseudogene

2540683 D 1 + BP2399† + - - - - - - - - regulation No change

2747342 D 1 + BP2595† - - + - + - - - - pseudogenes No change

2907628 D 1 + BP2738 bapC - - - - + - - - - pseudogenes Revert?

2929896 D 2 BP2755 - - - + - - - - - cell surface Pseudogene

3074606 I 2 BP2899 pntB + + + + + + + + + pseudogenes Revert?

3117315 D 7 BP2928† - + + + + + + + + transport/binding proteins Pseudogene

3134430 D 31 BP2946 - - - - - - + + + regulation Pseudogene

3162196 D 1 BP2975 + - - - - - - - - regulation Pseudogene

3291152 I 4 BP3090 + + + + + + + + + conserved hypothetical No change

3436723 I 1 + BP3224 - - - - - - + - + energy metabolism No change

3535025 I 5 BP3315 copA + - - - - - - - - adaptation No change

3974877 D 1 + BP3762 + + + + + + + + + pseudogenes Revert?

3987411 D 1 BP3782† + - - - - - - - - cell surface Pseudogene

Non-Frameshift

264560 D 3 BP0254 + - - - - - - - - cell surface

Chapter 4

109

1005224 D 6 BP0967† cysU + + + + + + + + + pseudogenes

1085817 D 6 BP1042 - + - - - - - - - pseudogenes

1250494 D 3 BP1186 - - - - - - - - + miscellaneous

1707618 D 6 BP1624 kpsT + + + + + + + + + transport/binding proteins

1916554 D 6 BP1826 - + - - - - - - - transport/binding proteins

2114835 D 9 BP2003† - - - + - - - - - cell surface

2265235 I 6 BP2141 + + + + + + + + + cell surface

2361991 I 3 BP2232† - + - - - - - - - unknown

2877338 D 12 BP2712 - + - - - - - - - cell surface

3476167 I 3 BP3258 + + + + + + + + + pseudogenes

3793608 I 3 BP3580† + + + + + + + + + macromolecule degradation

Intergenic

154997 D 1 + BP0155P serA + - - - - - - - - amino acid biosynthesis

932632 D 1 + BP0897P + - - - - - - - - phage-related or transposon-related

1087350 I 1 + BP1045P + + + + + + + + + phage-related or transposon-related

1176540 D 6 + BP1119† fim2 + + + + + + + + + pathogenicity

1272424 D 1 + BP1208P - - + - - - - - - cell surface

1647526 D 2 + BP1568 fim3 + + + - + + - - -

2134111 I 1 + BP2020P† + - - - - - - - - unknown

2606083 I 1 + BP2463P fauA - - - - + - - - - pathogenicity

3041130 I 1\2 + BP2863-

BP2862P

+ + + + + - + + - unknown

3140663 D 1 + BP2952P glyA - + + + + + + + + central/intermediary metabolism

Chapter 4

110

3645032 D 1 + BP3437P - - + - - - - - - conserved hypothetical

94462 I 1 BP0096P - - - + - - - - - cell surface

444906 I 6 Intergenic + - - - - - - - -

215507 D 1 BP0210P - + - - - - - - - phage-related or transposon-related

617175 I 2 BP0612P + + + + + + + + + cell surface

1493544 D 2 BP1418P map - - - - - - - + - macromolecule degradation

2996743 D 4 BP2819P - - + - - - - - - phage-related or transposon-related

3462288 D 39 Intergenic + + + + + + + + +

3846122 I 1 BP3642P rpoA + + + + + + + + + macromolecule

synthesis/modification

*UC= Un-clustered, †=Bvg regulated

Chapter 4

111

4.4.6 Insertion Sequences

There are three different insertion sequences in B. pertussis including IS481,

IS1002 and IS1663, of which IS481 is the major IS in B. pertussis strains. The

number of IS481 varies in different B. pertussis strains from 230 in B. pertussis

strain Tohama I to 239 in B. pertussis strain 18323 [249]. The hybrid PacBio and

Illumina assemblies were blasted against Tohama I to detect the presence and

absence of IS elements identified for B. pertussis Tohama I.

4.4.6.1 Deleted ISs

From a total of 230 IS481 elements detected in B. pertussis Tohama I [249], 213

ISs were detected as a deletion in one or more strains with an average of 67 and

ranging from 117 in L580 to 26 in L1756 (Table 4.4-9). Almost all of the IS

elements in the genome of B. pertussis Tohama I are known as transposases and

categorised as phage-related or transposon-related genes, and some as

pseudogenes. Therefore, deletions of transposases will be discussed in detail in the

gene loss section (4.4-7).

Table 4.4-9: General information about the presence and deletion of IS481 identified in B.

pertussis isolates that were analysed in this study.

Isolate Cluster Deleted Unique deletion

for each isolate

New IS Unique new IS

for each strain

L580 V 117 14 10 4

L506 IV 99 12 11 3

L1204 UC^ 46 3 7 1

L706 III 44 0 10 3

L1191 II 52 3 9 2

L1415 UC 33 2 9 2

L1376 I 49 4 6 0

L1756 I 26 0 7 0

L1432 I 144 20 9 1

Average 67.8 6.4 8.6 1.7

^= Unclustere

Chapter 4

112

4.4.6.2 New IS elements

All three types of ISs were searched. All newly identified IS insertion sites, except

for one IS1002 insertion in BP0692, were due to IS481. A total of 27 new IS

insertions were detected in one or more isolates, with an average of 8.6 and ranging

from 11 in L506 to 6 in L1376 (Table 4.4-10). Five IS insertions, one each in

BP0551, BP0935, BP1442, BP1987 and BP2839, were common to all clusters

(Table 4.4-10). One IS located in BP0976 was common to clusters I to IV. The

known IS insertion in the two Prn negative isolates of cluster I, L1432 and L1756,

were confirmed. A total of 16 IS were found unique to one isolate each including 4

in L580 (cluster V), followed with 3 ISs in each of L506 and L706, 2 in each of

L1191 and L1415 and one IS in each of L1204 and L1432, respectively.

The genes disrupted encompassed 9 functional categories including pathogenicity

(4), cell surface (4), energy proteins (4), miscellaneous (4), conserved hypothetical

proteins (3), transport binding protein (2 genes), macromolecule synthesis (2),

regulation (2) and unknown

Chapter 4

113

Table 4.4-10: New IS elements found in the different isolates.

^: UC= Un clustered, *= SNP Profile

LocusTag GeneName

L5

80

(V

)

L5

06

(IV

)

L1

20

4 (

UC

)^

L7

06

(II

I)

L1

41

5 (

UC

)

L1

19

1 (

II)

L1

43

2 (

I, S

P*

16

)

L1

37

6 (

I, S

P1

3)

L1

75

6 (

I, S

P1

3)

Functional category Protein

BP0199 ## conserved hypothetical conserved hypothetical protein

BP0276 cytB ## ## energy metabolism cytochrome B

BP0499-IG-BP0500 34 pathogenicity hypothetical protein

BP0551-IG-BP0552 fmt-def 52 32 49 30 41 61 64 79 46 macromolecule synthesis/modification methionyl-tRNA formyltransferase

BP0692 ## energy metabolism cytochrome C

BP0764 cyaX 81 pathogenicity probable LysR-family transcriptional regulator

BP0935 74 39 62 31 46 87 72 95 67 cell surface putative exported protein

BP0976-IG 40 54 21 43 72 67 85 25 conserved hypothetical conserved hypothetical protein

BP1054 prn ## ## pathogenicity pertactin precursor

BP1123 77 cell surface putative membrane protein

BP1442-IG smpB- 76 56 75 21 54 74 92 86 55 miscellaneous-conserved hypothetical SsrA-binding protein

BP1513 ## energy metabolism formate dehydrogenase

BP1767 phg 58 pathogenicity autotransporter

BP1987 ## ## 99 62 90 ## ## ## ## regulation probable MarR-family transcriptional regulator

BP2225 52 miscellaneous putative phospholipase

BP2369 acnA ## energy metabolism probable methyl-cis-aconitic acid hydratase

BP2713 92 60 ## miscellaneous putative hydrolase

IG-BP2803 43 cell surface putative integral membrane protein

IG-BP2827 41 58 unknown hypothetical protein

BP2839-IG 86 50 61 30 64 84 ## ## 62 cell surface exported protein, conserved

BP2857 73 regulation putative LysR-family transcriptional regulator

BP3116 89 macromolecule synthesis/modification putative modification methylase

BP3160 ## transport/binding proteins putative ABC transporter permease protein

BP3304 61 miscellaneous putative enoyl-CoA hydratase

BP3370 ## conserved hypothetical conserved hypothetical protein

BP3764 ## transport/binding proteins putative ABC-2 type transporter protein

Chapter 4

114

4.4.7 Gene Loss

To investigate gene deletions and regions of differences in each cluster, the hybrid

assemblies were blasted to the genome reference B. pertussis Tohama I and regions

with a 300 bp or more deletion were detected and analysed. Of the 3816 genes in

Tohama I [332], 402 genes (10%) were detected as partially or completely deleted

in one or more of the isolates (Appendix 3). The average number of genes affected

per isolate was 134.55 genes, ranging from 213 genes in L508 from cluster V to 76

genes in L1415. From the 3285 core genes, which are found to be present in most

B. pertussis strains and seems to be important for bacterial metabolism [249, 332],

290 (8.8%) genes were deleted in one or more isolates. Of the 402 genes deleted,

112 were genes defined as variable genes or regions of differences in B. pertussis.

Thus, the majority of genes deleted were core genes ranging from 147 for L1432 to

29 in L1756, both from cluster I.

Table 4.4-11: The number and percentage of genes affected by partial or complete deletions in

each isolate

Isolate Cluster No. of

genes

deleted

Unique

deletions

for each

isolate

Core genes* Variable

genes

Ratio Core

/Variable gene

L580 V 213 40 145(68.08%) 68(31.92%) 2.1

L506 IV 172 48 115(66.86%) 57(33.14%) 2.0

L1204 UC^ 108 5 52(48.15%) 56(51.85%) 0.9

L706 III 115 24 73(63.48%) 42(36.52%) 1.7

L1415 UC^ 76 5 39(51.32%) 37(48.68%) 1.1

L1191 II 118 10 58(49.15%) 60(50.85%) 1.0

L1376 I 118 8 54(45.76%) 64(54.24%) 0.8

L1756 I 79 0 29(36.71%) 50(63.29%) 0.6

L1432 I 212 28 147(69.34%) 65(30.66%) 2.3

^= UC-clustered

*core genes are stable genes found to be present in most B. pertussis isolates

Chapter 4

115

Gene deletion events also followed the phylogenetic tree. Of the 402 genes, 44

genes were affected with partial or complete deletions in all isolates including two

large loci with 24kb (BP0910-BP0934) and 8.6 kb (BP1134-BP1142) (Figure 4.4-

2). Two genes, BP1080 and BP2087, both coding for transposases, were only

deleted in L580, L506 and L1206. Three genes, BP2587 coding for ribonucleotide

biosynthesis, BP3406 and BP3407, both coding for transposases were deleted in all

isolates except L1415 and cluster I strains. BP2577 which also encodes a

transposase was deleted in all except cluster I strains. One gene, BP1810, was

deleted in cluster I to cluster IV. It has been reported by others that BP1947 to

BP1968 has been deleted in all ptxP3 strains [285, 336, 337]. This was also found

in our study. However, five genes in this RD from BP1955-BP1960 were also

found to be deleted in other clusters.

Chapter 4

116

Figure 4.4-2: Genes affected by partial or complete deletion in B. pertussis isolates analysed in

this study. Hybrid assemblies were blasted against B. pertussis Tohama I genome and regions with

300 bp or more deleted were detected and analysed. Two large loci were deleted in all isolates and

BP1947.

a n n n a n n n n

n a n n n n n n n

n n n n n n n n

n n n n n a n n n

n a n n n n n n n

a a n n n n n n n

a n n n n n n n n

a a n n n n n n a

n n n n n n n n a

a n n n n n n n n

a a a a a a a a a

a a a a a a a n a

a a a a a a a n a

a a a a a a a a a

a a n a n n n n a

a n n n n n n n a

n n n n n a n n a

n n n n n n n n a

n a n a n n n n n

a a n a n n n n a

a n n a n n n n a

a n n a a n n n a

n n n n n n a n n

n a n n n n n n n

n n n n n n n n a

n a n n n n n n n

n n n a a n n n a

n n a n n n n n a

n a n n n n n n a

n n n n n n n n a

n n n n n n n n a

a n n n n n n n a

a a n n n a n n n

n n n n n a n n n

n n n n n a n n n

a n n n n a n n a

a a n n n n n n a

n a n n n n n n a

a a n n a n n a a

n n n n n n n n a

a a a a a n a a a

a a n n a n a a a

n n a a n n n n a

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

a a n n n n n n a

n a n n n n n n n

n n a n n n n n n

n n n n a n n n a

n a n n n n n n a

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

a a n n n n n n a

n a a n n n n n n

n a a n n n n n n

n n n n a n n n n

n n n n n n n n a

n n n n n n n a a

n a n n n n n n n

n n n a n n n n n

n n n a n n n n n

n n n a n n n n n

a a n n n n a n n

a n n n n n a n a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a n n n n n n n a

n n n n a n n n a

a n a n n n n n n

a n n n n n a n a

a n n n n n a n a

n n n n n n n n a

n n n n n n n n a

n n n n n n n n a

a a a a a a a a a

a n a a a n n a a

n n a n n n n n n

n n a n n n n n n

n n a n n n n n a

a a a n n n n n n

a n n n n a n n a

a n n n n n n n a

n n n n n n n n a

n n n n n n n n a

a n n n n a a n n

a a n n n n n n a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

n n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n a n n

a n a n a n n n n

a n a n a n n n n

a n a n a n n n n

a n a n a n n n n

a n a n a n n n n

a n a n a n n n n

n n a n a n n n n

n n n a n a a n a

n n n a n a a n a

a a n n n n n n a

a a n n n n n n n

n n a n n n n n a

a n n n a n n n a

a n n n a n n n a

n n n n a n n n n

a n n n n n n n a

a n n n n n n n a

a n n a n n n n n

n n n a n n n n n

a n n n n n n n a

a a n n n n n n n

a a n n n n n n n

n n n n n n n n a

n a n n n n n n n

a n n a a n n n a

n a n n n n a n a

a a n a n a a a n

a a n a n a a a n

a a n a n a n n n

a a n a a n a a a

n n n a n n n n n

a n n a n n n n a

a n n n n n n n a

n a n n n n n n a

a a n a n n n n a

n n n a n n n n n

n n n a n n n n n

n n n a n n n n n

a n a a n n n n a

a a n n n n n n n

n n a n n n n n a

a n n n n n n n n

n n n a n n n n n

n n n a n n n n n

a n n n n n n n n

a a a n n n n n a

n a a n n n n n n

n a a n n n n n n

n a a n n n n n a

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n n

n a n n n n n n a

a a n n n n n n a

n a n n n n n n n

n n n n n n n a a

n a n n n n n n a

n n n n n n n n a

a n n n n n n n n

a n n n n n n n a

a n n n n n n n a

a n n n n n n n a

a n n n n n n n a

a a a a a a a a a

n a a a a a a a a

n n n n n n n n a

n n n n n n n n a

a n n n a n n n n

n n n n n n a n a

a a n n n n n n a

n n n a n n n n a

n n n a n n n n n

n n n n a n n n n

n n n a n n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

n n n a n n a a a

n n n n n n a a a

n n n n n n a a a

n a n a n n a a a

n a n a a n a a a

a a n a a n a a a

a a n a a n a a a

a a n a a n a a a

n n n n a n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

n n n n n n a a a

a a n n n n n n n

n n n n n n a n a

n n n a n n n n n

n n n a n n n n a

n a n n n n n n a

a a a n n n n n n

n n a n a n a n a

a n a n a a a n a

a a n n a a a n a

n n n n a n n n a

a a n n n n n n a

a a n n n n n n a

a a n n n n n n a

a n n n n n n n n

a a n n n n n n a

a n n n n n n n n

n n n n n n a n a

n n a n n n n n n

n n n n n a n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a a n n n n n n a

a a a a n n a n n

n n a n n n n n n

a n n n n n n n n

a a n n n n n n n

a n n n n n n n n

a n n n n n n n n

n n n n n n n n a

a n n n n n n n n

a n n n n n n n n

a a n a n n n a a

n n n a n n n n n

n a n n n n n n n

n a a n n n n n n

n a a n n n n n a

n a a n a n n n a

a n n n n n n n a

n a a n n n n n n

a a a a a a n n n

a a a a a n n n n

n a a a a n n n a

a a n n n n n n a

a n n n n n n n a

n n n n n n n n a

a n n n n n a n a

a n n n n n a n a

a n n n n n n n a

n n n a n n a n n

n n n a n n n n n

n n n n a n n n a

n a a n a n n n a

n a n n n n n n a

n a n n n n n n n

a n n n n n n n n

a n n n n a n n n

a n n n n a n n n

n a n n n n n n n

n n n a n n n n n

n n n n n n a n n

n a n n a n n n n

n a n n a n n n n

a a n n a n n n a

n a n n n n n n n

n n n n n n a n n

n n n n n n a n n

a n n n n n a n n

n n n a n n n n n

n n n a n n n n n

n n n a n n n n n

a n a a n n n n a

a n a a n n n n a

a a n a n n a n a

n n n a n n n n n

n n n a n n n n n

n n n a n n n n n

n a n n n n n n n

n n n a n n n n n

n n n a n n n n n

n n n a n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n n n n

a n n n n n a n n

n n n n n n a n n

n a n n n n n n a

a n n n n a a n a

a n n n n a a n a

a n n n n n n a n

n a n n n n n n a

a n n n n n n n a

n a n n n n n n n

n a n n n n n n n

n a n n n n n n a

n a n n n n n n n

n a n n n n n n n

n a n n n n n n a

n a n n n n n n n

a a a a a a n a a

a a a a a a a a a

n a n n n n n n a

n n n n n n n n a

n n n n n n n n a

n n n n n n n n a

n n n n n n n n a

n n n n n n n n a

n n n n n n a n a

n n n n n n n n a

n n n n a n n n n

n n n n n a n n a

a n n n n n n n a

n a n n n n n n a

n n n n n n n n a

a a n a a a a n n

a a n a a a a n a

a a n a a a a n n

n n n n n n n n a

n n n n n n n n a

n n n n n n n n a

a a n n n n n n a

a n n n n n n n n

a a a a a n n n n

a a a a a n n n n

a a a a a a a a a

a n n n n n n n a

n n n n a n n n n

n n n n a n n n n

n n n n a n n n n

n n n n a n n n n

n n n n a n n n n

n n n n a n n n n

a a n n a n n n a

n a a a n a n n n

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

a a a a a a a a a

n a a n n n n n a

n n n n n a n n a

a n n n n n n n a

a n n n n n n n a

a n n n n n n n a

a n n n n n n n n

a n n n n n n a n

a a n n a n n a n

a a n n n n n n a

BP0910-BP0934

BP1134-BP1142

BP1948-BP1968

L5

80

(V

, S

P2

7)

L7

06

(II

I, S

P1

9)

L1

19

1 (

II,

SP

37

)

L1

41

5 (

SP

11

)

L1

37

6 (

I, S

P1

3)

L1

75

6 (

I, S

P1

3)

L1

43

2 (

I, S

P1

6)

L5

06

(IV

, S

P3

0)

L1

20

4 (

SP

18

)

Chapter 4

117

Genes involved in pathogenicity and housekeeping functions such as molecular

metabolism seems to be conserved with only 4 to 6 deletions from a total of 402

deleted genes. In contrast, genes involved in phage-related or transposon-related

(194), pseudogenes (59), genes coding for cell surface (37), conserved hypothetical

protein (19) and miscellaneous function (17) were the most frequently deleted

genes. Four genes involved in pathogenicity were deleted, of which bfrH (BP1138)

encoding a putative ferric siderophore receptor and bfrl (BP1962) encoding a

putative ferric siderophore receptor have been deleted in all clusters and cluster I

strains respectively. Also of interest was the partial deletion of fhaL in L1706 and

bfrG coding for a putative TonB-dependent receptor in L580. The proportion of

deleted genes with respect to their functional categories are shown in Figure 4.4.-3

for each isolate and as can be seen, phage-related or transposon related genes have

the highest proportion in all isolates ranging from 63% in L1432 to 32.9% in

L1756. Pseudogenes ranging from 17.8% in L1376 to 12% in L706 have the

second highest number of deletions by functional category.

Chapter 4

118

Figure 4.4-3: The proportion of each functional category for deleted genes

Ca

teg

ory

of

ge

ne

s d

ele

ted

(%

)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

L580 (V, SP27) L506 (IV, SP30) L1204 (SP18) L706 (III, SP19) L1415 (SP11) L1191 (II, SP37) L1376 (I, SP13) L1756 (I, SP13) L1432 (I, SP16)

adaptation amino acid biosynthesis cell processes

cell surface central/intermediary metabolism cofactor biosynthesis

conserved hypothetical energy metabolism macromolecule synthesis/modification

miscellaneous pathogenicity phage-related or transposon-related

pseudogenes regulation ribonucleotide biosynthesis

small molecule degradation transport/binding proteins unknown

Chapter 4

119

The regions that were uniquely present in the B. pertussis strain CS were also

analysed for deletions. Three genes were deleted in all isolates including two

encoding a transposase (BPTD_0395 and BPTD_0407) and one encoding an

integrase, BPTD_0397. The deletion of BPTD_0395 and BPTD_0397 was also

reported by Xu et al. in isolates collected from China in 2007 and Finland in 1999,

respectively [302].

4.4.8 Genome rearrangement

The PacBio sequencing reads enabled the assembly of larger contigs and facilitated

the identification of genome rearrangements. An overview of the rearrangements in

each genome relative to the earlier diverged strain in the tree order is presented in

figure 4.5-1, with the top panel being Tohama I. Extensive rearrangements can be

seen in the figure. To analyse the rearrangements in greater detail, contigs for each

genome were aligned separately to the reference Tohama I genome using

progressiveMauve in order to identify potential segments that were translocated

and/or inverted. Only segments that were 2 kb or greater were considered. A total

of 151 translocation and/or inversion events were found in the nine isolates (Table

4.5-1), with an average of 17 events per genome, ranging from 9 events to 25

events. The different types of events were similar with 58 inversions, 40

translocations, and 53 inversions and translocations. There were 17

translocation/inversion breakpoints shared by more than one isolate, which

suggests that these are hotspots for genome rearrangements. Note that all of these

events were detected in comparison to Tohama I. Therefore, some events may have

only occurred once and could have been subsequently maintained through vertical

gene transfer. Due to time constraints, such events were not investigated as no

bioinformatics tools are available to identify such events and manual examination

is required.

Chapter 4

120

Figure 4.4-4. Pairwise genome comparison of B. pertussis isolates analysed in this study. Regions of homology between a pair of genomes are indicated by lines; red for the same direction

and blue for reverse direction (inversion). Translocation is also apparent when the lines are crossing

over.

Tohama I (Cluster VI)

L580 (Cluster V)

L506 (Cluster IV)

L1204 (SP18)

L706 (Cluster III)

L1191 (Cluster II)

L1415 (SP11)

L1376 (Cluster I)

L1756 (Cluster I)

L1432 (Cluster I)

121

Table 4.4-12: Potential rearrangements in the B. pertussis isolates analysed in this study

Isolate Cluster No of

Translocations

No of

Inversions

No. of

Translocation

& Inversion

Total no

of

events

L506 IV 7 3 4 14

L580 VI 2 5 6 13

L706 III 4 13 8 25

L1191 II 3 8 11 22

L1204 UC^ 7 9 7 23

L1376 I 6 5 4 15

L1415 UC 7 6 6 19

L1432 I 0 6 3 9

L1756 I 4 3 4 11

Total

40 58 53 151

^UC: unclustered

122

4.5 Discussion

In Australia, only WCV was used prior to 1997 and after a transition period of 3

years (1997-1999) when both WCV and ACV were used, with ACV being initially

used for boosters only, it was replaced completely by ACV. In Australia, B.

pertussis isolates were grouped into five major clusters, I to V, of which cluster I

strains known as ptxP3 strains are currently predominant [259, 306, 308]. In this

study, two platforms for whole genome sequencing, PacBio and Illumina, were

used to study the genomic diversity and evolution among major Australian clones.

There were three main forces that resulted in genomic diversity including gene

gain, gene loss and gene changes due to small indels or SNPs, all of which have

been observed to be important in the evolution of B. pertussis strains. In this study,

WGS was used to reveal the genomic diversity between clusters and the

relationship of the SNP clusters as shown in Figure 4.4-1.

Phylogenetic analysis based on 426 SNPs confirmed the previous findings made by

Octavia et al.[306] and separated the isolates into six major clusters with ptxP3

strains in cluster I with 16 unique SNPs. Bart et al. compared whole genome

sequences of globally circulating strains [41] . Out of 16 cluster I specific SNPs, 14

were also reported in their study which supports the hypothesis that ptxP3 strains

distributed globally in recent years [282, 283, 306, 308, 348, 357]. In their study,

two ptxP3-specific SNPs located in fim3 and prpB were reported [283]. While the

mutation in fim3 has also been reported by Octavia et al. for cluster I isolates

belonging to SNP profiles SP14 and SP16 [306], the mutation in prpB was only

found in the SP16 isolate L1432.

A recent comparative genomic study of 343 B. pertussis isolates collected

worldwide showed that some of these SNPs were not identified in all ptxP3 strains

and some of the identified SNPs in our study including the ones located in BP0194,

BP0507, BP1471,BP2249, and BP3787 can be found in non-ptxP3 strains including

ptxP1, ptxP5 and ptxP18 [41, 358]. The only SNP which was found to be specific

for all ptxP3 strains globally was the one located in the pseudogene, BP0292. There

were two intergenic SNPs which have been found in this study for the first time in

Chapter 4

123

Figure 4.5-1: Genomic diversity in different clusters based on the results of SNP typing as

minimum evolutionary tree, deleted and new IS elements , Indels and deleted regions . Details

for each isolates included name, year of isolation, cluster and SNP profile were based on the Octavia

et al. [306] results.

ptxP3 strains. These two SNPs might be specific to Australian cluster I since we

also found them in all SP13 isolates in Chapter 3. The first intergenic SNP was at

1,077,844 in B. pertussis Tohama I between BP1035, coding for a transposase for

IS1663, and BPt13 and the second intergenic SNP was located in the promoter

region of BP1711 coding for an IS481 transposase.

B. pertussis strain Tohama I was originally isolated from Japan in the 1950s and

was widely used as the vaccine strain in many countries. It has also been used as a

reference strain in many in vivo and in vitro comparison studies and genomic and

proteomics investigations. However, there are at least two large fragments found in

B. pertussis strain CS, a Chinese pertussis vaccine strain, including 21 genes

(BPTD_0387 to BPTD_0407) and 18 genes (BPTD_02835 to BPTD_2852) that are

not present in Tohama I [250]. These regions are present in B. parapertussis and B.

bronchiseptica as RD 11, RD13 and RD14 [248]. It was suggested that these RDs

were deleted in Tohama I. Therefore, we investigated the polymorphisms in

missing RDs in Tohama I. Only two out of the 10 SNPs found were also reported

by a recent study investigating the genomic variations in 40 B. pertussis isolates

collected during 1960 to 2010 from the Netherlands, Finland and China using B.

BP

0202

BP

0210

BP

0839

BP

0910

BP

1048

BP

1080

BP

1134

BP

1142

BP

1717

BP

1809

BP

1810

BP

2087

BP

2577

BP

3186

BP

3406

BP

3407

BP

3408

BP

3509

BP

3519

BP

3520

BP

0551-I

G

BP

0976-I

G

BP

1054

BP

2610

BP

2839-I

G

BP

0364

BP

0880

BP

1037

BP

1123

BP

1351

BP

1386

BP

1386

BP

2000

BP

2232

BP

2899

BP

2928

BP

2946

BP

3090

BP

3224

BP

3762

BP

0967

BP

1624

BP

2141

BP

3258

BP

3580

BP

1045P

BP

1119

BP

0612P

inte

rgenic

BP

3642P

BP

0910A

-BP

0934

BP

1134-B

P1142

BP

3510

BP

1948-B

P1954

BP

1961-B

P1968

a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + + + + a a a a a

a a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + + + + a a a a a

a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + + + a a a a a

a a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + a a a

a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + a a a

a a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + a a a

a a a a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + + a a a

a a a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + + a a a

a a a a a a a a a a a a a a a a a a a + + + + + + + + + + + + + + + + + + + + a a a

L1432 (2010’I;16)

L1756 (2011’I;13)

L1376 (2010’I;13)

L1191 (2009’II;37)

L1415 (2009’UC;11)

L706 (2002’III;19)

L1204 (2009’UC;18)

L506 (2004’IV;30)

L580 (2007’VI;27)

Tohama I (1954’VI;36)

Deleted IS New IS Indels Deleted Loci

Chapter 4

124

pertussis strains CS as a reference genome [302]. The first SNP was located in

BPTD_0394 which was found in all clusters except cluster V, L580, suggesting that

this SNP arose quite early but after cluster V diverged. The SNP located in

BPTD_0398 was found in all our isolates and were also reported for all 40 B.

pertussis isolates, suggesting that this SNP is unique to B. pertussis strain CS.

[302].

Gene loss is a form of genetic diversity within clonal pathogenic bacterial species

such as Mycobacterium tuberculosis and Yersinia pestis [359]. As reported by

others, no gene acquisition was found in this study and that current circulating B.

pertussis strains have smaller genomes in terms of size as compared to older strains

[285, 332]. Initial studies showed that there were no particular patterns in gene

reduction and they could not draw any relationship between gene loss and changes

to antigens. In the study by Brinig et al. who studied genome content variations in a

broad range of B. pertussis isolates from different countries, no particular patterns

were found [360]. In later studies on current circulating B. pertussis strains, a

correlation between gene loss and specific PFGE profiles, SNP type clusters and

antigen alleles was found [285, 332, 337]. It was revealed that the deletion of

BP1948 to BP1966 can be linked to the same RD loss in Finnish isolates collected

since 1999 and belonged to PFGE group Ivβ [337]. The same result found in two

studies by King et al. in 2008 and 2010 also showed that the loss of BP1946 to

BP1966 was correlated with the current epidemic strains known as ptxP3 strains

[336] and concluded that the genome content of B. pertussis was correlated with

ptxP type [332]. Further analysis of Australian B. pertussis isolates collected before

and after ACV introduction in 1997 showed that the same region was deleted in

cluster I strains which also possess the same ptxP3 alleles of currently circulating

strains worldwide [285]. However, our study showed that even in this region some

genes BP1955-BP1960 were deleted in other clusters as well. It seems that initally,

only five genes were deleted and then during the recent evolution, a larger region

surrounding the initial deletion has also been deleted in current strains. Except for

the deletion of BP1947 to BP1966 which follows the phylogenetic tree, other

deletions showed no particular patterns and even within cluster I strains, the

Chapter 4

125

deletions occurred independently. The effects of deletions on phenotypic variations

or pathogenicity between different strains need to be investigated.

It was suggested that transposons are key elements in rapid adaptation by affecting

genome structure, organisation and function especially in populations with low

genetic diversity [361]. IS elements may lead to functional changes in the clinical

isolates in terms of genotype and phenotype and participate in genome streamlining

or trimming by facilitating DNA deletions [362]. The number of ISs in current

circulating strains varies between isolates [287]. However, a recent study [289]

showed that the IS481 copy number remains unchanged in recent B. pertussis

isolates. However, progressive deletion of transposases in all clusters and some new

IS insertions suggests that there is an important role for IS elements and

transposases in the genomic diversity of B. pertussis strains.

Another novel phenomenon documented in this study was the effect of indels on

the function of affected genes. Twenty four indels followed the phylogenetic

relationship, six reverted pseudogenes to functional genes and another 6 were

located in the promoters including fim2 promoter. Pseudogenes make up 9% (344

genes) of the B. pertussis Tohama genome content and our finding showed that

around 59 (17%) of them were deleted in one or more isolates, of which, 19 were

transposases. The frameshift indels converted 14 functional genes to pseudogenes

and reverted back six pseudogenes to functional genes including the Bvg-regulated

gene, bapC. This finding suggests that the deletion of pseudogenes in B. pertussis

strains occur much faster than the production of functional genes and frameshift

indels may facilitate deletion of functional genes by converting them to

pseudogenes.

Another mechanism which can drive the genetic diversity of B. pertussis is

rearrangement. While specific regions were not fully investigated in this study,

preliminary analyses showed that B. pertussis genomes were dynamic with

apparent genome rearrangements either through inversion or translocations. For

most isolates analysed in this study, the number of inversions and translocations

events which occurred in each were observed to be similar. Genome

Chapter 4

126

rearrangements have been shown to be important in other human-specific

pathogens such as Salmonella enterica serovar Typhi and Yersinia pestis [363,

364]. It was hypothesised that the restricted lifestyle of host-specific pathogen

contributes to frequent chromosomal rearrangements [365]. The genome

rearrangements in S. Typhi were driven by rRNA operons. However, in Y. pestis,

the rearrangements were likely to be driven by transposases or IS elements. This

also appears to be the case with B. pertussis.

In the future, it would be interesting to determine if the potential genetic

rearrangements identified in this study were random or contained phylogenetically.

4.6 Conclusion

Genome sequencing is a powerful tool to identify both large and small changes in

the genome which may help explain the genetic basis for variations between

different epidemiological strains. Here in our study, two platforms of whole

genome sequencing, Illumina and PacBio sequencing were used to study the micro

and long term evolution of B. pertussis isolates which belong to different clusters.

Point mutations, IS element insertions, indels and gene reduction were found to be

the main mechanisms driving B. pertussis evolution [283, 285, 332, 360].

Comparative genomic analyses of B. pertussis clinical isolates provided new

information on the evolutionary trends in the major Australian B. pertussis clones

and the effect of IS elements, indels particularly frameshift indels and gene loss.

From a total of 426 SNPs, 16 were unique for cluster I strains. Apart from SNP

located in the promoter of ptx, two more SNPs were also located in the virulence

associated genes bcsI (BP2249) and ptxC (BP3787). Although some of these SNPs

were reported to be found in non-ptxP3 strains, it seems that in Australia all ptxP3

strains expanded from a single clone.

It was also demonstrated in this study that since B. pertussis Tohama I is not a

representative of currently circulating B. pertussis strains, using other B. pertussis

strains as a reference can be useful to gain an insight into the regions that are not

present in Tohama I. Our finding also suggested that transposases and pseudogenes

are two important factors that mediate genomic diversity within B. pertussis strains.

Chapter 5

127

Chapter 5. Fitness of Pertactin negative Bordetella pertussis in a

mixed infection model

5.1 Introduction

Genotyping studies using single nucleotide polymorphisms (SNPs) as molecular

markers have shown that the predominant strains currently circulating in Australia

belonged to cluster I, which is associated with the ptxP3 allele [306, 366]. In

Australia, a prolonged pertussis epidemic occurred in different regions from 2008

to 2012 [219, 319]. We demonstrated that isoaltes that did not produce Prn

emerged and expanded during the 2008-2012 epidemic [311], providing evidence

of vaccine-driven evolution. Prn negative isolates have also been reported in many

countries with high vaccination coverage including the European Union, Japan, the

USA and Canada [290, 291, 293, 297, 309]. Martin et al. [297] reported that

vaccinated individuals have a 2-fold higher probability of infection by Prn negative

B. pertussis, suggesting that Prn negative B. pertussis strains have an advantage in

the immunised host compared with Prn positive strains.

Prn is an autotransporter protein [65] that promotes adhesion to tracheal epithelial

cells [48]. A recent study by van Gent et al. showed that the prn knock-out mutant

had a decreased ability to colonise the trachea and lungs in mice, but

complementation of the prn knock-out mutant restored this ability [317]. In B.

bronchiseptica, Prn also plays a role in resistance to neutrophil-mediated clearance,

promoting persistence in the lower respiratory tract [54]. The emergence of B.

pertussis strains with inactivation of Prn raises questions of the effect of this on

virulence and the possibility that the presence of other virulence factors may have

rendered Prn dispensable under selection pressure.

5.2 Aims and motivation

In chapter 3, the microevolution of predominant SP13 isolates from the 2008-2012

pertussis epidemic including Prn negative isolates were investigated and the results

revealed that the Prn negative isolates were mostly grouped together based on the

Chapter 5

128

prn disruption mechanism. Since epidemiological data showed that within cluster I,

the population of Prn negative B. pertussis strains have increased dramatically in

recent years, we hypothesised that Prn negative B. pertussis could survive better in

host’s immunised with the ACV. Therefore, the aims of this study were:

a) To establish a competition assay for B. pertussis infection in the mouse

model to investigate the effect of ACV vaccination on the fitness of Prn

negative strains.

b) To study the efficacy of ACV on the bacterial clearance of recently

collected isolates.

Chapter 5

129

5.3 Material and methods

5.3.1 B. pertussis clinical strains

A Prn negative (L1756) and a Prn positive (L1423) isolates, that were isolated

from patients in Australia during the 2008-2012 pertussis epidemic, were selected

for this study. The non-production of Prn in L1756 was due to an IS481 insertion in

the prn gene [311] and our genome data showed that these two isolates differ by 18

SNPs (Appendix 1). L1423 and L1756 were isolated in 2010 and 2011

respectively. Both isolates shared the same genotype, SP13, prn2, fim3-1 and

ptxP3. Both expressed Fha and Ptx as shown previously for all Australian isolates

[311].

5.3.2 in vitro growth curve determination

Isolates were grown on Bordet-Gengou (BG) agar (Becton Dickinson)

supplemented with 7% defibrinated horse blood and glycerol for 5 days at 37°C. A

loopful of Bvg+ (haemolytic) colonies with homogenous morphology from each

isolates was inoculated in 20 ml of Stainer-Scholte (SS) liquid medium,

supplemented with 1% Heptakis ((2,3,6-tri-O-methyl)-β-cyclodextrin), and SS

supplement (1x), at 37°C with shaking (180 rpm). After overnight incubation, the

OD600 was adjusted to 0.05 and samples were taken at 12 h intervals for 48 h and

colony forming units (CFU) were estimated by plating samples diluted with 1x

phosphate-buffered saline (PBS) on BG agar. The growth curve determination was

done in triplicates.

5.3.3 The mouse model of B. pertussis infection

All procedures involving animals were conducted under the University of New

South Wales, Animal Care and Ethics Committee approval number 12/137A.

Bacterial colonisation and clearance experiment was conducted using 3-4 week old

Chapter 5

130

naïve or vaccinated female BALB/c mice. Briefly, groups of 3 mice (vaccinated

mice) were injected subcutaneously with 1/50th

of a human dose of the commercial

ACV Infanrix Penta® HiB (GlaxoSmithKline Biologicals) containing 25 µg Ptx,

25 µg FHA and 8 µg Prn per human dose and a second dose was administered 2

weeks later. A 1/50th

of a human dose was also used in the study by de Gouw et al.

[127]. Control animals were vaccinated with 1x PBS. Two weeks after the last

immunisation, mice were then challenged with 2-5 x 106 CFU mixture of Prn

positive and Prn negative isolates (L1423 and L1756, respectively) in a 1:1 ratio

under ketamine (200 mg/kg)/xylazine (100 mg/kg) sedation. Groups of 3 mice were

sacrificed at 0 day (2 hours after bacterial challenge), 3 days, 7 days, 14 days and

21 days post-infection. Trachea and lungs were collected aseptically and

homogenised separately in 1.6 mL and 600 mL SS broth, respectively. Serial 10-

fold dilutions of the homogenised trachea and lungs were grown on charcoal blood

agar containing cephalexin (Oxoid) for CFU determination.

5.3.4 Differentiation of the two B. pertussis isolates in the mixed infection in

lungs and trachea

A total of 700 µL each of homogenised lungs and trachea was used for DNA

extraction using Accuprep® Genomic DNA extraction Kit (Bioneer). To determine

the proportion of the two isolates from the mixture, we used targeted PCR and

Illumina deep sequencing to detect 2 SNPs that differentiated the two isolates.

Nested PCR was used to amplify the targeted regions using outer primers covering

the SNP site to increase PCR specificity. The inner primers were designed to

include one SNP for each isolate and a 50 bp custom Illumina adaptor sequence

was added at the end of both forward and reverse primers (Table 5.3-1). The final

targeted PCR products were ~ 500 bp and the SNP was located in the middle of the

PCR fragments. Targeted PCR products were purified and mixed for sequencing

using Illumina MiSeq 250 bp single end reads. Reads were mapped against B.

pertussis Tohama I genome using Burrows-Wheeler Alignment (BWA) tools

(version 0.7.5) [327]. The number of reads for each unique SNP was extracted and

used to calculate the proportion of Prn negative and Prn positive isolates.

Chapter 5

131

5.3.5 Statistical analysis

The significance of ACV-induced bacterial clearance from lungs and trachea were

determined by T-test at each time point between the log10 (CFUs/lungs or trachea)

of the control group of mice against the immunised one. The significant differences

in Prn negative proportion in lungs and trachea of immunised mice against the

control group were determined by t-test of the average percentage of Prn negative

isolate calculated based on the number of reads for each set of primers. P < 0.05

was considered statistically significant. Area under curve (AUC) was used to

determine differences in clearance and was analysed using GraphPad Prism version

6.04 for Windows (GraphPad Software, La Jolla California USA).

Chapter 5

132

Table 5.3-1: Primers designed for this study.

Isolate SNP in

genome

Gene

ID Protein product SNP

SNP

in

gene

Primer type

PCR

size

(bps)

Illumina adapters 5'-3' Primer Sequence

L1423 3555632 BP3333 Pyruvate kinase T...C 984 Nested-For. 977 - CGTTCACATTCCAAGGAG

Nested-Rev. - GCGAAGGGCACCGTGTAG

Illumina-For. 352 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG CCTGGGCGTGGAAGTGGG

Illumina-Rev.

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG GCGATGGTGCGAATGCGT

L1756 2244138 BP2120 Glutathione reductase T…C 988 Nested-For. 1049 - AACGACGCCTTCTTCCTG

Nested-Rev.

- CCGCATCCCCTACCAGCC

Illumina-For. 453 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG AAACCGATTGCGTCTTCT

Illumina-Rev.

GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG GGGTCTTTTCCTGGCTCT

L1756 BP1054 Pertactin IS481 AF-Fora 1800 - GCCAATGTCACGGTCCAA

PRN8-Rev - AGGGTAAAGGTCGCCGCG

CT

a: Fry et al. (2001)[263]

Chapter 5

133

5.4 Results

5.4.1 in vitro growth rate of the isolates used in this study

As intrinsic differences in growth rate may affect the in vivo competition outcome,

the doubling time was estimated separately for the Prn negative isolate (L1756) and

positive isolate (L1423) based on CFU counts done in 3 separate experiments. The

doubling time of L1756 and L1423 was 5.64± 0.51 hr and 6.55± 0.43 h,

respectively, which was not statistically significant (P = 0.79).

Figure 5.4-1 : Growth curve of two Prn positive and Prn negative B. pertussis isolates from

cluster I using optical density. Doubling time was also estimated based on the CFU count results

and no significant difference was observed. The error bars have been calculated and shown in Figure

5.4-1 which may affect the doubling time calculation.

0 1 2 2 4 3 6 4 8

0 .0

0 .5

1 .0

1 .5

2 .0

T im e (H )

OD

60

0

L 1 4 2 3 -P rn p o s it iv e )

L 1 7 5 6 -P rn n e g a tiv e

Chapter 5

134

5.4.2 Bacterial clearance in immunised mice infected with Prn positive and

negative isolates

Groups of three mice for each time point were infected with a mix of L1756 and

L1423 in 1:1 ratio. Bacterial clearance was determined at 2 h, 3, 7, 14 and 21 days

post-infection using CFU counts. For both lungs and trachea, the clearance was

faster in immunised mice than naïve mice (Figure 5.4-2 A and B). In mice

examined two weeks after the second vaccine dose, the difference in clearance from

the lungs between immunised mice and naïve mice reached statistical significance

(P = 0.0012). In tracheal tissue, there was no time point at which we could

demonstrate a statistically significant difference. However, when we calculated the

AUC for colonisation at different time points, the AUC of immunised mice was

significantly lower in both lungs (P = 0.0034) and trachea (P = 0.02) suggesting

faster clearance (Figure 5.4-2C).

Chapter 5

135

Figure 5.4-2: Colonisation of Bordetella pertussis in naïve and ACV immunised mice infected

with a mixture of Prn negative and Prn positive B. pertussis. Log10 CFU was calculated based on

the number of colonies in groups of 3 mice for each time point. A) Lungs B) Trachea and C) Area

under the curve for lungs (P = 0.0034) and trachea (P = 0.02) of naïve and ACV immunised mice. *

denotes significant difference (P < 0.05) in bacterial clearance.

A

B

C

L u n g s T r a c h e a

0

5 0

1 0 0

1 5 0

Im m u n is e d

C o n tro l

*

*

Are

a u

nd

er c

urv

e

0

2

4

6

8

D a y s p o s t in f e c t io n

Lo

g1

0(C

FU

/tr

ac

he

a)

Im m u n is e d

C o n tro l

3 7 14 21

0

2

4

6

8

D a y s p o s t in f e c t io n

Lo

g1

0(C

FU

/lu

ng

s)

Im m u n is e d

C o n tro l

3  7 14 21

*

Chapter 5

136

5.4.3 Competitive fitness of Prn negative B. pertussis in the mixed infection in

vivo study

The proportions of the Prn negative and Prn positive isolates was determined in the

extracted DNA from lung and trachea with the mixed infection at different time

points using Illumina sequencing of the PCR products covering the 2 SNP sites

amplified from the total DNA extracted from the tissues. The PCR products

covering the two SNP sites were amplified from the total DNA extracted from the

lungs and trachea. The PCR products were then sequenced using the Illumina

platform and reads supporting each SNP were detected.

As shown in Figure 5.4-3A, the proportion of the Prn negative isolate, L1756,

significantly increased in immunised mice from day 3 (P < 0.05) and reached 93%

by day 14. In contrast, the Prn producing isolate, L1423, in the control group

outgrew the Prn negative isolate, L1756, by day 3 (P = 0.003) and increased to 90%

by day 14 (P < 0.05). In the trachea (Figure 5.4-4B), the same trend was observed.

For the immunised mice, a statistically significant difference in proportional

colonisation was first observed at day 3 and by day 14, the Prn negative isolate

predominated at 92%. In contrast, statistically significant differences for the control

group were first observed at day 3 and the Prn positive isolate constituted 74% of

the total by day 14. We did not determine the proportion of the two isolates for the

day 21 time point as the PCR amplified only a weak product, likely due to little

bacterial DNA in the total DNA templates as a result of bacterial clearance. As

shown above, we only recovered a small number of CFUs from the lungs.

Chapter 5

137

Figure 5.4-3 : The proportion of Prn negative isolate in A) Lungs and B) Trachea of

immunised and control mice at different time points. Significance difference is found in all time

points post-infection (P >.0.05).

0

5 0

1 0 0

D a y s p o s t in fe c tio n

% o

f P

rn

ne

ga

tiv

eC o n tro l

Im m u n is e d

3 7 14

0

5 0

1 0 0

D a y s p o s t in fe c tio n

% o

f P

rn

ne

ga

tiv

e

C o n tro l

Im m u n is e d

3 7 14

A

B

Chapter 5

138

5.5 Discussion

Increasing numbers of Prn negative strains were identified in the 2008-2012

pertussis epidemic in Australia [311]. In this study we investigated the effect of

ACV in a mouse model by examining the comparative fitness of the Prn negative

strain in immunised and unimmunised mice. We carried out a competition assay by

inducing mixed infection in mice with Prn negative and Prn positive isolates which

demonstrated that the Prn negative isolate was relatively fitter in ACV-immunised

mice and the Prn positive isolate in the unimmunised mice. Our results are

consistent with the hypothesis that there is a selective advantage for the loss of Prn

in hosts immunised with ACVs containing Prn. Some data from efficacy studies

has suggested that ACVs containing Prn were superior to those without Prn [367]

but there is no consensus regarding the contribution of individual antigens in ACVs

to protection [368]. Interestingly, in the unimmunised host, the Prn negative isolate

was less fit than the Prn positive isolate, suggesting that Prn plays an important role

in pathogenesis in the unimmunised host but could be dispensable under vaccine

selection pressure. Prn has been shown to facilitate adhesion and resistance of B.

pertussis to neutrophil-mediated clearance [54, 81]. Previous studies have indicated

that Prn knockouts were less efficient in colonising the mouse lungs [317]. Our

results confirmed these observations.

Our findings of higher fitness of Prn negative isolate in the immunised mouse are

consistent with those of Hegerle et al. [316] who found that Prn negative strains are

fitter in immunised mice using B. pertussis strains isolated in France. However, our

study design differed in several respects. First, we used a much smaller dose of

vaccine (1/50th

of human dose versus 1/4th

of human dose) and second, we

employed a mixed infection model, as opposed to one strain only. We also

examined the competitive fitness in tracheal tissue, which has not been previously

explored.

It should be noted that the isolates used in this study were clinical isolates that are

not isogenic and contain polymorphisms at other loci that may affect the fitness of

the isolates. We have sequenced genomes of these isolates and identified 18 SNP

Chapter 5

139

differences and other genetic changes (Safarchi et al. unpublished data). However,

the effects found were consistent with the previously described roles of Prn. We

also sequenced a colony of each isolate recovered from immunised mice after 21

days following infection and found no genomic variations in either isolate during

the infection (data not shown).

Comparison of B. pertussis strains in the mouse model has mostly been conducted

separately rather than as a mixed infection. This was because typing colonies to

distinguish strains in mixed infection is laborious and the accuracy of

quantification is also affected by the number of colonies typed. Polak et al. in 2015

[369] performed a mixed infection assay in mice to study the fitness of clinical B.

pertussis isolates currently circulating in Poland where WCV was used. They used

pulsed field gel electrophoresis (PFGE) to type up to 30 colonies at each time point

to differentiate the isolates with different PFGE profiles [369]. In this study, we

took advantage of the next generation sequencing and SNP typing to distinguish

isolates used for infection. The mixed infection model is likely to be more

reproducible and offers an important advantage of eliminating variations in the

inoculation of different mice.

We used 1/50th

of human vaccine dose as proposed by de Gouw et al. [127] while

most studies used 1/4th

of human dose [314, 316, 370]. The smaller dosage seems

to have affected the speed of bacterial clearance. In lungs, we observed a slight

increase in colonisation in day 3 and then CFU declined from day 7 with significant

reductions in day 14, while the studies using 1/4th

of human dose generally showed

that the CFU declined much sooner [314, 316, 370]. A slower rate of clearance is

likely to better represent vaccine-induced immune responses in the context of

natural infection and enable higher sensitivity for detecting differences in clearance

between isolates.

In this study, we also examined bacterial colonisation of the trachea in ACV-

immunised mice which showed a different pattern of clearance to that of the lungs.

In the trachea, bacterial colonisation started to decline from day 3 post-infection in

immunised mice without any increase in growth, unlike colonisation of the lungs in

immunised mice, where growth occurred and peaked at day 3. However, in the

Chapter 5

140

unimmunised group, there was an increase in CFU count until day 7 and a decline

thereafter.

In conclusion, the Prn negative B. pertussis displayed significantly higher fitness

than Prn producing strains in the mixed infection model in terms of their ability to

colonise the lungs and trachea of mice immunised with ACV. This observation

provides an explanation for the emergence and expansion of Prn negative strains

under ACV induced immunity. Our findings also indicate that the clearance of B.

pertussis from mouse respiratory tract is enhanced even by a much lower dose of

ACV than previously used in other mouse model studies [314, 316, 370]. In future

studies, it would be interesting to determine the relative fitness of Prn negative

strains in the WCV environment pertinent to countries where WCV is used.

Chapter 6

141

Chapter 6. The differential fitness of Bordetella pertussis belonging

to two major clusters in in vivo competition assay

6.1 Introduction

Evidence for the adaptive evolution of B. pertussis has been observed in the

polymorphisms of genes which encode the antigens used in the vaccine. These

genes include the prn gene which has 13 alleles and the ptx promoter (ptxP3) with

17 alleles. The current circulating strains contain alleles that differ to those carried

by strain(s) used in the whole cell vaccine (WCV) or acellular vaccine (ACV).

Vaccination strains carry the ptxA2/ptxP1/prn1 alleles [230, 254]. The allelic shift

from prn1 to prn2 or prn3 and ptxP1 to ptxP3 was first reported in the 1980s [256,

281]. ptxP3 strains currently predominate in Europe and North America where the

ACV has been widely used for immunisation [159, 260, 273].

Previously, Octavia et al. characterised 208 Australian B. pertussis isolates

collected since the 1970s by SNP typing and typing of genes encoding antigens in

the ACV (prn, ptxA, fim2, fim3 and fhaB) to classify them into four major SNP

clusters. The study found that SNP cluster I was predominantly isolated during the

ACV period while SNP cluster II were mainly isolated from the WCV period. SNP

cluster II strains were found to carry vaccine-type ptxP1 and prn1 alleles or prn3 in

comparison to SNP cluster I strains which was found to have switched to the non-

vaccine ptxP3 and prn2 alleles. Cluster I strains in Australia have the equivalent

genotype as ptxP3 strains found in other countries [259, 308].

It was shown that Prn type-specific antibodies were produced in immunised or

infected individuals [266]. Since the Prn type in ACV is Prn1, the antibodies

induced during immunisation might not be as protective against Prn2 or Prn3

producing strains.

Chapter 6

142

6.2 Aims and motivation

We have previously documented an increase in the prevalence of cluster I (ptxP3,

prn2) strains and a decrease in cluster II (ptxP1, prn3) strains in Australia after the

introduction of the ACV [306]. Cluster I started to expand during the first few

years after the introduction of ACV in Australia and during the latest epidemic,

they accounted for the majority of strains isolated with only a small number from

cluster II [259, 307].

We hypothesised that cluster I strains survive better host’s immunised with the

ACV. Therefore, the aims of this study were to:

a. To determine the in vitro behaviour of different clusters in relation to their

growth and doubling times in the individual experiments.

b. To investigate the effect of the ACV vaccine on the fitness of cluster I

strains in the mouse model using a mixed infection of recently isolated

cluster I (ptxP3/prn2) and II (ptxP1/prn3) B. pertussis strains from

Australia

Chapter 6

143

6.3 Materials and Methods

6.3.1 B. pertussis clinical strains

Two isolates with different antigenic profiles from cluster I (L1423: fim3-1, prn2,

ptxA1, ptxP3) and cluster II (L1191: fim3-1, prn3, ptxA1, ptxP1) isolated in the

2008-2012 Australian pertussis epidemic were selected. Our genome data showed

that these two isolates differed by 61 SNPs (Apendix, Table A1 and A2).

6.3.2 in vitro growth curve determination

L1423 and L1191 were grown on Bordet-Gengou (BG) agar (Becton Dickinson)

supplemented with 7% defibrinated horse blood at 37°C for 5 days. A loopful of

Bvg+ (haemolytic) colonies from each isolate was then inoculated into 20 ml of

Stainer-Scholte (SS) broth for overnight culture. The broth was supplemented with

1% Heptakis ((2,3,6-tri-O-methyl)-β-cyclodextrin) and SS supplement, then

incubated at 37°C with shaking (180 rpm). The OD600 was then adjusted to 0.05 and

samples were taken at 12 hr intervals for 48 hr. CFU counts were also performed

by plating samples diluted with 1x phosphate buffer solution (PBS) on BG agar.

The experiment was performed in triplicates.

6.3.3 The mouse model of B. pertussis infection

Groups of 3 mice (3-4 weeks old) were subcutaneously injected with 1/50th

of a

human dose of the commercial ACV Infanrix Penta® HiB (GlaxoSmithKline

Biologicals) containing 25 µg Ptx, 25 µg Fha and 8 µg Prn per human dose and a

second dose was administered 2 weeks after. Naïve mice were also vaccinated with

1x PBS as the control group. Two weeks after the booster was administered, the

mice were given an intraperitoneal (IP) injection of ketamine (50 mg/kg)/xylazine

(25 mg/kg) for light sedation and then challenged with 2-5 x 106 CFU mixture of

cluster I and cluster II isolates (L1423 and L1191, respectively) in a 1:1 ratio .

Groups of 3 mice were sacrificed at 0 day (2 h), 3 days, 7 days, 14 days and 21

days post-infection. Trachea and lungs were collected and homogenised separately

in SS broth. Serial 10-fold dilutions of the homogenised trachea and lungs were

grown on charcoal blood agar containing cephalexin (Oxoid) for CFU

determination.

Chapter 6

144

6.3.4 Differentiation of the two B. pertussis isolates in the mixed infection in

lungs and trachea

A total of 700 µl of homogenised lungs and trachea were used for DNA extraction

using the Accuprep® Genomic DNA extraction Kit (Bioneer). To determine the

final proportion of each cluster from the extracted DNA, targeted PCR followed by

Illumina deep sequencing was used to detect two SNPs that differentiated these two

isolates. Nested PCR was first used to amplify the targeted regions using outer

primers that covered the SNP site to increase PCR specificity. The inner primers

were designed to include one SNP for each isolate and a 50 bp custom Illumina

adaptor sequence was added at the end of both the forward and reverse primers.

The inner primers were then used in a second PCR (Table 6.3.4-1). The final PCR

products were ~ 500 bp and the SNP was located in the middle of the PCR

fragments. PCR products were purified and mixed for sequencing using MiSeq 250

bp single end reads. Reads were mapped against B. pertussis Tohama I genome

using Burrows-Wheeler Alignment (BWA) tools (version 0.7.5) [327]. The number

of reads for each unique SNP was extracted and used to calculate the proportion of

cluster I and cluster II isolates.

6.3.5 Statistical analysis

The significance of ACV-induced bacterial clearance from the lungs and trachea

were determined by a t-test at each time point between the log10 (CFUs/lungs or

trachea) of the control group of mice against the immunised one. Area under curve

(AUC) was used to determine the difference in clearance. The significant

differences of cluster I proportion in lungs and trachea of immunised mice against

the control group were determined by ANOVA of the average percentage of cluster

I calculated based on the number of reads for each set of primers. P < 0.05 was

considered statistically significant. GraphPad Prism version 6.04 for Windows

(GraphPad Software, La Jolla California USA) was used for all calculations.

Chapter 6

145

Table 6.3-1 : Selected SNPs for this study and the designed primers used for sequencing

Isolate SNP in

genome

Gene

ID

Product SNP SNP in

gene

Primer type PCR size

(bps)

Illumina adapters 5'-3' Primer Sequence

L1423 3840411 BP3630 30S Ribosomal protein G...A 150 Nested-For. 656 - CGAAGACCGACGAAGAAC

Nested-Rev. - TGCTGGCTTCAACACCCT

Illumina-For. 459 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG GCTGGTAGGAGAACGAAT

Illumina-Rev. GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG TCGCCGCCCACGCCATTG

L1191 30665 BP0027 MaoC familly protein G...A 198 Nested-For. 805 - GCATCGTAATCTCGCTGG

Nested-Rev. - ACACTCTCGGTCAACGGG

Illumina-For. 493 TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG GTCCACACTCTCGGTCAACG

Illumina-Rev. GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG CGCCGTACTTCTCGTCCTTC

Chapter 6

146

6.4 Results

6.4.1 in vitro growth rate of the isolates used in this study

Differences in the growth rate between isolates may affect the in vivo competition

result. Therefore, we first determined the in vitro growth rate of the cluster I isolate

(L1423) and the cluster II isolate (L1191) based on CFU counts and the OD600

measured in 3 separate experiments (Figure 6.4-1). The doubling time of L1191

and L1423 was 6.53± 0.51 hr and 6.55± 0.43 hr respectively. The difference in

doubling time was not statistically significant (P > 0.05).

Figure 6.4-1: Growth curve of two B. pertussis isolates from cluster I and II using optical

density. Doubling time was also calculated based on the CFU count results and no significant

difference was observed.

6.4.2 Bacterial clearance in immunised mice infected with the mixed

infection of cluster I and cluster II isolates

For each time point, groups of 3 mice were infected with a mix of L1191 and

L1423 in a 1:1 ratio. The bacterial clearance was determined at 2 hr, 3, 7, 14 and 21

days post-infection using CFU counts. For both lungs and trachea, the clearance

was faster in immunised mice than naïve mice (Figure 6.4-2A and B). The CFU

data showed that when the mice immunity was primed by ACV, the log10 (CFU/

lungs) was already significantly lower (P= 0.005) at 3 days post-infection

compared to the control group.

0 .0

0 .5

1 .0

1 .5

2 .0

T im e (H )

OD

60

0

L 1 1 9 1 ( II)

L 1 4 2 3 ( I)

12 24 36 48 60 72

Chapter 6

147

In addition, the bacterial clearance in the trachea was also significantly lower

(P=0.005) from 7 days post-infection in vaccinated mice. To further demonstrate

the overall difference in clearance, the AUC of colonisation at different time points

in the lungs and trachea were calculated. In both cases, the AUC of immunised

mice was significantly lower than the AUC of naïve mice in both the lungs and

trachea (Figure 6.4-2C) suggesting faster clearance.

Figure 6.4-2. Colonisation curve for naïve and ACV immunised mice infected with a mixture of

cluster I (ptxP3,prn2) and cluster II (ptxP1, prn3) B. pertussis. Log10 CFU was calculated based

on the number of colonies in groups of 3 mice for each time point. A) Lungs; B) Trachea,

*significant difference (P< 0.05) in bacterial clearance was found in 3 days post-infection in lungs

and 7 days in trachea; C) Area under the curve for lungs (P=0.00003) and trachea (P=0.00002) of

immunised and control mice in the mixture infection.

A

B

C

Chapter 6

148

6.4.3 Competitive fitness of cluster I B. pertussis in the mixed infection in vivo

study

The proportion of the two isolates was determined in the mixed infection at

different time points using two SNPs. The PCR products covering the two SNP

sites were amplified from the total DNA extracted from the lungs and trachea. The

PCR products were sequenced using the Illumina platform and reads supporting

each SNP were detected.

As shown in Figure 6.4-3, the cluster I isolate, L1423 was the predominant strain in

both naïve and immunised mice from day 3. The proportion of the cluster I isolate

L1423 significantly increased from day 3 in naïve mice and reached 76% by day

14. The proportion of L1423 increased to 64% by day 14 in immunised mice

although this increase was not statistically significant when compared to days 3 and

7.

In the trachea, the same trend was observed where a significant difference was

observed in the immunised mice from day 7 and continued to day 14 as compared

to day 0. Cluster I isolate peaked (74%) at day 7 and declined slightly to 63% in

day 14 while for the control group, a statistically significant difference was

observed from day 3 and cluster I isolate peaked at 85% by day 14. The proportion

of cluster I isolate was significantly higher at day 14 in naïve mice compare to

immunised mice.

Chapter 6

149

Figure 6.4-3. The proportion of cluster I (ptxP3, prn2) strain in A) Lungs and B) Trachea of

immunised and control mice. Significant differences were found in day 14 of post-infection in

lungs. * (P <.0.05)

A

B

Chapter 6

150

6.5 Discussion

Australia replaced the WCV with ACV for all doses of immunisation in 2000. After

the introduction of ACV, the number of cluster II strains declined from 33% to 10%

while cluster I strains (prn2, ptxP3) dramatically increased during that period of

time [306]. Custer I strains accounted for 86% of strains isolated in the recent 2008-

2012 pertussis epidemic. The rise of cluster I strains was hypothesised to be due to

the selection pressure of the ACV [259, 308]. This study tested this hypothesis. A

competition assay using a mixed infection of two isolate from cluster I (L1423) and

cluster II (L1191) in the mouse model showed that the cluster I isolate outcompeted

the cluster II isoalte in the ACV immunised host environment.

Surprisingly, the cluster I isolate also outcompeted the cluster II isoalte in the naïve

host environment, suggesting that cluster I has evolved to have better fitness

regardless of the host environment. Our findings were consistent with a previous

report by King et al. who compared the colonisation difference between Dutch

pxtP3 and ptxP1 strains which were equivalent to our cluster I and cluster II strains,

respectively. They also found better colonisation of wild-type ptxP3 strains in the

lungs and trachea of naïve mice when assessed at day 4 of the infection.

Furthermore, they found that the advantage of ptxP3 strains also depended on their

genetic background. Therefore, this suggests that the increased fitness was not just

due to the simple advantage provided by the ptxP3 allele resulting in increased Ptx

production [281].

Cluster I (ptxP3) strains have become the predominant strains in developed

countries where ACV is used. The ACV may have helped the spread of the disease

as it affords poorer overall protection against B. pertussis infection as compared to

WCV. However, this study showed that cluster I strains had higher overall fitness

regardless of the host’s immunisation status. This has important implications for

global pertussis epidemiology. Cluster I may also have spread to and become

predominant in developing countries where WCV is still used, although in

countries such as China and Poland, ptxP3 strains were less frequently isolated

[371, 372]. It may also explain the discrepancy of epidemiological data where no

Chapter 6

151

correlation was observed between the introduction of ACV and the increase of

pxtP3 (cluster I) strains in the US whereas the correlation seen in Australia was

coincidental.

Our findings did not invalidate the suggestions that the genetic changes occurred in

cluster I strains may have provided a selective advantage in an ACV immunised

environment. Cluster I strains carry 2 important changes, pxtP3 and prn2. Isogenic

mutants in the Tohama I background showed synergistic effect and that the double

mutant colonised better than the wild type Tohama I strain in immunised mice

[373].

Our previous study showed that the non-production of pertactin by cluster I strains

also affected its colonisation fitness (chapter 5) as Prn deficient mice showed

reduced fitness in the naïve mice but better fitness in immunised mice. However,

the effects of different prn alleles are less clear. Polak et al. showed that strains

carrying ptxA1/ptxP1/prn2 colonised better than ptxA1/ptxP1/prn1 or

ptxA1/ptxP1/prn3 strains at day 14 post-challenge in mice, suggesting better fitness

of strains with prn2 allele [369]. In contrast, van Gent et al. showed in naïve mice

that the prn1 strain colonised significantly better than prn2 or prn3 strains

suggesting better fitness of prn1 strains [317].

It should be noted that our study was a comparison of two wild type isolates which

were not isogenic except for the ptxP and prn allelic type. Variations in other

virulence loci may also be important for colonisation. We have sequenced these

isolates. There were 61 SNP differences and other genetic changes between two

isolates, of which, 16 SNPs were unique to all cluster I strains including L1423

(Chapter 4, Table 4.4-6) Apart from SNPs in the ptx promoter, two other SNPs

were identified in virulence associated genes (bcsI and ptxC) that are under the Bvg

regulatory system. Furthermore, in the cluster I isolate used in this study, L1423,

there were two more genes with SNPs including sphB1 and fliM which are also

under the Bvg regulation. The former was a common SNP found in all 22 SP13

isolates that we sequenced previously while the latter was common to 20 SP13

isolates collected after 2008 (Appendix 1). Our finding are also consistent with

Chapter 6

152

other studies which showed that the genomic content of currently circulating B.

pertussis strains has been reduced compared to other clusters (discussed in 4.4.7).

One or more of these changes may provide a selective advantage to cluster I strains

for increased colonisation of the host.

6.6 Conclusion

In conclusion, our results showed that the predominant cluster I strains which

expanded recently has significantly better fitness for colonisation of the lungs and

trachea in mice regardless of the immunisation status of the host. Nevertheless,

ACV immunisation enhanced the bacterial clearance of the SNP cluster I strain

from the mouse respiratory tract, thereby demonstrating the importance of

vaccinations in the prevention of B. pertussis infection in humans from a mouse

model perspective.

Chapter 7

153

Chapter 7. General Discussion

The increase in global pertussis notification rates in recent years, led to the focus on

the evolution and adaptation of Bordetella pertussis and the potential effect of

vaccine selective pressure on re-emergence. In Australia, an increase in pertussis

epidemics in recent years has also been observed. Previous studies have revealed

that small mutations and antigenic gene inactivations and disruptions have altered

the genomic content of the Australian B. pertussis population. Single nucleotide

polymorphism (SNP) typing grouped Australian B. pertussis isolates collected over

40 years into different clusters with the current circulating isolates belonging to

cluster I [306]. Later studies also revealed that cluster I strains contained an altered

pertussis toxin promoter allele, ptxP3. In addition, cluster I also contained pertactin

(Prn) negative B. pertussis isolates [308, 311]. Based on these previous findings,

this thesis investigated the genomic changes and in vivo fitness of Australian

epidemic B. pertussis strains in immunised hosts. This thesis contributed to an

increased understanding of the evolution and adaptation of current predominant

Australian B. pertussis cluster I strains by:

a) Investigating the microevolution of predominant cluster I B. pertussis

isolates during the 2008-2012 epidemic in Australia.

b) Determining the adaptation and evolution of major Australian B. pertussis

clones with a particular focus on cluster I strains.

c) Developing a mixed infection competition assay in the mouse model to

examine the fitness of Prn negative B. pertussis in vivo in an ACV

immunised environment.

d) Examining the effect of acellular vaccine (ACV) vaccine pressure on the

fitness of cluster I strains over cluster II strains in the mouse model study.

From these four points, several themes relating to B. pertussis were identified and

are discussed below.

Chapter 7

154

7.1 Microevolution of current epidemic B. pertussis isolates

7.1.1 A genomic portrait of the 2008-2012 Australian epidemic

A previous study by Octavia et al. showed that isolates belonging to SNP profile

(SP) 13 in cluster I were the most predominant type in the 2008-2012 pertussis

epidemic in Australia [259]. This thesis project (Chapter 3) sequenced 22 epidemic

SP13 isolates and the results indicated that the epidemic SP13 clone in Australia

was separated from isolates belonging to the pre-epidemic era by 5 unique SNPs.

The results also showed that epidemic SP13 isolates underwent clonal expansion

across the country with multiple lineages. The epidemic SP13 isolates dated back to

2002 (95% confidence interval of late 1999- mid 2007) using BEAST suggesting

that the strains have been circulating in Australia for some time prior to causing the

epidemics.

The epidemic isolates were further divided into five distinct lineages, EL1 to EL5.

Spatial and temporal clustering of isolates was found. Isolates in two lineages, EL4

and EL5, were grouped together by regions of isolation from New South Wales

(NSW) and Western Australia (WA), respectively. However, the other three

lineages contained isolates from both states indicating an interstate spread of the

lineage during the epidemic. Temporal clustering was also evident. One pair of

isolates (L1493 and L1507) collected in 2011 from NSW was identical and hence

was a possible local transmission.

7.1.2 Diversification of epidemic SP13 through random mutations, adaptive

changes, indels and insertion sequence transposition

Small changes including SNPs and small indels may have contributed to the

expansion of the epidemic clones in Australia, with five SNPs unique to the

epidemic lineage. Three of these SNPs were likely to be neutral with two located on

intergenic regions and one being a synonymous SNP. However, two SNPs located

in genes were non-synonymous with one found in a virulence associated gene

(BP0216 - sphB1) and the other in a gene encoding a transport binding protein,

BP3570. These two changes may be adaptive. The three genic SNPs have also been

reported in some ptxP3 strains which were isolated from different countries by Bart

et al. [41], suggesting that the epidemic lineage may have spread globally.

Chapter 7

155

Apart from the SNPs present in all epidemic isolates that may be adaptive, there

were other virulence-associated genes with non-synonymous SNPs present in the

epidemic isolates. This indicates a selection pressure that is driving their evolution.

This finding is consistent to the study by Seeley et al. which sequenced 100 UK B.

pertussis isolates collected from 1920 to 2012 and showed that virulence genes

have a higher substitution rate [251]. This was also observed in Finnish and

Chinese isolates as well as in other global isolates [41, 302]. Thus, SNPs were

likely to have played an important role in the adaptation of B. pertussis.

Interestingly, both Bart et al. [41] and our study found that the mutation rate of B.

pertussis to be 2.24x10-7

-3.38x10-7

per site per year which equates to about 1-2

SNPs per genome per year. However, the study by Xu et al. [302] reported that the

mutation rate may vary among different B. pertussis lineages.

The presence of indels and new insertion sequence (IS) insertion sites did not

follow the phylogenetic pattern. Most of them appeared to be random occurrences

in different isolates. There were no unique indels or IS for epidemic SP13 isolates.

However, there were some specific indels for different epidemic lineages including

one frameshift indel located in bapC (BP2738) for EL2 and two located in

intergenic regions for EL4 (Table 3.4-5). Indels may result in pseudogenisation of

functional genes (BP2232, BP2595, BP2928, BP2946 and BP3465). However,

there were some indels located on pseudogenes which led to either the pseudogenes

remaining as pseudogenes (with further degradation of the pseudogenes) or that the

pseudogenes were potentially reverted back to functional genes (BP0880, BP2000,

BP2738, BP2899 and BP3762). Also of interest are indels located in the

homopolymeric tracts in genic regions that are likely to have an effect on phase

variation including the indel located in bapC, which encodes an outer membrane

protein that is upregulated by the Bvg system.

7.1.3 Independent evolution of Prn negative isolates

Prn negative B. pertussis strains emerged at the start of the Australian 2008-2012

epidemic and increased to over 80% by the end of the epidemic. It was found that

while there were multiple mechanisms that caused the non-expression of Prn, IS481

Chapter 7

156

was the most common one [311]. In Chapter 3, a total of 10 Prn negative isolates

(seven prn::IS481R, one prn::IS481F and two prn::IS1002) were sequenced. It was

found that Prn negative strains may have expanded independently at different time

points. The two prn::IS1002 isolates only had four SNP differences amongst them

but were isolated two years apart suggesting a clonal spread. Six prn::IS481R

isolates were grouped together in EL1 as a single origin and had spread across

NSW and WA in 2011 and 2012. However, one prn::IS481R isolate collected in

2008 from WA showed a separate origin. This is the first study to show that

although the isolates had the same genotype (SP13) and mechanism of prn gene

disruption, they may have evolved independently. The results also confirmed that

Prn negative strains can arise multiple times from different lineages. This highlights

the importance of having a larger number of Prn negative strains, in particular those

inactivated by IS481 to be sequenced, to determine the extent of independent

inactivations. This study also revealed the important role of IS in the adaptive

evolution of B. pertussis.

7.2 Comparative genomic investigation of major Australian B.

pertussis clones

7.2.1 Comparative genomic variation of current circulating cluster I B.

pertussis strains with other clusters in Australia

Pacific Biosciences technology provides a new data type and facilitate de novo

genome assembly and genome finishing with high accuracy to overcome some

limitations of current next generation sequencing platforms by providing

significantly longer reads, single molecule sequencing, low composition bias and

an error profile that is orthogonal to other platforms [374, 375]. Unlike Illumina,

there is no PCR amplification step during the library preparation in Pacbio

sequencing which avoids a common source of base composition bias [374, 376]. In

sequencing microbial genomes with high GC content using PacBio platform, read

coverage was not affected by GC content while the noticeable GC bias were

observed with Illumina and Ion Torrent sequencing [349, 374]. PacBio sequencing

also facilitates the assembly of larger contigs and the detection of indels and

genome rearrangements by providing longer reads, while Illumina sequencing

Chapter 7

157

produces a larger number of contigs due to the complexity of B. pertussis genome

with high number of insertion sequences.

In Chapter 4, two sequencing platforms, Illumina and PacBio, were used to

compare the genome content of major Australian B. pertussis clones and understand

their genetic diversities and evolutionary trends. Octavia et al. separated Australian

B. pertussis isolates into five major clusters including the current circulating ptxP3

strains grouped as cluster I [306]. The phylogenetic analysis based on 426 SNPs

identified in this study confirmed the previous clustering of strains [306]

The emergence and expansion of ptxP3 strains reported globally is intriguing and

recent genomic studies have been done to understand the reasons behind this

expansion [41, 260, 282, 305]. Genetic variations and adaptations were observed in

the ptxP3 strains including point mutation and indels in some genic region [283].

The correlation between ptxP3 strains and higher hospitalisation was shown only in

the study carried out by Mooi et al. in 2009. [281]. Furthermore, the recent study by

Clarke et al. in Australia showed that a high proportion of young infants (<3

month) have more sever pertussis if they were infected with a ptxP3 strains [377]

.The findings of Chapter 4 showed that cluster I ptxP3 strains which were the most

predominant strains in Australia were separated by 16 unique SNPs from other

Australian clusters. Except for ptxP allele, other point mutations particularly those

in virulence associated genes might also affect the fitness of cluster I isolates

compared to isolates from other clusters. Some of the identified SNPs were also

reported in ptxP3 strains worldwide by Bart et al.[41]. Two intergenic SNPs

located in the promoter regions of BP1035 and BP1711 which have not been

reported previously in ptxP3 strains and appeared to be unique for Australian

clones. It should be noted that all the strains used in the Bart et al. study were

collected before 2010 while most of the cluster I strains used in this study were

collected after 2010. Therefore, some SNPs found only in our study may be from

recent mutation events.

7.2.2 Ongoing genome reduction in B. pertussis through large indels

There was no gene acquisition found in this study. On the other hand, this study

uncovered far more extensive deletions in different clusters, with over 402 deletions

Chapter 7

158

greater than 300 bp, suggesting that the B. pertussis genome is quite unstable,

possibly due to the presence of large numbers of IS sequences. IS is possibly a

primary mechanism mediating the deletions. Interestingly, the number of genes

deleted varied between isolates representing different clusters as well as the three

isolates belonging to cluster I. Only the loss of two regions, BP0911-BP0937 and

BP1948-BP1968 appeared to correspond with the SNP-based evolution. The former

was lost in all clusters (I-V) and the latter was lost exclusively in cluster I. The

results were consistent with previous findings which also showed the same deletion

patterns in ptxP3 strains [284, 285, 288, 332]. However, some of the genes inside

this region including BP1955-BP1959 were also deleted in other clusters which has

not been reported previously.

Ongoing gene loss in B. pertussis may assist the bacteria to survive better in

immunised hosts and optimise its function. The loss of gene content is a dynamic

and ongoing process and was not specific for countries as it was observed in

isolates collected globally [287, 332, 337]. It seems that during the gene loss

process, unnecessary genes were removed from the genome since most of the

deleted genes were transposases (48%) or pseudogenes (14%). However, the loss of

genes coding for cell surface proteins (9% of total deletions) may also be beneficial

to the pathogen by decreasing the chance of recognition by the human immune

system.

Small deletions may also affect gene function but the number was relatively small.

There were only 12 events leading to pseudogenisation of functional genes in the

isolates analysed, including a frameshift indel in BP2946 specific for cluster I and

another one in BP2928 which was found in all isolates from clusters I to IV that

were analysed in this study. There were also some pseudogenes that were likely to

have reverted back to functional genes including BP0880 which was found in all

cluster I isolates.

7.2.3 Genetic diversities driven by transposition and genome rearrangements

It was suggested that transposons are key elements in the rapid adaptation of B.

pertussis by affecting its genome structure, organisation and function, especially in

Chapter 7

159

a population with low genetic diversity [361]. IS elements may lead to functional

changes in the clinical isolates in terms of genotype and phenotype and may also

participate in the streamlining or trimming of the genome by facilitating DNA

deletions [362]. The number of IS in current circulating strains varies between

isolates [287]. In contrast, a recent study [289] showed that the IS481 copy number

remains unchanged in the recent B. pertussis isolates. However, the progressive

deletion of transposases in all clusters and some new IS insertions suggest an

important role for IS elements and transposases in the genomic diversity of B.

pertussis strains.

In this study, it was demonstrated that IS elements may play a role in chromosomal

rearrangements via translocations and inversions (Chapter 4) like those observed in

the Y. pestis genome [364]. Belcher and Preston [378] hypothesised that genome

rearrangements generate diversity among B. pertussis and they showed used the

complete genome sequences of multiple B. pertussis strains to illustrate the

presence of genome rearrangements. However, the extent of these rearrangements

was not investigated. Only a few studies have reported the presence of genome

rearrangements including some inversions in the current B. pertussis strains [253,

378]. Here, a total of 151 translocation and/or inversion events were found in the

nine isolates analysed in Chapter 4, with an average of 17 events per genome,

ranging from 9 to 25 events. The different types of events were similar with 58

inversions, 40 translocations, and 53 inversions and translocations. There were 17

translocation/inversion breakpoints shared by more than one isolate, suggesting that

these are hotspots for genome rearrangements. However, due to time constraints,

further detailed analyses were not performed to confirm the breakpoints and the

nature of genes affected by genome rearrangements.

7.3 The comparative fitness of epidemic B. pertussis strains in vivo in

the mouse model

7.3.1 Development of a mixed infection model and a new method to perform

mixed bacterial competition assay

Previous studies investigating the fitness of different isolates with competition

assays were based on colony typing [369] which was time consuming and less

Chapter 7

160

accurate since only a limited number of colonies could be selected for typing. In

Chapter 5, a new method was developed to measure the relative proportion of two

isolates in mixed infections based on the tagged sequencing using whole DNA pool

extracted from lungs and trachea. Extracted DNA was then used for Illumina

sequencing using tagged primers to identify isolates based on their SNP variation.

This method has reduced the time required for colony typing and increased the

accuracy of frequency determination where several hundreds of reads were used to

determine the proportion of the isolates. Furthermore, mixed infection assays can

reduce variations in biological and technical replicates. However, the potential

disadvantage of this assay is a cross supply of functionality or virulence factors that

can potentially mask strain differences if the mixed isolates complement each other

on functionality.

7.3.2 The better fitness of Prn negative strains under the pressure of ACV

selection

Prn negative strains have dramatically increased during the last pertussis epidemic

from 5% in 2008 to around 80% in 2012 [311]. In Chapter 5, an in vivo mixed

infection competition assay was carried out in the mouse model to investigate the

fitness of Prn negative cluster I strains under ACV selection pressure. There was a

prior study which showed that Prn negative strains isolated from France colonised

better in mice immunised with ACV [316]. However, that study did not use a

competition assay to compare the fitness of Prn negative and positive B. pertussis

strains in separate mice [316].

In this thesis, mixed infection of mice demonstrated that Prn negative strains had

significantly better colonisation in both the lungs and trachea of ACV-immunised

mice as compared to naive mice. This finding provides an explanation for the

dramatic expansion of Prn negative strains under ACV induced immunity as Prn is

one of the major antigens in the three or more component ACVs. Our findings also

showed that the lower dose of ACV used compared to the previous studies can

induce bacterial clearance from respiratory tract of mouse indicating that

vaccination is still important in protecting the host against whooping cough [314,

316, 370].

Chapter 7

161

7.3.3 Better fitness of cluster I strains in both immunised and unimmunised

hosts

An increase in the prevalence of cluster I and a decrease in cluster II strains in

Australia after the introduction of ACV was documented previously [306] and it

was suggested that this changeover of SNP clusters might be due to the selective

pressure of ACV-induced immunity. To test this hypothesis, a mixed infection

competition assay was performed in the mouse model using one isolate each from

cluster I and cluster II as presented in Chapter 6. The fitness of cluster I isolate in

the in vivo condition under both immunised and non-immunised conditions was

tested using the same method established in Chapter 5.

The findings of Chapter 6 revealed that the cluster I isolate colonised significantly

better in the respiratory tracts of mice for both naïve and ACV-immunised groups

than the cluster II isolate. These in vivo results suggest that SNP cluster I was fitter

than SNP cluster II regardless of the vaccine immunity and thus the observed

increase in SNP cluster I may not be due to selection pressure from the ACV. Better

colonisation of the wild-type ptxP3 strains in the lungs and trachea of naïve mice

when assessed at day 4 of the infection was reported by King et al. who compared

the colonisation difference between Dutch pxtP3 and ptxP1 strains which were

equivalent to our cluster I and cluster II strains, respectively. They also

demonstrated that genetic background of ptxP3 strains may contribute to their

widespread distribution. Therefore, this suggests that the increased fitness was not

just due to the simple advantage provided by the ptxP3 allele resulting in increased

Ptx production [281]. Nevertheless, the significant bacterial clearance of both

clusters from the respiratory tracts of the immunised mice emphasises the

importance of vaccination as an important strategy in preventing B. pertussis

infections in humans since the same bacterial inoculum was used to induce

infection in both naïve and immunised mice.

If SNP cluster I (ptxP3) strains are generally fitter, one would expect SNP cluster I

ptxP3 strains to sweep through the whole world. However, many WCV vaccination

countries/regions were not dominated by SNP cluster I ptxP3 strains [371]. The

Chapter 7

162

fitness difference between SNP cluster I and other clusters should also be

investigated using the mouse model under a WCV immunised environment.

7.3.4. Caveat of comparative in vivo studies of different B. pertussis strains

Only a single isolate was used in the in vivo studies. There is a possibility that the

results were only applicable to the pair of isolates rather than to the two

populations. However, the isolates used for the in vivo studies were sequenced and

there were no other apparent genetic changes in either isolate that may adversely

affect the results as the changes in the known virulence associated genes are

representative of the clusters. There were also no unique SNPs that divided Prn

negative and Prn positive isolates as they did not represent two distinct populations.

Nevertheless, the comparison of Prn negative and Prn positive strains were non-

isogenic so there are potentially other genetic factors involved. The same caveat

exists in all prior non-isogenic comparative studies such as that of Hegerle et al.

[316].

7.4 Future work

The results from this thesis have built the foundation and framework for further

investigation on the evolution of B. pertussis and the selection pressures of

vaccinations. Chapters 3 and 4 showed the ongoing evolution and adaptation of

current circulating B. pertussis strains in Australia. Further WGS from a wider

representation of isolates from around Australia with more Prn negative isolates

from the 2008-2012 epidemic as well as prospective isolates are needed. This will

enhance our understanding of the evolutionary patterns seen in this study and can

also be used to observe how epidemics can spread and expand across the country.

As pertussis epidemic are known to repeat every 3-5 years, the next epidemic has

already shown signs of emergence with notifications rates in 2015 increased to

more than 56% in comparison to 2013 based on the health reports from the

governments in Victoria and NSW. It is crucial to understand past epidemics in

order to predict and prevent future epidemics.

The genomic changes revealed also form the basis for future genetic studies to

examine the functional effect of the SNP changes in virulence associated genes.

Chapter 7

163

The effect of the genomic changes may also be studied using transcriptomic and

proteomic tools to increase our understanding of the overall functional effect of

these genetic changes and their role in B. pertussis adaptation.

PacBio sequencing facilitated the assembly of longer contigs to detect genomic

structural changes and revealed that there were frequent rearrangements of the

genomic content in B. pertussis. It is known that genome rearrangement can affect

gene expression in other organisms [364, 379, 380]. Further studies would be

needed to investigate the potential effects of such rearrangement on bacterial

pathogenicity.

Although, the mixed infection by two different strains at the same time rarely occur

in patients suffered from pertussis, the mixed infection competition assay

developed in this study using tagged Illumina sequencing provides a sensitive and

convenient model to test isolate differences in a variety of vaccination

environments and also can be used to test various hypotheses. For example, a

competition of SNP cluster I and other SNP clusters or competition of Prn negative

and Prn positive strains in a WCV environment will be very interesting to find out

whether Prn negative strains have better fitness under the WCV pressure. These

findings will be significant in informing vaccination strategies and policy making in

countries where WCV is still in use as well as to address possible adverse

consequences of switching to ACV. It will also be interesting to test the

competitive fitness of different clusters or Prn negative strains in a live vaccine

immunised environment. A live attenuated pertussis vaccine has been designed and

has completed a Phase I clinical trial. A single dose nasal vaccine, named BPZE1,

has been developed. It contains a live attenuated B. pertussis strain which was

genetically inactivated by deletion of the dermonecrotic toxin gene. The ptx gene is

modified to encode an inactive toxin and the production of tracheal cytotoxin is

reduced by over expression of an ampG gene [381, 382]. The vaccine can mimic

natural infection in hosts and induce long term immunity against whole cell while

reducing the side effects of endogenous immunogenic compounds [383, 384]. The

vaccine provides long term protection in mice and the Phase I clinical study in

human showed the safety of this vaccine [384, 385]. It would also be useful to

Chapter 7

164

perform competition assays of the current circulating strains under the immunity of

this newly developed vaccine to determine whether the live vaccine offers a better

protection against the SNP cluster I and Prn negative strains.

7.5 Conclusion

This thesis revealed the genomic diversity inside SNP cluster I as well as between

cluster I and other SNP clusters. It also provided a snapshot of the ongoing

evolution and adaptation of B. pertussis strains currently circulating in Australia.

The genomic changes revealed provides clues to the potential advantage of SNP

cluster I (ptxP3) strains that has allowed it to spread across the world, leading to the

resurgence of pertussis globally. The genetic changes uncovered also allow further

experimental analysis to understand effect of genetic changes on pathogenicity of

B. pertussis. B. pertussis has adapted as a result of vaccine selection based on in

vivo competition. The non-production of Prn in cluster I strains has provided it with

a major advantage to evade the host immunity induced by ACV. The findings of

this thesis has formed the basis of future studies for the fitness of B. pertussis

strains under the pressures of different vaccines and can help inform the

development of new vaccination strategies.

Chapter 8

165

Chapter 8. References

[1] Guiso N, et al. "Other Bordetellas, lessons for and from pertussis vaccines".

Expert Rev Vaccines. 2014;13:1125-33.

[2] Kloss WE, et al. "Deoxyribonucleotide Sequence Relationships Among

Bordetella Species". Int J of Systematic Bacteriology. 1981;31:173-6.

[3] Arico B, et al. "Evolutionary relationships in the genus Bordetella". Mol

Microbiol. 1987;1:301-8.

[4] Muller M, et al. "Nucleotide sequences of the 23S rRNA genes from Bordetella

pertussis, B.parapertussis, B.bronchiseptica and B.avium, and their implications

for phylogenetic analysis". Nucleic Acids Res. 1993;21:3320.

[5] Musser JM, et al. "Genetic diversity and relationships in populations of

Bordetella spp". J Bacteriol. 1986;166:230-7.

[6] Gerlach G, et al. "Evolutionary trends in the genus Bordetella". Microbes

Infect. 2001;3:61-72.

[7] Goodnow RA. "Biology of Bordetella bronchiseptica". Microbiol Rev.

1980;44:722-38.

[8] Khelef N, et al. "Bordetella pertussis and Bordetella parapertussis: two

immunologically distinct species". Infect Immun. 1993;61:486-90.

[9] Mastrantonio P, et al. "Bordetella parapertussis infection in children:

epidemiology, clinical symptoms, and molecular characteristics of isolates". J Clin

Microbiol. 1998;36:999-1002.

[10] Mertsola J. "Mixed outbreak of Bordetella pertussis and Bordetella

parapertussis infection in Finland". Eur J Clin Microbiol. 1985;4:123-8.

[11] Cherry JD, et al. "Patterns of Bordetella parapertussis respiratory illnesses:

2008-2010". Clin Infect Dis. 2012;54:534-7.

[12] Liese JG, et al. "Clinical and epidemiological picture of B pertussis and B

parapertussis infections after introduction of acellular pertussis vaccines". Arch

Dis Child. 2003;88:684-7.

[13] Hinz KH, et al. "[Occurrence of Bordetella avium sp. nov. and Bordetella

bronchiseptica in birds]". Berl Munch Tierarztl Wochenschr. 1985;98:369-73.

Chapter 8

166

[14] Raffel TR, et al. "Prevalence of Bordetella avium infection in selected wild

and domesticated birds in the eastern USA". J Wildl Dis. 2002;38:40-6.

[15] Filion R, et al. "[Respiratory infection in the turkey caused by a bacterium

related to Bordetella bronchiseptica]". Can J Comp Med Vet Sci. 1967;31:129-34.

[16] Spilker T, et al. "Identification of Bordetella spp. in respiratory specimens

from individuals with cystic fibrosis". Clin Microbiol Infect. 2008;14:504-6.

[17] Harrington AT, et al. "Isolation of Bordetella avium and novel Bordetella

strain from patients with respiratory disease". Emerg Infect Dis. 2009;15:72-4.

[18] Gross R, et al. "The missing link: Bordetella petrii is endowed with both the

metabolic versatility of environmental bacteria and virulence traits of pathogenic

Bordetellae". BMC Genomics. 2008;9:449.

[19] Weyant RS, et al. "Bordetella holmesii sp. nov., a new gram-negative species

associated with septicemia". J Clin Microbiol. 1995;33:1-7.

[20] Pittet LF, et al. "Bordetella holmesii infection: current knowledge and a vision

for future research". Expert Rev Anti Infect Ther. 2015;13:965-71.

[21] Pittet LF, et al. "Bordetella holmesii: an under-recognised Bordetella species".

Lancet Infect Dis. 2014;14:510-9.

[22] Njamkepo E, et al. "Significant finding of Bordetella holmesii DNA in

nasopharyngeal samples from French patients with suspected pertussis". J Clin

Microbiol. 2011;49:4347-8.

[23] Bottero D, et al. "Bordetella holmesii in children suspected of pertussis in

Argentina". Epidemiol Infect. 2013;141:714-7.

[24] Burgos-Rivera B, et al. "An evaluation of the level of agreement in Bordetella

species identification in three United States laboratories during a period of

increased pertussis". J Clin Microbiol. 2015.

[25] Kamiya H, et al. "Transmission of Bordetella holmesii during pertussis

outbreak, Japan". Emerg Infect Dis. 2012;18:1166-9.

[26] Diavatopoulos DA, et al. "Characterization of a highly conserved island in the

otherwise divergent Bordetella holmesii and Bordetella pertussis genomes". J

Bacteriol. 2006;188:8385-94.

[27] Harvill ET, et al. "Genome Sequences of Nine Bordetella holmesii Strains

Isolated in the United States". Genome Announc. 2014;2.

Chapter 8

167

[28] Srigley JA, et al. "Bordetella Species Other than Bordetella pertussis".

Clinical Microbiology Newsletter. 2015;37:61-5.

[29] Lechner M, et al. "Genomic island excisions in Bordetella petrii". BMC

Microbiol. 2009;9:141.

[30] Vandamme P, et al. "Bordetella hinzii sp. nov., isolated from poultry and

humans". Int J Syst Bacteriol. 1995;45:37-45.

[31] Jiyipong T, et al. "Bordetella hinzii in rodents, Southeast Asia". Emerg Infect

Dis. 2013;19:502-3.

[32] Hayashimoto N, et al. "Prevalence of Bordetella hinzii in mice in experimental

facilities in Japan". Res Vet Sci. 2012;93:624-6.

[33] Almagro-Molto M, et al. "Bordetella trematum in chronic ulcers: report on

two cases and review of the literature". Infection. 2015.

[34] Saksena R, et al. "Bordetella trematum bacteremia in an infant: A cause to

look for". Indian J Med Microbiol. 2015;33:305-7.

[35] Halim I, et al. "[Isolation of Bordetella trematum from bacteremia]". Ann Biol

Clin (Paris). 2014;72:612-4.

[36] Chang DH, et al. "Draft Genome Sequence of Bordetella trematum Strain

HR18". Genome Announc. 2015;3.

[37] Shah NR, et al. "Draft Genome Sequences of Bordetella hinzii and Bordetella

trematum". Genome Announc. 2013;1.

[38] Ko KS, et al. "New species of Bordetella, Bordetella ansorpii sp. nov., isolated

from the purulent exudate of an epidermal cyst". J Clin Microbiol. 2005;43:2516-9.

[39] Arico B, et al. "Bordetella parapertussis and Bordetella bronchiseptica

contain transcriptionally silent pertussis toxin genes". J Bacteriol. 1987;169:2847-

53.

[40] Diavatopoulos DA, et al. "Bordetella pertussis, the causative agent of

whooping cough, evolved from a distinct, human-associated lineage of B.

bronchiseptica". PLoS Pathog. 2005;1:e45.

[41] Bart MJ, et al. "Global population structure and evolution of Bordetella

pertussis and their relationship with vaccination". MBio. 2014;5:e01074.

[42] Parkhill J, et al. "Comparative analysis of the genome sequences of Bordetella

pertussis, Bordetella parapertussis and Bordetella bronchiseptica". Nat Genet.

2003;35:32-40.

Chapter 8

168

[43] Chaudhuri R, et al. "Prediction of virulence factors using bioinformatics

approaches". Methods Mol Biol. 2014;1184:389-400.

[44] Leininger E, et al. "Comparative roles of the Arg-Gly-Asp sequence present in

the Bordetella pertussis adhesins pertactin and filamentous hemagglutinin". Infect

Immun. 1992;60:2380-5.

[45] Mooi FR, et al. "Polymorphism in the Bordetella pertussis virulence factors

P.69/pertactin and pertussis toxin in The Netherlands: temporal trends and

evidence for vaccine-driven evolution". Infect Immun. 1998;66:670-5.

[46] Godfroid F, et al. "Are vaccination programs and isolate polymorphism linked

to pertussis re-emergence?". Expert Rev Vaccines. 2005;4:757-78.

[47] Charles IG, et al. "Molecular cloning and characterization of protective outer

membrane protein P.69 from Bordetella pertussis". Proc Natl Acad Sci U S A.

1989;86:3554-8.

[48] Leininger E, et al. "Pertactin, an Arg-Gly-Asp-containing Bordetella pertussis

surface protein that promotes adherence of mammalian cells". Proc Natl Acad Sci

U S A. 1991;88:345-9.

[49] Li J, et al. "Cloning, nucleotide sequence and heterologous expression of the

protective outer-membrane protein P.68 pertactin from Bordetella bronchiseptica".

J Gen Microbiol. 1992;138 Pt 8:1697-705.

[50] Li LJ, et al. "P.70 pertactin, an outer-membrane protein from Bordetella

parapertussis: cloning, nucleotide sequence and surface expression in Escherichia

coli". Mol Microbiol. 1991;5:409-17.

[51] Everest P, et al. "Role of the Bordetella pertussis P.69/pertactin protein and

the P.69/pertactin RGD motif in the adherence to and invasion of mammalian

cells". Microbiology. 1996;142 ( Pt 11):3261-8.

[52] Khelef N, et al. "Characterization of murine lung inflammation after infection

with parental Bordetella pertussis and mutants deficient in adhesins or toxins".

Infect Immun. 1994;62:2893-900.

[53] Stefanelli P, et al. "A natural pertactin deficient strain of Bordetella pertussis

shows improved entry in human monocyte-derived dendritic cells". New Microbiol.

2009;32:159-66.

[54] Inatsuka CS, et al. "Pertactin is required for Bordetella species to resist

neutrophil-mediated clearance". Infect Immun. 2010;78:2901-9.

Chapter 8

169

[55] Nicholson TL, et al. "Contribution of Bordetella bronchiseptica filamentous

hemagglutinin and pertactin to respiratory disease in swine". Infect Immun.

2009;77:2136-46.

[56] Cherry JD, et al. "A search for serologic correlates of immunity to Bordetella

pertussis cough illnesses". Vaccine. 1998;16:1901-6.

[57] Denoel P, et al. "Comparison of acellular pertussis vaccines-induced immunity

against infection due to Bordetella pertussis variant isolates in a mouse model".

Vaccine. 2005;23:5333-41.

[58] Storsaeter J, et al. "Levels of anti-pertussis antibodies related to protection

after household exposure to Bordetella pertussis". Vaccine. 1998;16:1907-16.

[59] Hijnen M, et al. "The Bordetella pertussis virulence factor P.69 pertactin

retains its immunological properties after overproduction in Escherichia coli".

Protein Expr Purif. 2005;41:106-12.

[60] King AJ, et al. "Role of the polymorphic region 1 of the Bordetella pertussis

protein pertactin in immunity". Microbiology. 2001;147:2885-95.

[61] Stenger RM, et al. "Immunodominance in mouse and human CD4+ T-cell

responses specific for the Bordetella pertussis virulence factor P.69 pertactin".

Infect Immun. 2009;77:896-903.

[62] Noel CR, et al. "The prodomain of the Bordetella two-partner secretion

pathway protein FhaB remains intracellular yet affects the conformation of the

mature C-terminal domain". Mol Microbiol. 2012;86:988-1006.

[63] Antoine R, et al. "New virulence-activated and virulence-repressed genes

identified by systematic gene inactivation and generation of transcriptional fusions

in Bordetella pertussis". J Bacteriol. 2000;182:5902-5.

[64] Julio SM, et al. "Natural-host animal models indicate functional

interchangeability between the filamentous haemagglutinins of Bordetella pertussis

and Bordetella bronchiseptica and reveal a role for the mature C-terminal domain,

but not the RGD motif, during infection". Mol Microbiol. 2009;71:1574-90.

[65] Melvin JA, et al. "Bordetella pertussis pathogenesis: current and future

challenges". Nat Rev Microbiol. 2014;12:274-88.

[66] Abramson T, et al. "Proinflammatory and proapoptotic activities associated

with Bordetella pertussis filamentous hemagglutinin". Infect Immun.

2001;69:2650-8.

Chapter 8

170

[67] Arnal L, et al. "Adhesin contribution to nanomechanical properties of the

virulent Bordetella pertussis envelope". Langmuir. 2012;28:7461-9.

[68] Serra DO, et al. "FHA-mediated cell-substrate and cell-cell adhesions are

critical for Bordetella pertussis biofilm formation on abiotic surfaces and in the

mouse nose and the trachea". PLoS One. 2011;6:e28811.

[69] Willems RJ, et al. "Characterization of a Bordetella pertussis fimbrial gene

cluster which is located directly downstream of the filamentous haemagglutinin

gene". Mol Microbiol. 1992;6:2661-71.

[70] Willems RJ, et al. "Isolation of a putative fimbrial adhesin from Bordetella

pertussis and the identification of its gene". Mol Microbiol. 1993;9:623-34.

[71] Locht C. "Molecular aspects of Bordetella pertussis pathogenesis". Int

Microbiol. 1999;2:137-44.

[72] Geuijen CA, et al. "Role of the Bordetella pertussis minor fimbrial subunit,

FimD, in colonization of the mouse respiratory tract". Infect Immun.

1997;65:4222-8.

[73] Hazenbos WL, et al. "Bordetella pertussis fimbriae bind to human monocytes

via the minor fimbrial subunit FimD". J Infect Dis. 1995;171:924-9.

[74] Kania SA, et al. "Characterization of fimN, a new Bordetella bronchiseptica

major fimbrial subunit gene". Gene. 2000;256:149-55.

[75] Boschwitz JS, et al. "Bordetella bronchiseptica expresses the fimbrial

structural subunit gene fimA". J Bacteriol. 1997;179:7882-5.

[76] Riboli B, et al. "Expression of Bordetella pertussis fimbrial (fim) genes in

Bordetella bronchiseptica: fimX is expressed at a low level and vir-regulated".

Microb Pathog. 1991;10:393-403.

[77] Chen Q, et al. "Strong inhibition of fimbrial 3 subunit gene transcription by a

novel downstream repressive element in Bordetella pertussis". Mol Microbiol.

2014;93:748-58.

[78] Hallander H, et al. "Antibody responses to Bordetella pertussis Fim2 or Fim3

following immunization with a whole-cell, two-component, or five-component

acellular pertussis vaccine and following pertussis disease in children in Sweden in

1997 and 2007". Clin Vaccine Immunol. 2014;21:165-73.

[79] Mooi FR, et al. "Characterization of fimbrial subunits from Bordetella

species". Microb Pathog. 1987;2:473-84.

Chapter 8

171

[80] Willems R, et al. "Fimbrial phase variation in Bordetella pertussis: a novel

mechanism for transcriptional regulation". EMBO J. 1990;9:2803-9.

[81] Fedele G, et al. "The virulence factors of Bordetella pertussis: talented

modulators of host immune response". Arch Immunol Ther Exp (Warsz).

2013;61:445-57.

[82] Geuijen CA, et al. "Identification and characterization of heparin binding

regions of the Fim2 subunit of Bordetella pertussis". Infect Immun. 1998;66:2256-

63.

[83] Mattoo S, et al. "Role of Bordetella bronchiseptica fimbriae in tracheal

colonization and development of a humoral immune response". Infect Immun.

2000;68:2024-33.

[84] Kerr JR, et al. "Bordetella pertussis infection: pathogenesis, diagnosis,

management, and the role of protective immunity". Eur J Clin Microbiol Infect Dis.

2000;19:77-88.

[85] Finn TM, et al. "Tracheal colonization factor: a Bordetella pertussis secreted

virulence determinant". Mol Microbiol. 1995;16:625-34.

[86] Carbonetti NH. "Pertussis toxin and adenylate cyclase toxin: key virulence

factors of Bordetella pertussis and cell biology tools". Future Microbiol.

2010;5:455-69.

[87] Weiss AA, et al. "Molecular characterization of an operon required for

pertussis toxin secretion". Proc Natl Acad Sci U S A. 1993;90:2970-4.

[88] Covacci A, et al. "Pertussis toxin export requires accessory genes located

downstream from the pertussis toxin operon". Mol Microbiol. 1993;8:429-34.

[89] Verma A, et al. "Requirements for assembly of PtlH with the pertussis toxin

transporter apparatus of Bordetella pertussis". Infect Immun. 2007;75:2297-306.

[90] Locht C, et al. "The ins and outs of pertussis toxin". FEBS J. 2011;278:4668-

82.

[91] Simon NC, et al. "Novel bacterial ADP-ribosylating toxins: structure and

function". Nat Rev Microbiol. 2014;12:599-611.

[92] Farizo KM, et al. "Membrane localization of the S1 subunit of pertussis toxin

in Bordetella pertussis and implications for pertussis toxin secretion". Infect

Immun. 2002;70:1193-201.

Chapter 8

172

[93] Nicosia A, et al. "Promoter of the pertussis toxin operon and production of

pertussis toxin". J Bacteriol. 1987;169:2843-6.

[94] Boucher PE, et al. "Synergistic binding of RNA polymerase and BvgA

phosphate to the pertussis toxin promoter of Bordetella pertussis". J Bacteriol.

1995;177:6486-91.

[95] Pittman M. "The concept of pertussis as a toxin-mediated disease". Pediatr

Infect Dis. 1984;3:467-86.

[96] Robbins JB, et al. "Primum non nocere: a pharmacologically inert pertussis

toxoid alone should be the next pertussis vaccine". Pediatr Infect Dis J.

1993;12:795-807.

[97] Stein PE, et al. "The crystal structure of pertussis toxin". Structure. 1994;2:45-

57.

[98] Tuomanen E, et al. "Filamentous hemagglutinin and pertussis toxin promote

adherence of Bordetella pertussis to cilia". Dev Biol Stand. 1985;61:197-204.

[99] Mangmool S, et al. "G(i/o) protein-dependent and -independent actions of

Pertussis Toxin (PTX)". Toxins (Basel). 2011;3:884-99.

[100] Connelly CE, et al. "Pertussis toxin exacerbates and prolongs airway

inflammatory responses during Bordetella pertussis infection". Infect Immun.

2012;80:4317-32.

[101] Sakamoto H, et al. "Bordetella pertussis adenylate cyclase toxin. Structural

and functional independence of the catalytic and hemolytic activities". J Biol Chem.

1992;267:13598-602.

[102] Ladant D, et al. "Characterization of the calmodulin-binding and of the

catalytic domains of Bordetella pertussis adenylate cyclase". J Biol Chem.

1989;264:4015-20.

[103] Zaretzky FR, et al. "Mechanism of association of adenylate cyclase toxin with

the surface of Bordetella pertussis: a role for toxin-filamentous haemagglutinin

interaction". Mol Microbiol. 2002;45:1589-98.

[104] Guermonprez P, et al. "The adenylate cyclase toxin of Bordetella pertussis

binds to target cells via the alpha(M)beta(2) integrin (CD11b/CD18)". J Exp Med.

2001;193:1035-44.

Chapter 8

173

[105] Cerny O, et al. "Bordetella pertussis Adenylate Cyclase Toxin Blocks

Induction of Bactericidal Nitric Oxide in Macrophages through cAMP-Dependent

Activation of the SHP-1 Phosphatase". J Immunol. 2015.

[106] Hewlett EL, et al. "Pertussis pathogenesis--what we know and what we don't

know". J Infect Dis. 2014;209:982-5.

[107] Harvill ET, et al. "Probing the function of Bordetella bronchiseptica

adenylate cyclase toxin by manipulating host immunity". Infect Immun.

1999;67:1493-500.

[108] Henderson MW, et al. "Contribution of Bordetella filamentous hemagglutinin

and adenylate cyclase toxin to suppression and evasion of interleukin-17-mediated

inflammation". Infect Immun. 2012;80:2061-75.

[109] Adkins I, et al. "Bordetella adenylate cyclase toxin differentially modulates

toll-like receptor-stimulated activation, migration and T cell stimulatory capacity

of dendritic cells". PLoS One. 2014;9:e104064.

[110] Oliver DC, et al. "Identification of secretion determinants of the Bordetella

pertussis BrkA autotransporter". J Bacteriol. 2003;185:489-95.

[111] Mattoo S, et al. "Molecular pathogenesis, epidemiology, and clinical

manifestations of respiratory infections due to Bordetella pertussis and other

Bordetella subspecies". Clin Microbiol Rev. 2005;18:326-82.

[112] Fernandez RC, et al. "Cloning and sequencing of a Bordetella pertussis

serum resistance locus". Infect Immun. 1994;62:4727-38.

[113] Oliver DC, et al. "Antibodies to BrkA augment killing of Bordetella

pertussis". Vaccine. 2001;20:235-41.

[114] Magalhaes JG, et al. "Murine Nod1 but not its human orthologue mediates

innate immune detection of tracheal cytotoxin". EMBO Rep. 2005;6:1201-7.

[115] Heiss LN, et al. "Nitric oxide mediates Bordetella pertussis tracheal cytotoxin

damage to the respiratory epithelium". Infect Agents Dis. 1993;2:173-7.

[116] Fukui-Miyazaki A, et al. "Bordetella dermonecrotic toxin binds to target cells

via the N-terminal 30 amino acids". Microbiol Immunol. 2011;55:154-9.

[117] Matsuzawa T, et al. "Bordetella dermonecrotic toxin undergoes proteolytic

processing to be translocated from a dynamin-related endosome into the cytoplasm

in an acidification-independent manner". J Biol Chem. 2004;279:2866-72.

Chapter 8

174

[118] Walker KE, et al. "Characterization of the dermonecrotic toxin in members of

the genus Bordetella". Infect Immun. 1994;62:3817-28.

[119] Cowell JL, et al. "Intracellular localization of the dermonecrotic toxin of

Bordetella pertussis". Infect Immun. 1979;25:896-901.

[120] Horiguchi Y, et al. "Effects of Bordetella bronchiseptica dermonecrotic toxin

on the structure and function of osteoblastic clone MC3T3-e1 cells". Infect Immun.

1991;59:1112-6.

[121] Horiguchi Y, et al. "Stimulation of DNA synthesis in osteoblast-like MC3T3-

E1 cells by Bordetella bronchiseptica dermonecrotic toxin". Infect Immun.

1993;61:3611-5.

[122] Brockmeier SL, et al. "Role of the dermonecrotic toxin of Bordetella

bronchiseptica in the pathogenesis of respiratory disease in swine". Infect Immun.

2002;70:481-90.

[123] Decker KB, et al. "The Bordetella pertussis model of exquisite gene control

by the global transcription factor BvgA". Microbiology. 2012;158:1665-76.

[124] Karimova G, et al. "Characterization of DNA binding sites for the BvgA

protein of Bordetella pertussis". J Bacteriol. 1997;179:3790-2.

[125] Beier D, et al. "The BvgS/BvgA phosphorelay system of pathogenic

Bordetellae: structure, function and evolution". Adv Exp Med Biol. 2008;631:149-

60.

[126] Vergara-Irigaray N, et al. "Evaluation of the role of the Bvg intermediate

phase in Bordetella pertussis during experimental respiratory infection". Infect

Immun. 2005;73:748-60.

[127] de Gouw D, et al. "Proteomics-identified Bvg-activated autotransporters

protect against bordetella pertussis in a mouse model". PLoS One.

2014;9:e105011.

[128] Jones AM, et al. "Role of BvgA phosphorylation and DNA binding affinity in

control of Bvg-mediated phenotypic phase transition in Bordetella pertussis". Mol

Microbiol. 2005;58:700-13.

[129] Byrd MS, et al. "An improved recombination-based in vivo expression

technology-like reporter system reveals differential cyaA gene activation in

Bordetella species". Infect Immun. 2013;81:1295-305.

Chapter 8

175

[130] Nicholson TL, et al. "Phenotypic modulation of the virulent Bvg phase is not

required for pathogenesis and transmission of Bordetella bronchiseptica in swine".

Infect Immun. 2012;80:1025-36.

[131] Prugnola A, et al. "Response of the bvg regulon of Bordetella pertussis to

different temperatures and short-term temperature shifts". Microbiology. 1995;141

( Pt 10):2529-34.

[132] Boulanger A, et al. "In vivo phosphorylation dynamics of the Bordetella

pertussis virulence-controlling response regulator BvgA". Mol Microbiol.

2013;88:156-72.

[133] Warfel JM, et al. "Airborne transmission of Bordetella pertussis". J Infect

Dis. 2012;206:902-6.

[134] Warfel JM, et al. "Nonhuman primate model of pertussis". Infect Immun.

2012;80:1530-6.

[135] MacDonald H, et al. "Experimental pertussis". Journal of infectious disease.

1933;53:328-30.

[136] Paisley RD, et al. "Whooping cough in adults: an update on a reemerging

infection". Am J Med. 2012;125:141-3.

[137] Cherry JD, et al. "Clinical definitions of pertussis: Summary of a Global

Pertussis Initiative roundtable meeting, February 2011". Clin Infect Dis.

2012;54:1756-64.

[138] Cornia PB, et al. "Does this coughing adolescent or adult patient have

pertussis?". JAMA. 2010;304:890-6.

[139] Crowcroft NS, et al. "Recent developments in pertussis". Lancet.

2006;367:1926-36.

[140] Cherry JD, et al. "Defining pertussis epidemiology: clinical, microbiologic

and serologic perspectives". Pediatr Infect Dis J. 2005;24:S25-34.

[141] Zouari A, et al. "The diagnosis of pertussis: which method to choose?". Crit

Rev Microbiol. 2012;38:111-21.

[142] Wood N, et al. "Pertussis: review of epidemiology, diagnosis, management

and prevention". Paediatr Respir Rev. 2008;9:201-11; quiz 11-2.

[143] Guiso N, et al. "What to do and what not to do in serological diagnosis of

pertussis: recommendations from EU reference laboratories". Eur J Clin Microbiol

Infect Dis. 2011;30:307-12.

Chapter 8

176

[144] Dalby T, et al. "Evaluation of PCR methods for the diagnosis of pertussis by

the European surveillance network for vaccine-preventable diseases

(EUVAC.NET)". Eur J Clin Microbiol Infect Dis. 2013;32:1285-9.

[145] Glare EM, et al. "Analysis of a repetitive DNA sequence from Bordetella

pertussis and its application to the diagnosis of pertussis using the polymerase

chain reaction". J Clin Microbiol. 1990;28:1982-7.

[146] Pittet LF, et al. "Diagnosis of whooping cough in Switzerland: differentiating

Bordetella pertussis from Bordetella holmesii by polymerase chain reaction". PLoS

One. 2014;9:e88936.

[147] Williams MM, et al. "Harmonization of Bordetella pertussis real-time PCR

diagnostics in the United States in 2012". J Clin Microbiol. 2015;53:118-23.

[148] Dierig A, et al. "Antibiotic treatment of pertussis: are 7 days really

sufficient?". Pediatr Infect Dis J. 2015;34:444-5.

[149] Wang K, et al. "Symptomatic treatment of the cough in whooping cough".

Cochrane Database Syst Rev. 2014;9:CD003257.

[150] Cherry JD. "Historical review of pertussis and the classical vaccine". J Infect

Dis. 1996;174 Suppl 3:S259-63.

[151] Onorato IM, et al. "Efficacy of whole-cell pertussis vaccine in preschool

children in the United States". JAMA. 1992;267:2745-9.

[152] Miller DL, et al. "Pertussis immunisation and serious acute neurological

illness in children". Br Med J (Clin Res Ed). 1981;282:1595-9.

[153] Geier DA, et al. "An evaluation of the effects of thimerosal on

neurodevelopmental disorders reported following DTP and Hib vaccines in

comparison to DTPH vaccine in the United States". J Toxicol Environ Health A.

2006;69:1481-95.

[154] Chiappini E, et al. "Pertussis re-emergence in the post-vaccination era".

BMC Infect Dis. 2013;13:151.

[155] Klein NP, et al. "Post-marketing safety evaluation of a tetanus toxoid,

reduced diphtheria toxoid and 3-component acellular pertussis vaccine

administered to a cohort of adolescents in a United States health maintenance

organization". Pediatr Infect Dis J. 2010;29:613-7.

[156] Sato Y, et al. "Development of acellular pertussis vaccines". Biologicals.

1999;27:61-9.

Chapter 8

177

[157] Thierry-Carstensen B, et al. "Experience with monocomponent acellular

pertussis combination vaccines for infants, children, adolescents and adults--a

review of safety, immunogenicity, efficacy and effectiveness studies and 15 years of

field experience". Vaccine. 2013;31:5178-91.

[158] Meade BD, et al. "Possible options for new pertussis vaccines". J Infect Dis.

2014;209 Suppl 1:S24-7.

[159] van Gent M, et al. "Analysis of Bordetella pertussis clinical isolates

circulating in European countries during the period 1998-2012". Eur J Clin

Microbiol Infect Dis. 2015;34:821-30.

[160] Munoz FM. "Pertussis in infants, children, and adolescents: diagnosis,

treatment, and prevention". Semin Pediatr Infect Dis. 2006;17:14-9.

[161] Roberts M, et al. "Protection of mice against respiratory Bordetella pertussis

infection by intranasal immunization with P.69 and FHA". Vaccine. 1993;11:866-

72.

[162] Cahill ES, et al. "Immune responses and protection against Bordetella

pertussis infection after intranasal immunization of mice with filamentous

haemagglutinin in solution or incorporated in biodegradable microparticles".

Vaccine. 1995;13:455-62.

[163] Roberts M, et al. "Recombinant P.69/pertactin: immunogenicity and

protection of mice against Bordetella pertussis infection". Vaccine. 1992;10:43-8.

[164] Gustafsson L, et al. "A controlled trial of a two-component acellular, a five-

component acellular, and a whole-cell pertussis vaccine". N Engl J Med.

1996;334:349-55.

[165] Olin P, et al. "Randomised controlled trial of two-component, three-

component, and five-component acellular pertussis vaccines compared with whole-

cell pertussis vaccine. Ad Hoc Group for the Study of Pertussis Vaccines". Lancet.

1997;350:1569-77.

[166] van Amersfoorth SC, et al. "Analysis of Bordetella pertussis populations in

European countries with different vaccination policies". J Clin Microbiol.

2005;43:2837-43.

[167] Mooi FR, et al. "Pertussis resurgence: waning immunity and pathogen

adaptation - two sides of the same coin". Epidemiol Infect. 2014;142:685-94.

Chapter 8

178

[168] Bottero D, et al. "Genotypic and phenotypic characterization of Bordetella

pertussis strains used in different vaccine formulations in Latin America". J Appl

Microbiol. 2012;112:1266-76.

[169] Miller E. "Overview of recent clinical trials of acellular pertussis vaccines".

Biologicals. 1999;27:79-86.

[170] Halperin SA, et al. "Adult formulation of a five component acellular pertussis

vaccine combined with diphtheria and tetanus toxoids and inactivated poliovirus

vaccine is safe and immunogenic in adolescents and adults". Pediatr Infect Dis J.

2000;19:276-83.

[171] Belloni C, et al. "Immunogenicity of a three-component acellular pertussis

vaccine administered at birth". Pediatrics. 2003;111:1042-5.

[172] Tomovici A, et al. "Humoral immunity 10 years after booster immunization

with an adolescent and adult formulation combined tetanus, diphtheria, and 5-

component acellular pertussis vaccine". Vaccine. 2012;30:2647-53.

[173] Aoyama T, et al. "Efficacy and immunogenicity of acellular pertussis vaccine

by manufacturer and patient age". Am J Dis Child. 1989;143:655-9.

[174] Thollot F, et al. "A randomized study to evaluate the immunogenicity and

safety of a heptavalent diphtheria, tetanus, pertussis, hepatitis B, poliomyelitis,

haemophilus influenzae b, and meningococcal serogroup C combination vaccine

administered to infants at 2, 4 and 12 months of age". Pediatr Infect Dis J.

2014;33:1246-54.

[175] Canthaboo C, et al. "Investigation of an aerosol challenge model as

alternative to the intracerebral mouse protection test for potency assay of whole

cell pertussis vaccines". Biologicals. 2000;28:241-6.

[176] Cherry JD. "Epidemic Pertussis and Acellular Pertussis Vaccine Failure in

the 21st Century". Pediatrics. 2015.

[177] Poolman JT. "Shortcomings of pertussis vaccines: why we need a third

generation vaccine". Expert Rev Vaccines. 2014;13:1159-62.

[178] Warfel JM, et al. "Acellular pertussis vaccines protect against disease but fail

to prevent infection and transmission in a nonhuman primate model". Proc Natl

Acad Sci U S A. 2014;111:787-92.

[179] Poolman JT. "Foreword. Pertussis vaccines". Expert Rev Vaccines.

2014;13:1067-9.

Chapter 8

179

[180] Zhang L, et al. "Acellular vaccines for preventing whooping cough in

children". Cochrane Database Syst Rev. 2012;3:CD001478.

[181] Robbins JB, et al. "Pertussis vaccine: a critique". Pediatr Infect Dis J.

2009;28:237-41.

[182] Taranger J, et al. "Protection against pertussis with a monocomponent

pertussis toxoid vaccine". Biologicals. 1999;27:89.

[183] Hviid A, et al. "Impact of routine vaccination with a pertussis toxoid vaccine

in Denmark". Vaccine. 2004;22:3530-4.

[184] Greco D, et al. "A controlled trial of two acellular vaccines and one whole-

cell vaccine against pertussis. Progetto Pertosse Working Group". N Engl J Med.

1996;334:341-8.

[185] van den Berg BM, et al. "Protection and humoral immune responses against

Bordetella pertussis infection in mice immunized with acellular or cellular

pertussis immunogens". Vaccine. 2000;19:1118-28.

[186] Denoel P, et al. "Effects of adsorption of acellular pertussis antigens onto

different aluminium salts on the protective activity in an intranasal murine model of

Bordetella pertussis infection". Vaccine. 2002;20:2551-5.

[187] Allen AC, et al. "Improved pertussis vaccines based on adjuvants that induce

cell-mediated immunity". Expert Rev Vaccines. 2014;13:1253-64.

[188] Dunne A, et al. "A novel TLR2 agonist from Bordetella pertussis is a potent

adjuvant that promotes protective immunity with an acellular pertussis vaccine".

Mucosal Immunol. 2015;8:607-17.

[189] Sheridan SL, et al. "Number and order of whole cell pertussis vaccines in

infancy and disease protection". JAMA. 2012;308:454-6.

[190] Klein NP, et al. "Waning protection after fifth dose of acellular pertussis

vaccine in children". N Engl J Med. 2012;367:1012-9.

[191] Liko J, et al. "Priming with whole-cell versus acellular pertussis vaccine". N

Engl J Med. 2013;368:581-2.

[192] Edwards KM, et al. "Immune responses to pertussis vaccines and disease". J

Infect Dis. 2014;209 Suppl 1:S10-5.

[193] Mills KH, et al. "Cell-mediated immunity to Bordetella pertussis: role of Th1

cells in bacterial clearance in a murine respiratory infection model". Infect Immun.

1993;61:399-410.

Chapter 8

180

[194] Mills KH, et al. "Mouse and pig models for studies of natural and vaccine-

induced immunity to Bordetella pertussis". J Infect Dis. 2014;209 Suppl 1:S16-9.

[195] Barbic J, et al. "Role of gamma interferon in natural clearance of Bordetella

pertussis infection". Infect Immun. 1997;65:4904-8.

[196] Ross PJ, et al. "Relative contribution of Th1 and Th17 cells in adaptive

immunity to Bordetella pertussis: towards the rational design of an improved

acellular pertussis vaccine". PLoS Pathog. 2013;9:e1003264.

[197] Xing DK, et al. "Nitric oxide induction in murine macrophages and spleen

cells by whole-cell Bordetella pertussis vaccine". Vaccine. 1998;16:16-23.

[198] Mahon BP, et al. "Interleukin-12 is produced by macrophages in response to

live or killed Bordetella pertussis and enhances the efficacy of an acellular

pertussis vaccine by promoting induction of Th1 cells". Infect Immun.

1996;64:5295-301.

[199] Barnard A, et al. "Th1/Th2 cell dichotomy in acquired immunity to Bordetella

pertussis: variables in the in vivo priming and in vitro cytokine detection

techniques affect the classification of T-cell subsets as Th1, Th2 or Th0".

Immunology. 1996;87:372-80.

[200] Hendrikx LH, et al. "Serum IgA responses against pertussis proteins in

infected and Dutch wP or aP vaccinated children: an additional role in pertussis

diagnostics". PLoS One. 2011;6:e27681.

[201] Hellwig SM, et al. "Immunoglobulin A-mediated protection against

Bordetella pertussis infection". Infect Immun. 2001;69:4846-50.

[202] Wolfe DN, et al. "Comparative role of immunoglobulin A in protective

immunity against the Bordetellae". Infect Immun. 2007;75:4416-22.

[203] Royle J, et al. "Fifty years of immunisation in Australia (1964-2014): the

increasing opportunity to prevent diseases". J Paediatr Child Health. 2015;51:16-

20.

[204] Campbell P, et al. "Increased population prevalence of low pertussis toxin

antibody levels in young children preceding a record pertussis epidemic in

Australia". PLoS One. 2012;7:e35874.

[205] Quinn HE, et al. "The seroepidemiology of pertussis in NSW: fluctuating

immunity profiles related to changes in vaccination schedules". N S W Public

Health Bull. 2011;22:224-9.

Chapter 8

181

[206] Zepp F, et al. "Rationale for pertussis booster vaccination throughout life in

Europe". Lancet Infect Dis. 2011;11:557-70.

[207] Shah PD, et al. "What parents and adolescent boys want in school

vaccination programs in the United States". J Adolesc Health. 2014;54:421-7.

[208] Clark TA. "Changing pertussis epidemiology: everything old is new again". J

Infect Dis. 2014;209:978-81.

[209] "Pertussis vaccines: WHO position paper". Wkly Epidemiol Rec.

2010;85:385-400.

[210] Moore DM, et al. "Patterns of susceptibility in an outbreak of Bordetella

pertussis: evidence from a community-based study". Can J Infect Dis. 2002;13:305-

10.

[211] Skowronski DM, et al. "The changing age and seasonal profile of pertussis in

Canada". J Infect Dis. 2002;185:1448-53.

[212] Tanaka M, et al. "Trends in pertussis among infants in the United States,

1980-1999". JAMA. 2003;290:2968-75.

[213] de Melker HE, et al. "Reemergence of pertussis in the highly vaccinated

population of the Netherlands: observations on surveillance data". Emerg Infect

Dis. 2000;6:348-57.

[214] Baron S, et al. "Epidemiology of pertussis in French hospitals in 1993 and

1994: thirty years after a routine use of vaccination". Pediatr Infect Dis J.

1998;17:412-8.

[215] Winter K, et al. "California pertussis epidemic, 2010". J Pediatr.

2012;161:1091-6.

[216] Gabutti G, et al. "Pertussis: a review of disease epidemiology worldwide and

in Italy". Int J Environ Res Public Health. 2012;9:4626-38.

[217] EUVAC.NET. "Pertussis surveillance report 2003–2007".

wwweuvacnet/graphics/euvac/pdf/pertussis2pdf. 2011.

[218] Tan T, et al. "Pertussis Across the Globe: Recent Epidemiologic Trends From

2000-2013". Pediatr Infect Dis J. 2015.

[219] Pillsbury A, et al. "Australian vaccine preventable disease epidemiological

review series: pertussis, 2006-2012". Commun Dis Intell Q Rep. 2014;38:E179-94.

[220] He Q, et al. "Whooping cough caused by Bordetella pertussis and Bordetella

parapertussis in an immunized population". JAMA. 1998;280:635-7.

Chapter 8

182

[221] Gilberg S, et al. "Evidence of Bordetella pertussis infection in adults

presenting with persistent cough in a french area with very high whole-cell vaccine

coverage". J Infect Dis. 2002;186:415-8.

[222] Scheil W, et al. "Pertussis in South Australia 1893 to 1996". Commun Dis

Intell. 1998;22:76-80.

[223] Quinn HE, et al. "Pertussis epidemiology in Australia over the decade 1995-

2005--trends by region and age group". Commun Dis Intell Q Rep. 2007;31:205-

15.

[224] Andrews R, et al. "Pertussis notifications in Australia, 1991 to 1997".

Commun Dis Intell. 1997;21:145-8.

[225] Elliott E, et al. "National study of infants hospitalized with pertussis in the

acellular vaccine era". Pediatr Infect Dis J. 2004;23:246-52.

[226] Quinn HE. "Pertussis control in Australia--the current state of play".

Commun Dis Intell Q Rep. 2014;38:E177-8.

[227] Hewlett EL, et al. "Clinical practice. Pertussis--not just for kids". N Engl J

Med. 2005;352:1215-22.

[228] Loeffelholz MJ, et al. "Comparison of PCR, culture, and direct fluorescent-

antibody testing for detection of Bordetella pertussis". J Clin Microbiol.

1999;37:2872-6.

[229] Farrell DJ, et al. "Rapid-cycle PCR method to detect Bordetella pertussis that

fulfills all consensus recommendations for use of PCR in diagnosis of pertussis". J

Clin Microbiol. 2000;38:4499-502.

[230] He Q, et al. "Factors contributing to pertussis resurgence". Future Microbiol.

2008;3:329-39.

[231] Fisman DN, et al. "Pertussis resurgence in Toronto, Canada: a population-

based study including test-incidence feedback modeling". BMC Public Health.

2011;11:694.

[232] Ghanaie RM, et al. "Sensitivity and specificity of the World Health

Organization pertussis clinical case definition". Int J Infect Dis. 2010;14:e1072-5.

[233] Ward JI, et al. "Efficacy of an acellular pertussis vaccine among adolescents

and adults". N Engl J Med. 2005;353:1555-63.

[234] Wendelboe AM, et al. "Duration of immunity against pertussis after natural

infection or vaccination". Pediatr Infect Dis J. 2005;24:S58-61.

Chapter 8

183

[235] Sheridan SL, et al. "Acellular pertussis vaccine effectiveness for children

during the 2009-2010 pertussis epidemic in Queensland". Med J Aust.

2014;200:334-8.

[236] Witt MA, et al. "Unexpectedly limited durability of immunity following

acellular pertussis vaccination in preadolescents in a North American outbreak".

Clin Infect Dis. 2012;54:1730-5.

[237] Witt MA, et al. "Reduced risk of pertussis among persons ever vaccinated

with whole cell pertussis vaccine compared to recipients of acellular pertussis

vaccines in a large US cohort". Clin Infect Dis. 2013;56:1248-54.

[238] Gustafsson L, et al. "Long-term follow-up of Swedish children vaccinated

with acellular pertussis vaccines at 3, 5, and 12 months of age indicates the need

for a booster dose at 5 to 7 years of age". Pediatrics. 2006;118:978-84.

[239] Sheridan SL, et al. "Waning vaccine immunity in teenagers primed with

whole cell and acellular pertussis vaccine: recent epidemiology". Expert Rev

Vaccines. 2014;13:1081-106.

[240] Moraga-Llop FA, et al. "[Pertussis vaccine. Reemergence of the disease and

new vaccination strategies]". Enferm Infecc Microbiol Clin. 2015;33:190-6.

[241] Suryadevara M, et al. "Prevention of pertussis through adult vaccination".

Hum Vaccin Immunother. 2015:0.

[242] Moraga-Llop FA, et al. "[Pertussis in fully vaccinated infants and children.

Are new vaccination strategies required?]". Enferm Infecc Microbiol Clin.

2014;32:236-41.

[243] Jongerius I, et al. "Complement evasion by Bordetella pertussis: implications

for improving current vaccines". J Mol Med (Berl). 2015;93:395-402.

[244] Forsyth K, et al. "Strategies to decrease pertussis transmission to infants".

Pediatrics. 2015;135:e1475-82.

[245] Ahmed N, et al. "Genomic fluidity and pathogenic bacteria: applications in

diagnostics, epidemiology and intervention". Nat Rev Microbiol. 2008;6:387-94.

[246] Ochman H, et al. "Genes lost and genes found: evolution of bacterial

pathogenesis and symbiosis". Science. 2001;292:1096-9.

[247] Smith J. "The social evolution of bacterial pathogenesis". Proc Biol Sci.

2001;268:61-9.

Chapter 8

184

[248] Caro V, et al. "Is the Sequenced Bordetella pertussis strain Tohama I

representative of the species?". J Clin Microbiol. 2008;46:2125-8.

[249] Park J, et al. "Comparative genomics of the classical Bordetella subspecies:

the evolution and exchange of virulence-associated diversity amongst closely

related pathogens". BMC Genomics. 2012;13:545.

[250] Zhang S, et al. "Complete genome sequence of Bordetella pertussis CS, a

Chinese pertussis vaccine strain". J Bacteriol. 2011;193:4017-8.

[251] Sealey KL, et al. "Genomic analysis of isolates from the United Kingdom

2012 pertussis outbreak reveals that vaccine antigen genes are unusually fast

evolving". J Infect Dis. 2015;212:294-301.

[252] Harvill ET, et al. "Genome Sequences of 28 Bordetella pertussis U.S.

Outbreak Strains Dating from 2010 to 2012". Genome Announc. 2013;1.

[253] Bart MJ, et al. "Complete Genome Sequences of Bordetella pertussis Isolates

B1917 and B1920, Representing Two Predominant Global Lineages". Genome

Announc. 2014;2.

[254] Mooi FR. "Bordetella pertussis and vaccination: the persistence of a

genetically monomorphic pathogen". Infect Genet Evol. 2010;10:36-49.

[255] Packard ER, et al. "Sequence variation and conservation in virulence-related

genes of Bordetella pertussis isolates from the UK". J Med Microbiol. 2004;53:355-

65.

[256] Mooi FR, et al. "Variation in the Bordetella pertussis virulence factors

pertussis toxin and pertactin in vaccine strains and clinical isolates in Finland".

Infect Immun. 1999;67:3133-4.

[257] Han HJ, et al. "Antigenic variation in Bordetella pertussis isolates recovered

from adults and children in Japan". Vaccine. 2008;26:1530-4.

[258] Kallonen T, et al. "Bordetella pertussis strain variation and evolution

postvaccination". Expert Rev Vaccines. 2009;8:863-75.

[259] Octavia S, et al. "Newly emerging clones of Bordetella pertussis carrying

prn2 and ptxP3 alleles implicated in Australian pertussis epidemic in 2008-2010". J

Infect Dis. 2012;205:1220-4.

[260] Schmidtke AJ, et al. "Population diversity among Bordetella pertussis

isolates, United States, 1935-2009". Emerg Infect Dis. 2012;18:1248-55.

Chapter 8

185

[261] Petersen RF, et al. "Temporal trends in Bordetella pertussis populations,

Denmark, 1949-2010". Emerg Infect Dis. 2012;18:767-74.

[262] Zhang L, et al. "Effect of vaccination on Bordetella pertussis strains, China".

Emerg Infect Dis. 2010;16:1695-701.

[263] Fry NK, et al. "Genotypic variation in the Bordetella pertussis virulence

factors pertactin and pertussis toxin in historical and recent clinical isolates in the

United Kingdom". Infect Immun. 2001;69:5520-8.

[264] Njamkepo E, et al. "Genomic analysis and comparison of Bordetella

pertussis isolates circulating in low and high vaccine coverage areas". Microbes

Infect. 2008;10:1582-6.

[265] Xu Y, et al. "Characterization of co-purified acellular pertussis vaccines".

Hum Vaccin Immunother. 2015;11:421-7.

[266] He Q, et al. "Bordetella pertussis protein pertactin induces type-specific

antibodies: one possible explanation for the emergence of antigenic variants?". J

Infect Dis. 2003;187:1200-5.

[267] van der Ark AA, et al. "Resurgence of pertussis calls for re-evaluation of

pertussis animal models". Expert Rev Vaccines. 2012;11:1121-37.

[268] Elomaa A, et al. "Population dynamics of Bordetella pertussis in Finland and

Sweden, neighbouring countries with different vaccination histories". Vaccine.

2007;25:918-26.

[269] Kallonen T, et al. "Differences in the genomic content of Bordetella pertussis

isolates before and after introduction of pertussis vaccines in four European

countries". Infect Genet Evol. 2011;11:2034-42.

[270] Hallander HO, et al. "Shifts of Bordetella pertussis variants in Sweden from

1970 to 2003, during three periods marked by different vaccination programs". J

Clin Microbiol. 2005;43:2856-65.

[271] Litt DJ, et al. "Changes in genetic diversity of the Bordetella pertussis

population in the United Kingdom between 1920 and 2006 reflect vaccination

coverage and emergence of a single dominant clonal type". J Clin Microbiol.

2009;47:680-8.

[272] Borisova O, et al. "Antigenic divergence between Bordetella pertussis clinical

isolates from Moscow, Russia, and vaccine strains". Clin Vaccine Immunol.

2007;14:234-8.

Chapter 8

186

[273] Simmonds K, et al. "Dominance of two genotypes of Bordetella pertussis

during a period of increased pertussis activity in Alberta, Canada: January to

August 2012". Int J Infect Dis. 2014;29:223-5.

[274] Elomaa A, et al. "Pertussis before and after the introduction of acellular

pertussis vaccines in Finland". Vaccine. 2009;27:5443-9.

[275] Tsang RS, et al. "Polymorphisms of the fimbria fim3 gene of Bordetella

pertussis strains isolated in Canada". J Clin Microbiol. 2004;42:5364-7.

[276] van Loo IH, et al. "Temporal trends in the population structure of Bordetella

pertussis during 1949-1996 in a highly vaccinated population". J Infect Dis.

1999;179:915-23.

[277] van Loo IH, et al. "Multilocus sequence typing of Bordetella pertussis based

on surface protein genes". J Clin Microbiol. 2002;40:1994-2001.

[278] Rosenberg SM. "Evolving responsively: adaptive mutation". Nat Rev Genet.

2001;2:504-15.

[279] Bryant J, et al. "Developing insights into the mechanisms of evolution of

bacterial pathogens from whole-genome sequences". Future Microbiol.

2012;7:1283-96.

[280] Maharjan RP, et al. "Genome-wide analysis of single nucleotide

polymorphisms in Bordetella pertussis using comparative genomic sequencing".

Res Microbiol. 2008;159:602-8.

[281] Mooi FR, et al. "Bordetella pertussis strains with increased toxin production

associated with pertussis resurgence". Emerg Infect Dis. 2009;15:1206-13.

[282] van Gent M, et al. "Small mutations in Bordetella pertussis are associated

with selective sweeps". PLoS One. 2012;7:e46407.

[283] Bart MJ, et al. "Comparative genomics of prevaccination and modern

Bordetella pertussis strains". BMC Genomics. 2010;11:627.

[284] King AJ, et al. "Genome-wide gene expression analysis of Bordetella

pertussis isolates associated with a resurgence in pertussis: elucidation of factors

involved in the increased fitness of epidemic strains". PLoS One. 2013;8:e66150.

[285] Lam C, et al. "Investigating genome reduction of Bordetella pertussis using a

multiplex PCR-based reverse line blot assay (mPCR/RLB)". BMC Res Notes.

2014;7:727.

Chapter 8

187

[286] King AJ, et al. "Comparative genomic profiling of Dutch clinical Bordetella

pertussis isolates using DNA microarrays: identification of genes absent from

epidemic strains". BMC Genomics. 2008;9:311.

[287] Bouchez V, et al. "Genomic content of Bordetella pertussis clinical isolates

circulating in areas of intensive children vaccination". PLoS One. 2008;3:e2437.

[288] Caro V, et al. "Temporal analysis of French Bordetella pertussis isolates by

comparative whole-genome hybridization". Microbes Infect. 2006;8:2228-35.

[289] Tizolova A, et al. "Insertion sequences shared by Bordetella species and

implications for the biological diagnosis of pertussis syndrome". Eur J Clin

Microbiol Infect Dis. 2013;32:89-96.

[290] Bouchez V, et al. "First report and detailed characterization of B. pertussis

isolates not expressing Pertussis Toxin or Pertactin". Vaccine. 2009;27:6034-41.

[291] Otsuka N, et al. "Prevalence and genetic characterization of pertactin-

deficient Bordetella pertussis in Japan". PLoS One. 2012;7:e31985.

[292] Hegerle N, et al. "Evolution of French Bordetella pertussis and Bordetella

parapertussis isolates: increase of Bordetellae not expressing pertactin". Clin

Microbiol Infect. 2012;18:E340-6.

[293] Pawloski LC, et al. "Prevalence and molecular characterization of pertactin-

deficient Bordetella pertussis in the United States". Clin Vaccine Immunol.

2014;21:119-25.

[294] Tsang RS, et al. "Pertactin-negative Bordetella pertussis strains in Canada:

characterization of a dozen isolates based on a survey of 224 samples collected in

different parts of the country over the last 20 years". Int J Infect Dis. 2014;28:65-9.

[295] Kurova N, et al. "Monitoring of Bordetella isolates circulating in Saint

Petersburg, Russia between 2001 and 2009". Res Microbiol. 2010;161:810-5.

[296] Bodilis H, et al. "Virulence of pertactin-negative Bordetella pertussis isolates

from infants, France". Emerg Infect Dis. 2013;19:471-4.

[297] Martin SW, et al. "Pertactin-negative Bordetella pertussis strains: evidence

for a possible selective advantage". Clin Infect Dis. 2015;60:223-7.

[298] Mooi FR, et al. "The pertussis problem: classical epidemiology and strain

characterization should go hand in hand". J Pediatr (Rio J). 2015.

Chapter 8

188

[299] Maiden MC, et al. "Multilocus sequence typing: a portable approach to the

identification of clones within populations of pathogenic microorganisms". Proc

Natl Acad Sci U S A. 1998;95:3140-5.

[300] Mooi FR, et al. "Epidemiological typing of Bordetella pertussis isolates:

recommendations for a standard methodology". Eur J Clin Microbiol Infect Dis.

2000;19:174-81.

[301] Advani A, et al. "Pulsed-field gel electrophoresis analysis of Bordetella

pertussis isolates circulating in Europe from 1998 to 2009". J Clin Microbiol.

2013;51:422-8.

[302] Xu Y, et al. "Whole-genome sequencing reveals the effect of vaccination on

the evolution of Bordetella pertussis". Sci Rep. 2015;5:12888.

[303] Levinson G, et al. "Slipped-strand mispairing: a major mechanism for DNA

sequence evolution". Mol Biol Evol. 1987;4:203-21.

[304] Schouls LM, et al. "Multiple-locus variable-number tandem repeat analysis

of Dutch Bordetella pertussis strains reveals rapid genetic changes with clonal

expansion during the late 1990s". J Bacteriol. 2004;186:5496-505.

[305] Kurniawan J, et al. "Bordetella pertussis clones identified by multilocus

variable-number tandem-repeat analysis". Emerg Infect Dis. 2010;16:297-300.

[306] Octavia S, et al. "Insight into evolution of Bordetella pertussis from

comparative genomic analysis: evidence of vaccine-driven selection". Mol Biol

Evol. 2011;28:707-15.

[307] Poynten M, et al. "Temporal trends in circulating Bordetella pertussis strains

in Australia". Epidemiol Infect. 2004;132:185-93.

[308] Lam C, et al. "Selection and emergence of pertussis toxin promoter ptxP3

allele in the evolution of Bordetella pertussis". Infect Genet Evol. 2012;12:492-5.

[309] Barkoff AM, et al. "Appearance of Bordetella pertussis strains not expressing

the vaccine antigen pertactin in Finland". Clin Vaccine Immunol. 2012;19:1703-4.

[310] Zeddeman A, et al. "Investigations into the emergence of pertactin-deficient

Bordetella pertussis isolates in six European countries, 1996 to 2012". Euro

Surveill. 2014;19.

[311] Lam C, et al. "Rapid increase in pertactin-deficient Bordetella pertussis

isolates, Australia". Emerg Infect Dis. 2014;20:626-33.

Chapter 8

189

[312] Yellowlees A, et al. "Estimating vaccine efficacy using animal efficacy data".

Eur J Pharmacol. 2015;759:63-8.

[313] Mills KH, et al. "A respiratory challenge model for infection with Bordetella

pertussis: application in the assessment of pertussis vaccine potency and in

defining the mechanism of protective immunity". Dev Biol Stand. 1998;95:31-41.

[314] Boursaux-Eude C, et al. "Intranasal murine model of Bordetella pertussis

infection: II. Sequence variation and protection induced by a tricomponent

acellular vaccine". Vaccine. 1999;17:2651-60.

[315] Watanabe M, et al. "Efficacy of pertussis components in an acellular vaccine,

as assessed in a murine model of respiratory infection and a murine intracerebral

challenge model". Vaccine. 2002;20:1429-34.

[316] Hegerle N, et al. "Pertactin deficient Bordetella pertussis present a better

fitness in mice immunized with an acellular pertussis vaccine". Vaccine.

2014;32:6597-600.

[317] van Gent M, et al. "Studies on Prn variation in the mouse model and

comparison with epidemiological data". PLoS One. 2011;6:e18014.

[318] Hulbert RR, et al. "Laboratory Maintenance of Bordetella pertussis". Curr

Protoc Microbiol. 2009;Chapter 4:Unit 4B 1.

[319] Spokes PJ, et al. "Review of the 2008-2009 pertussis epidemic in NSW:

notifications and hospitalisations". N S W Public Health Bull. 2010;21:167-73.

[320] Queenan AM, et al. "Pertactin-negative variants of Bordetella pertussis in

the United States". N Engl J Med. 2013;368:583-4.

[321] Quinlan T, et al. "Pertactin-negative variants of Bordetella pertussis in New

York State: a retrospective analysis, 2004-2013". Mol Cell Probes. 2014;28:138-40.

[322] Lam C, et al. "Rapid Increase in Pertactin-deficient Bordetella pertussis

Isolates, Australia". Emerging Infectious Diseases. 2014;20:626-33.

[323] Octavia S, et al. "Frequent recombination and low level of clonality within

Salmonella enterica subspecies I". Microbiology. 2006;152:1099-108.

[324] Zerbino DR, et al. "Velvet: Algorithms for de novo short read assembly using

de Bruijn graphs". Genome Research. 2008;18:821-9.

[325] Darling AE, et al. "progressiveMauve: Multiple Genome Alignment with

Gene Gain, Loss and Rearrangement". Plos One. 2010;5.

Chapter 8

190

[326] Pang S, et al. "Genomic diversity and adaptation of Salmonella enterica

serovar Typhimurium from analysis of six genomes of different phage types". BMC

Genomics. 2013;14:718.

[327] Li H, et al. "Fast and accurate short read alignment with Burrows-Wheeler

transform". Bioinformatics. 2009;25:1754-60.

[328] Li H, et al. "The Sequence Alignment/Map format and SAMtools".

Bioinformatics. 2009;25:2078-9.

[329] Tamura K, et al. "MEGA5: molecular evolutionary genetics analysis using

maximum likelihood, evolutionary distance, and maximum parsimony methods".

Mol Biol Evol. 2011;28:2731-9.

[330] Drummond AJ, et al. "Bayesian phylogenetics with BEAUti and the BEAST

1.7". Mol Biol Evol. 2012;29:1969-73.

[331] Garcia-Alcalde F, et al. "Qualimap: evaluating next-generation sequencing

alignment data". Bioinformatics. 2012;28:2678-9.

[332] King AJ, et al. "Changes in the genomic content of circulating Bordetella

pertussis strains isolated from the Netherlands, Sweden, Japan and Australia:

adaptive evolution or drift?". BMC Genomics. 2010;11:64.

[333] Gogol EB, et al. "Phase variation and microevolution at homopolymeric

tracts in Bordetella pertussis". BMC genomics. 2007;8:122.

[334] Bokhari H, et al. "BapC autotransporter protein of Bordetella pertussis is an

adhesion factor". Journal of basic microbiology. 2012;52:390-6.

[335] Parkhill J, et al. "Comparative analysis of the genome sequences of

Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica".

Nature Genetics. 2003;35:32-40.

[336] King AJ, et al. "Comparative genomic profiling of Dutch clinical Bordetella

pertussis isolates using DNA microarrays: identification of genes absent from

epidemic strains". BMC genomics. 2008;9:311.

[337] Heikkinen E, et al. "Comparative genomics of Bordetella pertussis reveals

progressive gene loss in Finnish strains". PLoS One. 2007;2:e904.

[338] Antoine R, et al. "New virulence-activated and virulence-repressed genes

identified by systematic gene inactivation and generation of transcriptional fusions

in Bordetella pertussis". Journal of Bacteriology. 2000;182:5902-5.

Chapter 8

191

[339] Siezen RJ, et al. "Subtilases: the superfamily of subtilisin-like serine

proteases". Protein Sci. 1997;6:501-23.

[340] Coutte L, et al. "Subtilisin-like autotransporter serves as maturation protease

in a bacterial secretion pathway". EMBO J. 2001;20:5040-8.

[341] Coutte L, et al. "Role of adhesin release for mucosal colonization by a

bacterial pathogen". Journal of Experimental Medicine. 2003;197:735-42.

[342] Mazar J, et al. "Topology and maturation of filamentous haemagglutinin

suggest a new model for two-partner secretion". Molecular Microbiology.

2006;62:641-54.

[343] Higgins CF. "ABC transporters: physiology, structure and mechanism - an

overview". Research in Microbiology. 2001;152:205-10.

[344] Saurin W, et al. "Bacterial Binding Protein-Dependent Permeases -

Characterization of Distinctive Signatures for Functionally Related Integral

Cytoplasmic Membrane-Proteins". Molecular Microbiology. 1994;12:993-1004.

[345] Linton KJ, et al. "The Escherichia coli ATP-binding cassette (ABC) proteins".

Molecular Microbiology. 1998;28:5-13.

[346] Kelly FJ. "Gluthathione: in defence of the lung". Food and Chemical

Toxicology. 1999;37:963-6.

[347] Stenson TH, et al. "Reduced glutathione is required for pertussis toxin

secretion by Bordetella pertussis". Infection and Immunity. 2003;71:1316-20.

[348] van Gent M, et al. "SNP-based typing: a useful tool to study Bordetella

pertussis populations". PLoS One. 2011;6:e20340.

[349] Quail MA, et al. "A tale of three next generation sequencing platforms:

comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers".

BMC Genomics. 2012;13:341.

[350] Eid J, et al. "Real-time DNA sequencing from single polymerase molecules".

Science. 2009;323:133-8.

[351] Darling AE, et al. "progressiveMauve: multiple genome alignment with gene

gain, loss and rearrangement". PLoS One. 2010;5:e11147.

[352] Carver TJ, et al. "ACT: the Artemis Comparison Tool". Bioinformatics.

2005;21:3422-3.

[353] Gurevich A, et al. "QUAST: quality assessment tool for genome assemblies".

Bioinformatics. 2013;29:1072-5.

Chapter 8

192

[354] Lerat E, et al. "Recognizing the pseudogenes in bacterial genomes". Nucleic

Acids Res. 2005;33:3125-32.

[355] Noofeli M, et al. "BapC autotransporter protein is a virulence determinant of

Bordetella pertussis". Microb Pathog. 2011;51:169-77.

[356] Bokhari H, et al. "BapC autotransporter protein of Bordetella pertussis is an

adhesion factor". J Basic Microbiol. 2012;52:390-6.

[357] Fennelly NK, et al. "Bordetella pertussis expresses a functional type III

secretion system that subverts protective innate and adaptive immune responses".

Infect Immun. 2008;76:1257-66.

[358] Zeddeman A, et al. "Studying Bordetella pertussis populations by use of

SNPeX, a simple high-throughput single nucleotide polymorphism typing method".

J Clin Microbiol. 2015;53:838-46.

[359] Bolotin E, et al. "Gene Loss Dominates As a Source of Genetic Variation

within Clonal Pathogenic Bacterial Species". Genome Biol Evol. 2015;7:2173-87.

[360] Brinig MM, et al. "Significant gene order and expression differences in

Bordetella pertussis despite limited gene content variation". J Bacteriol.

2006;188:2375-82.

[361] Stapley J, et al. "Transposable elements as agents of rapid adaptation may

explain the genetic paradox of invasive species". Mol Ecol. 2015;24:2241-52.

[362] Siguier P, et al. "Everyman's Guide to Bacterial Insertion Sequences".

Microbiol Spectr. 2015;3:MDNA3-0030-2014.

[363] Liu GR, et al. "Genome plasticity and ori-ter rebalancing in Salmonella

typhi". Mol Biol Evol. 2006;23:365-71.

[364] Darling AE, et al. "Dynamics of genome rearrangement in bacterial

populations". PLoS Genet. 2008;4:e1000128.

[365] Matthews TD, et al. "Chromosomal rearrangements in Salmonella enterica

serovar Typhi strains isolated from asymptomatic human carriers". MBio.

2011;2:e00060-11.

[366] Lam C, et al. "Selection and emergence of pertussis toxin promoter ptxP3

allele in the evolution of Bordetella pertussis". Infect Genet Evol. 2012;12:492-5.

[367] Poolman JT, et al. "Acellular pertussis vaccines and the role of pertactin and

fimbriae". Expert Rev Vaccines. 2007;6:47-56.

Chapter 8

193

[368] WHO. Pertussis vaccines: WHO position paper – August 2015. Weekly

epidemiological record2015. p. 433-60.

[369] Polak M, et al. "Colonization of Bordetella pertussis clinical isolates that

differ by pulsed field gel electrophoresis types in the lungs of naive mice or mice

immunized with the whole-cell pertussis vaccine used in Poland". Arch Immunol

Ther Exp (Warsz). 2015;63:155-60.

[370] Guiso N, et al. "Intranasal murine model of Bordetella pertussis infection. I.

Prediction of protection in human infants by acellular vaccines". Vaccine.

1999;17:2366-76.

[371] Wang Z, et al. "Bordetella pertussis Isolates Circulating in China Where

Whole Cell Vaccines Have Been Used for 50 Years". Clin Infect Dis.

2015;61:1028-9.

[372] Mosiej E, et al. "Strain variation among Bordetella pertussis isolates

circulating in Poland after 50 years of whole-cell pertussis vaccine use". J Clin

Microbiol. 2011;49:1452-7.

[373] Komatsu E, et al. "Synergic effect of genotype changes in pertussis toxin and

pertactin on adaptation to an acellular pertussis vaccine in the murine intranasal

challenge model". Clin Vaccine Immunol. 2010;17:807-12.

[374] Carneiro MO, et al. "Pacific biosciences sequencing technology for

genotyping and variation discovery in human data". BMC Genomics. 2012;13:375.

[375] Chin CS, et al. "Nonhybrid, finished microbial genome assemblies from long-

read SMRT sequencing data". Nat Methods. 2013;10:563-9.

[376] Fichot EB, et al. "Microbial phylogenetic profiling with the Pacific

Biosciences sequencing platform". Microbiome. 2013;1:10.

[377] Clarke M, et al. "The relationship between Bordetella pertussis genotype and

clinical severity in Australian children with pertussis". J Infect. 2015.

[378] Belcher T, et al. "Bordetella pertussis evolution in the (functional) genomics

era". Pathog Dis. 2015.

[379] Sousa C, et al. "Modulation of gene expression through chromosomal

positioning in Escherichia coli". Microbiology. 1997;143 ( Pt 6):2071-8.

[380] Couturier E, et al. "Replication-associated gene dosage effects shape the

genomes of fast-growing bacteria but only for transcription and translation genes".

Mol Microbiol. 2006;59:1506-18.

Chapter 8

194

[381] Mielcarek N, et al. "Attenuated Bordetella pertussis: new live vaccines for

intranasal immunisation". Vaccine. 2006;24 Suppl 2:S2-54-5.

[382] Mielcarek N, et al. "Live attenuated B. pertussis as a single-dose nasal

vaccine against whooping cough". PLoS Pathog. 2006;2:e65.

[383] Locht C, et al. "Live attenuated vaccines against pertussis". Expert Rev

Vaccines. 2014;13:1147-58.

[384] Skerry CM, et al. "A live, attenuated Bordetella pertussis vaccine provides

long-term protection against virulent challenge in a murine model". Clin Vaccine

Immunol. 2011;18:187-93.

[385] Thorstensson R, et al. "A phase I clinical study of a live attenuated Bordetella

pertussis vaccine--BPZE1; a single centre, double-blind, placebo-controlled, dose-

escalating study of BPZE1 given intranasally to healthy adult male volunteers".

PLoS One. 2014;9:e83449.

Appendix

195

Appendix 1: List of SNPs detected in SP13 B. pertussis isolates using Illumina Whole genome Sequencing

Pos

itio

n in

gen

ome

Ref

eren

ce

SN

P

Am

ino

acid

ch

ange

Loc

us

nam

e

Gen

e n

ame

Str

and

Pos

itio

n in

gen

e

Toh

ama

I

L52

4

L49

0

L48

2

L47

5

L46

2

L13

91

L13

61

L12

14

L10

42

L14

23

L16

63

L14

19

L12

16

L10

37

L13

82

L13

80

L13

76

L16

58

L13

97

L14

21

L17

70

L16

61

L14

93

L15

07

L17

56

L17

79

L17

80

Pro

du

ct

Fu

nct

ion

al c

ateg

ory

Su

bce

llula

r lo

calis

atio

n

bvg

act

ivat

ed/r

epre

ssed

10764 T C BP0013 rplJ + 252 T C C C C C C C C C C C C C C C C C C C C C C C C C C C 50S ribosomal protein L10 Ribosome constituents Cytoplasmic

32626 C T BP0029 - + 412 C T T C C C C C C C C C C C C C C C C C C C C C C C C C Putative enoyl-coa hydratase Miscellaneous Cytoplasmic

36857 A G Intergenic A G G G G G G G G G G G G G G G G G G G G G G G G G G G

37390 A G BP0033 glyQ + 378 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Glycyl-trna synthetase alpha chain Macromolecule synthesis/modificationCytoplasmic

42578 A G E->G BP0038 gloA + 296 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Lactoylglutathione lyase Central/intermediary metabolism Cytoplasmic

50044 C G Intergenic C G G G G G G G G G G G G G G G G G G G G G G G G G G G

52491 A G BP0051P A G G G G G G G G G G G G G G G G G G G G G G G G G G G

65397 C T A->T BP0066 - - 909 C C C C C C C C C C C C C C C C C C C C C C C T T C C C Oxidoreductase Miscellaneous Cytoplasmic

74629 G A A->V BP0076 ampD - 272 G G G G G G G G G G G A G G G G G G G G G G G G G G G G N-acetyl-anhydromuranmyl-L-alanine amidase Macromolecule degradation

108923 T C BP0111 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative membrane protein (pseudogene) Pseudogenes

136138 G A Intergenic G G G G G G A N N N A A N N N A A A A N N N N A A N N N

136140 T C Intergenic T T T T T T C N C N C C N N N C C C C N C N N C C N N N

147540 C T BP0146 C T T C C C C C C C C C C C C C C C C C C C C C C C C C Putative hydroxylase (Pseudogene) Pseudogenes

182366 G C G->ABP0182 - + 1034 G C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative iron sulfur binding protein Miscellaneous Cytoplasmic Membrane

185405 G A BP0184 - + 1080 G A A A A A A A N A A A A A A A A A A A A A A A A A A A Putative exported protein Cell surface Outer Membrane

193157 C T BP0191 - + 60 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative exported protein Cell surface Unknown

196307 T C V->A BP0194 + 194 T C C C C C C C N C C C C N C C C C C C C C C C C C C C Probable metal transporter Transport/binding proteins Cytoplasmic Membrane

214663 A G S->P BP0208 - - 1395 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative oxidoreductase Miscellaneous Cytoplasmic

215582 T C BP0210P T C C C N N N N N N C C C N N C C C C C C N C C C C C C

220937 G A BP0215 ppc + 870 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Phosphoenolpyruvate carboxylase Energy metabolism Cytoplasmic Membrane+

223961 G A V->I BP0216 sphB1 + 361 G G G G G G A A A A A A A A A A A A A A A A A A A A A A Autotransporter subtilisin-like protease Virulence-associated genes Unknown +

259371 T C V->A BP0250 - + 305 T T T T T T T T T T T T T T T T T T C C C T T T T T T T Tripartite tricarboxylate transporter family receptor Cell surface Periplasmic

285033 T C I->T BP0280 degQ + 35 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Protease Macromolecule degradation Periplasmic

289550 G A P->S BP0284 - - 786 G G G G G G G G G G G A G G G G G G G G G G G G G G G G Probable extracellular solute-binding protein, family 5 Transport/binding proteins Periplasmic

299559 C T BP0292 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Conserved hypothetical protein (Pseudogene) Pseudogenes

364006 C T A->V BP0363 - + 461 C C C C C C C C N C C C C C C C C C C C C C C C C T C C Probable extracellular solute-binding protein Transport/binding proteins Cytoplasmic

372931 A G BP0371 gatB - 1314 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Glutamyl-trna(GLN) amidotransferase subunit B Macromolecule synthesis/modificationCytoplasmic

405322 G C P->A BP0405 - - 138 G C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative membrane protein Cell surface Cytoplasmic Membrane

417496 A C D->A BP0416 - + 881 A C C C C C C C C C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Conserved hypothetical Cytoplasmic

430935 C T A->T BP0429 - - 831 C C C C C C C C C C C C C C C C C C C C C C C C C C C T Delta-1-pyrroline-5-carboxylate dehydrogenase precursor Amino acid biosynthesis Cytoplasmic

444764 T C BP0440 T C C C C C C C C C C C C C C C C C C C C C C C C C C C N-terminal region of isovaleryl-coa dehydrogenase (Pseudogene) Pseudogenes

446550 C T E->K BP0443 - - 30 C C C C C C C C C C C T C C C C C C C C C C C C C C C C Transposase for IS481 element Phage-related or transposon-related

461359 C T BP0455 - + 963 C C C C C C C C C C C C C C C C C C T C C C C C C C C C Putative membrane protein Cell surface Cytoplasmic Membrane

479535 G A BP0467 - - 1729 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Acetolactate synthase large subunit Amino acid biosynthesis Cytoplasmic

483048 C G BP0472 C C C C C C C C C C C C C G C C C C C C C C C C C C C C Conserved hypothetical protein Pseudogenes

486005 G A P->L BP0475 rne - 650 G A A A A A A A N A A A A A A A A A A A A A A A A A A A Ribonuclease E Macromolecule synthesis/modificationCytoplasmic

492443 C T E->K BP0481 - - 30 C C C T C C C C C C C C C C C C C C C C C C C C C C C C Transposase for IS481 element Phage-related or transposon-related

506100 G A Intergenic G G G G G A G G G G G G G G G G G G G G G G G G G G G G

511992 A G BP0499P/BP0500P A G G G G G G G G G G G G G G G G G G G G G G G G G G G

514171 G A Intergenic G A A A A A A A A A A A A A A A A A A A A A A A A A A A

514994 A G BP0501 - 800 A A A A A A A A A A A A A A A A A A A A A G A A A A A A C-terminal region of two component sensor kinase (partial) Regulation +

517207 G T BP0505 - - 253 G T T T T T T T T T T T T T T T T T T T T T T T T T T T Phage-related conserved hypothetical protein Phage-related or transposon-related Unknown

518837 T C BP0507 - - 310 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Hypothetical protein Unknown Cytoplasmic

525420 G C D->E BP0518 - - 211 G C C C C C C C N C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Conserved hypothetical Cytoplasmic

532009 C T BP0529P C T T C C C C C C C C C C C C C C C C C C C C C C C C C

560519 C T BP0553 - - 889 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Conserved hypothetical protein Conserved hypothetical Cytoplasmic

560753 T C BP0553 - - 1123 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Conserved hypothetical Cytoplasmic

567403 G A BP0560 G G G G G G G G G G G G G G G G G G G G G G G G G G G A Conserved hypothetical protein (Pseudogene) Pseudogenes

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

196

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

601187 C T BP0595 - + 258 C C C C C T C C C C C C C C C C C C C C C C C C C C C C ABC transporter permease Transport/binding proteins

645417 C T BP0636 - + 978 C C C C C T C C N C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Miscellaneous Cytoplasmic

662740 T C BP0654 - - 361 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Tripartite tricarboxylate transporter family receptor Cell surface Periplasmic

667028 G A Q->STOPBP0658 - - 1188 G G G G A A A A A A A A A A A A A A A A A A A A A A A A Conserved hypothetical protein Miscellaneous Cytoplasmic

670446 C T BP0661 - - 808 C C C C T C C C C C C C C C C C C C C C C C C C C C C C Putative acetyl-coa synthetase Small molecule degradation Cytoplasmic -

694521 A G BP0678 prfA + 6 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Peptide chain release factor 1 Macromolecule synthesis/modificationCytoplasmic

700292 G T BP0684 G T T T T T T T N T T T T T T T T T T T T T T T T T T T Molybdopterin dehydrogenase (Pseudogene) Pseudogenes (phase-variable) -

710575 C T BP0694 rpoP - 262 C C C C C C C C C C C C C C C C C C C C C T C C C C C C Nitrogen regulatory IIA protein Regulation

713901 A C BP0700 - - 199 A C C C C C C C C C C C C C C C C C C C C C C C C C C C Probable hydrolase Miscellaneous Cytoplasmic

733144 C T V->I BP0721 - - 336 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Probable aminotransferase Central/intermediary metabolism Cytoplasmic

835998 G A Intergenic G A A A A A A A A A A A A A A A A A A A A A A A A A A A

838886 G A BP0814 - - 586 G G G G G G A A A A A A A A A A A A A A A A A A A A A A Probable lysr-family transcriptional regulator Regulation Cytoplasmic

848023 C T BP0821 hyuB - 706 C T T T T T T T N T T T T T T T T T T T T T T T T T T T Hydantoin utilization protein B Amino acid biosynthesis Cytoplasmic

864189 G A BP0833 G A A A A A A A N A A A A A A A A A A A A A A A A A A A Conserved hypothetical protein (Pseudogene) Pseudogenes (phase-variable)

876478 T C BP0847 nuoG + 1158 T C C C C C C C C C C C C C C C C C C C C C C C C C C C NADH-ubiquinone oxidoreductase, 75 kda subunit Energy metabolism Cytoplasmic

876902 G A G->S BP0847 nuoG + 1582 G G G G G G G G G G G G G G G G G G G G G A G G G G G G NADH-ubiquinone oxidoreductase, 75 kda subunit Energy metabolism Cytoplasmic

883816 C T L->F BP0854 nuoN + 76 C T T T T T T T T T T T T T T T T T T T T T T T T T T T NADH-ubiquinone oxidoreductase, chain N Energy metabolism Cytoplasmic Membrane

896887 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

911103 C T Intergenic C C C C C C C C C C C C C C C C C T C C C C C C C C C C

921144 C T R->C BP0882 - + 1273 C C C C C C C C C C C C C C C C C C C C C C C C C C T C Hypothetical protein Conserved hypothetical

925252 T C BP0887 - + 738 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Conserved hypothetical Periplasmic

926293 T C BP0888 - + 711 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Gntr-family transcriptional regulator Regulation Cytoplasmic

939561 T C F->L BP0904 pphA + 151 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Serine/threonine protein phosphatase 1 Regulation Unknown

946457 C T BP0909 dinP + 526 C C C C C C C C C C C C C C C C C C C C T C C C C C C C DNA-damage-inducible protein p Macromolecule synthesis/modificationCytoplasmic

996242 C T Intergenic C C C C C C C C C T C C C C C C C C C C C C C C C C C C

997017 G A G->E BP0958 cysM + 740 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Cysteine synthase B Amino acid biosynthesis Cytoplasmic

1013535 T C T->A BP0974 - - 522 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative membrane protein Cell surface Cytoplasmic Membrane

1021587 G T P->H BP0983 - - 851 G G G G G G G G G G G G G G G G G G G G G G G G G G T G Lysr-family transcriptional regulator Regulation Cytoplasmic

1026116 C T P->L BP0986 cusC + 479 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Probable outer membrane lipoprotein Cell surface Outer Membrane

1026809 C T A->V BP0986 cusC + 1172 C C C C C C T C C C C C C C C C C C C C C C C C C C C C Probable outer membrane lipoprotein Cell surface Outer Membrane

1059382 C T W->STOPBP1014 pitA - 298 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Probable phosphate transporter Transport/binding proteins Cytoplasmic Membrane

1063386 A G Intergenic A G G G G G G G G G G G G G G G G G G G G G G G G G G G

1077844 C T Intergenic C T T N T N T N T N T N T T N N N T T T T T T T T N T T

1080686 T C BP1037 cutE T C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative apolipoprotein N-acyltransferase (Pseudogene) Pseudogenes +

1082960 G A BP1040 - - 289 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Phoh-like protein Miscellaneous Cytoplasmic

1117897 A G S->P BP1071 pstS - 513 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Phosphate-binding periplasmic protein precursor Central/intermediary metabolism Periplasmic -

1137841 C A T->K BP1090 - + 431 C A A A A A A A N A A A A A A A A A A A A A A A A A A A Selenoprotein O-like protein [Collimonas fungivorans Ter331] Conserved hypothetical Cytoplasmic

1148861 G A E->K BP1099 - + 415 G G G G G G G G G G G G G G G G G G G G G G G G G A G G Long-chain fatty-acid--coa ligase Fatty acid metabolism cytoplasm

1153367 G A P->L BP1103 - - 476 G G G G G G G G N G G G G G G A G G G G G G G G G G G G Probable short chain dehydrogenase Miscellaneous Cytoplasmic

1162706 C T BP1110 sphB3 + 2043 C C C C C C C C C C C C C C C C C C C C C C T T T T T T Serine protease Virulence-associated genes Unknown -

1234866 G C BP1168 - - 148 G C C C C C C C N C C C C C C C C C C C C C C C C C C C Lysr-family transcriptional regulator Regulation Cytoplasmic Membrane

1239143 C T BP1172 C C C C C C C C T T T C C C C C C C C C C C C C C C C C Putative membrane protein (pseudogene) Pseudogenes

1264340 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

1275983 T C N->D BP1211 - - 147 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative exported protein Cell surface Unknown

1288344 C A R->L BP1226 - C C C C C C C C N C C C C C C C C C A A A C C C C C C C Putative exported protein (pseudogene) Pseudogenes (phase-variable) -

1290405 G A BP1227 radA + 225 G A A A A A A A A A A A A A A A A A A A A A A A A A A A DNA repair protein Macromolecule synthesis/modificationUnknown

1327960 C T G->DBP1260 glnE - 308 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Glutamate-ammonia-ligase adenylyltransferase Central/intermediary metabolism Unknown

1331840 G A BP1261 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Conserved hypothetical protein (Pseudogene) Pseudogenes

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

197

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

1353438 C T BP1280 proC - 562 C T T T T T T T N T T T T T T T T T T T T T T T T T T T Pyrroline-5-carboxylate reductase Amino acid biosynthesis Cytoplasmic

1358703 G T D->E BP1285 livJ - 289 G G G G G G G G G G G G G G G G G G T T T G G G G G G G Leu/Ile/Val-binding protein Receptor family ligand binding 

1364326 C T BP1293 lldD - 160 C C C C C C C C C C C C C C C T C C C C C C C C C C C C Putative L-lactate dehydrogenase Energy metabolism Cytoplasmic

1381251 C G A->P BP1314 - - 264 C G G G G G G G N G G G G G G G G G G G G G G G G G G G Putative exported protein Cell surface Unknown

1400964 C T BP1329 - - 1324 C C C C C C C C C C C C C C C C C C T T T T T T T T T T Alpha-glucosidase Macromolecule synthesis/modificationCytoplasmic

1407552 A G S->P BP1334 - - 456 A A A A A A A A A A A A A A A G A A A A A A A A A A A A Conserved hypothetical protein Conserved hypothetical Cytoplasmic

1430754 A G BP1355 - - 310 A G G G G G G G N G G G G G G G G G G G G G G G G G G G Probable laci-family transcriptional regulator Regulation Cytoplasmic

1470281 C T BP1394 fliM - 904 C C C C C C C C T T T T T T T T T T T T T T T T T T T T Flagellar motor switch protein flim Cell processes Cytoplasmic Membrane -

1500981 C T A->V BP1426 - + 527 C C C T C C C C N C C C C C C C C C C C C C C C C C C C Putative membrane protein Cell surface Cytoplasmic Membrane

1518410 T C V->A BP1443 - + 116 T T T T T T T T T T T T T T T T T T T T T T T T T T C T Hypothetical protein Unknown

1519199 G T Intergenic G T T T T T T T T T T T T T T T T T T T T T T T T T T T

1527810 G A BP1453 carB + 636 G G G A G G G G G G G G G G G G G G G G G G G G G G G G Carbamoyl-phosphate synthase large chain Amino acid biosynthesis Unknown

1527995 C T A->V BP1453 carB + 821 C C C C C C C C C C C C C C C C C C T T T T T T T T T T Carbamoyl-phosphate synthase large chain Amino acid biosynthesis Unknown

1534343 A G V->A BP1458 - - 221 A G G A A A A A A A A A A A A A A A A A A A A A A A A A Probable permease component of ABC transporter Transport/binding proteins Cytoplasmic Membrane

1547488 A G N->S BP1471 - + 5 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Unknown

1565529 G A R->K BP1487 smoM + 527 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative periplasmic solute-binding protein Transport/binding proteins Unknown +

1578615 C T BP1500 - + 228 C C C C T C C C C C C C C C C C C C C C C C C C C C C C Putative iia component of sugar transport PTS system Transport/binding proteins Unknown

1613367 T G W->GBP1539 - + 604 T G G G G G G G G G G G G G G G G G G G G G G G G G G G Probable lysr-familyt ranscriptional regulator Regulation Cytoplasmic

1615665 C T A->V BP1542 - + 425 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Conserved hypothetical protein Conserved hypothetical Cytoplasmic

1619514 C T P->S BP1546 - + 367 C C T C C C C C N C C C C C C C C C C C C C C C C C C C Hypothetical protein Unknown cytoplasm

1623068 T C BP1547 - + 2515 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Conserved hypothetical protein Conserved hypothetical Cytoplasmic

1626880 G C Intergenic G C C C C C C C N C C C N N C C C C C C C C C C C C C C

1649623 T C N->D BP1570 dapA - 180 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Dihydrodipicolinate synthase Amino acid biosynthesis Cytoplasmic

1652706 T C BP1573 glnH + 333 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Glutamine-binding periplasmic protein precursor Transport/binding proteins Periplasmic +

1666929 A G BP1589 - - 1168 A A A A A A A A N A A A A A A A A A A A A G A A A A A A Conserved hypothetical protein Conserved hypothetical Cytoplasmic

1668734 T C W->RBP1591 - + 427 T T T T C T T T N T T T T T T T T T T T T T T T T T T T Methyl-accepting chemotaxis protein Cell processes Cytoplasmic Membrane

1670288 T G BP1592 - - 412 T G G G G G G G G G G G G G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Cytoplasmic Membrane

1692984 C T BP1610 C C C C C C C C C C C C C T T T T T C C C C C C C C C C Putative autotransporter (Pseudogene) Pseudogenes

1711565 A G Y->C BP1627 vipC + 548 A A A A A A A A N A A A A A A A A A A A A G A A A A A A Capsular polysaccharide biosynthesis protein Cell surface Cytoplasmic -1727091 T C BP1639 T C C N N C C C N C C N C N C N C C C N C C N C N C C C Conserved hypothetical protein (pseudogene) Pseudogenes

1732032 T C V->A BP1644 - + 206 T T T C T T T T T T T T T T T T T T T T T T T T T T T T Tripartite tricarboxylate transporter family receptor Cell surface Unknown

1740455 T C S->G BP1654 wcbQ - 81 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative capsular polysaccharide biosynthesis protein Cell surface Cytoplasmic Membrane

1748290 C G P->A BP1660 sphB2 + 829 C G G N N G G N N G N G N N N G G G G N N N N N N G N G Autotransporter Virulence-associated genes Unknown

1748294 A G D->G BP1660 sphB2 + 833 A G G N N G G N N G N G N N N G G G G N N N N N G G N G Autotransporter Virulence-associated genes Unknown

1750584 T C D->G BP1662 - - 74 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative DNA-binding protein Miscellaneous Unknown

1777467 T C BP1691 phaA + 2043 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative ph adaptation potassium efflux protein Adaptation Cytoplasmic Membrane

1785460 C T A->V BP1701 ldcA + 920 C C C C C C C C C C C C C C C C C C T C C C C C C C C C Muramoyltetrapeptide carboxypeptidase Amino acid biosynthesis

1804767 T C BP1720 pcm + 1 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Protein-L-isoaspartate O-methyltransferase Macromolecule synthesis/modificationCytoplasmic

1806314 C T T->MBP1722 - + 11 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Polb-like 3-5 exonuclease (in Achromobacter xylosidans) Conserved hypothetical Cytoplasmic

1824667 C T BP1740 cphA - 1057 C C C C C C C C C C C C C T T T T C C C C C C C C C C C Cyanophycin synthetase Central/intermediary metabolism Cytoplasmic

1827556 G A BP1741 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative ABC transporter (Pseudogene) Pseudogenes

1854401 C T A->T BP1762 - - 789 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative adenine-specific methylase Miscellaneous Cytoplasmic

1861522 A G M->VBP1771 - + 196 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Unknown -

1877985 C T BP1787 C C C C C C C C C C C C C T T C C C C C C C C C C C C C Putative acyl-coa dehydrogenase (Pseudogene) Pseudogenes (phase-variable)

1878644 C T BP1787 C C C C C C C C C C C C C C C T C C C C C C C C C C C C Putative acyl-coa dehydrogenase (Pseudogene) Pseudogenes (phase-variable)

1880238 G T BP1790 G T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative exported protein (Pseudogene) Pseudogenes (phase-variable) +

1885417 G T A->E BP1795 tyrB - 497 G T T T T T T T T T T T T T T T T T T T T T T T T T T T Aromatic-amino-acid aminotransferase Amino acid biosynthesis Cytoplasmic

1898882 T C BP1810P T C C C C C C C C C C C C C C C C C C C C C C C C C C C

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

198

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

1923638 C T BP1832 C C C C C C C C C C C T C C C C C C C C C C C C C C C C Putative general secretion pathway protein Pseudogenes

1931433 A G Intergenic A G G G G G G G G G G G G G G G G G G G G G G G G G G G

1932067 A G D->G BP1840 rimM + 179 A G G G G G G G G G G G G G G G G G G G G G G G G G G G 16S rrna processing protein Macromolecule synthesis/modificationCytoplasmic

1954336 A G BP1863 - - 205 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative membrane protein Cell surface Cytoplasmic Membrane

1965604 T C K->E BP1877 bvgS - 1605 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Virulence sensor protein Virulence-associated genes Cytoplasmic Membrane+

1984103 T C F->S BP1883 fimD + 356 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Fimbrial adhesin Virulence-associated genes Extracellular +

1990399 C A BP1888 C C C A C C C C C C C C C C C C C C C C C C C C C C C C Putative membrane protein (Pseudogene) Pseudogenes -

2010367 T C Intergenic T T T T T T T T T T T C T T T T T T T T T T T T T T T T

2025071 T A H->Q BP1922 - + 753 T A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative exported protein Cell surface Unknown

2080049 T C BP1973 - + 108 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative membrane protein Cell surface Cytoplasmic Membrane

2090647 G A Intergenic G A A G G G G G N G G G G G G G G G G G G G G G G G G G

2098621 C T A->V BP1986 - + 1664 C C C C T C C C N C C C C C C C C C C C C C C C C C C C Putative ABC transporter, ATP-binding protein Transport/binding proteins Cytoplasmic Membrane

2102498 A G BP1989 - - 1405 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative exported protein Cell surface Outer Membrane

2122031 A G Y->H BP2011 - - 270 A A A A A A A A N A A A A A A G A A A A A A A A A A A A Conserved hypothetical protein Conserved hypothetical Cytoplasmic

2141172 T A L->Q BP2028 - + 47 T A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative exported protein Cell surface Periplasmic

2143606 G A BP2030 - - 652 G G G G G G G G G G G G G G G G G G A A A A A A A A A A Putative lysr-family transcriptional activator Regulation Cytoplasmic

2182840 G A A->T BP2062 - + 1246 G A A G G G G G G G G G G G G G G G G G G G G G G G G G Putative malonyl-coa decarboxylase Fatty acid metabolism Unknown

2185065 T C BP2064 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative malonyl-coa synthetase (pseudogene) Pseudogenes

2191410 C A BP2072 - + 456 C C C C C C A A C C C C C C C C C C C C C C C C C C C C Putative lipoprotein Cell surface Unknown

2194756 G A BP2075 - + 1293 G N A N A A A A N A A N A N A A A A A A A A A A A A A A Putative efflux system inner membrane protein Transport/binding proteins Cytoplasmic Membrane

2198097 A G N->S BP2076 - + 3221 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative efflux system transmembrane protein Transport/binding proteins Cytoplasmic Membrane

2213234 G A Intergenic G G G G G G G G G G G G G G A G G G G G G G G G G G G G

2221410 A G L->P BP2099 - - 779 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative thiolase Miscellaneous Unknown

2244138 T C Y->H BP2120 gor + 988 T T T T T T T T T T T T T T T T T T T T T T C C C C C C Glutathione reductase Central/intermediary metabolism Cytoplasmic

2244171 G A A->T BP2120 gor + 1021 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Glutathione reductase Central/intermediary metabolism Cytoplasmic

2258898 C T E->K BP2135 - - 30 C N T T T T T T T T T T T N T T T T T T T T T T T T T T Transposase for IS481 element Phage-related or transposon-related

2264505 G A A->V BP2140 - - 449 G A G G G G G G G G G G G G G G G G G G G G G G G G G G Putative membrane protein Cell surface Cytoplasmic Membrane

2265313 T C V->A BP2141 - + 158 T C C C C C C C N C C C C N C C C C C C C C C C C C C C Putative exported protein Cell surface Unknown

2321060 C T G->S BP2199 ispG - 372 C C C C C C C C C C C C C C C C C C C C C C C C C C C T 1-hydroxy-2-methyl-2-(E)-butenyl4-diphosphate synthase Central/intermediary metabolism Cytoplasmic

2325787 C T A->V BP2203 valS + 1370 C C C C T C C C C C C C C C C C C C C C C C C C C C C C Valyl-trna synthetase Miscellaneous

2350480 A C V->G BP2224 bapA - 2543 A A A A A A A A A A A C A A A A A A A A A A A A A A A A Putative autotransporter Virulence-associated genes Outer Membrane

2353874 G A BP2228 alr + 111 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Alanine racemase, catabolic Cell surface Cytoplasmic +

2356411 T C N->D BP2229 - - 1488 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative inner membrane transport protein Transport/binding proteins Cytoplasmic Membrane+

2356417 T C K->E BP2229 - - 1494 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Putative inner membrane transport protein Transport/binding proteins Cytoplasmic Membrane+

2359892 A G H->R BP2231 - + 425 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative exported protein Cell surface Unknown +

2374322 T C Y->C BP2249 bscI - 68 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative type III secretion protein Virulence-associated genes Unknown +

2377238 G T BP2253 bopD - 337 G G G G G G G G N G G G G G G G G G G G G T G G G G G G Putative outer protein D Virulence-associated genes Unknown +

2390341 T A F->I BP2268 - + 1465 T A A A A A A A N A A A A A A A A A A A A A A A A A A A Methyl-accepting chemotaxis protein Cell processes Cytoplasmic Membrane

2392797 G A G->S BP2271 - + 484 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative regulatory lipoprotein Cell surface Unknown

2393637 G C T->S BP2272 - - 5 G C C C C C C C C C C C C C C C C C C C C C C C C C C C Transposase for IS481 element Phage-related or transposon-related

2398173 C A R->L BP2276 - - 1163 C A A A A A A A N A A A A A A A A A A A A A A A A A A A Putative ABC transport permease Transport/binding proteins Cytoplasmic Membrane

2433015 G A L->F BP2307 - - 1146 G G G G G G G G G G G G G G G G G G G G G A G G G G G G Conserved ATP-binding protein Miscellaneous Unknown

2440188 G A BP2315 vag8 - 1408 G G G A G G G G G G G G G G G G G G G G G G G G G G G G Autotransporter Virulence-associated genes Outer Membrane +

2440728 G A BP2315 vag8 - 1948 G G G G G G G G G G G G G G G G G G A G G G G G G G G G Autotransporter Virulence-associated genes Outer Membrane +

2471200 C A G->C BP2334 - - 1395 C A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative ATP-dependent helicase Macromolecule synthesis/modificationCytoplasmic

2488085 T C BP2348 - - 1027 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative polyamine transport protein Transport/binding proteins Periplasmic

2492505 A G BP2352 - + 405 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative periplasmic substrate-binding transport protein Transport/binding proteins Periplasmic -

2497937 C A C->F BP2358 gltA - 692 C A A A A A A A A A A A A A A A A A A A A A A A A A A A Citrate synthase Energy metabolism Cytoplasmic

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

199

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

2508400 C T BP2369 acnA + 1497 C C C T C C C C C C C C C C C C C C C C C C C C C C C C Aconitate hydratase Energy metabolism -

2514582 G A Intergenic G G G G G G A A G G G G G G G G G G G G G G G G G G G G

2558163 C A V->L BP2416 - - 579 C A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative lysr-family transcriptional regulator Regulation Cytoplasmic

2570056 C G BP2427 C C C C C C C C G G G G G C C C C C C C C C C C C C C C

2570058 C A BP2427 C C C C C C C C A A A A A C C C C C C C C C C C C C C C Threonine synthase (pseudogene) Pseudogenes

2581557 T C BP2439 fabF - 313 T C C C C C C C C C C C C C C C C C C C C C C C C C C C 3-oxoacyl-[acyl-carrier-protein] synthase II Fatty acid metabolism Cytoplasmic Membrane

2601959 C T BP2459 alcD + 330 C C C C C C C C C C C C C T C C C C C C C C C C C C C C Hypothetical protein Adaptation Unknown

2628946 T C I->T BP2482 kdpC + 245 T C C C C C C C N C C C C N C C C C C C C C C C C C C C Potassium-transporting atpase C chain Transport/binding proteins Cytoplasmic Membrane

2632911 C A Q->HBP2485 - - 85 C N A A A A A A A A A A A A N A A A A A A A A A A A A A Transposase for IS481 element Phage-related or transposon-related

2651008 G A T->I BP2502 - - 50 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Hypothetical protein Unknown Unknown

2657330 A C T->P BP2509 dapB + 13 A A A A A A A A A A A A A C C C C C A A A A A A A A A A Dihydrodipicolinate reductase Amino acid biosynthesis Cytoplasmic

2662594 T G I->L BP2513 - - 801 T G G G G G G G G G G G G G G G G G G G G G G G G G G G Tripartite tricarboxylate transporter family receptor Cell surface Periplasmic

2681216 G A BP2528A G G G G G G G G N A A A A G G G G G G G G G G G G G G G Putative exported protein (pseudogene) Pseudogenes

2731525 G T Intergenic G T T T T T T T T T T T T T T T T T T T T T T T T T T T

2736088 G A BP2585 - - 52 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Deda-family integral membrane protein Cell surface Cytoplasmic Membrane

2790710 G C P->A BP2633 - - 1026 G C C C C C C C N C C C C C C C C C C C C C C C C C C C Possible exonuclease Macromolecule synthesis/modificationCytoplasmic

2839717 C T Intergenic C C C T C C C C C C C C C C C C C C C C C C C C C C C C

2849677 A G Intergenic A G G G G G G G G G G G G G G G G G G G G G G G G G G G

2854457 T C BP2689A T C C C C C C C N C C C C C C C C C C C C C C C C C C C Transposase fragment for IS1663 Pseudogenes

2879950 G A BP2713 - + 246 G A A A A A A A N A A A A A A A A A A A A A A A A A A A Putative hydrolase Miscellaneous Unknown

2881290 C T A->V BP2715 - + 14 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Ahpc/TSA-family protein Miscellaneous Unknown

2883883 G C A->G BP2717 - - 551 G G G G G G G G N G G G G G G G G G G C G G G G G G G G Hypothetical protein Miscellaneous

2908426 C A Intergenic C C C C C C C C C C C C C C C C C C C C C C C C C C A C

2921561 T G S->R BP2751 - - 210 T G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative membrane protein Cell surface Unknown +

2939308 G A A->V BP2762 xseA - 140 G G G G G G G G N G G G G G G G G G G G G G A G G G G G Exodeoxyribonuclease large subunit Macromolecule degradation Cytoplasmic

2963939 C T R->C BP2788 - + 1420 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Hypothetical protein Miscellaneous Cytoplasmic

2973877 A C BP2799 - - 523 A C C C C C C C N C C C C C C C C C C C C C C C C C C C Probable geranyltranstransferase Central/intermediary metabolism Cytoplasmic

2998921 C T A->V BP2820 - + 1022 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative alcohol dehydrogenase Central/intermediary metabolism Cytoplasmic

3008968 A G BP2833 A G G G G G G G G G G G G G G G G G G G G G G G G G G G

3010783 C A BP2834 C C C C C C C C C C C C C C A C C C C C C C C C C C C C Transposase for IS481 element Pseudogenes

3015966 G A R->H BP2839 - + 1016 G G G G G G G G G G G G G G G G G G G G G A G G G G G G Exported protein, conserved Cell surface Unknown

3023386 G A BP2846 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative exported protein (pseudogene) Pseudogenes -

3027623 G T Intergenic G G G G T G G G G G G G G G G G G G G G G G G G G G G G

3027750 T C BP2851 - + 66 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Outer membrane porin protein precursor Transport/binding proteins Outer Membrane

3030524 T C BP2853 - - 526 T C C C C C C C N C C C C C C C C C C C C C C C C C C C Probable short chain dehydrogenase Central/intermediary metabolism Cytoplasmic

3034945 A G BP2858 tyrB - 34 A G G G G G G G N G G G G G G G G G G G G G G G G G G G Aromatic-amino-acid aminotransferase Amino acid biosynthesis Cytoplasmic

3039407 C G V->L BP2861 - - 12 C G G G G G G G N N G G G G G G G G G N G G G G G G G G Transposase for IS481 element Phage-related or transposon-related

3039408 T A BP2861 - - 13 T A A A A A A A N N A A A A A A N A A N A A A A A A A A Transposase for IS481 element Phage-related or transposon-related

3039409 G T T->K BP2861 - - 14 G T N T T T T N N N N T T T N N N T T N N T T T T N T N Transposase for IS481 element Phage-related or transposon-related

3060100 G A Intergenic G G G G G G G G N G G G G G G G G G A A A G G G G G G G

3060324 T C BP2883 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative NADH:flavin oxidoreductase (pseudogene) Pseudogenes

3062326 A G V->A BP2885 - - 938 A A A A A A A A A A A A A A A A A A A G A A A A A A A A Hypothetical protein Miscellaneous

3070796 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

3076011 C T BP2900 - - 70 C C C T C C C C C C C C C C C C C C C C C C C C C C C C Putative transcriptional regulator Regulation Cytoplasmic

3092624 C T R->C BP2907 fhaL + 6760 C C C C C C C C C C C C C C C C C C C C C T T T T T T T Adhesin Virulence-associated genes Outer Membrane +

3122367 T C T->A BP2933 cyoA - 897 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative ubiquinol oxidase polypeptide II Energy metabolism Cytoplasmic Membrane

3123720 C A A->S BP2935 - - 576 C A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative two component system, histidine kinase Regulation Cytoplasmic Membrane+

3125619 G A Intergenic G A A A A A A A A A A A A A A A A A A A A A A A A A A A

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

200

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

3125620 A G Intergenic A G G G G G G G G G G G G G G G G G G G G G G G G G G G

3161770 C T Intergenic C T T T T T T T T T T T T T T T T T T T T T T T T T T T

3181172 C A BP2988 dltB - 109 C A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative protein involved in the transfer of D-alanine into teichoic acidsCell surface Cytoplasmic Membrane

3193746 A G T->A BP3002 - + 976 A G G G G G G G N G G G G G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Cytoplasmic

3239465 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

3241138 T G BP3040 bllY + 879 T G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative hemolysin Virulence-associated genes Cytoplasmic

3260282 C T BP3059 cca - 655 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Trna nucleotidyltransferase Macromolecule synthesis/modificationCytoplasmic

3263622 A C BP3062P A C C C C C C C C C C C C C C C C C C C C C C C C C C C Transport/binding proteins Cytoplasmic membrane

3270608 C T D->N BP3068 acyH - 1002 C C C C C C C C C C C C C C C C T C C C C C C C C C C C Adenosylhomocysteinase Central/intermediary metabolism Cytoplasmic

3284826 C T BP3082 - - 568 C C C C C T C C N C C C C C C C C C C C C C C C C C C C Putative transport protein ATP-binding Cell surface Cytoplasmic Membrane

3292800 G A V->MBP3092 - + 376 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative phospholipase D protein Macromolecule synthesis/modificationCytoplasmic Membrane

3312605 G A BP3113 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative DNA helicase (Pseudogene) Pseudogenes

3313138 G A BP3113 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative DNA helicase (Pseudogene) Pseudogenes

3322457 A G V->A BP3117 - - 2489 A G G A A A A A A A A A A A A A A A A A A A A A A A A A Putative restriction endonuclease Macromolecule synthesis/modificationCytoplasmic

3326160 G A BP3121 - - 130 G G G G G G G G G G G G G A A A G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Cytoplasmic

3347400 G A Intergenic G G G G G G G G N A A A A G G G G G G G G G G G G G G G

3347952 C T BP3140 - - 514 C T T T T T T T N T T T T T T T T T T T T T T T T T T T Hypothetical protein Unknown Cytoplasmic

3370669 A G Intergenic A G G G G G G G N G G G G G G G G G G G G G G G G G G G

3388793 G A BP3177 - + 1056 G G G G G G A A G G G G G G G G G G G G G G G G G G G G Putative methylaspartate ammonia-lyase Energy metabolism Cytoplasmic

3401673 A C L->V BP3189 - - 450 A C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative coenzyme A ligase Miscellaneous Cytoplasmic

3404606 A G BP3192 - - 226 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative ABC transporter permease protein Transport/binding proteins Cytoplasmic Membrane -

3410596 C G P->R BP3198 - + 578 C G G G G G G G N G G G G G G G G G G G G G G G G G G G Enoyl-coa hydratase/isomerase Miscellaneous Cytoplasmic

3421372 G A BP3208 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Conserved hypothetical protein (Pseudogene) Pseudogenes

3427748 C T Intergenic C T T T T T T T T T T T T T T T T T T T T T T T T T T T

3436869 A G BP3224 - - 925 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Putative cytochrome oxidase Energy metabolism Cytoplasmic Membrane

3436938 C T BP3224 - - 994 C C C C C C C C C C C C C T T T T T T T T T T T T T T T Putative cytochrome oxidase Energy metabolism Cytoplasmic Membrane

3449710 G A T->I BP3237 - - 605 G G G G G G G G G G G G G G G G G G G G G G G G G A G G Peptide ABC transporter substrate-binding protein preplasmic

3485064 G A BP3267 apaG + 306 G G G G G G G G A A A A A G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Unknown

3523618 C T G->R BP3303 - - 699 C C C C C C C C N T T T T T T T T T T T T T T T T T T T Conserved hypothetical protein Conserved hypothetical Cytoplasmic Membrane

3555632 C T BP3333 pykA + 984 C C C C C C C C T T T T T C C C C C C C C C C C C C C C Pyruvate kinase Energy metabolism Cytoplasmic

3587622 A G BP3371 - - 127 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Phage-related conserved hypothetical protein Conserved hypothetical Unknown +

3591185 T C BP3379 T N C C C C C C C C C C C C C C C C C C C C C C C C C C Phage-related conserved hypothetical protein (pseudogene) Unknown +

3596916 A G BP3385 - - 418 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Phage-related conserved hypothetical protein Conserved hypothetical Unknown +

3619407 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

3645025 T C Intergenic T C C C C C C C C C C C C C C C C C C C C C C C C C C C

3684691 T C H->R BP3472 - - 77 T T T T T T T T T T T T T T T T T C T T T T T T T T T T Hypothetical protein

3753863 T G BP3541 - - 622 T G G G G G G N N G G G G N G G G N G N G N G N G G N G Lysr family regulatory protein Regulation Cytoplasmic

3757929 A G T->A BP3546 - + 289 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Conserved hypothetical protein Conserved hypothetical Cytoplasmic

3758055 G A V->I BP3546 - + 415 G G G G G G G G G G G G G G G G G G G G G G A A A A A A Putative branched-chain amino acid transport system protein Transport/binding proteins Cytoplasmic Membrane

3783560 A C Y->S BP3570 - + 323 A A A A A A C C C C C C C C C C C C C C C C C C C C C C 30S ribosomal protein S8 Ribosome constituents Cytoplasmic

3840411 G A BP3630 rpsH + 150 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Preprotein translocase secy subunit Transport/binding proteins Cytoplasmic Membrane

3843034 A G T->A BP3636 secY + 256 A G G G G G G G G G G G G G G G G G G G G G G G G G G G DNA-directed RNA polymerase alpha chain Macromolecule synthesis/modificationCytoplasmic

3846833 G A BP3642 rpoA + 681 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Macromolecule synthesis/modification Cytoplasmic

3890915 A G BP3681 glcC A G G G G G G G G G G G G G G G G G G G G G G G G G G G

3927872 G A BP3718 - - 1054 G G G G G G G G N A A A A G G G G G G G G G G G G G G G Branched-chain amino acid transport ATP-binding protein Transport/binding proteins Cytoplasmic Membrane

3933893 G A BP3724 - + 777 G G G G G G G G G G G G G G G G G G G G G A A A A A A A Conserved hypothetical protein Conserved hypothetical Cytoplasmic

3938341 C T A->T BP3728 rkpK - 1146 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative UDP-glucose 6-dehydrogenase Cell surface Unknown

3950021 A G BP3743 ctaD - 838 A G G G G G G G G G G G G G G G G G G G G G G G G G G G Cytochrome c oxidase polypeptide I Energy metabolism Cytoplasmic Membrane -

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

201

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Lo

cus

na

me

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

I

L5

24

L4

90

L4

82

L4

75

L4

62

L1

39

1

L1

36

1

L1

21

4

L1

04

2

L1

42

3

L1

66

3

L1

41

9

L1

21

6

L1

03

7

L1

38

2

L1

38

0

L1

37

6

L1

65

8

L1

39

7

L1

42

1

L1

77

0

L1

66

1

L1

49

3

L1

50

7

L1

75

6

L1

77

9

L1

78

0

Pro

du

ct

Fu

nct

ion

al

cate

go

ry

Su

bce

llu

lar

loca

lisa

tio

n

bvg

act

iva

ted

/rep

ress

ed

3957159 C T Intergenic C T T T T T T T T T T T T T T T T T T T T T T T T T T T

3988168 G A BP3681P G A A A A A A A A A A A A A A A A A A A A A A A A A A A

3988941 G A M->I BP3783 ptxA + 684 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Pertussis toxin subunit 1 precursor Virulence-associated genes Extracellular +

3989239 G A G->S BP3784 ptxB + 133 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Pertussis toxin subunit 2 precursor Virulence-associated genes Unknown +

3991376 C T BP3787 ptxC + 681 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Pertussis toxin subunit 3 precursor Virulence-associated genes Unknown +

3998698 G A BP3795 - + 1092 G G G G G A G G G G G G G G G G G G G G G G G G G G G G Putative bacterial secretion system protein Transport/binding proteins Cytoplasmic Membrane+

4007734 T C BP3803 - + 624 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative transport system permease protein Transport/binding proteins Cytoplasmic Membrane

4009421 G C L->V BP3806 - - 12 G G G G G C G G G G G G G G G G G G G G G G G G G G G G Transposase for IS481 element Phage-related or transposon-related

4009422 A T N->K BP3806 - - 13 A A A A A T A A A A A A A A A A A A A A A A A A A A A A Transposase for IS481 element Phage-related or transposon-related

4009423 T G N->T BP3806 - - 14 T T T T T G T T T T T T T T T T T T T T T T T T T T T T Transposase for IS481 element Phage-related or transposon-related

4015848 G A E->K BP3811 - + 922 G A N A N N A N A A A A A A N N A N A A A A A A A A A N Transposase for IS481 element Phage-related or transposon-related

4044275 T C BP3835 T C C C C C C C C C C C C C C C C C C C C C C C C C C C Putative pyruvate ferredoxin/flavodoxin oxidoreductase (pseudogene)Pseudogenes (phase-variable)

4056201 T C F->L BP3845 - + 685 T T T T T T T T T T T T T T T T T C T T T T T T T T T T Nitroreductase family protein

4068047 C T G->S BP3857 - - 789 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative hydrolase Miscellaneous Unknown

4068650 C T G->S BP3858 - - 135 C T T T T T T T T T T T T T T T T T T T T T T T T T T T Putative transport ATP-binding protein Transport/binding proteins Cytoplasmic Membrane

4071996 G A BP3861 - - 523 G A A A A A A A A A A A A A A A A A A A A A A A A A A A Putative transport system permease protein Transport/binding proteins Cytoplasmic Membrane

4080201 C T BP3867 - + 651 C C C C C C C C C C C T C C C C C C C C C C C C C C C C Tripartite tricarboxylate transporter family receptor Cell surface Periplasmic

Pre-epidemic EL5 EL4 EL3 EL2 EL1

Appendix

202

Appendix 2: List of SNPs detected in major Australian clone

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

han

ge

Gen

e ID

Gen

e na

me

Str

and

Pos

ition

in g

ene

To

ham

a

L58

0

L50

6

L12

04

L70

6

L14

15

L11

91

L14

32

L13

76

L17

56

Fu

nct

ion

al c

ateg

ory

Pro

du

ct

bvg

act

ivat

ed/r

epre

ssed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

10764 T C I->I BP0013 rplJ + 252 T C C C C C C C C C ribosome constituents 50S ribosomal protein L10

27312 T C K->E BP0023 - - 30 T T T C T T T T T T phage-related or transposon-related transposase

30665 C T S->S BP0027 - - 301 C C C C C C T C C C miscellaneous MaoC family protein

36857 A G Intergenic A A A A A A A G G G

37390 A G L->L BP0033 glyQ + 378 A G G G G G G G G G macromolecule synthesis/modification glycyl-tRNA synthetase alpha chain

42578 A G E->G BP0038 gloA + 296 A G G G G G G G G G central/intermediary metabolism lactoylglutathione lyase

44585 G A E->K BP0041 - + 922 G G G A G G G G G G phage-related or transposon-related transposase

50044 C G Intergenic C C C C C G G G G G

52491 A G BP0051P A G G G G G G G G G

57391 G T T->K BP0057 - - 887 G G G T G G G G G G transport/binding proteins amino-acid ABC transporter binding protein precursor

96864 G A A->T BP0099 - + 655 G G A G G G G G G G cell surface putative integral membrane protein

108923 T C R->G BP0111 - T C C C N C C C C C pseudogenes putative membrane protein (pseudogene)

115955 T C N->N BP0118 - + 1011 T C T T T T T T T T phage-related or transposon-related transposase for IS1663

136138 G A Intergenic G G G G G G G G A A

136140 T C Intergenic T T T T T T T T C C

148092 G C R->G BP0148 - - 156 G G G G C G G G G G cell surface putative exported protein

165925 G A Intergenic G G G G G A G G G G

182366 G C G->A BP0182 - + 1034 G C C C N C C C C C miscellaneous putative iron sulfur binding protein

185405 G A Q->Q BP0184 - + 1080 G G G G N G G A A A cell surface putative exported protein

189456 C T A->T BP0187 - - 351 C C T C C C C C C C miscellaneous putative lyase

193157 C T A->A BP0191 - + 60 C T T T N T T T T T cell surface putative exported protein

196307 T C V->A BP0194 - + 194 T T T T N T T C C C transport/binding proteins probable metal transporter

197248 A C Intergenic A A A A C A A A A A

211831 G A H->H BP0207 - - 106 G G A G G G G G G G miscellaneous putative aldehyde dehydrogenase

214663 A G S->P BP0208 - - 1395 A G G G G G G G G G miscellaneous putative oxidoreductase

215582 T C BP0210P T C C T N N T C C C

220937 G A L->L BP0215 ppc + 870 G G G G A A A A A A energy metabolism phosphoenolpyruvate carboxylase +

223961 G A V->I BP0216 sphB1 + 361 G G G G G G G G A A pathogenicity autotransporter subtilisin-like protease +

224066 G A A->T BP0216 sphB1 + 466 G G G G G G G A G G pathogenicity autotransporter subtilisin-like protease +

240862 G A R->C BP0234 - - 369 G G G A G G G G G G transport/binding proteins putative solute-binding periplasmic protein

245515 C T P->S BP0237 - + 295 C C C C T C C C C C cell surface putative exported protein

246987 A G D->D BP0239 pcnB - 304 A A G A A A A A A A macromolecule synthesis/modification poly(A) polymerase

285033 T C I->T BP0280 degQ + 35 T C C C C C C C C C macromolecule degradation protease

290978 G A T->T BP0285 - - 631 G A G G N G G G G G small molecule degradation putative acetyl esterase

294510 C T I->I BP0289 ilvD + 108 C C C C C T C C C C amino acid biosynthesis dihydroxy-acid dehydratase

298934 C T BP0292 - C T C C N C C C C C pseudogenes conserved hypothetical protein (Pseudogene)

299559 C T A->T BP0292 - C C C C C C C T T T pseudogenes conserved hypothetical protein (Pseudogene)

316175 G A E->E BP0311 - + 261 G G G G G A G G G G miscellaneous probable acid-coenzyme A ligase

364006 C T A->V BP0363 - + 461 C C C C N C C C C T transport/binding proteins probable extracellular solute-binding protein

365493 C T E->K BP0365 - - 30 C C T T C C C C C C phage-related or transposon-related transposase

372931 A G L->L BP0371 gatB - 1314 A G G G G G G G G G macromolecule synthesis/modification glutamyl-tRNA(GLN) amidotransferase subunit B

391046 C T E->K BP0392 - - 30 C C C C C T C C C C phage-related or transposon-related transposase

405322 G C P->A BP0405 - - 138 G C C C C C C C C C cell surface putative membrane protein

417496 A C D->A BP0416 - + 881 A C C C C C C C C C conserved hypothetical conserved hypothetical protein

440310 C T G->S BP0435 - - 999 C C C C T C C C C C miscellaneous putative dehydratase/racemase

444764 T C S->G BP0440 - T C C C C C C C C C pseudogenes N-terminal region of a probable acyl-CoA dehydrogenase (Pseudogene)

469234 C A A->A BP0460 topB + 24 C C C A C C C C C C macromolecule synthesis/modification DNA topoisomerase iii

470332 G A K->K BP0460 topB + 1122 G G G A G G G G G G macromolecule synthesis/modification DNA topoisomerase iii

Appendix

203

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

479535 G A D->D BP0467 - - 1729 G A A A A A A A A A amino acid biosynthesis acetolactate synthase

486005 G A P->L BP0475 rne - 650 G N A A N A A A A A macromolecule synthesis/modification ribonuclease E

486528 T C T->A BP0475 rne - 1173 T C T T T T T T T T macromolecule synthesis/modification ribonuclease E

498602 C T S->S BP0487 - - 157 C C C C C T C C C C macromolecule degradation probable amidase

511992 A G BP0499P / BP0500P A A A A A G G G G G

512178 C T Intergenic C T C C C C C C C C

514171 G A Intergenic G G G G G A A A A A

517207 G T A->A BP0505 - - 253 G G G G G T T T T T phage-related or transposon-related phage-related protein

518837 T C E->E BP0507 - - 310 T T T T T T T C C C unknown hypothetical protein

522069 C T E->K BP0514 - - 30 C C C C C T C C C C phage-related or transposon-related transposase

525420 G C D->E BP0518 - - 211 G G G G N C C C C C conserved hypothetical conserved hypothetical protein

543199 C T S->L BP0534 - + 17 C T C C C C C C C C miscellaneous probable enoyl-CoA hydratase/isomerase +

560519 C T V->V BP0553 - - 889 C T T T T T T T T T conserved hypothetical conserved hypothetical protein

560753 T C G->G BP0553 - - 1123 T C C C C C C C C C conserved hypothetical

564430 T C T->A BP0558 - - 39 T T T T T T T C T T transport/binding proteins amino acid-binding periplasmic protein -

580238 T C S->S BP0573 trkA + 489 T C T T T T T T T T transport/binding proteins Trk system potassium uptake protein

634038 G A R->STOP BP0626 - G A G G G G G G G G miscellaneous probable 2-oxo acid dehydrogenases acyltransferase -

647204 C T T->I BP0640 - + 416 C T C C C C C C C C miscellaneous probable acyl-CoA dehydrogenase -

650814 G A A->V BP0643 - - 704 G A G G G G G G G G regulation LysR-family transcriptional regulator

654224 G A E->K BP0646 - + 922 G G G G G G G A G G phage-related or transposon-related transposase

661351 G A R->R BP0653 - - 343 G G A G G G G G G G miscellaneous probable acyl-CoA dehydrogenase

662740 T C S->S BP0654 - - 361 T C C C C C C C C C cell surface putative exported protein

664877 C T L->L BP0656 - - 181 C C C T N C C C C C conserved hypothetical conserved hypothetical protein

667028 G A Q->STOP BP0658 - - 1188 G G G G G G G G A A miscellaneous putative flavin-binding monooxygenase

674998 T C V->A BP0665 - + 8 T T T T T T C T T T central/intermediary metabolism thymidine diphosphoglucose 4,6-dehydratase

694521 A G K->K BP0678 prfA + 6 A A A A N A A G G G macromolecule synthesis/modification peptide chain release factor 1

700292 G T BP0684 + G G G T N T T T T T pseudogenes molybdopterin dehydrogenase (Pseudogene)

712815 C T A->T BP0698 - - 372 C C C C T C C C C C cell surface putative exported protein

713901 A C P->P BP0700 - - 199 A C C C C C C C C C miscellaneous probable hydrolase

733144 C T V->I BP0721 - - 336 C T T T T T T T T T central/intermediary metabolism probable aminotransferase

803112 G C S->C BP0780 - - 764 G G G G C G G G G G cell surface putative membrane protein

815265 C T G->G BP0792 pssA + 591 C C C C T C C C C C macromolecule synthesis/modification phosphatidylserine synthase

819382 A G D->G BP0796 - + 449 A A A A G A A A A A cell surface putative lipoprotein

819654 G T T->N BP0797 - - 5 G G G G T G G G G G phage-related or transposon-related transposase

825985 C G Intergenic C C C C C C C G C C

835998 G A Intergenic G A A A A A A A A A

838886 G A V->V BP0814 - - 586 G G G G G G G G A A regulation probable LysR-family transcriptional regulator

848023 C T V->V BP0821 hyuB - 706 C T T T N T T T T T amino acid biosynthesis hydantoin utilization protein B

864189 G A D->N BP0833 + G G A A N A A A A A pseudogenes conserved hypothetical protein (Pseudogene)

873082 T G I->S BP0844 nuoD + 938 T G T T T T T T T T energy metabolism respiratory-chain NADH dehydrogenase, 49 kDa subunit

876478 T C D->D BP0847 nuoG + 1158 T C C C C C C C C C energy metabolism NADH-ubiquinone oxidoreductase, 75 kDa subunit

883816 C T L->F BP0854 nuoN + 76 C C C C C T T T T T energy metabolism NADH-ubiquinone oxidoreductase, chain N

896887 T C Intergenic T C C C C C C C C C

911103 C T Intergenic C C C C C C C C T C

911937 C T A->A BP0876 - - 325 C C C C C C T C C C cell surface putative lipoprotein

925252 T C R->R BP0887 - + 738 T C C C C C C C C C conserved hypothetical conserved hypothetical protein

926293 T C I->I BP0888 - + 711 T C C C C C C C C C regulation GntR-family transcriptional regulator

939561 T C F->L BP0904 pphA + 151 T T C C N C C C C C regulation serine/threonine protein phosphatase 1

Appendix

204

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

997017 G A G->E BP0958 cysM + 740 G A A A A A A A A A amino acid biosynthesis cysteine synthase B

1013535 T C T->A BP0974 - - 522 T C C C C C C C C C cell surface putative membrane protein

1023059 G A A->T BP0985 acrB + 598 G G G A G G G G G G protection responses acriflavine resistance protein B

1026116 C T P->L BP0986 cusC + 479 C T T T N T T T T T cell surface probable outer membrane lipoprotein

1059382 C T W->STOP BP1014 pitA - 298 C C C T T T T T T T transport/binding proteins probable phosphate transporter

1063386 A G Intergenic A G G G G G G G G G

1074219 A G N->S BP1031 cHeR + A A G A A A A A A A pseudogenes chemotaxis protein methyltransferase (Pseudogene) -

1077844 C T Intergenic C C C C C C C T T T

1078477 C T Intergenic C T C C C C C C C C

1080079 G A BP1037 cutE G A G G G G G G G G pseudogenes putative apolipoprotein N-acyltransferase (Pseudogene) +

1080686 T C S->G BP1037 cutE - T C C C N C C C C C pseudogenes +

1082960 G A T->T BP1040 - - 289 G A A A A A A A A A miscellaneous PhoH-like protein

1098918 T C G->G BP1054 prn + 940 T T T T T T C T T T pathogenicity pertactin precursor +

1101501 G A L->L BP1055 cysG - 612 G G A G G G G G G G cofactor biosynthesis siroheme synthase +

1117897 A G S->P BP1071 pstS - 513 A G G G G G G G G G central/intermediary metabolism phosphate-binding periplasmic protein precursor -

1137841 C A T->K BP1090 - + 431 C C C C C A A A A A conserved hypothetical conserved hypothetical protein

1146523 G A A->A BP1097 livK + 12 G G G G G G A G G G transport/binding proteins putative amino acids binding protein

1148861 G A E->K BP1099 - + 415 G G G G G G G G G A miscellaneous putative long-chain fatty-acid--CoA ligase

1162706 C T A->A BP1110 sphB3 + 2043 C C C C C C C C C T pathogenicity serine protease -

1175956 C T R->K BP1119 fim2 - 104 C C T C C C C C C C pathogenicity serotype 2 fimbrial subunit precursor +

1178802 G A A->V BP1120 maeB - 2147 G G G G A G G G G G central/intermediary metabolism NADP-dependent malic enzyme

1179067 G A Intergenic G G G G G G A G G G

1193575 C T E->K BP1130 - - 30 C C T C C C C C C C phage-related or transposon-related transposase

1219546 G A G->E BP1154 - + 437 G G G A G G G G G G cell surface putative membrane protein

1231675 G A P->P BP1165 - + 903 G N A N G G N G G G transport/binding proteins sodium/solute symporter

1234866 G C P->P BP1168 - - 148 G N C N C C T C C C regulation LysR-family transcriptional regulator

1264340 T C Intergenic T C C C C C C C C C

1264962 G A G->D BP1201 tcfA + 527 G G G G N G A G G G pathogenicity tracheal colonization factor precursor +

1267400 G C L->L BP1203 - + 72 G C G G G G G G G G conserved hypothetical conserved hypothetical protein

1275983 T C N->D BP1211 - - 147 T C C C C C C C C C cell surface putative exported protein

1290405 G A G->G BP1227 radA + 225 G G G G G A A A A A macromolecule synthesis/modification DNA repair protein

1320894 C T S->S BP1254 polA + 1833 C C T C C C C C C C macromolecule synthesis/modification DNA polymerase I

1327960 C T G->D BP1260 glnE - 308 C T T T T T T T T T central/intermediary metabolism glutamate-ammonia-ligase adenylyltransferase

1331840 G A A->T BP1261 + G G G G N G G A A A pseudogenes conserved hypothetical protein (Pseudogene)

1336896 C T Y->Y BP1264 parE + 657 C T C C C C C C C C macromolecule synthesis/modification topoisomerase IV subunit B

1349361 G A P->L BP1276 livH - 527 G G G G A G G G G G transport/binding proteins high-affinity branched-chain amino acid transport system, permease protein

1353438 C T T->T BP1280 proC - 562 C T T T N T T T T T amino acid biosynthesis pyrroline-5-carboxylate reductase

1362787 G A S->N BP1291 - + 110 G G G G N G A G G G conserved hypothetical conserved hypothetical protein

1376000 C T Intergenic C T C C C C C C C C

1381251 C G A->P BP1314 - - 264 C G G G N G G G G G cell surface putative exported protein

1400964 C T T->T BP1329 - - 1324 C C C C C C C C C T macromolecule synthesis/modification alpha-glucosidase

1417770 C T A->A BP1345 - - 214 C C C C C C T C C C conserved hypothetical conserved hypothetical protein

1430661 G A V->V BP1355 - - 217 G A G G N G G G G G regulation probable LacI-family transcriptional regulator

1430754 A G T->T BP1355 - - 310 A G G G N G G G G G regulation

1448026 C T V->V BP1371 flgM - 100 C C C C C C T C C C regulation negative regulator of flagellin synthesis -

1470281 C T P->P BP1394 fliM - 904 C C C C C C C C T T cell processes flagellar motor switch protein FliM -

1471293 C T A->T BP1398 fliK - C T C C C C C C C C pseudogenes flagellar hook-length control protein (Pseudogene)

1489004 C T G->S BP1416 dsbB - 249 C C C T C C C C C C cell processes disulfide bond formation protein B

Appendix

205

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

1496253 G A G->D BP1421 pyrH + 134 G G A G G G G G G G central/intermediary metabolism uridylate kinase

1503340 A C Y->S BP1427 - + 1499 A A A C A A A A A A cell surface probable surface antigen

1510867 C T T->M BP1436 ppsA + 38 C C C C C C C T C C central/intermediary metabolism phosphoenolpyruvate synthase

1512941 G A L->L BP1436 ppsA + 2112 G G G G A G G G G G central/intermediary metabolism

1519199 G T Intergenic G T T T T T T T T T

1527995 C T A->V BP1453 carB + 821 C C C C N C C C C T amino acid biosynthesis carbamoyl-phosphate synthase large chain

1529560 A G S->G BP1453 carB + 2386 A G A A A A A A A A amino acid biosynthesis

1535275 C A G->G BP1458 - - 1153 C C C C A C C C C C transport/binding proteins probable permease component of ABC transporter

1547488 A G N->S BP1471 - + 5 A A A A A A A G G G conserved hypothetical conserved hypothetical protein

1565468 A G T->A BP1487 smoM + 466 A A A G A A A A A A transport/binding proteins putative periplasmic solute-binding protein +

1565529 G A R->K BP1487 smoM + 527 G A A A A A A A A A transport/binding proteins +

1576898 G A A->T BP1498 infC + 109 G G G G G A G G G G macromolecule synthesis/modification translation initiation factor IF-3

1586150 G A G->S BP1509 - + 292 G G G G G G A G G G transport/binding proteins putative inner membrane component of binding-protein-dependent transport system

1594376 C T A->A BP1518 - + 399 C C T C N C C C C C conserved hypothetical conserved hypothetical protein

1613367 T G W->G BP1539 - + 604 T G G G G G G G G G regulation probable LysR-familyt ranscriptional regulator

1615665 C T A->V BP1542 - + 425 C T T T T T T T T T conserved hypothetical conserved hypothetical protein

1623068 T C L->L BP1547 - + 2515 T C C C C C C C C C conserved hypothetical conserved hypothetical protein

1626880 G C Intergenic G C N C N C C C C C

1635721 G C Intergenic G G C G G G G G G G

1637246 C T A->T BP1559 - C T C C N C C C C C pseudogenes conserved hypothetical protein (Pseudogene)

1647688 C T T->T BP1568 fim3 + 87 C C C T C C C C C C pathogenicity serotype 3 fimbrial subunit precursor +

1647861 C A A->E BP1568 fim3 + 260 C C C C C C C A C C pathogenicity +

1649623 T C N->D BP1570 dapA - 180 T C C C C C C C C C amino acid biosynthesis dihydrodipicolinate synthase

1652706 T C Y->Y BP1573 glnH + 333 T C C C C C C C C C transport/binding proteins glutamine-binding periplasmic protein precursor +

1670288 T G G->G BP1592 - - 412 T G G G N G G G G G conserved hypothetical conserved hypothetical protein

1692984 C T T->N BP1610 + C C C C C C C C T C pseudogenes putative autotransporter (Pseudogene)

1702741 C T L->L BP1619 - - 316 C C C C N T C C C C unknown hypothetical protein -

1703727 C T A->T BP1619 - - 1302 C T C C C C C C C C unknown -

1722597 G A L->L BP1636 - - 744 G G G G G G A G G G miscellaneous putative nonspecific acid phosphatase precursor

1727091 T C T->A BP1639A - T N N N N N C N C C pseudogenes conserved hypothetical protein (pseudogene)

1735822 G A G->D BP1649 livG + 41 G G A G G G G G G G transport/binding proteins high-affinity branched-chain amino acid transport, ATP-binding protein

1740455 T C S->G BP1654 wcbQ - 81 T C C C C C C C C C cell surface putative capsular polysaccharide biosynthesis protein

1748290 C G P->A BP1660 sphB2 + 829 C N N G N N G N G G pathogenicity autotransporter

1748294 A G D->G BP1660 sphB2 + 833 A N N G N N G N G G pathogenicity

1750584 T C D->G BP1662 - - 74 T C C C C C C C C C miscellaneous putative DNA-binding protein

1751011 T C Intergenic T T T T T T C T T T

1753077 T C D->G BP1666 - - 239 T T T T C T T T T T regulation probable LysR-family transcriptional regulator

1777467 T C I->I BP1691 phaA + 2043 T C C C N C C C C C adaptation putative pH adaptation potassium efflux protein

1795894 C A Intergenic C C C N C N C A A A

1804767 T C L->L BP1720 pcm + 1 T C C C N C C C C C macromolecule synthesis/modification protein-L-isoaspartate O-methyltransferase

1806314 C T T->M BP1722 - + 11 C C T T T T T T T T conserved hypothetical conserved hypothetical protein

1807043 C A P->Q BP1722 - + 740 C C C C C A C C C C conserved hypothetical

1827556 G A BP1741 + G G G G G A A A A A pseudogenes putative ABC transporter (Pseudogene)

1849334 G A R->R BP1760 - - 616 G G G G G G A G G G cell surface putative exported protein

1854401 C T A->T BP1762 - - 789 C T T T T T T T T T miscellaneous putative adenine-specific methylase

1861522 A G M->V BP1771 - + 196 A G G G G G G G G G conserved hypothetical conserved hypothetical protein -

1863168 C T Intergenic C T C C C C C C C C

1870555 G A Intergenic G A G G G G G G G G

Appendix

206

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

L5

80

L5

06

L1

20

4

L7

06

L1

41

5

L1

19

1

L1

43

2

L1

37

6

L1

75

6

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

1880238 G T V->A BP1790 + G T T T T T T T T T pseudogenes putative exported protein (Pseudogene) +

1885417 G T A->E BP1795 tyrB - 497 G T T T T T T T T T amino acid biosynthesis aromatic-amino-acid aminotransferase

1898882 T C BP1810 P T T C T N T T C C C

1909804 G A Intergenic G G G G G G A G G G

1910285 G T D->E BP1820 - - 478 G T G G G G G G G G regulation probable LysR-family transcriptional regulator

1931433 A G Intergenic A A A G G G G G G G

1932067 A G D->G BP1840 rimM + 179 A G G G G G G G G G macromolecule synthesis/modification 16S rRNA processing protein

1939398 G A A->A BP1848 holB + 438 G G G G G G G A G G macromolecule synthesis/modification DNA polymerase III, delta' subunit

1940472 G T R->L BP1849 - + 410 G T G G G G G G G G conserved hypothetical conserved hypothetical protein

1941864 A G Intergenic A G A A A A A A A A

1953740 C T R->H BP1862 - - 896 C C C C N T C C C C cell surface putative membrane protein

1954336 A G N->N BP1863 - - 205 A G G G G G G G G G cell surface putative membrane protein

1965604 T C K->E BP1877 bvgS - 1605 T C C C N C C C C C pathogenicity virulence sensor protein +

1971172 A G S->G BP1879 fhaB + 2392 A A A A A A G A A A pathogenicity filamentous hemagglutinin/adhesin +

1984103 T C F->S BP1883 fimD + 356 T C C C C C C C C C pathogenicity fimbrial adhesin +

2018809 G A Intergenic G A G G G G G G G G

2025071 T A H->Q BP1922 - + 753 T A A A A A A A A A cell surface putative exported protein

2053262 G A G->D BP1949 - + 1361 G A G G G G G N N N transport/binding proteins putative permease component of branched-chain amino acid transport system

2056120 G A BP1952 - G A G G N G G N N N pseudogenes putative cytochrome (Pseudogene)

2068598 G A G->G BP1962 bfrI - 733 G G G G G A G N N N pathogenicity putative ferrisiderophore receptor

2080049 T C G->G BP1973 - + 108 T C C C C C C C C C cell surface putative membrane protein

2092988 C T R->R BP1983 - + 708 C C T C C C C C C C transport/binding proteins putative extracellular solute-binding protein

2102498 A G I->I BP1989 - - 1405 A G G G G G G G G G cell surface putative exported protein

2104868 G A L->L BP1992 - + 444 G G G A G G G G G G cell surface putative membrane protein -

2105395 G A R->H BP1994 - + 41 G G G G A G G G G G cell surface putative exported protein

2123502 C T V->V BP2012 - - 613 C C C T C C C C C C conserved hypothetical conserved hypothetical protein

2125641 C T D->N BP2014 acnA - 495 C C C T C C C C C C energy metabolism putative aconitate hydratase

2126590 G A G->G BP2014 acnA - 1444 G G G G G G A G G G energy metabolism

2141172 T A L->Q BP2028 - + 47 T A A A N A A A A A cell surface putative exported protein

2143606 G A D->D BP2030 - - 652 G G G G N G G G G A regulation putative LysR-family transcriptional activator

2148119 C T Intergenic C C C C N C T C C C

2185065 T C V->A BP2064 + T T C C C C C C C C pseudogenes putative malonyl-CoA synthetase (pseudogene)

2185909 G A G->S BP2066 - + 466 G G G G A G G G G G cell surface putative exported protein

2188173 C T L->L BP2068 - + 931 C C C T C C C C C C cell surface putative exported protein -

2194756 G A E->E BP2075 - + 1293 G G G A N N A N A A transport/binding proteins putative efflux system inner membrane protein

2198097 A G N->S BP2076 - + 3221 A G G G G G G G G G transport/binding proteins putative efflux system transmembrane protein

2208379 G A Intergenic G G G A G G G G G G

2213448 C A Intergenic C C C C A C C C C C

2214506 G A D->N BP2091 - + 1006 G G G G G A G G G G small molecule degradation putative dioxygenase hydroxylase component -

2221410 A G L->P BP2099 - - 779 A G G G N G G G G G miscellaneous putative thiolase

2240988 A G T->A BP2117 + A G A A A A A A A A pseudogenes putative transcriptional regulator (pseudogene)

2244138 T C Y->H BP2120 gor + 988 T T T T T T T T T C central/intermediary metabolism putative glutathione reductase

2244171 G A A->T BP2120 gor + 1021 G A A A A A A A A A central/intermediary metabolism

2258534 C T T->I BP2134 - + 32 C C C C C T C C C C conserved hypothetical conserved hypothetical protein -

2258898 C T E->K BP2135 - - 30 C T T N T T T T T T phage-related or transposon-related transposase

2265313 T C V->A BP2141 - + 158 T C C C C C C C C C cell surface putative exported protein

2271462 C T L->L BP2149 - - 355 C C C C C T C C C C regulation putative araC-family transcriptional regulator -

2273462 A G Intergenic A A G A A A A A A A

Appendix

207

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

2280702 C T P->S BP2158 - + 1504 C C C T N C C C C C regulation putative sigma-54-dependent transcriptional regulator

2318532 G A S->L BP2196 - - 992 G N G G N G G A G G miscellaneous putative quinoprotein

2325787 C T A->V BP2203 valS + 1370 C C C C C C C C C C macromolecule synthesis/modification valyl-tRNA synthetase

2353874 G A V->V BP2228 alr + 111 G A A A A A A A A A cell surface alanine racemase, catabolic +

2355443 G A A->A BP2229 - - 520 G A G G G G G G G G transport/binding proteins putative inner membrane transport protein +

2356411 T C N->D BP2229 - - 1488 T C C C C C C C C C transport/binding proteins +

2356417 T C K->E BP2229 - - 1494 T C C C C C C C C C transport/binding proteins +

2357290 C T A->T BP2229 - - 2367 C C C C T C C C C C transport/binding proteins +

2359892 A G H->R BP2231 - + 425 A G G G G G G G G G cell surface putative exported protein +

2363842 C T L->L BP2235 bscC - 127 C T C C N C C C C C pathogenicity putative type III secretion protein +

2374322 T C Y->C BP2249 bscI - 68 T T T T T T T C C C pathogenicity putative type III secretion protein +

2384626 C T S->S BP2262 bscD + 813 C C C T C C C C C C pathogenicity putative type III secretion protein +

2390341 T A F->I BP2268 - + 1465 T A A A N A A A A A cell processes methyl-accepting chemotaxis protein

2392797 G A G->S BP2271 - + 484 G G A A A A A A A A cell surface putative regulatory lipoprotein

2393637 G C T->S BP2272 - - 5 G C C C C C C C C C phage-related or transposon-related transposase

2398173 C A R->L BP2276 - - 1163 C A A A N A A A A A transport/binding proteins putative ABC transport permease

2433515 C T Intergenic C T C C C C C C C C

2459954 G A G->E BP2327 - + 209 G G G A N G G G G G cell surface putative outer membrane protein

2461126 C T P->S BP2327 - + 1381 C T C C C C C C C C cell surface

2471200 C A G->C BP2334 - - 1395 C A A A A A A A A A macromolecule synthesis/modification putative ATP-dependent helicase

2480916 C T Intergenic C T C C C C C C C C

2488085 T C Q->Q BP2348 - - 1027 T T T C C C C C C C transport/binding proteins putative polyamine transport protein

2492505 A G R->R BP2352 - + 405 A G G G G G G G G G transport/binding proteins putative periplasmic substrate-binding transport protein -

2497937 C A C->F BP2358 gltA - 692 C A A A A A A A A A energy metabolism citrate synthase

2505238 T C V->A BP2366 prpB + T T T T T T T C T T energy metabolism

2513786 G A A->T BP2373 - + 1045 G G A G G G G G G G conserved hypothetical conserved hypothetical protein

2552921 C T K->K BP2411 - - 226 C C C C C T C C C C conserved hypothetical conserved hypothetical protein

2558163 C A V->L BP2416 - - 579 C A A A A A A A A A regulation putative LysR-family transcriptional regulator

2581557 T C P->P BP2439 fabF - 313 T C C C C C C C C C fatty acid metabolism 3-oxoacyl-[acyl-carrier-protein] synthase II

2594947 A C Intergenic A A C A A A A A A A

2605624 T C H->H BP2462 bcr + 822 T T T T C T T T T T transport/binding proteins putative drug resistance translocase +

2628946 T C I->T BP2482 kdpC + 245 T C C C N C C C C C transport/binding proteins potassium-transporting ATPase C chain

2632911 C A Q->H BP2485 - - 85 C A A N N A A A A A phage-related or transposon-related transposase

2636646 C T V->M BP2488 icd - 201 C C C C C C C T C C energy metabolism isocitrate dehydrogenase [NADP]

2637133 G A I->I BP2488 icd - 688 G G G G G G A G G G energy metabolism

2651008 G A T->I BP2502 - - 50 G G G G G G G A A A unknown hypothetical protein

2657330 A C T->P BP2509 dapB + 13 A A A A A A A A C A amino acid biosynthesis dihydrodipicolinate reductase

2662594 T G I->L BP2513 - - 801 T G G G G G G G G G cell surface putative exported protein

2673941 G A I->I BP2523 - - 868 G G G G G A G G G G cell surface putative exported protein

2689908 C T V->I BP2538 - - 462 C C C T C C C C C C cell surface integral membrane protein

2706518 A G V->A BP2552 - - 833 A A A G A A A A A A transport/binding proteins ABC transport protein, solute-binding protein

2710464 G A P->S BP2557 - - 333 G G G A G G G G G G miscellaneous conserved hypothetical protein

2731525 G T Intergenic G T T T T T T T T T

2736088 G A V->V BP2585 - - 52 G G A A A A A A A A cell surface DedA-family integral membrane protein

2764889 C T A->V BP2611 + C C T C N C C C C C pseudogenes probable transcriptional regulator (pseudogene)

2790710 G C P->A BP2633 - - 1026 G C C C N C C C C C macromolecule synthesis/modification possible exonuclease

2791416 C T R->H BP2634 - C T C C C C C C C C pseudogenes D-amino acid dehydrogenase small subunit (pseudogene)

2794313 C T C->C BP2637 - + 660 C C T C C C C C C C cell surface putative lipoprotein

Appendix

208

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Po

siti

on

in

gen

e

To

ha

ma

L5

80

L5

06

L1

20

4

L7

06

L1

41

5

L1

19

1

L1

43

2

L1

37

6

L1

75

6

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

2812546 C T G->D BP2654 - - 635 C C T C N C C C C C amino acid biosynthesis probable dihydrodipicolinate synthase

2817932 C A P->P BP2662 - - 130 C A C C C C C C C C miscellaneous putative aldolase -

2826237 G A H->Y BP2667 fhaS - 4179 G G G G G G G A G G pathogenicity adhesin +

2849677 A G Intergenic A G G G G G G G G G

2854457 T C BP2689A + T C C C N C C C C C pseudogenes transposase fragment for IS1663 (pseudogene)

2859827 T C V->A BP2694 - + 197 T T C T T T T T T T regulation LysR-family transcriptional regulator

2879950 G A L->L BP2713 - + 246 G A N A N N A A A A miscellaneous putative hydrolase

2881031 G A Intergenic G G G A G G G G G G

2881290 C T A->V BP2715 - + 14 C T T T T T T T T T miscellaneous AhpC/TSA-family protein

2891938 A G K->E BP2724 - + 922 A G G G G N G N G G phage-related or transposon-related transposase

2911372 C T R->H BP2744 - - 83 C C C C C C T C C C transport/binding proteins putative ABC transport protein, ATP-binding component

2916858 G A A->V BP2749 putA - 158 G A G G G G G G G G small molecule degradation bifunctional proline oxidoreductase/transcriptional repressor

2921561 T G S->R BP2751 - - 210 T G G G G G G G G G cell surface putative membrane protein +

2931736 G A R->R BP2755 - - 2833 G G G G G G A G G G cell surface putative exported protein

2937272 C T W->STOP BP2760 - - 269 C C C C N T C C C C transport/binding proteins putative chloride-channel protein

2963939 C T R->C BP2788 - + 1420 C T T T T T T T T T miscellaneous hypothetical protein

2973877 A C P->P BP2799 - - 523 A C C C C C C C C C central/intermediary metabolism probable geranyltranstransferase

2996653 C T Intergenic C C C C C T C C C C

2998921 C T A->V BP2820 - + 1022 C T T T T T T T T T central/intermediary metabolism putative alcohol dehydrogenase

3008968 A G S->G BP2833 + A G G G G G G G G G pseudogenes putative membrane protein (pseudogene)

3023386 G A A->V BP2846 - G A A A A A A A A A pseudogenes putative exported protein (pseudogene) -

3027750 T C A->A BP2851 - + 66 T C C C C C C C C C transport/binding proteins outer membrane porin protein precursor

3030524 T C A->A BP2853 - - 526 T C C C N C C C C C central/intermediary metabolism probable short chain dehydrogenase

3033520 G T I->I BP2856 - - 577 G G T G N G G G G G conserved hypothetical conserved hypothetical protein

3034945 A G A->A BP2858 tyrB - 34 A G N G N G G G G G amino acid biosynthesis aromatic-amino-acid aminotransferase

3039407 C G V->L BP2861 - - 12 C N G G N G G G G G phage-related or transposon-related transposase

3039408 T A T->T BP2861 - - 13 T N A A N A A A A A phage-related or transposon-related

3039409 G T T->K BP2861 - - 14 G N N N N N G N T N phage-related or transposon-related

3060324 T C K->E BP2883 T C C C N C C C C C pseudogenes putative NADH:flavin oxidoreductase (pseudogene)

3070796 T C Intergenic T C C C C C C C C C

3085804 G T Intergenic G G G G G T G G G G

3092624 C T R->C BP2907 fhaL + 6760 C C C C C C C C C T pathogenicity adhesin +

3122367 T C T->A BP2933 cyoA - 897 T C C C C C C C C C energy metabolism putative ubiquinol oxidase polypeptide II

3123720 C A A->S BP2935 - - 576 C C C A N A A A A A regulation putative two component system, histidine kinase +

3125619 G A Intergenic G A A A A A A A A A

3125620 A G Intergenic A G G G G G G G G G

3161770 C T Intergenic C T T T T T T T T T

3181172 C A A->A BP2988 dltB - 109 C A A A A A A A A A cell surface putative protein involved in the transfer of D-alanine into teichoic acids

3193746 A G T->A BP3002 - + 976 A G G G N G G G G G conserved hypothetical conserved hypothetical protein

3221186 C T D->N BP3027 murE - 720 C C C T C C C C C C cell surface Possible murein precusor biosynthesis bifunctional protein

3239465 T C Intergenic T T T C C C C C C C

3241138 T G L->L BP3040 bllY + 879 T G G G G G G G G G pathogenicity putative hemolysin

3243304 G A A->V BP3043 ilvD - 965 G G G G G A G G G G amino acid biosynthesis dihydroxy-acid dehydratase

3253933 G A W->STOP BP3053 - + 678 G G G G G G A G G G miscellaneous putative oxidoreductase

3260282 C T L->L BP3059 cca - 655 C C C T T T T T T T macromolecule synthesis/modification tRNA nucleotidyltransferase

3263622 A C BP 3062P A A A A A A A C C C

3269016 G A R->R BP3066 - - 658 G G G G G A G G G G amino acid biosynthesis putative methylenetetrahydrofolate reductase

3292800 G A V->M BP3092 - + 376 G A A A A A A A A A macromolecule synthesis/modification putative phospholipase D protein

Appendix

209

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

3295857 A G I->T BP3095 modB - 29 A A A A A A G A A A transport/binding proteins molybdate-binding periplasmic protein precursor

3297373 G A R->W BP3096 - - 654 G G G A G G G G G G miscellaneous putative hydrolase

3312605 G A BP3113 - G A N A A A A A A A pseudogenes putative DNA helicase (Pseudogene)

3313138 G A R->C BP3113 - G A N A A A A A A A pseudogenes

3316475 G A L->F BP3115 - - 687 G A G G G G G G G G conserved hypothetical conserved hypothetical protein

3320775 C T V->M BP3117 - - 807 C C T C C C C C C C macromolecule synthesis/modification putative restriction endonuclease

3333242 C T A->T BP3128 - - 1104 C T C C C C C C C C conserved hypothetical conserved hypothetical protein

3344486 C T R->H BP3137 - - 602 C C C T N C C C C C regulation putative two-component system sensor protein

3347952 C T V->V BP3140 - - 514 C T T T N T T T T T unknown hypothetical protein

3352297 C T G->D BP3143 - - 1007 C C C C C T C C C C cell surface putative glycosyltransferase

3356562 C T V->V BP3146 - - 1876 C C C C T C C C C C amino acid biosynthesis probable asparagine synthase

3370669 A G Intergenic A G G G N G G G G G

3401673 A C L->V BP3189 - - 450 A C C C C C C C C C miscellaneous putative coenzyme A ligase

3404606 A G R->R BP3192 - - 226 A G G G G G G G G G transport/binding proteins putative ABC transporter permease protein -

3409165 C T Intergenic C C C C T C C C C C

3410596 C G P->R BP3198 - + 578 C G G G N G G G G G miscellaneous enoyl-CoA hydratase/isomerase family protein

3421372 G A BP3208 + G A A A A A A A A A pseudogenes conserved hypothetical protein (Pseudogene)

3423550 G A V->V BP3211 rnhA - 250 G G G A G G G G G G macromolecule degradation ribonuclease HI

3427748 C T Intergenic C T T T T T T T T T

3428952 T C N->N BP3216 - + 1011 T T C T T T T T T T phage-related or transposon-related transposase for IS1663

3436869 A G G->G BP3224 - - 925 A G G G G G G G G G energy metabolism putative cytochrome oxidase

3436938 C T P->P BP3224 - - 994 C C C C C C C C T T energy metabolism

3444600 G A P->L BP3232 - G G G G A G G G G G pseudogenes conserved hypothetical protein (Pseudogene)

3445127 G A A->V BP3233 - G G G G A G G G G G pseudogenes putative ABC transport ATP-binding subunit (pseudogene)

3449710 G A T->I BP3237 - - 605 G G G G G G G G G A transport/binding proteins putative binding-protein-dependent transport protein (periplasmic)

3523618 C T G->R BP3303 - - 699 C C C C N C C C T T conserved hypothetical conserved hypothetical protein

3524391 G A A->T BP3304 - + 520 G G G G G G A G G G miscellaneous putative enoyl-CoA hydratase

3540631 G A E->E BP3320 - + 69 G G A G G G G G G G conserved hypothetical conserved hypothetical protein

3552998 C T I->I BP3331 ksgA + 405 C C T C N C C C C C ribosome constituents dimethyladenosine transferase

3565492 G A T->I BP3342 - - 260 G G G G G A G G G G cell surface putative peptidoglycan-associated lipoprotein

3574080 T G Intergenic T T T G T T T T T T

3587622 A G I->I BP3371 - - 127 A G G G G G G G G G conserved hypothetical hypothetical protein +

3589126 G A C->C BP3375 - - 220 G G G G G A G G G G unknown hypothetical protein

3591185 T C L->P BP3379 - T T C T T C C C C C unknown hypothetical protein +

3596789 T G N->H BP3385 - - 291 T G T T T T T T T T conserved hypothetical conserved hypothetical protein +

3596916 A G F->F BP3385 - - 418 A G G G G G G G G G conserved hypothetical +

3610438 C T A->A BP3402 - - 931 C C T C C C C C C C cell surface putative exported protein +

3617316 A T T->T BP3408 + 828 A A A A A A T A A A phage-related or transposon-related transposase

3617317 G C V->L BP3408 + 939 G G G G G G C G G G phage-related or transposon-related

3619039 A G L->P BP3410 - - 887 A G A A A A A A A A cell surface putative inner membrane protein

3619407 T C Intergenic T C C C C C C C C C

3642513 G T A->E BP3434 - - 590 G T G G G G G G G G cell surface putative exported protein -

3645025 T C Intergenic T C C N C C C C C C

3660076 C T E->K BP3451 - - 30 C C C C T C C C C C phage-related or transposon-related transposase

3663218 G A R->R BP3453 - - 1291 G G G G G G A G G G miscellaneous putative thiamine pyrophosphate enzyme

3684691 T C H->R BP3472 - - 77 T T T T T T T T C T conserved hypothetical conserved hypothetical protein

3686708 G A S->L BP3474 - G G G A G G G G G G pseudogenes conserved hypothetical protein (pseudogene)

3700321 G A A->V BP3491 ndh - 314 G G A G G G G G G G energy metabolism putative NADH dehydrogenase

Appendix

210

Po

siti

on

in

gen

om

e

Ref

eren

ce

SN

P

Am

ino

aci

d c

ha

ng

e

Gen

e ID

Gen

e n

am

e

Str

an

d

Posi

tion

in

gen

e

To

ha

ma

L580

L506

L1204

L706

L1415

L1191

L1432

L1376

L1756

Fu

nct

ion

al

cate

go

ry

Pro

du

ct

bvg

act

iva

ted

/rep

ress

ed

Cluster/SP VII V IV UC III UC II I/16 I/13 I/13

3715976 G A V->V BP3504 - + 1044 G G G A G G G G G G conserved hypothetical conserved hypothetical protein

3717828 A G D->G BP3505 - + 920 A A A G A A A A A A phage-related or transposon-related transposase

3726811 G T T->T BP3517 - + 555 G G G G N T G G G G cell surface putative membrane protein

3736133 C A Intergenic C A C C C C C C C C

3743082 G A E->E BP3531 tonB + 177 G G G G N A G G G G adaptation siderophore-mediated iron transport protein

3752406 C G G->A BP3539 argD - 851 C G C C C C C C C C amino acid biosynthesis putative acetylornithine aminotransferase

3753863 T G I->I BP3541 - - 622 T G N N N N G G N G regulation LysR family regulatory protein

3757929 A G T->A BP3546 - + 289 A G G G G G G G G G conserved hypothetical conserved hypothetical protein

3758055 G A V->I BP3546 - + 415 G G G G G G G G G A conserved hypothetical

3783560 A C Y->S BP3570 - + 323 A A A A A A A A C C transport/binding proteins putative branched-chain amino acid transport system protein

3790524 G A T->T BP3576 - - 253 G G G G N G A G G G transport/binding proteins ABC transporter ATP-binding protein

3807642 G A E->K BP3593 - + 922 G G G G G G A G G G phage-related or transposon-related transposase

3835466 C A T->T BP3623 - - 55 C C A C N C C C C C miscellaneous putative hydrolase

3840411 G A S->S BP3630 rpsH + 150 G G G G G G G A A A ribosome constituents 30S ribosomal protein S8

3843034 A G T->A BP3636 secY + 256 A G G G G G G G G G transport/binding proteins preprotein translocase SecY subunit

3846833 G A V->V BP3642 rpoA + 681 G G G A A A A A A A macromolecule synthesis/modification DNA-directed RNA polymerase alpha chain

3852058 G A A->V BP3648 hemB - 416 G G G A G G G G G G cofactor biosynthesis putative delta-aminolevulinic acid dehydratase

3859153 G A Intergenic G G G G N A G G G G

3870338 C T Intergenic C C T C C C C C C C

3874037 A G D->G BP3665 + A A A A A A G A A A pseudogenes putative aldolase (Pseudogene)

3890915 A G Q->R BP3681 glcC + A G G G G G G G G G pseudogenes GntR-family transcriptional regulator (pseudogene)

3933893 G A E->E BP3724 - + 777 G G G G G G G G G A conserved hypothetical conserved hypothetical protein

3938341 C T A->T BP3728 rkpK - 1146 C C T T T T T T T T cell surface putative UDP-glucose 6-dehydrogenase

3950021 A G F->F BP3743 ctaD - 838 A G G G G G G G G G energy metabolism cytochrome c oxidase polypeptide I -

3957159 C T Intergenic C T T T T T T T T T

3959407 C T P->P BP3750 - - 436 C T C C C C C C C C small molecule degradation putative esterase -

3988168 G A ptxp3 BP3681 P G G G G G G G A A A

3988941 G A M->I BP3783 ptxA + 684 G G A A A A A A A A pathogenicity pertussis toxin subunit 1 precursor +

3989239 G A G->S BP3784 ptxB + 133 G A A A A A A A A A pathogenicity pertussis toxin subunit 2 precursor +

3991376 C T C->C BP3787 ptxC + 681 C C C C C C C T T T pathogenicity pertussis toxin subunit 3 precursor +

4001555 G A V->M BP3798 - + 574 G G G G N A G G G G regulation AraC family regulatory protein

4007734 T C R->R BP3803 - + 624 T C C C C C C C C C transport/binding proteins putative transport system permease protein

4015848 G A E->K BP3811 - + 922 G N A A N N A N N A phage-related or transposon-related transposase

4015977 C T R->Q BP3812 - - 104 C C C T C C C C C C cell surface putative outer membrane efflux protein

4018757 C T G->D BP3813 - - 1394 C C C C C T C C C C cell surface AcrB/AcrD/AcrF family protein

4036428 C G V->L BP3829 - - 186 C C C C C G C C C C transport/binding proteins putative amino acid ABC transporter permease protein

4044275 T C BP3835 - T C C C C C C C C C pseudogenes putative pyruvate ferredoxin/flavodoxin oxidoreductase (pseudogene)

4056201 T C F->L BP3845 - + 685 T T T T T T T T C T miscellaneous nitroreductase family protein

4063054 C T G->G BP3852 katA + 1140 C C T C C C C C C C protection responses catalase +

4068047 C T G->S BP3857 - - 789 C C C C C T T T T T miscellaneous putative hydrolase

4068650 C T G->S BP3858 - - 135 C C C C C T T T T T transport/binding proteins putative transport ATP-binding protein

4071996 G A I->I BP3861 - - 523 G G G G G A A A A A transport/binding proteins putative transport system permease protein

Appendix

211

Appendix 3: Genes affected by 300 bp more deletion

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L5

80

(V

, S

P2

7)

L5

06

(IV

, S

P3

0)

L1

20

4 (

SP

18

)

L7

06

(II

I, S

P1

9)

L1

19

1 (

II,

SP

37

)

L1

41

5 (

SP

11

)

L1

37

6 (

I, S

P1

3)

L1

75

6 (

I, S

P1

3)

L1

43

2 (

I, S

P1

6)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP0031 34127-36075 1949 - a n n n a n n n n phage-related or transposon-related transposase

BP0049 49643-51591 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP0058 56999-58947 1949 - n n n n n n n n phage-related or transposon-related transposase

BP0059 58051-59601 1551 - n n n n n a n n n pseudogenes regulatory protein (Pseudogene)

BP0071 68587-70535 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP0080 77136-79084 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP0137 134659-136607 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP0166 164396-166344 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP0175 173802-175750 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0192 193930-195878 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP0202 205160-207108 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP0202A 206105-207141 1037 - a a a a a a a n a pseudogenes transposase (pseudogene)

BP0203 206230-208178 1949 - a a a a a a a n a phage-related or transposon-related transposase

BP0210 215113-217061 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP0211 216162-218110 1949 - a a n a n n n n a phage-related or transposon-related transposase

BP0228 234980-236928 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP0256 265233-267181 1949 - n n n n n a n n a phage-related or transposon-related transposase

BP0281 285633-287581 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0294 299592-301150 1559 - n a n a n n n n n macromolecule synthesis/modification putative 5'(3')-deoxyribonucleotidase

BP0295 300212-302160 1949 - a a n a n n n n a phage-related or transposon-related transposase

BP0296 301354-302612 1259 - a n n a n n n n a unknown hypothetical protein

BP0297 301730-303678 1949 - a n n a a n n n a phage-related or transposon-related transposase

BP0327 330044-331992 1949 - n n n n n n a n n phage-related or transposon-related transposase

BP0355 354578-356526 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP0392 390518-392466 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0401 399280-401255 1976 - n a n n n n n n n pseudogenes transposase (Pseudogene)

BP0424 425547-427495 1949 - n n n a a n n n a phage-related or transposon-related transposase

BP0439 442986-444934 1949 - n n a n n n n n a phage-related or transposon-related transposase

BP0443 446022-447970 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP0473 482647-484595 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0481 491915-493863 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0496 508720-510668 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP0514 521541-523489 1949 - a a n n n a n n n phage-related or transposon-related transposase

BP0515 522590-523830 1241 - n n n n n a n n n unknown hypothetical protein

BP0516 522999-524638 1640 - n n n n n a n n n cell surface putative exported protein

BP0517 523651-525599 1949 - a n n n n a n n a phage-related or transposon-related transposase

BP0537 544771-546719 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP0540 548184-550132 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP0565 570658-572606 1949 - a a n n a n n a a phage-related or transposon-related transposase

BP0611 615603-617551 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0645 651755-653703 1949 - a a a a a n a a a phage-related or transposon-related transposase

BP0646 652804-654752 1949 - a a n n a n a a a phage-related or transposon-related transposase

BP0676 690983-692931 1949 - n n a a n n n n a phage-related or transposon-related transposase

BP0677 692656-694928 2273 hemA n a n n n n n n n cofactor biosynthesis glutamyl-tRNA reductase

BP0678 694017-696097 2081 prfA n a n n n n n n n macromolecule synthesis/modification peptide chain release factor 1

BP0679 695110-696920 1811 hemK n a n n n n n n n cofactor biosynthesis heme biosynthesis protein

BP0680 695959-697283 1325 - n a n n n n n n n conserved hypothetical conserved hypothetical protein

BP0681 696290-697848 1559 - n a n n n n n n n miscellaneous probable flavoprotein

BP0682 697091-699081 1991 - n a n n n n n n n cell surface putative exported protein

BP0683 698102-700095 1994 - n a n n n n n n n small molecule degradation 4,5-dihydroxyphthalate decarboxylase

BP0684 699113-700981 1869 - n a n n n n n n n pseudogenes molybdopterin dehydrogenase (Pseudogene)

BP0684A 699980-701487 1508 - n a n n n n n n n energy metabolism probable 2Fe-2S ferredoxin

BP0685 700482-703861 3380 - n a n n n n n n n miscellaneous probable dehydrogenase/oxidase

BP0686 702904-704624 1721 - n a n n n n n n n pseudogenes conserved hypothetical protein (Pseudogene)

BP0688 703731-705679 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP0704 718070-720018 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP0729 741967-743915 1949 - n n a n n n n n n phage-related or transposon-related transposase

BP0733 746126-748074 1949 - n n n n a n n n a phage-related or transposon-related transposase

BP0739 752077-754025 1949 - n a n n n n n n a phage-related or transposon-related transposase

Appendix

212

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L58

0 (

V,

SP

27

)

L50

6 (

IV,

SP

30)

L12

04

(S

P18

)

L70

6 (

III,

SP

19

)

L11

91

(II

, S

P3

7)

L14

15

(S

P11

)

L13

76

(I,

SP

13

)

L17

56

(I,

SP

13

)

L14

32

(I,

SP

16

)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP0739A 753124-754160 1037 - n a n n n n n n n pseudogenes transposase (pseudogene)

BP0740 753249-756442 3194 - n a n n n n n n n unknown hypothetical protein

BP0741 755517-756943 1427 - n a n n n n n n n regulation putative transcriptional regulator

BP0742 756047-757992 1946 - a a n n n n n n a pseudogenes transposase (Pseudogene)

BP0756 771022-774139 3118 - n a a n n n n n n pseudogenes putative exported protein (Pseudogene)

BP0756A 771288-773233 1946 - n a a n n n n n n pseudogenes transposase (Pseudogene)

BP0786 808588-810536 1949 - n n n n a n n n n phage-related or transposon-related transposase

BP0797 819151-821099 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP0839 867464-869412 1949 - n n n n n n n a a phage-related or transposon-related transposase

BP0871 906749-908697 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP0876 911114-912792 1679 - n n n a n n n n n cell surface putative lipoprotein

BP0877 912076-913790 1715 - n n n a n n n n n conserved hypothetical conserved hypothetical protein

BP0878 912807-915106 2300 - n n n a n n n n n small molecule degradation putative phospholipase

BP0891 928455-930403 1949 - a a n n n n a n n phage-related or transposon-related transposase

BP0897 932159-934107 1949 - a n n n n n a n a phage-related or transposon-related transposase

BP0910 946617-948565 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP0910A 947565-948640 1076 - a a a a a a a a a pseudogenes N-terminal region of a putative decarboxylase (pseudogene)

BP0911 947655-949204 1550 - a a a a a a a a a miscellaneous putative decarboxylase

BP0912 948380-950377 1998 - a a a a a a a a a pseudogenes LysR-family transcriptional regulator (Pseudogene)

BP0913 949484-951456 1973 - a a a a a a a a a cell surface putative exported protein

BP0914 950478-952348 1871 - a a a a a a a a a transport/binding proteins probable inner membrane component of binding-protein-dependent transport system

BP0915 951347-953172 1826 - a a a a a a a a a transport/binding proteins probable inner membrane component of binding-protein-dependent transport system

BP0916 952198-953975 1778 - a a a a a a a a a pseudogenes putative ATP-binding protein of a transporter (Pseudogene)

BP0918 953077-955496 2420 - a a a a a a a a a conserved hypothetical conserved hypothetical protein

BP0919 954603-957055 2453 gabD a a a a a a a a a small molecule degradation putative succinate-semialdehyde dehydrogenase [NADP+]

BP0920 956107-958508 2402 - a a a a a a a a a cell surface putative exported protein

BP0921 957531-959713 2183 citB a a a a a a a a a small molecule degradation citrate utilization protein B

BP0922 958758-960931 2174 - a a a a a a a a a conserved hypothetical conserved hypothetical protein

BP0923 960104-961917 1814 - a a a a a a a a a conserved hypothetical conserved hypothetical protein

BP0924 960961-962855 1895 - a a a a a a a a a regulation putative transcriptional regulator

BP0925 961912-963770 1859 - a a a a a a a a a miscellaneous putative fumarylacetoacetate-family hydrolase

BP0926 962850-964921 2072 - a a a a a a a a a conserved hypothetical conserved hypothetical protein

BP0927 964049-966021 1973 - a a a a a a a a a cell surface putative exported protein

BP0928 965162-967149 1988 - a a a a a a a a a regulation LysR-type transcriptional regulator

BP0929 966176-967749 1574 - a a a a a a a a a cell surface putative membrane protein

BP0930 966845-969501 2657 - a a a a a a a a a miscellaneous putative CoA ligase

BP0931 968515-970496 1982 - a a a a a a a a a cell surface putative exported protein

BP0932 969503-970923 1421 - a a a a a a a a a conserved hypothetical conserved hypothetical protein

BP0933 969929-972115 2187 - a a a a a a a a a pseudogenes conserved hypothetical protein (Pseudogene)

BP0934 971172-972529 1358 - a a a a a a a a a unknown hypothetical protein

BP0938 974375-976323 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP0978 1016298-1018342 2045 - n n n n a n n n a phage-related or transposon-related transposase for IS1663

BP1006 1052932-1054571 1640 - a n a n n n n n n miscellaneous probable glutathione S-transferase

BP1007 1053860-1055808 1949 - a n n n n n a n a pseudogenes transposase (Pseudogene)

BP1008 1053677-1056059 2383 - a n n n n n a n a pseudogenes hypothetical protein (Pseudogene)

BP1020 1063021-1064969 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP1035 1076190-1078204 2015 - n n n n n n n n a phage-related or transposon-related transposase for IS1663

BP1044 1085557-1087571 2015 - n n n n n n n n a phage-related or transposon-related transposase for IS1663

BP1048 1090326-1092274 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP1064 1108731-1112017 3287 maeB a n a a a n n a a central/intermediary metabolism NADP-dependent malic enzyme

BP1065 1111069-1113227 2159 - n n a n n n n n n pseudogenes conserved hypothetical protein (Pseudogene)

BP1066 1112226-1113688 1463 - n n a n n n n n n conserved hypothetical conserved hypothetical protein

BP1067 1112717-1114665 1949 - n n a n n n n n a phage-related or transposon-related transposase

BP1080 1128922-1130870 1949 - a a a n n n n n n phage-related or transposon-related transposase

BP1086 1133655-1135603 1949 - a n n n n a n n a phage-related or transposon-related transposase

BP1093 1142288-1144236 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP1113 1168239-1172751 4513 - n n n n n n n n a pseudogenes putative competence protein (Pseudogene)

BP1114 1169957-1171905 1949 - n n n n n n n n a pseudogenes transposase (Pseudogene)

BP1118 1174305-1176253 1949 - a n n n n a a n n phage-related or transposon-related transposase

Appendix

213

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L58

0 (

V, S

P27)

L50

6 (

IV, S

P30)

L12

04 (

SP

18)

L70

6 (

III,

SP

19)

L11

91 (

II, S

P37)

L14

15 (

SP

11)

L13

76 (

I, S

P13)

L17

56 (

I, S

P13)

L14

32 (

I, S

P16)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP1130 1193047-1194995 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP1134 1196355-1198303 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP1135 1195884-1198765 2882 ssiD a a a a a a a a a pseudogenes alpha-ketoglutarate-dependent taurine dioxygenase (Pseudogene)

BP1136 1198027-1199558 1532 fecI a a a a a a a a a macromolecule synthesis/modification probable RNA polymerase sigma factor FecI

BP1137 1198553-1200513 1961 fecR a a a a a a a a a regulation putative signal transduction protein

BP1138 1199609-1203084 3476 bfrH a a a a a a a a a pathogenicity putative ferric siderophore receptor

BP1139 1202203-1203509 1307 - a a a a a a a a a adaptation putative iron uptake protein

BP1140 1202508-1205149 2642 - a a a a a a a a a pseudogenes putative iron uptake protein (Pseudogene)

BP1141 1204156-1205480 1325 - a a a a a a a a a adaptation putative iron uptake protein

BP1142 1204631-1206579 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP1157 1221932-1223880 1949 - n n a n a n a n n phage-related or transposon-related transposase

BP1158 1222879-1224287 1409 - a n a n a n a n n miscellaneous putative dioxygenase

BP1159 1223397-1225249 1853 - a n a n a n a n n small molecule degradation putative 2-pyrone-4,6-dicarboxylic acid hydrolase

BP1160 1224244-1226204 1961 - a n a n a n a n n cell surface putative lipoprotein

BP1161 1225252-1227374 2123 - a n a n a n a n n miscellaneous putative racemase

BP1162 1226488-1228358 1871 - a n a n a n a n n regulation probable LysR-family transcriptional regulator

BP1163 1227485-1229262 1778 - a n a n a n a n n miscellaneous probable short-chain dehydrogenase

BP1164 1228418-1230837 2420 - a n a n a n a n n cell surface putative membrane protein

BP1165 1230274-1232969 2696 - a n a n a n a n n transport/binding proteins sodium/solute symporter

BP1165A 1232000-1233120 1121 - a n a n a n a n n cell surface putative membrane protein

BP1166 1232198-1234137 1940 ldcA a n a n a n a n n cell surface putative muramoyltetrapeptide carboxypeptidase

BP1167 1233466-1235210 1745 - a n a n a n a n n miscellaneous putative adolase

BP1168 1234220-1236123 1904 - a n a n a n a n n regulation LysR-family transcriptional regulator

BP1169 1235268-1237183 1916 - a n a n a n a n n miscellaneous putative oxidoreductase

BP1170 1236225-1238176 1952 - a n a n a n n n n cell surface putative exported protein

BP1171 1237203-1238779 1577 - a n a n a n n n n conserved hypothetical conserved hypothetical protein

BP1172 1237796-1240378 2583 - a n a n a n n n n pseudogenes putative membrane protein (Pseudogene)

BP1174 1239455-1241061 1607 - a n a n a n n n n unknown hypothetical protein

BP1175 1240141-1241492 1352 - a n a n a n n n n cell surface putative exported protein

BP1176 1240578-1242565 1988 aruE a n a n a n n n n small molecule degradation putative succinylglutamate desuccinylase

BP1177 1241770-1243718 1949 - n n a n a n n n n phage-related or transposon-related transposase

BP1199 1261116-1263064 1949 - n n n a n a a n a phage-related or transposon-related transposase

BP1200 1260675-1264551 3877 bapB n n n a n a a n a pseudogenes adhesin (pseudogene)

BP1268 1341857-1343805 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP1308 1376220-1378168 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP1332 1405286-1407234 1949 - n n a n n n n n a phage-related or transposon-related transposase

BP1337 1409393-1411341 1949 - a n n n a n n n a phage-related or transposon-related transposase

BP1338 1408539-1411532 2994 - a n n n a n n n a pseudogenes putative membrane protein (Pseudogene)

BP1339 1410531-1412731 2201 - n n n n a n n n n small molecule degradation probable phospholipase

BP1361 1436348-1438296 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP1365 1439946-1441960 2015 - a n n n n n n n a phage-related or transposon-related transposase for IS1663

BP1384 1459675-1462493 2819 cheD a n n a n n n n n cell processes methyl-accepting chemotaxis protein I

BP1385 1461616-1464233 2618 cheM n n n a n n n n n cell processes methyl-accepting chemotaxis protein II

BP1388 1465071-1467085 2015 - a n n n n n n n a phage-related or transposon-related transposase for IS1663

BP1397 1470834-1472782 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP1398 1470606-1473852 3247 flaE a a n n n n n n n pseudogenes flagellar hook-length control protein (Pseudogene)

BP1407 1479708-1481981 2274 - n n n n n n n n a pseudogenes conserved hypothetical protein (Pseudogene)

BP1438 1513605-1515028 1424 - n a n n n n n n n cell surface putative membrane protein

BP1450 1523273-1525221 1949 - a n n a a n n n a phage-related or transposon-related transposase

BP1459 1535497-1537445 1949 - n a n n n n a n a phage-related or transposon-related transposase

BP1488 1565810-1567437 1628 - a a n a n a a a n cell surface putative membrane protein

BP1489 1566442-1569107 2666 - a a n a n a a a n cell surface putative membrane protein

BP1490 1568124-1569778 1655 trpF a a n a n a n n n amino acid biosynthesis N-(5'-phosphoribosyl)anthranilate isomerase

BP1491 1569260-1571208 1949 - a a n a a n a a a phage-related or transposon-related transposase

BP1492 1570242-1572337 2096 - n n n a n n n n n unknown hypothetical protein

BP1493 1571336-1573284 1949 - a n n a n n n n a phage-related or transposon-related transposase

BP1511 1586897-1588845 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP1544 1615893-1617841 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP1552 1630156-1632104 1949 - a a n a n n n n a phage-related or transposon-related transposase

Appendix

214

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L580 (

V, S

P27)

L506 (

IV, S

P30)

L1204 (

SP

18)

L706 (

III,

SP

19)

L1191 (

II, S

P37)

L1415 (

SP

11)

L1376 (

I, S

P13)

L1756 (

I, S

P13)

L1432 (

I, S

P16)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP1553 1631176-1632938 1763 - n n n a n n n n n unknown hypothetical protein

BP1555 1632640-1634489 1850 - n n n a n n n n n cell surface putative membrane protein

BP1556 1633551-1635175 1625 - n n n a n n n n n conserved hypothetical conserved hypothetical protein

BP1557 1634174-1636122 1949 - a n a a n n n n a phage-related or transposon-related transposase

BP1572 1650743-1652691 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP1594 1672048-1673996 1949 - n n a n n n n n a phage-related or transposon-related transposase

BP1602 1681271-1683219 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP1637 1722392-1724736 2345 - n n n a n n n n n cell surface putative exported protein

BP1638 1723735-1726247 2513 - n n n a n n n n n conserved hypothetical conserved hypothetical protein

BP1647 1734148-1736096 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP1653 1738929-1740877 1949 - a a a n n n n n a phage-related or transposon-related transposase

BP1654 1739876-1742430 2555 wcbQ n a a n n n n n n cell surface putative capsular polysaccharide biosynthesis protein

BP1655 1741425-1743196 1772 wcbP n a a n n n n n n miscellaneous putative oxidoreductase

BP1656 1742257-1744205 1949 - n a a n n n n n a phage-related or transposon-related transposase

BP1676 1762187-1764575 2389 - n a n n n n n n n pseudogenes probable sulfatase (Pseudogene)

BP1677 1763574-1765000 1427 - n a n n n n n n n pseudogenes hypothetical protein (Pseudogene)

BP1678 1763999-1765947 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP1679 1765052-1766244 1193 - n a n n n n n n n pseudogenes Putative LysR-family transcriptional regulator (Pseudogene)

BP1689 1772817-1774765 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP1697 1781064-1783012 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP1698 1782091-1783598 1508 - n a n n n n n n n cell surface putative exported protein

BP1717 1801327-1803341 2015 - n n n n n n n a a phage-related or transposon-related transposase for IS1663

BP1735 1818314-1820262 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP1748 1832766-1834714 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP1757 1843079-1845027 1949 - a n n n n n n n n cell surface putative exported protein

BP1792 1881135-1883083 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP1793 1880892-1884414 3523 - a n n n n n n n a pseudogenes putative tracheal colonization factor (Pseudogene)

BP1807 1895279-1897227 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP1808 1896354-1898161 1808 paaG a n n n n n n n a small molecule degradation probable enoyl-CoA hydratase

BP1809 1897364-1899312 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP1810 1898413-1900361 1949 - n a a a a a a a a phage-related or transposon-related transposase

BP1811 1899613-1901634 2022 kdgT n n n n n n n n a pseudogenes 2-keto-3-deoxygluconate permease (Pseudogene)

BP1812 1900637-1902852 2216 - n n n n n n n n a conserved hypothetical conserved hypothetical protein

BP1844 1934564-1936512 1949 - a n n n a n n n n phage-related or transposon-related transposase

BP1866 1956111-1958059 1949 - n n n n n n a n a phage-related or transposon-related transposase

BP1897 1999180-2001127 1948 - a a n n n n n n a pseudogenes transposase (Pseudogene)

BP1911 2013242-2015190 1949 - n n n a n n n n a phage-related or transposon-related transposase

BP1912 2014322-2016285 1964 - n n n a n n n n n conserved hypothetical conserved hypothetical protein

BP1914 2017046-2019060 2015 - n n n n a n n n n phage-related or transposon-related transposase for IS1663

BP1947 2049106-2051054 1949 - n n n a n n a a a phage-related or transposon-related transposase

BP1948 2050103-2052309 2207 - n n n n n n a a a transport/binding proteins branched-chain amino acid-binding protein

BP1949 2051403-2054350 2948 - n n n n n n a a a transport/binding proteins putative permease component of branched-chain amino acid transport system

BP1950 2053349-2055123 1775 - n n n n n n a a a transport/binding proteins putative ATP-binding component of branched-chain amino acid ABC transporter

BP1951 2054122-2055890 1769 - n n n n n n a a a transport/binding proteins putative ATP-binding component of ABC transporter

BP1952 2055047-2058337 3291 - n n n a n n a a a pseudogenes putative cytochrome (Pseudogene)

BP1953 2057336-2058855 1520 - n n n n n n a a a miscellaneous probable oxidoreductase

BP1954 2057934-2060083 2150 - n n n n n n a a a miscellaneous putative monooxygenase

BP1955 2059386-2061136 1751 maiA n a n a n n a a a central/intermediary metabolism maleate cis-trans isomerase

BP1956 2060158-2061975 1818 - n a n a a n a a a pseudogenes probable alpha/beta hydrolase (Pseudogene)

BP1957 2061005-2063034 2030 - a a n a a n a a a conserved hypothetical conserved hypothetical protein

BP1958 2062052-2063667 1616 - a a n a a n a a a miscellaneous putative isochorismatase

BP1959 2062861-2064875 2015 - a a n a a n a a a phage-related or transposon-related transposase for IS1663

BP1960 2063991-2066892 2902 - n n n n a n a a a pseudogenes probable aldehyde dehydrogenase (Pseudogene)

BP1961 2066016-2068354 2339 - n n n n n n a a a energy metabolism putative flavocytochrome

BP1962 2067367-2070473 3107 bfrI n n n n n n a a a pathogenicity putative ferrisiderophore receptor

BP1963 2069609-2071134 1526 - n n n n n n a a a pseudogenes putative transcriptional regulator (Pseudogene)

BP1965 2070230-2072193 1964 - n n n n n n a a a cell surface putative exported protein

BP1966 2071212-2073775 2564 - n n n n n n a a a pseudogenes putative sulfatase (Pseudogene)

BP1968 2072901-2075167 2267 - n n n n n n a a a pseudogenes fusion between a transposase and a C-terminal portion of a regulatory protein (Pseudogene)

Appendix

215

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L5

80

(V

, S

P2

7)

L5

06

(IV

, S

P3

0)

L1

20

4 (

SP

18

)

L7

06

(II

I, S

P1

9)

L1

19

1 (

II,

SP

37

)

L1

41

5 (

SP

11

)

L1

37

6 (

I, S

P1

3)

L1

75

6 (

I, S

P1

3)

L1

43

2 (

I, S

P1

6)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP2018 2131648-2133596 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP2029 2141144-2143092 1949 - n n n n n n a n a phage-related or transposon-related transposase

BP2034 2146342-2148302 1961 - n n n a n n n n n regulation putative LysR-family transcriptional regulator

BP2048 2166311-2168259 1949 - n n n a n n n n a phage-related or transposon-related transposase

BP2054 2172985-2174933 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP2087 2208733-2210681 1949 - a a a n n n n n n phage-related or transposon-related transposase

BP2104 2227523-2229471 1949 - n n a n a n a n a phage-related or transposon-related transposase

BP2105 2228695-2230643 1949 - a n a n a a a n a phage-related or transposon-related transposase

BP2118 2240934-2243080 2147 - a a n n a a a n a pseudogenes transposase (Pseudogene)

BP2121 2244147-2246161 2015 - n n n n a n n n a phage-related or transposon-related transposase for IS1663

BP2135 2258370-2260318 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP2136 2259461-2260872 1412 - a a n n n n n n a pseudogenes putative acetyltransferase (pseudogene)

BP2137 2259973-2261921 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP2159 2280562-2282510 1949 - a n n n n n n n n pseudogenes transposase (Pseudogene)

BP2181 2299638-2301586 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP2207 2329863-2331811 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP2214 2337286-2339234 1949 - n n n n n n a n a phage-related or transposon-related transposase

BP2221 2345961-2347909 1949 - n n a n n n n n n phage-related or transposon-related transposase

BP2266 2386026-2387974 1949 - n n n n n a n n n phage-related or transposon-related transposase

BP2272 2393134-2395082 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP2293 2414826-2417398 2573 - a n n n n n n n n transport/binding proteins putative solute-binding transport protein (periplasmic)

BP2294 2416508-2418522 2015 - a n n n n n n n n transport/binding proteins putative integral membrane transport protein

BP2295 2417521-2419358 1838 - a n n n n n n n n transport/binding proteins putative integral membrane transport protein

BP2296 2418365-2421000 2636 - a n n n n n n n n transport/binding proteins putative ABC transport ATP-binding subunit

BP2297 2419999-2421947 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP2316 2447508-2449456 1949 - a a a a n n a n n phage-related or transposon-related transposase

BP2355 2494579-2496527 1949 - n n a n n n n n n phage-related or transposon-related transposase

BP2390 2527890-2529838 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP2415 2556139-2558087 1949 - a a n n n n n n n phage-related or transposon-related transposase

BP2427 2568401-2571548 3148 - a n n n n n n n n pseudogenes transposase/threonine synthase (pseudogene)

BP2427A 2568607-2570552 1946 - a n n n n n n n n pseudogenes transposase for IS481 element (pseudogene)

BP2453 2593400-2595348 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP2475 2619019-2620703 1685 - a n n n n n n n n miscellaneous hypothetical protein

BP2476 2619702-2621032 1331 - a n n n n n n n n transport/binding proteins putative mebrane transport protein

BP2477 2620150-2622098 1949 - a a n a n n n a a phage-related or transposon-related transposase

BP2478 2621097-2624161 3065 - n n n a n n n n n transport/binding proteins putative membrane transport protein

BP2483 2628868-2632580 3713 kdpD n a n n n n n n n regulation Two component sensor protein

BP2484 2631624-2633317 1694 kdpE n a a n n n n n n regulation two component system transcriptional regulatory protein

BP2485 2632328-2634276 1949 - n a a n n n n n a phage-related or transposon-related transposase

BP2492 2638768-2640716 1949 - n a a n a n n n a phage-related or transposon-related transposase

BP2524 2673531-2675479 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP2568 2719420-2721368 1949 - n a a n n n n n n phage-related or transposon-related transposase

BP2577 2728385-2730333 1949 - a a a a a a n n n phage-related or transposon-related transposase

BP2578 2729530-2730977 1448 dnaS a a a a a n n n n ribonucleotide biosynthesis deoxyuridine 5'-triphosphate nucleotidohydrolase

BP2579 2729976-2731924 1949 - n a a a a n n n a phage-related or transposon-related transposase

BP2582 2732985-2734933 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP2587 2737316-2739264 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP2608 2759138-2761086 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP2672 2837066-2839014 1949 - a n n n n n a n a phage-related or transposon-related transposase

BP2673 2838013-2839961 1949 - a n n n n n a n a phage-related or transposon-related transposase

BP2679 2844336-2846284 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP2704 2869212-2871160 1949 - n n n a n n a n n phage-related or transposon-related transposase

BP2705 2870297-2872401 2105 - n n n a n n n n n cofactor biosynthesis putative molybdenum-binding protein

BP2721 2886358-2888372 2015 - n n n n a n n n a phage-related or transposon-related transposase for IS1663

BP2724 2890518-2892466 1949 - n a a n a n n n a phage-related or transposon-related transposase

BP2733 2899494-2901442 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP2734 2900543-2903042 2500 - n a n n n n n n n pseudogenes putative chelatase (Pseudogene)

BP2763 2940229-2942177 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP2775 2950286-2953322 3037 - a n n n n a n n n pseudogenes putative exported protein (pseudogene)

Appendix

216

Locu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L580 (

V, S

P27)

L506 (

IV, S

P30)

L1204 (

SP

18)

L706 (

III,

SP

19)

L1191 (

II, S

P37)

L1415 (

SP

11)

L1376 (

I, S

P13)

L1756 (

I, S

P13)

L1432 (

I, S

P16)

Fun

ctio

na

l ca

teg

ory

Pro

tein

BP2776 2950826-2952774 1949 - a n n n n a n n n phage-related or transposon-related transposase

BP2781 2955329-2957277 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP2803 2976266-2978274 2009 - n n n a n n n n n cell surface putative integral membrane protein

BP2805 2978572-2980520 1949 - n n n n n n a n n pseudogenes transposase (Pseudogene)

BP2819 2996376-2998324 1949 - n a n n a n n n n phage-related or transposon-related transposase

BP2820 2997401-2999553 2153 - n a n n a n n n n central/intermediary metabolism putative alcohol dehydrogenase

BP2821 2998612-3000560 1949 - a a n n a n n n a phage-related or transposon-related transposase

BP2834 3009338-3011574 2237 - n a n n n n n n n pseudogenes transposase (Pseudogene)

BP2845 3020941-3022889 1949 - n n n n n n a n n phage-related or transposon-related transposase

BP2846 3020696-3023928 3233 - n n n n n n a n n pseudogenes putative exported protein (pseudogene)

BP2848 3023855-3025803 1949 - a n n n n n a n n phage-related or transposon-related transposase

BP2873 3049779-3051766 1988 - n n n a n n n n n cell surface putative exported protein

BP2874 3051067-3052766 1700 ribC n n n a n n n n n cofactor biosynthesis riboflavin synthase alpha chain

BP2875 3051807-3054178 2372 - n n n a n n n n n central/intermediary metabolism amidase

BP2876 3053555-3055503 1949 - a n a a n n n n a pseudogenes transposase (Pseudogene)

BP2877 3053219-3056249 3031 - a n a a n n n n a pseudogenes putative exported protein (pseudogene)

BP2884 3059943-3061891 1949 - a a n a n n a n a phage-related or transposon-related transposase

BP2885 3060890-3062964 2075 - n n n a n n n n n cell surface putative exported protein

BP2886 3062011-3063761 1751 - n n n a n n n n n cell surface putative lipoprotein

BP2907 3085366-309895413589 fhaL n n n a n n n n n pathogenicity adhesin

BP2912 3102204-3104152 1949 - n a n n n n n n n phage-related or transposon-related transposase

BP2913 3103253-3105054 1802 - n n n a n n n n n unknown hypothetical protein

BP2914 3104053-3105578 1526 - n n n a n n n n n macromolecule synthesis/modification putative RNA polymerase sigma factor

BP2915 3104631-3105997 1367 - n n n a n n n n n cell surface putative exported protein

BP2920 3107697-3109924 2228 - a n n n n n n n n transport/binding proteins putative chromate transport protein

BP2921 3108929-3111396 2468 - a n n n n n n n n cell surface putative exported protein

BP2922 3110410-3113606 3197 bfrG a n n n n n n n n pathogenicity putative TonB-dependent receptor

BP2923 3112771-3114206 1436 - a n n n n n n n n cell surface putative lipoprotein

BP2924 3113510-3114804 1295 - a n n n n n n n n cell surface putative exported protein

BP2925 3113803-3115685 1883 - a n n n n n n n n conserved hypothetical conserved hypothetical protein

BP2926 3114702-3116455 1754 - a n n n n n n n n conserved hypothetical conserved hypothetical protein

BP2927 3115461-3116956 1496 - a n n n n n n n n cell surface putative integral membrane protein

BP2928 3115990-3118418 2429 - a n n n n n n n n transport/binding proteins putative membrane transport protein

BP2929 3117559-3118904 1346 - a n n n n n n n n regulation putative regulatory protein

BP2930 3118027-3119372 1346 cyoD a n n n n n n n n energy metabolism cytochrome 0 ubiquinol oxidase

BP2931 3118374-3119983 1610 cyoC a n n n n n n n n energy metabolism cytochrome O ubiquinol oxidase

BP2932 3118989-3121966 2978 cyoB a n n n n n n n n energy metabolism ubiquinol oxidase polypeptide I

BP2933 3120972-3122884 1913 cyoA a n n n n n n n n energy metabolism putative ubiquinol oxidase polypeptide II

BP2934 3122095-3123647 1553 - a n n n n n n n n regulation putative two component system response regulator

BP2935 3122646-3124855 2210 - a n n n n n n n n regulation putative two component system, histidine kinase

BP2936 3124052-3126057 2006 - a n n n n n n n n cell surface putative exported protein

BP2937 3125198-3127062 1865 - a n n n n n n n n conserved hypothetical conserved hypothetical protein

BP2938 3126180-3127704 1525 - a n n n n n n n n pseudogenes conserved hypothetical protein (pseudogene)

BP2947 3134216-3136164 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP2955 3142497-3144445 1949 - a n n n n n a n n phage-related or transposon-related transposase

BP2976 3162078-3164026 1949 - n n n n n n a n n phage-related or transposon-related transposase

BP3005 3196051-3197999 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3045 3244412-3247229 2818 - a n n n n a a n a pseudogenes conserved hypothetical protein (Pseudogene)

BP3046 3245145-3247093 1949 - a n n n n a a n a phage-related or transposon-related transposase

BP3049 3247544-3249492 1949 - a n n n n n n a n phage-related or transposon-related transposase

BP3055 3254031-3255979 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3091 3290759-3292707 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP3109 3307248-3309685 2438 - n a n n n n n n n conserved hypothetical conserved hypothetical protein

BP3110 3308820-3310657 1838 - n a n n n n n n n regulation probable MerR-family transcriptional regulator

BP3111 3309829-3311777 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3112 3310783-3312482 1700 - n a n n n n n n n pseudogenes conserved hypothetical protein (Pseudogene)

BP3113 3311563-3314639 3077 - n a n n n n n n n pseudogenes putative DNA helicase (Pseudogene)

BP3114 3314156-3316104 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3164 3377010-3378958 1949 - n a n n n n n n n phage-related or transposon-related transposase

Appendix

217

Lo

cu

s-ID

Fla

nk

ing

size

Ge

ne

na

me

L580 (

V, S

P27)

L506 (

IV, S

P30)

L1204 (

SP

18)

L706 (

III,

SP

19)

L1191 (

II, S

P37)

L1415 (

SP

11)

L1376 (

I, S

P13)

L1756 (

I, S

P13)

L1432 (

I, S

P16)

Fu

ncti

on

al

ca

teg

ory

Pro

tein

BP3185 3396872-3398820 1949 - a a a a a a n a a phage-related or transposon-related transposase

BP3186 3397921-3399869 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP3203 3413819-3415767 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3210 3421855-3423803 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP3216 3427443-3429457 2015 - n n n n n n n n a phage-related or transposon-related transposase for IS1663

BP3217 3428567-3429957 1391 - n n n n n n n n a conserved hypothetical conserved hypothetical protein

BP3218 3429097-3430775 1679 - n n n n n n n n a cell surface putative membrane protein

BP3219 3429782-3431778 1997 - n n n n n n n n a cell surface putative exported protein

BP3220 3430849-3432797 1949 - n n n n n n a n a phage-related or transposon-related transposase

BP3230 3441975-3443989 2015 - n n n n n n n n a phage-related or transposon-related transposase for IS1663

BP3243 3460736-3462750 2015 - n n n n a n n n n phage-related or transposon-related transposase for IS1663

BP3257 3474619-3476567 1949 - n n n n n a n n a phage-related or transposon-related transposase

BP3260 3477323-3479310 1988 - a n n n n n n n a pseudogenes transposase (Pseudogene)

BP3272 3488997-3490945 1949 - n a n n n n n n a phage-related or transposon-related transposase

BP3294 3512224-3514172 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP3311 3530293-3532241 1949 - a a n a a a a n n phage-related or transposon-related transposase

BP3312 3531584-3533532 1949 - a a n a a a a n a phage-related or transposon-related transposase

BP3313 3532633-3534581 1949 - a a n a a a a n n phage-related or transposon-related transposase

BP3336 3558986-3560934 1949 - n n n n n n n n a phage-related or transposon-related transposase

BP3378 3589879-3591827 1949 - n n n n n n n n a pseudogenes transposase (Pseudogene)

BP3379 3589214-3592259 3046 - n n n n n n n n a unknown hypothetical protein

BP3386 3597125-3599073 1949 - a a n n n n n n a phage-related or transposon-related transposase

BP3392 3600961-3602909 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP3406 3613814-3615762 1949 - a a a a a n n n n phage-related or transposon-related transposase

BP3407 3613781-3616778 2998 - a a a a a n n n n pseudogenes transposase for IS1002 (pseudogene)

BP3408 3615879-3617827 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP3436 3643559-3645507 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP3445 3655782-3657361 1580 - n n n n a n n n n miscellaneous putative NUDIX hydrolase

BP3446 3656360-3658269 1910 - n n n n a n n n n regulation LysR family transcriptional regulator

BP3447 3657387-3658786 1400 - n n n n a n n n n cell surface putative membrane protein

BP3448 3657806-3659406 1601 - n n n n a n n n n unknown hypothetical protein

BP3449 3658466-3659847 1382 - n n n n a n n n n cell surface putative membrane protein

BP3450 3659294-3660546 1253 - n n n n a n n n n pseudogenes C-terminal region of a putative membrane protein (Pseudogene)

BP3451 3659548-3661496 1949 - a a n n a n n n a phage-related or transposon-related transposase

BP3456 3665399-3668685 3287 maeB n a a a n a n n n central/intermediary metabolism NADP-dependent malic enzyme

BP3509 3720338-3722708 2371 - a a a a a a a a a phage-related or transposon-related transposase

BP3510 3720590-3722538 1949 - a a a a a a a a a pseudogenes hypothetical protein (pseudogene)

BP3519 3727785-3729733 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP3520 3728834-3730782 1949 - a a a a a a a a a phage-related or transposon-related transposase

BP3548 3758991-3760939 1949 - n a a n n n n n a phage-related or transposon-related transposase

BP3607 3823149-3825097 1949 - n n n n n a n n a phage-related or transposon-related transposase

BP3688 3897050-3899091 2042 - a n n n n n n n a pseudogenes transposase (Pseudogene)

BP3698 3905684-3907632 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP3726 3934692-3936640 1949 - a n n n n n n n a phage-related or transposon-related transposase

BP3806 4008911-4010859 1949 - a n n n n n n n n phage-related or transposon-related transposase

BP3810 4013379-4015327 1949 - a n n n n n n a n phage-related or transposon-related transposase

BP3811 4014428-4016376 1949 - a a n n a n n a n phage-related or transposon-related transposase

BP3851 4060343-4062291 1949 - a a n n n n n n a phage-related or transposon-related transposase