Analysing diversity in microbial communities from metagenomics to

Meta-population analysis: applications
in microbial ecology
58.3%
87.5%
66%
Challenges for in situ studies of soil microbial communities
•Structure-function relationships high molecular diversityrelationship to functional diversity
• 104 - 106 different species per gram of soil
•Dormant groups represented as spores, small cells
•Unknown timescale for metatranscriptomics
•Need to target proteins and particularly enzymes in soil matrix
involved in biodegradation
•Develop sensitive meta-omic approaches
Questions:
1. Metagenomic verses metagenic?
2. How deep to sequence?
3. Which pipelines to use?
4. Do we enlist bioinformaticians?
5. What level of replication is appropriate: 3, 5, 10, 20?
6. Do we need specialist databases for specific functional
analyses?
Chitin degradation: chitinases, domain multiplicity
and shuffling
• F18 proteins consist of
discrete domains connected
by linkers allowing them to
rearrange, function, and
evolve independently (Gilkes, et
al. 1991; Warren 1996; Henrissat and Davies
2000).
 Two or more catalytic
domains can be present in
the same protein and can be
from different F18 classes
and subgroups.
(Kawase, et al. 2006; Suzuki, et al. 1999)
Properties of family 18 and 19 chitinases
Multiplicity of genes
Synergy of proteins
Family 18
Family 19
Catalysis model
Substrate assisted
General acid-base
Mechanism
Retention
Inversion mechanism
mechanism
Position of anomeric
oxygen at C1
Equatorial (b)
Axial (a)
Exochitinase or
Endochitinase
Exo- and Endo-
Endo-
Inhibitor (s)
Allosamidin
Amidines,
amidrazones and
nojiritetrazoles
Chitinase multiplicity
• Family 18 chitinases are common in Actinobacteria, Bacillales,
the Clostridiales classes of Firmicutes, the Burkholderiales class
of the Betaproteobacteria and in all classes of
Gammaproteobacteria (Karlsson and Stenlid 2009).
– Highest number (10) - Streptomyces cœlicolor A3 (2) (Kawase, et al.
2006)
• Altermonas sp. Strain O-7 has four
specialised chitinases
– When combined at an optimal ratio
chitinolytic activity was much greater than
the combined total individual chitinase
activity demonstrating synergy
(Orikoshi, et al. 2005)
Shot-gun sequencing a soil metagenome
• Soil metagenome: MiSeq 15 M reads 7 Gb data and
annotated using GenDB.
• Genome has coverage of 29.6 Mb,10,839 putative genes
annotated plus 8,777 needing further attention.
• Preliminary Pfam searches yielded sequences or part
sequences with the Pfam domains 51 enzymes
The metagenomic library was used to create a translated
trypsin digested database for metaproteomics
Plasmid clone library and phenotypic screening
•
•
•
•
Hit rate ~10-4 chitinase activity
Some novelty
Labour intensive
Captured gene
Metagenomic expression screening chitin amended
sandy soil
MUF-diNAG activity assays for fosmids from test soil microcosms amended with 1% αchitin (D 7)
D7 fosmids
G9 fosmids
402
10123
10093
10892
9425
11267
8474
440
376
202117
9943
12033
10270
9539
7696
692
445
10885
9595
12316
9378
9860
7714
445
416
8990
9951
10873
9505
10201
8257
448
371
9903
8810
11325
9838
10052
7674
401
444
9060
12213
11220
9462
8498
8834
436
445
9864
9533
15108
9434
11877
10352
493
383
11285
9260
14345
11040
8392
10229
406
417
9840
9454
195324
10257
10717
10176
451
382
9347
9346
194261
12645
10053
10527
406
401
110796
20084
400
9154
9717
382
382
437
12282
200645
385
10798
9629
452
472
Metaproteomics
Soil metaproteome vigorous vortexing in 50ml of cell lysis
buffer and freshly made DTT. centrifugation, aggressive
total extract with methanol and TCA, soil pellet rextracted
x3
Three-phase soil metaexoproteome extraction comprised
a gentle enzyme extraction with a buffer and metal
chelator, dialysis to remove salt, and two-stage
concentration by ultrafiltration
Gel slices LC-ESI-MS/MS
In house
Triple quadrapole Quantiva +
Orbitrap fusion Alex Jones
2D-LC Velos LTQ-Orbitrap analysis
Nathan VerBerkmoes Berg Diagnostics
Metaproteomics
Soil metaproteome vigorous vortexing in 50ml of cell lysis
buffer and freshly made DTT. centrifugation, aggressive
total extract with methanol and TCA, soil pellet rextracted
x3
Three-phase soil metaexoproteome extraction comprised
a gentle enzyme extraction with a buffer and metal
chelator, dialysis to remove salt, and two-stage
concentration by ultrafiltration
Gel slices LC-ESI-MS/MS
In house pre-Alex era!
2D-LC Velos LTQ-Orbitrap analysis
Nathan VerBerkmoes Berg Diagnostics
Protein extraction from soil
In summary, the three-phase soil exoproteome extraction comprises a gentle
enzyme extraction with a buffer and metal chelator, dialysis to remove salt,
and two-stage concentration by ultrafiltration
Johnson-Rollings et al ISME J 2014
Proteins and peptides from mass
spectrophotometric analysis Cuban soil
Proteins ID
>1 peptide
61
Proteins ID
= 1 peptide
Total
peptides ID
Unassigned
peptides
169
1502
1682
47%
53%
Total
peptides
Extracts tryptically digested and submitted to
LC-ESI-MS/MS gel based; solution based approach
LC-ESI-MSE (quantitative label-free nanoLC-MSE based
approach)
3184
Superkingdoms by % of significant
identified peptides
Bacteria
Eukaryota
4%
Note:
This includes only
protein hits with >1
unique peptides
96%
Note:
Same as the previous
pie chart but split up
by phylum
Cluster of Orthologous Groups (COG) categories for chitin
amended Cuban soil metaexoproteome
Unknown Function, 15%
Carbohydrate
transport and
metabolism, 13%
Coenzyme transport
and metabolism, 3%
General Function, 8%
Inorganic ion transport
and metabolism, 16%
Amino acid transport
and metabolism, Signal
transduction, 18%
Other, 9%
Energy production and
conversion, 2%
Secondary metabolites
biosynthesis transport
and catabolism, 2%
Amino acid transport
and metabolism, 21%
Amino acid transport and
metabolism,
Carbohydrate transport
and metabolism, 2%
http://img.jgi.doe.gov/
Examples of previous work on secretomes or
exoproteomes
• Several pathogenic bacteria (e.g. Erwinia chrysanthemi –25
different proteins including cellulases, proteases, flagellin and
intracellular proteins (Gohar et al., 2005) and Bascillus cereus –
46 proteins including degradative enzymes and toxins such as
proteases, phospholipases, haemolysins and
enterotoxins(Kazemi-Pour et al., 2004)
• Non-pathogenic bacteria (e.g. Ruegeria pomeroyi – 60 different
proteins - many ABC and TRAP related transporters ChristieOleza and Armengaud, 2010)
• Cellulolytic thermophile Thermobifida fusca –grown in cellulose
and/or lignin. iTRAQ (high-throughput isobaric tag) labelled for
relative and absolute quantification. 55 proteins including
cellulases, proteases, peroxidases and transporter proteins
(Adav et al., 2010).
Actinosynnema
mirumDSM 43827
Prasinophyceae
Anthoza
Chlorophyta
Cnidaria
teria
Stappia
IAM 12 aggregata
614
3
ora 83
isp 43 a
ob SM
or
erm ra D nosp 183
h
T po
o
43
bis rmom SM
D
5
e
GB
Th rvata
p. H
u
s
c
s
tipe
es 84
s
id
li
o
r
A
431
cte
Actinobacteria
Actinobacteria
Eukaryota
bac
Mamiellales
teo
Micromonas
sp. RCC299
pro
och
Beta
Actiniaria
les
s
Spir
ettsia
riale
es
ta
Chorda s
aete
lde
Salinispora
tropica CNB-440
ii
Rick
kho
pteryg
Bur
o na
d al
Nocardiopsis dassonvillei subsp.
dassonvillei DSM 43111
Actinomycetales
Actino
A lt e
ro m
formes
Gammaproteobacteria
Oceanospirillales
la
Nematostel
vectensis
a sp
. CC
GE
100
3
nt
Ps
sp. ychro
CN mo
P T na s
Burk
hold
3
eri
s
Hahella chejuensis
KCTC 2396
sp
O
ce
an
os
pir
illu
m
m
on
sp
as
.M
sp
. M ED9
2
ED
12
1
in
o
s
ale
ad
on
om
ud
se
ar
donti
haeta
Tetrao
Spiroc
etes
gdinae
ocha
smara
Spir
11293
s
DS M
etale
ocha
Spir
aeta
e
och
Spir ragdina
es
sma 11293
d al
ona
297
DS M
ED
h om
.M
a
on
M
Can
Pela didatus
gib
u b a c te r
HTC ique
C 10
62
Sulfit
obacte
sp . E
E-36 r
a
s Xa
eke
ona
Rein
om
nth -1
oxa is 11
s
eud
Ps onen
a
utid
suw
sp
m
P
do
le
s
eu
Ps 2440
lla
na O1
re
KT
mo PA
eu
do sa
st
s
eu no
u
Pa
nic
Ps gi
u
o
r
j ap
ae
io
P
ibr 7
s
llv 0
lu H
1
C e eda
hi 00
U
op 350
em yi
Ha ucre
d
Split based on
superkingdom,
phylum, class,
order, and
organism
Ruegeria
pomeroyi DSS-3
B
Bac
id
tero
Bac
s
ete
idia
tero
Sph
ob
ing
id
ero
act
ba
ingo
S ph
s
ale
c
a b a TC C
Par dae A cter
r
me aginiba 18603
les
teria
il
M
Muc dis DS
palu
ter
ibac 8603
1
ilagin
Muc is DSM
d
palu
! "#$%&'"(
) #*+, - "#$%&'"(
) #*+, - "#$%&'"(
) #*+, . /#%$"0%1(
) #*+, 1/++%. "(. '&2. (3 45 (6789: (
; , #"&<', =1'1(<"11, +>'00%'(12- 1=?(<"11, +>'00
4"0'+'1=, &"($&, ='#"(A; ! B66C(
DE%&. , - '1=, &"(- '1=, &"(3 45 (67877(
DE%&. , . , +, 1=, &"(#2&>"$"(3 45 (67@87(
! "#$%&, '<%$%1(
! "#$%&, '<'"(
! "#$%&, '<"0%1(
) 0'1*=%1(1=?(F G! H(
I "&"- "#$%&, '<%1(. %&<"%() DAA(67@86(
4=E'+J , - "#$%&'"(
4=E'+J , - "#$%&'"0%1(
5 2#'0"J '+'- "#$%&(="02<'1(3 45 (@8KC7(
I &, $%, - "#$%&'"(
) 0=E"=&, $%, - "#$%&'"(
L E'M, - '"0%1(
) J &, - "#$%&'2. (&"<', - "#$%&(N86(
) J &, - "#$%&'2. ($2. %O"#'%+1(1$&?(AH8(
) J &, - "#$%&'2. (>'*1(46(
ria
acte
) 2&"+*. , +"1(. "+J "+, P/<"+1(4Q8HBR) @(
! &2#%00"(. %0'$%+1'1(- >?(@(1$&?(@K5 (
F , %S%"(=E, $, $&, =E'#"(3 TUB67(
5 %1, &E'M, - '2. (#'#%&'(- ', >"&(- '1%&&20"%(V 4
5 %1, &E'M, - '2. (0, *(5 ) TT7C7CRR(
Agrobacterium
tumefaciens
str. C58
5 %1, &E'M, - '2. (1=?(! ; A@(
5 %$E/0, #%00"(1'0>%1$&'1(! U9(
L E'M, - '2. (%$0'(AT; (69(
L E'M, - '2. (%$0'(GL HK(
4'+, &E'M, - '2. (O&%<''(; GL 976(
4'+, &E'M, - '2. (. %<'#"%(V 45 6@R(
4'+, &E'M, - '2. (. %0'0, *(@C9@(
L E, <, - "#$%&"0%1(
U"- &%+M'"("0%P"+<&''(3 TUB@@(
I 1%2<, >'- &', (1=?(W
XCK9(
L , 1%, - "#$%&(1=?(5 X3 @R7(
L , 1%, >"&'21(1=?(9@: (
L 2%J %&'"(=, . %&, /'(3 44B7(
Roseovarius
sp. 217
4$"=='"("J J &%J "$"(Q) 5 (@9K@6(
Agrobacterium
vitis S4
Rhodobacterales
420Y$, - "#$%&(1=?(XXB7K(
L '#Z%[ 1'"0%1(
A"+<'<"$21(I %0"J '- "#$%&(2- '\ 2%(F DAA@CK9
! %$"=&, $%, - "#$%&'"(
! 2&ZE, 0<%&'"0%1(
! 2&ZE, 0<%&'"(1=?(AAGX@CC7(
G". . "=&, $%, - "#$%&'"(
) 0$%&, . , +"<"0%1(
I 1/#E&, . , +"1(1=?(A; I D7(
Roseobacter
sp. MED193
] #%"+, 1='&'00"0%1(
F "E%00"(#E%^2%+1'1(NADA(97RK(
5 "&'+, . , +"1(1=?(5 X3 @9@(
Aurantimonas
manganoxydans
SI85-9A1
Bacteria
Pseudovibrio
sp. JE062
; %=$2+''- "#$%&(#"%1"&'%+1'1(_] #%"+, 1='&'002
I "1$%2&%00"0%1(
F "%. , =E'021(<2#&%/'(7HCCCF I (
I 1%2<, . , +"<"0%1(
A%00>'- &', (^"=, +'#21(a %<"@C: (
I 1%2<, . , +"1("%&2J '+, 1"(I ) ] @(
I 1%2<, . , +"1(=2*<"(ND966C(
b"+$E, . , +"<"0%1(
Bru
cell
bv. a melite
1 str
. 16 nsis
M
Labrenzia
alexandrii
DFL-11
Proteobacteria
I 1%2<, P"+$E, . , +"1(12c , +%+1'1(@@B@(
; d) (
L %'+%Z%"(1=?(5 X3 9R: (
4='&, #E"%$%1(
4='&, #E"%$%1(
4='&, #E"%$"0%1(
4='&, #E"%$"(1. "&"J <'+"%(3 45 (@@9R7(
X2Z"&/, $"(
AE, &<"$"(
) #*+, =$%&/J ''(
D%$&", <, +*O, &. %1(
D%$&", <, +(+'J &, >'&'<'1(
A+'<"&'"(
Sinorhizobium
meliloti 1021
Hoeflea
phototrophica
DFL-43
) +$E, M, "(
) #*+'"&'"(
; %. "$, 1$%00"(>%#$%+1'1(
AE0, &, =E/$"(
I &"1'+, =E/#%"%(
5 ". '%00"0%1(
5 '#&, . , +"1(1=?(L AA9RR(
Sinorhizobium
medicae WSM419
Me
so
bis rhiz
er ob
ru ium
lae
W cice
SM ri
12 biov
71 ar
Alphaproteobacteria
Sinorhizobium fredii
NGR234
Mesorhizobium loti
MAFF303099
Rhizobiales
Rhizobium
etli GR56
Mesorhizobium sp. BNC1
Rhizobium
etli CFN 42
Methylocella silvestris BL2
Some proteins from Cuban soil metaexoproteome
Organism
Protein
general L-amino acid ABC
transporter, substrate-binding
Rhizobium etli GR56
protein
hypothetical protein
Roseovarius sp. 217
ROS217_05884
general L-amino acid ABC
Agrobacterium radiobacter K84 transporter
predicted phosphate ABC
transporter, substrate-binding
Sinorhizobium fredii NGR234 protein
Agrobacterium vitis S4
Thermomonospora curvata
DSM 43183
Nocardiopsis dassonvillei
subsp. dassonvillei DSM 43111
Thermobispora bispora DSM
43833
hypothetical protein Avi_9075
Unknown
Superoxide dismutase
class III aminotransferase
aminotransferase class-III
aminotransferase class-III
COG
Description
ABC-type amino acid transport/signal
Amino acid transport and transduction systems, periplasmic
metabolism
component/domain
Amino acid transport and
metabolism
Amino acid transport and
metabolism
Inorganic ion transport
and metabolism
Carbohydrate transport
and metabolism
Amino acid transport and
metabolism
Coenzyme transport and
metabolism
Coenzyme transport and
metabolism
Inorganic ion transport
and metabolism
ABC-type sugar transport system,
periplasmic component
4-aminobutyrate aminotransferase and
related aminotransferases
Adenosylmethionine-8-amino-7oxononanoate aminotransferase
Adenosylmethionine-8-amino-7oxononanoate aminotransferase
Nocardiopsis dassonvillei
Energy production and
subsp. dassonvillei DSM 43111 dihydrolipoamide dehydrogenase conversion
Actinosynnema mirum DSM
General function
43827
amidohydrolase
prediction
Pyruvate/2-oxoglutarate dehydrogenase
complex, dihydrolipoamide dehydrogenase
(E3) component, and related enzymes
Metal-dependent
amidase/aminoacylase/carboxypeptidase
Nocardiopsis dassonvillei
subsp. dassonvillei DSM 43111 Glycosyl hydrolase
Family 18 (GH18) chitinase II
Carbohydrate transport
and metabolism
Visual representation of the distribution
of primary COG assignments of
proteins retrieved from
α- and β-chitin amended soil.
*denotes periplasmic
component for
abbreviation purposes
Chitinases detected
• Nocardiopsis dassonvillei subsp. dassonvillei DSM
43111 glycoside hydrolase family 18 (GH18) chitinase
II; cellulose-binding family II; Fibronectin type III
domain protein
Also detected
• Microbulbifer hydrolyticus ChiC GH18 class II
endochitinase with chitin/cellulose binding domain
• Kitasatospora setae putative chitinase precursor class
II GH18 chitinase
A visual summary of the assigned bacterial community
structure, recovered metaexoproteome community, and GH18
chi gene taxonomic matches for the combined α- and β-chitin
amended soil
Metaproteomics
Soil metaproteome vigorous vortexing in 50ml of cell lysis
buffer and freshly made DTT. Centrifugation, aggressive
total extract with methanol and TCA, soil pellet reextracted x 3
Three-phase soil metaexoproteome extraction comprised
a gentle enzyme extraction with a buffer and metal
chelator, dialysis to remove salt, and two-stage
concentration by ultrafiltration
Gel slices LC-ESI-MS/MS
In house
2D-LC Velos LTQ-Orbitrap analysis
Nathan VerBerkmoes New England Labs
Proteins in soil
Three-phase
soil
metaexoproteome
extraction
comprised a gentle enzyme extraction with a buffer and
metal chelator, dialysis to remove salt, and two-stage
concentration by ultrafiltration
Soil metaproteome vigorous vortexing in 50ml of cell
lysis buffer and freshly made DTT. Centrifugation,
aggressive total extract with methanol and TCA, soil pellet
rextracted x 3
soil
metaexoproteome
Gram positive
Gram negative
Metaproteome combined with metaexoproteome of a soil
Metaproteome
metaexoproteome
Sample
TS a TP
TS a TP
TS a TP
TS a TP
TS a TP
TS a TP
TS a TP
TS a TP
TS b XP
TS b TP
TS b TP
TS b TP
TS b TP
TS b TP
TS b TP
TS b TP
Total metaproteome chitinase hits
Run Protein ID
1/2 Chitinasea
1/2 Chitinaseb
1 Chitinasec
1 Chitobiased
1 Chitinasee
1/2* Chitinasef
1 Chitinaseg
1 Chitinaseh
½ Chitosanasei
2* Chitinasej
2* Chitinasek
2 Chitinasel
2* Chitinasem
2 Chitinasen
2 Chitinaseo
2 Chitinasep
Phylum
Actinobacteria
Actinobacteria
Actinobacteria
Proteobacteria
Proteobacteria
Actinobacteria
Firmicutes
Uncultured Bacterium
Firmicutes
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Class
Actinobacteria
Actinobacteria
Actinobacteria
Betaproteobacteria
Betaproteobacteria
Actinobacteria
Clostridia
Family
Frankineae
Frankineae
Pseudonocardiaceae
Burkholderiaceae
Burkholderiaceae
Streptomycetaceae
Synthrophomonadaceae
Genus
Acidothermus
Acidothermus
Amycolatopsis
Burkholderia
Burkholderia
Streptomyces
Syntrophomonas
Bacilli
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Actinobacteria
Listeriaceae
Streptomycetaceae
Streptomycetaceae
Streptomycetaceae
Streptomycetaceae
Streptomycetaceae
Streptomycetaceae
Streptomycetaceae
Listeria
Streptomyces
Streptomyces
Streptomyces
Streptomyces
Streptomyces
Streptomyces
Streptomyces
Species
cellulolyticus
cellulolyticus
mediterannei
pseudomallei
pseudomallei
coelicolor A3(2)
wolfeiq
grayi
avermitilis
avermitilis
bingchenggensis
griseus
sp. SPB74
sviceuss
thermoviolaceus
Community changes due
to baiting with chitin
16S rRNA gene
sequencing
16S rRNA library versus the metaproteome
200
1200
Exo B
Whole A
Whole B
Rhodanobacter
16S instances / genus
1000
180
Nocardia
160
Burkholderia
140
Kitasatospora
120
Streptomyces
Rhodoplanes
800
100
80
Pseudomonas
60
Bradyrhizobium
600
40
20
Rhodopseudomonas
0
400
0
20
40
60
80
100
120
140
Afipia
Massilia
200
Burkholderia
Yersinia
Salmonella Shigella
Escherichia
0
0
500
1000
1500
2000
2500
Protein instances / genus
3000
3500
Conclusions
•Protein extraction from soil now realistic approach
for studying microbial function
•Community analysis overemphasis on biomass rather
than activity
•Initial metaexoproteome studies promising insight into
key biodegraders and transport proteins
•Some ‘rare’ actinobacteria are highly active in soil
A metagenomic approach to drug discovery
Metagenome/Amplicon
sequencing
Allows the study of diversity of soils.
Could be used for a method of
prospecting the bioactive potential of
various soils
Functional metagenomics
The extraction of total community
DNA and cloning it into a vector to
capture genes of interest
Will allow capture of biosynthetic
gene clusters
Antarctica: believed to be a
potential biodiversity hotspot
(Yergeau et al. 2007)
unique soil chemistry, when
compared to the surrounding
area (Chong et al. in press).
Situated on the Antarctic
Peninsula 1000 km from South
America, isolated by the
Antarctic Circumpolar current
and prevailing wind direction
from the continental interior
Metagenomic library
454 sequencing reads
Antarctic
metagenome:
gene ontology by
Function
MG-RAST
1 Gb
Pearce et al 2013
Antarctic metagenome genus prevalence
Actinobacteria
Actinobacteria
Actinomycetales
unclassified_Actinomycetales
unclassified_Actinomycetales
13132
Proteobacteria
Gammaproteobacteria
Alteromonadales
Alteromonadaceae
Shewanella
12650
Bacteroidetes
Bacteroidetes
Bacteroidales
Prevotellaceae
Prevotella
Proteobacteria
Gammaproteobacteria
Enterobacteriales
Enterobacteriaceae
unclassified_Enterobacteriaceae
551
Proteobacteria
unclassified_Bacteria
Gammaproteobacteria
Enterobacteriales
Enterobacteriaceae
Escherichia
unclassified_Bacteria
320
207
Proteobacteria
Gammaproteobacteria
Enterobacteriales
Enterobacteriaceae
Salmonella
180
Proteobacteria
Gammaproteobacteria
Xanthomonadales
Xanthomonadaceae
unclassified_Xanthomonadaceae
85
Proteobacteria
Gammaproteobacteria
Oceanospirillales
unclassified_Oceanospirillales
unclassified_Oceanospirillales
80
Proteobacteria
Gammaproteobacteria
unclassified_Gammaproteobacteria
unclassified_Gammaproteobacteria
48
Proteobacteria
Betaproteobacteria
Burkholderiales
unclassified_Burkholderiales
37
Proteobacteria
Betaproteobacteria
unclassified_Betaproteobacteria
unclassified_Betaproteobacteria
33
Proteobacteria
Betaproteobacteria
Rhodocyclales
Dechloromonas
28
Proteobacteria
unclassified_Proteobacteria
unclassified_Proteobacteria
22
Proteobacteria
Gammaproteobacteria
Enterobacteriales
Enterobacteriaceae
Pantoea
20
Proteobacteria
Alphaproteobacteria
Rhodobacterales
Rhodobacteraceae
unclassified_Rhodobacteraceae
16
Firmicutes
Bacilli
Lactobacillales
Leuconostocaceae
Leuconostoc
11
Bacteroidetes
Sphingobacteria
Sphingobacteriales Crenotrichaceae
Chitinophaga
10
Proteobacteria
Gammaproteobacteria
Enterobacteriales
Shigella
10
Cyanobacteria
Cyanobacteria
Deferribacterales
Enterobacteriaceae
unclassified_Deferribacterale
s
unclassified_Deferribacterales
9
Firmicutes
Bacilli
Lactobacillales
Streptococcaceae
Lactococcus
9
Proteobacteria
Deltaproteobacteria
unclassified_Deltaproteobacteria
unclassified_Deltaproteobacteria
7
Firmicutes
Bacilli
Bacillales
unclassified_Bacillales
6
Verrucomicrobia
Verrucomicrobiae
Verrucomicrobiales Verrucomicrobiaceae
Verrucomicrobium
6
unclassified_Burkholderiales
Rhodocyclaceae
unclassified_Bacillales
Genera_incertae_sedis_TM7 TM7
1082
TM7
5
Planctomycetes
Planctomycetacia
Planctomycetales
Planctomycetaceae
unclassified_Planctomycetaceae
5
Proteobacteria
Alphaproteobacteria
Sphingomonadales
Sphingomonadaceae
unclassified_Sphingomonadaceae
5
Proteobacteria
Deltaproteobacteria
Myxococcales
unclassified_Myxococcales
unclassified_Myxococcales
5
Proteobacteria
Gammaproteobacteria
Alteromonadales
Alteromonadaceae
unclassified_Alteromonadaceae
5
Detection of antibiotic gene clusters PCR screening
BACs and fosmids
Peptides
Polyketides
NRPS-nonribosomal peptide synthetase
PKS-polyketide synthase
0
100
KAGGA
A
200
300
SGTTGXPK
G
400
TG
D
50
0
KIRGXRIE NGK
L
600
The minimal PKS
LGGXS
Peptide synthetase
1
2
3 4
5
ac
tI
6
act III
act VII
act IV
act
gra
fren
gris
tcm
ORF 1
minimal PKS
ORF 2
ORF 3
Cyclases / aromatasesketoreductase
act III
KR KSa
ac
tI
KSb ACP
other
act VII act IV
Aromatase / Cyclase
60
100
PSPA7_2859 Pseudomonas aeruginosa PA7
pvsA Pseudomonas fluorescens SBW25
pvdD Pseudomonas aeruginosa 206-12
Druridge 38
psvA Pseudomonas fluorescens
Druridge 26
Athens 26
PSPA7_2858 Pseudomonas aeruginosa PA7
Diversity of NRPS
library screening
Druridge 16
Athens 12
100
Cockle 45
92 Cockle 48
Cockle 36
Daci_4753 Delftia acidovorans SPH-1
92
RS06179 Ralstonia solanacearum GMI1000 megaplasmid
RSMK04952 Ralstonia solanacearum MolK2
84
RSIPO_02940 Ralstonia solanacearum IPO1609
100
bglu_2g09010 Burkholderia glumae BGR1 chromosome 2
Athens 29
Athens 8
sypB Bradyrhizobium sp. BTAi1
Druridge 1
51
BURPS1106A_A2213 Burkholderia pseudomallei 1106a
93
Cockle 28
Athens 48
63
Druridge 17
Athens 5
100
Cockle 3
Cockle 41
Athens 35
65
Cockle 25
Cockle 4
89
Druridge 13
Athens 13
82
Athens 25
64
ECA1487 Pectobacterium atrosepticum SCRI1043
Druridge
28
53
Druridge 7
80
massB Pseudomonas fluorescens SS101
59
Antarctic 215
Druridge 32
bacB Bacillus subtilis 916
Cockle 2
86
Athens 39
96
Druridge 15
Athens 7
99
PSPTO_4519 Pseudomonas syringae DC3000
Cockle 13
Cockle 37
71
Antarctic 244
Cockle 16
63
MXAN_4403 Myxococcus xanthus DK 1622
Cockle 35
Druridge
22
50
Druridge 30
100
Druridge 37
Athens 4
69
Athens 10
Athens 36
Cockle 20
Cockle 38
88
Athens 46
76
Cockle 29
Cockle 7
100
Athens 6
Athens 24
100
Cockle 39
84
Cockle 50
Antarctic 318
Druridge 11
Druridge 33
100
Druridge 2
Athens 23
60
Athens 33
snbDE S. virginiae
51
MXAN_3636 Myxococcus xanthus DK 1622
Athens 28
MXAN 3779 Myxococcus xanthus DK 1622
98
Antarctic 337
Cockle 40
Cockle 21
Athens 14
Athens 34
Antarctic 283
MXAN_4532 Myxococcus xanthus DK 1622
Antarctic 326
Athens 41
Athens 37
92
Athens 42
89
Antarctic 9
Antarctic 14
50
Antarctic 27
67
Antarctic 353
74
Antarctic 78
nrps2-1 S. avermitilis MA-4680
Antarctic 104
55
sce8255 Sorangium cellulosum 'So ce 56'
Antarctic 335
72
Druridge 31
Athens 17
Druridge 39
sce2387 Sorangium cellulosum 'So ce 56'
Hoch_1747 Haliangium ochraceum DSM 14365
scpsB Saccharothrix mutabilis subsp. capreolus
Druridge 9
73
Druridge 3
95
Cockle 5
74
SACE_4288 Saccharopolyspora erythraea NRRL2338
snbC S. pristinaespiralis
Athens 16
100
83
79
Athens 18
Tcur_1886 Thermomonospora curvata DSM 43183
Amir_3602 Actinosynnema mirum DSM 43827
visE S. virginiae
86
Antarctic library and the 3
European soils with
markers from GenBank in
bold. The tree was
constructed using the
neighbor-joining method; the
numbers besides the
branches indicate the
percentage bootstrap value
of 1000 replicates. The
scale bar indicates 10%
nucleotide dissimilarity
nrps2-1
S. avermitilis
MA-4680
Antarctic clones 78
and 104
0.1
Gene clusters showing some similarity
ST1P6A4, ~30 kb
Similarity 99 %
Delftia acidovorans
SPH-1, ~59 kb
Delftibactin (NRP syderophore interacting with gold, produced by Delftia acidovorans)
Analysis of biosynthetic gene diversity
None of the soils differed significantly in their alpha diversity for 16S, 16S
Actino, NRPS or PKS sequences using several measures of diversity
(Chao1, PD, Observed).
Soils were vastly different with significant differences occurring in levels of
beta-diversity with soils significantly clustering away from one another
 16S
Actino
PKS 
NRPS 
Relationships between function and
phylogeny- procrustes
PKS and NRPS diversity correlated with one another (Mantel P < 0.001 ).
Both functional genes significantly correlated with the phylogeny present in
the soils. Analysis of taxa revealed diversity of Actinobacteria, Proteobacteria
and Bacteroidetes were the main drivers
PKS and NRPS
NRPS and Actinobacteria PKS and Actinobacteria
Network analysis: cytoscape showing significant β diversity
Conclusion…
beta-diversity of key taxa (Actinobacteria) 
beta-diversity of secondary
metabolites
outgroup
Mycobacteria:
Pathogen ecology
Fast growing
mycobacteria
Order Actinomycetales,
Family
Mycobacteriaceae,
genus Mycobacterium
include ‘atypical’ or
‘nontuberculous’
mycobacteria or MOTT
Slow growing
mycobacteria
Neighbour-joining Phylogenetic Tree
(16S
Why use 16S rRNA for identity?
Accuracy
100%
Kindom
80%
Phylum
Class
60%
Order
Family
40%
Genus
Species
20%
0%
Blast
RDP
Uclust
Actinomycetales
Oligotyping
• Perform and analyse many sequence reads
• Entropy analysis of variable sites in sequences
• Identification of short hypervariable region of 16S rRNA
• More detailed phylogenetic structure of community
• Important connection between microbial community structure and
environmental dataset
Shannon entropy analysis
𝑛
H X = −
𝑝 𝑥𝑖 𝑙𝑜𝑔𝑏 𝑝 𝑥𝑖
𝑖=0
Oligotyping: tetramers insufficient separation
WIKW (water sample)
0.8%
1.1%
3.1%
0.6%
0.4%
1.3%
3.5%
4.7%
19.3%
63.9%
CGGG
TATA
TAAA
TATT
~~TT
TGTA
CGTT
~~GA
TAGA
TAGG
TGAA
~~AG
~~~~
CGGA
CGAA
~~TC
CGTA
~~GG
C-AG
CT~~
~~CT
~~CC
~~C~
TGGG
C~TA
~~TA
TAAG
TA~~
TGAT
C~TT
Matching oligotypes to species
Blast ID results
WIKW (water sample)
63.92%
60%
40%
20%
19.27%
3.52%
4.70%
0%
1.34%
0.82%
0.38%
0.21%
0.11%
0.01%
0.01% 0.01%
0.02%
0.10%
0.05% 0.03%
0.02%
3.10%
0.03%
0.01% 0.01%
1.05%
0.03%
0.02% 0.02%
0.24%
0.58%
0.05%
0.21%
0.10%
Prevalence of SG Mycobacterium species in relation to sample type
OTU network
Clustering analysis
bTB and TB in Tanzania - qPCR
Faecal shedding
Cattle herds RD4 scar assay (bTB)
Goat herds
RD4 scar assay Household dust RD9 assay (TB)
Prevalence: qPCR detection of M. bovis vs amplicon
sequencing
M. tuberculosis complex pyrosequencing reads
M. bovis RD4 qpcr detection
M. bovis cell copies
per gram/per ml
1.00E+04
qPCR
1.00E+03
1.00E+02
1.00E+01
1.00E+00
Ba8
Bu3
Bu4
Ga1
Ga3
Wo5
2.38% (1/42) of soils were positive for M. bovis
11.90% (5/42) of water samples were positive for
M. bovis
Data suggests water sources are a
potential reservoir of M. bovis infection
Human-Environment-Livestock-Interface
Targeting specific groups within a metagenomic community
Othergeneral
generabacterium species
Other
Mycobacteriun
Mycobacteria species
Slow-grower
Mycobacterium
Slow growing
mycobacteriaspecies
Number of sequence
20000
15000
?
10000
5000
0.3~1%
60%
0
454 Pyrosequencing
Miseq
Metagenome
APTK primer
Group specific primers
Miseq
16s rRNA primer
Universal 16S rRNA primers
Significant challenges:
1. Depth of metagenome still a limitation
2. Plasmid/fosmid/BAC metagenomic clone libraries
with PCR/expression screening enable activity
studies
3. Better BACs needed for large library construction
4. Still major issues with annotation
5. More full meta-omic comparative analyses will
improve understanding of biases
6. Reverse genetic tools needed for exploitation of
metaproteomes
7. Amplicon seq and qPCR needed to support
metagenome data
ACKNOWLEDGEMENTS Co-workers
Ashley Johnson-Rollings metaproteomics
Helena Wright
Nathan VerBerkmoes
Vicky Hibberd
Berg Diagnostics,
Graziana Masciandaro,
Greg Amos
IES, Italy.
Chira Borsett
Paris Laskaris gene cluster analyses
Nikos Kyratsous
Orin Courtenay pathogen ecology
Rudovick Kazwala, Goodluck Paul, Joseph
Emma Travis
Malakalinga (Tanzania) Woutrina Millar (UCD), Phillip
Phillip James
Hopewell (UCSF), Glyn Hewinson, Jason Sawyer,
Will Gaze
Dez Delahay (AHVLA)
David Porter
Frank Sweeney
Archer Hung
Hayley King
Andrew Murphy
Vicky Hibberd