Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms
Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms
Journal of Microbiology and Biotechnology. 2016. Jan, 26(1): 207-212
Copyright © 2016, The Korean Society For Microbiology And Biotechnology
  • Received : July 15, 2015
  • Accepted : October 14, 2015
  • Published : January 28, 2016
Export by style
Cited by
About the Authors
Haeyoung, Jeong
Biosystems and Bioengineering Program, Korea University of Science and Technology (UST), Daejeon 34113, Republic of Korea
Dae-Hee, Lee
Biosystems and Bioengineering Program, Korea University of Science and Technology (UST), Daejeon 34113, Republic of Korea
Choong-Min, Ryu
Biosystems and Bioengineering Program, Korea University of Science and Technology (UST), Daejeon 34113, Republic of Korea
Seung-Hwan, Park
Biosystems and Bioengineering Program, Korea University of Science and Technology (UST), Daejeon 34113, Republic of Korea

PacBio’s long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of second-generation, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
( A ) LASTZ alignment [10] between two versions of Shigella boydii HGAP assemblies. The upper panel shows a dot plot; and the lower panel, alignment blocks. The major contig from the old version of HGAP is shown in the horizontal axis. The plots were generated using Geneious Pro R8 ( ). ( B ) MUMmer whole-genome alignments [15] of two versions of Paenibacillus sp. HS311 HGAP assemblies (left, old version; right, new version) with the complete genome sequence of P. polymyxa CR1 (upper panel) and cumulative GC skew plots as calculated by (G-C)/(G+C) with a window size of 5 kb (lower panel). ( C ) Ion Torrent PGM mate-pair reads on Pseudomonas syringae pv. syringae HGAP contigs were mapped and visualized using Consed software [8] , the results indicating that the four contigs are arranged in a single scaffold. The lightgreen plot designates the read depth. Multiple copies of ribosomal RNA genes, designated by the thick arrows at the bottom, induced mate reads to align at a longer span (○). RNA genes at the end of the adjacent contigs, represented through filled-in arrows of the same color, were used to join them, resulting in two contigs.
PPT Slide
Lager Image
This work was supported by the KRIBB Research Initiative Program, Ministry of Science, ICT, and Future Planning, and by the Next-Generation BioGreen 21 Program (SSAC Grant No. PJ009524) funded by the RDA (to C.M.R), Republic of Korea.
Bankevich A , Nurk S , Antipov D , Gurevich AA , Dvorkin M , Kulikov AS 2012 SPAdes: a n ew g enome a ssembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19 455 - 477    DOI : 10.1089/cmb.2012.0021
Barthelson R , McFarlin AJ , Rounsley SD , Young S 2011 Plantagora: modeling whole genome sequencing and assembly of plant genomes. PLoS One 6 e28436. -    DOI : 10.1371/journal.pone.0028436
Boetzer M , Pirovano W 2012 Toward almost closed genomes with GapFiller. Genome Biol. 13 R56. -    DOI : 10.1186/gb-2012-13-6-r56
Charneski CA , Honti F , Bryant JM , Hurst LD , Feil EJ 2011 Atypical at skew in Firmicute genomes results from selection and not from mutation. PLoS Genet. 7 e1002283 -    DOI : 10.1371/journal.pgen.1002283
Chin CS , Alexander DH , Marks P , Klammer AA , Drake J , Heiner C 2013 Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10 563 - 569    DOI : 10.1038/nmeth.2474
Coil D , Jospin G , Darling AE 2015 A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics 31 587 - 589    DOI : 10.1093/bioinformatics/btu661
English AC , Richards S , Han Y , Wang M , Vee V , Qu J 2012 Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One 7 e47768 -    DOI : 10.1371/journal.pone.0047768
Gordon D , Green P 2013 Consed: a graphical editor for next-generation sequencing. Bioinformatics 29 2936 - 2937    DOI : 10.1093/bioinformatics/btt515
Gurevich A , Saveliev V , Vyahhi N , Tesler G 2013 QUAST: quality assessment tool for genome assemblies. Bioinformatics 29 1072 - 1075    DOI : 10.1093/bioinformatics/btt086
Harris RS , PhD thesis 2007 Improved pairwise alignment of genomic DNA. Pennsylvania State University PhD thesis
Jeong H , Kloepper JW , Ryu C-M 2015 Genome sequences ofPseudomonas amygdalipv.tabacistrain ATCC 11528 and pv.lachrymansstrain 98A-744. Genome Announc. 3 e00683 - 00615
Kamada M , Hase S , Sato K , Toyoda A , Fujiyama A , Sakakibara Y 2014 Whole genome complete resequencing ofBacillus subtilis nattoby combining long reads with highquality short reads. PLoS One 9 e109999 -    DOI : 10.1371/journal.pone.0109999
Koren S , Phillippy AM 2015 One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly. Curr. Opin. Microbiol. 23 110 - 120    DOI : 10.1016/j.mib.2014.11.014
Koren S , Schatz MC , Walenz BP , Martin J , Howard JT , Ganapathy G 2012 Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol. 30 693 - 700    DOI : 10.1038/nbt.2280
Kurtz S , Phillippy A , Delcher AL , Smoot M , Shumway M , Antonescu C , Salzberg SL 2004 Versatile and open software for comparing large genomes. Genome Biol. 5 R12 -    DOI : 10.1186/gb-2004-5-2-r12
Liao YC , Lin SH , Lin HH 2015 Completing bacterial genome assemblies: strategy and performance comparisons. Sci. Rep. 5 8747 -    DOI : 10.1038/srep08747
Lobry JR , Louarn JM 2003 Polarisation of prokaryotic chromosomes. Curr. Opin. Microbiol. 6 101 - 108    DOI : 10.1016/S1369-5274(03)00024-9
Park N , Shirley L , Gu Y , Keane TM , Swerdlow H , Quail MA 2013 An improved approach to mate-paired library preparation for Illumina sequencing. Methods Next Gener. Seq. 1 10 - 20
Park S-H , Choi S-K , Park S-Y , Jeon JH , Kim HR , Jeong J , Kim YT 2015 NovelPaenibacillussp. and the method for yield increase of potato using the same. Republic of Korea patent application 10-1498155
Park YS , Jeong H , Sim YM , Yi HS , Ryu CM 2014 Genome sequence and comparative genome analysis ofPseudomonas syringaepv.syringaetype strain ATCC 19310. J. Microbiol. Biotechnol. 24 563 - 567    DOI : 10.4014/jmb.1312.12082
Ribeiro FJ , Przybylski D , Yin S , Sharpe T , Gnerre S , Abouelleil A 2012 Finished bacterial genomes from shotgun sequence data. Genome Res. 22 2270 - 2277    DOI : 10.1101/gr.141515.112
Zerbino DR , Birney E 2008 Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18 821 - 829    DOI : 10.1101/gr.074492.107