Metagenome Assembled Genome of a Putative Phage from Young Cheddar Curds
Disciplines
Bacteriology | Bioinformatics | Biology | Biotechnology | Cell and Developmental Biology | Food Microbiology | Genetics | Genomics | Molecular Genetics | Population Biology | Virology
Abstract (300 words maximum)
Next-generation sequencing (NGS) has been providing new opportunities in the elucidation of microbiomes, especially in fermented foods. The goal of this study is to find new phages from young Cheddar curds that serve as a base for surface-ripened cheeses to further understand its related microbiome. Data from BioProject PRJEB15423 was deposited into NCBI Sequence Read Archive (SRA) by Teagasc Food Research Centre in Ireland after sequencing metagenomes of young Cheddar curds and surface-ripened cheeses with Illumina NextSeq 500. A specific run from the aforementioned experiment, ERR2212267, was selected for our study to identify phages that were unreported in the original publication. ERR2212267 contains sequence data from day 0 of the unsmeared Cheddar cheese curd which served as a control in the original study. First, metaviralspades carried out de novo assembly of reads into contigs and scaffolds, which potentially contain metagenome-assembled genomes (MAGs) for phages. Contigs and scaffolds were then subjected to identification of mobile genetic elements, including plasmids and viruses, by geNomad. Taxonomic assignments of viral genomes were also carried out by geNomad. One scaffold generated by metaviralspades contains 22,322 nucleotides in length and a G + C content of 36%. Metaviralspades also reported a coverage of 56.582834 for this scaffold. This scaffold has been predicted by geNomad as a member in the class of Caudoviricetes. Similarity search with BLASTN against nt_viruses revealed 93.84% identity in comparison to Lactococcus phage BIM BV-114 as the most related entry, which was originally isolated from cheese brine and is also a member of Caudoviricetes. Among 45 open-reading frames (ORFs) predicted by PHANOTATE as implemented by the pharokka pipeline, the most noteworthy categories include: nucleotide metabolism with five ORFs, lysis with three ORFs, and phage head/packaging with five ORFs. Large terminase subunit of 526 amino acids was also predicted by pharokka. CheckV reported that the assembly is at a completeness of 100% with the presence of direct terminal repeats (DTR). CheckV also reported that the minimum information about an uncultivated virus genome (MIUVIG) quality is “High-Quality.” This MAG is therefore considered as a complete genome. In conclusion, this study potentially discovered a novel phage from young Cheddar curds that infects the genus of Lactococcus.
Academic department under which the project should be listed
CSM - Molecular and Cellular Biology
Primary Investigator (PI) Name
Tsai-Tien Tseng
Metagenome Assembled Genome of a Putative Phage from Young Cheddar Curds
Next-generation sequencing (NGS) has been providing new opportunities in the elucidation of microbiomes, especially in fermented foods. The goal of this study is to find new phages from young Cheddar curds that serve as a base for surface-ripened cheeses to further understand its related microbiome. Data from BioProject PRJEB15423 was deposited into NCBI Sequence Read Archive (SRA) by Teagasc Food Research Centre in Ireland after sequencing metagenomes of young Cheddar curds and surface-ripened cheeses with Illumina NextSeq 500. A specific run from the aforementioned experiment, ERR2212267, was selected for our study to identify phages that were unreported in the original publication. ERR2212267 contains sequence data from day 0 of the unsmeared Cheddar cheese curd which served as a control in the original study. First, metaviralspades carried out de novo assembly of reads into contigs and scaffolds, which potentially contain metagenome-assembled genomes (MAGs) for phages. Contigs and scaffolds were then subjected to identification of mobile genetic elements, including plasmids and viruses, by geNomad. Taxonomic assignments of viral genomes were also carried out by geNomad. One scaffold generated by metaviralspades contains 22,322 nucleotides in length and a G + C content of 36%. Metaviralspades also reported a coverage of 56.582834 for this scaffold. This scaffold has been predicted by geNomad as a member in the class of Caudoviricetes. Similarity search with BLASTN against nt_viruses revealed 93.84% identity in comparison to Lactococcus phage BIM BV-114 as the most related entry, which was originally isolated from cheese brine and is also a member of Caudoviricetes. Among 45 open-reading frames (ORFs) predicted by PHANOTATE as implemented by the pharokka pipeline, the most noteworthy categories include: nucleotide metabolism with five ORFs, lysis with three ORFs, and phage head/packaging with five ORFs. Large terminase subunit of 526 amino acids was also predicted by pharokka. CheckV reported that the assembly is at a completeness of 100% with the presence of direct terminal repeats (DTR). CheckV also reported that the minimum information about an uncultivated virus genome (MIUVIG) quality is “High-Quality.” This MAG is therefore considered as a complete genome. In conclusion, this study potentially discovered a novel phage from young Cheddar curds that infects the genus of Lactococcus.