Skip to content

Latest commit

 

History

History
98 lines (45 loc) · 3.4 KB

File metadata and controls

98 lines (45 loc) · 3.4 KB

Earth Biogenome Project (EBP) Insect Genome Assembly

The Illinois Innovation Network and the Discovery Partners Institute funded a pilot project to assemble the genomes of agriculturally relevant insects in Illinois for which little or no genomic data are available. Hiqh-quality DNA was isolated using protocols optimized for small, difficult samples. The pilot project Implemented the novel use of Tell-Seq linked-read libraries for the dual purpose of genome size estimation and linked-read scaffolding. Genomes were assembled using PacBio HiFi reads, Tell-Seq reads, and Dovetail Omni-C reads for chromosome-range scaffolding.

Eight high-quality genomes were assembled from non-model organisms with contig N50 >1Mb and scaffold N50 >5Mb, including the second-only soon-to-be public genome for the order Neuroptera. Species were confidently identified using the mitochondrial genomes assembled from the HiFi reads. Potential endosymbionts and pathogens were identified as well as novel prey information from predator species. Genomes were annotated using the BRAKER2 pipeline, generating a rich set of novel data to mine

The Workflow







. .

Denovo genome assembly using HiFi reads

These are the steps:

  1. Generate raw assembly with hifiasm

  2. Purge duplicate contigs

  3. Scaffold using TellSeq reads

  4. Scaffold using Omni-C reads

  5. Fill gaps

  6. Mask repeats and low complexity regions

  7. Predict gene and protein function

  8. Identify and annotate mitochondrial DNA

  9. Assess genome completeness w Merqury

  10. Identify contaminants and artifacts in genome

Denovo genome assembly using CLR reads

  1. Generate raw assembly with Redbean

  2. Base-correct assembly with Arrow

  3. Purge duplicate contigs

  4. Pilon polishing

  5. Scaffold using TellSeq reads

  6. Scaffold using Omni-C reads

  7. Fill gaps

  8. Mask repeats and low complexity regions

  9. Predict gene and protein function

  10. FreeBayes polishing

  11. Identify and annotate mitochondrial DNA

  12. Assess genome completeness w Merqury

  13. Identify contaminants and artifacts in genome

Our sponsors:

To learn more about the Earth Biogenome Project, please use this link: https://www.earthbiogenome.org/