BER launches Environmental System Science Program. Visit our new website under construction!

U.S. Department of Energy Office of Biological and Environmental Research

BER Research Highlights

Improving the Reliability of Metagenomic Sequencing Data
Published: June 07, 2012
Posted: August 21, 2012

Natural microbial communities usually are made up of a large variety of species. Knowing the community's composition is important for addressing DOE energy and environmental missions. Sequencing of the community's combined genome (the ‘metagenome') is now the best way to characterize these communities, but to make sense of the data, it is important to accurately account for all of the experimental and instrumental errors in the process. Up to now, the instrumental errors have been routinely estimated, but not the sample collection and preparation errors. As part of the DOE Systems Biology Knowledgebase project, researchers at Argonne National Laboratory have developed an open-source program called DRISEE (duplicate read inferred sequencing error estimation) to account for both types of errors. DRISEE identifies errors that could be due to sample collection, intermediary DNA processing techniques, or to the instruments themselves. Using DRISEE, the authors reproduce known error rates from a given set of standard data. They then apply this method to show that many factors can contribute to errors in sequencing including read length and sample preparation. Although this method so far only applies to 454 and Illumina sequencing, it will provide valuable assistance to scientists trying to assemble genomes from metagenomic data by helping them determine if the sequence data has a true error and should be disregarded or if it is a natural sequence variation and should be included.

Reference: Keegan, K. P., W. L. Trimble, J. Wilkening, A. Wilke, T. Harrison, M. D'Souze, and F. Meyer. 2012. "A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISSE," PLoS Computational Biology 8(6), e1002541. DOI: 10.1371/journal.pcbi.1002451. (Reference link)

Contact: Susan Gregurick, SC-23.2, (301) 903-7672
Topic Areas:

  • Research Area: Genomic Analysis and Systems Biology
  • Research Area: Microbes and Communities
  • Research Area: Sustainable Biofuels and Bioproducts
  • Research Area: Computational Biology, Bioinformatics, Modeling
  • Cross-Cutting: Scientific Computing and SciDAC

Division: SC-33.2 Biological Systems Science Division, BER


BER supports basic research and scientific user facilities to advance DOE missions in energy and environment. More about BER

Recent Highlights

Mar 23, 2021
Molecular Connections from Plants to Fungi to Ants
Lipids transfer energy and serve as an inter-kingdom communication tool in leaf-cutter ants&rsqu [more...]

Mar 19, 2021
Microbes Use Ancient Metabolism to Cycle Phosphorus
Microbial cycling of phosphorus through reduction-oxidation reactions is older and more widespre [more...]

Feb 22, 2021
Warming Soil Means Stronger Microbe Networks
Soil warming leads to more complex, larger, and more connected networks of microbes in those soi [more...]

Jan 27, 2021
Labeling the Thale Cress Metabolites
New data pipeline identifies metabolites following heavy isotope labeling.

Analysis [more...]

Aug 31, 2020
Novel Bacterial Clade Reveals Origin of Form I Rubisco

  • All plant biomass is sourced from the carbon-fixing enzyme Rub [more...]

List all highlights (possible long download time)