U.S. Department of Energy Office of Biological and Environmental Research

BER Research Highlights

Improving the Reliability of Metagenomic Sequencing Data
Published: June 07, 2012
Posted: August 21, 2012

Natural microbial communities usually are made up of a large variety of species. Knowing the community's composition is important for addressing DOE energy and environmental missions. Sequencing of the community's combined genome (the ‘metagenome') is now the best way to characterize these communities, but to make sense of the data, it is important to accurately account for all of the experimental and instrumental errors in the process. Up to now, the instrumental errors have been routinely estimated, but not the sample collection and preparation errors. As part of the DOE Systems Biology Knowledgebase project, researchers at Argonne National Laboratory have developed an open-source program called DRISEE (duplicate read inferred sequencing error estimation) to account for both types of errors. DRISEE identifies errors that could be due to sample collection, intermediary DNA processing techniques, or to the instruments themselves. Using DRISEE, the authors reproduce known error rates from a given set of standard data. They then apply this method to show that many factors can contribute to errors in sequencing including read length and sample preparation. Although this method so far only applies to 454 and Illumina sequencing, it will provide valuable assistance to scientists trying to assemble genomes from metagenomic data by helping them determine if the sequence data has a true error and should be disregarded or if it is a natural sequence variation and should be included.

Reference: Keegan, K. P., W. L. Trimble, J. Wilkening, A. Wilke, T. Harrison, M. D'Souze, and F. Meyer. 2012. "A Platform-Independent Method for Detecting Errors in Metagenomic Sequencing Data: DRISSE," PLoS Computational Biology 8(6), e1002541. DOI: 10.1371/journal.pcbi.1002451. (Reference link)

Contact: Susan Gregurick, SC-23.2, (301) 903-7672
Topic Areas:

  • Research Area: Genomic Analysis and Systems Biology
  • Research Area: Microbes and Communities
  • Research Area: Sustainable Biofuels and Bioproducts
  • Research Area: Computational Biology, Bioinformatics, Modeling
  • Cross-Cutting: Scientific Computing and SciDAC

Division: SC-23.2 Biological Systems Science Division, BER


BER supports basic research and scientific user facilities to advance DOE missions in energy and environment. More about BER

Recent Highlights

Aug 24, 2019
New Approach for Studying How Microbes Influence Their Environment
A diverse group of scientists suggests a common framework and targeting of known microbial processes [more...]

Aug 08, 2019
Nutrient-Hungry Peatland Microbes Reduce Carbon Loss Under Warmer Conditions
Enzyme production in peatlands reduces carbon lost to respiration under future high temperatures. [more...]

Aug 05, 2019
Amazon Forest Response to CO2 Fertilization Dependent on Plant Phosphorus Acquisition
AmazonFACE Model Intercomparison. The Science Plant growth is dependent on the availabi [more...]

Jul 29, 2019
A Slippery Slope: Soil Carbon Destabilization
Carbon gain or loss depends on the balance between competing biological, chemical, and physical reac [more...]

Jul 15, 2019
Field Evaluation of Gas Analyzers for Measuring Ecosystem Fluxes
How gas analyzer type and correction method impact measured fluxes. The Science A side- [more...]

List all highlights (possible long download time)