U.S. Department of Energy Office of Biological and Environmental Research

BER Research Highlights


Faster, Bigger, Stronger: Genome Database Improvements
Published: October 27, 2013
Posted: February 12, 2014

The Department of Energy’s Joint Genome Institute (DOE JGI) maintains the Integrated Microbial Genomes (IMG) data warehouse, which contains a rich collection of genomes from all three domains of life. IMG/M provides a similar collection of partially assembled genome reads from microbial communities (metagenomes). Both have recently been upgraded to address the increase in genome sequences and provide more options for users. IMG was introduced in 2005. Since the last published report in 2012, both systems have grown and improved. The improvements for both systems are described in a pair of reports in the Jan. 1, 2014, issue of Nucleic Acids Research.

The late 2013 version of IMG contains more than 16,000 genome datasets with more than 42 million protein-coding genes. Most (nearly 12,000) are bacterial, archaeal, and eukaryotic genomes. The number of genomes is more than three times the number two years ago. IMG also includes thousands of viral genomes, plasmids that did not come from a specific microbial genome sequencing project, and hundreds of genome fragments. Also in late 2013, IMG/M contained 3,328 metagenome datasets from 460 metagenome studies, with more than 19.5 billion protein coding genes.

Both systems have enhanced analysis tools for publicly available datasets. The latest version of IMG includes tools for recording and analyzing single cell genomes, RNA sequencing data, and gene clusters coding for synthesis of complex organic molecules (biosynthetic clusters).
Both systems are continually being improved to keep up with recent advances in genomics. Future advances will include incorporating pangenomic data (genes that make up the core genes common to all individuals in a species as well as variant genes to enable some individuals to adapt to different environments) and analysis tools for IMG and metaproteomics datasets (protein samples collected from environmental sources) in IMG/M.

References: Markowitz, V. M., et al. 2013. “IMG 4 Version of the Integrated Microbial Genomes Comparative Analysis System,” Nucleic Acids Research 42(D1), D560–67. DOI:10.1093/nar/gkt963. (Reference link)

Markowitz, V. M., et al. 2013. “IMG/M 4 Version of the Integrated Metagenome Comparative Analysis System,” Nucleic Acids Research 42 (D1), D568–73. DOI:10.1093/nar/gkt919. (Reference link)

Related Links:
IMG website: https://img.jgi.doe.gov/
IMG/M website: https://img.jgi.doe.gov/cgi-bin/m/main.cgi

Contact: Dan Drell, SC-23.2, (301) 903-4742
Topic Areas:

  • Research Area: Genomic Analysis and Systems Biology
  • Research Area: Microbes and Communities
  • Research Area: DOE Joint Genome Institute (JGI)
  • Research Area: Computational Biology, Bioinformatics, Modeling

Division: SC-23.2 Biological Systems Science Division, BER

 

BER supports basic research and scientific user facilities to advance DOE missions in energy and environment. More about BER

Recent Highlights

May 10, 2019
Quantifying Decision Uncertainty in Water Management via a Coupled Agent-Based Model
Considering risk perception can improve the representation of human decision-making processes in age [more...]

May 09, 2019
Projecting Global Urban Area Growth Through 2100 Based on Historical Time Series Data and Future Scenarios
Study provides country-specific urban area growth models and the first dataset on country-level urba [more...]

May 05, 2019
Calibrating Building Energy Demand Models to Refine Long-Term Energy Planning
A new, flexible calibration approach improved model accuracy in capturing year-to-year changes in bu [more...]

May 03, 2019
Calibration and Uncertainty Analysis of Demeter for Better Downscaling of Global Land Use and Land Cover Projections
Researchers improved the Demeter model’s performance by calibrating key parameters and establi [more...]

Apr 22, 2019
Representation of U.S. Warm Temperature Extremes in Global Climate Model Ensembles
Representation of warm temperature events varies considerably among global climate models, which has [more...]

List all highlights (possible long download time)