How the Human Microbiome Project Works

The Project: Swabbing Hundreds of Humans

So now that we know what we're studying, let's jump into the Human Microbiome Project itself. Funded through the U.S. National Institutes of Health, the first phase was slated for a five-year schedule and included getting samples from the human microbiome and developing a reference set of microbial genome sequences (the goal was 3,000 sequences). The second phase (2013-2015) involves developing a kind of catalog of microbiome data sets to help the scientific community research and study disease and health. They also have the broader, challenging task of investigating the connection between a person's microbial community and various diseases and conditions.

Why hasn't this been done earlier? The technology didn't exist. There wasn't really a way for scientists to study microbiome residents because they couldn't be grown or isolated in a laboratory setting. But with the advance of DNA sequencing and technology, scientists were now able to isolate genetic material from microbial communities without needing to cultivate them [source: NIH]. And so the project began in 2007 with a budget of $170 million and a fairly straightforward task: take samples from a large enough number of healthy people to determine a baseline or framework of what a human microbiome entails.

The first phase started with recruiting volunteers who were considered "healthy." That wasn't easy: 600 subjects between 18 and 40 were brought in, but after rigorous examinations (for things like cavities and yeast infections as well as overall health), more than half were rejected. Two hundred and forty-two subjects from Houston, Texas, and St. Louis, Mo., finally met the criteria, and were the lucky ones to be swabbed multiple places at multiple times, and then have their biome sequenced by more than 200 scientists at 80 different institutions [source: Kolata].

Each man was sampled in 15 places, and each woman in 18 (to account for the microbial environment of the vagina), each up to three times over two years. The samples included areas of the gut (taken from the stool), the nose, and multiple places in the mouth and on the skin.

Over the course of the study, more than 11,000 samples were obtained, and scientists were able to sequence part of the RNA material to identify the microbes, as well as determine the size of the population [source: Baylor College of Medicine]. Thus far, 800 of the samples went through a whole genome sequence [source: Baylor College of Medicine]. The project has generated a whopping 3.5 trillion bytes of data, which is a thousand times more than the Human Genome Project [source: Baylor College of Medicine].

National Institutes of Health Common-Fund financing was used through 2013, but the project then began funding itself through 16 participating NIH institutes [source: Mole]. Scientists still are establishing stricter sampling protocols, and attempting to find a base of technical support and resources.