Category Archives: Uncategorized

Visualizing Microbiome Data: Choropleth Style

A Pipette and an Open Mind

Recently I’ve needed to visualize spatial changes in my microbiome data that are easily interpretable by other people. The best solution I’ve come across is simply projecting the data onto a drawing of my organism. Like this:

Zostera marina high resolution Alpha diversity of samples from pieces of a seagrass plant projected onto a drawing of that plant.

I’ve used SitePainter to produce these in the past, and in some ways it’s great. I got the figure I wanted and I’ve received a lot of positive feedback about it. The only problem is it has a steep learning curve and is the most dysfunctional GUI I’ve ever come across (generating the figure above took several days of my time, the first time). So when it became necessary to produce over forty more images like it I decided to search for a better way.

My first instinct was that there should be an easy way…

View original post 694 more words

ASM Highlights

The Seagrass Team (Hannah, Jenna & I) hit up ASM this year which was in New Orleans. In case you missed ASM, Jonathan took the effort to compile all the tweets together (#ASM2015) and his efforts can be found: here.

However in case you don’t want to wade though thousands of tweets I’ve included some some brief highlights:

NCBI’s Targeted Loci Blast (MOLE-BLAST)

  • Using MOLE-BLAST you can blast specific 16S or ITS sequences against NCBI currated databases for those marker genes. This seems like it could be really useful if you want to identify bacterial or fungal taxonomy using a marker gene approach. Also, MOLE-BLAST appears to use a tree based approach to help you find the nearest neighbor for taxonomy assignment.

Carl Zimmer’s Talk on Microbiomes and the Hyperbolome

The Session that kept Redefining the Tree of Life (aka Unearthing the Dark Matter of Microbial Metabolism and Diversity)

  • Unfortunately, I missed this session, but apparently both Brett Baker (@archaeal) and Laura Hug (@LAHug_) shook things up
  • Brett Baker brought Thorarchaeota to the party, a monophyletic group that looks like it branches between Lokiarchaeota and Eukaryotes
  • Then Laura Hug showed up with Woesearchaeota and Pacearcheoata, not sure where they fit in, but cool

Contributions to “Extreme” Microbiology by Female Scientists Session

  • The whole session was awesome (and extreme), but Emmie de Wit gave a heartfelt (and tear producing) talk on being a first responder to the Ebola outbreak in Liberia
  • Also, I now really want to go to the Bonneville Salt Flats that Betsy Kleba talked about

Honorable mentions: John Zehr (Talk on open ocean Nitrogen fixing symbionts), David Baltrus (Talk on fungal endophytes with bacterial symbionts in their hyphae), Michael Wagner (Talk on syntrophy between nitrite and ammonia oxiders and alternative substrates for nitrification), Tom Marshburn (A real life astronaut!), Tom Sharpton (A Shotgun Metagenome Annotation Pipeline (ShotMAP))

Hannah and I presented posters at the meeting, her poster (left, #1237) and my poster (right, #1236) can be seen below:

20150601_162116 20150601_162124

Of course, Team Seagrass wasn’t the only ones from the Eisen lab presenting at ASM. David Coil (@davidacoil), Srijak Bhatnagar (@srijakbhatnagar) and Megan Krusor (@MKrusor) also presented posters this year! See tweets with photos below.

Some might say that the real highlight of ASM was New Orleans itself; it was my first time there and I really enjoyed being immersed in the culture (the music!), city and food (especially the food). Below is a picture of me touching the Mississippi River (my first time!). Unfortunately, the water was too murky to search for any seagrass or seagrass relatives.

11391124_10205818157466724_2465408434093120848_n

Potomac Samples: not what we expected, but still some interesting connections

Last summer I went on a sampling expedition to the Chesapeake Bay for some SAV (submerged aquatic vegetation) collection. I came back with leaf and root samples from the Potomac River from a few different SAV species. Ideally, we thought the microbial community would correlate with the salinity gradient across the sites or the host species.

Neither of those patterns are discernible in this data set as far as the beta diversity plots are concerned, but I found some other interesting things while sifting through these plots. For instance, the samples cluster based on the site location (P1-P4). The communities at P1 and P3 look really similar and P4 is within the tail end of their cluster, while P2 is totally different. location

For reference, here’s a map of our sites:

Screen Shot 2015-06-05 at 11.21.11 AM
Whitehead, Andrew et al. “Genomic Mechanisms of Evolved Physiological Plasticity in Killifish Distributed along an Environmental Salinity Gradient.” Proceedings of the National Academy of Sciences of the United States of America 108.15 (2011): 6193–6198. PMC. Web. 5 June 2015.

Although all the sites were visited, I only found SAV species at P1-4. There are some useful patterns in the water and soil chemistry data (courtesy of Greg Mayer from Texas Tech University) that show the same correlation pattern as the site locations (as expected since each location has its own distinct chemistry data). Some of the chemistry data shows different patterns from site location, so I’ll have to sift through those next and see what looks relevant. I also ran a core microbiome script for each site, but haven’t looked at the output yet.

In addition, the leaf and root samples are pretty distinct:

sampletype

The alpha diversity graphs are a whole ‘nother beast that I’m going to explore some other time. That’s all for now, but I feel that there are some interesting lines of investigation to pursue and more scripts to run.

A first look at the first ZEN run

PNAs (blockers for mitochondrial and chloroplast amplification)

  • More organelles were filtered from the leaves than the roots, sediment, or water.
average % loss by sample type
Leaf 26.5518607264
Root 10.6381529905
Sediment 5.1730348426
Water 13.0591363787
  • using PNAs did not result in an overall loss of reads
# filtered reads
PNA 389011
no PNA 411059
  • in progress: do we see more chimeras with PNA use?

Run summary statistics

    • 533 samples, including kit controls and PNA tests
    • reads per sample (after filtration) ranged from 5 – 206,661
    • the kit control with the most reads had 6549
    • 306 (57.4%) of the samples had fewer than 6549 reads, we’ll call these low-abundance samples
    • breakdown of low-abundance samples by type
Leaf Water Roots Sediment
126 (41.1%) 109 (35.6%) 32 (8.8%) 33 (10.8%)
  • in progress: not sure, I published this post by accident

Early foray into CARD-FISH imaging of seagrass leaf microbes

In addition to sequencing samples to see which bacteria types are present, we are also interested in imaging bacterial populations to gain understanding of how different microbe communities are spatially distributed. For example, one set of bacteria may live on the root tip, while another set may prefer to colonize the rhizome surface. Leaf-associated microbe groups tend to be different from root-associated microbes, although overlap has been reported in other plant species. Additionally, one type of microbe may be dependent on compounds released from another type, in which case you might expect to see distinct types of microbes that are clumped closely together. Alternately, competitive or antagonistic microbe groups would probably be located far apart from each other. The spatial scale of “close” or “far” and “clumpiness” is more easily identified with imaging. At the very least, imaging distinct microbe populations is a first step in identifying dependencies and preferences of microbe types within microbiome communities.

The imaging method I used is called “catalyzed reporter deposition fluorescent in situ hybridization,” also known as CARD-FISH. The main idea is that you attach a fluorophore to a nucleic acid sequence of your choosing. The sequence should be unique to your microbe population of interest.

Here is a very rough outline of the protocol steps for CARD-FISH:
1. probe design: make an oligonucleotide (aka “probe”) that will bind your microbe’s sequence, and make sure the oligo has HRP conjugation on one end, because that is what enables CARD.
2. hybridize: fix and permeabilize your sample so your probe can get into the microbes and access their nucleic acids.
3. amplification: incubate your hybridized samples with a flurophore that recognizes your probe’s HRP binding site.

There are a lot of parameters to deal with, but in theory the number of distinct microbe populations that can be imaged with CARD-FISH is limited to the number of different fluorophores and number of unique genes at your disposal. In practice, the number of distinct microbe populations you can image is more likely to be limited by the tolerance of your sample to the CARD-FISH protocol, which varies based on your microbial targets and can take days if you’re trying more than one probe.

Anyway, in my case, as a first pass, I just tried one probe, eub338, which targets the 16S rRNA gene that all eubacteria should have. Probe details linked here.

I used DAPI as a control stain, to verify that my probe and fluorophore were truly binding nucleic acids and not getting stuck on other things. DAPI reliably binds nucelic acids, so I would expect any valid CARD-FISH signal to colocalize with DAPI. It’s not the most rigorous control stain, but it’s a start.

First, here is just the DAPI image of a seagrass leaf. The large square-ish cells are plant cells.
JPG_seagrass_leaf1_40x_dapi

Here is the same DAPI image with the eub338 signal overlaid in green.
JPG_seagrass_leaf1_40x_dapi_eub

As you can see, not all DAPI signal colocalizes with eub338, which is fine- those are probably nucleic acids that did not contain a bacterial 16S rRNA sequence. They could be archaea, fungi, eubacteria that my probe could not get into, or something else. Happily, the eub338 signal that does colocalize with DAPI is our bacterial signal. This means the CARD-FISH protocol accessed at least some of the eubacteria, and means that next we can follow up with more targeted probes for subpopulations of eubacteria that looked interesting based on the sequencing results.

Introducing Biogeography 2

It’s bigger, it’s better, it’s Biogeography 2!

About a year ago I started an Intra-plant biogeography project. Limited in scope, this project’s primary aim was to determine how much variation there was in the microbial communities across a single plant in “high resolution.” The goal was to determine whether it mattered where our ZEN collaborators cut their samples from along the roots and leaves.

The general project was this: Cut a plant into about 50 strategically chosen pieces and look at the community variation across the surface.

We got some really interesting results which I presented in a poster at the 2014 Lake Arrowhead Microbial Genomics Conference.

One thing that always bothered me about these results were that they were for only one plant. I didn’t know if the cool patterns I was seeing were normal or a fluke. That’s where Biogeography 2 comes in, it’s a continuation of the first project but with more replicates (five, to be precise) all collected at the same time and from the same place. In the coming weeks I’ll be processing these samples and updating you about the progress.

This week’s update:

This week I finally was able to mutilate  dissect the plants and now we can begin extracting DNA from the samples. Here are some pictures of plants prior to dissection.

DSC_0082

For a plant that withstands daily tidal forces, seagrass are surprisingly delicate when taken out of water. When they dry out, they crumble so I try to section them as fast as possible to prevent drying.

DSC_0073

Sample preparation includes painstakingly disentangling these roots from each other and from the shoots without breaking them. (About a 2 hour process per plant).

more IPython notebook troubleshooting

No word from the QIIME forum about my problem. So, I asked twitter for help.

So far, everyone thinks it has something to do with my path, BUT 1) macqiime sets the PYTHONPATH variable, and 2) the package it’s looking for exists in both macqiime python and anaconda.

OK, so it’s fixed now. There may be another way to deal with it, but what I did was install ipython notebook into the macqiime python folder, using get-pip.py to install pip, and then pip install ipython[notebook], and then comment out the line in my .bash_profile that points to the anaconda version of ipython.

Marisano James actually did all of the work for me, I asked him to summarize:

“When anaconda was installed, it added a path to its own ipython in the .bash_profile. Then, no matter what python was running, it would wind up using the anaconda version of ipython, which didn’t have the same settings as the system Python. I wound up renaming the anaconda folder (so it could no longer be found), and then commenting out the added line in ~/.bash_profile. Just commenting out the line in the ~/.bash_profile is sufficient, but I didn’t know anaconda’s ipython was being called until I effectively removed its folder. If you run into this problem, be sure to open a new terminal after commenting out the offending anaconda ipython line so it will be able to use the updated PATH.”

installing macqiime

Within the span of 1 week, I set up my new super-powerful Mac Pro, we got all of the ZEN sequence data back, and QIIME version 1.9 is live! I also posted my IPython notebook for a basic QIIME analysis.http://jennomics.github.io/QIIMEbyJennomics/

Quite a confluence of events…

Anyway… I’m christening my new machine with QIIME.

Notes on macqiime install:

1. I went through the installation instructions, including the optional add-ons with no glitches here:

http://www.wernerlab.org/software/macqiime/macqiime-installation#install

2. I ignored AmpliconNoise because I do not use 454 data.

3. I could not get Topiary Explorer to work. At first, there was a problem with the security, but I figured out how to add exceptions, but then it still didn’t work, and the error message said: “Unable to launch application.” Then, I clicked on the Details button, and I think this describes the problem:

Caused by: java.net.URISyntaxException: Relative path in absolute URI: file://topiaryexplorer1.0.jar

But, I’m not sure how to fix it, so I decided to move on and come back to Topiary Explorer when/if I need it.

4. In the bit about installing R, I noticed this:

Please note that even if you installed R and these libraries previously for MacQIIME 1.8.0, you should still upgrade to/install the latest version of R, 3.1.2, and re-install all these R packages to get everything working.

And, that’s how I learned that QIIME 1.9 was out there. BUT, it doesn’t look like macqiime has been updated, so it installed QIIME 1.8 instead. Maybe that’s because it’s a “release candidate” at this point? Anyway, I’ll have to go back and update QIIME somehow. Macqiime appears to be working. See below for the output of print_qiime_config.py -t

System information
==================
Platform:    darwin
Python version:    2.7.3 (default, Dec 19 2012, 09:12:08)  [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
Python executable:    /macqiime/bin/python

Dependency versions
===================
PyCogent version:    1.5.3
NumPy version:    1.7.1
matplotlib version:    1.1.0
biom-format version:    1.3.1
qcli version:    0.1.0
QIIME library version:    1.8.0
QIIME script version:    1.8.0
PyNAST version (if installed):    1.2.2
Emperor version:    0.9.3
RDP Classifier version (if installed):    rdp_classifier-2.2.jar
Java version (if installed):    Not installed.

QIIME config values
===================
blastmat_dir:    None
sc_queue:    all.q
topiaryexplorer_project_dir:    None
pynast_template_alignment_fp:    /macqiime/greengenes/core_set_aligned.fasta.imputed
cluster_jobs_fp:    /macqiime/QIIME/bin/start_parallel_jobs.py
pynast_template_alignment_blastdb:    None
assign_taxonomy_reference_seqs_fp:    /macqiime/greengenes/gg_13_8_otus/rep_set/97_otus.fasta
torque_queue:    friendlyq
template_alignment_lanemask_fp:    /macqiime/greengenes/lanemask_in_1s_and_0s
jobs_to_start:    1
cloud_environment:    False
qiime_scripts_dir:    /macqiime/QIIME/bin/
denoiser_min_per_core:    50
working_dir:    /tmp/
python_exe_fp:    /macqiime/bin/python
temp_dir:    /tmp/
blastall_fp:    blastall
seconds_to_sleep:    60
assign_taxonomy_id_to_taxonomy_fp:    /macqiime/greengenes/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
….F…………………………
======================================================================
FAIL: test_ampliconnoise_install (__main__.QIIMEDependencyFull)
AmpliconNoise install looks sane.
———————————————————————-
Traceback (most recent call last):
File “/macqiime/QIIME/bin/print_qiime_config.py”, line 392, in test_ampliconnoise_install
“$PYRO_LOOKUP_FILE variable is not set. See %s for help.” % url)
AssertionError: $PYRO_LOOKUP_FILE variable is not set. See http://qiime.org/install/install.html#ampliconnoise-install-notes for help.

———————————————————————-
Ran 35 tests in 0.456s

FAILED (failures=1)

Postdoc Position in Theoretical Community Ecology

The Seagrass Microbiome Project is looking for a Postdoc!

Postdoctoral Position in Microbial Ecology and Evolution

Jessica Green at the University of Oregon (http://pages.uoregon.edu/green/) is currently seeking a postdoctoral researcher to collaborate on the Seagrass Microbiome Project (http://seagrassmicrobiome.org).   Applicants should have a Ph.D. in a biological, computational, mathematical, or statistical field and strong writing skills.  The ideal candidate will have experience developing and applying models to understand the ecology, evolution, and/or function of complex systems.  Experience in the analysis of environmental sequence data is highly desirable, but not required.

The successful candidate will have the opportunity to creatively and independently tackle one or more of the science questions outlined in the Seagrass Microbiome Project grant proposal (https://seagrassmicrobiome.org/2014-grant-proposal/), funded by the Gordon and Betty Moore Foundation.   The successful candidate will interact regularly with team members Jonathan Eisen (http://phylogenomics.wordpress.com), Jay Stachowicz http://www-eve.ucdavis.edu/stachowicz/stachowicz.shtml, and Jenna Lang (http://jennomics.com/) at the University of California, Davis through weekly tele-conferencing and also through regular visits to the UC Davis campus.  At the University of Oregon, the candidate will benefit from ongoing microbiome research programs including the Microbial Ecology and Theory of Animals Center for Systems Biology (http://meta.uoregon.edu/) and the Biology and Built Environment Center (http://biobe.uoregon.edu/).

The position is available for 1 year with the possibility for renewal depending on performance.  The start date is flexible.  Please email questions regarding the position to Jessica Green (jlgreen@uoregon.edu).

To apply

A complete application will consist of the following materials:

(1) a brief cover letter explaining your background and career interests

(2) CV (including publications)

(3) names and contact information for three references

Submit materials to ie2jobs@uoregon.edu.  Subject: Posting 14431

To ensure consideration, please submit applications by March 10, 2015, but the position will remain open until filled.

Women and minorities encouraged to apply.  We invite applications from qualified candidates who share our commitment to diversity.

The University of Oregon is an equal opportunity, affirmative action institution committed to cultural diversity and compliance with the ADA. The University encourages all qualified individuals to apply, and does not discriminate on the basis of any protected status, including veteran and disability status.

Zen Project Update!

You might be wondering, “Whatever happened to that Zen Project stuff?”

ZEN (Zostera Experimental Network) is a global partnership of seagrass researchers studying Zostera marina. Over the summer and late last spring we sent 24 sampling kits to researchers across the world. Except for a few sites (international sample collection is hard) we’ve received nearly all of the kits with samples intact and we’re well on our way to finishing extracting DNA from each of them.

20140618_140930
The first kit we received,
from North Carolina.

Here’s a map of all of the samples we’ve received so far (it will be updated in real time as we receive more samples).

We have also had a couple of experimental challenges that are worth noting:

Collection strategies: As with any carefully planned field experiment, things go wrong once you’re out in the field. It seems like some of our collection techniques are harder to carry out than we had originally though. The filtering process is not always clear and it’s often hard to fully submerge roots and leaves in the Zymo storage buffer we provided.

Extraction: It turns out that despite its great capacity for sample preservation, the Zymo buffer does not play nicely with our DNA extraction method. We’ve dealt with this by modifying the extraction protocol, but we still get lower yields per sample than we had hoped.

In spite of these challenges we’ve still been overwhelmingly successful:

Received kits: Nearly all of the kits are back and being extracted!

Wonderful Collaborators: Many of our partners have graciously offered to help collect extra seagrass species for other aspects of the project. (Thank you!)