Category Archives: Fungal ITS

Marine Fungi Workshop

I just returned from a marine fungi workshop set up by Amy Gladfelter and supported by the Gordon and Betty Moore Foundation. The workshop was from May 7-9th at the Marine Biological Laboratory in Woods Hole, MA. This was actually my second trip to Woods Hole, my first was in summer of 2015 to attend the Microbial Diversity course (click here to read a cheesey poem I wrote about the course).

The workshop started with everyone giving 5 minute lightning talks about their research. It was my first time presenting my research ideas to people outside of UC Davis and even though it was only a 5 minute presentation, I was scared to death. I am pretty sure I was literally shaking in the moments leading up to my talk and my imposter syndrome was yelling at me to run far far away so that the real mycologists (doubly scary since they were mostly all professors) wouldn’t know they’d invited a eco-evolutionary microbiologist / bioinformagician into their midst. I can’t really remember anything that happened in those 5 minutes, but I walked away feeling like I had crushed it (take that imposter syndrome).

After the talks, we discussed what we thought were some big issues in marine mycology as a group before breaking up into 4 smaller groups with the goal of drafting white papers on these issues.

The 4 smaller group topics were:

  1. Who is out there? Identification and isolation of fungi from different parts of the marine environment
  2. How can marine fungi be studied? Establishing model systems to discover new biology
  3. What are fungi doing to influence the geochemical cycle of the ocean? Establishing the function of fungi in chemical cycling and contributions to climate
  4. How are fungi interacting with and shaping the marine biosphere? Identification of fungal interactions across scales of life in the ocean
Some of the dominant themes that resulted from these conversations were (1) a desire to  inform both scientists and non-scientists of the presence of fungi in the ocean; (2) to impart and quantify the importance of the roles of marine fungi in the ocean; (3) the unclear definition of marine fungi and whether or not this definition includes facultative marine fungi, transient terrestrial fungi or freshwater / brackish fungi; (4) our current lack of understanding of the genetic, phylogenetic, functional and ecological diversity of marine fungi and the spatial scales at which they exist in the marine environment; (5) the lack of standardized protocols for the study of fungi more generally and a need for improved / expanded databases for fungal sequence data that potentially incorporate phylogeny.

I got to meet a bunch of awesome people from a variety of fields (including systematics, cell biology, genetics, chemistry, bioinformatics, etc), some of whom I had heard a lot about / seen before on twitter and others who were completely new to me! I only wish it had been 1-2 days longer to further promote networking opportunities and collaborative discussions. Despite the jam-packed workshop schedule, we somehow managed to fit in a boat trip on one of the MBL’s collection vessels, the Gemma.

Throughout the conference, I realized a few things (1) I should probably be going to and giving talks at more conferences; (2) networking skills are extremely important; (3) I need to learn more about fungal taxonomy and systematics; (4) I am now super excited to look at and incorporate fungi in some of my other non-seagrass projects; (5) working on my computer on a bus is not a good idea and makes me extremely motion sick.
Found some microbes in Woods Hole, but no marine fungi 😦
This workshop served as a breathe of fresh air for me and helped renew my excitement for analyzing my seagrass-associated fungal ITS data. It also gave me a few cool ideas of things to do moving forward. I am extremely grateful that I had the opportunity to attend and that the Moore foundation was able to bring us all together. I can’t wait for the next marine fungi meet-up!

Fungal ITS Taxonomy Problem: SOLVED (for now)

The past couple weeks (maybe months?) I’ve been struggling with analyzing some fungal ITS data that we have for our Edge Effects side project. No one in our lab really specializes in fungal barcoding (or fungal anything) so we became sheep and followed the mainstream path. We amplified the ITS region, between the small subunit and large subunits of RNA, which was to our knowledge the “chosen one” for fungal barcoding, using ITS1F and ITS2 primers. ITS appears to serves its purpose in terms of detailed classification (family/genus taxonomic levels) but it is definitely not a perfect barcode – for one ITS reads cannot be aligned (perhaps due to too much variation between reads, insertions, deletions, length variation, etc) which makes the reads useless by themselves for phylogenetic approaches.

Before this particular dataset fell into my hands, it was in Jenna’s and when issues with the ITS dataset arose, she turned to twitter for answers (part 1 and part 2). The conclusion – due to our desire for phylogenetic analysis it is highly likely that future fungal analysis will not be done using ITS as ultimately we care more about phylogeny than taxonomy.

That is great – but we still have our ITS dataset, what do we do with it?

I essentially did what they do here in this tutorial which I of course found after figuring out what to do from scratch. I used the UNITE ITS database to cluster my forward unmerged reads into OTUs in QIIME using UCLUST. I also used UCLUST to assign taxonomy (because it was the default option). I then did some basic filtering using and to remove singletons, mitochondria, chloroplasts and unassigned (at kingdom level) taxa. This is where things began to go wrong (if they weren’t already wrong to start with).

I summarized my biom table using biom summarize-table and I saw this:

Counts/Sample summary:
Min: 0.0
Max: 838.0
Median: 30.000
Mean: 100.512
Std. dev.: 201.292

What happened to all my sequences?? Better yet, are there even fungi on seagrass? Is what we are seeing the result of low fungal biomass????

Let the investigation begin. I decided to look at what my biom table looked like before I filtered out the unassigned reads. This is what I saw.

Min: 14.0
Max: 48889.0
Median: 2847.000
Mean: 6001.653
Std. dev.: 9287.627

Now, that looks a bit better… except that the “unassigned” reads could be anything (seagrass, jellyfish, bacteria, fungi, sponges, etc). Since we want to do a “fungal” analysis this just won’t do. So to investigate further, I downloaded NCBI’s nucleotide “nt” database. Approx ~4250 OTU’s in my dataset were classified as “Unassigned” so I pulled these out and locally blasted them against the “nt” database to get some idea of their taxonomy. What I found was that my “Unassigned” OTUs were seagrass, jellyfish, bacteria, sponges and lots and lots of uncultured fungi. Of my ~4250 OTU’s, ~3250 hit something in the “nt” database and ~700 of hit something with >70% identity over >70% of the query OTU length.  So there are obviously fungi (or fungi-like sequences) in my dataset that aren’t being identified using the method for taxonomic assignment I’ve been using (UCLUST & UNITE).

On a whim while writing this blog post about the dreary nature of ITS, I took a second look at the earlier mentioned tutorial. On the surface, it looks identical to what I did with my dataset (reassuring), but I then noticed they were using a mysterious parameter file. Perhaps this parameter file was filled with rainbows, pixie dust and unicorns that could solve all my fungal problems? To investigate further, I downloaded and took a peak at this mythical parameter file. Cue dramatic music. Low and behold, they are using the “blast” method for taxonomic assignment over UCLUST. So I thought what the heck, I’ll try anything at this point to make this fungal data usable, let’s give it a go. Of course (because this is how my life seems to be going recently) using the “blast” method of taxonomic assignment worked like magic. My new biom table summary (and this is after removing OTUs with “No blast hit”) looks like this:

Min: 7.0
Max: 47199.0
Median: 2344.000
Mean: 5441.093
Std. dev.: 9146.485

According to the log file, using the “blast” method 4717 sequences were inspected and only 1796 could not be identified. This is a huge improvement from before where ~4250 were “Unassigned”. I will note here, that upon investigating the blast assigned taxonomies, I do see a lot of unidentified fungi so this solution might not work for you if you care about specific taxonomy. I still have to analyze this new biom table which since I can’t use phylogenetic approaches will be its own hurdle, but at least I have enough truly “fungal” data to analyze now. Thinking back on all of my struggles, I am so incredibly angry that one silly QIIME parameter was what was keeping me from moving forward. Even before this I was wary of what the default QIIME script options meant for my data, but moving forward I’ll be even more vigilant in my choice of programs and parameters. This entire situation is equal parts ridiculous, embarrassing, frustrating and dumb luck. Perhaps, the craziest part is that had I not decided to write a blog post about my problems with ITS, I would never have found the solution to this particular problem. I can’t be the only one to ever have had this issue – is this some well kept mycologist secret method to ITS success? My hope is that by writing this blog post, I can save others from weeks (or months) of mental anguish over poor quality ITS taxonomic classification when the answer is hidden (or not so hidden) away in a silly parameter file.

What the fungi do I do with my ITS library? (Part 2)

What the fungi do I do with my ITS library (Part 2)
Originally posted on on May 22, 2014

Previously, I expressed some concern about size variation in my environmental fungal ITS PCR libraries. I’m still concerned about that, but I have an additional concern. The ITS region can’t be aligned, and I’m partial to phylogenetic approaches to pretty much everything. So maybe ITS is not for me?

So, I asked Twitter again…

In summary, I don’t think that I can use ITS given the size variation that I see, and I’m not sure that I want to, given the fact that you cannot align it to do phylogeny-based analyses.

28S (or LSU) is a reasonable alternative to ITS that has two big downsides: 1) the reference database is much smaller than the ITS reference database and 2) it does not provide the fine-scale taxonomic resolution that ITS does.

Rachel Adams referred me to Amend et al, in which they use both. I’ll have to look into this approach…