hmmscan vs. hmmsearch speed: the numerology

From today’s email…

Suppose, for example, you want to search 300 million metagenomic sequence reads, each about 200nt long, against the Pfam database. What’s the best way to do that task with HMMER3? The bottom line: use hmmsearch, not hmmscan. For the numerology of why (and chapter and verse on how hmmsearch and hmmscan scale to large multithreaded and MPI tasks, their limitations, advice on how we do it, and some clues about what’s coming in the future), keep reading…
Continue reading →

Extracting HMMER results to sequence files: Easel miniapplications

Easel logo

The HMMER and Infernal code includes some hidden tools: the Easel library, and its “miniapplications”. Easel is our code library (in the easel subdirectory of both HMMER and Infernal), and the miniapplications (in easel/miniapps) are a set of command line utilities that we use for manipulating sequence data. For example, esl-reformat is a utility for reformatting from one sequence file format to another, and esl-sfetch is a tool for retrieving sequence(s) or subsequence(s) from a large sequence flatfile. These utilities work together with HMMER and Infernal to enable sequence analysis in a flexible, arcane, unix-y command line sort of way.

For example, yesterday someone wrote to ask, suppose I want to extract all the sequences that were hit by a HMMER hmmsearch, and save them in a separate file in FASTA format — how do I do that? This is a good example for introducing Easel’s miniapplications.
Continue reading →

HMMER3 at your (web) service

hmmer-154x184

Over at hmmer.janelia.org, you’ll notice a significant change over on the right side of the page. See the “Search” button? You don’t have to use HMMER at the UNIX command line any more. Thanks to support from the Howard Hughes Medical Institute, and hard work from Rob Finn and Jody Clements here in the skunkworks at HMMER Labs, HMMER searches are now available on interactive web servers.
Continue reading →

Updated user guide

I’ve added some sections to the version of the HMMER User’s Guide that’s linked at hmmer.org. You can download the new Guide from here.

The Guide now briefly describes the HMMER3 acceleration pipeline for profile/sequence comparison, and the methods used to identify domain “envelopes”, maximum expected accuracy alignments, and posterior probabilities of aligned residue confidence. The Guide also now documents the --tblout and --domtblout tabular output formats.

This should help address questions about what all the fiddly columns mean in these outputs, especially in --tblout.

This version of the Guide included in HMMER distro tarballs is still the old version. The updated Guide will appear in distro tarballs for 3.1, which is coming Real Soon Now.

In memoriam.


Human language is a cracked kettle on which we beat out tunes for bears to dance to, when all the time we are longing to move the stars to pity.

— Gustave Flaubert, Madame Bovary

I regret to announce the untimely death Monday morning of Michael Farrar, principal software engineer for the Eddy lab. Michael was a critical part of our lab and an extraordinarily talented colleague. The lab is stunned. He is greatly, greatly missed. This will of course impact the HMMER3 development roadmap and our lab’s plans for the future. I will discuss business at some later time. Our thoughts now are with Michael’s wife and his three children.

HMMER 3.0


Our quest is at an end.

— Monty Python and the Holy Grail

Four years in development, and a year in testing: HMMER3 has reached its first public production release. Do we have time for a beer and a small celebration before we write the manuscripts and move straight on to 3.1 development? No? Thought not.

HMMER3 is available for download as a source code tarball. Over at hmmer.org, there are also links for downloading tarballs including precompiled binaries for Linux/Intel ia32, Linux/Intel x86_64, and Mac OS/X Intel platforms.

The release notes for 3.0 follow:
Continue reading →