HMMER 3.1 beta test 2 released

hmmer-154x184

The HMMER dev team is happy to announce a bugfix release of HMMER3.1, release 3.1 beta 2, aka 3.1b2. Following Google’s ineffable lead in having perpetual beta test periods, 3.1 has been in beta test now for two years. When we said before that 3.1 will be released reasonably soon… “reasonably soon” continues to be a term of art for the dev team. Did we mention, it’s a stable beta release?

Anyhoo, moving right along. The 3.1b2 code is publicly available as a tarball available for download, or from hmmer.org, where you’ll also find precompiled binary releases for Mac and Linux.

The most significant upgrade in 3.1b2 is that the nhmmer program for DNA/DNA comparison now includes a somewhat radical heuristic acceleration technique that gets us about 10x more speed. Travis Wheeler has used an FM-index data structure to accelerate remote homology search in nhmmer. FM-index techniques are well known now in the computational biology community for fast near-exact matching (in read mappers, for example), and there have been some proofs of principle for accelerating Smith/Waterman especially with scoring systems set for close matches; Travis’s code is a full-fledged implementation in production code for remote homology. Travis is still working on it and writing it up. Meanwhile, you can try it out. If you format a DNA database with the new makehmmerdb command, and then use nhmmer –tformat hmmerfm to search the binary FM-index database, you’ll use the new acceleration.

Another significant upgrade is the inclusion of the hmmlogo program, which is essentially a commandline interface for producing the data underlying the Skylign profile logo server (skylign.org).

Also, eight, count ’em eight bugs have been fixed. Of the ones we count, anyway.

Congratulations again to Travis Wheeler, who continues as 3.1’s build master, even though he is now afar in his new mountain lair faculty position at the University of Montana, as the HMMER dev team continues to scatter and flee from Virginia.

The horrible grinding noise you hear is the HMMER4 development code branch. Do not be alarmed. All is well. It will be ready… reasonably soon.

Detailed release notes for 3.1b2 are below the fold.
Continue reading →

HMMER3.1 beta test 1 released

hmmer-154x184

The HMMER dev team is happy to announce an upgrade release of HMMER3, release 3.1. A beta test version of the code is publicly available as a tarball available for download, or from hmmer.org, where you’ll also find precompiled binary releases for Mac and Linux.

HMMER 3.1 includes nhmmer and nhmmscan, programs for DNA/DNA homology searches with profile HMMs. nhmmer has already been incorporated in RepeatMasker, in collaboration with Arian Smit and colleagues, and is the software underlying the Dfam database of profiles for mobile DNA elements.

HMMER 3.1 database searches are about twice as fast as HMMER 3.0 was, fulfilling old campaign promises.

HMMER 3.1 includes hmmpgmd, the parallel search daemon underlying HMMER Web Services at hmmer.org.

This code is expected to be stable, but we’re releasing it as a beta test just to be careful. After some time in the wild, we’ll make a release candidate, and if you folks haven’t chewed any of that up too badly, we’ll make the final 3.1 release reasonably soon.

Congratulations to Travis Wheeler, 3.1’s build master — note the TJW on the notes below the fold, not an SRE — the first HMMER release managed by someone besides me (Sean).

Meanwhile… slowly, slowly, HMMER4 takes shape, as the gnomes of HMMER Labs toil sleeplessly on their latest monstrosities. The long awaited return of glocal alignment has been delayed into HMMER4, because the changes required turned out to be, um, quite extensive.

Detailed release notes for 3.1b1 are below the fold.
Continue reading →

Join Rob’s HMMER team

hmmer-154x184

Rob Finn’s HMMER web services team is expanding. We’re looking for people to apply to two new positions to help Rob and Jody push forward on some important ideas for our services. We’re pushing in the direction of using more phylogenetic information (species trees) as we compute database homology searches and deliver the results — organizing everything on trees, rather than treating the protein database as a bag of unrelated sequences, as we (the community) have tended to do in the past. We’ll need help on the data visualization side (navigating search results organized on the tree of life), on the computing back end (accelerating our searches by searching representative subsets of complete proteomes, rather than “all” sequences — which will allow us to deliver fully interactive search times, measured in milliseconds), and on collaborative efforts with the primary protein sequence and genome data resources, as we (the community) get our data ecosystem organized around complete annotated genomes, not individual protein sequences. The positions, written in HR-speak, are advertised on HHMI’s web site here and here.

HMMER3 is stubborn

We’ve had a couple of reports of some less-than-intuitive behavior of HMMER3 on poor-scoring sequences. As one correspondent described it, HMMER3 is stubborn. It will refuse to score and align certain low-scoring sequences no matter what options you try to set. It’s probably worth explaining this behavior in public, partly because it’s an opportunity for me to briefly describe the fact that H3 has two processing pipelines: the “acceleration” pipeline, and the “domain postprocessing” pipeline. Only the acceleration pipeline is written up for publication, reasonably well documented, and well controllable by options. The domain postprocessor is ad hoc, not terribly satisfying, not well documented, not easily configurable — and it kicks back a side effect that drops some poor-scoring sequences entirely.
Continue reading →