New positions opening in the group

Construction on the new laboratory here at Harvard is starting to look like it might actually get done. They’re projecting our move-in date to be the first week of November. I’ve been meeting with prospective rotation students and postdoc candidates, even though we still have no place to put anyone or anything. It feels like we have a bunch of planes in the air stacked up in holding patterns, waiting for the bulldozers to finish the runway.

I plan to search for three key staff positions this fall. The first open position is our all-important administrative lab coordinator. I’m already sorely missing our lab coordinators over the years at Janelia, Margaret Jefferies, Patrice Neville, and Sarah Moorehead, and we don’t even have an open lab here yet. An ad is now out in all the fashionable color supplements (here’s the one at Nature Jobs). Help us spread the word, if you know of someone who might be interested in the position!

The other two positions will be two scientific software developers, working on our main codebases in HMMER, Infernal, and Easel. We’re going to make a push on the high performance computing ends of things, including parallelization (SIMD vectorization, threading, and MPI), so I want to find a software engineer with experience in C programming in parallel HPC scientific applications. We’re also going to make a push in the visualization and web development end of things, so I want to find a web developer with experience in data visualization who’d also be interested in being a general guru of much of our front-facing stuff, including our web presence, issue tracking, github, and software distribution. I don’t have ads out for these positions yet, but will do soon-ish.

Open faculty position in Harvard FAS Systems Biology

harvard_logoHarvard’s FAS Center for Systems Biology has opened a search for a new tenure-track faculty member at the assistant professor level. Sharad Ramanathan and I are the co-chairs for the search committee.

From the ad:

The Center emphasizes quantitative approaches to fundamental problems in biology. It aims to foster interactions across disciplinary boundaries, housing faculty from a spectrum of academic departments in addition to the Bauer Fellows. Exceptional candidates in any area of quantitative biology will be considered, including those taking computational, theoretical, and/or experimental approaches.

Faculty associated with the Center for Systems Biology have access to facilities and opportunities for collaborative research not only through departments but also through the Bauer Core facilities, the Center for Nanoscale Systems, the Broad Institute, and the Center for Brain Science. The successful candidate will hold an academic appointment in a natural science department such as, but not restricted to, Molecular and Cellular Biology, Organismic and Evolutionary Biology, Physics, Applied Mathematics, or Chemistry and Chemical Biology.

The application web page is here.

HMMER mission control: we are go for launch vehicle separation(s)

hmmer_titlebar_small_textHMMER web servers were officially launched today at the EMBL European Bioinformatics Institute (EBI) in Cambridge UK. You can read an EBI press release here. This marks the completion of the pilot HMMER server project at Janelia Farm and its transition to the EBI. All of this has been led by Rob Finn, now the head of sequence family resources at EBI.  A huge thank you goes out to Rob and his team at Janelia (Jody Clements, Bill Arndt, and Ben Miller), to HHMI for funding the pilot project, and to the EBI for agreeing to adopt the pilot project and making it all grown up and respectable.

Today we’ve separated the code development home ( from the servers ( Nonetheless, the two sites are pointing back and forth at each other, so you can download the current HMMER release and documentation from EBI, and you navigate to EBI’s search pages starting from We even think that RESTful URLs for the pilot servers at will continue to be forwarded and served properly by the new EBI servers. Let us know if you experience any glitches.

Rob’s team at EBI will run the EBI servers, and the Eddy/Rivas lab will continue to be responsible for  Because of the terrifyingly sophisticated planning processes we employ in the HMMER project, or maybe it’s just a coincidence, the EBI announcement comes just days before our move to Harvard. Everything HMMER-related at Janelia will now wind down quickly over the next few months. A big change for us.  If you’re used to using, switch to using the project’s permanent URL at We’re about to turn out the lights here at Janelia.

HMMER 3.1 beta test 2 released


The HMMER dev team is happy to announce a bugfix release of HMMER3.1, release 3.1 beta 2, aka 3.1b2. Following Google’s ineffable lead in having perpetual beta test periods, 3.1 has been in beta test now for two years. When we said before that 3.1 will be released reasonably soon… “reasonably soon” continues to be a term of art for the dev team. Did we mention, it’s a stable beta release?

Anyhoo, moving right along. The 3.1b2 code is publicly available as a tarball available for download, or from, where you’ll also find precompiled binary releases for Mac and Linux.

The most significant upgrade in 3.1b2 is that the nhmmer program for DNA/DNA comparison now includes a somewhat radical heuristic acceleration technique that gets us about 10x more speed. Travis Wheeler has used an FM-index data structure to accelerate remote homology search in nhmmer. FM-index techniques are well known now in the computational biology community for fast near-exact matching (in read mappers, for example), and there have been some proofs of principle for accelerating Smith/Waterman especially with scoring systems set for close matches; Travis’s code is a full-fledged implementation in production code for remote homology. Travis is still working on it and writing it up. Meanwhile, you can try it out. If you format a DNA database with the new makehmmerdb command, and then use nhmmer –tformat hmmerfm to search the binary FM-index database, you’ll use the new acceleration.

Another significant upgrade is the inclusion of the hmmlogo program, which is essentially a commandline interface for producing the data underlying the Skylign profile logo server (

Also, eight, count ’em eight bugs have been fixed. Of the ones we count, anyway.

Congratulations again to Travis Wheeler, who continues as 3.1’s build master, even though he is now afar in his new mountain lair faculty position at the University of Montana, as the HMMER dev team continues to scatter and flee from Virginia.

The horrible grinding noise you hear is the HMMER4 development code branch. Do not be alarmed. All is well. It will be ready… reasonably soon.

Detailed release notes for 3.1b2 are below the fold.
Continue reading

Cryptic genetic variation in software: hunting a buffered 41 year old bug

In genetics, cryptic genetic variation means that a genome can contain mutations whose phenotypic effects are invisible because they are suppressed or buffered, but under rare conditions they become visible and subject to selection pressure.

In software code, engineers sometimes also face the nightmare of a bug in one routine that has no visible effect because of a compensatory bug elsewhere. You fix the other routine, and suddenly the first routine starts failing for an apparently unrelated reason. Epistasis sucks.

I’ve just found an example in our code, and traced the origin of the problem back 41 years to the algorithm’s description in a 1973 applied mathematics paper. The algorithm — for sampling from a Gaussian distribution — is used worldwide, because it’s implemented in the venerable RANLIB software library still used in lots of numerical codebases, including GNU Octave. It looks to me that the only reason code has been working is that a compensatory “mutation” has been selected for in everyone else’s code except mine.

Continue reading