Open position: lab coordinator

We’re searching for a new HHMI administrative lab coordinator. Michelle Merry is leaving us to take a great new position at a startup company down in Kendall Square; we expect her to be running the place. Her last day is May 14, so we’re hoping to find our new lab coordinator quickly.

It’s a part-time, 20hr/wk position, afternoons each weekday. Though part-time, it qualifies for full HHMI employee benefits. If you’re interested, or if you know of someone who might be interested, more information is at the HHMI job applications website here, and at LinkedIn here, or you can contact me directly.

the first DACA Rhodes Scholar

Congratulations to Jin Kyu Park, who was named last week as the first DACA immigrant to become a Rhodes Scholar. The Rhodes Trust changed its rules because of him. Previously, only US citizens were allowed to apply for the Rhodes. In the spirit of DACA, it seemed like Dreamers should also qualify, and Harvard encouraged him to apply last year to force the issue. Though that application was rejected as non-qualifying, the Rhodes trustees subsequently voted in favor of changing their rules to allow DACA recipients to apply, and he was invited to reapply this year, this time successfully.

I was proud to co-write one of his letters of recommendation both times. Jin, a Molecular and Cellular Biology major, took my course MCB 112 Biological Data Analysis, and he also took Elena Rivas’ course MCB 111 Mathematics in Biology. Elena and I wrote a joint rave review letter to the Rhodes. It started with something like ‘he is one of the most impressive individuals we have ever encountered in our careers’ and ended with something like ‘we hope to have the chance to vote for him for President someday.’ What with his DACA status, the latter would require one of the Articles of the Constitution to be amended, but we didn’t say any of this lightly.

I am writing this at my parent’s kitchen table in rural western Pennsylvania, with my large family gathered for Thanksgiving. I just noticed that Breitbart has picked up the story. Both the Breitbart headline [‘Harvard-Backed Illegal Alien Becomes First DACA Rhodes Scholar’] and the comments are what one might expect. Well, I’m one of the supposedly effete Harvard liberal professors who backed Jin’s application. But I grew up here in rural coal mining country. My values are western Pennsylvanian values.

It’s true that the Harvard faculty skew left and affluent. I’m frequently conscious of being somewhat out of place on the faculty. (I’m sure there’s plenty of us faculty who feel out of place at Harvard for one reason or another.) Not just at Harvard, but amongst science “elites” in general, I have certainly experienced some “liberal professor” caricatures occasionally. I’ve had more than one sleek, affluent, private-school-educated colleague tell me that the problem with this country is that rural working Americans are stupid. This sort of crap pisses me off as much as it would piss off anyone else from a place like Creekside, Pennsylvania.

I was raised to treat people as individuals, with honor and respect. Jin Park is an extraordinarily talented and hard-working student. I believe that regulation of legal immigration is a core function of government (yeah, and I believe in limited government, fiscal responsibility, and a strong military too) but Jin wasn’t responsible for the decision to come here from Korea; he can’t help that his parents brought him to this country illegally when he was 7. He grew up here, pretty much like I grew up here. I was given the opportunity to rise through the system, with some things working unfairly for me (white male) and some things working against me (didn’t grow up affluent, no fancy private school education, will never ever be sleek). Jin’s illegal immigrant status is something that’s working against him, but he’s worked damn hard to succeed. He worked his butt off in my class, and in Elena’s. He’s a nice guy too. He deserves every chance to rise in this country, as much as any of us. If we help him, like we would help anyone as extraordinarily well qualified as him, our country will be better off for it.

Because of where I came from, I consider it one of my jobs on the faculty at Harvard to look out for rural Americans with backgrounds like mine; and because of my values, I also consider it one of my jobs to look out for stars like Jin.

mistakes were made

“No one really knows how the game is played
The art of the trade
How the sausage gets made
We just assume that it happens”


A while back Elena Rivas and I posted a response to a bioRxiv preprint (Tavares et al, 2018) that challenged some of the conclusions in our 2017 Nature Methods paper on R-scape, a method for detecting support for conserved RNA secondary structure in sequence alignments by statistical analysis of base pair correlations. At the Benasque RNA meeting last summer, Zasha Weinberg told us we’d made a mistake in our description of how the Weinberg and Breaker R2R program annotates “covarying” base pairs. We’ve just updated our PDF with a correction, and I added an update section to my July blog post to describe the mistake, and our revision. Thanks, Zasha!

Meanwhile, the Tavares et al. manuscript is still up at bioRxiv, with no response to our comment there, despite the fact that we argue that the manuscript is based on an artifact, one of the more spectacular artifacts I have seen in my career. The manuscript describes a method that finds “covariation” support for RNA base pairs in sequence alignments that have no variation or covariation at all.

I’m told that peer review is broken, and that what science really needs is preprint servers and post-publication review. How’s the preprint and open review culture doing in this example? I bet there are people out there citing Tavares et al as support for an evolutionarily conserved RNA structure in the HOTAIR lncRNA, because that’s what the title and abstract says. If people are even aware of our response, I bet they see it as a squabble between rival labs, because lncRNAs are controversial. I bet almost nobody has looked at our Figure 1, which shows Tavares’ method calling “statistically significant covariation” on sequence alignments of 100% identical sequences. I dunno that you’ll ever find a crispier example of post-publication peer review.

In my experience, this is just the way science works, and the way it has always worked. It is ultimately the responsibility of authors, not reviewers, to get papers right. No amount of peer review, pre-publication or post-, will ever suffice to make the scientific literature reliable. Scientists vary in their motivations and in how much they care. Reviews help, but only where authors are willing to listen. A pre-publication peer reviewer can recommend rejecting a paper; the authors can send it elsewhere. A post-publication review can point out a problem; the authors don’t have to pay attention. And who has time to sort out the truth, if the authors themselves aren’t going to?

Most of the literature simply gets forgotten. What matters in the long run is the stuff that’s both correct and important. It’s mostly pointless to squabble about the stuff that’s wrong; point it out politely, sure, but move on. You can’t change people’s minds if they don’t want to hear it.

I guess what I want you to know is that I’m on the side of wanting to get our work right. If you ever see something wrong in one of our papers, or our code, or any product of our laboratory or whatever comes out of my mouth, I want to know about it. Mind you, I won’t necessarily be happy about it – I’m thin skinned, my second grade report card said “DOESN’T TAKE CRITICISM WELL” — but I care deeply about making things right, both in big picture and in every detail.

So again: thanks, Zasha.




’tis the season

Harvard’s Quantitative Biology Initiative is searching for a new tenure-track assistant professor. This is a broad search — we don’t have any particular focus areas in mind. We are interested in people studying fundamental biological questions using quantitative, computational, theoretical, or experimental methods. The Initiative emphasizes cross-departmental interaction among our life sciences departments (including Molecular & Cellular Biology, MCB; Stem Cell  & Regenerative Biology, SCRB; and Organismic & Evolutionary Biology, OEB) and our Physics, Statistics, and Chemistry departments, as well as our areas in our School of Engineering including Computer Science and Applied Math.

Ads are out now in the usual places, including Times Higher Education, Science, and LinkedIn. To apply, see

I’m on the search committee, a member of the MCB and Applied Math departments, and I’m happy to answer questions either here or by email.

I’d like to personally emphasize that a lot of what I hear about Harvard faculty searches isn’t true. I do read your application and your papers; I’m perfectly happy to read a bioRxiv preprint, and I don’t count your publications or your C/N/S papers or your citations at all. I want to recruit people from anywhere, not just from Boston or the Ivy League or the elite coasts; I grew up in rural Western Pennsylvania and I’d be happy to have some more people from coal mining towns here at Harvard, or indeed from anywhere else. Search committees here are taught about implicit bias, and we take steps to reduce  implicit biases against women and minority candidates. Harvard does have a tenure track, has for years (unlike when I came up through the system), and junior faculty are supported and mentored. We have on-campus childcare, family-friendly policies, and we work hard to make two-body spousal hires work.

I strongly, strongly encourage you to apply, and not to take yourself out of our candidate pool because you think that Harvard’s not going to look at your application for some cynical elitish reason you might have read on Twitter. Yes, we’re looking for top-flight scientists, but my experience is that many top-flight scientists tend to be pretty uncomfortable with proclaiming (or even realizing) that they’re top-flight scientists.  Apply, and tell us the cool science you want to do.

Response to Tavares et al., bioRxiv (2018)

A new bioRxiv preprint from Rafael Tavares, Anna Marie Pyle, and Srinivas Somarowthu challenges conclusions of our 2017 Nature Methods paper where we describe R-scape, a method for detecting support for conserved RNA secondary structure in sequence alignments by statistical analysis of base pair covariations. In our paper, among other things, we showed that the evidence presented by Somarowthu (2015) in support of a putative conserved structure for the HOTAIR lncRNA was not statistically significant, using the same alignment that they had analyzed. The new Tavares paper argues that by changing R-scape’s default statistic to a different one called RAFS, now statistically significant evidence for conserved structure is detected in their HOTAIR alignment and others.

Tavares’ conclusions depend on an assumption that the RAFS statistic is an appropriate measure of RNA base pair covariation, but RAFS was not designed to measure covariation alone. RAFS detects positive signals in common patterns of primary sequence conservation in absence of any covariation. The problem is severe;  Tavares’ analysis reports “significantly covarying base pairs” in 100% identical sequence alignments with no variation or covariation. The base pairs that Tavares et al. identify as significantly covarying actually arise from primary sequence conservation patterns. Their analysis still reports similar numbers of  “significant covarying” base pairs in negative controls in which we permute residues in independent alignment columns to destroy covariation. There remains no significant covariation support for evolutionarily conserved RNA structure in the HOTAIR lncRNA or other lncRNA structures and alignments we have analyzed.

We have posted a PDF of a full response to the Tavares et al. preprint on the lab’s web site.

[Update, 15 Nov 2018: We made a correction, marked in red in the PDF, where we describe how the Weinberg and Breaker R2R program annotates “covarying” base pairs. While it’s true that it only “requires a single compensatory pair substitution to annotate a pair as covarying”, it isn’t true that this is “regardless of the number of sequences or the number of substitutions that are inconsistent with the proposed structure.” We have corrected the latter phrase to read “so long as no more than 10% of the sequences are inconsistent with canonical base pairing of the two positions.” We also added a footnote to say that the Somarowthu (2015) paper customized R2R’s tolerance to allow up to 15% inconsistent base pairs to obtain their HOTAIR results. Thank you to Zasha Weinberg for correcting us.]

HMMER 3.2 release


The glorious master plan was to finish HMMER4 while hoping that HMMER3 stayed stable. Alas, HMMER4 development has been even slower than expected, and bugs and bitrot have accumulated on HMMER3. Here’s a new HMMER 3.2 release to tide us all over. I’m managing HMMER releases again, with Travis Wheeler having moved a while ago to a faculty position at U. Montana.

You can get the HMMER3.2.1 source tarball from here.
Continue reading →

Graeme Mitchison

Astronomy began when the Babylonians mapped the heavens. Our descendants will certainly not say that biology began with today’s genome projects, but they may well recognize that a great acceleration in the accumulation of biological knowledge began in our era.

Graeme Mitchison wrote those opening lines of our book Biological Sequence Analysis in Richard Durbin’s parents’ house in London. We four coauthors had borrowed the house for a month to write together, knowing that we had to get Richard out of the Sanger Centre or no progress would be made. The living room looked like a spy ring’s safe house, drapes drawn and full of improvised desks, computers, printer, and papers. We paired off in warring alliances to write, to cook, to argue, and to take long walks on the Hampstead Heath to cool down. At one point over a late dinner and wine, Anders Krogh proposed that one could make a hidden Markov model to recognize each of our writing styles. Richard proposed that mine could be recognized trivially by a high emission probability of the word “simple”. I recall snapping something back. I was struggling to draft our introduction and feeling defensive. At some point Graeme took it from me and in a few strokes replaced my clumsy efforts with the chapter that began with the beautiful lines above.

Continue reading →

Fall 2018 MCB112 teaching fellows

I’m looking for four teaching fellows (TFs) for my course MCB112 Biological Data Analysis in the fall 2018 semester. TFs are typically Harvard G2 or G3 students (second- or third-year PhD students, in Harvard-speak), but can be more senior students or even postdocs. I teach the course in Python and Jupyter Notebook, using numpy and pandas, so experience in these things is a plus. Email me if you’re interested, or if you know someone else at Harvard who might be interested, let them know.