the first DACA Rhodes Scholar

Congratulations to Jin Kyu Park, who was named last week as the first DACA immigrant to become a Rhodes Scholar. The Rhodes Trust changed its rules because of him. Previously, only US citizens were allowed to apply for the Rhodes. In the spirit of DACA, it seemed like Dreamers should also qualify, and Harvard encouraged him to apply last year to force the issue. Though that application was rejected as non-qualifying, the Rhodes trustees subsequently voted in favor of changing their rules to allow DACA recipients to apply, and he was invited to reapply this year, this time successfully.

I was proud to co-write one of his letters of recommendation both times. Jin, a Molecular and Cellular Biology major, took my course MCB 112 Biological Data Analysis, and he also took Elena Rivas’ course MCB 111 Mathematics in Biology. Elena and I wrote a joint rave review letter to the Rhodes. It started with something like ‘he is one of the most impressive individuals we have ever encountered in our careers’ and ended with something like ‘we hope to have the chance to vote for him for President someday.’ What with his DACA status, the latter would require one of the Articles of the Constitution to be amended, but we didn’t say any of this lightly.

I am writing this at my parent’s kitchen table in rural western Pennsylvania, with my large family gathered for Thanksgiving. I just noticed that Breitbart has picked up the story. Both the Breitbart headline [‘Harvard-Backed Illegal Alien Becomes First DACA Rhodes Scholar’] and the comments are what one might expect. Well, I’m one of the supposedly effete Harvard liberal professors who backed Jin’s application. But I grew up here in rural coal mining country. My values are western Pennsylvanian values.

It’s true that the Harvard faculty skew left and affluent. I’m frequently conscious of being somewhat out of place on the faculty. (I’m sure there’s plenty of us faculty who feel out of place at Harvard for one reason or another.) Not just at Harvard, but amongst science “elites” in general, I have certainly experienced some “liberal professor” caricatures occasionally. I’ve had more than one sleek, affluent, private-school-educated colleague tell me that the problem with this country is that rural working Americans are stupid. This sort of crap pisses me off as much as it would piss off anyone else from a place like Creekside, Pennsylvania.

I was raised to treat people as individuals, with honor and respect. Jin Park is an extraordinarily talented and hard-working student. I believe that regulation of legal immigration is a core function of government (yeah, and I believe in limited government, fiscal responsibility, and a strong military too) but Jin wasn’t responsible for the decision to come here from Korea; he can’t help that his parents brought him to this country illegally when he was 7. He grew up here, pretty much like I grew up here. I was given the opportunity to rise through the system, with some things working unfairly for me (white male) and some things working against me (didn’t grow up affluent, no fancy private school education, will never ever be sleek). Jin’s illegal immigrant status is something that’s working against him, but he’s worked damn hard to succeed. He worked his butt off in my class, and in Elena’s. He’s a nice guy too. He deserves every chance to rise in this country, as much as any of us. If we help him, like we would help anyone as extraordinarily well qualified as him, our country will be better off for it.

Because of where I came from, I consider it one of my jobs on the faculty at Harvard to look out for rural Americans with backgrounds like mine; and because of my values, I also consider it one of my jobs to look out for stars like Jin.

’tis the season

Harvard’s Quantitative Biology Initiative is searching for a new tenure-track assistant professor. This is a broad search — we don’t have any particular focus areas in mind. We are interested in people studying fundamental biological questions using quantitative, computational, theoretical, or experimental methods. The Initiative emphasizes cross-departmental interaction among our life sciences departments (including Molecular & Cellular Biology, MCB; Stem Cell  & Regenerative Biology, SCRB; and Organismic & Evolutionary Biology, OEB) and our Physics, Statistics, and Chemistry departments, as well as our areas in our School of Engineering including Computer Science and Applied Math.

Ads are out now in the usual places, including Times Higher Education, Science, and LinkedIn. To apply, see

I’m on the search committee, a member of the MCB and Applied Math departments, and I’m happy to answer questions either here or by email.

I’d like to personally emphasize that a lot of what I hear about Harvard faculty searches isn’t true. I do read your application and your papers; I’m perfectly happy to read a bioRxiv preprint, and I don’t count your publications or your C/N/S papers or your citations at all. I want to recruit people from anywhere, not just from Boston or the Ivy League or the elite coasts; I grew up in rural Western Pennsylvania and I’d be happy to have some more people from coal mining towns here at Harvard, or indeed from anywhere else. Search committees here are taught about implicit bias, and we take steps to reduce  implicit biases against women and minority candidates. Harvard does have a tenure track, has for years (unlike when I came up through the system), and junior faculty are supported and mentored. We have on-campus childcare, family-friendly policies, and we work hard to make two-body spousal hires work.

I strongly, strongly encourage you to apply, and not to take yourself out of our candidate pool because you think that Harvard’s not going to look at your application for some cynical elitish reason you might have read on Twitter. Yes, we’re looking for top-flight scientists, but my experience is that many top-flight scientists tend to be pretty uncomfortable with proclaiming (or even realizing) that they’re top-flight scientists.  Apply, and tell us the cool science you want to do.

from zero to python


Students actually showed up, so we really do have to teach the course. MCB112 Biological Data Analysis is now in its first week.

The tricksiest bit in the first couple weeks is bringing people up to speed in writing Python, for people who’ve never written code before. We trust in the power of trial and error. We give working example scripts that are related to what the students are asked to do on a problem set. Developing code by mutation, descent with modification, and selection: coding for biologists.

Soon we’ll start to lift the training wheels, while trying not to leave people in a “now draw the rest of the damn owl” situation.

When you’re learning to code, with every line you type you’re looking something up. Your concentration is getting broken all over the place as you try to express the Simplest Stupid Thing (Why Don’t You Work gaaaah $%^&#@). If you’re also trying to learn something else at the same time that requires hard thinking – an algorithm, a mathematical equation, a biological analysis approach – really just about the last thing you need is to have your concentration broken every ten seconds because you can’t express yourself. The best way to learn to code isn’t to start by writing scientific code. It’s better to code something fun, something that you’re completely absorbed by, something that isn’t too conceptually difficult. You want to have only the code frustrating you, while the goal pulls you in and keeps you engaged.

But I can’t exactly recommend that students learn to code the way that I did. Sure, go get yourself absorbed in an early Internet massive military-industrial simulation game. Automate your country’s economy, re-invent Dijkstra’s shortest path algorithm to distribute your resources, make an interactive display of your map, reverse engineer the client/server communication interface so you can launch automated attacks… no, this is no way to do a PhD. Even if it does mean you end up knowing C and Perl and understanding dynamic programming, GUI development, and networked computing.

So alas, we’ll try to generate entertainment value in more socially acceptable ways, like sand mouse mysteries in the problem sets, or teasing Lior Pachter. We’ll see if it’s enough. If not, maybe I’ll have to see if the old Empire code still compiles.

Biological Data Analysis

I’m starting to plan a new Harvard course that’ll be called Biological Data Analysis. Biology is going through a culture change. It’s suddenly become a data-rich, computational analysis-heavy science. Are we going to outsource data analysis to bioinformaticians and data science specialists, or are biologists going to analyze their own data? There’s always advantages to specialization, and we need bioinformatics and data science. But I also feel that we are dangerously weak in training biologists to think about their own data. The usual response I get when I talk about it is something like “you can’t expect wet lab biologists to learn how to program”.

What I want to teach in Biological Data Analysis is that writing scripts and using the command line for data analysis is not software engineering, it’s just a simple and essential thing that a wet lab biologist can do, and needs to do. I’m going to teach from the point of view that biologists already have a special advantage in large-scale data analysis: we are trained to expect that we will be screwed by our experiments. We should be treating data analysis the same way. Like doing an experiment on a complicated organism, any given data analysis only gives you a narrow glimpse into a large data set. God only knows what else is going on in the data that you’re not seeing. Like doing experiments, you need to design positive and negative controls to protect yourself from the hundred different ways that nature (and computers) are going to mess with you. Writing scripts that generate positive and negative controls for a data analysis is a powerful and biologically motivated thing to know how to do.

Once you’re generating negative control data — “here’s what the data would look like if there were no effect to be found” — you’re actually doing statistics, but in an intuitive and motivated way that any biologist can understand. Instead of learning a bunch of incantations and lore about t-tests, you’re forced to think directly about what your null hypothesis is, because you have to make a negative control data set according to that null hypothesis. The “p-value” is  directly the probability that you observe a signal in your negative control. This style of simulation-driven analysis is enabled by modern computational power plus the ability to write simple scripts and use the command line. You don’t have to learn statistics per se. You have to learn how to do computational control experiments. I expect that if a biologist learns the simulation-driven style of analysis first, then they’re motivated to go on to learn more serious statistical analysis as they need it… and they’re armed with a powerful way to check analytic results against intuitive simulations.

If this makes any sense to you, and if you happen to be a Harvard PhD student graduating this year, and you’re thinking it might be nice to take a year and do some teaching… boy, do I have a deal for you. Harvard has a thing called the College Fellows Program. This is a one-year position (renewable for one more) that focuses on teaching and course development. We’ve just posted an ad looking for a College Fellow to help me develop and teach Biological Data Analysis. Application deadline is April 15. Feel free to contact me directly with questions!


Open faculty position in Harvard FAS Systems Biology

harvard_logoHarvard’s FAS Center for Systems Biology has opened a search for a new tenure-track faculty member at the assistant professor level. Sharad Ramanathan and I are the co-chairs for the search committee.

From the ad:

The Center emphasizes quantitative approaches to fundamental problems in biology. It aims to foster interactions across disciplinary boundaries, housing faculty from a spectrum of academic departments in addition to the Bauer Fellows. Exceptional candidates in any area of quantitative biology will be considered, including those taking computational, theoretical, and/or experimental approaches.

Faculty associated with the Center for Systems Biology have access to facilities and opportunities for collaborative research not only through departments but also through the Bauer Core facilities, the Center for Nanoscale Systems, the Broad Institute, and the Center for Brain Science. The successful candidate will hold an academic appointment in a natural science department such as, but not restricted to, Molecular and Cellular Biology, Organismic and Evolutionary Biology, Physics, Applied Mathematics, or Chemistry and Chemical Biology.

The application web page is here.