snoscan and squid in the 21st century

Back in the 20th century, when Todd Lowe was a happy PhD student in my lab in St. Louis instead of toiling in irons as professor and chair at the Department of Biomolecular Engineering at UCSC, he wrote a program called snoscan for searching the yeast genome sequence for 2′-O-methyl guide C/D snoRNAs (small nucleolar RNAs), using a probabilistic model that combines consensus C/D snoRNA features and a guide sequence complementarity to a predicted target methylation site in ribosomal RNA [Lowe & Eddy, 1999]. Like all the software that’s been developed in our group, snoscan has been freely available ever since. Over 30 years, our software, manuscript, and supplementary materials archive has so far survived several institutional moves (Boulder to MRC-LMB to St. Louis to Janelia Farm to Harvard), several generations of revision control system (nothing to rcs to cvs to subversion to git) and is now hosted in the cloud at eddylab.org, with some of the bigger projects also at GitHub.

It’s one thing to keep software available. It’s entirely another thing to keep it running. Software geeks know the term bit rot: a software package will gradually fall apart and stop running, even if you don’t change a thing in it, because operating systems, computer languages, compilers, and libraries evolve.

We got a bug report on snoscan the other day, that it fails to compile on Redhat Linux. (Adorably, the report also asked “how were you ever able to compile it?“. Well, the trick for compiling 18-year-old code is to use an 18-year-old system.) Indeed it does fail, in part because the modern glibc library puts symbols into the namespace that weren’t in glibc in 1999, causing a C compiler in 2017 to fatally screech about name clashes.

I just spent the day unearthing snoscan-0.9b and whacking it back into compilable and runnable shape, which felt sort of like my college days keeping my decrepit 1972 MG running with a hammer and some baling wire, except that I wasn’t cursing Lucas Electrical every five minutes. For those of you who may be trying to build old software from us — and maybe even for you idealists dreaming of ziplessly, magically reproducible computational experiments across decades — here’s a telegraphic tale of supporting a software package that hasn’t been touched since Todd’s thesis was turned in in 1999, in the form of a few notes on the main issues, followed by a change log for the hammers and baling wire I just used to get our snoscan distribution working again.

Continue reading