A new bioRxiv preprintfrom Rafael Tavares, Anna Marie Pyle, and Srinivas Somarowthu challenges conclusions of our 2017 Nature Methods paper where we describe R-scape, a method for detecting support for conserved RNA secondary structure in sequence alignments by statistical analysis of base pair covariations. In our paper, among other things, we showed that the evidence presented by Somarowthu (2015) in support of a putative conserved structure for the HOTAIR lncRNA was not statistically significant, using the same alignment that they had analyzed. The new Tavares paper argues that by changing R-scape's default statistic to a different one called RAFS, now statistically significant evidence for conserved structure is detected in their HOTAIR alignment and others.
Tavares' conclusions depend on an assumption that the RAFS statistic is an appropriate measure of RNA base pair covariation, but RAFS was not designed to measure covariation alone. RAFS detects positive signals in common patterns of primary sequence conservation in absence of any covariation. The problem is severe; Tavares' analysis reports "significantly covarying base pairs'' in 100% identical sequence alignments with no variation or covariation. The base pairs that Tavares et al. identify as significantly covarying actually arise from primary sequence conservation patterns. Their analysis still reports similar numbers of "significant covarying" base pairs in negative controls in which we permute residues in independent alignment columns to destroy covariation. There remains no significant covariation support for evolutionarily conserved RNA structure in the HOTAIR lncRNA or other lncRNA structures and alignments we have analyzed.
We have posted a PDF of a full response to the Tavares et al. preprint on the lab's web site.
[Update, 15 Nov 2018: We made a correction, marked in red in the PDF, where we describe how the Weinberg and Breaker R2R program annotates "covarying" base pairs. While it's true that it only "requires a single compensatory pair substitution to annotate a pair as covarying", it isn't true that this is "regardless of the number of sequences or the number of substitutions that are inconsistent with the proposed structure." We have corrected the latter phrase to read "so long as no more than 10% of the sequences are inconsistent with canonical base pairing of the two positions." We also added a footnote to say that the Somarowthu (2015) paper customized R2R's tolerance to allow up to 15% inconsistent base pairs to obtain their HOTAIR results. Thank you to Zasha Weinberg for correcting us.]