The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' but 'That's funny...' --Isaac Asimov
That's Funny… random header image

Proofing a manuscript

February 6th, 2010 by eric
Respond

I recently got the proofs back for an accepted manuscript and had to go about making sure the editors didn’t screw anything up when they typeset the text. One way I’ve done this in the past (idea via Gabrielle!) is to coerce a co-author into helping me read the entire manuscript. Backwards. One conspirator reads the original submitted version and another reads the uncorrected proof, starting from the end, pronouncing every word and every punctuation mark. This is mind-numbingly tedious but it works. The end-to-beginning technique is important because it prevents ’skimming’ and assuming the content of a sentence without actually verifying it.

Unfortunately, I don’t have a co-author here to waylay, so I’ve used the next best thing (no, not an undergrad): a computer. Below is a perl script which takes as input a text file with one version of the manuscript and outputs the reversed, punctuated version. I abused Adobe to translate the publisher’s PDF into a text file (email it to pdf2txt@adobe.com), which I then manually edited to remove line numbers, etc. After translating it using the script I had MacOSX read it to me while I made notes on the original submitted PDF whenever there was an inconsistency. The new text-to-speech voice, Alex, is actually quite good. In hindsight, the better way to do this would be to use my original LaTeX (tex2txt?) or .doc file as the reversed text and to make notes directly on the publisher’s PDF. Oh well, there will be a next time.

How to use the program:

  • save as proofer.pl
  • make the code executable: chmod +x ./proofer.pl
  • and run as: ./proofer.pl [inputfile]
  • the program will output a file [proofed.txt]
  • run: say -v Alex < ./proofed.txt
  • OR open the output file in a text processing program (e.g. TextWrangler) and choose “Services –> Speech –> Start Reading Text”. You can change the voice and reading speed under “Apple –> System Preferences –> Speech”
#!/usr/bin/perl
use strict;
use warnings;

open FILE, ">proofed.txt" or die $!;
my @array;

while(<>) {
    s/\,/ ,comma, /g;
    s/\:/ ,colon, /g;
    s/\./ ,period, /g;
    s/\;/ ,semicolon, /g;
    s/\</ than, ,less /g;
    s/\>/ than, ,greater /g;
    s/\?/ mark, ,question /g;
    s/\'/ apostrophe, /g;
    s/\"/ mark, ,quotation /g;
    s/\(/ parenthesis, ,open /g;
    s/\)/ parenthesis, ,closed /g;
    s/\=/ ,equals, /g;
    s/\+/ ,plus, /g;
    s/\-/ ,dash, /g;
    s/\!/ point, ,exclamation /g;
    s/\@/ sign, ,at /g;
    s/\#/ sign, ,pound /g;
    s/\$/ sign, ,dollar /g;
    s/\%/ sign, ,percent /g;
    s/\^/ sign, ,caret /g;
    s/\&/ ,ampersand, /g;
    s/\*/ ,asterisk, /g;
    s/\[/ bracket, ,open /g;
    s/\]/ bracket, ,close /g;
    s/\{/ bracket, ,curly open /g;
    s/\}/ bracket, ,curly close /g;
    s/\~/ ,tilde, /g;
    s/\`/ mark, ,tick /g;
    s/\// slash, ,forward /g;
    s/\\/ slash, ,back /g;
    s/(\r|\f)/\n/g;
    s/ a / ,a, /g;
    s/ the / ,the, /g;
    s/ to / ,to, /g;
    s/ in / ,in, /g;
    s/[^A-Za-z0-9 ,\n]/ character, ,unknown /g;
    push(@array, split,"\n");
}
print FILE join(' ', reverse @array);

close FILE;

Tags:   · · · · · 2 Comments

GUI file diff viewers round-up (for LaTeX)

December 14th, 2009 by eric
Respond

I was having a look for graphical file difference viewers in order to get a quick view of the differences between two LaTeX files. One of the things I was looking for was a tool that could handle changed inversions or rearrangments, places where the text has changed both in content and position within the document. I was expecting to find some tool that could show rearrangements like those we see in genomic sequences, but I found none that were free/open source. I understand that Araxis Merge ($$) can do this.

I previously wrote about using ‘bzr’ for version control with LaTeX files. Some people recommend ‘git’, which is a distributed version control system like ‘bzr’. Others use ‘cvs’ or ’svn’ but I wouldn’t recommend them because they aren’t distributed, so you don’t have the full repository and can’t commit when you’re working offline (AFAIK).

These are my impressions of the few GUI file diff viewers that I tried.

Meld
A GNOME based program.

  • + Allows line wrapping
  • + Highlights differences at line and word level
  • + In-line editing
  • + Clean, simple interface (hides gaps but a little uglily)
  • – Ugly, unchangeable color scheme
  • – Linear mapping (can’t handle rearrangements)

Kompare
A KDE based program.

  • – No line wrapping
  • + Highlights differences at line level and ~word level (may be able to use wdiff? didn’t try)
  • – No in-line editing
  • + Clean, simple interface (hides gaps)
  • + Good color scheme
  • – Linear mapping (can’t handle rearrangements)

TkDiff
A Tk program.

  • – No line wrapping
  • – Only highlights differences at line level by default (may be able to use wdiff? didn’t try)
  • – No in-line edits (opens up simple editor, but not to the right line)
  • – Ugly interface (Tk; leaves explicit gaps)
  • + Good color scheme
  • – Linear mapping (can’t handle rearrangements)

latexdiff
Not the same as the others here, latexdiff compares two files and merges them into a single .tex file which is then rendered in order to show the differences like ‘Track Changes’ in Word or OpenOffice. The output is very nice but the requirement that the document renders properly was a hindrance in one comparison I wanted to do. This would be good for submitting as the “changes” file required by some journals during manuscript revision in which you need to point out every change made between revisions.

  • + Highlights differences at word and higher levels
  • + Easy command line interface: latexdiff oldfile newfile > diff.tex
  • + Good color scheme
  • – Linear mapping (can’t handle rearrangements)

Here are a couple of good discussion threads on this and the related issue of collaboration:

Discussion on Debian Science List
Discussion on Ask Slashdot
Discussion on Academic Productivity
ScribTex — a online collaborative wiki-like LaTeX editor

Tags:   · · · · No Comments.

Molecular modeling of extremophiles?

December 1st, 2009 by eric
Respond

Dr. Mikko Karttunen, from the University of Western Ontario, will be speaking at McMaster tomorrow in the Physics and Astronomy seminar. He worked on a recent paper titled Microscopic Mechanism for Cold Denaturation in which they used computer simulations to determine that the highly ordered structure of water at subzero temperatures caused protein denaturation because the shell water molecules were more strongly bound to each other than the polar hydrogens on the protein exterior. I’m interested in this work from the perspective of life in extremely cold environments. Would the presence of salts inhibit this mechanism of denaturation? Ammonium? How does this change in solvation affect other biochemicals like lipids or polysaccharides like EPS (extracellular polymeric substances) which are thought to aid in the cryoprotection of microorganisms surviving at low temperatures? It might be interesting to re-run these models, doped with 1-7% Na+ + Cl-, the atomic concentrations of salt at seawater salinity and in sea ice brine at -23 deg Celsius.

In another paper from the same lab (Control of Calcium Oxalate Crystal Growth by Face-Specific Adsorption of an Osteopontin Phosphopeptide) they investigate biomineralization and protein-mineral interactions. This work is interesting from the Origins of Life perspective because the earliest biological systems were probably heavily reliant on minerals to perform the catalysis of reactions important to nascent biochemistry. Because one of the problems in identifying a pathway for the origin of life lies in constraining the near infinite number of possible chemical reactions that might take place under given reaction conditions, if it were possible to use molecular models to predict chemical reactions that would be likely to proceed on various mineral surfaces it would help limit the scope of all potential reactions that would need to be investigated in order to derive a ‘most parsimonious path’ from non-life to life.

Tags:   · · · · No Comments.

How well can we predict the future of sea ice? And what do we do when the ice is gone?

November 11th, 2009 by eric
Respond

Sea Ice Report — Summary of 2009 Pan-Arctic Sea Ice Outlook

The amount of sea ice in the Arctic grows and shrinks every year as the seasons change. The largest extent is in late winter, after which it melts throughout the summer. The minimum annual sea ice extent generally occurs in mid-September. This year (2009) it occurred around September 16. As of today (November 11) the sea ice has been growing more slowly than usual and is now at a record minimum for this time of year.

The smallest Arctic sea ice extent ever observed was in 2007 but in the past two years it has rebounded. However, there is a long-term decline in September sea ice extent going back at least 30 years, so although this year’s September ice minimum is near that predicted by the 30-year declining trend, it falls well short of the actual September ice minimum in 1979.

Early in this year’s melt season (May) a number of sea ice scientists made predictions about the year’s September minimum. They used a variety of methods, ranging from simple extrapolation of the 10-year decline trend to complicated computer models involving existing sea ice and weather conditions, but all of them predicted smaller minimums than were actually observed. The report shows that we still have a lot to learn about predicting inter-annual variations in complex natural phenomena like weather conditions and sea ice extent. Some factors affecting our ability to make predictions include large inter-annual variability in the historical record, limitations in our ability to interpret satellite data from sea ice covered areas (see this report by Barber et al.), and the long-term decline in ice thickness (and thus ice volume) as thick multi-year ice melts and is replaced by thinner, younger ice.

Nevertheless, it is clear that summer sea ice is an endangered environment in the Arctic. I think we should use this information to re-assess our priorities for research in the polar regions, but it is not clear which direction we should take. Should we focus on learning about the biology and ecosystem functions present today in multi-year ice and summer sea ice, as a method of archiving the environment in the literature so that it will continue to live on even after it is gone? Or should we look to the Antarctic as a model for what to expect in the future of the Arctic, focusing our resources on predicting the responses of ecosystems and food webs to the ongoing transition from large summer sea ice extents to small? I’m not yet sure, but we will have to move fast to make these kinds of decisions because nature is not going to wait for us–by the time my generation retires there may not be any summer ice left to study in the Arctic.

Tags: No Comments.

Automated bacteria or virus counts in ImageJ

September 28th, 2009 by eric
Respond

THIS IS EXPERIMENTAL, UNPUBLISHED SOFTWARE. USE AT YOUR OWN RISK.

I’ve written a free, open source script for ImageJ (free, open source) to count viruses (or bacteria, but not both at the same time) automatically from JPEG image files. If you have TIFFs you can batch convert them to JPEG using ImageMagick (free, open source) with the following command:

mogrify -format jpg *.tif

You can download the script here as a text file (JVirusCount), or the full source is written below. Opened in ImageJ (after opening any image in the desired directory), it will iteratively adjust the noise threshold and use the “Find Maxima” command to count the number of dots in every image file in the desired directory.

20071213-4-5

viruses in Arctic seawater

The output of the script is a tab-delimited text file for each image summarizing the number of dots detected at each threshold, which can be input into a Matlab (not free, closed source) script (MVirusCount) using an external function, regress2lines (free, open source). I haven’t tried using it with Octave (free, open source) but if you get it to work let me know. What I’ve found is that there is a significant change in the slope of this curve during the transition from measuring ‘noise’ to ‘particles’, but that it depends on the quality, brightness, and contrast of the image; MVirusCount determines the intersection of those two lines and the abundance of dots at that point. A program finding the maximum of the first derivative of the curve would likely work just as well or better.

Example graph of output

Example graph of output

As seen below, the program has the lowest relative error at high concentrations of viruses, and should probably not be used at concentrations less than 100 viruses per field without further testing. I should also note that the samples shown above are field samples from an extreme environment, so usage in a laboratory setting may be more precise.

computer counting of viruses

JVirusCount

list = getFileList(File.directory);

for (i=0; i<list.length; i++) {
run("Clear Results");
run("Set Measurements...", "  decimal=9");

for (noise=1; noise<42; noise++) {
 run("Find Maxima...", "noise=" + noise + " output=Count");
 rows = nResults-1;
 setResult("Noise", rows, noise);
 counts = getResult("Count", rows);
 setResult("logcount", rows, log(counts));
}

title = getTitle();
saveAs("Measurements", File.directory + title +".csv");
run("Open Next");
}

MVirusCount

clear all;                                                  
d=dir('*jpg.csv');                                          

for k=1:length(d);

  fname=d(k).name;
counts = dlmread(fname,'\t',1,0);
counts(end,:)=[];                
counts(:,end)=[];                

% from 10:end because it has sigmoidal shape screwing things up
[m, R, idiv, G] = regress2lines(counts(10:end,3),counts(10:end,4));

xy = [1, m(5); (m(1)*1+m(2)), (m(1)*m(5)+m(2)); m(5), 40; (m(3)*m(5)+m(4)), (m(3)*40+m(4));];

store_fname{k,1} = fname;
store_intercept(k,1) = m(5);
store_count(k,1) = exp((m(1)*m(5)+m(2)));

%  subplot(1,length(d),k)
hold off
plot(counts(:,3),counts(:,4),'bs')
hold on
plot(xy(1,:),xy(2,:),'b-')
plot(xy(3,:),xy(4,:),'r-')
title({fname});

drawnow
pause

end

fid = fopen('output.csv','a');
for k=1:length(store_fname);
fprintf(fid,'%s\t%0.5g\t%0.5g\n', store_fname{k}, store_intercept(k), store_count(k));
%  sprintf('%s\t%0.5g', store_fname{k}, store_intercept(k))
end

Tags:   No Comments.