Notes on an Emacs workflow for academic documents

I've put together a workflow for authoring in raw text using Emacs. These are my notes:

1. Why author in emacs?

The sociologist Kieran Healy argues that Emacs is a very useful tool if you want to present research results in a professional, clear, and well-organized manner. Frankly, to get to the point where this is true, one must climb a steep learning curve. I think the productivity gains make it worth the effort.

I do a lot of document preparation in Markdown. I can move documents from Markdown to LaTeX, a sophisticated document preparation system favored by many academics, using Pandoc. There is a "Pandoc mode" available for Emacs. Importantly, I can add LaTeX preamble commands to a YAML snippet at the top of each document, like so:

---
title: 'An Example Title.'
author: "Nick Judd"
abstract: "Paper abstract."
date: "February 15, 2016"
classoption: titlepage
header-includes:
- \usepackage{txfonts}
and so on ...

---

I can also add math which will be immediately typeset the "right way." This is one of the biggest perks to using something that is built on top of TeX. For instance, I can write $a^2 + b^2 = c^2$ and it will be converted from Markdown into LaTeX the same way it is converted using Mathjax on this site: \(a^2 + b^2 = c^2\).

Bibliographies and citations

Academic work involves fastidious recordkeeping about external sources. I prefer Zotero because it has two key features:

  1. Firefox integration that downloads a journal article citation, the PDF (where available), and relevant metadata with one click, meaning it is very easy to get data in;

  2. It has many options for generating citations in different formats and for exporting entire bibliographies -- it is easy to get data out, for research purposes and to make sure no software can hold my work hostage.

Several packages integrate Zotero with Pandoc and Emacs. The one that works for me is zotxt. I use Zotxt to satisfy another set of criteria:

  1. On-the-fly bibliography management;

  2. On-the-fly citations --- I don't want to do any cut-pasting or hunting around for the right citation key to use;

  3. Relative ease of use.

Professional results

Tools like ggplot2 and stargazer in R or pandas, seaborn, and statsmodels in Python generate beautiful tables and charts. They can also be used to generate raw LaTeX output suitable for integrating directly into a document.

With many tables and figures in each document, this sort of thing is easy to do and quick to replicate in Markdown and LaTeX, but would start to get tedious in Word.

2. How-to

Installation

I'll assume you have figured out how to download and install the following:

For some reason my Emacs and Pandoc did not come with citeproc. In Fedora, I installed it using the dnf package manager like this: sudo dnf install pandoc-citeproc.x86_64. You might consider Kieran Healy's Emacs Starter Kit if acquiring everything seems mystifying. (Zotero should be pretty straightforward to get separately.)

Now, the part that is especially non-obvious:

  1. Go to zotxt's Firefox Add-ons page; right-click on "Add to Firefox;" save the file.
  2. With Zotero Standalone open, go to Tools -> Add-ons and select "Install Add-on From File." I found this option by clicking on the crossed hammer and screwdriver at upper right; perhaps your interface is a little different. In the dialog box, select the file you just downloaded.
  3. Again in Zotero Standalone, go to Preferences -> Export and change the "Default Output Format" for QuickCopy to "Easy Citekey." If that option doesn't exist, restart Zotero Standalone and try again; it might need to be restarted for it to pick up on the fact that zotxt is there now.

The Easy Citekey format is Pandoc-friendly --- you'll see how it works a little later.

Now you should have all of the tools you need.

The Workflow

  1. Try out the Emacs interactive tutorial if you haven't used Emacs before. The tutorial is comprehensive, assumes no prior knowledge, and is very user-friendly.
  2. Author your work in an Emacs file in Markdown. With Emacs open, you can open a new file by holding down Control and pressing x, then holding down Control and pressing f. Emacs users call these "chords" and would abbreviate that combo like this: C-x C-f. Notice an open "minibuffer" at the bottom of the Emacs screen; you can use this to specify out The link again to the basic syntax is here. How to use YAML metadata blocks for Pandoc is here.
  3. Adding a citation: With Zotero Standalone open and while working in Emacs, place your cursor where you want the citation to go and use this keychord: C-c " k (Control-c, then shift-single-quote for double quotes, then the letter k).
  4. Notice the minibuffer at the bottom of the screen is now active; it wants you to hit Enter, so do that.
  5. In the next prompt in the minibuffer, type the author's last name and the date of the citation you want and hit Enter.
  6. Use the up and down arrow keys to navigate to the correct citation and hit Enter when you've found it. You should now see a citekey in your document beginning with the "@" symbol. For Pandoc to make use of this, the ref must be in brackets ([]). You can additionally add multiple citations by separating them with semicolons (;) so the whole package looks like this: [@mayhew:1974;@fenno:1978].
  7. Generating a document: After saving the Markdown file, open up a terminal, console, or shell window. Navigate to the appropriate directory and run Pandoc from the command line, invoking both zotxt and citeproc like this: pandoc -F pandoc-zotxt -F pandoc-citeproc -s inputfile.md -o outputfile.pdf.
  8. Notice that the bibliography has been inserted at the end of the document by default. You could also output LaTeX and edit that file to customize where your bibliography goes.

Troubleshooting and Customization

Sometimes citeproc won't pick up on a reference because it is a duplicate. You can fix this by specifying a citekey of your own within Zotero. If a reference isn't showing up, open up Zotero and add a citekey as a tag. See the section "Creating a custom citekey" here.

The syntax for invoking zotxt from Pandoc is adapted from Warren Knight's recipe.

Improve This Workflow

I'd love to hear about other productivity hacks that beat this procedure for speed or ease of use.