Markdown-Based Scientific and Computational Note Taking with Obsidian

Motivation

Over the last year, being in the Reed Research Group, I have been exposed to new ideas more rapidly than I can manage. During this period, I was filling my laptop storage with countless .docx, .pdf, .txt, .html files semi-sporadically stored across different cloud and local storages.

Finding a good method for organizing my notes and documentation has been very beneficial for my productivity, processing of new ideas, and ultimately staying sane. For me, this has been done through Obsidian.

Obsidian is an open-source markdown (.md) based note taking and organization tool, with strengths being:

  • Connected and relational note organization
  • Rendering code and mathematical (LaTeX) syntax
  • Community-developed plugins
  • Light-weight
  • A nice aesthetic

The fact that all notes are simply .md files is very appealing to me. I don’t need to waste time searching through different local and cloud directories for my notes, and can avoid having OneNote, Notepad, and 6 Word windows open at the same time.

Also, I like having localized storage of the notes and documentation that I am accumulating. I store my Obsidian directory on GitHub, and push-pull copies between my laptop and desktop, giving me the security of having three copies of my precious notes at all times.

I’ve been using Obsidian as my primary notetaking app for a few months now, and it become central to my workflow. In this post, I show how Obsidian extends Markdown feature utility, helps to organize topical .md libraries, and generally makes the documentation process more enjoyable. I also try to explain why I believe Obsidian is so effective for researchers or code developers.

Note Taking in Markdown

Markdown (.md) text is an efficient and effective way of not just documenting code (e.g., README.md files), but is great for writing tutorials (e.g., in JupyterNotebook), taking notes, and even designing a Website (as demonstrated in Andrew’s Blog Post last week).

In my opinion, the beauty is that only a few syntax tricks are necessary for producing a very clean .md document. This is particularly relevant as a programmer, when notes often require mathematical and code syntax.

I am not going to delve into Markdown syntax. For those who have not written in Markdown, I suggest you see the Markdown Documentation here. Instead, I focus on how Obsidian enables a pleasant environment in which to write .md documentation.

Obsidian Key Features

Below, I hit on what I view as the key strengths of Obsidian:

  • Connected and relational note organization
  • Rendering code and mathematical (LaTeX) syntax
  • Community-developed plugins

Relational Note Organization

If you take a quick look at any of the Obsidian forums, you will come across this word: Zettelkasten.

Zettelkasten, meaning "slip-box" in German and also referred to as card file, is a note taking strategy which encourages the use of many highly-specific notes which are connected to one another via references.

Obsidian is designed to help facilitate this style of note taking, with the goal of facilitating what they refer to as a "second brain". The goal of this relational note taking is to store not just information about a single topic but rather a network of connected information.

To this end, Obsidian helps to easily navigate these connections and visualize relationships stored within a Vault (which is just a fancy word for one large folder directory). Below is a screen cap of my note network developed over the last ~4 months. On the left is a visualization of my entire note directory, with a zoomed-in view on the right.

Notice the scenario discovery node, which has direct connections to methodological notes on Logistic Regression, Boosted Trees, PRIM, and literature review notes on Bryant & Lempert 2010, a paper which was influential in advocating for participatory, computer-assisted scenario discovery.

Each of these nodes is then connected to the broader network of notes and documentation.

These relations are easily constructed by linking other pages within the directory, or subheadings within those pages. When writing the notes, you can then click through the links to flip between files in the directory.

Links to Internal Pages and Subheadings

Referencing other pages (.md files) in your library is done with a double square bracket on either side: [[Markdown cheatsheet]]

You can link get down to finer resolution and specifically reference various levels of sub-headings within a page by adding a hashtag to the internal link, such as: [[Markdown cheatsheet#Basics#Bold]]

Tags and Tag Searches

Another tool that helps facilitate relational note taking is the use of #tags. Attach a # to any word within any of your notes, and that word becomes connected to other instances of the word throughout the directory.

Once tags have been created, they can be searched. The following Obsidian syntax generates a live list of occurrences of that tag across your entire vault:

```query
tag: #scenarios 

Which produces the window:

Rending code and math syntax

Language-Specific Code Snippets

Obsidian will beautifully stylized code snippets using language-specific formatting, and if you don’t agree then you change change your style settings.

A block of code is specified, for some specific language using the tripple-tic syntax as such:

```langage
<Enter code here>

The classic HelloWorld() function can be stylistically rendered in Python or C++:

LaTeX

As per usual, the $ characters are used to render LaTeX equations. Use of single-$ characters will results in in-line equations ($<Enter LaTeX>$) with double-$$ used for centered equations.

Obsidian is not limited to short LaTeX equations, and has plugins designed to allow for inclusion other LaTeX packages or HTML syntaxes.

latex $$ \phi_{t,i} = \left\{ \begin{array}\\ \phi_{t-1,i} + 1 & \text{if } \ z_t < \text{limit}\\ 0 & \text{otherwise.} \end{array} \right. $$

will produce:

Plugins

Obsidian boasts an impressive 667 community-developed plugins with a wide variety of functionality. A glimpse at the webpage shows plugins that give more control over the visual interface, allow for alternative LaTeX environments, or allow for pages to be exported to various file formats.

Realistically, I have not spent a lot of time working with the plugins. But, if you are someone who likes the idea of a continuously evolving and modifiable documentation environment then you may want to check them out in greater depth.

Conclusion: Why did I write this?

This is not a sponsered post in any way, I just like the app.

When diving into a new project or starting a research program, it is critical to find a way of taking notes, documenting resources, and storing ideas that works for you. Obsidian is one tool which has proven to be effective for me. It may help you.

Best of luck with your learning.

Make LaTeX easier with custom commands

LaTeX is a powerful tool for creating professional looking documents. Its ability to easily format mathematical equations, citations and complex figures makes LaTeX especially useful for developing peer-reviewed journal articles and scientific reports. LaTeX is highly customizable, which allows you to create documents that are not carbon copies of generic Microsoft Word templates.

Using LaTeX does have it’s drawbacks- instead of simply typing on a page, you construct the document by writing LaTeX code. Once you’ve written your code, a compiler translates it into a finished and formatted document. This can sometimes result in high overhead time for fixing bugs and managing format. But coding a document also has advantages, in addition to the vast array of existing LaTeX libraries and commands, you can create your own custom commands that speed up the writing and formatting process. Below I’ll overview the basics of creating custom LaTeX commands and provide some illustrative examples.

Commands with no arguments

If you have an equation or a complex sequence of text that you know you’ll be repeating, you can create a custom command to produce it. For example, if I’m constantly referencing the equation for an Ordinary Least Squares (OLS) estimator, I can make a new command that produces it:

\newcommand{\OLS}{$\hat{\beta}=(X^TX)^{-1}X^Ty$}

There are three parts to defining this command, as shown in the figure below:

  1. Tell LaTex you are defining a new command by specifying “\newcommand”
  2. name the command (make sure to include the backslash)
  3. Specify the output of the new command

Example LaTex code that calls the OLS command:

I can store complex terms using a predefined command: \OLS

Compiled output:

Commands with basic arguments

Single argument

You can also define commands that accept arguments. For example, if you want to make commands to assist tracking changes in a document, you can create a command that formats a section of added text so it has the color blue and is bolded:

\newcommand{\addtxt}[1]{{\color{blue} \textbf{#1}}} % Highlight text that has been added

The command defined above accepts one argument (shown as the “[1]”) and calls that argument using “#1”, as highlighted in the figure below:

Example Latex code using my command:

Demonstrating my custom commands with arguments:\addtxt{This text has be inserted into this sentence}

Example compiled output:

Multiple arguments

You can also define commands with multiple arguments, for example, you can create a template sentence that provides an update the timing of a project:

\newcommand{\projReportA}[2]{The project was planned to finish on \textbf{#1}after reviewing current progress we have determined that it will likely finish on \textbf{#2}} % insert a date for when a project was planned to be completed and when a project is likely to be completed

Here, argument #1 is the date when the project was planned on being completed, and argument #2 is the date that the project will likely be completed.

Example use of this function:

Another way you can use an argument:\\ \\
\projReportA{September $9^{th}$}{October $1^{st}$}

Example compiled output:

Commands with optional arguments

The project report command above can be modified to accept a default completion date with an option to include an updated date.

\newcommand{\progReportB}[2][September 9th]{The project was planned to finish on \textbf{#2}, after reviewing current progress, we have determined that it will likely finish on \textbf{#1}}

To create an optional argument, specify the default value of the first argument in a new set of brackets. Note that for basic Latex this only works for a single default argument, for more defaults you can use a package such as xparse.

Here’s an example using this new command with the default argument:

Here I'll will use the command without the optional argument, so it will print the default: \\
\\
\progReportB{September $9^{th}$}

This will compile to:

Here’s an example with the optional argument specified

Now I'll add the optional argument, which will be added in place of the default: \\ \\
\progReportB[October $1^{st}$]{September $9^{th}$}

This will compile to:

Concluding thoughts

These simple examples only scratch the surface of what you can do with LaTex commands. I should also note that while custom commands are useful, LaTex also contains a large suite of packages with predefined commands that can be easily imported into your document.

Helpful Latex resources:

R-Markdown

What Is R-Markdown? Why We Are Interested in It?

A few years ago, a very dear friend of mine told me about R-Markdown. I was working on a report, and he said that I should try this very cool tool. I did so but not immediately. I started with an “if it ain’t broke, don’t fix it” attitude. However, I quickly realized that R-Markdown really is helpful—well, at least in many situations.

What is R-Markdown? It is a script-based text-development platform for preparing high-quality papers and reports. This strong tool is effective for use on complicated documents that have various types of diagrams and tables. R-Markdown is a distribution of Markdown language for R. More information about Markdown can be found here.
Personally, I’ve found R-Markdown to be a powerful tool for creating tutorial documents that include figures, tables, blocks of code, and more. R-Markdown can also be very helpful for working on papers; you can have everything in the same place. For example, as you will see in this tutorial, you can generate your figures and tables within documents. Because it is script-based, R-Markdown is reproducible; you will always get the same text format and figure quality. Therefore, if you want to have a professional-looking CV or are working on a paper or report, I suggest giving R-Markdown a try. The tool might become your new best friend.

install R Markdown

There are two steps to install R-Markdown:

1- Install R Markdown


# 1- Install R Markdown

install.packages("rmarkdown")
library(rmarkdown)


2- You also need to install “tinytex”. You can use the following command to install and load “tinytex”

tinytex::install_tinytex()
library(tinytex)

Create an R-Markdown Document

To create your first R-Markdown document, start by installing R-Markdown. Then, open the “File” menu, and click on “New File.” From the dropdown menu, select “R-Markdown.” Doing so will open an R-Markdown file in your RStudio. The file comes with very simple and informative instructions.

On a side note, I use RStudio, which is a popular and user-friendly integrated development environment (IDE) for R. You can find more information about it (here).

Publish your document

The final format of your output document can be pdf, html, or word. To select your favorite output and generate your final document, click on “Knit,” which opens a dropdown menu. Select the output format—for example, pdf—and it will generate your document.

Components of an R-Markdown Code

R-Markdown documents usually include meta-data, text, and code chunks. The following sections briefly describe the components, and more information can be found on R-Markdown’s website.

Meta-Data

When generating documents, R-Markdown requires some initial information and instructions. These can include general data about the documents—for example, date, title, output format, and author’s name.

Text

Text parts in R-Markdown follow the tradition of other document-markup languages such as LaTex (see here). However, R-Markdown is easier than LaTex. Basically, authors can use scripts to adjust document formatting. Many details can be listed about R-Markdown’s text-formatting commands, but I am not going to explain them in this short tutorial. These cheat sheets here and here provide enough information to get you started on writing an R-Markdown document. A few examples: #header creates headers, ‘[]()’ creates a hyperlink. You can use $ to insert equations (e.g. $y=ax^{2}+ bx +c$).

Code Chunks

Different types of code chunks can be used in R-Markdown; the types depend on application of the code. You might want to show your code when you develop an instruction. You can also write a code solely for generating a figure, but you may not want to show the code itself. The following is a generated timeline figure of Michael Jackson’s life; you can see the code.

#You need to uncomment ``` lines

#```{r timeline}

# This code chunck generates a timeline of Michael Jackson life.

library(timelineS)
timelineS(mj_life, main = "Life of Michael Jackson",label.cex =   0.7)

#```

Adding Tables

There are different libraries available in R for generating nice-looking tables. Here I use knitr.

#You need to uncomment ``` lines

#```{r table}

library(ggplot2)

library(knitr)
kable(mpg[1:8,])

#```

Adding Figures to Your Document

R-Markdown allows you to generate plots in your documents. For example, you can use ggplot, which is a powerful figure-creation library in R, to create and insert a plot into your document. See the following. If you include echo = FALSE to the header of your code chunk, the code would disappear on your final pdf file.

#You need to uncomment ``` lines

#```{r ggplot} 

library(ggplot2)

# MPG dataset is already available in ggplot2, I use it to generate the following figure

ggplot(mpg, aes(x=cyl, y=cty)) + geom_boxplot(aes(fill=factor(cyl))) + 
    labs(title="Mileage vs Number of Cylinders", 
         x="Number of Cylinders",
         y="City Mileage",
         fill="City Mileage")

#```

Embedding figure text into a Latex document

Often times we have to create plots and schematic drawings for our publications. These figures are then included in the final document either as bitmaps (png, jpeg, bmp) or as vectorized images (ps, eps, pdf). Some inconveniences that arise due to this process and are noticed in the final document are:

  • Loss of image quality due to resizing the figure (bitmaps only)
  • Different font type and size from the rest of the text
  • Limited resizing possibility due to text readability
  • No straight-forward method to add equations to the figure

If the document is being created in LaTeX, it is possible to overcome all these inconveniences by exporting your figure into either svg or postscript formats and converting it into pdf+Latex format with Inkscape. This format allows the LaTeX engine to understand and treat figure text as any other text in the document and the lines and curves as a vectorized image.

EXPORTING FIGURE

The process for creating of a PDF+LaTeX figure is described below:

1 – Create your figure and save it in either svg or postscript format. Inkscape, Matlab, GNUPlot, and Python are examples of software that can export at least one of these formats. If your figure has any equations, remember to type them in LaTeX format in the figure.

2 – Open your figure with Inkscape, edit it as you see necessary (figure may need to be ungrouped), and save it.

3.0 – If you are comfortable with using a terminal and the figure does not need editing, open a terminal pointing to the folder where the figure is and type the following the command (no $). If this works, you can skip steps 3 and 4 and go straight to step 5.

$ inkscape fig1.svg --export-pdf fig1.pdf --export-latex

3 – Click on File -> Save As…, select “Portable Document Format (*.pdf)” as the file format, and click on Save.

ss1

4 – On the Portable Document Format window that will open, check the option “PDF+LaTeX: Omit text in PDF, and create LaTeX file” and click on OK.

ss2

Inkscape will then export two files, both with the same name but one with pdf and the other with pdf_tex extension. The pdf file contains all the drawing, while the pdf_tex contains all the text of the figure and calls the pdf file.

5 – On your latex document, include package graphicx with the command \usepackage{graphicx}.

6 – To include the figure in your document, use \input{your_figure.pdf_tex}. Do not use the conventional \includegraphics command otherwise you will end up with an error or with a figure with no text. If you want to scale the figure, type \def\svgwidth{240bp} (240 is the size of your figure in pixels) in the line before the \input command. Do not use the conventional [scale=0.5] command, which would cause an error. Some instructions are available at the first lines of the pdf_tex file, which can be opened with a regular text editor such as notepad.

Below is a comparison of how the same figure would look like in the document if exported in PDF+LaTeX and png formats. It can be seen that the figure created following the procedure above looks smoother and its text style matches that of the paragraphs above and below, which is more pleasant to the eyes. Also, the text can be marked and searched with any pdf viewer. However, the user should be aware that, since text font size is not affected by the scaling of the figure, some text may end up bigger than boxes that are supposed to contain it, as well as too close or to far from lines and curves. The former can be clearly seen in the figure below. This, however, can be easily fixed with software such as Inkscape and/or with the editing tips described in the following section.

ss3

TIPS FOR TEXT MANIPULATION AFTER FIGURE IS EXPORTED

If you noticed a typo of a poorly positioned text in the figure after the figure has been exported and inserted in your document, there is a easier way of fixing it other than exporting the figure again. If you open the pdf_tex file (your_figure.pdf_tex) with a text editor such as notepad, you can change any text and its position by changing the parameters of the \put commands inside the \begin{picture}\end{picture} LaTeX environment.

For example, it would be better if the value 1 in the y and x axes of the figures above would show as 1.0, so that its precision is consistent with that of the other values. The same applies to 2 vs. 2.0 in the x axis. This can be fixed by opening file fig1.pdf_tex and replacing lines:

\put(0.106,0.76466667){\makebox(0,0)[rb]{\smash{1}}}%
\put(0.53916667,0.0585){\makebox(0,0)[b]{\smash{1}}}%
\put(0.95833333,0.0585){\makebox(0,0)[b]{\smash{2}}}%

by:

\put(0.106,0.76466667){\makebox(0,0)[rb]{\smash{1.0}}}%
\put(0.53916667,0.0585){\makebox(0,0)[b]{\smash{1.0}}}%
\put(0.95833333,0.0585){\makebox(0,0)[b]{\smash{2.0}}}%

Also, one may think that the labels of both axes are too close to the axes. This can be fixed by replacing lines:

\put(0.02933333,0.434){\rotatebox{90}{\makebox(0,0)[b]{\smash{$x\cdot e^{-x+1}$}}}}%
\put(0.539,0.0135){\makebox(0,0)[b]{\smash{x}}}%

by:

\put(0.0,0.434){\rotatebox{90}{\makebox(0,0)[b]{\smash{$x\cdot e^{-x+1}$}}}}%
\put(0.0,0.0135){\makebox(0,0)[b]{\smash{x}}}%

With the modifications described above and resizing the legend box with Inkscape, the figure now would look like this:

ss4

Don’t forget to explore all the editing features of inkscape. If you export a figure form GNUPlot or Matlab and ungroup it with Inkscape into small pieces, Inkscape would give you freedom to rearrange and fine tune your plot.

PDFExtract: Get a list of BibTeX references from a scholarly PDF

So you’ve found a review article with a great list of references that you’d like to include in your own paper/thesis/etc. You could look them up, one-by-one, on Google Scholar, and export the citation format of your choice. (You could also retype them all by hand, but let’s assume you’re savvy enough to use some kind of citation manager).

This is not a great use of your time.

Check out PDFExtract, a Ruby library written by folks at CrossRef. Its goal is to read text from a PDF, identify which sections are “references”, and return this list to the user. As of recently, it has the ability to return a list of references in BibTeX format after resolving the DOIs over the web. When the references in the PDF are identified correctly (about 80-90% of the time in my experience), you’ll now have all the references from that paper to do with as you please—to cite in LaTeX, or import to Zotero, etc.

How to use it

You will need a recent version of Ruby and its gem package manager. Search around for how to do this on your particular OS. As usual, this will be a lot easier on *nix, but I have it working in Cygwin too so don’t despair.

The latest version of PDFExtract (with BibTeX output) is not on the central gem repository yet, but for now you can build and install from source:

git clone https://github.com/CrossRef/pdfextract
cd pdfextract
gem build pdf-extract.gemspec
gem install pdf-extract-0.1.1.gem  # check version number

You should now have a program called pdf-extract available from the command line. Navigate to a directory with a PDF whose references you’d like to extract, and run the following:

pdf-extract extract-bib --resolved_references MyFile.pdf

It will take a minute to start running, and then it will begin listing the references it finds, along with their resolved DOIs from CrossRef’s web API, like so:

Found DOI from Text: 10.1080/00949659708811825 (Score: 5.590546)
Found DOI from Text: 10.1016/j.ress.2011.10.017 (Score: 4.6864557)
Found DOI from Text: 10.1016/j.ssci.2008.05.005 (Score: 0.5093678)
Found DOI from Text: 10.1201/9780203859759.ch246 (Score: 0.6951939)
Found DOI from Text: 10.1016/s0377-2217(96)00156-7 (Score: 5.2922735)
...

Note that not all resolutions are perfect. The score reflects the degree of confidence that the reference extracted from the PDF matches the indicated DOI. Scores below 1.0 will not be included in the final output, as they are probably incorrect.

Go make yourself a coffee while it searches for the rest of the DOIs. Eventually it will move to the second phase of this process, which is to use the DOI to obtain a full BibTeX entry from the web API. Again, this will not be done for DOIs with scores below 1.0.

Found BibTeX from DOI: 10.1080/00949659708811825
Found BibTeX from DOI: 10.1016/j.ress.2011.10.017
Found BibTeX from DOI: 10.1016/s0377-2217(96)00156-7
Found BibTeX from DOI: 10.1016/j.ress.2006.04.015
Found BibTeX from DOI: 10.1111/j.1539-6924.2010.01519.x
Found BibTeX from DOI: 10.1002/9780470316788.fmatter
...

Finish your coffee, check your email, and chuckle at the poor saps out there gathering their references by hand. When the program finishes, look for a file called MyFile.bib—the same filename as the original PDF—in the same directory from which you invoked the pdf-extract command. Open it up in a text editor or reference manager and take a look. Here’s the output from my example:

@article{Archer_1997,
doi = {10.1080/00949659708811825},
url = {http://dx.doi.org/10.1080/00949659708811825},
year = 1997,
month = {May},
publisher = {Informa UK Limited},
volume = {58},
number = {2},
pages = {99-120},
author = {G. E. B. Archer and A. Saltelli and I. M. Sobol},
title = {Sensitivity measures,anova-like Techniques and the use of bootstrap},
journal = {Journal of Statistical Computation and Simulation}
}
@article{Auder_2012,
doi = {10.1016/j.ress.2011.10.017},
url = {http://dx.doi.org/10.1016/j.ress.2011.10.017},
year = 2012,
month = {Nov},
publisher = {Elsevier BV},
volume = {107},
pages = {122-131},
author = {Benjamin Auder and Agn\`es De Crecy and Bertrand Iooss and Michel Marqu\`es},
title = {Screening and metamodeling of computer experiments with functional outputs. Application to thermal$\textendash$hydraulic computations},
journal = {Reliability Engineering \& System Safety}
}

... (and many more!)

A few extra-nice things: (1) it includes all DOIs, which journals sometimes require and are pesky to track down, and (2) it attempts to escape all BibTeX special characters by default. Merge this with your existing library, and be happy! (You could even use this to recover or develop a reference library from your own papers!)

Caveats

  • This works a lot better on journal articles than on longer documents like theses and textbooks. It assumes that the “Reference” section is toward the end, so a chapter-based or footnote-based reference format will cause it to choke.

  • It will not work on non-digital articles—for example, older articles which were scanned and uploaded to a journal archive.

  • Careful with character encoding when you are importing/exporting BibTeX with other applications (like Zotero), or even managing the file yourself. You may want to look for settings in all of your applications that allow you to change the character encoding to UTF-8.

  • Lots of perfectly good references do not have DOIs and thus will not be resolved by the web API. This includes many government agency reports, for example. In general do not expect to magically BibTeXify things other than journal articles and the occasional textbook.

  • Reading a PDF is tricky business—there are some journal formats that just won’t work. You will notice failures based on (1) consistently bad DOI resolution scores, (2) complete failure with an error message from the PDF reader (very hard to trace these), or (3) if your BibTeX file contains bizarre entries at the end. I’ve accidentally “extracted” references about ornithology, for example—just delete these and move on.

Writing a Paper in Markdown Using Pandoc

I’ve struggled up to now with the tension between drafting papers in Word (easy for co-authors to use for marking up revisions) and using LaTeX to prepare them for publication (because Word fights you and actively thwarts your efforts the whole time if you try to make a paper look half-decent.) When I start in Word and switch to LaTeX, there’s an awkward phase in the middle where I have to fix all of my quotation marks and em-dashes, and all of my equations, tables, and citations are completely broken.

Recently I discovered Pandoc, and I think it will streamline the transition quite a lot.  Pandoc is a document converter that converts several input formats to many output formats.  Here’s the list from running pandoc --help:


Input formats:  native, json, markdown, markdown_strict, markdown_phpextra, markdown_github, markdown_mmd, rst,  mediawiki, docbook, textile, html, latex

Output formats: native, json, docx, odt, epub, epub3, fb2, html, html5, s5, slidy, slideous, dzslides, docbook, opendocument, latex, beamer, context, texinfo, man, markdown, markdown_strict, markdown_phpextra, markdown_github, markdown_mmd, plain, rst, mediawiki, textile, rtf, org, asciidoc

That’s a lot of document formats!  In particular, it supports a “native” dialect of Markdown that it does a great job of translating both to LaTeX and to docx (Microsoft Word). Other nifty things you can do include:

  • Convert LaTeX to Word, including your BibTeX citations
  • Making PDFs from html (if you have LaTeX installed)
  • Writing Beamer presentations in Markdown (and exporting the LaTeX sources for the slides)
  • Use BibTeX citations in Markdown

I’m using Pandoc Markdown to draft my next paper, and while it’s not as full-featured as LaTeX for things like internal references, I find that it’s easier to write Word documents in Markdown than it is to write them in Microsoft Word!  To give just one example, it lets you caption figures properly.  Try making a figure in Word, adding a caption, and then moving or deleting the figure. The caption stays put.  Why on earth would I want that to happen? If I delete a figure in Pandoc Markdown, it takes extra effort to leave the caption behind.  In addition, when I switch my output format from Word to LaTeX source, Pandoc makes a figure environment with a \caption{} automatically.

My planned workflow is:

  1. Draft in Pandoc Markdown
  2. Convert Markdown to docx and share with co-authors
  3. Update Markdown sources based on revisions to the Word document
  4. Repeat 1-3 until the paper is mostly done
  5. Convert Markdown to LaTeX
  6. Final revisions and formatting in LaTeX

I’ll follow up to this post as I progress with drafting the paper. Right now I’m enjoying Pandoc Markdown quite a lot, and I highly recommend it.

 

Beginner’s LaTeX Guide

What are TeX and LaTeX?

TeX is a low-level markup and programming language used to typeset documents, created by Donald Knuth. TeX is a powerful typesetting tool, but can be difficult to use because of the time it takes to create custom text formatting macros.  To get around this difficulty, there are programs, like LaTeX, that come with pre-built macros. LaTeX is more user-friendly, but lacks the flexibility of TeX.

Installing a TeX System: MiKTeX

MiKTeX is an implementation of Knuth’s TeX system. You’ll need a TeX system on your computer so the LaTeX commands are recognized by your machine. My decision to personally use MiKTeX is based on its compatibility with the WinEdt software we use in the Reed group, you can also use TeX Live as your TeX system, but I have no experience with that software.  Once you have a TeX system installed on your computer, you can compile LaTeX documents using a command line and text files (saved with the proper file extension). Most people find this difficult, which is why many people use a TeX editing software.

Installing a TeX Editor: WinEdt

The software that we have a license for in the Reed group is WinEdt. There are other free options such as TeXnicCenter and many, many others. For a whole discussion on pros and cons of different editors see Wikipedia’s article comparing different TeX editors. Once you’ve installed WinEdt, you can go to Documents -> Current Work (Samples) within the program to compile one of the sample documents included in the program to ensure your software is properly installed/configured.

(Reed Members: Talk to Josh for license information. I believe the Reed license is only valid for WinEdt 5.5, which is not the latest version.)

Learning some Basic Commands

Luckily, there are MANY MANY sources for learning LaTeX commands. A good place to start is Wikipedia’s LaTeX Wikibook. Starting under the tab “Absolute Beginners” will walk you through very simple document creation. Another good place to start is Andre Heck‘s short course in LaTeX called Learning LaTeX by Doing. Within this course, there are 24 exercises designed to get you familiar with commands, and typing your own LaTeX documents. If you’re just interested in trying out these exercises without installing software, you can use Latexlabs.org to compile your LaTeX documents online. Once you become familiar with the commands, a good place to start with a unique document is putting together a LaTeX resume/CV.  This will get you familiar with simple document commands such as tables and lists.

Some Resources

Winston Chang has written a comprehensive document that compresses most of the major LaTeX commands to two pages: http://www.stdout.org/~winston/latex/latexsheet.pdf.

If you’re interested in using LaTeX to write a Penn State thesis/dissertation, Gary L. Gray and Francesco Costanzo have written a thesis template to use: http://www.esm.psu.edu/psuthesis/

There’s even a LaTeX template that makes your documents look like MS Word!

How to cite packages in R

R is a nice statistical tool or language to use, because it is free and provides many useful packages for data analysis.  I just found out about a neat way that R will actually generate a BibTeX citation for you regarding a specific package.  It’s explained here:

http://astrostatistics.psu.edu/su07/R/html/utils/html/citation.html

Do you have tips on using R?  If so edit this post or provide a comment below.

Web-based Free Options for Bibliography Management and LaTeX Editing

I often find myself switching between computers with different operating systems, so I try to use free tools on the web as often as I can. The purpose of this post is to make you aware of two free options that I’ve had success with.

Bibliography Management – Zotero.org

Zotero is a free bibliography management resource that works as a plug-in for Mozilla Firefox along with plug-ins that work with Microsoft Office and Open Office. You edit your citations within Firefox, and insert them into documents using the Office plug-ins. You can import and export BibTeX into or out of Zotero and it is compatible with the RIS format, so you can move your citations back and forth between Zotero and Endnote. When you sign up for Zotero, it will ask you to create a user account. Your web account serves as an online backup for your citations, as well as a collaborative space. You can create a profile based on your area of expertise, so you can search for users with similar research interests as you and share your citations with them. (Perhaps this would be a good way to create a Pat Reed Group citation database?)

If this piqued your interest, I recommend checking out the quick start guide which shows some of the cool stuff you can do with Zotero.

My only warning is make sure you’re running the latest version of Firefox or you might have some compatibility issues with the plug-ins, especially with Word and Open Office. According to the website, there is a beta release for standalone Zotero as well as plug-ins for Safari and Chrome, but I haven’t used any of those options. It is also important to note that there is a 100MB limit for free Zotero service. I have about 2,000 citations total stored online and I’m only using about 1.0MB according to the website, so I imagine that the free service will be sufficient for everyone. It is $20/year for 1GB of Zotero storage.

LaTeX – Latexlab.org

Latexlab.org is a Google Docs based LaTeX editor. You sign in using your Google Docs account, so all your files are stored on your Google profile.  Those familiar with WinEdt or other LaTeX editing software should have no trouble using the LaTeX Lab interface.  You can upload images to your Google-docs account to insert them into your LaTeX document. I’d recommend using this if you’re on the go and need to put together a LaTeX document quickly.

I’ve never tried to compile anything complicated within LaTeX Lab, but if you need to put together an equation-heavy document quickly, this is a good alternative. I certainly wouldn’t try to put your thesis together using LaTeX Lab. You can compile different documents together into a project, but I’ve never used that functionality. Again, I would shy away from trying to put together complicated documents in LaTeX Lab.