Literate Programming, Quarto, and Workflows

HES 505 Fall 2024: Session 5

Carolyn Koehn

For today

  1. Introduce literate programming

  2. Describe pseudocode and its utility for designing an analysis

  3. Introduce Quarto as a means of documenting your work

  4. Practice workflow

Reproducibility

Science is a social process!!

Why Do We Need Reproducibility?

  • Noise!!

  • Confirmation bias

  • Hindsight bias

Munafo et al. 2017. Nat Hum Beh.

Reproducibility and your code

  • Scripts: may make your code reproducible (but not your analysis)

  • Commenting and formatting can help!

```{r}
#| eval: false
#|
## load the packages necessary
library(tidyverse)
## read in the data
landmarks.csv <- read_csv("/Users/mattwilliamson/Google Drive/My Drive/TEACHING/Intro_Spatial_Data_R/Data/2023/assignment01/landmarks_ID.csv")

## How many in each feature class
table(landmarks.csv$MTFCC)
```

Reproducible scripts

  • Comments explain what the code is doing

  • Operations are ordered logically

  • Only relevant commands are presented

  • Useful object and function names

  • Script runs without errors (on your machine and someone else’s)

Literate Programming

Toward Efficient Reproducible Analyses

  • Scripts can document what you did, but not why you did it!

  • Scripts separate your analysis products from your report/manuscript

What is literate programming?

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
— Donald Knuth, CSLI, 1984

What is literate programming?

  • Documentation containing code (not vice versa!)

  • Direct connection between code and explanation

  • Convey meaning to humans rather than telling computer what to do!

  • Multiple “scales” possible

Why literate programming?

  • Your analysis scripts are computer software

  • Integrate math, figures, code, and narrative in one place

  • Explaining something helps you learn it

Planning an analysis

  • Outline your project

  • Write pseudocode

  • Identify potential packages

  • Borrow (and attribute) code from others (including yourself!)