The other day, there was a bit of a twitter conversation about `qqline`

in R.

It made me think: how exactly is the line produced by `qqline`

chosen? I seemed to recall that the line was through the first and third quartiles.

An advantage of R is that you can just type the name of the function and see the code:

# qqline function (y, datax = FALSE, distribution = qnorm, probs = c(0.25, 0.75), qtype = 7, ...) { stopifnot(length(probs) == 2, is.function(distribution)) y <- quantile(y, probs, names = FALSE, type = qtype, na.rm = TRUE) x <- distribution(probs) if (datax) { slope <- diff(x)/diff(y) int <- x[1L] - slope * y[1L] } else { slope <- diff(y)/diff(x) int <- y[1L] - slope * x[1L] } abline(int, slope, ...) }

I was right: They take the 25th and 75th percentiles of the data and of the theoretical distribution, calculate the slope and y-intercept of the line that goes through those two points, and use `abline`

to draw the line.

Open source means the source is open, so why not take the time to look at it?

Sometimes typing the name of the function doesn’t tell you much:

# qqnorm function (y, ...) UseMethod("qqnorm")

In such cases, you could try typing, for example, `qqnorm.default`

.

Still, the comments (if there were any) get stripped off, and for long functions, it’s not pretty. So I like to keep a copy of the source code (for example, `R-3.0.1.tar.gz`

; extract it with `tar xzf R-3.0.1.tar.gz`

). I use `grep`

to find the relevant files.

For example,

grep -r 'qqline' R-3.0.1/src/

shows that I should look for `qqline`

in

R-3.0.1/src/library/stats/R/qqnorm.R

For something like `cor`

, you might want to do:

grep -r 'cor <-' R-3.0.1/src

Or maybe:

grep -r 'cor <-' R-3.0.1/src/library/stats/R

But for `cor`

, you probably also want to look at the C code, which is in

R-3.0.1/src/library/stats/src/cov.c

You can learn a lot about programming from the source code for R.

Tags: code, programming, R