apply vs for

It’s widely understood that, in R programming, one should avoid for loops and always try to use apply-type functions.

But this isn’t entirely true. It may have been true for Splus, back in the day: As I recall, that had to do with the entire environment from each iteration being retained in memory.

Here’s a simple example:

> x <- matrix(rnorm(4000*40000), ncol=4000)

> system.time({
+     mx <- rep(NA, nrow(x))
+     for(i in 1:nrow(x)) mx[i] <- max(x[i,])
+  })
   user  system elapsed 
  3.719   0.446   4.164

> system.time(mx2 <- apply(x, 1, max))
   user  system elapsed 
  5.548   1.783   7.333

There’s a great commentary on this point by Uwe Ligges and John Fox in the May, 2008, issue of R News (see the “R help desk”, starting on page 46, and note that R News is now the R Journal).

Also see the related discussion at stackoverflow.

They say that apply can be more readable. It can certainly be more compact, but I usually find a for loop to be more readable, perhaps because I’m a C programmer first and an R programmer second.

A key point, from Ligges and Fox: “Initialize new objects to full length before the loop, rather than increasing their size within the loop.”

About these ads

Tags: ,

7 Responses to “apply vs for”

  1. Robert Young Says:

    – I usually find a for loop to be more readable, perhaps because I’m a C programmer first and an R programmer second.

    In a nutshell. If one were a relational database zealot, or prolog coder, by default, then the “set” based approach is far more natural. And given that user typed R code is nearly (always?) slower than internal R C code, use of set implemented syntax is to be preferred. Better yet, if your data is in a sql database (and you’ve wisely not eaten of Eve’s NoSql apple, have you?), do all that munging over there first. Will save a ton effort and headache.

  2. Alberto Says:

    I guess you forgot the [i] in the mx assignment.
    I mean:
    > system.time({
    + mx <- rep(NA, nrow(x))
    + for(i in 1:nrow(x)) mx[i] <- max(x[i,])
    + })

  3. TszKin Julian Says:

    I also found that loop is faster than *apply in some cases. Especially if the object is very big. (or the returning object is big)

    For looping, we can compile the code to boost up the speed. For *apply, we can make it parallel to boost up the speed.

    To me, if the work is computation intensive then i would prefer apply. If the data object is large or require nested looping, then i would prefer looping.

    Here is an example based on your code :

    > library(compiler)
    > library(parallel)
    >
    > n x
    > f <- cmpfun(function(x){
    + mx <- rep(NA, nrow(x))
    + for(i in 1:nrow(x)) mx[i]
    > system.time({
    + mx <- rep(NA, nrow(x))
    + for(i in 1:nrow(x)) mx[i]
    > system.time(mx2
    > system.time({ mx3
    > cl=makeCluster(detectCores())
    > clusterExport(cl,”x”)
    >
    > system.time(mx2
    >

  4. TszKin Julian Says:

    One more point here,
    lapply would be faster than apply
    system.time(mx2 <- apply(x, 1, max))
    system.time(mx2 <- lapply(1:nrow(x), function(z) max(x[z,]) ))

    sorry, the code seems to be truncated. Let me try it again:

    n <- 40000
    x <- matrix(rnorm(400*n), ncol=400)

    f <- cmpfun(function(x){
    mx <- rep(NA, nrow(x))
    for(i in 1:nrow(x)) mx[i] <- max(x[i,])
    })

    system.time({
    mx <- rep(NA, nrow(x))
    for(i in 1:nrow(x)) mx[i] <- max(x[i,])
    })

    system.time(mx2 <- apply(x, 1, max))
    system.time(mx2 <- lapply(1:nrow(x), function(z) max(x[z,]) ))

    system.time({ mx3<-f(x)})

    cl=makeCluster(detectCores())
    clusterExport(cl,"x")

    system.time(mx2 <- parSapply(cl,1:nrow(x),function(z) max(x[z,]) ))

  5. fer_rabanal Says:

    Reblogged this on Easy ML World.

  6. isomorphismes Says:

    Maybe this comes from the R Inferno — “failing to vectorise” is chapter 3, he makes the point about initialising at full size in ch 2 (gluttons), and he pokes fun at “Speaking R with a strong C accent” (or was it the other way around?) Anyway. He’s a popular source. Could be where the conventional wisdom comes from.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 88 other followers