as.character() for rownames()

Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen.

> x <- z <- data.frame(id=1:3)
> y <- data.frame(id=4:6)
> rownames(x) <- 1:3
> rownames(y) <- LETTERS[4:6]
> rownames(z) <- as.character(1:3)
> rbind(y,x)
  id
D  1
E  2
F  3
4  1
5  2
6  3
> rbind(y,z)
  id
D  1
E  2
F  3
1  1
2  2
3  3

If you type rownames(x) you see the same result as rownames(z), and is.character(rownames(x)) and is.character(rownames(z)) both return TRUE, but if you look at the "row.names" attribute directly, you see they are different.

> rownames(x)
[1] "1" "2" "3"
> rownames(z)
[1] "1" "2" "3"
> is.character(rownames(x))
[1] TRUE
> is.character(rownames(z))
[1] TRUE
> attr(x, "row.names")
[1] 1 2 3
> attr(z, "row.names")
[1] "1" "2" "3"

But why is 1:3 treated so differently from 2:4?

> w <- data.frame(id=1:3)
> rownames(w) <- 2:4
> attr(w, "row.names")
[1] 2 3 4
> rownames(rbind(y,w))
[1] "D" "E" "F" "2" "3" "4"
Advertisements

Tags: , ,

2 Responses to “as.character() for rownames()”

  1. Charles Says:

    Also see `.set_row_names` and `.row_names_info`. When the rownames are just 1:N, they are stored in a compressed form as (NA, -N). See the difference below (though the `row.names` attribute will _appear_ the same for both):

    x = y = data.frame(id=1:3)
    rownames(x) = 1:3
    .row_names_info(x)
    .row_names_info(y)

    As for particularly why 1:3 is treated differently to 2:4 in rbind, you can see in `Make.row.names` within `rbind.data.frame` that it special cases when the existing rownames are 1:N. That should probably be limited to when the rownames are “compressed” as per row_names_info (ie not explicitly set by the user) to avoid problems such as yours.

Comments are closed.