r - Behavior of subsetting data frame for unique column values -
background: have data frame 1 column having duplicate values. trying split data frame picking out rows duplicate column values, process them , spit out new data frame processed rows.
i amazed going wrong here in following code:
dataset <- structure(list(day = structure(1:10, .label = c("tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday", "tuesday"), class = "factor"), variable = structure(c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l), .label = c("act1", "act2", "act3", "act4", "act5", "act12", "act19", "act116", "act22", "act6", "act13", "act111", "act117", "act23", "act7", "act14", "act112", "act118", "act24", "act8", "act15", "act113", "act119", "act25", "act9", "act16", "act114", "act20", "act26", "act10", "act17", "act115", "act21", "act27", "act11", "act18"), class = "factor"), value = c(67, 65, 40, 79, 106, 90, 57, 59, 2, 12)), .names = c("day", "variable", "value"), row.names = c(na, 10l), class = "data.frame") uniq <- unique(dataset$variable) (i in 1:length(uniq)){ rowsperval <- dataset[dataset$variable == uniq[i], ] print(length(rowsperval)) }
i don't understand how final print statement says length 3, when there 10 records in data frame same value variable
column.
plyr
split-apply-combine problem (split data set chunks, operate on each one, , put together).
library("plyr") ddply(dataset, .(variable), nrow)
as others have said length()
of data.frame
number of columns; nrow()
number of rows.
> ddply(dataset, .(variable), nrow) variable v1 1 act1 10
you can replace nrow
(anonymous) function whatever processing want.
Comments
Post a Comment