r - Create unique ID based on existing ID conditions -
i column of unique document ids, ids contain q or a:
"702-591|source-871987", "702-591|source-872066", "702-591|source-872336", "702-591|source-872557", "702-591|source-873368", "702-591|source-876216", "702-591|source-907269", "702-591|source-10754a", "702-591|source-10754q", "702-591|source-118603a", "702-591|source-118603q", "702-591|source-119738a"
i want create simpler unique id column (easy enough -- table$id <- c(1:nrow(table))
). if existing column contains q or a, want q/a incorporated new id field. additionally, if 2 ids linked q/a, want new ids show 1q or 1a. example, records 8 & 9 are: "702-591|source-10754a", "702-591|source-10754q"
. new ids 8a & 8q, respectively. records 1 -5 have new ids of 1-5. need incorporating grep command here?
thanks!
this may little long, think works. you'll have install stringr
package use it.
require(stringr) df <- data.frame(str_match(tab$old_id,"(.*[[:digit:]]+)([[:alpha:]]?)")) names(df) <- c("old_id","nonqa","qa") df2<- data.frame(nonqa=unique(df$nonqa)) df2$base <- seq_along(df2$nonqa) df3<- merge(df,df2) df3$id=paste(df3$base,df3$qa,sep="")
in end, have "old_id" , "id" columns in final data frame. read table "tab" since "table" function in r. else answering question, here is:
tab = data.frame(old_id=c("702-591|source-871987", "702-591|source-872066", "702-591|source-872336", "702-591|source-872557", "702-591|source-873368", "702-591|source-876216", "702-591|source-907269", "702-591|source-10754a", "702-591|source-10754q", "702-591|source-118603a", "702-591|source-118603q", "702-591|source-119738a"))
Comments
Post a Comment