R data table select rows to update column [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I am trying to update a column for a selection of rows based on the value of the same column. this column contains characters from '1','2',....'10','11'. what I want to do is to combine 11 categories into 3. So my code look like
library(data.table)
DT <- data.table(col = as.character(1:11))
DT[col %in% c('1','2','3'), col := '3']
DT[col %in% c('4','5','6','7','8'), col := '2']
DT[col %in% c('9','10','11'), col := '1']
weirdly, the last line doesn't work. the '10' and '11' are not updated. when I change 'c' to list (below code), it seems to work. but i don't know why this is the case.
DT[col %in% list('9','10','11'), col := '1']
Any help will be much appreciated.

Related

If statement in R, multiple conditions [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Closed 4 years ago.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Improve this question
I wonder how I can write this if statement correct. I have tried a lot of things, but none of them worked.
b <- matrix(NA,10,10)
> for (row in 1:10)
>> for (column in 1:10)
>>> if(!is.na(a[row,column] && a==(1 || 2 || 3))
b[row,column]==1
>>> else
b[row,column]==0
The problem is here:
if(!is.na(a[row,column] && a==(1 || 2 || 3))
Assuming that 'a' is a matrix with the same dimension as 'b', we can do this without any if/else
+((a %in% 1:3) & !is.na(a))
data
set.seed(24)
a <- matrix(sample(c(1:9, NA), 10*10, replace = TRUE), 10, 10)

Replace row values by max values in the group [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame look like this
a <- c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40)
b <- c(l,k,l,l,k,l,l,l,k,k,l,l,k,l)
c <- c(1,1,1,2,2,2,2,2,3,3,3,4,4,4)
I want to group data frame by column 'b' and 'c', then replace row values in 'a' column by max value of each group. For example: the 1st and 2nd of the 'a' column would be replaced by 30. Here is my code:
df%>%group_by(b, c)%>%mutate(a = max(a, na.rm = TRUE))
Other values are replaced by max value but not NA. I don't know why mutatefunction rewrite NA by inf. Here is the result I have with my code:
a <- c(30,inf,30,80,inf,80,80,80,90,90,90,90,10,90)
But I want it like this:
a <- c(30,30,30,80,80,80,80,80,90,90,90,90,10,90)
Assuming your data are:
Tuong_df <- data.frame(
c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40),
c("l","l","l","l","l","l","l","l","k","k","k","k","k","k"),
c(1,1,1,2,2,2,2,2,3,3,3,4,4,4))
names(Tuong_df) <- c("Var1","Var2","Var3")
You have to run the following code:
Tuong_df_mod <- Tuong_df %>%
group_by(Var2,Var3) %>%
mutate(Var1=max(Var1,na.rm=TRUE))
Anyway, for the near future, it should be better if you release reproducible code.

dplyr, create a column conditional on presence or absence or text in another column [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I use dplyr. I would like to create a new column called "disease" with yes or no designation based on another column called description. If the description is NA then the value in the new column should be "N", if there is any text now in the description the value in the new column should be "Y". I tried the following code:
data%>%
mutate(disease= ifelse( is.na(Description)),"N", "Y")
There is a really simple solution using data.table
library(data.table)
setDT(data)[, disease := ifelse( is.na(cyl), "N", "Y")]
We can use base R to do this
transform(data, disease = c("Y", "N")[is.na(cyl)+1])

How to add a column with constant observation and another variable with consecutive numbers with a character in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I want to add a first column with consecutive numbers with characters in a existing data frame.
I use the following code. It does not work.
df$VARNAME_ <- paste0('COL', 1:5)(df)
I want to it look like this.
VARNAME_ old_var1 old_var2
COL1 1 2
COL2 1 2
COL3 1 2
COL4 1 2
COL5 1 2
Thanks in advance.
I am Sorry that I asked a stupid question. And now I figure out.
The solution is as following.
actual_df<-data.frame(df)#transfer matrix a to data frame
actual_df<-cbind(VARNAME_=paste0('COL', 1:5),actual_df) #add COL1~COL5 in the first column
actual_df<-cbind(ROWTYPE_ = 'PROX', actual_df) #Add a variable with constant observations in first column. Now the previous column become second one.
df$VARNAME_ = paste0('COL', 1:5)
will work

WeiRd: R does not find value but it's just there [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
Trying to merge two data frames, using a variable called hash_id. For some reason R does not recognize the hash-id's in one of the data frames, while it does so in the other.
I have checked and I just don't get it. See below how I checked:
> head(df1[46],1) # so I take the first 'hash-id' from df1
# hash_id
# 1 abab123123
> which(df2 == "abab123123", arr.ind=TRUE) # here it shows that row 6847 contains a match
# row col
# [1,] 6847 32`
> which(df1 == "abab123123", arr.ind=TRUE) # and here there is NO matching value!
# row col
#
One possibility is trailing or leading spaces in the concerned columns for one of the datasets. You could do:
library(stringr)
df1[, "hash_id"] <- str_trim(df1[,"hash_id"])
df2[, "hash_id"] <- str_trim(df2[, "hash_id"])
which(df1[, "hash_id"]=="abab123123", arr.ind=TRUE)
which(df2[, "hash_id"]=="abab123123", arr.ind=TRUE)
Another way would be use grep
grepl("\\babab123123\\b", df1[,"hash_id"])
grepl("\\babab123123\\b", df2[,"hash_id"])

Resources