WeiRd: R does not find value but it's just there [closed] - r

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
Trying to merge two data frames, using a variable called hash_id. For some reason R does not recognize the hash-id's in one of the data frames, while it does so in the other.
I have checked and I just don't get it. See below how I checked:
> head(df1[46],1) # so I take the first 'hash-id' from df1
# hash_id
# 1 abab123123
> which(df2 == "abab123123", arr.ind=TRUE) # here it shows that row 6847 contains a match
# row col
# [1,] 6847 32`
> which(df1 == "abab123123", arr.ind=TRUE) # and here there is NO matching value!
# row col
#

One possibility is trailing or leading spaces in the concerned columns for one of the datasets. You could do:
library(stringr)
df1[, "hash_id"] <- str_trim(df1[,"hash_id"])
df2[, "hash_id"] <- str_trim(df2[, "hash_id"])
which(df1[, "hash_id"]=="abab123123", arr.ind=TRUE)
which(df2[, "hash_id"]=="abab123123", arr.ind=TRUE)
Another way would be use grep
grepl("\\babab123123\\b", df1[,"hash_id"])
grepl("\\babab123123\\b", df2[,"hash_id"])

Related

adding a column name in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 months ago.
Improve this question
library(quantmod)
quote<-c("0883.HK","0386.HK","0857.HK")
df1<-data.frame(getQuote(quote)$Last)
Just download the last traded price for some stocks and put those quotes in a data frame, and then what I am trying to do here is to add column names for those quotes like below:
colnames(df1)<-c("883","386","857")
Error in names(x) <- value :
'names' attribute [3] must be the same length as the vector [1]
May I ask what have I done wrong here? Million thanks.
The data you selected is one column with three observations. Therefore you cannot assign it three names as there is only one column
> df1
getQuote.quote..Last
1 10.00
2 3.74
3 3.55
When we apply as.data.frame or data.frame on a vector (after extracting the Last column), it will be a single column with that vector as values of the column. We may need to convert to list and then setting the column names will work
df1 <- as.data.frame.list(getQuote(quote)$Last)

R: how to sample rows with custom frequencies [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame in R that has two columns, one with last names, the other with the frequency of each last name. I would like to randomly select last names based on the frequency values (0 -> 1).
So far I have tried using the sample function, but it doesn't allow for specific frequencies for each value. Not sure if this is possible :/
df1 <- data.frame(names = c("John","Mary"),freq=c(0.2,0.8))
df1
# names freq
# 1 John 0.2
# 2 Mary 0.8
set.seed(1)
sample100 <- sample(
x = df1$names,
size = 100,
replace=TRUE,
prob=df1$freq)
table(sample100)
# sample100
# John Mary
# 17 83

Replace row values by max values in the group [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame look like this
a <- c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40)
b <- c(l,k,l,l,k,l,l,l,k,k,l,l,k,l)
c <- c(1,1,1,2,2,2,2,2,3,3,3,4,4,4)
I want to group data frame by column 'b' and 'c', then replace row values in 'a' column by max value of each group. For example: the 1st and 2nd of the 'a' column would be replaced by 30. Here is my code:
df%>%group_by(b, c)%>%mutate(a = max(a, na.rm = TRUE))
Other values are replaced by max value but not NA. I don't know why mutatefunction rewrite NA by inf. Here is the result I have with my code:
a <- c(30,inf,30,80,inf,80,80,80,90,90,90,90,10,90)
But I want it like this:
a <- c(30,30,30,80,80,80,80,80,90,90,90,90,10,90)
Assuming your data are:
Tuong_df <- data.frame(
c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40),
c("l","l","l","l","l","l","l","l","k","k","k","k","k","k"),
c(1,1,1,2,2,2,2,2,3,3,3,4,4,4))
names(Tuong_df) <- c("Var1","Var2","Var3")
You have to run the following code:
Tuong_df_mod <- Tuong_df %>%
group_by(Var2,Var3) %>%
mutate(Var1=max(Var1,na.rm=TRUE))
Anyway, for the near future, it should be better if you release reproducible code.

How to add a column with constant observation and another variable with consecutive numbers with a character in R [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I want to add a first column with consecutive numbers with characters in a existing data frame.
I use the following code. It does not work.
df$VARNAME_ <- paste0('COL', 1:5)(df)
I want to it look like this.
VARNAME_ old_var1 old_var2
COL1 1 2
COL2 1 2
COL3 1 2
COL4 1 2
COL5 1 2
Thanks in advance.
I am Sorry that I asked a stupid question. And now I figure out.
The solution is as following.
actual_df<-data.frame(df)#transfer matrix a to data frame
actual_df<-cbind(VARNAME_=paste0('COL', 1:5),actual_df) #add COL1~COL5 in the first column
actual_df<-cbind(ROWTYPE_ = 'PROX', actual_df) #Add a variable with constant observations in first column. Now the previous column become second one.
df$VARNAME_ = paste0('COL', 1:5)
will work

R data table select rows to update column [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I am trying to update a column for a selection of rows based on the value of the same column. this column contains characters from '1','2',....'10','11'. what I want to do is to combine 11 categories into 3. So my code look like
library(data.table)
DT <- data.table(col = as.character(1:11))
DT[col %in% c('1','2','3'), col := '3']
DT[col %in% c('4','5','6','7','8'), col := '2']
DT[col %in% c('9','10','11'), col := '1']
weirdly, the last line doesn't work. the '10' and '11' are not updated. when I change 'c' to list (below code), it seems to work. but i don't know why this is the case.
DT[col %in% list('9','10','11'), col := '1']
Any help will be much appreciated.

Resources