Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 3 years ago.
Improve this question
I am trying to divide all integers in a column with another integer. I have a database with a column that has integers that go above 1*10^20. Because of this my plots are way to big. I need to normalize the data to have a better understanding what is going on. For example, the data that I have:
[x][Day] [Amount]
[1] 1 1 23440100
[2] 2 2 41231020
[3] 3 3 32012010
I am using a data.frame for my own data, so here you have the data frame for the data above
x <- c(1,2,3)
day <- c(1,2,3)
Amount <- c(23440100, 41231020, 32012010)
my.data <- data.frame(x, day, Amount)
I tried using another answer, provided here, but that doesn't seem to work.
The code that I tried:
test <- my.data[, 3]/1000
Hope someone can help me out! Cheers, Chester
I guess you are looking for this?
my.data$Amount <- my.data$Amount/1000
such that
> my.data
x day Amount
1 1 1 23440.10
2 2 2 41231.02
3 3 3 32012.01
Use mutate from dplyr
Since you're using a data.frame, you can use this simple code:
library(dplyr)
mutated.data <- my.data %>%
mutate(Amount = as.integer(Amount / 1000))
> mutated.data
x day Amount
1 1 1 23440.10
2 2 2 41231.02
3 3 3 32012.01
Hope this helps.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I have a data frame look like this
a <- c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40)
b <- c(l,k,l,l,k,l,l,l,k,k,l,l,k,l)
c <- c(1,1,1,2,2,2,2,2,3,3,3,4,4,4)
I want to group data frame by column 'b' and 'c', then replace row values in 'a' column by max value of each group. For example: the 1st and 2nd of the 'a' column would be replaced by 30. Here is my code:
df%>%group_by(b, c)%>%mutate(a = max(a, na.rm = TRUE))
Other values are replaced by max value but not NA. I don't know why mutatefunction rewrite NA by inf. Here is the result I have with my code:
a <- c(30,inf,30,80,inf,80,80,80,90,90,90,90,10,90)
But I want it like this:
a <- c(30,30,30,80,80,80,80,80,90,90,90,90,10,90)
Assuming your data are:
Tuong_df <- data.frame(
c(10,NA,30,40,NA,60,70,80,90,90,80,90,10,40),
c("l","l","l","l","l","l","l","l","k","k","k","k","k","k"),
c(1,1,1,2,2,2,2,2,3,3,3,4,4,4))
names(Tuong_df) <- c("Var1","Var2","Var3")
You have to run the following code:
Tuong_df_mod <- Tuong_df %>%
group_by(Var2,Var3) %>%
mutate(Var1=max(Var1,na.rm=TRUE))
Anyway, for the near future, it should be better if you release reproducible code.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I want to add a first column with consecutive numbers with characters in a existing data frame.
I use the following code. It does not work.
df$VARNAME_ <- paste0('COL', 1:5)(df)
I want to it look like this.
VARNAME_ old_var1 old_var2
COL1 1 2
COL2 1 2
COL3 1 2
COL4 1 2
COL5 1 2
Thanks in advance.
I am Sorry that I asked a stupid question. And now I figure out.
The solution is as following.
actual_df<-data.frame(df)#transfer matrix a to data frame
actual_df<-cbind(VARNAME_=paste0('COL', 1:5),actual_df) #add COL1~COL5 in the first column
actual_df<-cbind(ROWTYPE_ = 'PROX', actual_df) #Add a variable with constant observations in first column. Now the previous column become second one.
df$VARNAME_ = paste0('COL', 1:5)
will work
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
I would like to subset a df based on one level in a column, i.e. keep all rows that only contain this unique level within a column.
For this example I want a df with all columns that meet the criteria "blue" in column "D" without losing information. Whether that is subset, filter, etc.
A B C D E
1 2 3 "blue" 8
7 4 6 "red" 5
5 9 1 "green" 2
I have tried the variations of the following script:
newdf = subset(df, D == "blue")
newdf = subset(df, levels(D) == "blue")
This should work:
newdf = df[df$D == "blue", ]
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
Trying to merge two data frames, using a variable called hash_id. For some reason R does not recognize the hash-id's in one of the data frames, while it does so in the other.
I have checked and I just don't get it. See below how I checked:
> head(df1[46],1) # so I take the first 'hash-id' from df1
# hash_id
# 1 abab123123
> which(df2 == "abab123123", arr.ind=TRUE) # here it shows that row 6847 contains a match
# row col
# [1,] 6847 32`
> which(df1 == "abab123123", arr.ind=TRUE) # and here there is NO matching value!
# row col
#
One possibility is trailing or leading spaces in the concerned columns for one of the datasets. You could do:
library(stringr)
df1[, "hash_id"] <- str_trim(df1[,"hash_id"])
df2[, "hash_id"] <- str_trim(df2[, "hash_id"])
which(df1[, "hash_id"]=="abab123123", arr.ind=TRUE)
which(df2[, "hash_id"]=="abab123123", arr.ind=TRUE)
Another way would be use grep
grepl("\\babab123123\\b", df1[,"hash_id"])
grepl("\\babab123123\\b", df2[,"hash_id"])