This question already has answers here:
Replace given value in vector
(8 answers)
Closed 4 years ago.
I have a column Apps in dataframe dframe
that looks like this:
Apps
1 31
2 12
3 10
4 33
5 -
I need the column to be type int instead of String so I need to convert the 5th row to a 0.
Apps
1 31
2 12
3 10
4 33
5 0
dframe$Apps[dframe$Apps == "-"] <- "0"
dframe$Apps <- as.integer(dframe$Apps)
You can do it with ifelse and the tidyverse approach:
require(tidyverse)
df %>%
mutate(Apps = ifelse(Apps == "-", 0, Apps))
Apps
1 4
2 3
3 2
4 5
5 0
Dataset:
df <- read.table(text = " Apps
1 31
2 12
3 10
4 33
5 -", header = TRUE)
dframe$Apps <- as.integer(gsub("-", "0", dframe$Apps, fixed = TRUE))
will give you an integer column as I suspect you want.
Related
This question already has answers here:
R - Filter a vector using a function
(5 answers)
Closed 2 years ago.
Let's say I have the following vector
vec <- rep(1:20,sample(1:5, 20, replace = T))
table(vec)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1 5 4 3 5 2 1 1 3 5 3 2 3 1 5 5 3 1 4 1
I want to only keep the numbers that appear once. Anything that appears more than once, I want to remove it. So in the end I'd like to end up with
1 7 8 14 18 20
At the moment, I'm generating this with the following command
vec2 <- names(which(table(vec) == 1))
vec2
But I'm wondering if there's a beeter (and more efficient) way of doing this.
We can use %in% and negate (!) create a logical vector based on the OP's code
vec[!vec %in% vec2]
Or in a single line using ave and length
vec[ave(seq_along(vec), vec, FUN = length)>1]
This question already has answers here:
Multiplying elements of a column in skipping an element after each iteration
(3 answers)
Closed 3 years ago.
x <- as.data.frame(1:5)
with the above data frame I want to create a new column which has a running product, i.e. the first element should be
1*2*3*4*5 = 120 then
2*3*4*5 = 120 then
3*4*5 = 60
and so on.
How can I do this in R?
result should be
> x[,"result"] <- c(120,120,60,20,5)
> x
1:5 result
1 1 120
2 2 120
3 3 60
4 4 20
5 5 5
We can use cumprod
rev(cumprod(rev(x[[1]])))
#[1] 120 120 60 20 5
Or
rev(Reduce(`*`, rev(x[[1]]), accumulate = TRUE))
Also, there is a convenient wrapper in accumulate
library(tidyverse)
x %>%
mutate(result = accumulate(`1:5`, `*`, .dir = "backward"))
# 1:5 result
#1 1 120
#2 2 120
#3 3 60
#4 4 20
#5 5 5
To do so while simply adding a new column to your data:
data <- data.frame(list(x = 1:5))
data
x
1 1
2 2
3 3
4 4
5 5
data$prod <- apply(data,1,function(x) prod(x:5))
data
x prod
1 1 120
2 2 120
3 3 60
4 4 20
5 5 5
This question already has answers here:
Replacing NAs with latest non-NA value
(21 answers)
Closed 4 years ago.
everyone. I want to replace the NA value with value which is not NA for the same participants. I tried this, but it return the original df, i don't know what happened.
df = data.frame(block = c('1',NA,NA,'2',NA,NA,'3',NA,NA),
subject = c('31','31','31','32','32','32','33','33','33'))
df[df$subject == 1 & is.na(df$block)] = df[df$subject == 31 &!is.na(df$block)]
# define a for loop with from 1 to n
for (i in 1: length(unique(df$subject))){
subjects
# replace the block with NA in block that is not NA for the same participant
df[df$subject == i & is.na(df$block)] = df[df$subject == i & !is.na(df$block)]
}
Here is what i want to get.
enter image description here
Using the dplyr and the zoo library, I replaced the NA values in the block column with the previous non-NA row values:
library(dplyr)
library(zoo)
df2 <- df %>%
do(na.locf(.))
The end result looks as follow:
df2
block subject
1 1 31
2 1 31
3 1 31
4 2 32
5 2 32
6 2 32
7 3 33
8 3 33
9 3 33
This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 5 years ago.
I have 2 columns, "x" and "y" generated with this code:
x = 1:8
y = c(2,7,1,3,5,4,1,2)
data = data.frame(x,y)
It look like this:
x y
1 2
2 7
3 1
4 3
5 5
6 4
7 1
8 2
Now I want to keep adding all the previous rows of "y" into "z".
x y z
1 2 2
2 7 9
3 1 10
4 3 13
5 5 18
6 4 22
7 1 23
8 2 25
I have tried everything without any luck.
Use cumsum, the cumulative sum function.
data$z <- cumsum(data$y)
probably not the cleanest way, but this is easy to understand and works well:
data$z=NA
for(i in 1:nrow(data)){
if(i==1){
data[i,'z']=data[i,'y']
} else{
data[i,'z']=data[i,'y']+data[i-1,'z']
}
}
This question already has answers here:
R: Differences by group and adding
(3 answers)
Closed 6 years ago.
I have the following dataset:
df <- data.frame (id= c(1,1,1,2,2), time = c(13,14,17,17,17))
id time
1 1 13
2 1 14
3 1 17
4 2 17
5 2 17
and I wish to go over on each id and subtract the next time and the previous time. So, My ideal output will be:
#output
id time diff
1 1 13 0
2 1 14 1
3 1 17 3
4 2 17 0
5 2 17 0
What is the most efficient way for that?
Thank so Zheyuan Li.
This is a great solution:
df$diff <- with(df, ave(time, id, FUN = function (x) c(0, diff(x))))