I have a question regarding the use of paste in R
a<-c(1,2,3,5,5,6,7,8)
b<-c(2,3,5,6,2,3,6,7)
d<-c(2,8,4,6,3,7,3,5)
df<-data.frame(a,b)
cbind(df,sugar=d)
Using the above code, I got this:
> a b sugar
1 1 2 2
2 2 3 8
3 3 5 4
4 5 6 6
5 5 2 3
6 6 3 7
7 7 6 3
8 8 7 5
However, I wonder why I couldn't get the same results using paste function:
name<-c("sugar","salt","fat")
cbind(df,paste(name[1])=d)
Any help would be much appreciated!!
If you need to create a new column with name stored in an object, try
df[name[1]] <- d
df
# a b sugar
#1 1 2 2
#2 2 3 8
#3 3 5 4
#4 5 6 6
#5 5 2 3
#6 6 3 7
#7 7 6 3
#8 8 7 5
Another option might be to use assign
assign('df', `[[<-`(df, name[1], value=d))
You want to change the name, so try setNames.
> setNames(cbind(df, d), c(colnames(df),name[1]))
a b sugar
1 1 2 2
2 2 3 8
3 3 5 4
4 5 6 6
5 5 2 3
6 6 3 7
7 7 6 3
8 8 7 5
Related
I have the dataframe below in which there are 2 rows with the same pair of values for columns A and B -3RD AND 4RTH with 2 3 -, -7TH AND 8TH with 4 6-.
master <- data.frame(A=c(1,1,2,2,3,3,4,4,5,5), B=c(1,2,3,3,4,5,6,6,7,8),C=c(5,2,5,7,7,5,7,9,7,8),D=c(1,2,5,3,7,5,9,6,7,0))
A B C D
1 1 1 5 1
2 1 2 2 2
3 2 3 5 5
4 2 3 7 3
5 3 4 7 7
6 3 5 5 5
7 4 6 7 9
8 4 6 9 6
9 5 7 7 7
10 5 8 8 0
I would like to merge these rows into one by adding the pipe | operator between values of C and D. The 2nd and 3rd line for example would be like:
A B C D
2 3 2|5 2|5
I think your combined pairs are off by a row in your example, assuming that's the case, this is what you're looking for. We group by the columns we want to collapse the duplicates out of, and then use summarize_all with paste0 to combine the values with a separator.
library(tidyverse)
master %>% group_by(A,B) %>% summarize_all(funs(paste0(., collapse="|")))
A B C D
<dbl> <dbl> <chr> <chr>
1 1 1 5 1
2 1 2 2 2
3 2 3 5|7 5|3
4 3 4 7 7
5 3 5 5 5
6 4 6 7|9 9|6
7 5 7 7 7
8 5 8 8 0
We can do this in base R with aggregate
aggregate(.~ A + B, master, FUN = paste, collapse= '|')
# A B C D
#1 1 1 5 1
#2 1 2 2 2
#3 2 3 5|7 5|3
#4 3 4 7 7
#5 3 5 5 5
#6 4 6 7|9 9|6
#7 5 7 7 7
#8 5 8 8 0
I want to replicate a vector with one value within this vector is missing (sequentially).
For example, my vector is
value <- 1:7
First, the series is without 1, second without 2, and so on. In the end, the series is in one vector.
The intended output looks like
2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 6
Is there any smart way to do this?
You could use the diagonal matrix to set up a logical vector, using it to remove the appropriate values.
n <- 7
rep(1:n, n)[!diag(n)]
# [1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5
# [36] 7 1 2 3 4 5 6
Well, you can certainly do it as a one-liner but I am not sure it qualifies as smart. For example:
x <- 1:7
do.call("c", lapply(as.list(-1:-length(x)), function(a)x[a]))
This simple uses lapply to create a list of copies of x with each of its entries deleted, and then concatenates them using c. The do.call function applies its first argument (a function) to its second argument (a list of arguments to the function).
For fun, it's also possible to just use rep:
> n <- 7
> rep(1:n, n)[rep(c(FALSE, rep(TRUE, n)), length.out=n^2)]
[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2
[39] 3 4 5 6
But lapply is cleaner, I think.
You could also do:
n <- 7
rep(seq(n), n)[-seq(1,n*n,n+1)]
#[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2 3 4 5 6
My current dataset look like this
Order V1
1 7
2 5
3 8
4 5
5 8
6 3
7 4
8 2
1 8
2 6
3 3
4 4
5 5
6 7
7 3
8 6
I want to create a new variable called "V2" based on the variables "Order" and "V1". For every 8 items in the "Order" variable, I want to assign a value of "0" in "V2" if the varialbe "Order" has observation equals to 1; otherwise, "V2" takes the value of previous item in "V1".
This is the dataset that I want
Order V1 V2
1 7 0
2 5 7
3 8 5
4 5 8
5 8 5
6 3 8
7 4 3
8 2 4
1 8 0
2 6 8
3 3 6
4 4 3
5 5 4
6 7 5
7 3 7
8 6 3
Since my actual dataset is very large, I'm trying to use for loop with if statement to generate "V2". But my code keeps failing. I appreciate if anyone can help me on this, and I'm open to other statements. Thank you!
(Up front: I am assuming that the order of Order is perfectly controlled.)
You need simply ifelse and lag:
df <- read.table(text="Order V1
1 7
2 5
3 8
4 5
5 8
6 3
7 4
8 2
1 8
2 6
3 3
4 4
5 5
6 7
7 3
8 6 ", header=T)
df$V2 <- ifelse(df$Order==1, 0, lag(df$V1))
df
# Order V1 V2
# 1 1 7 0
# 2 2 5 7
# 3 3 8 5
# 4 4 5 8
# 5 5 8 5
# 6 6 3 8
# 7 7 4 3
# 8 8 2 4
# 9 1 8 0
# 10 2 6 8
# 11 3 3 6
# 12 4 4 3
# 13 5 5 4
# 14 6 7 5
# 15 7 3 7
# 16 8 6 3
with(dat,{V2<-c(0,head(V1,-1));V2[Order==1]<-0;dat$V2<-V2;dat})
Order V1 V2
1 1 7 0
2 2 5 7
3 3 8 5
4 4 5 8
5 5 8 5
6 6 3 8
7 7 4 3
8 8 2 4
9 1 8 0
10 2 6 8
11 3 3 6
12 4 4 3
13 5 5 4
14 6 7 5
15 7 3 7
16 8 6 3
The vector (1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9)
seq() and rep() maybe can not deliver parameters.
I read the help doc but fail to find the way.
You could try
(1:5) + rep(0:4,each=5)
#[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
NOTE: (1:5) and 0:4 can be replaced by seq(1,5) and seq(0,4)
Another one:
as.vector(outer(1:5,0:4,"+"))
Apologies if this is posted elsewhere I did searches here and elsewhere and found things that were close but not quite what I needed. After sinking a couple hours into this, I'm posting!
I need to remove rows from a data set for duplicate values in value1 by id. So in the following data frame I'd only want to remove row 3. I do not want to remove row 10 or row 9. If it makes a difference, in the actual date the values are dates.
I know the solution is probably very simple but I've yet to get it exactly right. Thanks!
x <- data.frame(cbind(id=c(1,2,2,2,3,3,4,5,6,6), value1=c(6,8,8,1,9,5,4,3,8,4), value2=1:10))
> x
id value1 value2
1 1 6 1
2 2 8 2
3 2 8 3
4 2 1 4
5 3 9 5
6 3 5 6
7 4 4 7
8 5 3 8
9 6 8 9
10 6 4 10
I want to end up with:
> x
id value1 value2
1 1 6 1
2 2 8 2
4 2 1 4
5 3 9 5
6 3 5 6
7 4 4 7
8 5 3 8
9 6 8 9
10 6 4 10
Try duplicated:
> x[!duplicated(x[1:2]), ]
id value1 value2
1 1 6 1
2 2 8 2
4 2 1 4
5 3 9 5
6 3 5 6
7 4 4 7
8 5 3 8
9 6 8 9
10 6 4 10