R repeating sequence add 1 each repeat - r

I have a workbook problem for my R class I can't figure out. I need to "write an R command that uses rep() to create a vector with elements 1 2 3 4 2 3 4 5 3 4 5 6 4 5 6 7"
It seems to be a repeating sequence of 1 to 4, repeating 4 times and on each repeat adding 1 to the starting element. I'm very very new to R so I'm stumped. Any help would be appreciated.

We can use rep and add with the initial vector
v1 + rep(0:3, each = length(v1))
#[1] 1 2 3 4 2 3 4 5 3 4 5 6 4 5 6 7
Or using sapply
c(sapply(v1, `+`, 0:3))
Or using outer
c(outer(v1, 0:3, `+`))
data
v1 <- 1:4

Another option is to use sequence:
sequence(rep(4, 4), 1:4)
#[1] 1 2 3 4 2 3 4 5 3 4 5 6 4 5 6 7

Related

R Create column that provides grouping number for each distinct group [duplicate]

This question already has an answer here:
get sequence of group in R
(1 answer)
Closed 2 years ago.
I need to add a column to my data that contains a number grouping for each distinct combination of other columns. It will likely be more clear with this example:
# Make data
df <- data.frame(x = c(1,1,2,3,4,5,2,3,4,5),
y = c(2, 2,3,4,5,1,3,4,5,1),
value = c(1,2,3,4,5,6,7,8,9,10))
# Print the data
df
x y value
1 1 2 1
2 1 2 2
3 2 3 3
4 3 4 4
5 4 5 5
6 5 1 6
7 2 3 7
8 3 4 8
9 4 5 9
10 5 1 10
I need to add a "Location" column that has the numbers each unique (or distinct) combination of x and y. Duplicated x and y combinations should all use the same number. In my example there are 5 unique combinations of x and y, so I only have a maximum of 5 Locations. My goal output is this:
x y value Location
1 1 2 1 1
2 1 2 2 1
3 2 3 3 2
4 3 4 4 3
5 4 5 5 4
6 5 1 6 5
7 2 3 7 2
8 3 4 8 3
9 4 5 9 4
10 5 1 10 5
I imagine doing something like this:
df <- df %>%
group_by(x,y) %>%
mutate(Location = ndistinct(x,y)
But this doesn't work. Any help is appreciated!
Thanks!
df %>% mutate(., Location=group_indices(., x,y))
x y value Location
1 1 2 1 1
2 1 2 2 1
3 2 3 3 2
4 3 4 4 3
5 4 5 5 4
6 5 1 6 5
7 2 3 7 2
8 3 4 8 3
9 4 5 9 4
10 5 1 10 5
See here and here.
Not quite as straightforward as I thought to start with.
Update
To answer OP's question: the dot . is a placeholder for "the object on the left hand side of the pipe" (%>%). Normally you don't need it because, by default, magrittr (the package which defines the pipe) assumes that you want to use the object on the left hand side of the pipe as the first argument to the function on the right hand side of the pipe, and makes the substitution for you. This is very helpful because the tidyverse is designed so that the object on the left hand side of the pipe is always the first argument to the function on the right hand side - so you don't have to use the dot.
If you use functions that don't belong to the tidyverse, you sometimes need the dot to override magrittr's default behaviour.
I wrote my first version of this answer without testing the code because the solution seemed "obvious". But I did test it afterwards (at the same time as OP reported the error) and found that it didn't work. A quick Google brought me to the github issue in the second link above, and hence to the correct answer.
I don't yet understand why, in this particular case, a tidyverse function doesn't work as I expect. (Other than taking the easy way out and saying that my expectation was wrong!)
In base R we can use:
df$location <- as.numeric(factor(paste(df$x,df$y)))
x y value location
1 1 2 1 1
2 1 2 2 1
3 2 3 3 2
4 3 4 4 3
5 4 5 5 4
6 5 1 6 5
7 2 3 7 2
8 3 4 8 3
9 4 5 9 4
10 5 1 10 5

Change the order of numerically named columns in r

If I have a dataframe like the one below which has numerical column names
example = data.frame(1=c(1,8,3,9), 2=c(3,2,3,3), 3=c(5,2,5,4), 4=c(1,2,3,4), 5=c(2,5,7,8))
Which looks like this:
1 2 3 4 5
1 3 5 1 2
8 2 2 2 5
3 3 5 3 7
9 3 4 4 8
And I want to arrange it so that the column names start with three and proceed through five and back to one, like this:
3 4 5 1 2
5 1 2 1 3
2 2 5 8 2
5 3 7 3 3
4 4 8 9 3
I know how to rearrange the position of a single column in a dataset, but I'm not sure how to do this with more than one column in this particular order.
We can use the column index concatenated (c) based on the sequence (:) on a range of values
example[c(3:5, 1:2)]
# 3 4 5 1 2
#1 5 1 2 1 3
#2 2 2 5 8 2
#3 5 3 7 3 3
#4 4 4 8 9 3
As the column names are all numeric, just convert to numeric and use that for ordering
v1 <- as.numeric(names(example))
example[c(v1[3:5], v1[1:2])]
Or simply do
example[c(names(example)[3:5], names(example)[1:2])]
Or another way is with head and tail
example[c(tail(names(example), 3), head(names(example), 2))]
data
example <- data.frame(`1`=c(1,8,3,9), `2`=c(3,2,3,3),
`3`=c(5,2,5,4), `4`=c(1,2,3,4), `5`=c(2,5,7,8), check.names = FALSE)
R will not easily let you create columns with numbers as name. If somehow, you are able to create columns with numbers you can use match to get order in which you want the column names.
example[match(c(3:5, 1:2), names(example))]
# 3 4 5 1 2
#1 5 1 2 1 3
#2 2 2 5 8 2
#3 5 3 7 3 3
#4 4 4 8 9 3

Using seq and rep to create a sequence of 5 integers that go up by 1 on each repetition

I'm trying to create the vector: 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
using rep and seq functions
So far I have this:
rep(seq(1,5),5)
Which yields:
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
I cannot for the life of me figure out how to add the +1 incrementally.
I have tried rep(seq(1,5),5,+1) and rep(seq(1,5),5, each +1 and many other variations.
Is a for loop needed?
Using R's feature of recycling vectors:
1:5 + rep(0:4, each = 5)
# [1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Note that 1:5 gives the same result as seq(1,5).
Take a look at sequence:
sequence(nvec = rep(5L, 5L), from = 1:5)
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Note: the from argument was introduced in R 4.0.0. From R News 4.0.0:
sequence() [...] gains arguments [e.g. from] to generate more complex sequences.
This would be a case of divide and conquer. So you basically want to build a sequence of 1-5, 2-6, 3-7, 4-8, 5-9. The pattern here being i-(i+4).
So here's a solution:
unlist(lapply(1:5, function(i) seq(i, i+4)))
You perform your sub-pattern for i = 1-5. The outcome is a list, hence you unlist it, bringing you down to a simple
[1] 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9
Edit: Yes, you did need a "loop", but this demonstrates clearly that in R, many loop-operations can be performed with the *apply functions (sapply, lapply, apply, mapply).
In this case, we perform the same function on different values which is why it is easier to use sapply. If you have a calculation where you rely on a previous value or row, traditional loops are the way to go.
Here, your function is seq(i, i+4). When it is this simple, we don't bother assigning it to a name, but have instead made a "lambda function" or "anonymous" function. The exact same result could have been achieved by:
sequence_1_to_4 <- function(i) {
seq(i, i+4)
}
sequences <- lapply(1:5, sequence_1_to_4)
single_sequence <- unlist(sequences)
In new R, you could do this. The \(x) is a shortcut for a function.
c(sapply(1:5, \(x) x:(x+4)))

Order data frame by column and display WITH indices

I have the following R data frame
> df
a
1 3
3 2
4 1
5 3
6 6
7 7
8 2
10 8
I order it by the a column with the order function df[ order(df), ]:
[1] 1 2 2 3 3 6 7 8
This is the result I want, BUT, how can list the whole data frame with the permuted indices?
The only thing that works is the following, but it seems sloppy and I don't really understand what it does:
> df[ order(df), c(1,1) ] # I want this but without the a.1 column!!!!
a a.1
4 1 1
3 2 2
8 2 2
1 3 3
5 3 3
6 6 6
7 7 7
10 8 8
Thanks
If we need the indices as well, use sort with index.return = TRUE
data.frame(sort(df$a, index.return=TRUE))

Keep column names after calling function combn

After looking at another post about column names and combn function here consider the same data.frame. We make a combn with all 2 possible vectors:
foo <- data.frame(x=1:5,y=4:8,z=10:14, w=8:4)
all_comb <- combn(foo,2)
Is there a way to keep column names after the combn call so in this case we could get "x y" instead of "X1.5 X4.8" as shown below ?
comb_df <- data.frame(all_comb[1,1],all_comb[2,1])
print(comb_df)
X1.5 X4.8
1 1 4
2 2 5
3 3 6
4 4 7
5 5 8
I suspect you really want to use expand.grid() instead.
Try this:
head(expand.grid(foo))
x y z w
1 1 4 10 8
2 2 4 10 8
3 3 4 10 8
4 4 4 10 8
5 5 4 10 8
6 1 5 10 8
or
head(expand.grid(foo[, 1:2]))
x y
1 1 4
2 2 4
3 3 4
4 4 4
5 5 4
6 1 5

Resources