How can I loop a data matrix in R? - r

I am trying to loop a data matrix for each separate ID tag, “1”, “2” and “3” (see my data at the bottom). Ultimately I am doing this to transform the X and Y coordinates into a timeseries with the ts() function, but first i need to build a loop into the function that returns a timeseries for each separate ID. The looping itself works perfectly fine when I use the following code for a dataframe:
for(i in 1:3){
print(na.omit(xyframe[ID==i,]))
}
Returning the following output:
Timestamp X Y ID
1. 0 -34.012 3.406 1
2. 100 -33.995 3.415 1
3. 200 -33.994 3.427 1
Timestamp X Y ID
4. 0 -34.093 3.476 2
5. 100 -34.145 3.492 2
6. 200 -34.195 3.506 2
Timestamp X Y ID
7. 0 -34.289 3.522 3
8. 100 -34.300 3.520 3
9. 200 -34.303 3.517 3
Yet, when I want to produce a loop in a matrix with the same code:
for(i in 1:3){
print(na.omit(xymatrix[ID==i,])
}
It returns the following error:
Error in print(na.omit(xymatrix[ID == i, ]) :
(subscript) logical subscript too long
Why does it not work to loop the ID through a matrix while it does work for the dataframe and how would I be able to fix it?
Furthermore did I read that looping requires much more computational strength then doing the same thing vector based, would there be a way to do this vector based?
The data (simplification of the real data):
Timestamp X Y ID
1. 0 -34.012 3.406 1
2. 100 -33.995 3.415 1
3. 200 -33.994 3.427 1
4. 0 -34.093 3.476 2
5. 100 -34.145 3.492 2
6. 200 -34.195 3.506 2
7. 0 -34.289 3.522 3
8. 100 -34.300 3.520 3
9. 200 -34.303 3.517 3

The format xymatrix[ID==i,] doesn't work for matrix. Try this way:
for(i in 1:3){ print(na.omit(xymatrix[xymatrix[,'ID'] == i,])) }

In general, if you want to apply a function to a data frame, split by some factor, then you should be using one of the apply family of functions in combination with split.
Here's some reproducible sample data.
n <- 20
some_data <- data.frame(
x = sample(c(1:5, NA), n, replace= TRUE),
y = sample(c(letters[1:5], NA), n, replace= TRUE),
id = gl(3, 1, length = n)
)
If you want to print out the rows with no missing values, split by each ID level, then you want something like this.
lapply(split(some_data, some_data$grp), na.omit)
or more concisely using the plyr package.
library(plyr)
dlply(some_data, .(grp), na.omit)
Both methods return output like this
# $`1`
# x y grp
# 1 2 d 1
# 4 3 e 1
# 7 3 c 1
# 10 4 a 1
# 13 2 e 1
# 16 3 a 1
# 19 1 d 1
# $`2`
# x y grp
# 2 1 e 2
# 5 3 e 2
# 8 3 b 2
# $`3`
# x y grp
# 6 3 c 3
# 9 5 a 3
# 12 2 c 3
# 15 2 d 3
# 18 4 a 3

Related

R: all combinations of nested list of variable length [duplicate]

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?
This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

How to get permutations by selecting one member of a subset with multiple subsets in R? [duplicate]

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?
This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

R How to permute all rows of a data frame such that all possible combinations of rows are returned in a list?

I'm trying to produce all possible row permutations of a data frame (or matrix if that's easier) and have an object returned as a list or array of the data frames/matrices. I've constructed a mock dataframe that as the same dimensions as the one I'm working with.
test.df <- as.data.frame(matrix(1:80,nrow=16,ncol=5)
Edit: changed combinations to permutations
v.df <- data.frame(symbol = c("a", "b", "c"), number = c(1,2,3))
v.df
## symbol number
## 1 a 1
## 2 b 2
## 3 c 3
permutate.rows <- function(df) {
k <- dim(df)[1] # number of rows
index.df <- as.data.frame(t(permutations(n = k, r = k, v = 1:k)))
res <- lapply(index.df, function(idx) df[idx, , drop = FALSE])
}
permutate.rows(v.df)
gives the list of all permutated dfs:
$V1
symbol number
1 a 1
2 b 2
3 c 3
$V2
symbol number
1 a 1
3 c 3
2 b 2
$V3
symbol number
2 b 2
1 a 1
3 c 3
$V4
symbol number
2 b 2
3 c 3
1 a 1
$V5
symbol number
3 c 3
1 a 1
2 b 2
$V6
symbol number
3 c 3
2 b 2
1 a 1
Use 16 instead of 3 and your data frame to apply it on your example.
I shortened the df because 16!=20922789888000
library(purrr)
library(combinat)
test.df <- as.data.frame(matrix(1:25,nrow=5,ncol=5))
map(permn(1:nrow(test.df)), function(x) test.df[x,])

How to write the remaining data frame in R after randomly subseting the data

I took a random sample from a data frame. But I don't know how to get the remaining data frame.
df <- data.frame(x=rep(1:3,each=2),y=6:1,z=letters[1:6])
#select 3 random rows
df[sample(nrow(df),3)]
What I want is to get the remaining data frame with the other 3 rows.
sample sets a random seed each time you run it, thus if you want to reproduce its results you will either need to set.seed or save its results in a variable.
Addressing your question, you simply need to add - before your index in order to get the rest of the data set.
Also, don't forget to add a comma after the indx if you want to select rows (unlike in your question)
set.seed(1)
indx <- sample(nrow(df), 3)
Your subset
df[indx, ]
# x y z
# 2 1 5 b
# 6 3 1 f
# 3 2 4 c
Remaining data set
df[-indx, ]
# x y z
# 1 1 6 a
# 4 2 3 d
# 5 3 2 e
Try:
> df
x y z
1 1 6 a
2 1 5 b
3 2 4 c
4 2 3 d
5 3 2 e
6 3 1 f
>
> df2 = df[sample(nrow(df),3),]
> df2
x y z
5 3 2 e
3 2 4 c
1 1 6 a
> df[!rownames(df) %in% rownames(df2),]
x y z
1 1 6 a
2 1 5 b
5 3 2 e

Combinations of multiple vectors in R

I'm not sure if permutations is the correct word for this. I want to given a set of n vectors (i.e. [1,2],[3,4] and [2,3]) permute them all and get an output of
[1,3,2],[1,3,3],[1,4,2],[1,4,3],[2,3,2] etc.
Is there an operation in R that will do this?
This is a useful case for storing the vectors in a list and using do.call() to arrange for an appropriate function call for you. expand.grid() is the standard function you want. But so you don't have to type out or name individual vectors, try:
> l <- list(a = 1:2, b = 3:4, c = 2:3)
> do.call(expand.grid, l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
However, for all my cleverness, it turns out that expand.grid() accepts a list:
> expand.grid(l)
a b c
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
This is what expand.grid does.
Quoting from the help page: Create a data frame from all combinations of the supplied vectors or factors. The result is a data.frame with a row for each combination.
expand.grid(
c(1, 2),
c(3, 4),
c(2, 3)
)
Var1 Var2 Var3
1 1 3 2
2 2 3 2
3 1 4 2
4 2 4 2
5 1 3 3
6 2 3 3
7 1 4 3
8 2 4 3
As an alternative to expand.grid() you could use rep() to produce the desired combination. Consider the following simplified example using the original data from this question:
a <- c(1,2)
b <- c(3,4)
c <- c(2,3)
To get the expand.grid()-like effect, use rep() with a times= argument equal to the product of the length of the other vectors (or 4). The middle vector would use a nested rep() with products of vector length to either side (or 2 and 2). The end vector is like the first but with each= argument in order to pattern correctly. This is trivial to calculate when each vector is length of 2. Example:
#tibble of all combinations of a, b and c
tibble::tibble(
var1 = rep(a, times = 4),
var2 = rep(rep(b, each= 2), times = 2), #nested rep()
var3 = rep(c, each= 4)
)
For an unknown number of input vectors (or unknown vector lengths), we can get all combinations with rep() in a function like this:
#Produces a tibble of all combinations of input vectors
expand_tibble <- function(...){
x <- list(...) #all input vectors stored here
l <- lapply(x,length)|> unlist() #vector showing length of each input vector
t <- length(l) #total input vector count
r <-list() #empty list
for(i in 1:t){
if(i==1){ #first input vector
first <-l[2:length(l)] |> prod()
r[[i]]<-rep(x[[i]], each = first)
}else{ #last input vector
if(i==t){
last <- l[1:t-1] |> prod()
r[[i]]<-rep(x[[i]], last)
}else{ #all middle input vectors
m1 <- l[1:(i-1)] |> prod()
m2 <- l[(i+1):t] |> prod()
r[[i]] <- rep(rep(x[[i]], each=m1),m2)
}
}
names(r)[i]<-paste0("var",i)
}
tibble::as_tibble(r)
}
output:
expand_tibble(a,b,c)
var1 var2 var3
<dbl> <dbl> <dbl>
1 1 3 2
2 1 3 3
3 1 4 2
4 1 4 3
5 2 3 2
6 2 3 3
7 2 4 2
8 2 4 3

Resources