Extract a numeric vector from data frame in R - r

I have a data.frame like following example. I want to write a function to do these two tasks for me in one function in R? first extract the value of data frame which is same for x and y and I want to save it as a numeric vector and also make the rest as a data frame.
d = data.frame(x = c(1,7, 2, 9, 11),y=c(6, 7, 8, 9,10))
v = c(7, 9)
w = data.frame(x=c(1, 2, 11), y=c(6, 8, 10))
My desire result as follows:
> result
$v
[1] 7 9
$w
x y
1 1 6
2 2 8
3 11 10

Maybe with is what you want?
with(d, list(v = x[x==y] ,w=d[x!=y,]))
$v
[1] 7 9
$w
x y
1 1 6
3 2 8
5 11 10

Something along these lines should do this too
splitdf <- function(df) {
if (ncol(df) != 2) stop("df must have 2 columns")
ind <- do.call("==", df)
list(v = df[ind, 1], w = df[!ind, ])
}
d <- data.frame(x = c(1, 7, 2, 9, 11), y = c(6, 7, 8, 9, 10))
splitdf(d)
## $v
## [1] 7 9
## $w
## x y
## 1 1 6
## 3 2 8
## 5 11 10
df <- data.frame(x = c(1, 7, 2, 9, 11), z = c(7, 8, 10, 9, 12))
splitdf(df)
## $v
## [1] 9
## $w
## x z
## 1 1 7
## 2 7 8
## 3 2 10
## 5 11 12

Related

How to repeat a data list with two vectors in R

I have a list data X with two vectors
X[1]=(1,2,3,5,6,9,7,8)
X[2]=(2,3,4,5,6)
I want to get a new list data Y
Y[1]=(1,2,3,5,6,9,7,8,1,2,3,5,6,9,7,8)-repeat x[1]
Y[2]=(2,3,4,5,6,2,3,4,5,6)-repeat x[2]
I used Y<-rep(X,2) but get
Y[1]:(1,2,3,5,6,9,7,8)
Y[2]:(2,3,4,5,6)
Y[3]:(1,2,3,5,6,9,7,8)
Y[4]:(2,3,4,5,6)
How to do it right? Many thanks.
Use sapply/lapply :
sapply(X, rep, 2)
#[[1]]
# [1] 1 2 3 5 6 9 7 8 1 2 3 5 6 9 7 8
#[[2]]
# [1] 2 3 4 5 6 2 3 4 5 6
data
X <- list(c(1, 2, 3, 5, 6, 9, 7, 8), c(2, 3, 4, 5, 6))
You are having problems accessing the list elements - use [[1]] etc.
X <- list( c(1,2,3,5,6,9,7,8),
c(2,3,4,5,6))
Y = list(rep(X[[1]], 2),
rep(X[[2]], 2))
# R > Y
# [[1]]
# [1] 1 2 3 5 6 9 7 8 1 2 3 5 6 9 7 8
#
# [[2]]
# [1] 2 3 4 5 6 2 3 4 5 6
Using map from purrr
library(purrr)
map(X, rep, 2)
data
X <- list(c(1, 2, 3, 5, 6, 9, 7, 8), c(2, 3, 4, 5, 6))

How is it possible to make a matrix of array in R?

What i want to have is a matrix in which each element is an array itself.
This array is taken subsetting a dataframe, but the example can be generalized for any array.
I tried with:
My_matrix <- matrix(array(), nrow = NROW, ncol = NCOL)
for (i in 1:NROW){
for(j in 1:NCOL){
My_matrix[i,j] <- df[df$var1 == j & df$var2== i,]$var3
}
}
but I got this message error:
Error in My_matrix[i,j] <- df[df$var1== j & df$var2== i,]$var3 :
number of items to replace is not a multiple of replacement length
How should I define and access each element of the matrix and each element of the contained array?
I think I understand that: (1) the base array is 45x3; (2) each cell has a differently sized matrix; and (3) this is not known apriori. Gotcha. Not possible. An array (matrix) is always perfectly dimensioned, and while you can dynamically change one or more of the dimensions, you change for all cells.
Alternative: list-columns.
dat <- data.frame(x=1:3, y=11:13)
dat$z <- lapply(3:5, function(i) matrix(seq_len(i^2), nr=i))
dat
# x y
# 1 1 11
# 2 2 12
# 3 3 13
# z
# 1 1, 2, 3, 4, 5, 6, 7, 8, 9
# 2 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
# 3 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
That doesn't look very appealing, but if you want a different presentation, you might consider assigning it as a tibble::tbl_df (available whenever dplyr is loaded as well). (Note that presentation is distinct from storage and accessibility.)
library(tibble)
as_tibble(dat)
# # A tibble: 3 x 3
# x y z
# <int> <int> <list>
# 1 1 11 <int[,3] [3 x 3]>
# 2 2 12 <int[,4] [4 x 4]>
# 3 3 13 <int[,5] [5 x 5]>
Subsetting is consistent:
dat$z[ dat$x == 2 & dat$y == 12 ]
# [[1]]
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
### note that you need an extra [[1]] to get to the real data
m <- dat$z[ dat$x == 2 & dat$y == 12 ][[1]]
m
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
m[3,4]
# [1] 15

Exchange two elements of a vector in one call

I have a vector c(9,6,3,4,2,1,5,7,8), and I want to switch the elements at index 2 and at index 5 in the vector. However, I don't want to have to create a temporary variable and would like to make the switch in one call. How would I do that?
How about just x[c(i,j)] <- x[c(j,i)]? Similar to replace(...), but perhaps a bit simpler.
swtch <- function(x,i,j) {x[c(i,j)] <- x[c(j,i)]; x}
swtch(c(9,6,3,4,2,1,5,7,8) , 2,5)
# [1] 9 2 3 4 6 1 5 7 8
You could use replace().
x <- c(9, 6, 3, 4, 2, 1, 5, 7, 8)
replace(x, c(2, 5), x[c(5, 2)])
# [1] 9 2 3 4 6 1 5 7 8
And if you don't even want to assign x, you can use
replace(
c(9, 6, 3, 4, 2, 1, 5, 7, 8),
c(2, 5),
c(9, 6, 3, 4, 2, 1, 5, 7, 8)[c(5, 2)]
)
# [1] 9 2 3 4 6 1 5 7 8
but that's a bit silly. You will probably want x assigned to begin with.
If you actually want to do it without creating a temporary copy of the vector, you would need to write a short C function.
library(inline)
swap <- cfunction(c(i = "integer", j = "integer", vec="integer"),"
int *v = INTEGER(vec);
int ii = INTEGER(i)[0]-1, jj = INTEGER(j)[0]-1;
int tmp = v[ii];
v[ii] = v[jj];
v[jj] = tmp;
return R_NilValue;
")
vec <- as.integer(c(9,6,3,4,2,1,5,7,8))
swap(2L, 5L, vec)
vec
# [1] 9 2 3 4 6 1 5 7 8

Remove duplicate and small vectors from list

I have a list of vectors, say:
li <- list( c(1, 2, 3),
c(1, 2, 3, 4),
c(2, 3, 4),
c(5, 6, 7, 8, 9, 10, 11, 12),
numeric(0),
c(5, 6, 7, 8, 9, 10, 11, 12, 13)
)
And I would like to remove all the vectors that are already contained in others (bigger or equal), as well as all the empty vectors
In this case, I would be left with only the list
1 2 3 4
5 6 7 8 9 10 11 12 13
Is there any useful function for achieving this?
Thanks in advance
First you should sort the list by vector length, such that in the excision loop it is guaranteed that each lower-index vector is shorter than each higher-index vector, so a one-way setdiff() is all you need.
l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 );
ls <- l[order(sapply(l,length))];
i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || any(sapply((i+1):length(ls),function(i2) length(setdiff(ls[[i]],ls[[i2]]))) == 0)) ls[[i]] <- NULL else i <- i+1;
ls;
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
## [1] 5 6 7 8 9 10 11 12 13
Here's a slight alternative, replacing the any(sapply(...)) with a second while-loop. The advantage is that the while-loop can break prematurely if it finds any superset in the remainder of the list.
l <- list(1:3, 1:4, 2:4, 5:12, double(), 5:13 );
ls <- l[order(sapply(l,length))];
i <- 1; while (i <= length(ls)-1) if (length(ls[[i]]) == 0 || { j <- i+1; res <- F; while (j <= length(ls)) if (length(setdiff(ls[[i]],ls[[j]])) == 0) { res <- T; break; } else j <- j+1; res; }) ls[[i]] <- NULL else i <- i+1;
ls;
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
## [1] 5 6 7 8 9 10 11 12 13
x is contained in y if
length(setdiff(x, y)) == 0
You can apply it to each pair of vectors using functions like expand.grid or combn.

Map numbers to smallest in a vector of numbers in R

Given a vector of numbers, I'd like to map each to the smallest in a separate vector that the number does not exceed. For example:
# Given these
v1 <- 1:10
v2 <- c(2, 5, 11)
# I'd like to return
result <- c(2, 2, 5, 5, 5, 11, 11, 11, 11, 11)
Try
cut(v1, c(0, v2), labels = v2)
[1] 2 2 5 5 5 11 11 11 11 11
Levels: 2 5 11
which can be converted to a numeric vector using as.numeric(as.character(...)).
Another way (Thanks for the edit #Ananda)
v2[findInterval(v1, v2 + 1) + 1]
# [1] 2 2 5 5 5 11 11 11 11 11]

Resources