Multiple operators in a string - r

I have some operators in a list
[[1]]
[1] "*"
[[2]]
[1] "-"
[[3]]
[1] "+"
[[4]]
[1] "/"
[[5]]
[1] "^"
I wanted to do the operations between two two datasets of same dimensions. For example, dataset1*dataset2, dataset1-dataset2, etc. Is it possible using the strings in list?

Yes, here is one example:
ops <- list("+", "-")
x <- y <- 1:10
lapply(ops, function(op) eval(parse(text = paste0("x", op, "y"))))
# [[1]]
# [1] 2 4 6 8 10 12 14 16 18 20
#
# [[2]]
# [1] 0 0 0 0 0 0 0 0 0 0

Related

Replacing values in a list based on a condition

I have a list of values called squares and would like to replace all values which are 0 to a 40.
I tried:
replace(squares, squares==0, 40)
but the list remains unchanged
If it is a list, then loop through the list with lapply and use replace
squares <- lapply(squares, function(x) replace(x, x==0, 40))
squares
#[[1]]
#[1] 40 1 2 3 4 5
#[[2]]
#[1] 1 2 3 4 5 6
#[[3]]
#[1] 40 1 2 3
data
squares <- list(0:5, 1:6, 0:3)
I think for this purpose, you can just treat it as if it were a vector as follows:
squares=list(2,4,6,0,8,0,10,20)
squares[squares==0]=40
Output:
[[1]]
[1] 2
[[2]]
[1] 4
[[3]]
[1] 6
[[4]]
[1] 40
[[5]]
[1] 8
[[6]]
[1] 40
[[7]]
[1] 10
[[8]]
[1] 20

Getting all splits of numeric sequence in R

I'm trying to get all the possible splits of a sequence [1:n] in R. E.g.:
getSplits(0,3)
Should return all possible splits of the sequence 123, in other words (in a list of vectors):
[1] 1
[2] 1 2
[3] 1 2 3
[4] 1 3
[5] 2
[6] 2 3
[7] 3
Now I've created a function which does get to these vectors recursively, but having trouble combining them into one as above. My function is:
getSplits <- function(currentDigit, lastDigit, split) {
splits=list();
for (nextDigit in currentDigit: lastDigit)
{
currentSplit <- c(split, c(nextDigit));
print(currentSplit);
if(nextDigit < lastDigit) {
possibleSplits = c(list(currentSplit), getSplits(nextDigit+1, lastDigit, currentSplit));
}else{
possibleSplits = currentSplit;
}
splits <- c(splits, list(possibleSplits));
}
return(splits);
}
Where printing each currentSplit results in all the right vectors I need, but somehow the final returnt list (splits) nests them into deeper levels of lists, returning:
[1] 1
[[1]][[2]]
[[1]][[2]][[1]]
[1] 1 2
[[1]][[2]][[2]]
[1] 1 2 3
[[1]][[3]]
[1] 1 3
[[2]]
[[2]][[1]]
[1] 2
[[2]][[2]]
[1] 2 3
[[3]]
[1] 3
For the corresponding function call getSplits(1, 3, c()).
If anyone could help me out on getting this to work the way I described above, it'd be much appreciated!
character vector output
Try combn:
k <- 3
s <- unlist(lapply(1:k, combn, x = k, toString))
s
## [1] "1" "2" "3" "1, 2" "1, 3" "2, 3" "1, 2, 3"
data frame output
If you would prefer that the output be in the form of a data frame:
read.table(text = s, header = FALSE, sep = ",", fill = TRUE, col.names = 1:k)
giving:
X1 X2 X3
1 1 NA NA
2 2 NA NA
3 3 NA NA
4 1 2 NA
5 1 3 NA
6 2 3 NA
7 1 2 3
list output
or a list:
lapply(s, function(x) scan(textConnection(x), quiet = TRUE, sep = ","))
giving:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 1 2
[[5]]
[1] 1 3
[[6]]
[1] 2 3
[[7]]
[1] 1 2 3
Update: Have incorporated improvement mentioned in comments as well as one further simplification and also added data frame and list output.
Here is another approach:
f <- function(nums) sapply(1:length(nums), function(x) t(combn(nums, m = x)))
f(1:3)
This yields
[[1]]
[,1]
[1,] 1
[2,] 2
[3,] 3
[[2]]
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 2 3
[[3]]
[,1] [,2] [,3]
[1,] 1 2 3
The OP is looking for the Power set of c(1,2,3). There are several packages that will quickly get you this in one line. Using the package rje, we have:
library(rje)
powerSet(c(1,2,3))
[[1]]
numeric(0)
[[2]]
[1] 1
[[3]]
[1] 2
[[4]]
[1] 1 2
[[5]]
[1] 3
[[6]]
[1] 1 3
[[7]]
[1] 2 3
[[8]]
[1] 1 2 3
... and with iterpc:
library(iterpc)
getall(iterpc(c(2,1,1,1), 3, labels = 0:3))
[,1] [,2] [,3]
[1,] 0 0 1
[2,] 0 0 2
[3,] 0 0 3
[4,] 0 1 2
[5,] 0 1 3
[6,] 0 2 3
[7,] 1 2 3
More generally,
n <- 3
getall(iterpc(c(n-1,rep(1, n)), n, labels = 0:n)) ## same as above

Remove outliers based on a preceding value

How to remove outliers using a criterion that a value cannot be more than 2-fold higher then its preceding one.
Here is my try:
x<-c(1,2,6,4,10,20,50,10,2,1)
remove_outliers <- function(x, na.rm = TRUE, ...) {
for(i in 1:length(x))
x < (x[i-1] + 2*x)
x
}
remove_outliers(y)
expected outcome: 1,2,4,10,20,2,1
Thanks!
I think the first 10 should be removed in your data because 10>2*4. Here's a way to do what you want without loops. I'm using the dplyr version of lag.
library(dplyr)
x<-c(1,2,6,4,10,20,50,10,2,1)
x[c(TRUE,na.omit(x<=dplyr::lag(x)*2))]
[1] 1 2 4 20 10 2 1
EDIT
To use this with a data.frame:
df <- data.frame(id=1:10, x=c(1,2,6,4,10,20,50,10,2,1))
df[c(TRUE,na.omit(df$x<=dplyr::lag(df$x,1)*2)),]
id x
1 1 1
2 2 2
4 4 4
6 6 20
8 8 10
9 9 2
10 10 1
A simple sapply:
bool<-sapply(seq_along(1:length(x)),function(i) {ifelse(x[i]<2*x[i-1],FALSE,TRUE)})
bool
[[1]]
logical(0)
[[2]]
[1] TRUE
[[3]]
[1] TRUE
[[4]]
[1] FALSE
[[5]]
[1] TRUE
[[6]]
[1] TRUE
[[7]]
[1] TRUE
[[8]]
[1] FALSE
[[9]]
[1] FALSE
[[10]]
[1] FALSE
resulting in:
x[unlist(bool)]
[1] 1 2 4 10 20 1

looping through a list of vectors in r

i am trying to loop through a list a vectors and assign values on the way:
I generate 10 vectors like this:
for(i in 1:10){
vecname <- paste('blub',i,sep='')
assign(vecname,vector(mode='numeric',length = my_len))
}
ls() = blub1, blub2 .... blub10
now i have another vector bla <- 100:109
what i basically want to do is
blub1[1] <- bla[1]
blub2[1] <- bla[2]
blub3[1] <- bla[3]
...
blub10[1] <- bla[10]
I am pretty sure there is an more elegant solution to his problem.
Help would be very much appreciated.
Thanks and have a nice day!
Here is how I would do it, following the "R way" of "lists, not for loops":
my_len <- 3
blub <- replicate(10, vector(mode = "numeric", length = my_len), simplify = FALSE)
bla <- 100:109
blub <- Map(function(a, b) {
a[1] <- b
a
}, blub, bla)
# [[1]]
# [1] 100 0 0
#
# [[2]]
# [1] 101 0 0
#
# [[3]]
# [1] 102 0 0
#
# [[4]]
# [1] 103 0 0
#
# [[5]]
# [1] 104 0 0
#
# [[6]]
# [1] 105 0 0
#
# [[7]]
# [1] 106 0 0
#
# [[8]]
# [1] 107 0 0
#
# [[9]]
# [1] 108 0 0
#
# [[10]]
# [1] 109 0 0

Why use c() to define vector?

c is not the abbreviation of vector in English, so why use c() to define a vector in R?
v1<- c(1,2,3,4,5)
This is a good question, and the answer is kind of odd. "c", believe it or not, stands for "combine", which is what it normally does:
> c(c(1, 2), c(3))
[1] 1 2 3
But it happens that in R, a number is just a vector of length 1:
> 1
[1] 1
So, when you use c() to create a vector, what you are actually doing is combining together a series of 1-length vectors.
Owen's answer is perfect, but one other thing to note is that c() can concatenate more than just vectors.
> x = list(a = rnorm(5), b = rnorm(7))
> y = list(j = rpois(3, 5), k = rpois(4, 2), l = rbinom(9, 1, .43))
> foo = c(x,y)
> foo
$a
[1] 0.280503895 -0.853393705 0.323137905 1.232253725 -0.007638861
$b
[1] -2.0880857 0.2553389 0.9434817 -1.2318130 -0.7011867 0.3931802 -1.6820880
$j
[1] 5 12 5
$k
[1] 3 1 2 1
$l
[1] 1 0 0 1 0 0 1 1 0
> class(foo)
[1] "list"
Second Example:
> x = 1:10
> y = 3*x+rnorm(length(x))
> z = lm(y ~ x)
> is.vector(z)
[1] FALSE
> foo = c(x, z)
> foo
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
[[6]]
[1] 6
[[7]]
[1] 7
[[8]]
[1] 8
[[9]]
[1] 9
[[10]]
[1] 10
$coefficients
(Intercept) x
0.814087 2.813492
$residuals
1 2 3 4 5 6 7
-0.2477695 -0.3375283 -0.1475338 0.5962695 0.5670256 -0.5226752 0.6265995
8 9 10
0.1017986 -0.4425523 -0.1936342
$effects
(Intercept) x
-51.50810097 25.55480795 -0.05371226 0.66592081 0.61250676 -0.50136423
0.62374031 0.07476915 -0.49375185 -0.26900403
$rank
[1] 2
$fitted.values
1 2 3 4 5 6 7 8
3.627579 6.441071 9.254562 12.068054 14.881546 17.695038 20.508529 23.322021
9 10
26.135513 28.949005
$assign
[1] 0 1
$qr
$qr
(Intercept) x
1 -3.1622777 -17.39252713
2 0.3162278 9.08295106
3 0.3162278 0.15621147
4 0.3162278 0.04611510
5 0.3162278 -0.06398128
6 0.3162278 -0.17407766
7 0.3162278 -0.28417403
8 0.3162278 -0.39427041
9 0.3162278 -0.50436679
10 0.3162278 -0.61446316
attr(,"assign")
[1] 0 1
$qraux
[1] 1.316228 1.266308
$pivot
[1] 1 2
$tol
[1] 1e-07
$rank
[1] 2
attr(,"class")
[1] "qr"
$df.residual
[1] 8
$xlevels
named list()
$call
lm(formula = y ~ x)
$terms
y ~ x
attr(,"variables")
list(y, x)
attr(,"factors")
x
y 0
x 1
attr(,"term.labels")
[1] "x"
attr(,"order")
[1] 1
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: R_GlobalEnv>
attr(,"predvars")
list(y, x)
attr(,"dataClasses")
y x
"numeric" "numeric"
$model
y x
1 3.379809 1
2 6.103542 2
3 9.107029 3
4 12.664324 4
5 15.448571 5
6 17.172362 6
7 21.135129 7
8 23.423820 8
9 25.692961 9
10 28.755370 10

Resources