Slightly embarrassed to ask such a simple question but I've wasted an hour now and figure its a 30 second solution. The problem is how to edit an existing object that is provided as an input to a function. I've also played with the super-assignment <<- without success.
The example function uses 2 inputs (one for an object and one for its name). I just need a form of this that removes the need for the 'n' input.
m <- c(2,5,3,7,1,3,9,3,5)
dim(m) <- c(3,3)
m
f <- function(x, n) { # where 'n' is the object name of 'x'
x[1,] <- c(1,2,3)
assign(n, x, envir = .GlobalEnv)
}
f(m, 'm')
m
Thanks in advance.
OK solved. Thanks #Andrie, sorry I misunderstood your reply.
Rookie error :(
f <- function(x) {
x[1,] <- c(1,2,3)
return(x)
}
m <- f(m)
m
You don't need to provide the name as an extra argument; substitute will get that for you. To do things in the scope of the calling function you use eval with parent.frame.
f <- function(x) {
eval(substitute( x[1,] <- c(1,2,3) ), parent.frame())
}
Then,
m <- c(2,5,3,7,1,3,9,3,5)
> dim(m) <- c(3,3)
> m
[,1] [,2] [,3]
[1,] 2 7 9
[2,] 5 1 3
[3,] 3 3 5
> f(m)
> m
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 5 1 3
[3,] 3 3 5
That said, modifying the caller's environment is generally a bad idea and it will usually lead to less confusing/fragile code if you just return the value and re-assign it to m instead. This is generally preferable.:
f <- function (x) {
x[1,] <- c(1,2,3)
x
}
m <- f(m)
However, I have occasionally found eval shenanigans to come in handy when I really needed to change an array in place and avoid an array copy.
I am probably missing the point of why you want to do this, but this will do exactly what you want, I think:
m[1,] = c(1,2,3)
The value of m has been changed in the global environment.
I'm just guessing here, but often folks who are writing functions to take the "names" of objects find that R lists can be useful. If you find yourself wanting to manipulate variables based on names, consider using R lists instead. Remember that every member of a list can have a different data type if necessary.
Related
m <- matrix(1:4, ncol=2)
l <- list(a=1:3, b='c')
d <- data.frame(a=1:3, b=3:1)
I was wondering if it is possible to make a function that takes a base R object (matrix, vector, list or data.frame, ...) as well as a text that specifies the subset of the object.
f1 <- function(object, subset) {
# object'subset'
}
For instance
f1(m, '[1,1]') #to evaluate m[1,1]
f1(l, '[[1]][2:3]') #l[[1]][2:3]
f1(d, '$a') #d$a
would give us (respectively):
[1] 1
[1] 2 3
[1] 1 2 3
I guess the function need somehow to glue the two arguments before evaluating. I guess one could make a kind of interpreter for each bit of the subset text and the (for the matrix example) do something like:
`[`(1,1)
This would possible but I thought there would be an easier more direct way (my 'glue' above).
Well one way to go is to use eval(parse)) methodology, i.e.
f1 <- function(x, text){
eval(parse(text = paste0(x, text)))
}
f1('d', '$a')
#[1] 1 2 3
f1('m', '[1,1]')
#[1] 1
f1('l', '[[1]][2:3]')
#[1] 2 3
f1<-function(object, subset){
return(eval(parse(text=paste0(substitute(object),subset))))
}
> m=matrix(4,2,2)
> l=list(c(1,2,3),c(2,3,4))
> f1(m,'[1,1]')
[1] 4
> f1(l,'[[1]][1:2]')
[1] 1 2
I have just started learning R and I wrote this code to learn on functions and loops.
squared<-function(x){
m<-c()
for(i in 1:x){
y<-i*i
c(m,y)
}
return (m)
}
squared(5)
NULL
Why does this return NULL. I want i*i values to append to the end of mand return a vector. Can someone please point out whats wrong with this code.
You haven't put anything inside m <- c() in your loop since you did not use an assignment. You are getting the following -
m <- c()
m
# NULL
You can change the function to return the desired values by assigning m in the loop.
squared <- function(x) {
m <- c()
for(i in 1:x) {
y <- i * i
m <- c(m, y)
}
return(m)
}
squared(5)
# [1] 1 4 9 16 25
But this is inefficient because we know the length of the resulting vector will be 5 (or x). So we want to allocate the memory first before looping. This will be the better way to use the for() loop.
squared <- function(x) {
m <- vector("integer", x)
for(i in seq_len(x)) {
m[i] <- i * i
}
m
}
squared(5)
# [1] 1 4 9 16 25
Also notice that I have removed return() from the second function. It is not necessary there, so it can be removed. It's a matter of personal preference to leave it in this situation. Sometimes it will be necessary, like in if() statements for example.
I know the question is about looping, but I also must mention that this can be done more efficiently with seven characters using the primitive ^, like this
(1:5)^2
# [1] 1 4 9 16 25
^ is a primitive function, which means the code is written entirely in C and will be the most efficient of these three methods
`^`
# function (e1, e2) .Primitive("^")
Here's a general approach:
# Create empty vector
vec <- c()
for(i in 1:10){
# Inside the loop, make one or elements to add to vector
new_elements <- i * 3
# Use 'c' to combine the existing vector with the new_elements
vec <- c(vec, new_elements)
}
vec
# [1] 3 6 9 12 15 18 21 24 27 30
If you happen to run out of memory (e.g. if your loop has a lot of iterations or vectors are large), you can try vector preallocation which will be more efficient. That's not usually necessary unless your vectors are particularly large though.
I wish to combine equivalent, deeply-nested columns from all elements of a reasonably long list. What I would like to do, though it's not possible in R, is this:
combined.columns <- my.list[[1:length(my.list)]]$my.matrix[,"my.column"]
The only thing I can think of is to manually type out all the elements in cbind() like this:
combined.columns <- cbind(my.list[[1]]$my.matrix[,"my.column"], my.list[[2]]$my.matrix[,"my.column"], . . . )
This answer is pretty close to what I need, but I can't figure out how to make it work for the extra level of nesting.
There must be a more elegant way of doing this, though. Any ideas?
Assuming all your matrices have the same column name you wish to extract you could use sapply
set.seed(123)
my.list <- vector("list")
my.list[[1]] <- list(my.matrix = data.frame(A=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[2]] <- list(my.matrix = data.frame(C=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
my.list[[3]] <- list(my.matrix = data.frame(D=rnorm(10,sd=0.3), B=rnorm(10,sd=0.3)))
sapply(my.list, FUN = function(x) x$my.matrix[,"B"])
Free data:
myList <- list(list(myMat = matrix(1:10, 2, dimnames=list(NULL, letters[1:5])),
myVec = 1:10),
list(myMat = matrix(10:1, 2, dimnames=list(NULL, letters[1:5])),
myVec = 10:1))
We can get column a of myMat a few different ways. Here's one that uses with.
sapply(myList, with, myMat[,"a"])
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9
This mapply one might be better for a more recursive type problem. It works too and might be faster than sapply.
mapply(function(x, y, z) x[[y]][,z] , myList, "myMat", "a")
# [,1] [,2]
# [1,] 1 10
# [2,] 2 9
I need to assign a matrix to a custom designed variable in R. So here is the matrix:
A = matrix(c(2,4,3,1,7,5),nrow=2,ncol=3,byrow=TRUE)
and here is the custom designed variable name:
G <- "Pakka"
Here I create the expression now:
G <- paste(G, "<- A")
and now I need to evaluate the expression so that the matrix A is assigned to the variable named Pakka.
eval(parse(G))
However, there is an error given by R saying
Not able to open file name `Pakka <- A`. No file of that name found.
Searing on environment is not giving me any clues. Please help!
The eval(parse(G)) in the above question, has to be replaced by eval(parse(text=G)).
This will solve the problem.
You should really use eval(call()) for this, or delayedAssign.
Using eval(call()) :
"<-" is a special type of function, so we can hold it as an unevaluated call. Then when we're ready to evaluate it, we just wrap it with eval. This was how this type of assignment was designed.
> A <- matrix(c(2,4,3,1,7,5),nrow=2,ncol=3,byrow=TRUE)
> G <- "Pakka"
> e <- call("<-", as.name(G), substitute(A))
A look at e shows that it's exactly what we want to do.
> e
# Pakka <- A
Now we evaluate it, and Pakka is assigned to A.
> eval(e)
> Pakka
# [,1] [,2] [,3]
#[1,] 2 4 3
#[2,] 1 7 5
> A <- matrix(c(2,4,3,1,7,5),nrow=2,ncol=3,byrow=TRUE)
Using delayedAssign we can create a promise (unevaluated object) :
> delayedAssign("Pakka", A)
> ls()
[1] "A" "Pakka" ## Pakka is there, but not in memory yet
> Pakka
# [,1] [,2] [,3]
#[1,] 2 4 3
#[2,] 1 7 5
I have a function with two arguments. The first argument takes vector, and the second argument takes a scalar. I want to apply this function to each row of a matrix, but this function takes different second argument every time. I tried the following, it didn't work. I expected to calculate the p.value for each row and then divide the p.value by the row number. I expected the result to be a vector, but I got a matrix instead. This is a pseudo example, but it illustrates my purpose.
> foo = matrix(rnorm(100),ncol=20)
> f = function (x,y) t.test(x[1:10],x[11:20])$p.value/y
> goo = 1:5
> apply(foo,1,f,y=goo)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.9406881 0.6134117 0.5484542 0.11299535 0.20420786
[2,] 0.4703440 0.3067059 0.2742271 0.05649767 0.10210393
[3,] 0.3135627 0.2044706 0.1828181 0.03766512 0.06806929
[4,] 0.2351720 0.1533529 0.1371135 0.02824884 0.05105196
[5,] 0.1881376 0.1226823 0.1096908 0.02259907 0.04084157
The following for loop strategy produces the expected result, expect would be very slow for the real data.
> res = numeric(5)
> for (i in 1:5){
res[i]=f(foo[i,],i)
}
> res
[1] 0.94068810 0.30670585 0.18281807 0.02824884 0.04084157
Any suggestions would be appreciated!
If your real purpose is like your example, you can vectorize the division:
f <- function(x) t.test(x[1:10], x[11:20])$p.value
apply(foo, 1, f) / goo
Based on the comment, the above is not appropriate.
In the case of the example, you might observe that the diagonal of the returned matrix is the desired result:
f = function (x,y) t.test(x[1:10],x[11:20])$p.value/y
goo = 1:5
diag(apply(foo,1,f,y=goo))
Besides being inefficient in time or space, this suffers from another problem. It is a result of the operation on y being vectorized that this is correct for the example. And in that case, the former solution is better. So I suspect that in your actual problem, your operation is not vectorized.
Sometimes a for loop really is the best answer. The apply family of functions are not magical; they are still loops.
Here is an sapply solution. It won't beat for for time (probably won't lose either) but it doesn't have a high space overhead. The idea is to apply the row index and use that to extract the row of foo and the element of goo to pass to f
sapply(seq(nrow(foo)), function(i) f(foo[i,], goo[i]))
f <- function (x,y) t.test(x[1:10],x[11:20])$p.value/y
f2 <- function(a, b){
tt <- t.test(x = a[1:10], y = a[11:20])$p.value
tt/b
}
f3 <- function() {
res <- numeric(5)
for (i in 1:5){
res[i] <- f(foo[i,],i)
}
res
}
f4 <- function(x) t.test(x[1:10], x[11:20])$p.value
set.seed(101)
foo <- matrix(rnorm(100),ncol=20)
goo <- 1:5
library(rbenchmark)
benchmark(
apply(foo, 1, f4) / goo,
mapply(f,split(foo,row(foo)),goo),
f2(foo,goo),
f3(),replications=1000,
sapply(seq(nrow(foo)), function(i) f(foo[i,], goo[i])),
columns=c("test","replications","elapsed","relative"))
## test replications elapsed relative
## 1 apply(foo, 1, f4)/goo 1000 1.581 5.528
## 3 f2(foo, goo) 1000 0.286 1.000
## 4 f3() 1000 1.458 5.098
## 2 mapply(...) 1000 1.599 5.591
## 5 sapply(...) 1000 1.486 5.196
The direct division is best (but not actually applicable); for this example there's not much difference between the other solutions, but for loop is better than sapply which is better than mapply. You should try this on a more realistic example to see how it's going to scale for your problem.