R: Pass data.frame by reference to a function - r

I pass a data.frame as parameter to a function that want to alter the data inside:
x <- data.frame(value=c(1,2,3,4))
f <- function(d){
for(i in 1:nrow(d)) {
if(d$value[i] %% 2 == 0){
d$value[i] <-0
}
}
print(d)
}
When I execute f(x) I can see how the data.frame inside gets modified:
> f(x)
value
1 1
2 0
3 3
4 0
However, the original data.frame I passed is unmodified:
> x
value
1 1
2 2
3 3
4 4
Usually I have overcame this by returning the modified one:
f <- function(d){
for(i in 1:nrow(d)) {
if(d$value[i] %% 2 == 0){
d$value[i] <-0
}
}
d
}
And then call the method reassigning the content:
> x <- f(x)
> x
value
1 1
2 0
3 3
4 0
However, I wonder what is the effect of this behaviour in a very large data.frame, is a new one grown for the method execution? Which is the R-ish way of doing this?
Is there a way to modify the original one without creating another one in memory?

Actually in R (almost) each modification is performed on a copy of the previous data (copy-on-writing behavior).
So for example inside your function, when you do d$value[i] <-0 actually some copies are created. You usually won't notice that since it's well optimized, but you can trace it by using tracemem function.
That being said, if your data.frame is not really big you can stick with your function returning the modified object, since it's just one more copy afterall.
But, if your dataset is really big and doing a copy everytime can be really expensive, you can use data.table, that allows in-place modifications, e.g. :
library(data.table)
d <- data.table(value=c(1,2,3,4))
f <- function(d){
for(i in 1:nrow(d)) {
if(d$value[i] %% 2 == 0){
set(d,i,1L,0) # special function of data.table (see also ?`:=` )
}
}
print(d)
}
f(d)
print(d)
# results :
> f(d)
value
1: 1
2: 0
3: 3
4: 0
>
> print(d)
value
1: 1
2: 0
3: 3
4: 0
N.B.
In this specific case, the loop can be replaced with a "vectorized" and more efficient version e.g. :
d[d$value %% 2 == 0,'value'] <- 0
but maybe your real loop code is much more convoluted and cannot be vectorized easily.

Related

R Floop behavior

Suppose I have the following R loop:
for(i in 1:5){
print(i)
i = i + 1
}
This produces the following result:
1
2
3
4
5
Which is weird since I did redefine the index inside the loop.
Why do we see this behavior?
I thought I would see something like the following:
1
3
4
5
6
Assignment to the looping variable i is discarded on the next loop iteration, so i = i + 1 has no effect. If you want to change the value of i within the loop, you can use a while loop.
However, your intended output of 1 3 4 5 6 doesn't make sense for a couple of reasons:
assignment of i, as already mentioned;
why does it not increment every other loop?; and
the domain of i, 1:5, is locked in at the first pass of the loop.
Having said that, try this:
i <- 1
lim <- 5
while (i <= lim) {
if (i == 2) {
lim <- lim + 1
} else {
print(i)
}
i <- i + 1
}
# [1] 1
# [1] 3
# [1] 4
# [1] 5
# [1] 6
(My only caution here is that if your increment of i is conditional on anything, there needs to be something else that prevents this from being an infinite loop. Perhaps this caution is unnecessary in your use-case, but it seems relevant to at least mention it.)
The i is the for each value and that cannot be modified with i = i + 1. We may use an if condition
for(i in 1:5){
if(i != 2)
print(i)
}
-output
[1] 1
[1] 3
[1] 4
[1] 5
Also, if the intention is to subset a vector, why not use vectorized option
v1 <- 1:5
v1[v1 != 2]

Executing Basic Addition with for Loop (Trying to Understand how R Interprets for Loops)

I have a vector named jvec that consists of #'s 1 - 9 and simply want to use a for loop to add 1 to every number in the vector if that number is bigger than 3 and print the result once. I've tried this a number of ways and all have failed.
jvec <- c(1:9)
for (x in jvec) {
if (x > 3) {
x + 1
}
}
print(jvec)
(won't work)
This won't work either:
jvec <- c(1:9)
for (x in jvec[c(x)]) {
if (jvec[c(x)] > 3) {
jvec[c(x)+1]
print(jvec)
}
}
Could someone please explain why neither of these options do the trick as well as how to do it correctly? Thanks!
You could iterate over the index of jvec and update jvec only if the value is greater than 3.
jvec <- 1:9
for (x in seq_along(jvec)) {
if (jvec[x] > 3) {
jvec[x] <- jvec[x] + 1
}
}
print(jvec)
#[1] 1 2 3 5 6 7 8 9 10
However, you can also do this without for loop :
jvec <- 1:9
jvec + as.integer(jvec > 3)
#[1] 1 2 3 5 6 7 8 9 10
It is the same like in other programming languages - you have to actually assign your calculation results to a variable (to save it).
E.g. this would work (but there are also quicker ways in R):
jvec <- c(1:9)
for (i in 1:length(jvec)) {
if (jvec[i] > 3) {
jvec[i] <- jvec[i] + 1
}
}
print(jvec)

Printing user given values in Fibonacci sequence in R

I am trying to make function that prints values of Fibonacci sequence that are under user given value (n). So input 8 will return values (1,1,2,3,5,8)
Fib<- function(n){
v=NULL
v[1]<-1
v[2]<-1
for(i in 3:n){
v[i]<-v[i-1]+v[i-2]
while(v[i]<=n){
print(v)
break}}}
input
fib(8)
[1] 1 1 2
[1] 1 1 2 3
[1] 1 1 2 3 5
[1] 1 1 2 3 5 8
I would want only the last one printed out.
I also tried it with append(v,v[i]) but haven't got that working so it would return only values below n.
Will appreciate any tips given.
You can try a recursive approach (if you want, you can modify this to let the limit be an input, but I like it better this way to conserve stack space ):
V = c(1,1)
Limit = 10
fib = function(n){
if(n > L){
print(V)
return()
}else{
n = V[(length(V)-1) : length(V)] %>% sum
V <<- c(V,n)
return(fib(n))
}
}
fib(0)
You can use one only while without break to reach it:
Fib<- function(n){
v=NULL
v[1]<-1
v[2]<-1
i<-2
while(v[i]<=n)
{
i<-i+1
v[i]<-as.numeric(v[i-1])+as.numeric(v[i-2])
}
print(v[1:length(v)-1])
}
This is your desired output:
Fib(8)
[1] 1 1 2 3 5 8

How do I set up a function that displays a set vector only when 3 is entered?

I am trying to write a function that creates a vector that counts up and back based on the number given c(1:n, (n-1):1). When 3 is entered, however, I want the vector to display as 1,1,1,2,2,2,3,3,3 instead of 1,2,3,2,1. I have tried using if(n==3), but I get an error when I try to run it that says "n cannot be found", but I can't quite understand why. Any help is very much appreciated! Here is what I have tried:
vector<-function(n)
c(1:n, (n-1):1)
if(n==3)
c(rep(1,3),rep(2,3),rep(3,3))
Problems
There are several problems with the code in the question:
the { ... } are missing from the function so only the first line after the function line would actually be regarded as part of the function.
a function returns the value of the last statement executed and the last statement executed in the question is the if or the body of the if so the c(1:n, (n-1):1) statement is computed but can never be returned.
also if n=1 then c(1:n, (n-1):1) gives 1,0,1 which is not likely what you want.
c(rep(1,3),rep(2,3),rep(3,3)) is not wrong in terms of the result it gives but rep can be used in a more compact manner.
normally x:y is not used in programming because if y < x then it unexpectedly gives values descending from x to y. In this case the if statements excluse such a possibility but you might want to replace the colon with the appropriate seq anyways. The Alternatives to Second Leg in Last if section below provides such an alternative.
Solution
Instead try this. It first checks if n is less than 1 and if so returns a zero length vector; otherwise, the remaining if is run with two legs, one leg for the n = 1 or n = 3 case and one leg for the remaining cases.
(If you are willing to only have this work for n > 0 then we could omit the first if. If you are willing to only have this work n > 1 then we could omit the n==1 part of the condition in the last if too.)
myfun <- function(n) {
if (n < 1) integer(0)
else if (n == 1 || n == 3) rep(1:n, each = n)
else c(1:n, (n-1):1)
}
giving:
myfun(-1)
## integer(0)
myfun(0)
## integer(0)
myfun(1)
## [1] 1
myfun(2)
## [1] 1 2 1
myfun(3)
## [1] 1 1 1 2 2 2 3 3 3
myfun(4)
## [1] 1 2 3 4 3 2 1
Alternatives for first leg of last if
Here are some alternatives for the first leg, i.e. for n <- 3.
rep(1:n, each = n)
## [1] 1 1 1 2 2 2 3 3 3
c(outer(rep(1, n), 1:n))
## [1] 1 1 1 2 2 2 3 3 3
c(col(diag(n)))
## [1] 1 1 1 2 2 2 3 3 3
Alternatives for second leg of last if
and here are some alternatives for the second leg. The first assumes n > 1 and the others assume n > 0. In the code in the Solution section we handle n=1 in the n=3 leg so any of the following could be used. As the first alternative below does not handle n=1 it relies on the fact that the first leg of the last if handles n=1; however, the remaining alternatives below can handle n=1 correctly so they could be used even if we only have the first leg handle n=3.
c(1:n, (n-1):1) # only works for n > 1
c(seq_len(n), rev(seq_len(n-1)))
pmin(seq(2*n - 1), seq(2*n-1, 1))
n - abs((n-1):-(n-1))
Try this one it's working :
vector<-function(n)
{
if(n==3)
rep(1:3, each=3)
else
c(1:n, (n-1):1)
}
I assume you ran the function with as one-liner and worked, then you added the conditional statement.
Try this
vector<-function(n){
m <- c(1:n, (n-1):1)
if(n==3) m<- c(rep(1,3),rep(2,3),rep(3,3))
m
}
Another way to do it
Vector2 <- function(n){
if(n == 3 ){
return(c(rep(1,3),rep(2,3),rep(3,3)))
} else{
return(c(1:n, (n-1):1) )
}
}

Translate mathematical function into R

I have this mathematical function
I have written R code:
result <- 0
for (i in length(v)) {
result <- abs(x-v[i])
}
return(result)
to compute the function.
However, this does not seems efficient to me? How to implement this sum with the R sum() function?
I appreciate your answer!
sum(abs(x-v)) show be enough, no need the for loop, since arithmetic operations in R are vectorized
# Example
> x <- 5
> v <- 1:10
> abs(x-v)
[1] 4 3 2 1 0 1 2 3 4 5
> sum(abs(x-v))
[1] 25

Resources