I'm trying to write a for-loop of a dataset. Just to make it simple, I'll write an example:
Two variables, X and Y.
X = 3, 6, 9
Y = 4, 8, 12
I want to make a loop that does this:
(Xi - Yi)^2, so first (3-4)^2, then
(6-8)^2 and so on.
Then, after that is done, multiply by this:
((1/2)/(n*(n-1))).
In this example, it would be:
(3-4)^2 + (6-8)^2 + (9-12)^2 = 1 + 4 + 9 = 14
1/2 / (3*(3-1)) = 0.5 / 6 = 0.0833.
0.0833 * 14 = 1.166.
result <- 0
sum <- rep(NA, n)
for (i in (1:n)) {
for(j in (1:n)) {
sum <- ((gathered$X[i] - gathered$X[j])^2)
}
}
Usually in R you can avoid for loops most of the times. For your case you can do
sum((X - Y)^2) * (1/2)/(length(X) * (length(X) - 1))
#[1] 1.166666667
However, as far as for loop is concerned you should be using a single loop since you want to access X[i] and Y[i] together.
sum <- 0
n <- 3
for (i in (1:n)) {
sum <- sum + (X[i] - Y[i])^2
}
sum * (1/2)/(n*(n-1))
#[1] 1.1667
data
X = c(3, 6, 9)
Y = c(4, 8, 12)
How about this, i think outer is fit to your problem.
CASE 1 ( X-Y )
sum(diag(outer(X,Y,function(X,Y)(X-Y)^2))) *
(1/2)/(length(X) * (length(X) - 1))
1.166667
CASE 2 ( all X and Y calculation )
sum(outer(X,Y,function(X,Y)(X-Y)^2)) *
(1/2)/(length(X) * (length(X) - 1))
15.5
Related
I try to loop trough a matrix but cant find a easy and elegant way instead of writing many (>10) equations... Can anyone help me please?
My Matrix looks like this:
and I want to calculate the following:
(0 * 0 * 4/24) + (0 * 1 * 6/24) + (0 * 2 * 3/24) + (1 * 0 * 3/24) + (1 * 1 * 4/24) + (1 * 2 * 4/24)
instead of using
__
btw: my code for the matrix
vals<- c(4/24, 6/24, 3/24, 3/24, 4/24, 4/24)
x <- c(0,1)
y <- c(0,1,2)
df <- matrix(vals, byrow = TRUE, nrow = 2, ncol = 3,
dimnames = list(x,y))
instead of calculation each step manually, I think there should be a for-loop method, but cant figure it out..
A possible solution:
c(x %*% df %*% y)
#> [1] 0.5
Another possible solution, based on outer:
sum(outer(x, y, Vectorize(\(x,y) x*y*df[x+1,y+1])))
#> [1] 0.5
x <- c(0, 1)
y <- c(0, 1, 2)
vals<- c(4/24, 6/24, 3/24, 3/24, 4/24, 4/24)
mat <- matrix(vals, byrow = TRUE, nrow = 2, ncol = 3,
dimnames = list(x,y)) ## not a data frame; don't call it "df"
There is even a better way than a for loop:
sum(tcrossprod(x, y) * mat)
#[1] 0.5
sum((x %o% y) * df)
Explanation:
x %o% y gets the outer product of vectors x and y which is:
#> [,1] [,2] [,3]
#> [1,] 0 0 0
#> [2,] 0 1 2
Since that has the same dimensions as df, you can multiply the corresponding elements and get the sum: sum((x %o% y) * df)
If you are new to R (as I am), here is the loop approach.
result = 0
for (i in 1:length(x)) {
for (j in 1:length(y)) {
result = result + x[i] * y[j] * df[i, j]
}
}
result
I need help with a code to generate random numbers according to constraints.
Specifically, I am trying to simulate random numbers ALFA and BETA from, respectively, a Normal and a Gamma distribution such that ALFA - BETA < 1.
Here is what I have written but it does not work at all.
set.seed(42)
n <- 0
repeat {
n <- n + 1
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1)
alfa[n] <- a
beta[n] <- b
l = length(alfa)
if (l == 10000) break
}
Due to vectorization, it will be faster to generate the numbers "all at once" rather than in a loop:
set.seed(42)
N = 1e5
a = rnorm(N, 10, 2)
b = rgamma(N, 8, 1)
d = a - b
alfa = a[d < 1]
beta = b[d < 1]
length(alfa)
# [1] 36436
This generated 100,000 candidates, 36,436 of which met your criteria. If you want to generate n samples, try setting N = 4 * n and you'll probably generate more than enough, keep the first n.
Your loop has 2 problems: (a) you need curly braces to enclose multiple lines after an if statement. (b) you are using n as an attempt counter, but it should be a success counter. As written, your loop will only stop if the 10000th attempt is a success. Move n <- n + 1 inside the if statement to fix:
set.seed(42)
n <- 0
alfa = numeric(0)
beta = numeric(0)
repeat {
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1) {
n <- n + 1
alfa[n] <- a
beta[n] <- b
l = length(alfa)
if (l == 500) break
}
}
But the first way is better... due to "growing" alfa and beta in the loop, and generating numbers one at a time, this method takes longer to generate 500 numbers than the code above takes to generate 30,000.
As commented by #Gregor Thomas, the failure of your attempt is due to the missing of curly braces to enclose the if statement. If you would like to skip {} for if control, maybe you can try the code below
set.seed(42)
r <- list()
repeat {
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1) r[[length(r)+1]] <- cbind(alfa = a, beta = b)
if (length(r) == 100000) break
}
r <- do.call(rbind,r)
such that
> head(r)
alfa beta
[1,] 9.787751 12.210648
[2,] 9.810682 14.046190
[3,] 9.874572 11.499204
[4,] 6.473674 8.812951
[5,] 8.720010 8.799160
[6,] 11.409675 10.602608
I would like to write a code that generates 3 x 1 vector y according to following rule (The small numbers are selected for simplicity):
Here x is a 3 x 1 vector. According to the rule, for an update of y, I need sum of all y’s.
An attemp to code with an arbitrary x:
x <- c(2,3,1)
y <- c(0,0,0)
for(i in 1:5){
for(j in 1:3){
y[j] <- x[j] + y[j] + sum(y)
}
}
This code is not appropriate because it computes sum(b) term by term.
The inner loop indicates something like this:
y[1] = x[1] + 0 = 2
y[2] = x[2] + 2 = 5
y[3] = x[3] + 2 + 5 = 8
It is not appropriate because sum(y) term contains one term for y[1], two terms for y[2], three terms for y[3]. But I think sum(y) should be 2 + 5 + 8 = 15 for each iteration, y[1], y[2], y[3], according to the rule given above. Moreover this procedure should be repeated for a certain times (here 5 times shown by the outer loop). At each time of outer loop, only one sum(y) term will be computed for all three iteration of inner loop and it will be put as sum(y) term for each j.
How should I code this?
You are over-complicating this. Vectorize the inner-loop away:
> x <- c(2,3,1)
> y <- c(0,0,0)
> for(j in 1:5) y <- x + y + sum(y)
> y
[1] 682 687 677
This approach only computes sum(y) once per iteration, which is what you seem to want. As an added benefit, adding vectors in a single operation is much faster than adding them component-wise in a loop.
Maybe this will work
myfun <- function(x, y, i) {
y[i] <- x[i] + sum(y)
if (i < length(x)) {
myfun(x, y, i+1)
} else {
return(y)
}
}
x <- c(2, 3, 1)
y <- rep(0, length(x))
myfun(x, y, 1)
# [1] 2 5 8
x <- c(2, 3, 1, 5)
y <- rep(0, length(x))
myfun(x, y, 1)
# [1] 2 5 8 20
I am trying to calculate the sum of this sequence in R.
The sequence will have two Inputs (1 and 11) in the below case,
1 + (1 * 2/3 ) + (1 * 2/3 * 4/5) + ( 1 * 2/3 * 4/5 * 6/7) + ....................(1 *......10/11)
I think, defining my own is the way to go here.
You could try just using old-fashioned loops here:
sum <- 0
num_terms <- 6
for (i in 1:num_terms) {
y <- 1
if (i > 1) {
for (j in 1:(i-1)) {
y <- y * (j*2 / (j*2 + 1))
}
}
sum <- sum + y
}
You can set num_terms to any value you want, from 1 to a higher value. In this case, I use 6 terms because this is the requested number of terms in your question.
Someone will probably come along and reduce the entire code snippet above to one line, but in my mind an explicit loop is justified here.
Here is a link to a demo which prints out the values being used in each of the terms, for verification purposes:
Demo
My approach:
# input
start <- 1
n <- 5 # number of terms
end <- start + n*2
result <- start
to_add <- start
for (i in (start + 1):(end-1)) {
to_add <- to_add * (i / (i + 1))
result <- result + to_add
}
which gives:
> result
[1] 4.039755
Another base R alternative using cumprod to generate the inner terms is
sum(cumprod(c(1, seq(2, 10, 2)) / c(1, seq(3, 11, 2))))
[1] 3.4329
Here, c(1, seq(2, 10, 2)) / c(1, seq(3, 11, 2)) generates the sequence 1, 2/3, 4/5, 6/7, 8/9, 10/11 and cumprod takes the cumulative product. This result is summed with sum. The returned result is identical to the one in the accepted answer.
you can try:
library(tidyverse)
Result <- tibble(a=seq(1, 11, 2)) %>%
mutate(b=lag(a, default = 0)+1) %>%
mutate(Prod=cumprod(b)/cumprod(a)) %>%
mutate(Sum=cumsum(Prod))
Result
# A tibble: 6 x 4
a b Prod Sum
<dbl> <dbl> <dbl> <dbl>
1 1 1 1.0000000 1.000000
2 3 2 0.6666667 1.666667
3 5 4 0.5333333 2.200000
4 7 6 0.4571429 2.657143
5 9 8 0.4063492 3.063492
6 11 10 0.3694084 3.432900
# and some graphical analysis
Result %>%
ggplot(aes(as.factor(a), Prod, group=1)) +
geom_col(aes(as.factor(a), Sum), alpha=0.4)+
geom_point() +
geom_line()
If n(1) = 1 ,n(2) = 5, n(3) = 13, n(4) = 25, ...
I am using a for loop for summation of these terms
1 + (1*4 - 4) + (2*4 - 4) + (3*4 - 4) + ..
This is the function I am using with a for loop:
shapeArea <- function(n) {
terms <- as.numeric(1)
for(i in 1:n){
terms <- append(terms, (i*4 - 4))
}
sum(terms)
}
This works fine (as shown here):
> shapeArea(3)
[1] 13
> shapeArea(2)
[1] 5
> shapeArea(4)
[1] 25
Yet I was also thinking how can I do this without saving the terms of the series in numeric vector terms. In other words is there a way to find summations of terms without saving them in a vector first. Or is this the efficient way to do this.
Thanks
You can change your shapeArea function to a one-liner
shapeArea <- function(num) {
1 + sum(seq(num) * 4) - (4 * num)
}
shapeArea(1)
#[1] 1
shapeArea(2)
#[1] 5
shapeArea(3)
#[1] 13
shapeArea(4)
#[1] 25