End goal is to take a dataframe and create a new column based on multiplication and addition of prior rows, i.e. if my multipliers are 0.1, 0.2, and 0.3, my addition is z + [lag(z) * 0.1] ,then I want to take column Z and transform it 3 times as such (skipping the first row):
z <- 1:4*10
df <- data.frame(z)
Z
Z_0.1
Z_0.2
Z_0.3
10
10
10
10
20
21
22
23
30
32.1
34.4
36.9
40
43.21
46.88
51.07
I have been able to get the correct values by manually feeding in the rate and overwriting the existing column:
for (i in 1:nrow(df)) {
if (i ==1)
df[i,1] <- df[i,1]
else
df[i,1] <- df[i,1] + (df[i-1,1] * 0.1)
}
Separately, I can also create column placeholders for the new values:
for (i in seq(0.1, 0.3, by = 0.1)) {
cola <- paste('col', i, sep = "_")
df[[cola]] <- 0
}
However, I cannot seem to combine these loops and get the outcome in the above sample table. I have tried this:
for (i in 1:nrow(df2)) {
for (j in seq(0.1, 0.3, by = 0.1)) {
cola <- paste('col', j, sep = "_")
df[[cola]] <- 0
if (i ==1)
df[[cola]] <- df[i,1]
else
df[[cola]] <- df[i,1] + (df[i-1,1] * j)
}
}
But it fills all the new columns with the same values for the whole column
Z
Z_0.1
Z_0.2
Z_0.3
10
77.02
81.85
86.68
20
77.02
81.85
86.68
30
77.02
81.85
86.68
40
77.02
81.85
86.68
Appreciate any suggestions. I'm not married to for loops if anyone has an alternative suggestion.
Like this maybe?
Z <- 1:4*10
y <- seq(0.1, 0.3, by = 0.1)
df <- data.frame(Z)
for (i in 1:(length(Z)-1)+1){
for (j in seq_along(y)){
df[1,paste0('Z_', y[j])] = Z[1]
df[i, paste0('Z_', y[j])] = Z[i]+(df[i-1, paste0('Z_', y[j])]*y[j])
}
}
df
#> Z Z_0.1 Z_0.2 Z_0.3
#> 1 10 10.00 10.00 10.00
#> 2 20 21.00 22.00 23.00
#> 3 30 32.10 34.40 36.90
#> 4 40 43.21 46.88 51.07
Created on 2022-09-09 by the reprex package (v2.0.1)
Related
I want to calculate the
sum_{i}^{n} x_{i}^2 * lambda^(n-i) / sum_{i}^{n} lambda^(n-i)
Doing so in R i manage to do the following.But zeros occured.What am i doing wrong?
n = 10
x = seq(1,n,1);x
lambda = 0.99
mat = 0
for (i in 2:n) {
mat = (lambda^(n-i)*x[i-1]^2) / (lambda^(n-i))
}
mat
the result must be the variance of each entry x_{i}
Very well, I will try to see if this is what you need #The Red:
n <- 10
x <- seq(1,n,1);x
lambda <- 0.99
mat <- 0
vect_mat <- rep(0, n-1); vect_mat
for (i in 2:n) {
mat <- mat + (lambda^(n-i)*x[i-1]^2) / (lambda^(n-i))
vect_mat[i-1] <- mat
}
mat
vect_mat
After running it, it results in:
> mat
[1] 285
> vect_mat
[1] 1 5 14 30 55 91 140 204 285
I have a data frame called df2 that has 1501 data points
Depth <- seq(0, 1500, by = 1)
Temp <- rev(seq(1, 10, by = 0.006))
D0 <- 0
Dend <- 1000
r <- 2
days <- 100
D <- rep(NA, days+1)
D <- D0
Temp <- T0
for (time in seq_len(steps)){
if (tail(D,1) >= Dend) break
D[time + 1] <- r + D[time]
Temp[time] <- Temp[time]
}
I can't seem to couple Temp with D. Using this line of code (Temp[time] <- Temp[time]), I get Temp every m for 1500.
One approach to simplify things a bit with seq using by= and length.out=.
Then we can use merge to join the results back to df2. It needs to be a data.frame with names to merge onto, so I changed your cbind to data.frame.
Depth <- seq(0, 1500, by = 1)
Temp <- rev(seq(1, 10, by = 0.006))
df2 <- data.frame(Depth, Temp)
D0 <- 0
days <- 107
r <- 40
Result <- data.frame(Day = 0:days,
Depth =seq(from = D0, by= r ,length.out = days + 1))
Result <- merge(Result,df2,all.x=TRUE)
Result
# Depth Day Temp
#1 0 0 10.00
#2 40 1 9.76
#3 80 2 9.52
#4 120 3 9.28
#5 160 4 9.04
#...
By using all.x=TRUE we will get NA when there is no value in df2 for that Depth.
I have hundreds of TimeSeries lines, each corresponding to unique values of a set of parameters. I put all the data in one large dataframe. The data looks like this (containing 270 TimeSeries):
> beginning
TimeSeriesID TimeSeries Par1 Par2 Par3 Par4 Par5
1 1 3936.693 51 0.05 1 1 True
2 1 3936.682 51 0.05 1 1 True
3 1 3945.710 51 0.05 1 1 True
4 1 3937.385 51 0.05 1 1 True
5 1 3938.050 51 0.05 1 1 True
6 1 3939.387 51 0.05 1 1 True
> end
TimeSeriesID TimeSeries Par1 Par2 Par3 Par4 Par5
3600452 270 -16.090 190 0.025 5 5 False
3600453 270 -21.120 190 0.025 5 5 False
3600454 270 -14.545 190 0.025 5 5 False
3600455 270 -23.950 190 0.025 5 5 False
3600456 270 -4.390 190 0.025 5 5 False
3600457 270 -3.180 190 0.025 5 5 False
What I am trying to achieve is for the Shiny app to allow the user vary the parameters he wants, get the user input and plot all the TimeSeries that satisfy those values in one plot. Therefore the plot will have different number of lines displayed given the users' input - ranging from one (when all parameters are set to a specified value) to 270 (when no parameters are chosen, all TimeSeries are plotted).
I had no success so far, so there is nothing I can share that may help solve the problem, although I spent many days on-and-off the it. So far I have been trying to use reactivePlot() and specify the lines by adding geom_line() in ggplot2. Now I am trying to look into the aes() parameter whether there is a possibility achieve what I need. I have also read about converting data into long format by reshape2, but I am not sure that is what I need, since I am working with TimeSeries data.
Thank you in advance.
In the end I went for a base R solution. Not perfect, but suited my needs:
equityplot.IDs <- function()
{
bounds <- c(-6000, 100000) #c(min(sapply(eq.list, min)), max(sapply(eq.list, max)))
colors <- rainbow(length(outputIDs()[[2]]))
j <- 1
indexy <- c(0, 6000)
# Plot
plot(NULL,xlim=indexy,ylim=bounds)
for (i in 1:length(equitieslist))
{
if(i %in% outputIDs()[[2]])
{
profit <- rev(equitieslist[[i]][,1]) #$Profit1)
lines(1:length(profit), profit, col=colors[j])
j <- j + 1
}
}
}
After more experimenting, currently working with this:
ggpokus <- function(n) {
mymin <- function(N = n){
m <- Inf
for (i in 1:N)
{
g <- length(equitieslist[[i]][,1])
if (g < m) {m <- g}
}
return (m)
}
mylength <- mymin()
# t <- paste("qplot(1:", mylength, ", rev(equitieslist[[", 1, "]][,1])[1:", mylength, "], geom = \"line\")", sep = "")
t <- paste("qplot(1:", mylength, ", rev(equitieslist[[", 1, "]][,1])[1:", mylength, "], geom = \"line\", ylim = c(0, 5000))", sep = "")
cols <- rainbow(n)
for (i in 1:n) {
p <- paste("rev(equitieslist[[", i+1, "]][,1])[1:", mylength, "])", sep = "")
c <- paste("\"", cols[i+1], "\"", sep = "") # paste("cols[", i, "]", sep = "")
t <- c(paste(t, " + geom_line(aes(y = ", p,", colour = ", c, ")", sep = ""))
}
# cat(t)
# cat("\n")
return (t)
}
options(expressions=10000)
z <- ggpokus(1619)
eval(parse(text=z))
N <- c(1,3,4,6)
a <- c(3,4,5,6)
b <- c(4,5,6,7)
w <- c(5,6,7,6)
dat1 <- data.frame(N,May = a, April = b,June = w)
N May April June
1 1 3 4 5
2 3 4 5 6
3 4 5 6 7
4 6 6 7 6
I need a data frame, where each value is sd of N value and row value
sd(c(1,3) sd(c(1,4) sd(c(1,5) # for 1st row
sd(c(3,4) sd(c(3,5) sd(c(3,6) # for second and so on.
Try this:
The data:
Norm <- c(1,3,4,6)
a <- c(3,4,5,6)
b <- c(4,5,6,7)
w <- c(5,6,7,6)
mydata <- data.frame(Norm=Norm,May = a, April = b,June = w)
Solution:
finaldata <- do.call('cbind',lapply(names(mydata)[2:4], function(x) apply(mydata[c("Norm",x)],1,sd)))
I hope it helps.
Piece of advice:
Please refrain from using names like data and norm for your variable names. They can easily conflict with things that are native to R. For example norm is a function in R, and so is data.
I think I got it
x=matrix(data=NA, nrow=4, ncol=3)
for(j in 1:3){
for(i in 1:4){
x[i, j] <- sd(data[i, c(i,(j+1))])
x
}
}
I want to create a double sliding window in a for loop. An example data set might look like:
a <- structure(list(a = c(0.0961136, 0.1028192, 0.1106424, 0.1106424,
0.117348, 0.117348, 0.117348, 0.122936, 0.1307592, 0.1307592,
0.1318768, 0.1318768, 0.1385824, 0.1385824, 0.1318768, 0.1251712,
0.1251712, 0.1251712, 0.1251712, 0.1251712)), .Names = "a", row.names = c(NA,
-20L), class = "data.frame")
The code I have so far looks like this:
windowSize <- 5
windowStep <- 1
dat <- list()
for (i in seq(from = 1, to = nrow(a), by = windowStep)){
window1 <- a[i:windowSize, ]
window2 <- a[i:windowSize + windowSize, ]
if (median(window1) <= 0.12 && (median(window1) >= 0.08)) {
p <- "True"
} else
p <- "not"
dat[[i]] <- c(p)
}
result <- as.data.frame(do.call(rbind, dat))
This example shows that I require two windows of size 5 (data points) to slide one in front of the other by 1 data point at a time. This example does not utilize window 2 because it doesn't work!(I will need it to work eventually) However using just window1 to calculate the median (in this case) at each step works but the output is incorrect. The if statements ask that if the median of window 1 is between 0.08 and 0.12 then output "True" else "not."
Output for my for loop =
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True
10 not
11 not
12 not
13 not
14 not
15 not
16 not
17 not
18 not
19 not
20 not
Correct output as checked using rollapply (and obviously can be seen by eye)
rollapply(a, 5, FUN = median, by = 1, by.column = TRUE, partial = TRUE, align = c("left"))
should be:
1 True
2 True
3 True
4 not
5 not
6 not
7 not
8 not
9 not
10 not
11 not
12 not
13 not
14 not
15 not
16 not
17 not
18 not
19 not
20 not
Could the solution remain as a for loop if possible as I have much more to add but need to get this right first. Thanks.
This gets close..modified from: https://stats.stackexchange.com/questions/3051/mean-of-a-sliding-window-in-r
windowSize <- 10
windowStep <- 1
Threshold <- 0.12
a <- as.vector(a)
data <- a
slideFunct <- function(data, windowSize, WindowStep){
total <- length(data)
dataLength <- seq(from=1, to=(total-windowSize), by=windowStep)
result <- vector(length = length(dataLength))
for(i in 1:length(dataLength)){
result[i] <- if (median(data[dataLength[i]:(dataLength[i]+windowSize)]) <= Threshold)
result[i] <- "True"
else
result[i] <- "not"
}
return(result)
}