Using for loops to save a matrix of data? - r

I am trying to find the probability density functions of a wind data. Below is how I calculate the scale parameter from the mean wind speeds.
k<-2
for(i in 1:length(Windmean)){
Scale[i]=as.numeric(Windmean[i]/(exp(gammaln(1+(1/k)))))
}
> Scale
[1] 3.913934 3.996000 4.012884 3.925220 3.856707 3.849608 3.820578 3.943110 3.945975 3.842338 3.891791
[12] 3.933083 3.993944 3.907775 3.847120 3.853263 3.917156 4.028956 3.878879 3.753880 3.969074 3.818923
[23] 3.855913 3.993075 3.985828 3.914240 3.854336 3.620460 3.848180 3.843788 3.830617 3.841890 3.879547
[34] 3.904059
If these are my results for scale parameter I want to use the formula below to get the wind probability Wind_prob. I do.
Scale<- cbind(3.913934,3.996000,4.012884,3.925220,3.856707,3.849608,
3.820578,3.943110,3.945975,3.842338,3.891791,3.933083,3.993944,3.907775,
3.847120,3.853263,3.917156,4.028956,3.878879,3.753880,3.969074,3.818923,
3.855913,3.993075,3.985828,3.914240,3.854336,3.620460,3.848180,3.843788,
3.830617,3.841890,3.879547,3.904059) ##Length 34
bins<-cbind(seq(0.5,25,by=0.5)) ##Length 51
bins<-cbind(bins)
shape<-k
for(i in 1:length(bins)){
for(o in 1:length(Scale)){
Wind_prob[i]<-(0.5*(exp(-1*(bins[i,1]/shape)^shape))*(shape/as.numeric(Scale[o]))*((bins[i,1]/as.numeric(Scale[o]))^(shape-1)))
}
}
I am getting a list of 51 probability functions (i=34) but I should get a matrix of [51*34]. Basically, I want to get 51 probability functions for each 34 scale functions. It seems my iterations are not been saved at matrix. I tried as.matrix, as.array as well in the wind_prob but couldn't get it working. Would anyone be kind to point out the changes I need to make in the code? Thanks.

You need to use a matrix:
Scale <- c(3.913934,3.996000,4.012884,3.925220,3.856707,3.849608,
3.820578,3.943110,3.945975,3.842338,3.891791,3.933083,3.993944,3.907775,
3.847120,3.853263,3.917156,4.028956,3.878879,3.753880,3.969074,3.818923,
3.855913,3.993075,3.985828,3.914240,3.854336,3.620460,3.848180,3.843788,
3.830617,3.841890,3.879547,3.904059) ##Length 34
bins <- seq(0.5, 25, by = 0.5) # length 50
shape <- 2
Wind_prob <- matrix(NA_real_, length(bins), length(Scale))
for (j in seq_along(Scale)) {
for (i in seq_along(bins)) {
Wind_prob[i, j]<-(0.5*(exp(-1*(bins[i]/shape)^shape))*
(shape/as.numeric(Scale[j]))*
((bins[i]/as.numeric(Scale[j]))^(shape-1)))
}
}

Related

How to fix this r code to get the correct values for 1 to 100?

In the following code, when I try N = 100, I get the answer as 5.352954 but when I enter N=1:100 to get a table for all the 100 values, the 100th value shown is 1.123200e-27. Why the discrepancy? I was hoping for the table to show the 100th value as 5.352954. Help with fixing this code would be appreciated. Thank you very much.
Task_binom <- function(N, Time, sigma, rho, St, K, put = T){
h <- Time/N
rf <- rho*h
sd <- sigma*sqrt(h)
u <- exp(sd)
d <- 1/u
pi <- (1+rf-d)/(u-d)
sum <- 0
if(put == T){
for (i in 1:N) {
helper <- choose(N,i)*pi^i*(1-pi)^(N-i)*max(K - St*u^i*d^(N-i),0)
sum <- helper + sum
}
p_0 <- 1/(1+rf)^N * sum
return(p_0)}
else{
for (i in 1:N) {
helper <-choose(N,i)*pi^i*(1-pi)^(N-i)*max(St*u^i*d^(N-i) - K,0)
sum <- helper + sum
}
p_0 <- 1/(1+rf)^N * sum
return(p_0)
}
}
Task_binom(100,0.0833,0.3299,0.0472,134.51,134.51,put=F)
Task_binom(1:100,0.0833,0.3299,0.0472,134.51,134.51,put=F)
This is because function Task_binom is not vectorized. You can do the following:
sapply(1:100,Task_binom,0.0833,0.3299,0.0472,134.51,134.51,put=F)
[1] 6.649708 4.784997 5.798963 5.059390 5.624142 5.158321 5.549569 5.209064 5.508332
[10] 5.239888 5.482184 5.260587 5.464130 5.275442 5.450917 5.286621 5.440830 5.295337
[19] 5.432878 5.302323 5.426447 5.308048 5.421139 5.312824 5.416684 5.316870 5.412892
[28] 5.320340 5.409624 5.323350 5.406780 5.325986 5.404282 5.328313 5.402070 5.330382
[37] 5.400097 5.332234 5.398328 5.333902 5.396732 5.335411 5.395285 5.336783 5.393966
[46] 5.338037 5.392760 5.339186 5.391653 5.340244 5.390633 5.341221 5.389690 5.342125
[55] 5.388815 5.342965 5.388003 5.343747 5.387245 5.344477 5.386537 5.345160 5.385874
[64] 5.345801 5.385252 5.346402 5.384668 5.346969 5.384117 5.347503 5.383597 5.348007
[73] 5.383106 5.348485 5.382641 5.348937 5.382200 5.349366 5.381782 5.349774 5.381384
[82] 5.350161 5.381005 5.350531 5.380645 5.350883 5.380301 5.351219 5.379972 5.351540
[91] 5.379658 5.351848 5.379357 5.352142 5.379069 5.352424 5.378793 5.352695 5.378529
[100] 5.352954
Two quick points to check which might solve your problem:
Your 'for' loop is already specified as 1:N so if you make N = 1:100 then your loop is trying to do 1:1:100 which will end badly.
There are a number of places where you use N as a single number in your function (e.g. calculations of 'h', 'helper' and 'p_0') which wouldn't work with N = 1:100.

Optimization function gives incorrect results for 2 similar data sets

I have 2 datasets not very different to each other. Each dataset has 27 rows of actual and forecast values. When tested against Solver in Excel for minimization of the absolute error (abs(actual - par * forecast) they both give nearly equal values for the parameter 'par'. However, when each of these data sets are passed on to the same optimization function that I have written, it only works for one of them. For the other data set, the objective always gets evaluated to zero (0) with'par' assisgned the upper bound value.
This is definitely incorrect. What I am not able to understand is why is R doing so?
Here are the 2 data sets :-
test
dateperiod,usage,fittedlevelusage
2019-04-13,16187.24,17257.02
2019-04-14,16410.18,17347.49
2019-04-15,18453.52,17246.88
2019-04-16,18113.1,17929.24
2019-04-17,17712.54,17476.67
2019-04-18,15098.13,17266.89
2019-04-19,13026.76,15298.11
2019-04-20,13689.49,13728.9
2019-04-21,11907.81,14122.88
2019-04-22,13078.29,13291.25
2019-04-23,15823.23,14465.34
2019-04-24,14602.43,15690.12
2019-04-25,12628.7,13806.44
2019-04-26,15064.37,12247.59
2019-04-27,17163.32,16335.43
2019-04-28,17277.18,16967.72
2019-04-29,20093.13,17418.99
2019-04-30,18820.68,18978.9
2019-05-01,18799.63,17610.66
2019-05-02,17783.24,17000.12
2019-05-03,17965.56,17818.84
2019-05-04,16891.25,18002.03
2019-05-05,18665.49,18298.02
2019-05-06,21043.86,19157.41
2019-05-07,22188.93,21092.36
2019-05-08,22358.08,21232.56
2019-05-09,22797.46,22229.69
Optimization result from R
$minimum
[1] 1.018188
$objective
[1] 28031.49
test1
dateperiod,Usage,fittedlevelusage
2019-04-13,16187.24,17248.29
2019-04-14,16410.18,17337.86
2019-04-15,18453.52,17196.25
2019-04-16,18113.10,17896.74
2019-04-17,17712.54,17464.45
2019-04-18,15098.13,17285.82
2019-04-19,13026.76,15277.10
2019-04-20,13689.49,13733.90
2019-04-21,11907.81,14152.27
2019-04-22,13078.29,13337.53
2019-04-23,15823.23,14512.41
2019-04-24,14602.43,15688.68
2019-04-25,12628.70,13808.58
2019-04-26,15064.37,12244.91
2019-04-27,17163.32,16304.28
2019-04-28,17277.18,16956.91
2019-04-29,20093.13,17441.80
2019-04-30,18820.68,18928.29
2019-05-01,18794.10,17573.40
2019-05-02,17779.00,16969.20
2019-05-03,17960.16,17764.47
2019-05-04,16884.77,17952.23
2019-05-05,18658.16,18313.66
2019-05-06,21036.49,19149.12
2019-05-07,22182.11,21103.37
2019-05-08,22335.57,21196.23
2019-05-09,22797.46,22180.51
Optimization result from R
$minimum
[1] 1.499934
$objective
[1] 0
The optimization function used is shown below :-
optfn <- function(x)
{act <- x$usage
fcst <- x$fittedlevelusage
fn <- function(par)
{sum(abs(act - (fcst * par)))
}
adjfac <- optimize(fn, c(0.5, 1.5))
return(adjfac)
}
adjfacresults <- optfn(test)
adjfacresults <- optfn(test1)
Optimization result from R
adjfacresults <- optfn(test)
$minimum
[1] 1.018188
$objective
[1] 28031.49
Optimization result from R
adjfacresults <- optfn(test1)
$minimum [1]
1.499934
$objective
[1] 0
Can anyone help to identify why is R not doing the same process over the 2 data sets and outputting the correct results in both the cases.
The corresponding results using Excel Solver for the 2 datasets are as follows :-
For 'test' data set
par value = 1.018236659
objective function valule (min) : 28031
For 'test1' data set
par value = 1.01881062927878
objective function valule (min) : 28010
Best regards
Deepak
That's because the second column of test1 is named Usage, not usage. Therefore, act = x$usage is NULL, and the function fn returns sum(abs(NULL - something)) = sum(NULL) = 0. You have to rename this column to usage.

No sign change found error in R but not excel

Just trying to understand why when I find the root of the following equation in excel I get a value however in R I get the "no sign change found error"
(-exp(-i*x))-x + 1
i = 1 in this case.
I'm plotting a graph where the value for i is 1:5. I've done this manually on excel and got a value of 0.003 when i = 1, here is the graph for all values of i: image 1
When try to find the root for when i = 1 in R though I get the error.
This is the code I am using to find the root:
func1 <- function(x) {
(-exp(-1*x))-x+1
}
root <- uniroot(func1, lower =0.5, upper = 1, extendInt = "yes")
print(root)
print(root$root)
}
Plotting the equation when i = 1 gives the following curve: image 2
Looking at the curve it doesn't seem like f(x) crosses 0 which explains the error, however, I get a value in excel.
Any help on this would be really appreciated
Thanks
This is the best I can offer. It is using a method of derivatives I found at http://rpubs.com/wkmor1/simple-derivatives-in-r. It will allow you to get the roots from newtons method.
options(scipen = 999)
f<-function(x) (-exp(-x)-x+1)
g<-function(x) { }
body(g)<-D(body(f),"x")
x <- 19
y<-vector()
y[1]<-x
for(i in 2:500) {
y<-c(y, y[i-1] - (f(y[i-1])/g(y[i-1])))
if(y[i]==y[i-1]) break
}
y
The output looks like this:
> y
[1] 19.000000000000000 0.999999893546867 0.418023257075328 0.194491909332762
[5] 0.094095681658666 0.046310116577025 0.022976345768161 0.011444180565743
[9] 0.005711176200954 0.002852869974152 0.001425756748278 0.000712708975595
[13] 0.000356312158327 0.000178145499021 0.000089070105497 0.000044534390909
[17] 0.000022267031356 0.000011133470771 0.000005566723944 0.000002783342584
[21] 0.000001391644148 0.000000695821730 0.000000347990248 0.000000173795172
[25] 0.000000086916841 0.000000043487300 0.000000023063442 0.000000013435885
[29] -0.000000003090351 -0.000000003090351
I hope this helps.

Using for loop variable to access element in array yielding NA in R

I'm using a nested for loop to create a greedy algorithm in R.
z = 0
for (j in 1:length(t))
for (i in 1:(length(t) - j))
if ((t[j + i] - t[j]) >= 30)
{z <- c(z,j + i - 1)
j <- j + i - 1
break}
z
Where t is a vector such as:
[1] 12.01485 26.94091 33.32458 49.46742 65.07425 76.05700
[7] 87.11043 100.64116 111.72977 125.72649 139.46460 153.67292
[13] 171.46393 184.54244 201.20850 214.05093 224.16196 237.12485
[19] 251.51753 258.45865 273.95466 285.42704 299.01869 312.35587
[25] 326.26289 339.78724 353.81854 363.15847 378.89307 390.66134
[31] 402.22007 412.86049 424.23181 438.50462 448.88005 462.59917
[37] 473.65289 487.20678 499.80053 509.14141 526.03873 540.17209
[43] 550.69941 565.74602 576.06882 589.07297 598.53208 614.20677
[49] 627.44605 648.08346 665.49614 681.46445 691.01806 704.05762
[55] 714.09172 732.04124 745.90960 758.52628 769.80519 779.41537
[61] 788.35732 805.78547 818.75262 832.71196 844.97859 856.08608
[67] 865.72998 875.55945 887.20862 900.00000
The goal for the function is to find the indexes whose differences are as close to 30 as possible and save them in z.
For example, with the vector t provided, I would expect z to be [0, 2, 4, 6, 8, 10,...70]
The functionality is not my concern right now, as I am running into the error:
Error in if ((t[j + i] - t[j]) >= 30) { :
missing value where TRUE/FALSE needed
I'm new to R so I know I'm not utilizing the vectorization that R is known for. I simply want to have 'j' and 'i' as "counter variables" that I can use to access specific elements of vector t, but for a reason unknown to me, the if statement is not yielding a T/F value.
Any suggestions?
I know you want to learn how to use for-loop, but it is difficult to help you because you did not provide a reproducible example. On the other hand, in R a lot of functions were vectorized, meaning that you can avoid for-loop to achieve the same task with more efficient ways.
Based on the description in your post "The goal for the function is to find the indexes whose differences are as close to 30 as possible and save them in z." I provided the following example to address your question without a for-loop.
z <- which.min(abs(diff(vec) - 30))
z
# [1] 49
vec[c(z, z + 1)]
# [1] 627.4461 648.0835
Based on the data you provided, the indices with the numbers difference which are the closest to 30 is 49. The numbers are 627.4461 and 648.0835.
Data
vec <- c("12.01485 26.94091 33.32458 49.46742 65.07425 76.05700 87.11043
100.64116 111.72977 125.72649 139.46460 153.67292 171.46393
184.54244 201.20850 214.05093 224.16196 237.12485 251.51753
258.45865 273.95466 285.42704 299.01869 312.35587 326.26289
339.78724 353.81854 363.15847 378.89307 390.66134 402.22007
412.86049 424.23181 438.50462 448.88005 462.59917 473.65289
487.20678 499.80053 509.14141 526.03873 540.17209 550.69941
565.74602 576.06882 589.07297 598.53208 614.20677 627.44605
648.08346 665.49614 681.46445 691.01806 704.05762 714.09172
732.04124 745.90960 758.52628 769.80519 779.41537 788.35732
805.78547 818.75262 832.71196 844.97859 856.08608 865.72998
875.55945 887.20862 900.00000")
vec <- strsplit(vec, split = " ")[[1]]
vec <- as.numeric(grep("[0-9]+\\.[0-9]+", vec, value = TRUE))

L2 distance between functional data (smoothed curves)

I have used smoothing to create two "functions" fd4 and fd6.
fit6 <- smooth.basis(tid6, zbegfor, fdParobj2)
fd6 <- fit6$fd
I want to measure the L2 distance between them on the interval [0,1], but I haven't been able to find an appropriate way.
||f − g||_2 = sqrt(int(|f(x)-g(x)|^2,0,1))
The best bet has been this one: How to calculate functional L_2 norm using R, but when I use fd6 instead of f <- function(x) x^2, I get the following message:
"Error in fac - fdmat : non-conformable arrays".
I've spent hours trying to find a solution. Please help me!
Now with reproducible code:
library(fda)
# Smoothing of movement pattern without obstacle rescaled to the interval [0,1]
without <- c(22.5050173512478, 22.5038665040295, 22.5171851824298, 22.5368096190746,
22.5770229184757, 22.6709727229898, 22.8195669635573, 23.0285400460222,
23.3240853426905, 23.6895323912605, 24.0905709304813, 24.5674870961964,
25.129085512519, 25.7433521858875, 26.4096817521118, 27.1338935155912,
27.906416101033, 28.7207273157549, 29.5431756517467, 30.3697951466496,
31.2214907341765, 32.0625307132683, 32.8786845916855, 33.671550678219,
34.4449992914392, 35.1852293010227, 35.8866367048324, 36.5650863548079,
37.1776116180247, 37.7706354957587, 38.3082855431959, 38.8044130844639,
39.2471137254193, 39.6193031585418, 39.9685683244076, 40.2345560551869,
40.4394442661545, 40.5712407258558, 40.6905311089523, 40.712419802203,
40.6704560575084, 40.5583379372846, 40.3965425630546, 40.1443139907057,
39.8421899334408, 39.4671160834355, 39.018733225651, 38.5381390971577,
38.035680135599, 37.4625783280288, 36.8649362406917, 36.2320264206665,
35.5599736527209, 34.8983871226943, 34.2058073957721, 33.4893682831911,
32.7568501019309, 32.0241649500974, 31.3036406455137, 30.587636320768,
29.8962657607091, 29.2297665999702, 28.6003939337949, 28.0003531206639,
27.433551463149, 26.9088532545635, 26.4265682839796, 25.974193299003,
25.5553146923473, 25.1701249455904, 24.8107813804098, 24.4776168601955,
24.167582682288, 23.8726502760669, 23.589703789663, 23.3222235336882,
23.0616248799115, 22.8185342685607, 22.6767541125512, 22.6567795841271,
22.6488510112824, 22.6436058079441, 22.6391304188382)
timewithout <- (1:length(without))/length(without) # For scaling
splineBasis = create.bspline.basis(c(0,1), nbasis=25, norder=6) # The basis for smoothing
basis = fdPar(fdobj=splineBasis, Lfdobj=2, lambda=0.00001)
fitwithout <- smooth.basis(timewithout, without, basis) # Smoothing
fdwithout <- fitwithout$fd
# Same but movement is over an obstacle
with <- c(22.4731637093167, 22.4655561889073, 22.4853719755102, 22.4989400065304,
22.5495656349031, 22.666945409755, 22.8368941117498, 23.0846080078369,
23.4160560011242, 23.8285634914224, 24.2923085321078, 24.8297004047422,
25.4884540279408, 26.2107053559, 27.0614232848574, 27.9078055119721,
28.8449720096674, 29.8989669834473, 30.996962022701, 32.1343108758062,
33.3286403418359, 34.6364870430171, 35.9105342483246, 37.1883582665643,
38.467212668323, 39.7381525466373, 41.0395064969214, 42.3095531191294,
43.5708069740233, 44.7881178787717, 45.9965529977777, 47.1643807808923,
48.284786275036, 49.3593991064962, 50.3863035442644, 51.3535489662494,
52.2739716491521, 53.1338828493223, 53.9521101656512, 54.7037562884229,
55.3593092084143, 55.9567618011946, 56.4768579145271, 56.9251919073806,
57.2971965985674, 57.5937987523734, 57.8158626068961, 57.9554856023804,
58.009777126789, 57.9863251605612, 57.8932199088797, 57.6988126618694,
57.4350394069443, 57.1112025796509, 56.7580579506751, 56.2680669960935,
55.6963799946038, 55.0574070566765, 54.3592140352073, 53.6072275005723,
52.7876353306759, 51.9172334605074, 50.9879178368431, 49.9953932631072,
48.9460707853802, 47.8511977258834, 46.6827266395278, 45.4635999409637,
44.2633368255294, 43.0386729762103, 41.7880095105045, 40.4834298069985,
39.1610223705633, 37.9241872458281, 36.7158342529737, 35.5408830466013,
34.4070964101159, 33.307156473109, 32.2514661493348, 31.2475129673168,
30.2990631096187, 29.4096423238141, 28.590173995037, 27.8437368908309,
27.17493959411, 26.5779670740351, 26.0377946174036, 25.5731202027558,
25.1761397934058, 24.8319659155494, 24.5479180062239, 24.2940808334792,
24.09388897537, 23.934861348149, 23.7999923744404, 23.6877461628934,
23.5982309560843, 23.5207597985246, 23.4354446383638, 23.3604065265148,
23.2819126915765, 23.1725048152396, 23.0637455648184, 22.9426779696074,
22.8079176617495, 22.69360227086, 22.6622165457034, 22.6671302753094,
22.66828206305, 22.6703162730529, 22.6715781657376)
timewith <- (1:length(with))/length(with)
fitwith <- smooth.basis(timewith, with, basis) # Smoothing
fdwith <- fitwith$fd
# Plots for understanding
plot(fdwith, col=2) # Smoothed curve for movement over obstacle
plot(fdwithout, col=2, add = TRUE) # Same but no obstacle
# I have to find the L2-distance between these curves
First, one can take advantage of the possibility to perform arithmetic operations with fd objects: fdwith - fdwithout. Second, maybe there is a better way to extract values from fd objects at specific points, but this also works: predict(newdata = 0.5, fdwith - fdwithout). So,
sqrt(integrate(function(x) predict(newdata = x, fdwith-fdwithout)^2, lower = 0, upper = 1)$val)
# [1] 9.592434

Resources