define a variable in a for-loop in R - r

I have a sequence of numbers in a range from 65-60 . In each step we diminish 1/12
[1] 65.00000 64.91667 64.83333 64.75000 64.66667 64.58333 64.50000 64.41667 64.33333 64.25000 64.16667
[12] 64.08333 64.00000 63.91667 63.83333 63.75000 63.66667 63.58333 63.50000 63.41667 63.33333 63.25000
[23] 63.16667 63.08333 63.00000 62.91667 62.83333 62.75000 62.66667 62.58333 62.50000 62.41667 62.33333
[34] 62.25000 62.16667 62.08333 62.00000 61.91667 61.83333 61.75000 61.66667 61.58333 61.50000 61.41667
[45] 61.33333 61.25000 61.16667 61.08333 61.00000 60.91667 60.83333 60.75000 60.66667 60.58333 60.50000
[56] 60.41667 60.33333 60.25000 60.16667 60.08333 60.00000
For every step obove we will compute a function.
So a for-loop is created with the following code:
a <- seq(65, 60, (-1/12))
v <- a*0
w <- a*0
mu10<-0.0067
mu12<-0.5
mu20<-1.03*mu10
mu21<-3
r<-log(1+0.01)
b1<-(-1500)
b2<-25000*0.15*12
c10<-45000*10
c20<-45000*10
h<-1/12
v[1] <- h*(b1+mu10*c10+mu12)
w[1] <- h*(b2+mu20*c20+mu21)
for (t in 2:length(a)){
v[t] <- v[t-1]+h*(-r*v[t-1]+b1+mu10*(c10-v[t-1])+mu12*(w[t-1]-v[t-1]))
w[t] <- w[t-1]+h*(-r*w[t-1]+b2+mu20*(c20-w[t-1])+mu21*(v[t-1]-w[t-1]))
}
What we wanted to do here was to calculate the value of the function at age 60 by calculating the feature back in time. So simply to compute v[60] and w[60]
This code I have no problem with and it does exactly the way I want it to do.
the problem is variable mu10 I defined above as mu10<-0.0067
It should however be defined as: mu10(t)=alpha+beta*exp(gamma * t)
Where t should be al the sequences from 65.00000 64.91667 64.83333 64.75000......60.00000
I want to replace my mu10 with mu10(t), the formula I defined here. mu10(t) shall then replace mu10 in the for-loop.
I want them to look like this:
v[t] <- v[t-1]+h*(-r*v[t-1]+b1+mu10[t]*(c10-v[t-1])+mu12*(w[t-1]-v[t-1]))
t on the left hand side should be the same as t in mu10(t) , perhaps obvious but just wanted to make clear.
I have tried a bit up and down but feels that it is not right. I have defined parameters as:
gamma<-0.044
alpha<-(-0.0073)
beta<-0.0009
I simply need help to calculate mu10(t) in a smooth manner so it can be included in the for-loop.

Related

How to rescale values from [-2, 2] to [0, 100] in R? [duplicate]

How do I scale a series such that the first number in the series is 0 and last number is 1. I looked into 'approx', 'scale' but they do not achieve this objective.
# generate series from exponential distr
s = sort(rexp(100))
# scale/interpolate 's' such that it starts at 0 and ends at 1?
# approx(s)
# scale(s)
The scales package has a function that will do this for you: rescale.
library("scales")
rescale(s)
By default, this scales the given range of s onto 0 to 1, but either or both of those can be adjusted. For example, if you wanted it scaled from 0 to 10,
rescale(s, to=c(0,10))
or if you wanted the largest value of s scaled to 1, but 0 (instead of the smallest value of s) scaled to 0, you could use
rescale(s, from=c(0, max(s)))
It's straight-forward to create a small function to do this using basic arithmetic:
s = sort(rexp(100))
range01 <- function(x){(x-min(x))/(max(x)-min(x))}
range01(s)
[1] 0.000000000 0.003338782 0.007572326 0.012192201 0.016055006 0.017161145
[7] 0.019949532 0.023839810 0.024421602 0.027197168 0.029889484 0.033039408
[13] 0.033783376 0.038051265 0.045183382 0.049560233 0.056941611 0.057552543
[19] 0.062674982 0.066001242 0.066420884 0.067689067 0.069247825 0.069432174
[25] 0.070136067 0.076340460 0.078709590 0.080393512 0.085591881 0.087540132
[31] 0.090517295 0.091026499 0.091251213 0.099218526 0.103236344 0.105724733
[37] 0.107495340 0.113332392 0.116103438 0.124050331 0.125596034 0.126599323
[43] 0.127154661 0.133392300 0.134258532 0.138253452 0.141933433 0.146748798
[49] 0.147490227 0.149960293 0.153126478 0.154275371 0.167701855 0.170160948
[55] 0.180313542 0.181834891 0.182554291 0.189188137 0.193807559 0.195903010
[61] 0.208902645 0.211308713 0.232942314 0.236135220 0.251950116 0.260816843
[67] 0.284090255 0.284150541 0.288498370 0.295515143 0.299408623 0.301264703
[73] 0.306817872 0.307853369 0.324882091 0.353241217 0.366800517 0.389474449
[79] 0.398838576 0.404266315 0.408936260 0.409198619 0.415165553 0.433960390
[85] 0.440690262 0.458692639 0.464027428 0.474214070 0.517224262 0.538532221
[91] 0.544911543 0.559945121 0.585390414 0.647030109 0.694095422 0.708385079
[97] 0.736486707 0.787250428 0.870874773 1.000000000
Alternatively:
scale(x,center=min(x),scale=diff(range(x)))
(untested)
This has the feature that it attaches the original centering and scaling factors to the output as attributes, so they can be retrieved and used to un-scale the data later (if desired). It has the oddity that it always returns the result as a (columnwise) matrix, even if it was passed a vector; you can use drop(scale(...)) if you want a vector instead of a matrix (this usually doesn't matter but the matrix format can occasionally cause trouble downstream ... in my experience more often with tibbles/in tidyverse, although I haven't stopped to examine exactly what's going wrong in these cases).
This should do it:
reshape::rescaler.default(s, type = "range")
EDIT
I was curious about the performance of the two methods
> system.time(replicate(100, range01(s)))
user system elapsed
0.56 0.12 0.69
> system.time(replicate(100, reshape::rescaler.default(s, type = "range")))
user system elapsed
0.53 0.18 0.70
Extracting the raw code from reshape::rescaler.default
range02 <- function(x) {
(x - min(x, na.rm=TRUE)) / diff(range(x, na.rm=TRUE))
}
> system.time(replicate(100, range02(s)))
user system elapsed
0.56 0.12 0.68
Yields similar result.
You can also make use of the caret package which will provide you the preProcess function which is just simple like this:
preProcValues <- preProcess(yourData, method = "range")
dataScaled <- predict(preProcValues, yourData)
More details on the package help.
I created following function in r:
ReScale <- function(x,first,last){(last-first)/(max(x)-min(x))*(x-min(x))+first}
Here, first is start point, last is end point.

Partial Variances at each row of a Matrix

I generated a series of 10,000 random numbers through:
rand_x = rf(10000, 3, 5)
Now I want to produce another series that contains the variances at each point i.e. the column look like this:
[variance(first two numbers)]
[variance(first three numbers)]
[variance(first four numbers)]
[variance(first five numbers)]
.
.
.
.
[variance of 10,000 numbers]
I have written the code as:
c ( var(rand_x[1:1]) : var(rand_x[1:10000])
but I am only getting 157 elements in the column rather than not 10,000. Can someone guide what I am doing wrong here?
An option is to loop over the index from 2 to 10000 in sapply, extract the elements of 'rand_x' from position 1 to the looped index, apply the var and return a vector of variance output
out <- sapply(2:10000, function(i) var(rand_x[1:i]))
Your code creates a sequence incrementing by one with the variance of the first two elements as start value and the variance of the whole vector as limit.
var(rand_x[1:2]):var(rand_x[1:n])
# [1] 0.9026262 1.9026262 2.9026262
## compare:
.9026262:3.33433
# [1] 0.9026262 1.9026262 2.9026262
What you want is to loop over the vector indices, using seq_along to get the variances of sequences growing by one. To see what needs to be done, I show you first a (rather slow) for loop.
vars <- numeric() ## initialize numeric vector
for (i in seq_along(rand_x)) {
vars[i] <- var(rand_x[1:i])
}
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
The first element has to be NA because the variance of one element is not defined (division by zero).
However, the for loop is slow. Since R is vectorized we rather want to use a function from the *apply family, e.g. vapply, which is much faster. In vapply we initialize with numeric(1) (or just 0) because the result of each iteration is of length one.
vars <- vapply(seq_along(rand_x), function(i) var(rand_x[1:i]), numeric(1))
vars
# [1] NA 0.9026262 1.4786540 1.2771584 1.7877717 1.6095619
# [7] 1.4483273 1.5653797 1.8121144 1.6192175 1.4821020 3.5005254
# [13] 3.3771453 3.1723564 2.9464537 2.7620001 2.7086317 2.5757641
# [19] 2.4330738 2.4073546 2.4242747 2.3149455 2.3192964 2.2544765
# [25] 3.1333738 3.0343781 3.0354998 2.9230927 2.8226541 2.7258979
# [31] 2.6775278 2.6651541 2.5995346 3.1333880 3.0487177 3.0392603
# [37] 3.0483917 4.0446074 4.0463367 4.0465158 3.9473870 3.8537925
# [43] 3.8461463 3.7848464 3.7505158 3.7048694 3.6953796 3.6605357
# [49] 3.6720684 3.6580296
Data:
n <- 50
set.seed(42)
rand_x <- rf(n, 3, 5)

Changing cell values from a raster

I have raster cell values with 5 digits, but I need to get rid of the first one, for instance, if the the cell value is "31345" I need to make it "1345".
I am trying to use the calc() function from the raster package to do that by subtracting different numbers from based on the raster cell value (since they are all numbers), like this:
correct.grid <- calc(grid, fun=function(x){ifelse(x < 40000, x-30000,
ifelse(x > 40000 & < 50000, x-40000,
ifelse(x > 50000 & < 60000, x-50000,
ifelse(x > 60000, x-60000, 0)))))})
I guess this is probably a terrible approach to the problem (I am not really good at programming), still, I ran into an error because I am using ">" and "<" inside the function I am guessing. Any ideas on how to make these "ifelse"s to work or maybe a smarter approach to the problem?
This is a piece of the unique values in my data if it helps:
> unique(grid)
[1] 30057 30084 30207 30230 30235 30237 30280 30283 30311 30319 30320 30326 30350 30351 30352 30360
[17] 30384 30396 30415 30420 30447 30449 30452 30456 30478 30481 30497 30507 30522 30560 30562 30605
[33] 30606 30612 30638 30639 30645 30654 30657 30658 30662 30665 30678 30682 30701 30707 30714 30736
[49] 30740 30743 30749 30750 30823 30824 30841 30852 30862 30892 30896 30898 30915 30920 30928 30934
[65] 30956 30962 30978 30986 30998 31021 31022 31031 31042 31053 31055 31081 31085 31092 31097 31099
[81] 31114 31115 31122 31126 31129 31130 31131 31141 31157 31168 31171 40019 40026 40075 40197 40217
[97] 50342 50360 50379 50496 50720 50725 50732 50766 50798 50837 51073 51092 51397 53096 53110 53117
[113] 53118 53120 53124 60003 60005 60041 60485 60516 60655 60661 60825 61039 61174 61185 61187 61210
[129] 61221 61224 61227 61259 61287 61289 61290 61295
If you just want to remove the leftmost digit of each value, how about this:
First, let's load a raster object to work with:
library(raster)
# Load a raster object to work with
grid = system.file("external/test.grd", package="raster")
grid = raster(grid)
# Set up values to be whole numbers
values(grid) = round(values(grid)*100)
Now, remove the leftmost digit from every value in the raster:
values(grid) = as.numeric(substr(values(grid), 2, nchar(values(grid))))
Note that a value with one or more zeros after the leftmost digit will get shortened by more than one digit. For example, 60661 will become 661 and 30001 will become 1.

Large numbers multiplication in R

I want to know how can I calculate large values multiplication in R.
R returns Inf!
For example:
6.350218e+277*2.218789e+215
[1] Inf
Let me clarify the problem more:
consider the following code and the results of outFunc function:
library(hypergeo)
poch <-function(a,b) gamma(a+b)/gamma(a)
n<-c(37 , 41 , 4 , 9 , 12 , 13 , 2 , 5 , 23 , 73 , 129 , 22 , 121 )
v<-c(90.2, 199.3, 61, 38, 176.3, 293.6, 318.6, 328.7, 328.1, 313.3, 142.4, 92.9, 95.5)
DF<-data.frame(n,v)
outFunc<-function(k,w,r,lam,a,b) {
((((w*lam)^k) * poch(r,k) * poch(a,b) ) * hypergeo(r+k,a+k,a+b+k,-(w*lam)) )/(poch(a+k,b)*factorial(k))
}
and the function returns:
outFunc(DF$n,DF$v,0.2, 1, 3, 1)
[1] 0.002911330+ 0i 0.003047594+ 0i 0.029886646+ 0i 0.013560599+ 0i 0.010160073+ 0i
[6] 0.008928524+ 0i 0.040165795+ 0i 0.019402318+ 0i 0.005336008+ 0i 0.001689114+ 0i
[11] Inf+NaNi 0.005577985+ 0i Inf+NaNi
As can be seen above, outFunc returns Inf+NaNi for n values of 129 and 121.
I checked the code sections part by part and I find that the returned results of (wlam)^k poch(r,k) for these n values are Inf. I also check my code with equivalent code in Mathematica which everything is OK:
in: out[indata[[All, 1]], indata[[All, 2]], 0.2, 1, 3, 1]
out: {0.00291133, 0.00304759, 0.0298866, 0.0135606, 0.0101601, 0.00892852, \
0.0401658, 0.0194023, 0.00533601, 0.00168911, 0.000506457, \
0.00557798, 0.000365445}
Now please let me know how we can solve this issue as simple as it is in Mathematica. regards.
One option you have available in base R, which does not require a special library, is to convert the two numbers to a common base, and then add the exponents together to get the final result:
> x <- log(6.350218e+277, 10)
> x
[1] 277.8028
> y <- log(2.218789e+215, 10)
> y
[1] 215.3461
> x + y
[1] 493.1489
Since 10^x * 10^y = 10^(x+y), your final answer is 10^493.1489
Note that this solution does not allow to actually store numbers which R would normally treat as INF. Hence, in this example, you still cannot compute 10^493, but you can tease out what the product would be.
For first, I'd recommend two useful reads: logarithms and how floating values are handled by a computer. These are pertinent because with some "tricks" you can handle much bigger values than you think. For instance, your definition of the poch function is terrible. This because the fraction can be simplified a lot but a computer will evaluate the numerator first and if it overflows the result will be useless. That's why R provides beside gamma the lgamma function: it just calculates the logarithm of gamma and can handle much bigger values. So, we calculate the log of each factor in your function and then we use exp to restore the intended values. Try this:
#redefine poch properly
poch<-function(a,b) lgamma(a+b) - lgamma(a)
#redefine outFunc
outFunc<-function(k,w,r,lam,a,b) {
exp((k*(log(w)+log(lam))+ poch(r,k) + poch(a,b) ) +
log(hypergeo(r+k,a+k,a+b+k,-(w*lam)))- poch(a+k,b)-lgamma(k+1))
}
#Now we go
outFunc(DF$n,DF$v,0.2, 1, 3, 1)
#[1] 0.0029113299+0i 0.0030475939+0i 0.0298866458+0i 0.0135605995+0i
#[5] 0.0101600732+0i 0.0089285243+0i 0.0401657947+0i 0.0194023182+0i
#[9] 0.0053360084+0i 0.0016891144+0i 0.0005064566+0i 0.0055779850+0i
#[13] 0.0003654449+0i
> library(gmp)
> x<- pow.bigz(6.350218,277)
> y<- pow.bigz(2.218789,215)
> x*y
Big Integer ('bigz') :
[1] 18592826814872791919942226542714580401488894909642693257011204682802122918146288728149155739011270579948954646130492024596687919148494136290260248656581476275790189359808616520170359345612068099238508437236172770752199936303947098513476300142414338199993261924467166943683593371648

Scale a series between two points

How do I scale a series such that the first number in the series is 0 and last number is 1. I looked into 'approx', 'scale' but they do not achieve this objective.
# generate series from exponential distr
s = sort(rexp(100))
# scale/interpolate 's' such that it starts at 0 and ends at 1?
# approx(s)
# scale(s)
The scales package has a function that will do this for you: rescale.
library("scales")
rescale(s)
By default, this scales the given range of s onto 0 to 1, but either or both of those can be adjusted. For example, if you wanted it scaled from 0 to 10,
rescale(s, to=c(0,10))
or if you wanted the largest value of s scaled to 1, but 0 (instead of the smallest value of s) scaled to 0, you could use
rescale(s, from=c(0, max(s)))
It's straight-forward to create a small function to do this using basic arithmetic:
s = sort(rexp(100))
range01 <- function(x){(x-min(x))/(max(x)-min(x))}
range01(s)
[1] 0.000000000 0.003338782 0.007572326 0.012192201 0.016055006 0.017161145
[7] 0.019949532 0.023839810 0.024421602 0.027197168 0.029889484 0.033039408
[13] 0.033783376 0.038051265 0.045183382 0.049560233 0.056941611 0.057552543
[19] 0.062674982 0.066001242 0.066420884 0.067689067 0.069247825 0.069432174
[25] 0.070136067 0.076340460 0.078709590 0.080393512 0.085591881 0.087540132
[31] 0.090517295 0.091026499 0.091251213 0.099218526 0.103236344 0.105724733
[37] 0.107495340 0.113332392 0.116103438 0.124050331 0.125596034 0.126599323
[43] 0.127154661 0.133392300 0.134258532 0.138253452 0.141933433 0.146748798
[49] 0.147490227 0.149960293 0.153126478 0.154275371 0.167701855 0.170160948
[55] 0.180313542 0.181834891 0.182554291 0.189188137 0.193807559 0.195903010
[61] 0.208902645 0.211308713 0.232942314 0.236135220 0.251950116 0.260816843
[67] 0.284090255 0.284150541 0.288498370 0.295515143 0.299408623 0.301264703
[73] 0.306817872 0.307853369 0.324882091 0.353241217 0.366800517 0.389474449
[79] 0.398838576 0.404266315 0.408936260 0.409198619 0.415165553 0.433960390
[85] 0.440690262 0.458692639 0.464027428 0.474214070 0.517224262 0.538532221
[91] 0.544911543 0.559945121 0.585390414 0.647030109 0.694095422 0.708385079
[97] 0.736486707 0.787250428 0.870874773 1.000000000
Alternatively:
scale(x,center=min(x),scale=diff(range(x)))
(untested)
This has the feature that it attaches the original centering and scaling factors to the output as attributes, so they can be retrieved and used to un-scale the data later (if desired). It has the oddity that it always returns the result as a (columnwise) matrix, even if it was passed a vector; you can use drop(scale(...)) if you want a vector instead of a matrix (this usually doesn't matter but the matrix format can occasionally cause trouble downstream ... in my experience more often with tibbles/in tidyverse, although I haven't stopped to examine exactly what's going wrong in these cases).
This should do it:
reshape::rescaler.default(s, type = "range")
EDIT
I was curious about the performance of the two methods
> system.time(replicate(100, range01(s)))
user system elapsed
0.56 0.12 0.69
> system.time(replicate(100, reshape::rescaler.default(s, type = "range")))
user system elapsed
0.53 0.18 0.70
Extracting the raw code from reshape::rescaler.default
range02 <- function(x) {
(x - min(x, na.rm=TRUE)) / diff(range(x, na.rm=TRUE))
}
> system.time(replicate(100, range02(s)))
user system elapsed
0.56 0.12 0.68
Yields similar result.
You can also make use of the caret package which will provide you the preProcess function which is just simple like this:
preProcValues <- preProcess(yourData, method = "range")
dataScaled <- predict(preProcValues, yourData)
More details on the package help.
I created following function in r:
ReScale <- function(x,first,last){(last-first)/(max(x)-min(x))*(x-min(x))+first}
Here, first is start point, last is end point.

Resources