Weird R behavior with indexing function arrays - r

I'm having some unexpected behaviour in R with function arrays, and I've reduced the problem to a minimal working example:
theory = c(function(p) p)
i = 1
posterior = function(p) theory[[i]](p)
i = 2
posterior(0)
Which gives me an error saying the subscript i is out of bounds.
So I guess that i is somehow being used as a "free" variable in the definition of posterior so it gets updated when I redefine i. Oddly enough, this works:
theory = c(function(p) p)
i = 1
posterior = theory[[i]]
i = 2
posterior(0)
How can I avoid this? Note that not redefining i is not an option, as this stuff is going in a for loop where i is the index.

The reason that this doesn't work is that you redefine i = 2, and then you are out of bounds of theory, which contains a single element. The function is evaluated lazily, so that it only executes theory[[i]] when the function is called, at which point i equals 2.
You can read some more about lazy evaluation here.

Related

How can I optimize the expected value of a function in R?

I have derived a survival function for a system of components (ignore the details of how this system is setup) and I am trying to maximize its expected, or more specifically, maximizing the expected value of the function:
surv_func = function(x,mu) = {(exp(-(x/(mu))^(1/3))*((1-exp(-(4/3)*x^(3/2)))+exp(-(-(4/3)*x^(3/2)))))*exp(-(x/(3-mu))^(1/3))}
and I am supposed (since the pdf including my tasks gives a hint about it) to use the function
optimize()
and the expected value for a function can be computed with
# Computes expected value of a the function "function"
E <- integrate(function, 0, Inf)
but my function depends on x and mu. The expected value could (obviously) be computed if the integral had no mu but instead only depended on x. For those interested, the mu comes from the fact that one of the components has a Weibull-distribution with parameters (1/3,mu) and the 3-mu comes from that has a Weibull-distribution with parameters (1/3,lambda). In the task there is a constraint mu + lambda = 3, so I tought substituting the lambda-parameter in the second Weibull-distribution with lambda = 3 - mu and trying to maximize this problem would yield not only mu, but also lambda.
If I try to, just for the sake of learing about R, compute the expected value using the code below (in the console window), it just gives me the following:
> E <- integrate(surv_func,0,Inf)
Error in (function (x, mu) : argument "mu" is missing, with no default
I am new to R and seem to be a little bit "slow" at learning. How can I approach this problem?

ksmooth function doesn't work with parameters via ellipsis

I am currently working with R due to a course at university, so I am still quite inexperienced.
We use R for exploratory data analysis. In a data analysis we are supposed to apply different regression models to the data and generate the same plots for each. Additionally, we are supposed to play a bit with the parameters for learning purposes. To avoid unattractive 10-20 times copy-pasting I wrote a function that shows the regression function and the parameters for it as an ellipsis (...). In this function I call the passed function with the ellipsis as parameter.
library("astsa")
data_glob <- globtemp
plot.data.and.reg <- function(data, reg.func, ...){
model <- reg.func(...)
par(mfrow = c(1, 2))
plot(data)
abline(model, col = "orange", lwd = 3)
qqnorm(data)
}
This works for the simple lm function, but unfortunately not for the ksmooth function.
When I pass this function I get the error message: "numeric y must be supplied. For density estimation use density()".
plot.data.and.reg(
data_g,
lm,
list(
formula = as.formula("data_glob ~ time(data_glob)"),
data = data_glob
)
)
plot.data.and.reg(
data_glob,
ksmooth,
list(
x = as.numeric(time(data_glob)),
y = as.numeric(data_glob),
kernel = "box",
bandwidth = 0.25
)
)
Thereupon I looked at the source code of ksmooth. It shows that this error message occurs because the check "missing(y)" fails. Apparently a problem occurs because I passed the parameters as an ellipsis and it doesn't seem to "unpack".
For simplicity, I wrote a dummy function to test if I can add this "unpack" myself.
test.wrapper <- function(func, ...){
func(...)
}
test <- function(x, y){
match.call()
if(missing(y))
print("Leider hatte ich Recht")
print(x)
print(y)
}
test.wrapper(test, list(x = 10, y = 20))
Unfortunately I have not found a solution yet.
From Python I know it so that as with kwargs a dictionary can be unpacked with the ** operator. Is there an equivalent in R? Or how to make sure in R that the parameters from the ellipsis are used correctly?
Since it worked with the lm function without errors I also looked again in their source code . Unfortunately, with my little experience in R, I can't see exactly where the essential difference is.
Overall, I would attribute the error to the fact that the ksmooth function is not yet designed for use with an ellipsis, but I am not sure. How would I need to adjust the ksmooth code to make it work with ...?
(For my Uni task, I will resort to the copy-paste (anti) pattern if in doubt. After searching for so long, I would still be interested in the solution and it may be useful in the future).
Thanks a lot for your help!
The closest equivalent of the */** splat in Python is the do.call function.
However, you don’t need this here. The actual issue is that you’re passing the extra arguments as a list rather than individually. Once you flatten the list, it works1:
plot.data.and.reg(
data_glob,
ksmooth,
x = as.numeric(time(data_glob)),
y = as.numeric(data_glob),
kernel = "box",
bandwidth = 0.25
)
I’m actually surprised that it works with a list for lm; that’s not intentional, it’s essentially an accident caused by how lm is currently implemented.
1 I say it “works” because there’s no error and it plots something, but with your example data there’s no visible regression line (abline is inappropriate for the output of ksmooth), and the smoothing parameters do nothing — the result is identical to the unsmoothed input.
To get this to work, use lines instead of abline. And as for the smoothing, for your example data a bandwidth of 10 works fine.

Basic questions about scilab

I am taking a numeric calculus class and we are not required to know any scilab programming except the very basic, which is taught through a booklet, since the class is mostly theoretical. I was reading the booklet and found this scilab code meant to find a root of a function through bissection method.
The problem is, I can't find a way to make it work. I tried to call it with bissecao(x,-1,1,0.1,40) however it didn't work.
The error I got was:
at line 3 of function bissecao ( E:\Downloads\bisseccao3.sce line 3 )
Invalid index.
As I highly doubt that the code itself isn't working, and I tried to search for anything I could spot that seemed wrong, to no avail, I guess I am probably calling it wrong, somehow.
The code is the following:
function p = bissecao(f, a, b, TOL, N)
i = 1
fa = f(a)
while (i <= N)
//iteraction of the bissection
p = a + (b-a)/2
fp = f(p)
//stop condition
if ((fp == 0) | ((b-a)/2 < TOL)) then
return p
end
//bissects the interval
i = i+1
if (fa * fp > 0) then
a = p
fa = fp
else
b = p
end
end
error ('Max number iter. exceded!')
endfunction
Where f is a function(I guess), a and b are the limits of the interval in which we will be iterating, TOL is the tolerance at which the program terminates close to a zero, and N is the maximum number of iteractions.
Any help on how to make this run is greatly appreciated.
Error in bissecao
The only error your bissecao function have is the call to return :
In a function return stops the execution of the function,
[x1,..,xn]=return(a1,..,an) stops the execution of the function and
put the local variables ai in calling environment under names xi.
So you should either call it without any argument (input our output) and the function will exit and return p.
Or you could call y1 = return(p) and the function will exit and p will be stored in y1.
It is better to use the non-arguments form return in functions to avoid changing values of variables in the parent/calling script/functions (possible side-effect).
The argument form is more useful when interactively debugging with pause:
In pause mode, it allows to return to lower level.
[x1,..,xn]=return(a1,..,an) returns to lower level and put the local
variables ai in calling environment under names xi.
Error in calling bissecao
The problem may come by your call: bissecao(x,-1,1,0.1,40) because you didn't defined x. Just fixing this by creating a function solves the problem:
function y=x(t)
y=t+0.3
enfunction
x0=bissecao(x,-1,1,0.1,40) // changed 'return p' to 'return'
disp(x0) // gives -0.3 as expected

Estimate parameters of Frechet distribution using mmedist or fitdist(with mme) error

I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
require(actuar)
library(fitdistrplus)
library(MASS)
#values
n=100
scale = 1
shape=3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
require(actuar)
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
#fit
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.

Rstudio - Error in user-created function - Object not found

First thing's first; my skills in R are somewhat lacking, so there is a chance I may be using something incorrectly in the following. If I go wrong somewhere, please let me know.
I've been having a problem in Rstudio where I try to create 2 functions for formulae, then use nls() to create a model using those, with which I will make a plot. When I try to run the line for creating it, I get an error message saying an object is missing. It is always the last object in the function of the first "formula", in this case, 'p'.
I'll provide my code here then explain what I am trying to do for a little context;
DATA <- read.csv(file.choose(), as.is=T)
formula <- function(m, h, g, p){(2*m)/(m+(sqrt(m^2+1)))*p*g*(h^2/2)}
formula.2 <- function(P, V, g){P*V*g}
m = 0.85
p = 766.42
g = 9.81
P = 0.962
h = DATA$lithothick
V = DATA$Vol
fit.1 <- nls(formula (P, V, g) ~ formula(m, h, g, p), data = DATA)
If I run it how it is shown, I get the error;
Error in (2 * m)/(m + (sqrt(m^2 + 1))) * p : 'p' is missing
However it will show h if I rearrange the objects in the formula to (m,g,p,h)
Error in h^2 : 'h' is missing
Now, what I'm trying to do is this; I have a .csv file with 3 thicknesses (0.002, 0.004, 0.006 meters) and 3 volumes (10, 25, 50 milliliters). I am trying to see how the rates of strength and buoyancy increase (in relation to each other) as the thickness and volume for each object (respectively) increases. I was hoping to come out with a graph showing the upward trend for each property (strength and buoyancy), as I believe them to be unequal (one exponential the other linear). I hope that isn't more confusing than clarifying, but any pointers would be GREATLY appreciated.
You cannot overload functions this way in R, what you can do is provide optional arguments (which is a kind of overload) with syntax function(mandatory, optionnal="")
For what you are trying to do, you have to use formula.2 if you want to use the 3-arguments formula.
A workaround could be to use one function with one optionnal argument and check if this argument has been used. Something like :
formula = function(m, h, g, p="") {
if (is.numeric(p)) {
(2*m)/(m+(sqrt(m^2+1)))*p*g*(h^2/2)
} else {
m*h*g
}
}
This is ugly and a very bad way to do it (your variables do not really mean the same thing from one call to the other) but it works.

Resources