I am trying to use activation functions other than the pre-implemented "logistic" and "tanh" in the R package neuralnet. Specifically, I would like to use rectified linear units (ReLU) f(x) = max{x,0}. Please see my code below.
I believe I can use custom functions if defined by (for example)
custom <- function(a) {x*2}
but if I set max(x,0) instead of x*2 then R tells me that 'max is not in the derivatives table', and same for '>' operator. So I am looking for a sensible workaround as I am thinking numerical integration of max in this case wouldn't be an issue.
nn <- neuralnet(
as.formula(paste("X",paste(names(Z[,2:10]), collapse="+"),sep="~")),
data=Z[,1:10], hidden=5, err.fct="sse",
act.fct="logistic", rep=1,
linear.output=TRUE)
Any ideas? I am a bit confused as I didn't think the neuralnet package would do analytical differentiation.
The internals of the neuralnet package will try to differentiate any function provided to act.fct. You can see the source code here.
At line 211 you will find the following code block:
if (is.function(act.fct)) {
act.deriv.fct <- differentiate(act.fct)
attr(act.fct, "type") <- "function"
}
The differentiate function is a more complex use of the deriv function which you can also see in the source code above. Therefore, it is currently not possible to provide max(0,x) to the act.fct. It would require an exception placed in the code to recognize the ReLU and know the derivative. It would be a great exercise to get the source code, add this in and submit to the maintainers to expand (but that may be a bit much).
However, regarding a sensible workaround, you could use softplus function which is a smooth approximation of the ReLU. Your custom function would look like this:
custom <- function(x) {log(1+exp(x))}
You can view this approximation in R as well:
softplus <- function(x) log(1+exp(x))
relu <- function(x) sapply(x, function(z) max(0,z))
x <- seq(from=-5, to=5, by=0.1)
library(ggplot2)
library(reshape2)
fits <- data.frame(x=x, softplus = softplus(x), relu = relu(x))
long <- melt(fits, id.vars="x")
ggplot(data=long, aes(x=x, y=value, group=variable, colour=variable))+
geom_line(size=1) +
ggtitle("ReLU & Softplus") +
theme(plot.title = element_text(size = 26)) +
theme(legend.title = element_blank()) +
theme(legend.text = element_text(size = 18))
You can approximate the max function with a differentiable function, such as:
custom <- function(x) {x/(1+exp(-2*k*x))}
The variable k determines the accuracy of the approximation.
Other approximations can be derived from equations in section "Analytic approximations": https://en.wikipedia.org/wiki/Heaviside_step_function
a bit belated, but in case anyone else is still looking for an answer. Here's how to incorporate the non-approximated ReLu function. This is achieved by loading it from a package.
Note that while you could technically define the relu function yourself (with max() or if(x<0) etc.), this wouldn't work in the neural net package because it needs a differentiable function.
First, load the relu function from sigmoid package, which is differentiable
install.packages('sigmoid')
library(sigmoid)
relu()
Second, insert in your code
nn <- neuralnet(
as.formula(paste("X",paste(names(Z[,2:10]), collapse="+"),sep="~")),
data=Z[,1:10],
hidden=5, err.fct="sse",
act.fct=relu,
rep=1,
linear.output=TRUE)
I found this solution in another post, but can't for the life of me rememeber which one, so credits to unknown.
Related
I'm trying to plot the Worm plot residuals on a model fitted using the gamlss function from the gamlss package. The interest graph looks like the one below:
Initially, below is the computational routine referring to the use of the wormplot_gg function from the childsds package, however, the result expressed using the function described above is not looks like the example shown above, which is being applied to a dataset contained within R.
library(ggplot2)
library(gamlss)
library(childsds)
head(Orange)
Dados <- Orange
Model <- gamlss(circumference~age, family=NO,data=Dados); Model
wp(Model)
wormplot_gg(m = Model)
Below are the traditional results via the wp function in the gamlss package.
And finally, we have the results obtained through the wormplot_gg function from the childsds package. However, as already described, this one does not present itself in the way I am interested, that is, with the visual structure of the first figure.
using qqplotr https://aloy.github.io/qqplotr/index.html with the detrend=True option
library(qqplotr)
set.seed(1)
df <- data.frame(z=rnorm(50))
ggplot(df, aes(sample=z)) +
stat_qq_point(detrend = T) +
stat_qq_band(detrend = T, color='black', fill=NA, size=0.5)
you can also add geom_hline(yintercept = 0)
edit:
In the case of using this with a gamlss model, the first have to extract the randomized residuals out of the model, which for gamlss is done simply with the function residuals, so you can just do e.g., df <- data.frame(z=residuals(Model)) and then just continue with the rest of the code
I am currently working with R due to a course at university, so I am still quite inexperienced.
We use R for exploratory data analysis. In a data analysis we are supposed to apply different regression models to the data and generate the same plots for each. Additionally, we are supposed to play a bit with the parameters for learning purposes. To avoid unattractive 10-20 times copy-pasting I wrote a function that shows the regression function and the parameters for it as an ellipsis (...). In this function I call the passed function with the ellipsis as parameter.
library("astsa")
data_glob <- globtemp
plot.data.and.reg <- function(data, reg.func, ...){
model <- reg.func(...)
par(mfrow = c(1, 2))
plot(data)
abline(model, col = "orange", lwd = 3)
qqnorm(data)
}
This works for the simple lm function, but unfortunately not for the ksmooth function.
When I pass this function I get the error message: "numeric y must be supplied. For density estimation use density()".
plot.data.and.reg(
data_g,
lm,
list(
formula = as.formula("data_glob ~ time(data_glob)"),
data = data_glob
)
)
plot.data.and.reg(
data_glob,
ksmooth,
list(
x = as.numeric(time(data_glob)),
y = as.numeric(data_glob),
kernel = "box",
bandwidth = 0.25
)
)
Thereupon I looked at the source code of ksmooth. It shows that this error message occurs because the check "missing(y)" fails. Apparently a problem occurs because I passed the parameters as an ellipsis and it doesn't seem to "unpack".
For simplicity, I wrote a dummy function to test if I can add this "unpack" myself.
test.wrapper <- function(func, ...){
func(...)
}
test <- function(x, y){
match.call()
if(missing(y))
print("Leider hatte ich Recht")
print(x)
print(y)
}
test.wrapper(test, list(x = 10, y = 20))
Unfortunately I have not found a solution yet.
From Python I know it so that as with kwargs a dictionary can be unpacked with the ** operator. Is there an equivalent in R? Or how to make sure in R that the parameters from the ellipsis are used correctly?
Since it worked with the lm function without errors I also looked again in their source code . Unfortunately, with my little experience in R, I can't see exactly where the essential difference is.
Overall, I would attribute the error to the fact that the ksmooth function is not yet designed for use with an ellipsis, but I am not sure. How would I need to adjust the ksmooth code to make it work with ...?
(For my Uni task, I will resort to the copy-paste (anti) pattern if in doubt. After searching for so long, I would still be interested in the solution and it may be useful in the future).
Thanks a lot for your help!
The closest equivalent of the */** splat in Python is the do.call function.
However, you don’t need this here. The actual issue is that you’re passing the extra arguments as a list rather than individually. Once you flatten the list, it works1:
plot.data.and.reg(
data_glob,
ksmooth,
x = as.numeric(time(data_glob)),
y = as.numeric(data_glob),
kernel = "box",
bandwidth = 0.25
)
I’m actually surprised that it works with a list for lm; that’s not intentional, it’s essentially an accident caused by how lm is currently implemented.
1 I say it “works” because there’s no error and it plots something, but with your example data there’s no visible regression line (abline is inappropriate for the output of ksmooth), and the smoothing parameters do nothing — the result is identical to the unsmoothed input.
To get this to work, use lines instead of abline. And as for the smoothing, for your example data a bandwidth of 10 works fine.
AIM: The aim here was to find a suitable fit, using step functions, which uses age to describe wage, in the Wage dataset in the library ISLR.
PLAN:
To find a suitable fit, I'll try multiple fits, which will have different cut points. I'll use the glm() function (of the boot library) for the fitting purpose. In order to check which fit is the best, I'll use the cv.glm() function to perform cross-validation over the fitted model.
PROBLEM:
In order to do so, I did the following:
all.cvs = rep(NA, 10)
for (i in 2:10) {
lm.fit = glm(wage~cut(Wage$age,i), data=Wage)
all.cvs[i] = cv.glm(Wage, lm.fit, K=10)$delta[2]
}
But this gives an error:
Error in model.frame.default(formula = wage ~ cut(Wage$age, i), data =
list( : variable lengths differ (found for 'cut(Wage$age, i)')
Whereas, when I run the code given below, it runs.(It can be found here)
all.cvs = rep(NA, 10)
for (i in 2:10) {
Wage$age.cut = cut(Wage$age, i)
lm.fit = glm(wage~age.cut, data=Wage)
all.cvs[i] = cv.glm(Wage, lm.fit, K=10)$delta[2]
}
Hypotheses and Results:
Well, it might be possible that cut() and glm() might not work together. But this works:
glm(wage~cut(age,4),data=Wage)
Question:
So, basically we're using the cut() function, saving it's results in a variable, then using that variable in the glm() function. But we can't put the cut function inside the glm() function. And that too, only if the code is in a loop.
So, why is the first version of the code not working?
This is confusing. Any help appreciated.
I'm relatively new in R and I would appreciated if you could take a look at the following code. I'm trying to estimate the shape parameter of the Frechet distribution (or inverse weibull) using mmedist (I tried also the fitdist that calls for mmedist) but it seems that I get the following error :
Error in mmedist(data, distname, start = start, fix.arg = fix.arg, ...) :
the empirical moment function must be defined.
The code that I use is the below:
require(actuar)
library(fitdistrplus)
library(MASS)
#values
n=100
scale = 1
shape=3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
memp=minvweibull(c(1,2), shape=3, rate=1, scale=1)
# estimating the parameters
para_lm = mmedist(data_fre,"invweibull",start=c(shape=3,scale=1),order=c(1,2),memp = "memp")
Please note that I tried many times en-changing the code in order to see if my mistake was in syntax but I always get the same error.
I'm aware of the paradigm in the documentation. I've tried that as well but with no luck. Please note that in order for the method to work the order of the moment must be smaller than the shape parameter (i.e. shape).
The example is the following:
require(actuar)
#simulate a sample
x4 <- rpareto(1000, 6, 2)
#empirical raw moment
memp <- function(x, order)
ifelse(order == 1, mean(x), sum(x^order)/length(x))
#fit
mmedist(x4, "pareto", order=c(1, 2), memp="memp",
start=c(shape=10, scale=10), lower=1, upper=Inf)
Thank you in advance for any help.
You will need to make non-trivial changes to the source of mmedist -- I recommend that you copy out the code, and make your own function foo_mmedist.
The first change you need to make is on line 94 of mmedist:
if (!exists("memp", mode = "function"))
That line checks whether "memp" is a function that exists, as opposed to whether the argument that you have actually passed exists as a function.
if (!exists(as.character(expression(memp)), mode = "function"))
The second, as I have already noted, relates to the fact that the optim routine actually calls funobj which calls DIFF2, which calls (see line 112) the user-supplied memp function, minvweibull in your case with two arguments -- obs, which resolves to data and order, but since minvweibull does not take data as the first argument, this fails.
This is expected, as the help page tells you:
memp A function implementing empirical moments, raw or centered but
has to be consistent with distr argument. This function must have
two arguments : as a first one the numeric vector of the data and as a
second the order of the moment returned by the function.
How can you fix this? Pass the function moment from the moments package. Here is complete code (assuming that you have made the change above, and created a new function called foo_mmedist):
# values
n = 100
scale = 1
shape = 3
# simulate a sample
data_fre = rinvweibull(n, shape, scale)
# estimating the parameters
para_lm = foo_mmedist(data_fre, "invweibull",
start= c(shape=5,scale=2), order=c(1, 2), memp = moment)
You can check that optimization has occurred as expected:
> para_lm$estimate
shape scale
2.490816 1.004128
Note however, that this actually reduces to a crude way of doing overdetermined method of moments, and am not sure that this is theoretically appropriate.
Sorry if this question is trivial, but I'm trying to figure out how to plot a certain type of natural cubic spline (NCS) in R and it's completely eluded me.
In a previous question I learned how to plot the NCS generated by the ns() command in ggplot, but I'm interested in how to plot a slightly different NCS generated the smooth.Pspline command in the pspline package. As far as I know this is the only package that automatically selects the proper smoothing penalty by CV for a given dataset.
Ideally I would be able to provide smooth.Pspline as a method to a stat_smooth layer in ggplot2. My current code is like:
plot <- ggplot(data_plot, aes(x=age, y=wOBA, color=playerID, group=playerID))
plot <- plot + stat_smooth(method = lm, formula = y~ns(x,4),se=FALSE)
I'd like to replace the "lm" formula with smooth.Pspline's functionality. I did a little bit of googling and found a solution to the very similar B-spline function smooth.spline, written by Hadley. But I haven't been able to adapt this to smooth.Pspline perfectly. Does anyone have experience with this?
Thanks so much!
You simply need to inspect how predict.smooth.Pspline returns the predicted values.
In the internal workings of stat_smooth, predictdf is called to create the smoothed line. predictdf is an internal (non-exported) function of ggplot2 (it is defined here) it is a standard S3 method.
sm.spline returns an object of class smooth.Pspline, therefore for stat_smooth to work you need to create method for predictdf for class smooth.Pspline.
As such the following will work.
smP <- function(formula,data,...){
M <- model.frame(formula, data)
sm.spline(x =M[,2],y =M[,1])
}
# an s3 method for predictdf (called within stat_smooth)
predictdf.smooth.Pspline <- function(model, xseq, se, level) {
pred <- predict(model, xseq)
data.frame(x = xseq, y = c(pred))
}
An example (with a pspline fitted using mgcv::gam as comparison). mgcv is awesome and gives great flexibility in fitting methods and smoothing spline choices (although not CV, only GCV/UBRE/REML/ML)
d <- ggplot(mtcars, aes(qsec, wt))
d + geom_point() + stat_smooth(method = smP, se= FALSE, colour='red', formula = y~x) +
stat_smooth(method = 'gam', colour = 'blue', formula = y~s(x,bs='ps'))