I use R-packages (nnet, amore, neuralnet) for designing a neural network. The problem is that I want to use a custom error function. Based on the output from the neural network I have a custom calculation...
It seems not possible to do this is in a package in R? Is there anybody who knows what I can do?
Other possibility is that I use a genetic algorithm to optimize my weights of my neural network, but I don't get the desired optimization here. My network 28 inputs & 9 hidden neurons is too big to use a genetic algorithm to optimize, I get stuck in local optima...
(Maybe the genetic algorithm approach is an option but it would be time consuming to try to achieve a decent solution with it.)
With the neuralnet package you can pass custom activation and error functions which are automatically differentiated (assuming the function can be with the internal R functions). See this question here regarding how to implemented ReLU for the activation function. The same thing should be possible by passing your custom error function to the err.fct argument.
Related
I'm working in R and trying to get started with neural networks, using the keras package.
I'd like to use a custom loss function for training my NN. It's possible to do this by writing a the custom loss function as lossFn <- function(y_true, y_pred) { ... } and passing it to the compile method as model %>% compile(loss = lossFn, ...).
Now in order to use the gradient descent method of training the NN, the loss function needs to be differentiable. I understand that you'd usually accomplish this by restricting yourself to using backend functions in your loss function, e.g.
lossFn <- function(y_true, y_pred) {
K <- backend()
K$mean(K$square(y_true - y_pred), axis = 1L)
}
or something like that.
Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.
So my idea was that I'd work out the gradient myself on paper, and then provide it to compile as another argument, say compile(loss = lossFn, gradient = gradientFn, ...), with gradientFn suitably defined.
The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.
So my question is, is it possible?
An addendum: since Google has suggested that there are other training methods for NNs that do not rely on the gradient of the loss function, I should add I'm not too hung up on the specific training method. My ultimate goal isn't to manually supply the gradient of a custom loss function, it's to use a custom loss function to train the NN. The gradient is just a technical obstacle for me right now.
Thanks!
This is certainly possible in Keras, you'll just have to move up the stack a little and implement a train_step method and then call optimizer$apply_gradients().
Chapter 7 in the Deep Learning with R book covers this use case:
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608
Also, this keras guide may be useful, even though it's in Python and you're working in R. (The Python interface is very similar to the R interface).
https://keras.io/guides/writing_a_training_loop_from_scratch/
I'm using the Caret package from R to create prediction models for maximum energy demand. What i need to use is neural network multilayer perceptron, but in the Caret package i found out there's 2 of the mlp method, which is "mlp" and "mlpML". what is the difference between the two?
I have read description from a book (Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization) but it still doesnt answer my question.
Caret has 238 different models available! However many of them are just different methods to call the same basic algorithm.
Besides mlp there are 9 other methods of calling a multi-layer-perceptron one of which is mlpML. The real difference is only in the parameters of the function call and which model you need depends on your use case and what you want to adapt about the basic model.
Chances are, if you don't know what mlpML or mlpWeightDecay,etc. does you are fine to just use the basic mlp.
Looking at the official documentation we can see that:
mlp(size) while mlpML(layer1,layer2,layer3) so in the first method you can only tune the size of the multi-layer-perceptron while in the second call you can tune each layer individually.
Looking at the source code here:
https://github.com/topepo/caret/blob/master/models/files/mlp.R
and here:
https://github.com/topepo/caret/blob/master/models/files/mlpML.R
It seems that the difference is that mlpML allows several hidden layers:
modelInfo <- list(label = "Multi-Layer Perceptron, with multiple layers",
while mlp has one single layer with hidden units.
The official documentation also hints at this difference. In my opinion, it is not particularly useful to have many different models that differ only very slightly, and the documentation does not explain those slight differences well.
Ciao,
I am working to neuralnet in R.
I used to program this kind of stuff using Keras in python so I would expect to be able to set up different activation functions for different layers.
Let me explain. Suppose I want to build a neural net with 2 hidden layers (say with 5 and 4 neurons) and an output between -1 and 1.
I would like to set up RELU or softplus in the hidden layers and tanh in the output layer.
The issue here is that neuralnet package lets me choose only one activation function via the argument act.fun:
> nn <- neuralnet(data = data, hidden = c(5, 4), act.fun =tanh)
I tried by setting the act.fun argument as c(softplus, softplus, tanh) but of course I get an error because the neuralnet function expects only one function for that argument.
Do you know how I can set up the neuralnet in this way? On the internet I can only find very basic linear neural net built with this package. If it would be not possible this mean that this package is almost useless because it would be able to build only "linear models" (??!)
Thanks a lot,
ciao
ReLu was added in neuralnet 1.44.4 (not on CRAN yet, could use devtools::install_github("bips-hb/neuralnet")). In this version it's also possible to change the output activation function separately (output.act.fct). However, different activations for the hidden layers is not yet possible.
See also here: https://github.com/bips-hb/neuralnet/issues/18.
On the internet I can only find very basic linear neural net built with this package. If it would be not possible this mean that this package is almost useless because it would be able to build only "linear models" (??!)
No, not only linear models. But note that the package is from the pre-deep learning era (2008) and not made for deep networks. I would also recommend keras (the R package is great) here.
I am using the mlr package in R to run the KNN algorithm. I am using tuneParams to search for the optimal k. When I run tuneParams the output shows the performance for each value of k. How can I save the performance for each k? The TuneResult object only has the optimal performance. I would like to use this to create a graph with the performance as a function of k.
To complete the answer you found yourself:
The best way to access all the settings that have been tried out:
as.data.frame(TuneResult$opt.path)
I’m trying to make something like Word Spotting in R through images. For now, I’ve been able to put some boundries around the words with imager package and isoblur function:
document=imager::load.image("image.jpg")
plot(document)
document1=document<0.8
plot(document1)
plot(document1)
px=(isoblur(document1,1)>.3)
highlight(px)
Document
Document1
Result
The idea is from this work:
https://cran.r-project.org/web/packages/imager/vignettes/pixsets.html
Isoblur description’s is not very helpful to understand the process behind the function and I am wondering:
What are the calculations behind it?
It is possible to construct a neural network to achive the same result, more or less?
What are the calculations behind it?
Isoblur function passes a Gaussian filter on the image, blurring it and removing some noise.
It is possible to construct a neural network to achieve the same
result, more or less?
Yes, but it is an overkill. If most of your pictures are similar to the one shown you could segment words with thresholding and select each connected component, you should take a look at Otsu method.
If you aim to recognize each character, I think it is better to use an already established OCR tool, such as Tesseract. But your characters seem to be far from the usual handwriting, you probably will have to train your own classifier (might be a good choice to use a neural network for this task).