how to initialize weights with the neuralnet package? - r

I am using the neuralnet package in R, but I have a problem when I want to initialize certain number of initial weights for my network. I have tried to do it based on the results that I got from the default random weights generated, but no luck at all.
This is the part where I should put the initial weights:
weigths<-c(-0.3,0.2,
0.2,0.05,
0,2,-0.1,
-0.1,0.2,0.2)
net=neuralnet(to~x1+x2,tdata,hidden=2,threshold=0.01,constant.weights=weights)
because I am considering that the weights follow this pattern:
Intercept.to.1layhid1 -5.0556934519949
x1.to.1layhid1 10.9208362719511
x2.to.1layhid1 12.9996270590530
Intercept.to.1layhid2 3.7047601228351
x1.to.1layhid2 -2.5636252939619
x2.to.1layhid2 -2.5759077405754
Intercept.to.to -1.6494794336705
1layhid.1.to.to 1.3502874764968
1layhid.2.to.to 1.6969811621181
but when I apply it I got the error:
Error in constant.weights != 0
Any help?
Thanks

You are looking for the startweights argument to initialize custom weights. This is in the documentation:
help(neuralnet)
startweights:
a vector containing starting values for the weights.
The weights will not be randomly initialized.
The constant.weights is used to specify fixed weights those which you would have excluded with the exclude agrument.

Related

Estimation to plot person-item map not feasible because items "have no 0-responses" in data matrix

I am trying to create a person item map that organizes the questions from a dataset in order of difficulty. I am using the eRm package and the output should looks like follows:
[person-item map] (https://hansjoerg.me/post/2018-04-23-rasch-in-r-tutorial_files/figure-html/unnamed-chunk-3-1.png)
So one of the previous steps, before running the function that outputs the map, I have to fit the data set to have a matrix which is the object that the plotting functions uses to create the actual map, but I am having an error when creating that matrix
I have already tried to follow and review some documentation that might be useful if you want to have some extra-information:
[Tutorial] https://hansjoerg.me/2018/04/23/rasch-in-r-tutorial/#plots
[Ploting function] https://rdrr.io/rforge/eRm/man/plotPImap.html
[Documentation] https://eeecon.uibk.ac.at/psychoco/2010/slides/Hatzinger.pdf
Now, this is the code that I am using. First, I install and load the respective libraries and the data:
> library(eRm)
> library(ltm)
Loading required package: MASS
Loading required package: msm
Loading required package: polycor
> library(difR)
Then I fit the PCM and generate the object of class Rm and here is the error:
*the PCM function here is specific for polytomous data, if I use a different one the output says that I am not using a dichotomous dataset
> res <- PCM(my.data)
>Warning:
The following items have no 0-responses:
AUT_10_04 AUN_07_01 AUN_07_02 AUN_09_01 AUN_10_01 AUT_11_01 AUT_17_01
AUT_20_03 CRE_05_02 CRE_07_04 CRE_10_01 CRE_16_02 EFEC_03_07 EFEC_05
EFEC_09_02 EFEC_16_03 EVA_02_01 EVA_07_01 EVA_12_02 EVA_15_06 FLX_04_01
... [rest of items]
>Responses are shifted such that lowest
category is 0.
Warning:
The following items do not have responses on
each category:
EFEC_03_07 LC_07_03 LC_11_05
Estimation may not be feasible. Please check
data matrix
I must clarify that all the dataset has a range from 1 to 5. Is a Likert polytomous dataset
Finally, I try to use the plot function and it does not have any output, the system just keep loading ad-infinitum with no answer
>plotPImap(res, sorted=TRUE)
I would like to add the description of that particular function and the arguments:
>PCM(X, W, se = TRUE, sum0 = TRUE, etaStart)
#X
Input data matrix or data frame with item responses (starting from 0);
rows represent individuals, columns represent items. Missing values are
inserted as NA.
#W
Design matrix for the PCM. If omitted, the function will compute W
automatically.
#se
If TRUE, the standard errors are computed.
#sum0
If TRUE, the parameters are normed to sum-0 by specifying an appropriate
W.
If FALSE, the first parameter is restricted to 0.
#etaStart
A vector of starting values for the eta parameters can be specified. If
missing, the 0-vector is used.
I do not understand why is necessary to have a score beginning from 0, I think that that what the error is trying to say but I don't understand quite well that output.
I highly appreciate any hint that you can provide me
Feel free to ask for any information that could be useful to reach the solution to this issue
The problem is not caused by the fact that there are no items with 0-responses. The model automatically corrects this by centering the response scale categories on zero. (You'll notice that the PI-map that you linked to is centered on zero. Also, I believe the map you linked to is of dichotomous data. Polytomous data should include the scale categories on the PI-map, I believe.)
Without being able to see your data, it is impossible to know the exact cause though.
It may be that the model is not converging. That may be what this error was alluding to: Estimation may not be feasible. Please check data matrix. You could check by entering > res at the prompt. If the model was able to converge you should see something like:
Conditional log-likelihood: -2.23709
Number of iterations: 27
Number of parameters: 8
...
Does your data contain answers with decimal numbers? I found the same error, I solved it by using dplyr::dense_rank() function:
df_ranked <- sapply(df_decimal_data, dense_rank)
Worked.

Exclude specific tensors being updated by optimizer in TensorFlow

I have two graphs, which I suppose to train them independently, which means I have two different optimizers, but at the same time one of them is using the tensor values of the other graph. As a result, I need to be able to stop specific tensors being updated while training one of the graphs. I have assigned two different namescopes two my tensors and using this code to control updates over tensors for different optimizers:
mentor_training_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "mentor")
train_op_mentor = mnist.training(loss_mentor, FLAGS.learning_rate, mentor_training_vars)
mentee_training_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "mentee")
train_op_mentee = mnist.training(loss_mentee, FLAGS.learning_rate, mentee_training_vars)
the vars variable is being used like below, in the training method of mnist object:
def training(loss, learning_rate, var_list):
# Add a scalar summary for the snapshot loss.
tf.summary.scalar('loss', loss)
# Create the gradient descent optimizer with the given learning rate.
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
# Create a variable to track the global step.
global_step = tf.Variable(0, name='global_step', trainable=False)
# Use the optimizer to apply the gradients that minimize the loss
# (and also increment the global step counter) as a single training step.
train_op = optimizer.minimize(loss, global_step=global_step, var_list=var_list)
return train_op
I'm using the var_list attribute of the optimizer class in order to control vars being updated by the optimizer.
Right now I'm confused whether I have done what I supposed to do appropriately, and even if there is anyway to check if any optimizer would only update partial of a graph?
I would appreciate if anyone can help me with this issue.
Thanks!
I have had a similar problem and used the same approach as you, i.e. via the var_list argument of the optimizer. I then checked whether the variables not intended for training stayed the same using:
the_var_np = sess.run(tf.get_default_graph().get_tensor_by_name('the_var:0'))
assert np.equal(the_var_np, pretrained_weights['the_var']).all()
pretrained_weights is a dictionary returned by np.load('some_file.npz') which I used to store the pre-trained weights to disk.
Just in case you need that as well, here is how you can override a tensor with a given value:
value = pretrained_weights['the_var']
variable = tf.get_default_graph().get_tensor_by_name('the_var:0')
sess.run(tf.assign(variable, value))

h2o autoencoders high error (h2o.mse)

I am trying to use h2o to create an autoencoder using its deeplearning function. I am feeding a set of data about 4000x50 in size to the deeplearning function (hidden node c(200)) and then using h2o.mse to check its error and I am getting about 0.4, a fairly high value.
Is there anyway to reduce that error by changing something in the deeplearning function?
I assume everything is the defaults, except defining a single hidden layer with 200 nodes?
Your first set of things to try are:
Use more epochs (or use less aggressive early stopping criteria)
Use a 2nd hidden layer
Use more nodes in your hidden layer(s)
Get more training data
Note that all of those will increase your training time.
You can use H2OGridSearch to find the best autoencoder model with the smallest MSE.
Below is an example in Python. Here you can find example in R.
def tuneAndTrain(hyperParameters, model, trainDataFrame):
h2o.init()
trainData=trainDataFrame.values
trainDataHex=h2o.H2OFrame(trainData)
modelGrid = H2OGridSearch(model,hyper_params=hyperParameters)
modelGrid.train(x=list(range(0,int(len(trainDataFrame.columns)))),training_frame=trainDataHex)
gridperf1 = modelGrid.get_grid(sort_by='mse', decreasing=True)
bestModel = gridperf1.models[0]
return bestModel
And you can call the above function to find and train the best model:
hiddenOpt = [[50,50],[100,100], [5,5,5],[50,50,50]]
l2Opt = [1e-4,1e-2]
hyperParameters = {"hidden":hiddenOpt, "l2":l2Opt}
bestModel=tuneAndTrain(hyperParameters,H2OAutoEncoderEstimator(activation="Tanh", ignore_const_cols=False, epochs=200),dataFrameTrainPreprocessed)

Arguments length error when trying to test model using party package in R

I have divided my data set into two groups:
transactions.Train (80% of the data)
transactions.test (20% of the data)
Then I built the decision tree using ctree method from party package as follow:
transactions.Tree <- ctree(dt_formula, data=transactions.train)
And I can successfully apply predict method on the training set and use table function to output the result as follow:
table(predict(transactions.Tree), transactions.train$Satisfaction)
But my problem occurs when I try to output the table based on the testing set as follow:
testPred <- predict(transactions.Tree, newdata=transactions.test)
table(testPred, transactions.test$Satisfaction)
And the error is as follow:
Error in table(predict(pred = svm.pred, transactions.Tree), transactions.test$Satisfaction) :
all arguments must have the same length
I have done research on similar cases which suggested omitting any NA values which I did without changing the error outcome.
Can anyone help me by poniting out what's the problem here?

How to specify additional parameters in R functions?

For Fitting Markov Switching Models with package MSwM (function msmFit) there is 'control' argument which is list of control parameters.
Syntax of msmFit is:
msmFit(object, k, sw, p, data, family, control)
The 'control' argument is a list that can supply any of the following components:
-trace: A logical value. If it is TRUE, tracing information on the progress of the optimization is produced.
-maxiter: The maximum number of iterations in the EM method. Default is 100.
and so on.
My question is how to specify for example '-maxiter'? I tried: component(maxiter=50), component-maxiter=50, component[["maxiter"]]=50. Everything gives an error. "unexpected '='" or other errors connected to argument.
Format your call something like this:
mod=msmFit(model,k=2,sw=rep(TRUE,8),control=list(maxiter=10))

Resources