Neural network-development&simulation result&explanation - r

I've been up to developing a backward propagation(BP) neural network to make a time-series data prediction(past two year data to predict the 3rd one).AMORE package and Neural Network package in R have both been applied and I've got my output in AMORE.However,the big problem is that all my testing samples given the input values finally shared exactly the same outputs,which could really be disastrous.I would paste my codes here,hoping someone would do me a favor.I am really anxious with the explanation and really appreciate your help. Mercy!
library(AMORE)
tbincidence<-c(22.37,52.22,73.99,83.51,74.25,79.10,81.19,81.98,80.76,69.29,78.94,80.04,65.80,58.12,116.71,127.01,115.74,116.08,102.58,101.71,94.25,80.27,89.88,89.37,68.37,79.33,113.04,110.72,101.01,102.88,94.41,97.17,88.01,82.37,84.77,86.93,87.80,69.35,107.51,113.10,103.19,102.27,102.16,98.86,90.24,84.33,88.21,90.38,84.49,76.90,118.43,116.34,107.71,99.42,102.91,92.72,93.49,90.30,83.54,91.20,67.86,95.68,105.28,105.19,96.50,106.80,97.26,95.24,93.15,82.00,78.55,90.18,79.31,66.46,103.72,100.13,96.17,95.35,91.61,87.56,83.84,75.73,82.44,78.33,74.26,73.14,101.19,96.31,93.13,88.79,83.77,85.59,79.45,74.57,82.17,77.72,65.21,91.96,102.80,95.35,97.22,87.91,86.85,85.09,78.52,73.98,77.77,72.60,76.95,66.10,92.61,89.73,90.04,78.49,83.61,78.80,77.42,73.74,74.69,73.99,70.57,65.02,82.18,87.19,78.14,74.37,77.50,71.07,70.20,65.12,62.96,67.99,69.54,55.18,82.69,78.82,74.47,74.67,75.07,69.77,69.25,62.63,64.13,65.34,62.70,60.77,83.40,75.24,73.14,68.46,67.54,69.82,65.03,60.49,64.04,65.24,58.91,67.02)
tbincidence
lagtwo<-c(22.37,52.22,73.99,83.51,74.25,79.10,81.19,81.98,80.76,69.29,78.94,80.04,65.80,58.12,116.71,127.01,115.74,116.08,102.58,101.71,94.25,80.27,89.88,89.37,68.37,79.33,113.04,110.72,101.01,102.88,94.41,97.17,88.01,82.37,84.77,86.93,87.80,69.35,107.51,113.10,103.19,102.27,102.16,98.86,90.24,84.33,88.21,90.38,84.49,76.90,118.43,116.34,107.71,99.42,102.91,92.72,93.49,90.30,83.54,91.20,67.86,95.68,105.28,105.19,96.50,106.80,97.26,95.24,93.15,82.00,78.55,90.18,79.31,66.46,103.72,100.13,96.17,95.35,91.61,87.56,83.84,75.73,82.44,78.33,74.26,73.14,101.19,96.31,93.13,88.79,83.77,85.59,79.45,74.57,82.17,77.72,65.21,91.96,102.80,95.35,97.22,87.91,86.85,85.09,78.52,73.98,77.77,72.60,76.95,66.10,92.61,89.73,90.04,78.49,83.61,78.80,77.42,73.74,74.69,73.99,70.57,65.02,82.18,87.19,78.14,74.37,77.50,71.07,70.20,65.12,62.96,67.99,69.54,55.18,82.69,78.82,74.47,74.67,75.07,69.77,69.25,62.63,64.13,65.34,62.70,60.77,83.40,75.24,73.14,68.46,67.54,69.82,65.03,60.49,64.04,65.24)
lagone<-c(52.22,73.99,83.51,74.25,79.10,81.19,81.98,80.76,69.29,78.94,80.04,65.80,58.12,116.71,127.01,115.74,116.08,102.58,101.71,94.25,80.27,89.88,89.37,68.37,79.33,113.04,110.72,101.01,102.88,94.41,97.17,88.01,82.37,84.77,86.93,87.80,69.35,107.51,113.10,103.19,102.27,102.16,98.86,90.24,84.33,88.21,90.38,84.49,76.90,118.43,116.34,107.71,99.42,102.91,92.72,93.49,90.30,83.54,91.20,67.86,95.68,105.28,105.19,96.50,106.80,97.26,95.24,93.15,82.00,78.55,90.18,79.31,66.46,103.72,100.13,96.17,95.35,91.61,87.56,83.84,75.73,82.44,78.33,74.26,73.14,101.19,96.31,93.13,88.79,83.77,85.59,79.45,74.57,82.17,77.72,65.21,91.96,102.80,95.35,97.22,87.91,86.85,85.09,78.52,73.98,77.77,72.60,76.95,66.10,92.61,89.73,90.04,78.49,83.61,78.80,77.42,73.74,74.69,73.99,70.57,65.02,82.18,87.19,78.14,74.37,77.50,71.07,70.20,65.12,62.96,67.99,69.54,55.18,82.69,78.82,74.47,74.67,75.07,69.77,69.25,62.63,64.13,65.34,62.70,60.77,83.40,75.24,73.14,68.46,67.54,69.82,65.03,60.49,64.04,65.24,58.91)
predict<-cbind(lagtwo,lagone)
predict<-matrix(predict,ncol=2,byrow=F)
predict
output<-c(73.99,83.51,74.25,79.10,81.19,81.98,80.76,69.29,78.94,80.04,65.80,58.12,116.71,127.01,115.74,116.08,102.58,101.71,94.25,80.27,89.88,89.37,68.37,79.33,113.04,110.72,101.01,102.88,94.41,97.17,88.01,82.37,84.77,86.93,87.80,69.35,107.51,113.10,103.19,102.27,102.16,98.86,90.24,84.33,88.21,90.38,84.49,76.90,118.43,116.34,107.71,99.42,102.91,92.72,93.49,90.30,83.54,91.20,67.86,95.68,105.28,105.19,96.50,106.80,97.26,95.24,93.15,82.00,78.55,90.18,79.31,66.46,103.72,100.13,96.17,95.35,91.61,87.56,83.84,75.73,82.44,78.33,74.26,73.14,101.19,96.31,93.13,88.79,83.77,85.59,79.45,74.57,82.17,77.72,65.21,91.96,102.80,95.35,97.22,87.91,86.85,85.09,78.52,73.98,77.77,72.60,76.95,66.10,92.61,89.73,90.04,78.49,83.61,78.80,77.42,73.74,74.69,73.99,70.57,65.02,82.18,87.19,78.14,74.37,77.50,71.07,70.20,65.12,62.96,67.99,69.54,55.18,82.69,78.82,74.47,74.67,75.07,69.77,69.25,62.63,64.13,65.34,62.70,60.77,83.40,75.24,73.14,68.46,67.54,69.82,65.03,60.49,64.04,65.24,58.91,67.02)
output<-matrix(output,ncol=1)
output
trainPmatrix<-predict[1:130,]
trainTmatrix<-output[1:130,]
testPmatrix<-predict[131:156,]
testTmatrix<-output[131:156,]
net<-newff(n.neurons=c(2,5,1),learning.rate.global=1e-3,momentum.global=0.5,error.criterium="LMS",Stao=NA,hidden.layer="tansig",output="purelin",method="ADAPTgdwm")
result<- train(net,trainPmatrix,trainTmatrix,error.criterium="LMS",report=TRUE,show.step=100,n.shows=10)
y<-sim(result$net,testPmatrix)
y
The output would be something like this.
y
[,1]
[1,] 84.81958
[2,] 84.81958
[3,] 84.81957
[4,] 84.81958
[5,] 84.81958
[6,] 84.81958
[7,] 84.81958
[8,] 84.81958
[9,] 84.81958
[10,] 84.81958
[11,] 84.81958
[12,] 84.81958
[13,] 84.81958
[14,] 84.81958
[15,] 84.81958
[16,] 84.81958
[17,] 84.81958
[18,] 84.81958
[19,] 84.81958
[20,] 84.81958
[21,] 84.81958
[22,] 84.81958
[23,] 84.81958
[24,] 84.81958
[25,] 84.81958
[26,] 84.81958
How could that be possible!!
Thanks a lot for your sharp ideas!

It's been a while since I used R, and I'm not familiar with AMORE. But, are you normalizing your input in any way? It doesn't look like it. You're using the tansig function for the hidden layer, which returns values between -1 and 1. It might work better if your input were in that range. For time series, you might use the percentage change (expressed as between -1 and 1) from one value to the next instead of the raw values, then scaled if necessary. I'm sure there's a function in an R library which can do that for you. Then, you transform the output in reverse fashion to get the predictions.
You'll probably have to keep tuning the network (learning rate, momentum, etc.), but this might be a place to start.

Related

Optimum cut-off values

I have both univariate and multivariate logistic regression models and I want to find cut-off values with their respective sensitivity and specificity. I want to chose the best cut-off values for both my univariate and multivariate models.
I tried the following code for the univariate models but I am getting the sensitivity and specificity values in decimals. Is there any other way I can get the cut-off values as whole numbers rather than rounding up to the nearest integer? I am also not sure how to use the same code to get the cut-off values of the multivariate model.
Thank you for any help in advance!!
###Cut off values of Var1
library(pROC)
ok <- multiclass.roc(DATA$Outcome, DATA$Var1)
class_1 <- ok$rocs[[1]]
wants <- cbind(sensitivity = class_1$sensitivities, specificity = class_1$specificities, cutt_off = class_1$thresholds)
wants
I am getting the values:
sensitivity specificity cutt_off
[1,] 1.00000 0.000000 Inf
[2,] 1.00000 0.012346 73.500
[3,] 1.00000 0.024691 72.500
[4,] 1.00000 0.049383 71.500
[5,] 1.00000 0.061728 70.500
[6,] 1.00000 0.135802 69.500
[7,] 1.00000 0.172840 68.500
[8,] 0.94118 0.222222 67.500
[9,] 0.88235 0.283951 66.500
[10,] 0.88235 0.320988 65.750
[11,] 0.88235 0.333333 65.250
[12,] 0.88235 0.432099 64.500
[13,] 0.88235 0.506173 63.500
[14,] 0.82353 0.617284 62.500
[15,] 0.82353 0.629630 61.750
[16,] 0.76471 0.629630 61.250
[17,] 0.76471 0.691358 60.500
[18,] 0.70588 0.753086 59.750
[19,] 0.70588 0.777778 59.250
[20,] 0.70588 0.814815 58.500
[21,] 0.64706 0.827160 57.500
[22,] 0.64706 0.876543 56.500
[23,] 0.64706 0.901235 55.250
[24,] 0.58824 0.913580 54.250
[25,] 0.58824 0.938272 53.900
[26,] 0.52941 0.938272 53.400
[27,] 0.41176 0.938272 52.500
[28,] 0.35294 0.950617 51.835
[29,] 0.29412 0.950617 50.835
[30,] 0.29412 0.962963 49.000
[31,] 0.23529 0.975309 47.500
[32,] 0.17647 0.975309 46.000
[33,] 0.11765 0.987654 44.500
[34,] 0.00000 0.987654 42.500
[35,] 0.00000 1.000000 -Inf
To determine the cut-off values for the multivariate model, I tried the following code but I am getting errors. Also, my model comprises of both continuous and categorical values. Var1, Var2, Var3 are continuous variables and Var4 is categorical which was changed to 0, 1, & 2.
library(pROC)
ok <- multiclass.roc(DATA$Outcome, DATA$var1 + DATA$Var2 + DATA$Var3 + DATA$Var4)
class_1 <- ok$rocs[[1]]
wants <- cbind(sensitivity = class_1$sensitivities, specificity = class_1$specificities, cutt_off = class_1$thresholds)
wants

Extract contour vertices from a dataframe

Hi guys and thanks in advance for your help.
I have a three-column dataframe, two with coordinates for my data (x and y) and a value of brain activity (z). Out of 7505 rows there are many coordinates with null data that I need to exclude for my statistical analysis.
I'm using the package ImageSCC (https://rdrr.io/github/funstatpackages/ImageSCC/man/) so I need to extract the boundaries or contour of my data, meaning that I need a two-column list of the coordinates that separate brain activity data from null data. This is an example provided by the package:
$Brain.V1
V1 V2
[1,] 0.07781920 0.33867403
[2,] 0.07781920 0.56408840
[3,] 0.07781920 0.65469613
[4,] 0.07968313 0.43812155
[5,] 0.10950606 0.50000000
[6,] 0.11323392 0.25690608
[7,] 0.12068966 0.73425414
[8,] 0.16728798 0.18176796
[9,] 0.19897484 0.83812155
[10,] 0.23625349 0.10441989
[11,] 0.26663278 0.63322031
[12,] 0.28808616 0.46804766
[13,] 0.28939153 0.30397021
[14,] 0.30335508 0.91325967
[15,] 0.31081081 0.04033149
[16,] 0.37862235 0.75747676
[17,] 0.42193552 0.19377116
[18,] 0.42823858 0.02044199
[19,] 0.43196645 0.94419890
[20,] 0.44787179 0.57753089
[21,] 0.45363822 0.38319933
[22,] 0.54193849 0.02554144
[23,] 0.54231688 0.73945984
[24,] 0.55125815 0.92756906
[25,] 0.58275116 0.25043703
[26,] 0.59755088 0.47843551
[27,] 0.63513514 0.89662983
[28,] 0.64072693 0.07616022
[29,] 0.66888720 0.64103671
[30,] 0.71390307 0.35218048
[31,] 0.72087605 0.84138122
[32,] 0.75069897 0.15050829
[33,] 0.78631797 0.50829246
[34,] 0.79543336 0.77508287
[35,] 0.84016775 0.26132597
[36,] 0.85507922 0.69552486
[37,] 0.92845294 0.39287293
[38,] 0.92845294 0.60502762
I have tried with the package 'contoureR' but every time I run my code Rstudio crashes and reboots session. This is a sample of my code:
#install.packages("contoureR")
library(contoureR)
x = 1:ncol(df)
y = 1:nrow(df)
z = expand.grid(x=x,y=y)
z$z = apply(z,1,function(xx){df[xx[1],xx[2]]})
z$z[is.nan(z$z)] <- 0
cl = getContourLines(z)
Does anyone have other idea about how could I extract the boundaries of my data?
Thanks in advance.

How to create a date sequence for 10 years with 16 Day interval with each year starts with 1st January

Following code create a date sequence of 10 years with 16 Day interval.
library(chron)
seq.dates("01/01/2008","12/31/2017", 16)
Output
[1] 01/01/08 01/17/08 02/02/08 02/18/08 03/05/08 03/21/08 04/06/08 04/22/08 05/08/08
[10] 05/24/08 06/09/08 06/25/08 07/11/08 07/27/08 08/12/08 08/28/08 09/13/08 09/29/08
[19] 10/15/08 10/31/08 11/16/08 12/02/08 12/18/08 **01/03/09** 01/19/09 02/04/09 02/20/09
[28] 03/08/09 03/24/09 04/09/09 04/25/09 05/11/09 ..........
........................
...........................
[208] 01/25/17 02/10/17 02/26/17 03/14/17 03/30/17 04/15/17 05/01/17 05/17/17 06/02/17
[217] 06/18/17 07/04/17 07/20/17 08/05/17 08/21/17 09/06/17 09/22/17 10/08/17 10/24/17
[226] 11/09/17 11/25/17 12/11/17 12/27/17
I want first entry for every year to be 1st January not the day which comes after 16 days from the last entry of previous year (BOLD entry in the example sequence) and subsequent entries accordingly.
A long way to do this would be creating date sequence for individual years separately then merging them in a single vector. I'm curious that is there any way to do this in a single line code.
How's this work for you. Uses sapply to pass a vector of starting points and then makes seq.dates do more limited sequences. The sapply function will simplify to an array if possible.
dates(sapply( seq.dates("01/01/2008", "01/01/2017", by="years") ,
function(x) seq.dates(x, to=x+365, by=16, length=23)))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 01/01/08 01/01/09 01/01/10 01/01/11 01/01/12 01/01/13 01/01/14 01/01/15
[2,] 01/17/08 01/17/09 01/17/10 01/17/11 01/17/12 01/17/13 01/17/14 01/17/15
[3,] 02/02/08 02/02/09 02/02/10 02/02/11 02/02/12 02/02/13 02/02/14 02/02/15
[4,] 02/18/08 02/18/09 02/18/10 02/18/11 02/18/12 02/18/13 02/18/14 02/18/15
[5,] 03/05/08 03/06/09 03/06/10 03/06/11 03/05/12 03/06/13 03/06/14 03/06/15
[6,] 03/21/08 03/22/09 03/22/10 03/22/11 03/21/12 03/22/13 03/22/14 03/22/15
[7,] 04/06/08 04/07/09 04/07/10 04/07/11 04/06/12 04/07/13 04/07/14 04/07/15
[8,] 04/22/08 04/23/09 04/23/10 04/23/11 04/22/12 04/23/13 04/23/14 04/23/15
[9,] 05/08/08 05/09/09 05/09/10 05/09/11 05/08/12 05/09/13 05/09/14 05/09/15
[10,] 05/24/08 05/25/09 05/25/10 05/25/11 05/24/12 05/25/13 05/25/14 05/25/15
[11,] 06/09/08 06/10/09 06/10/10 06/10/11 06/09/12 06/10/13 06/10/14 06/10/15
[12,] 06/25/08 06/26/09 06/26/10 06/26/11 06/25/12 06/26/13 06/26/14 06/26/15
[13,] 07/11/08 07/12/09 07/12/10 07/12/11 07/11/12 07/12/13 07/12/14 07/12/15
[14,] 07/27/08 07/28/09 07/28/10 07/28/11 07/27/12 07/28/13 07/28/14 07/28/15
[15,] 08/12/08 08/13/09 08/13/10 08/13/11 08/12/12 08/13/13 08/13/14 08/13/15
[16,] 08/28/08 08/29/09 08/29/10 08/29/11 08/28/12 08/29/13 08/29/14 08/29/15
[17,] 09/13/08 09/14/09 09/14/10 09/14/11 09/13/12 09/14/13 09/14/14 09/14/15
[18,] 09/29/08 09/30/09 09/30/10 09/30/11 09/29/12 09/30/13 09/30/14 09/30/15
[19,] 10/15/08 10/16/09 10/16/10 10/16/11 10/15/12 10/16/13 10/16/14 10/16/15
[20,] 10/31/08 11/01/09 11/01/10 11/01/11 10/31/12 11/01/13 11/01/14 11/01/15
[21,] 11/16/08 11/17/09 11/17/10 11/17/11 11/16/12 11/17/13 11/17/14 11/17/15
[22,] 12/02/08 12/03/09 12/03/10 12/03/11 12/02/12 12/03/13 12/03/14 12/03/15
[23,] 12/18/08 12/19/09 12/19/10 12/19/11 12/18/12 12/19/13 12/19/14 12/19/15
[,9] [,10]
[1,] 01/01/16 01/01/17
[2,] 01/17/16 01/17/17
[3,] 02/02/16 02/02/17
[4,] 02/18/16 02/18/17
[5,] 03/05/16 03/06/17
[6,] 03/21/16 03/22/17
[7,] 04/06/16 04/07/17
[8,] 04/22/16 04/23/17
[9,] 05/08/16 05/09/17
[10,] 05/24/16 05/25/17
[11,] 06/09/16 06/10/17
[12,] 06/25/16 06/26/17
[13,] 07/11/16 07/12/17
[14,] 07/27/16 07/28/17
[15,] 08/12/16 08/13/17
[16,] 08/28/16 08/29/17
[17,] 09/13/16 09/14/17
[18,] 09/29/16 09/30/17
[19,] 10/15/16 10/16/17
[20,] 10/31/16 11/01/17
[21,] 11/16/16 11/17/17
[22,] 12/02/16 12/03/17
[23,] 12/18/16 12/19/17
I was a bit surprised at this result since I thought the value would be a character matrix, but str shows it's a matrix of chron date elements. Can remove the apparent "matrix" (actually "dates" with a dimension attribute) structure with a call to c:
str(c(dates(sapply( seq.dates("01/01/2008", "01/01/2017", by="years") , function(x) seq.dates(x, to=x+365, by=16, length=23))) ))
'dates' num [1:230] 01/01/08 01/17/08 02/02/08 02/18/08 03/05/08 ...
- attr(*, "format")= chr "m/d/y"
- attr(*, "origin")= num [1:3] 1 1 1970

Select columns from nested lists in r

I have a list with 50 elements, and each element is a 21x2 matrix. I want to pull every first column so that I will be able to multiply the first column of each 21x2 matrix by another matrix.
Example data:
x<-replicate(50,cbind(rnorm(21,0,1),rnorm(21,1,1)))
x<-lapply(seq(dim(x)[3]), function(i) x[ , , i])
x[[1]]
[,1] [,2]
[1,] -1.00653872 1.2780327
[2,] -0.30442989 -0.6854457
[3,] -1.05715492 -0.3464085
[4,] 0.12005815 1.1885382
[5,] 0.93834177 1.4968285
[6,] 0.85975400 1.3084381
[7,] 0.91980222 -0.1580829
[8,] 0.35785346 1.7679500
[9,] -1.03510124 2.2865753
[10,] -0.74853505 0.5148834
[11,] -1.23582377 0.8514812
[12,] 0.69546075 0.8294420
[13,] 0.08527011 1.7080554
[14,] -0.81635552 0.7492530
[15,] 0.53826428 -0.3058294
[16,] 0.16545497 0.4415540
[17,] -0.27144363 0.8299643
[18,] 0.02851933 1.2673526
[19,] 1.86516449 0.3009744
[20,] -0.46998359 -0.3232826
[21,] -0.60222069 2.3836219
assign <- rep(c(0,1),times=c(10,11))
If I do
x[[1]][,1]*assign
I get what I'm looking for, but I want to be able to do this for all elements of x without a for-loop.
I tried
alt<-lapply(x, `[[`, 1)
but this only gives the first element of the first columns, whereas I want the whole vector.
Any suggestions?
Try using split to split each matrix by columns and take the first one
sapply(x, function(mat) split(mat, col(mat))[1])
You could also try simplify2array
simplify2array(x)[,1,]

system is computationally singular error

I am using fastICA package in r. In this package, I am using fastICA function, which have some parameters. If I set n.comp to 2, that works fine, but if I set this parameter to 3 or more in this function:
ica<-fastICA(datalist,n.comp=3)
datalist is here a matrix with 20 rows and 4 columns:
[,1] [,2] [,3] [,4]
[1,] 567.00 324.225 281.0889 538.25
[2,] 557.75 317.500 269.5556 529.15
[3,] 543.75 309.900 264.5778 515.95
[4,] 557.00 316.225 265.0889 528.25
[5,] 538.25 307.750 266.6667 510.95
[6,] 531.25 301.025 250.0222 503.70
[7,] 545.00 311.800 270.9333 517.40
[8,] 550.00 316.925 284.3778 522.65
[9,] 514.75 290.300 235.6000 487.75
[10,] 518.00 293.800 245.1556 491.20
[11,] 553.75 318.125 281.6667 526.00
[12,] 563.50 325.925 297.2667 535.75
[13,] 540.00 303.300 241.1556 511.40
[14,] 546.00 310.350 261.6444 517.90
[15,] 567.25 324.425 281.4889 538.50
[16,] 577.75 330.125 285.2222 548.40
[17,] 560.75 317.425 262.3778 531.60
[18,] 570.00 323.925 272.8222 540.65
[19,] 569.00 324.700 278.8444 540.00
[20,] 565.50 324.150 284.1333 537.00
I am getting this error:
Error in solve.default(w %*% t(w)) :
system is computationally singular: reciprocal condition number = 1.16873e-16
could you please say me why I am getting this error and how can I solve it?
In solve(), use a smaller tolerance, like solve(..., tol = 1e-17).
This should be fine since you get reciprocal condition number = 1.16873e-16.
More info in the help file and this related question.

Resources