Memory error when applying logistf() - r

I am trying to apply logistf() on a dataframe (dim:11359x139). All the variables are binary. I get the following message:
"Error in logistf.fit(x = x, y = y, weight = weight, offset = offset, firth, :
no memory available".
Even if I take into consideration only 20 rows and 139 predictors of the dataframe I get the same. Is it a hardware issue or my fault?

Related

Increasing Max Weights in neural netwrok

I have a very large data set with around 700000 lines and 26 predictor variables. My neural netwrok cannot run it all because the default max weights is set to 100. I am trying to use the MaxNWts to set it at 10000 but this gives me the following error. What can I do so it runs, even if it takes a bit longer.
trainingData=sample(1:nrow(data),0.7*nrow(data))
predictors=c(1:21,23:27)
myNet=nnet.formula(data[,22]~ data$ï..PatientMRN+data$IsNewToProvider+data$IsOverbooked+data$IsOverride+data$AppointmentDateandTime+data$VisitType+data$DayofWeek+data$DepartmentSpecialty+data$LeadDays,data=data, subset=trainingData, size=7, MaxNWts = 10000)
Forgot to include this is the error message i get
Error in nnet.default(x, y, w, ...) :
NA/NaN/Inf in foreign function call (arg 2)
In addition: Warning message:
In nnet.default(x, y, w, ...) : NAs introduced by coercion
Instead of trying to change the max weight, you should first normalize column-wise your data (before splitting into train-test!) so that the maximal value of each table is set to 1.0
since then all values of the table will be 0 <= value <= 1.0
your weights won't be huge.

argument "x" is missing, with no default in ezANOVA

I am getting a weird problem with ezANOVA. When I try to execute code below it says that some data is missing, but when I look at the data, nothing is missing.
model_acc <- ezANOVA(data = nback_acc_summary[complete.cases(nback_acc_summary),],
dv = Stimulus1.ACC,
wid = Subject,
within = c(ExperimentName, Target),
between = Group,
type = 3,
detailed = T)
When I run these lines I get an error message that says:
Error in ezANOVA_main(data = data, dv = dv, wid = wid, within = within, :
One or more cells is missing data. Try using ezDesign() to check your data.
Then I run
ezDesign(nback_acc_summary)
And get the message:
Error in as.list(c(x, y, row, col)) :
argument "x" is missing, with no default
I am not sure what to change in the code, because I can't really figure out what the problem is. I've researched the issue online, and it seems like quite a lot of users have encountered it before, but there is a very limited amount of solutions posted. I would be grateful for any kind of help.
Thanks!
For an ANOVA model you must have observations in all conditions created by the design of your model.
For example, if ExperimentName, Target, and Group each have two levels each, you have 2 x 2 x 2 = 8 conditions which require multiple observations in each condition. Then, add a constraint to this that your model is repeated measures which means that each Subject within a level of your between factor Group must have observations for all of the within conditions (i.e., ExperimentName x Target = 2 x 2 = 4).
The first error suggests you have fallen short of having enough data in the conditions suggested by your model.
The following should produce a plot to help identify which conditions are missing data:
ezDesign(
data = nback_acc_summary[complete.cases(nback_acc_summary), ],
x = Target,
y = Subject,
row = ExperimentName,
col = Group
)

R won't train my data set

I at moment trying to train my data, but can't seem to get R to work as I want it.
The data consist of hand written digits (400) where for each hand written number is 18x18 pixels extracted. So in total 400 x 324 data points as training data.
> class(train_data)
[1] "data.frame"
> str(train_data)
'data.frame': 400 obs. of 324 variables:
The code used for training is this
control = trainControl(method="cv",
number = 1,
repeats=0,
p = 0.9,
preProcOptions = list(thresh = 0.8),
)
knnFit = train(x=train_data,
y=factor(testClass[1:400]),
method ='knn',
trControl = control,
preProcess = c('PCA')
)
The problem here is that when i perform the train, i get an error message which i am not able to decipher what the problem is?
the error message is
Error in train.default(x = train_data, y = factor(testClass[1:400]), method = "knn", :
Stopping
In addition: Advarselsbesked:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.
Just judging from the error, one would assume that there are NAs in the training set.
If you run sum(is.na(train_data)), you should be able to see whether there are missing values in your set. If there are, then you could use the same command column-by-column to figure out where they're coming from.

How would I run an ANOVA in R on this long form data?

My factors are constraint (high or low), picture type type (a,b,c,d), and electrode (29 values).
My dependent variable is the amplitude measured from the electrodes. So it is 2 x 4 x 29. How can I run the ANOVA so that R does not think that each of the electrodes is another measurement but included in the 'electrode' factor?
This is what I tried so far but I get an error
anova1 <- ezANOVA(data=dat, dv=n_400, wid=subject, within=.(constraint, ending, electrode), type="III")
>Error in ezANOVA_main(data = data, dv = dv, wid = wid, within = within, :
One or more cells is missing data. Try using ezDesign() to check your data.

RandomForest error code

I am trying to run a rather simple randomForest. I keep having an error code that does not make any sense to me. See code below.
test.data<-data.frame(read.csv("test.RF.data.csv",header=T))
attach(test.data)
head(test.data)
Depth<-Data1
STemp<-Data2
FPT<-Sr_hr_15
Stage<-stage_feet
Q<-discharge_m3s
V<-vel_ms
Turbidity<-turb_ntu
Day_Night<-day_night
FPT.rf <- randomForest(FPT ~ Depth + STemp + Q + V + Stage + Turbidity + Day_Night, data = test.data,mytry=1,importance=TRUE,na.action=na.omit)
Error in randomForest.default(m, y, ...) : data (x) has 0 rows
In addition: Warning message:
In randomForest.default(m, y, ...) :
The response has five or fewer unique values. Are you sure you want to do regression?
I then run the dimensions to ensure there is infact data recognized in R
dim(test.data)
[1] 77 15
This is a subset of the complete data set I ran just to test if I could get it to run since I got the same error with the complete data set.
Why is it telling me data(x) has 0 rows when clearly there is.
Thanks

Resources