Mode Constrained likelihood maximisation using alabama package in R - r

I have been trying to minimize the function \sum n_i*log(p_i) where p_i's are the unknown probability parameters, satisfying usual constraints, namely p_i>0 and \sum(p_i)=1.
Further I have constraints of the form p_1<=p_2<=...<=p_(11)>=p_(12)>=...>=p(21). I plan to use the alabama package in R for the above. I wrote out a code -
############## Generating multinomial with specified probability ##################
p=seq(0.01,0.01+9*0.007,0.007)
pr=c(p,(1-2*sum(p)),rev(p))
y=sample(21,1000,prob=pr,replace=T)
############### Obtaining Frequency distribution #################
freq=numeric(21)
for(i in 1:21)
freq[i]=sum(as.numeric(y==i))
############# Negative Log-likelihood #################
fn=function(x)
{
sum=0
for(i in 1:21)
sum=sum+freq[i]*log(x[i])
return(-sum)
}
############# Gradient vector #################
gr=function(x)
{
g=numeric(length(x))
for(i in 1:length(x))
g[i]=freq[i]/x[i]
return(g)
}
############# Equality constraints ################
heq=function(x)
{
h=rep(0,1)
h=sum(x)-1
return(h)
}
heq.jac=function(x)
{
j=matrix(0,1,length(x))
j[1,]=rep(1,length(x))
return(j)
}
############## Inequality constraints ################
hin=function(x)
{
h=rep(0,41)
for(i in 1:21)
{
h[i+20]=x[i]
}
for(i in 1:20)
{
if(i<=10)
h[i]=x[i+1]-x[i]
if(i>10)
h[i]=x[i]-x[i+1]
}
return(h)
}
hin.jac=function(x)
{
j=matrix(0,41,21)
for(i in 1:21)
j[(i+20),i]=1
for(i in 1:20)
{
if(i<=10)
{
j[i,(i+1)]=1
j[i,i]=-1
}
if(i>10)
{
j[i,i]=1
j[i,(i+1)]=-1
}
}
return(j)
}
################ Function call ##################
init=runif(21,0,1)
est=auglag(par=init, fn=fn, gr=gr, heq=heq, heq.jac=heq.jac, hin=hin, hin.jac=hin.jac)
Now the above snippet doesn't work. The algorithm does not converge and instead yields warnings, namely
In log(x[i]) : NaNs produced
Can you please help me figure out where I am going wrong? Thank you.
P.S. I do understand there are some redundant constraints in the code.

Related

need help to improve efficiency of my code in R

I am trying to study different estimators and trying do their simulation. However, my current code is too slow, and I would like to make it more efficient. Below is the relevant excerpt:
covsn<-function(x,y){
v=rep(0,n)
for(i in 1:n)
{
if(n%%2==1){
m0=floor(n/2)
} else{
m0=floor((n+1)/2)
}
x_other=x[-i]
y_other=y[-i]
v[i]=sort((x[i]-x_other)*(y[i]-y_other),decreasing=F)[c(m0)]
}
m1=floor((n+1)/2)
return(sncov=sort(v)[c(m1)])
}
M=matrix(0,p,p)
D=matrix(0,p,p)
for(i in 1:p)
{
for(j in 1:p)
{
M[i,j]=covsn(X[,i],X[,j])
if(i==j){
library(robustbase)
d=1/Sn(X[,i], constant=1.196)
D[i,j]=d
}}}

read data from a file rjags

My data file looks something like this:
list(y=structure(.Data=c(26, 228, 31, ...)), .Dim=c(413,9))
Let's say this file is saved as "data.txt".
If I'm working in 'R2OpenBUGS', it allows me to pass the data as a file with no problem:
mcmc <- bugs(data = "data.txt", inits=...)
But in JAGS, if I pass data as "data.txt", it says: "data must be a list or environment". What's the problem here? Also, if there is no way around it, is there a way I can read the data as list in R?
My model is:
model {
for (i in 1:413) {
for (j in 1:9) {
logy[i,j] <- log(y[i,j])
logy[i,j] ~ dnorm(m[i], s)
}
}
# priors
for (i in 1:413) {
m[i] ~ dgamma(0.001, 0.001)
}
s ~ dgamma(0.001, 0.001)
}
From the JAGS user manual
7.0.4 Data transformations
JAGS allows data transformations, but the syntax is different from BUGS.
BUGS allows you to put a stochastic node twice on the left hand side of a relation, as in this example taken from the manual
for (i in 1:N) {
z[i] <- sqrt(y[i])
z[i] ~ dnorm(mu, tau)
}
This is forbidden in JAGS. You must put data transformations in a separate block of relations preceded by the keyword data:
data {
for (i in 1:N) {
z[i] <- sqrt(y[i])
}
}
model {
for (i in 1:N) {
z[i] ~ dnorm(mu, tau)
}
...
}

R GP Implementation Error

So I am very new to R. I was having some trouble importing data with Mathematica, so I decided to make a switch since R is much better suited to analytics. I'm building a few machine learning techniques to do analysis on the data that I can now import. This is a genetic programming implementation that when finished should do symbolic regression on some data. Other than the errors, the script should be almost complete (I need to program the composition operator, make the division protected, and finish the list of base functions). I had a previous problem while programming the script that was resolved (R Error Genetic Programming Implementation). I've been debugging the script for about a day and I'm all out of ideas.
My error message is:
Error in makeStrName(nextGen) : object 'nextGen' not found
>
> #Print the string versions of the five functions with the lowest RMSE evolved.
> byRMSEList<-sortByRMSE(populationsBestTenStr)
Error: object 'totalTwo' not found
> for(i in 1:5)
+ {
+ byRMSEList[[i]]
+ }
Error: object 'byRMSEList' not found
Here is my script. I am currently using RStudio. Thanks for taking the time to help:
library("datasets")
operators<-list("+","*","-","/","o")
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
rndNumVector[rndNumVector==i]<-funcList[i]
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, funcList)
{
set.seed(inputSeed)
rnd<-sample(1:length(funcList),inputLen,replace=T)
Org<-mapFuncList(funcList,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list()
for(i in 1:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed+i,functions))
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
funCreator<-function(snippet)
{
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList, inputData, outputData, populationStringList)
{
#rmse <- sqrt( mean( (sim - obs)^2))
for(i in 1:length(populationStringList))
{
stringFunc<-populationStringList[[i]]
total<-list(mode="numeric",length=length(inputData))
topTenPercentFunctionList<-list()
topTenPercentRMSEList<-list()
topTenPercentStringFunctionList<-list()
tempFunc<-function(x){x}
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
tempFunc<-populationFuncList[[i]]
}
rmse<-sqrt(mean(total*total))
topTenPercentVal<-length(populationFuncList)*0.1
if(length(topTenPercentFunctionList)<topTenPercentVal||RMSE<min(topTenPercentRMSEList))
{
topTenPercentStringFunctionList<-c(topTenPercentStringFunctionList,stringFunc)
topTenPercentRMSEList<-c(topTenPercentRMSEList, rmse)
topTenPercentFunctionList<-c(topTenPercentFunctionList, tempFunc)
}
}
return(topTenPercentStringFunctionList)
}
#Get random operator
getRndOp<-function(seed)
{
set.seed(seed)
rndOpNum<-sample(1:length(operators),1,replace=T)
operation<-operators[[rndOpNum]]
return(operation)
}
#Mutation Operators
#This attaches a new appendage to an organism
endNodeMutation<-function(strFunc,seed)
{
op<-getRndOp(seed)
strFunc<-c(strFunc,op)
newAppendage<-generateOrganism(1,seed+2,funcs)
strFunc<-c(strFunc,newAppendage)
return(strFunc)
}
#This is a mutation that occurs at a random locaiton in an organism
rndNodeMutation<-function(strFunc,seed,secondSeed)
{
op<-getRndOp(seed)
halfStrFunc<-((length(strFunc))/2)
set.seed(seed)
randomStart<-sample(1:halfStrFunc,1,replace=T)
set.seed(secondSeed)
randomEnd<-2*(sample(1:length(halfStrFunc),1,replace=T))
strFuncUpdate<-substr(strFunc,randomStart,randomEnd)
strFuncUpdate<-c(strFuncUpdate,op)
newAppendage<-generateOrganism(1,seed+2,funcs)
strFuncUpdate<-c(strFuncUpdate,newAppendage)
return(strFuncUpdate)
}
#Crossover Operators
#Crossover operator that attaches otherStrFunc to strFunc at the endpoint of strFunc
crossoverConcatenationOperator<-function(strFunc,otherStrFunc)
{
newStrFunc<-c(strFunc,otherStrFunc)
return(newStrFunc)
}
#Crossover Operation that starts and ends at random points in the concatenation
randomCrossoverOperator<-function(strFunc,otherStrFunc,seed,secondSeed)
{
set.seed(seed)
wholeLength<-(length(strFunc)+length(otherStrFunc))
startRndNum<-sample(1:length(strFunc),1,replace=T)
set.seed(secondSeed)
endRndNum<-sample(length(strFunc):wholeLength,1,replace=T)
concatenatedFunc<-c(strFunc,otherStrFunc)
newFunc<-substr(concatenatedFunc,startRndNum,endRndNum)
return(newFunc)
}
evolve<-function(strFuncList,tenPercentStrFuncList)
{
#Detach the bottom ninety percent to the top ten percent
evolveList<-substr(strFuncList,length(tenPercentStrFuncList),length(strFuncList))
#Get sizes. Will use a random mutation, then random crossover, then
#random mutation, then random crossover at percentages with 0.05,0.45,0.05,0.45
#respectively
size<-length(evolveList)
mutateNum<-0.1*size
crossoverNum<-0.9*size
halfMutateNum<-0.05*size
halfCrossoverNum<-0.45*size
roundedMutateNum<-floor(mutateNum)
roundedCrossoverNum<-floor(crossoverNum)
roundedHalfMutateNum<-floor(halfMutateNum)
roundedHalfCrossoverNum<-floor(halfCrossoverNum)
#Calls the functions for those percentage of organisms in that order
for(i in 1:roundedHalfMutateNum)
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndNodeMutation(evolveList[[i]],rndOne,rndTWo)
evolveList[[i]]<-newFunc
}
for (i in roundedHalfMutateNum:(roundedHalfCrossoverNum+roundedHalfMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndCrossoverOperation(evolveList[[i]],evolveList[[i+1]],rndOne,rndTwo)
firstSubstr<-substr(evolveList,1,i-1)
secondSubstr<-substr(evolveLIst,i+2,length(evolveList))
halfSubstr<-c(firstSubstr,newFunc)
evolveList<-c(halfSubstr,secondSubstr)
}
for(i in (roundedHalfCrossoverNum+roundedHalfMutateNum):(roundedHalfCrossoverNum+roundedMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndNodeMutation(evolveList[[i]],rndOne,rndTWo)
evolveList[[i]]<-newFunc
}
for(i in (roundedHalfCrossoverNum+roundedMutateNum):(roundedCrossoverNum+roundedHalfMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndCrossoverOperation(evolveList[[i]],evolveList[[i+1]],rndOne,rndTwo)
firstSubstr<-substr(evolveList,1,i-1)
secondSubstr<-substr(evolveLIst,i+2,length(evolveList))
halfSubstr<-c(firstSubstr,newFunc)
evolveList<-c(halfSubstr,secondSubstr)
}
}
#Calculates the root mean squared of the functions in a string list.
#Then sorts the list by RMSE.
sortByRMSE<-function(strL)
{
for (z in 1:length(strL))
{
for(i in 1:length(strL))
{
nonStrFuncList<-lapply(strL,function(x){funCreator(x)})
totalTwo<-c(totalTwo,(abs(nonStrFuncList[[z]](inputData[[i]])-outputData[[i]])))
}
rmse<-sqrt(mean(totalTwo*totalTwo))
strFuncsLists<-strL[order(sapply(strL, '[[', rmse))]
}
return(strFuncsLists)
}
#Data, Output Goal
desiredFuncOutput<-list(1,4,9,16,25)
dataForInput<-list(1,2,3,4,5)
#Generate Initial Population
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList <- lapply(setNames(POpulation,names(POpulation)),function(x){funCreator(x)})
#Get and save top ten percent in bestDudes
bestDudes<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput,POpulation)
#Evolve the rest
NewBottomNinetyPercent<-evolve(POpulation,bestDudes)
#Concatenate the two to make a new generation
nextGen<-c(bestDudes,NewBottomNinetyPercent)
#Declare lists,
populationsBestTenStr<-list()
populationsFuncList<-list()
#Run ten generations.
for(i in 1:10)
{
nextGen<-makeStrName(nextGen)
populationsFuncList<-lapply(setNames(nextGen,names(nextGen)),function(x){funCreator(x)})
populationsBestTenStr<-evalPopulation(populationsFuncList,dataForInput,desiredFuncOutput,nextGen)
nextGen<-evolve(populations,populationsBestTenStr)
}
#Print the string versions of the five functions with the lowest RMSE evolved.
byRMSEList<-sortByRMSE(populationsBestTenStr)
for(i in 1:5)
{
byRMSEList[[i]]
}
library("datasets")
operators<-list("+","*","-","/","o")
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
# Fixed:
# evolveLIst inconsistently typed as evolveList
# rndCrossoverOperation inconsistently typed as randomCrossoverOperator
# rndTWo inconsistently typed as rndTwo
# broken substr
# broken condition leading to for(i in 1:0)
# misc. others
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
rndNumVector[rndNumVector==i]<-funcList[i]
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, funcList)
{
set.seed(inputSeed)
rnd<-sample(1:length(funcList),inputLen,replace=T)
Org<-mapFuncList(funcList,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list()
for(i in 1:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed+i,functions))
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
funCreator<-function(snippet)
{
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList=POpulationFuncList, inputData=dataForInput, outputData=desiredFuncOutput,
populationStringList=POpulation)
{
#rmse <- sqrt( mean( (sim - obs)^2))
for(i in 1:length(populationStringList))
{
stringFunc<-populationStringList[[i]]
total<-as.numeric(length(inputData))
topTenPercentFunctionList<-list()
topTenPercentRMSEList<-list()
topTenPercentStringFunctionList<-list()
tempFunc<-function(x){x}
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
tempFunc<-populationFuncList[[i]]
}
rmse<-sqrt(mean(total^2))
topTenPercentVal<-length(populationFuncList)*0.1
if(length(topTenPercentFunctionList)<topTenPercentVal||RMSE<min(topTenPercentRMSEList))
{
topTenPercentStringFunctionList<-c(topTenPercentStringFunctionList,stringFunc)
topTenPercentRMSEList<-c(topTenPercentRMSEList, rmse)
topTenPercentFunctionList<-c(topTenPercentFunctionList, tempFunc)
}
}
return(topTenPercentStringFunctionList)
}
#Get random operator
getRndOp<-function(seed)
{
set.seed(seed)
rndOpNum<-sample(1:length(operators),1,replace=T)
operation<-operators[[rndOpNum]]
return(operation)
}
#Mutation Operators
#This attaches a new appendage to an organism
endNodeMutation<-function(strFunc,seed)
{
op<-getRndOp(seed)
strFunc<-c(strFunc,op)
newAppendage<-generateOrganism(1,seed+2,funcs)
strFunc<-c(strFunc,newAppendage)
return(strFunc)
}
#This is a mutation that occurs at a random locaiton in an organism
rndNodeMutation<-function(strFunc,seed,secondSeed)
{
op<-getRndOp(seed)
halfStrFunc<-((length(strFunc))/2)
set.seed(seed)
randomStart<-sample(1:halfStrFunc,1,replace=T)
set.seed(secondSeed)
randomEnd<-2*(sample(1:length(halfStrFunc),1,replace=T))
strFuncUpdate<-substr(strFunc,randomStart,randomEnd)
strFuncUpdate<-c(strFuncUpdate,op)
newAppendage<-generateOrganism(1,seed+2,funcs)
strFuncUpdate<-c(strFuncUpdate,newAppendage)
return(strFuncUpdate)
}
#Crossover Operators
#Crossover operator that attaches otherStrFunc to strFunc at the endpoint of strFunc
crossoverConcatenationOperator<-function(strFunc,otherStrFunc)
{
newStrFunc<-c(strFunc,otherStrFunc)
return(newStrFunc)
}
#Crossover Operation that starts and ends at random points in the concatenation
rndCrossoverOperation<-function(strFunc,otherStrFunc,seed,secondSeed) # fixed function name
{
set.seed(seed)
wholeLength<-(length(strFunc)+length(otherStrFunc))
startRndNum<-sample(1:length(strFunc),1,replace=T)
set.seed(secondSeed)
endRndNum<-sample(length(strFunc):wholeLength,1,replace=T)
concatenatedFunc<-c(strFunc,otherStrFunc)
newFunc<-substr(concatenatedFunc,startRndNum,endRndNum)
return(newFunc)
}
evolve<-function(strFuncList=POpulation,tenPercentStrFuncList=bestDudes)
{
#Detach the bottom ninety percent to the top ten percent
evolveList<-strFuncList[!strFuncList %in% tenPercentStrFuncList] # fixed broken substring
#Get sizes. Will use a random mutation, then random crossover, then
#random mutation, then random crossover at percentages with 0.05,0.45,0.05,0.45
#respectively
size<-length(evolveList)
mutateNum<-0.1*size
crossoverNum<-0.9*size
halfMutateNum<-0.05*size
halfCrossoverNum<-0.45*size
roundedMutateNum<-floor(mutateNum)
roundedCrossoverNum<-floor(crossoverNum)
roundedHalfMutateNum<-floor(halfMutateNum)
roundedHalfCrossoverNum<-floor(halfCrossoverNum)
#Calls the functions for those percentage of organisms in that order
if(roundedHalfMutateNum < 1) roundedHalfMutateNum <- 1
for(i in 1:roundedHalfMutateNum)
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndNodeMutation(evolveList[[i]],rndOne,rndTwo) # fixed case
evolveList[[i]]<-newFunc
}
for (i in roundedHalfMutateNum:(roundedHalfCrossoverNum+roundedHalfMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndCrossoverOperation(evolveList[[i]],evolveList[[i+1]],rndOne,rndTwo)
firstSubstr<-substr(evolveList,1,i-1)
secondSubstr<-substr(evolveList,i+2,length(evolveList))
halfSubstr<-c(firstSubstr,newFunc)
evolveList<-c(halfSubstr,secondSubstr)
}
for(i in (roundedHalfCrossoverNum+roundedHalfMutateNum):(roundedHalfCrossoverNum+roundedMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndNodeMutation(evolveList[[i]],rndOne,rndTwo)
evolveList[[i]]<-newFunc
}
for(i in (roundedHalfCrossoverNum+roundedMutateNum):(roundedCrossoverNum+roundedHalfMutateNum))
{
set.seed(i)
rndOne<-sample(0:1000,1,replace=T)
set.seed(i+10000)
rndTwo<-sample(0:10000,1,replace=T)
newFunc<-rndCrossoverOperation(evolveList[[i]],evolveList[[i+1]],rndOne,rndTwo)
firstSubstr<-substr(evolveList,1,i-1)
secondSubstr<-substr(evolveList,i+2,length(evolveList))
halfSubstr<-c(firstSubstr,newFunc)
evolveList<-c(halfSubstr,secondSubstr)
}
}
#Calculates the root mean squared of the functions in a string list.
#Then sorts the list by RMSE.
sortByRMSE<-function(strL)
{
for (z in 1:length(strL))
{
for(i in 1:length(strL))
{
nonStrFuncList<-lapply(strL,function(x){funCreator(x)})
totalTwo<-c(totalTwo,(abs(nonStrFuncList[[z]](inputData[[i]])-outputData[[i]])))
}
rmse<-sqrt(mean(totalTwo*totalTwo))
strFuncsLists<-strL[order(sapply(strL, '[[', rmse))]
}
return(strFuncsLists)
}
#Data, Output Goal
desiredFuncOutput<-list(1,4,9,16,25)
dataForInput<-list(1,2,3,4,5)
#Generate Initial Population
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList <- lapply(setNames(POpulation,names(POpulation)),function(x){funCreator(x)})
#Get and save top ten percent in bestDudes
bestDudes<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput,POpulation)
#Evolve the rest
NewBottomNinetyPercent<-evolve(POpulation,bestDudes)
#Concatenate the two to make a new generation
nextGen<-c(bestDudes,NewBottomNinetyPercent)
#Declare lists,
populationsBestTenStr<-list()
populationsFuncList<-list()
#Run ten generations.
for(i in 1:10)
{
nextGen<-makeStrName(nextGen)
populationsFuncList<-lapply(setNames(nextGen,names(nextGen)),function(x){funCreator(x)})
populationsBestTenStr<-evalPopulation(populationsFuncList,dataForInput,desiredFuncOutput,nextGen)
nextGen<-evolve(populations,populationsBestTenStr)
}
#Print the string versions of the five functions with the lowest RMSE evolved.
byRMSEList<-sortByRMSE(populationsBestTenStr)
for(i in 1:5)
{
byRMSEList[[i]]
}

R Error Genetic Programming Implementation

So I am brand new to R. I started learning it yesterday, because there's some data that is being very resistant to automatically importing into Mathematica and Python. I'm building a few machine learning techniques to do analysis on the data that I can now import with R. This is a genetic programming implementation that when finished should do symbolic regression on some data. (I have yet to create the mutation or crossover operators, build a legit function list, etc). I get two errors when I run the script:
> Error: attempt to apply non-function
> print(bestDude)
> Error in print(bestDude) : object 'bestDude' not found
This is my code:
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
replace(rndNumVector, rndNumVector==i,funcList[[i]])
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list("null")
for(i in 2:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed, functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
populationFuncList<-function(Population)
{
Population[[1]]<-"x"
funCreator<-function(snippet)
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
listOfFunctions <- lapply(setNames(Population,names(Population)),function(x){funCreator(x)})
return(listOfFunctions)
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList, inputData,outputData)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-list()
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList<-populationFuncList(POpulation)
bestDude<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput)
print(bestDude)
The code is now working thanks to Hack-R's suggestions. So here's the finalized code in case someone else runs into a similar trouble.
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
rndNumVector[rndNumVector==i]<-funcList[i]
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list()
for(i in 1:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed,functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
funCreator<-function(snippet)
{
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(populationFuncList, inputData,outputData)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-vector(mode="numeric",length=length(inputData))
for(z in 1:length(inputData))
{
total<-c(total,(abs(populationFuncList[[i]](inputData[[z]])-outputData[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
POpulationFuncList <- lapply(setNames(POpulation,names(POpulation)),function(x){funCreator(x)})
bestDude<-evalPopulation(POpulationFuncList,dataForInput,desiredFuncOutput)
print(bestDude)
In your function evalPopulation you're attempting to apply populationFuncList[[i]] as if it were a function, but when you pass in the argument POpulationFuncList to replace the variable populationFuncList it's not a function, it's a list.
I'm not sure what you were trying to do, so I'm not sure which way you want to fix this. If you meant to use a function you should change the name of the object you're referencing to the function and remove it as an argument or at least pass a function in as an argument instead of the list.
OTOH if you meant to use the list POpulationFuncList then you just shouldn't be applying it as if it were a function instead of a list.
On a side note, this probably would be more apparent if you didn't give them such similar names.
Another potential problem is that you seem have non-numeric results in one of your lists:
> populationFuncList(POpulation)
$x1
[1] "x"
$x2
[1] 2
$x3
[1] 1
$x4
[1] 1
You can't take the absolute value of the character "x", so I just wanted to make sure you're aware of this.
A third problem is that you're doing math on a non-numeric data typed object called total. You need to either change the type to numeric or index it appropriately.
Now it's impossible for me to know exactly which of an infinite number of possibilities you should choose to fix this, because I don't know the details of your use case. However, here is one possible solution which you should be able to adapt to the specifics of the use case:
library("datasets")
#Allows me to map a name to each element in a numerical list.
makeStrName<-function(listOfItems)
{
for(i in 1:length(listOfItems))
{
names(listOfItems)[i]=paste("x",i,sep="")
}
return(listOfItems)
}
#Allows me to replace each random number in a vector with the corresponding
#function in a list of functions.
mapFuncList<-function(funcList,rndNumVector)
{
for(i in 1:length(funcList))
{
replace(rndNumVector, rndNumVector==i,funcList[[i]])
}
return(rndNumVector)
}
#Will generate a random function from the list of functions and a random sample.
generateOrganism<-function(inputLen,inputSeed, functions)
{
set.seed(inputSeed)
rnd<-sample(1:length(functions),inputLen,replace=T)
Org<-mapFuncList(functions,rnd)
return(Org)
}
#Will generate a series of "Organisms"
genPopulation<-function(popSize,initialSeed,initialSize,functions)
{
population<-list("null")
for(i in 2:popSize)
{
population <- c(population,generateOrganism(initialSize,initialSeed, functions))
initialSeed <- initialSeed+1
}
populationWithNames<-makeStrName(population)
return(populationWithNames)
}
#Turns the population of functions (which are actually strings in "") into
#actual functions. (i.e. changes the mode of the list from string to function).
populationFuncList<-function(Population)
{
Population[[1]]<-"x"
funCreator<-function(snippet)
txt=snippet
function(x)
{
exprs <- parse(text = txt)
eval(exprs)
}
listOfFunctions <- lapply(setNames(Population,names(Population)),function(x){funCreator(x)})
return(listOfFunctions)
}
#Applies a fitness function to the population. Puts the best organism in
#the hallOfFame.
evalPopulation<-function(myList=myList, dataForInput,desiredFuncOutput)
{
#rmse <- sqrt( mean( (sim - obs)^2))
hallOfFame<-list(1000000000)
for(i in 1:length(populationFuncList))
{
total<-0
for(z in 1:length(dataForInput))
{
total<-c(total,(abs(myList[[i]]+(dataForInput[[z]])-desiredFuncOutput[[z]])))
}
rmse<-sqrt(mean(total*total))
if(rmse<hallOfFame[[1]]) {hallOfFame[[1]]<-rmse}
}
return(hallOfFame)
}
#Function list, input data, output data (data to fit to)
funcs<-list("x","log(x)","sin(x)","cos(x)","tan(x)")
desiredFuncOutput<-list(1,2,3,4,5)
dataForInput<-list(1,2,3,4,5)
#Function calls
POpulation<-genPopulation(4,1,1,funcs)
myList <-populationFuncList(POpulation)[2:4]
bestDude<-evalPopulation(myList,dataForInput,desiredFuncOutput)
print(bestDude)
[[1]]
[1] 1.825742

selection of nlm starting values problem

Need to estimate two parameters using the nlm function;
fit<-nlm(hood2par,c(x01[i],x02[j]),iterlim=300, catch=x[,c(3,4,5)],sp=.5)
where hood2par is a modified logistic
The convergence of nlm depends on the starting values ​​of these parameters. To find such initial values I ​​automatically generate two vectors of starting values
x01 = seq(-10,-20,-0.1)
x02 = seq(0.1,0.9,0.01)
next I create a routine included in a double for() to find the values ​​that lead to the convergence of the function:
for (i in 1:length(x01)) { for (j in 1:length(x02)) {
fit <- NULL
try(fit <- nlm(hood2par, c(x01[i],x02[j]), iterlim = 300, catch = x[,c(3,4,5)],
sp = .5),
silent = TRUE)
stopifnot(is.null(fit))}}
The problem I have is that when I include the previous routine in a function:
FFF <- function(x01, x02, catch){
for (i in 1:length(x01)) {
for (j in 1:length(x02)) {
fit <- NULL
try(fit <- nlm(hood2par, c(x01[i], x02[j]), iterlim = 300,
catch = x[,c(3,4,5)], sp = .5),
silent = TRUE) # does not stop in the case of err
stopifnot(is.null(fit))
}
}
return(fit)
}
I can´t get the 'fit' values from FFF():
> fit.fff<-FFF(x01,x02,catch)
#Error: is.null(fit) is not TRUE
>fit.fff
fit.fff
Error: object 'fit.fff' not found
I used stopifnot(is.null(fit)) to stop the loops when fit is not NULL (as fit is defined as a NULL object before try(...)). Regarding the try code you have shared, I just need this;
res <- try(some_expression)
if(inherits(res, "try-error"))
{
#some code to keep loops running
} else
{
#stop the loops and gather "res"
}
I tried to include the break function in the second argument of the condictional, but it doesn´t run in my R version...Any idea??
When you call FFF, inside the try block if nlm successfully completes, then fit is assigned, and the stopifnot condition is activated, throwing an error.
Wildly guessing, did you mean
stopifnot(!is.null(fit))
For future reference, a standard chunk of code for use with try is
res <- try(some_expression)
if(inherits(res, "try-error"))
{
#some error handling code
} else
{
#normal execution
}

Resources