if and else with vector sum condition not properly working - r

I have a data.frame called sites_sp where I'm trying to run some functions based on if and else statements. sites_sp has the following structure:
structure(list(x = c(-50.1298257841559, -49.9523708108406, -49.8600298829818,
-49.8590735594872, -49.8600022102151, -49.680556540172), y = c(-29.2498490060132,
-29.1594734717135, -29.0700140387022, -28.9795033961473, -28.8900003372153,
-28.8945716273705), ua = c("ua_1", "ua_4", "ua_10", "ua_15",
"ua_21", "ua_23"), occ = c(0, 0, 0, 0, 0, 0), PC1 = c(0.403336553595704,
-0.209623013249306, -2.38969068562858, -1.0875631345167, 0.0424075103800285,
-1.69180948954307), PC2 = c(-3.62346919232857, -4.03856503375702,
-1.46862258765078, -1.77908267718137, -2.0250031837701, -0.952927464794925
), PC3 = c(-0.375601733371977, -0.122982261539736, -0.365818414058142,
-0.111150398019996, 0.287459840686463, 0.034973266100254), PC4 = c(-1.31153262462204,
-0.899941801783298, -1.35652371929479, -1.98693913441246, -1.75393016363327,
-0.788097574287776), PC5 = c(1.42830395246321, 1.55155187773266,
1.33933059031444, 0.0760013457702872, 0.588191290690648, -0.408003273953271
)), row.names = c(NA, 6L), class = "data.frame")
What I'm doing is an if and else statements of form:
for(s in sp){
if(sum(sites_sp$occ >= 30)){
pa_data <- st_as_sf(sites_sp,
coords = c("x", "y"),
crs = crs(env_terra))
...
} else {
block of functions for the statement being FALSE
}
}
RELEVANT EDIT: From what I can tell, the function is going directly to the else block even though it should not — since sum(sites_sp$occ) is bigger than 30 for the first s in sp
I can't really understand what's going on. If I try sum(sites_sp$occ) it returns for me a value of 37, implying that the function inside the if block (pa_data <- st_as_sf()...) should run normally. What am I doing wrong here? If more information is needed, please tell me.

Ok, guys...I'm kinda dumb.
The problem is simply here:
if(sum(sites_sp$occ >= 30)){
Should be written as
if(sum(sites_sp$occ) >= 30){
My condition was inside the sum

Related

(R) Error in optim - attempt to apply non-function, when function is defined

not sure what I'm doing wrong here. I'm trying to get a cross-validation score for a mixture-of-two-gammas model.
llikGammaMix2 = function(param, x) {
if (any(param < 0) || param["p1"] > 1) {
return(-Inf)
} else {
return(sum(log(
dgamma(x, shape = param["k1"], scale = param["theta1"]) *
param["p1"] + dgamma(x, shape = param["k2"], scale = param["theta2"]) *
1
(1 - param["p1"])
)))
}
}
initialParams = list(
theta1 = 1,
k1 = 1.1,
p1 = 0.5,
theta2 = 10,
k2 = 2
)
for (i in 1:nrow(cichlids)) {
SWS1_training <- cichlids$SWS1 - cichlids$SWS1[i]
SWS1_test <- cichlids$SWS1[i]
MLE_training2 <-
optim(
par = initialParams,
fn = llikGammaMix2,
x = SWS1_training,
control = list(fnscale = -1)
)$par
LL_test2 <-
optim(
par = MLE_training2,
fn = llikGammaMix2,
x = SWS1_test,
control = list(fnscale = -1)
)$value
}
print(LL_test2)
This runs until it gets to the first optim(), then spits out Error in fn(par, ...) : attempt to apply non-function.
My first thought was a silly spelling error somewhere, but that doesn't seem to be the case. Any help is appreciated.
I believe the issue is in the return statement. It's unclear if you meant to multiply or add the last quantity (1 - param["p1"])))) to the return value. Based on being a mixture, I'm guessing you mean for it to be multiplied. Instead it just hangs at the end which throws issues for the function:
return(sum(log(dgamma(x, shape = param["k1"], scale = param["theta1"]) *
param["p1"] +
dgamma(x, shape = param["k2"], scale = param["theta2"]) *
(1 - param["p1"])))) ## ISSUE HERE: Is this what you meant?
There could be other issues with the code. I would double check that the function you are optimizing is what you think it ought to be. It's also hard to tell unless you give a reproducible example we might be able to use. Try to clear up the above issue and let us know if there are still problems.

NSGA2 Genetic Algorithm in R

I am working on the NSGA2 package on R (library mco).
My NSGA2 code takes forever to run, so I am wondering:
1) Is there a way to limit the precision of the solution values (say, maybe up to 3 decimal places) instead of infinite?
2) How do I set an equality constraint (the ones online all seemed to be about >= or <= than =)? Not sure if I'm doing it right.
My entire relevant code for reference, for easy tracing: https://docs.google.com/document/d/1xj7OPng11EzLTTtWLdRWMm8zJ9f7q1wsx2nIHdh3RM4/edit?usp=sharing
Relevant sample part of code reproduced here:
VTR = get.hist.quote(instrument = 'VTR',
start="2010-01-01", end = "2015-12-31",
quote = c("AdjClose"),provider = "yahoo",
compress = "d")
ObjFun1 <- function (xh){
f1 <- sum(HSVaR_P(merge(VTR, CMI, SPLS, KSS, DVN, MAT, LOE, KEL, COH, AXP), xh, 0.05, 2))
tempt = merge(VTR, CMI, SPLS, KSS, DVN, MAT, LOE, KEL, COH, AXP)
tempt2 = tempt[(nrow(tempt)-(2*N)):nrow(tempt),]
for (i in 1:nrow(tempt2))
{
for (j in 1:ncol(tempt2))
{
if (is.na(tempt2[i,j]))
{
tempt2[i,j] = 0
}
}
}
f2 <- ((-1)*abs(sum((xh*t(tempt2)))))
c(f1=f1,f2=f2)
}
Constr <- function(xh){
totwt <- (1-sum(-xh))
totwt2 <- (sum(xh)-1)
c(totwt,totwt2)
}
Solution1 <- nsga2(ObjFun1, n.projects, 2,
lower.bounds=rep(0,n.projects), upper.bounds=rep(1,n.projects),
popsize=n.solutions, constraints = Constr, cdim=1,
generations=generations)
The function HSVaR_P returns matrix(x,2*500,1).
Even when I set generations = 1, the code does not seem to run. Clearly there should be some error in the code, somewhere, but I am not entirely sure about the mechanics of the NSGA2 algorithm.
Thanks.

Catching the print of the function

I am using package fda in particular function fRegress. This function includes another function that is called eigchk and checks if coeffients matrix is singular.
Here is the function as the package owners (J. O. Ramsay, Giles Hooker, and Spencer Graves) wrote it.
eigchk <- function(Cmat) {
# check Cmat for singularity
eigval <- eigen(Cmat)$values
ncoef <- length(eigval)
if (eigval[ncoef] < 0) {
neig <- min(length(eigval),10)
cat("\nSmallest eigenvalues:\n")
print(eigval[(ncoef-neig+1):ncoef])
cat("\nLargest eigenvalues:\n")
print(eigval[1:neig])
stop("Negative eigenvalue of coefficient matrix.")
}
if (eigval[ncoef] == 0) stop("Zero eigenvalue of coefficient matrix.")
logcondition <- log10(eigval[1]) - log10(eigval[ncoef])
if (logcondition > 12) {
warning("Near singularity in coefficient matrix.")
cat(paste("\nLog10 Eigenvalues range from\n",
log10(eigval[ncoef])," to ",log10(eigval[1]),"\n"))
}
}
As you can see last if condition checks if logcondition is bigger than 12 and prints then the ranges of eigenvalues.
The following code implements the useage of regularization with roughness pennalty. The code is taken from the book "Functional data analysis with R and Matlab".
annualprec = log10(apply(daily$precav,2,sum))
tempbasis =create.fourier.basis(c(0,365),65)
tempSmooth=smooth.basis(day.5,daily$tempav,tempbasis)
tempfd =tempSmooth$fd
templist = vector("list",2)
templist[[1]] = rep(1,35)
templist[[2]] = tempfd
conbasis = create.constant.basis(c(0,365))
betalist = vector("list",2)
betalist[[1]] = conbasis
SSE = sum((annualprec - mean(annualprec))^2)
Lcoef = c(0,(2*pi/365)^2,0)
harmaccelLfd = vec2Lfd(Lcoef, c(0,365))
betabasis = create.fourier.basis(c(0, 365), 35)
lambda = 10^12.5
betafdPar = fdPar(betabasis, harmaccelLfd, lambda)
betalist[[2]] = betafdPar
annPrecTemp = fRegress(annualprec, templist, betalist)
betaestlist2 = annPrecTemp$betaestlist
annualprechat2 = annPrecTemp$yhatfdobj
SSE1.2 = sum((annualprec-annualprechat2)^2)
RSQ2 = (SSE - SSE1.2)/SSE
Fratio2 = ((SSE-SSE1.2)/3.7)/(SSE1/30.3)
resid = annualprec - annualprechat2
SigmaE. = sum(resid^2)/(35-annPrecTemp$df)
SigmaE = SigmaE.*diag(rep(1,35))
y2cMap = tempSmooth$y2cMap
stderrList = fRegress.stderr(annPrecTemp, y2cMap, SigmaE)
betafdPar = betaestlist2[[2]]
betafd = betafdPar$fd
betastderrList = stderrList$betastderrlist
betastderrfd = betastderrList[[2]]
As penalty factor the authors use certain lambda.
The following code implements the search for the appropriate `lambda.
loglam = seq(5,15,0.5)
nlam = length(loglam)
SSE.CV = matrix(0,nlam,1)
for (ilam in 1:nlam) {
lambda = 10ˆloglam[ilam]
betalisti = betalist
betafdPar2 = betalisti[[2]]
betafdPar2$lambda = lambda
betalisti[[2]] = betafdPar2
fRegi = fRegress.CV(annualprec, templist,
betalisti)
SSE.CV[ilam] = fRegi$SSE.CV
}
By changing the value of the loglam and cross validation I suppose to equaire the best lambda, yet if the length of the loglam is to big or its values lead the coefficient matrix to singulrity. I recieve the following message:
Log10 Eigenvalues range from
-5.44495317739048 to 6.78194912518214
Created by the function eigchk as I already have mentioned above.
Now my question is, are there any way to catch this so called warning? By catch I mean some function or method that warns me when this has happened and I could adjust the values of the loglam. Since there is no actual warning definition in the function beside this print of the message I ran out of ideas.
Thank you all a lot for your suggestions.
By "catch the warning", if you mean, will alert you that there is a potential problem with loglam, then you might want to look at try and tryCatch functions. Then you can define the behavior you want implemented if any warning condition is satisfied.
If you just want to store the output of the warning (which might be assumed from the question title, but may not be what you want), then try looking into capture.output.

How can I use characters in a function as argument in R?

I want to update a date file which I want to assign weight to a name.
For example :
weight_f = function(Name = 0, Weight = 0){
data$Weight = ifelse(data$Name==Name, Weight, NA)
}
The problem is that I need to have Name as "Name" after ==. I tried pasting " before and after, but it wont work because R wont let me enter """
Very easy fix :)
weight_f = function(Name = 0, Weight = 0){
data$Weight = ifelse(data$Name==deparse(substitute(Name)), Weight, NA)
}
Edit:
Actually, I think I misunderstood what you were asking, and my answer doesn't make any sense because of it (in the case I thought you meant, you would have just used "Name" rather than deparse(substitute(Name)) - same output).
I think probably what you want is toString :
weight_f = function(Name = 0, Weight = 0){
data$Weight = ifelse(data$Name==toString(Name), Weight, NA)
}

Setting a loop in R

I have already discussed a similar type of a question in this following post
How to set a for -loop in R
each file contents as follows:
FILE_1.FASTA
>>TTBK2_Hsap ,(CK1/TTBK)
MSGGGEQLDILSVGILVKERWKVLRKIGGGGFGEIYDALDMLTRENVALKVESAQQPKQVLKMEVAVLKKLQGKDHVCRFIGCGRNDRFNYVVMQLQGRNLADLRRSQSRGTFT
FILE_2.FASTA
>>TTBK2_Hsap ,(CK1/TTBK)
MSGGGEQLDILSVGILVKERWKVLRKIGGGGFGEIYDALDMLTRENVALKVESAQQPKQVLKMEVAVLKKLQGKDHVCRFIGCGRNDRFNYVVMQLQGRNLADLRRSQSRGTFT
However, there is another package in R which works like this:
extractAPAAC(x, props = c("Hydrophobicity", "Hydrophilicity"), lambda = 30,
w = 0.05, customprops = NULL)
I tried creating a function to run it for number of file sequences and the program looks like this
read_and_extract <- function(fasta) {
seq <- readFASTA(fasta)[[1]]
return(extractAPAAC(seq, props = c("Hydrophobicity", "Hydrophilicity"), lambda = 30,
w = 0.05, customprops = NULL))
}
setwd("H:\\CC")
fasta_files <- dir(pattern = "[.]fasta$")
aa_comp <- vapply(fasta_files, read_and_extract, rep(pi, 80))
write.csv(aa_comp, file = "C:\\Users\\PAAC.csv")
This programs shows an error
Error: unexpected ',' in "w = 0.05,"
But I have given w=0.05 as of default value, could anyone tell me where is the actual problem?

Resources