I'm trying to write a function to compute sample sizes in R.
The function uses a couple of smaller functions. I'd like to pass arguments into the smaller functions using the dots. Here is my function so far:
log_reg_var<-function(p){
if(p<=0|p>=1) stop('p must be between 0 and 1')
var<-1/(p*(1-p))
return(var)
}
samplesize<-function(method_name, beta, sigma_x, mult_cor, power= 0.8,fpr = 0.05,...){
if(method_name=='linear regression'){
var_func <- lin_reg_var
}
else if(method_name=='logistic regression'){
var_func <- log_reg_var
}
else if(method_name=='cox regression'){
var_func <- cox_reg_var
}
else if(method_name=='poisson regression'){
var_func <- pois_reg_var
}
else{
stop('method_name not recognized. method_name accepts one of: "linear regression",
"logistic regression","cox regression", or "poisson regression"')
}
top = (qnorm(1-fpr/2) + qnorm(power))^2
bottom = (beta*sigma_x)^2*(1-mult_cor)
n = (top/bottom)*var_func(...)
return(ceiling(n))
}
I should be able to do
samplesize(method_name = 'logreg',1,1,0,p=0.5)
>>>32
But instead I am thrown the following error:
Error in var_func(...) : argument "p" is missing, with no default
Clearly there is something wrong with me passing p through the dots, but I'm not sure what is wrong.
What is my problem here?
You need to add the additional parameter p as an argument and you need to pass it into your log_reg_var() function. You also have to be careful with some other syntax:
log_reg_var<-function(p){
if(p<=0|p>=1) stop('p must be between 0 and 1')
var<-1/(p*(1-p))
return(var)
}
# specify that you pass a parameter `p`
samplesize<-function(method_name, beta, sigma_x, mult_cor, power= 0.8,fpr = 0.05, p, ...){
# Initialize `var_func` to a NULL value
var_func = NULL
if(method_name=='linear regression'){
var_func <- lin_reg_var(p)
}
else if(method_name=='logistic regression'){
# pass parameter `p` into log_reg_var since there is no default
var_func <- log_reg_var(p)
}
else if(method_name=='cox regression'){
var_func <- cox_reg_var(p)
}
else if(method_name=='poisson regression'){
var_func <- pois_reg_var(p)
}
else{
stop('method_name not recognized. method_name accepts one of: "linear regression",
"logistic regression","cox regression", or "poisson regression"')
}
top = (qnorm(1-fpr/2) + qnorm(power))^2
bottom = (beta*sigma_x)^2*(1-mult_cor)
n = (top/bottom)*var_func
return(ceiling(n))
}
> samplesize(method_name ='logistic regression', 1, 1, 0, p=0.5)
[1] 32
Related
I want to generate a function with some conditions. If those conditions are TRUE then those will be used for further execution. But my problem is that R is not finding that even that is TRUE. Following is my R code;
require(KernSmooth)
require(kerdiest)
mini<-function(y,k,h1=TRUE, h2=TRUE, h3=TRUE,type){
n <- length(y)
x <- seq(min(y) + 0.05, max(y), length = k)
if(h1==TRUE){
h1 <- abs(dpik(y) )
} else { print("DPI is not selected")
}
if(h2==TRUE){
h2<-abs(ALbw(vec_data=y))
} else { print("AL is not selected")
}
if(h3==TRUE){
h3 <-abs(1.06*sd(y)*(n^(-1/5)))
} else { print("NSR is not selected")
}
ftrue<-switch(type,
Exp = dexp(x,(1/mean(x))),
Gamma = dgamma(x,(mean(x)/(var(x)/mean(x))),(var(x)/mean(x)))
)
dpi<-h1-sum(ftrue)
AL<-h2*3
nsr<-h3*4
v<-c("DPI"=dpi, "Altaman"=AL, "NSR"=nsr)
v[which.min(v)]
}#function end
y<-rexp(100,1)
mini(y,200,h1,h2,h3,"exp")
but there is error:
Error in mini(y, 200, h1, h2, h3, "exp") : object 'h1' not found
I am getting any clue what is my mistake here. Please help me in this problem.
Is it possible to do something like this somehow with pairs or a similar function?
var = "" #initialization
panel.pearson <- function(x, y, ...) {
horizontal <- (par("usr")[1] + par("usr")[2]) / 2;
vertical <- (par("usr")[3] + par("usr")[4]) / 2;
cor = cor.test(x,y)
cor.p = cor$p.value
cor.r = cor$estimate
cor.p = round(cor.p, digits = 2)
cor.r = round(cor.r, digits = 2)
stars = ifelse(cor.p < .001, "***", ifelse(cor.p < .01, "** ", ifelse(cor.p < .05, "* ", " ")))
format_r_p = paste(cor.r, stars, sep="")
text(horizontal, vertical, format_r_p, cex=2)
var = c(var, format_r_p)
}
pairs(crime, upper.panel=panel.pearson )
var would output all the format_r_p values.
It’s possible but it’s a really, really bad idea in general: functions should not mutate global state.
So instead, isolate the modification to be local instead of global:
var = ''
pairs(crime, upper.panel = function (x, y, ...) {
result = panel.pearson(x, y, ...)
var <<- c(var, result)
result
})
Now, instead of making panel.pearson modify any global magic variables, we use an anonymous function in the scope of the call to pairs to modify a variable in the scope of the call to pairs, i.e. locally.
To modify this variable from inside the anonymous function, we use <<- instead of the normal assignment.
I am having a very odd problem in R. The question was to make a function for global and semi global allignment. Appropriate algorithms were made which are able to "print out" the correct allignment. However "returning" the alginment seems to be a problem for the semi global algorithm.
Below are the functions for both alignments which both contain two functions: one computing the score matrix and the other outputs the alignment. As you can see, the output function for semi global was inspired by the global one but although it is able to print out values A and B, when returning A and B a value NULL is returned.
It came to my attention that when making defining A and B, they also contain a NULL part which seen by printing the structures of A and B at the end. This is also the case in the global alignment but does not seem to be a problem here.
Global Alignment Algorithm
########### GLOBAL ALLIGNMENT ALGORITHM ############
GA_score = function(v,w,score.gap=-3,score.match=8,score.mismatch=-5){
v = strsplit(v,split="")[[1]]
w = strsplit(w,split="")[[1]]
S = matrix(0,nrow=(length(v)+1),ncol = (length(w)+1) )
S[1,1] = 0
for(j in 2:dim(S)[2]){
S[1,j] = score.gap*(j-1)
}
for(i in 2:dim(S)[1]){
S[i,1] = score.gap*(i-1)
for(j in 2:dim(S)[2]){
if(v[i-1]==w[j-1]){diag = S[i-1,j-1] + score.match} else {diag = S[i-1,j-1] + score.mismatch}
down = S[i-1,j] + score.gap
right = S[i,j-1] + score.gap
S[i,j] = max(diag,down,right)
}
}
return(S)
}
GA_output = function(v,w,S,score.gap=-3,score.match=8,score.mismatch=-5){
v = strsplit(v,split="")[[1]]
w = strsplit(w,split="")[[1]]
A=c()
B=c()
GA_rec = function(A,B,S,i,j,v,w,score.gap,score.match,score.mismatch){
if (i==1 | j==1){
if(i>1){
for(i1 in seq(i-1,1,-1)){
A = c(v[i1],A)
B = c("-",B)
}
}
if(j>1){
for(j1 in seq(j-1,1,-1)){
A = c("-",A)
B = c(w[j1],B)
}
}
return(list(v=A,w=B))
}
if(v[i-1]==w[j-1] ){diag = score.match} else {diag=score.mismatch}
if (S[i,j] == (S[i-1,j-1] + diag)){
A.temp = c(v[i-1],A)
B.temp = c(w[j-1],B)
GA_rec(A.temp,B.temp,S,i-1,j-1,v,w,score.gap,score.match,score.mismatch)
}
else if (S[i,j] == (S[i-1,j] + score.gap)){
A.temp <- c(v[i-1],A)
B.temp <- c("-",B)
GA_rec(A.temp,B.temp,S,i-1,j,v,w,score.gap,score.match,score.mismatch)
}
else {
A.temp = c("-",A)
B.temp = c(w[j-1],B)
GA_rec(A.temp,B.temp,S,i,j-1,v,w,score.gap,score.match,score.mismatch)
}
}
return( GA_rec(A,B,S,length(v)+1,length(w)+1,v,w,score.gap,score.match,score.mismatch))
}
Semi-Global Alignment Algorithm
########### SEMI GLOBAL ALLIGNMENT ALGORITHM ############
SGA_score = function(sequence1,sequence2,score.gap=-1,score.match=1,score.mismatch=-1){
v=sequence2
w=sequence1
v = strsplit(v,split="")[[1]]
w = strsplit(w,split="")[[1]]
S = matrix(0,nrow=length(v)+1,ncol=length(w)+1)
for(i in 1:(length(w)+1)){
for( j in 1:(length(v)+1)){
if (i==1|j==1){S[i,j]=0}
else{
if((i==length(w)+1) | (j==length(v)+1)){
from.top = S[i,j-1]
from.left = S[i-1,j]
}
else{
from.top = max(S[i,j-1]+score.gap) # Max is artifact from max(0,... )
from.left = max(S[i-1,j]+score.gap)
}
if(w[i-1] == v[j-1]){
from.diag = S[i-1,j-1]+score.match
}
else{
from.diag = S[i-1,j-1]+score.mismatch
}
S[i,j] = max(from.top,from.left,from.diag)
}
}
}
return(S)
}
SGA_output = function(v,w,S,score.gap=-1,score.match=1,score.mismatch=-1){
v = strsplit(v,split="")[[1]]
w = strsplit(w,split="")[[1]]
A=c()
B=c()
print(str(A))
print(str(B))
SGA_rec = function(A,B,S,i,j,v,w,score.gap,score.match,score.mismatch){
if (i==1 | j==1){
if(i>1){
for(i1 in seq(i-1,1,-1)){
A = c(v[i1],A)
B = c("-",B)
}
}
if(j>1){
for(j1 in seq(j-1,1,-1)){
A = c("-",A)
B = c(w[j1],B)
}
}
print(A)
print(B)
out = list(v=A,w=B)
#print(out)
print(str(A))
print(str(B))
print(str(out))
return(out)
}
if(v[i-1]==w[j-1] ){diag = score.match} else {diag=score.mismatch}
if (S[i,j] == (S[i-1,j-1] + diag)){
A.temp = c(v[i-1],A)
B.temp = c(w[j-1],B)
SGA_rec(A.temp,B.temp,S,i-1,j-1,v,w,score.gap,score.match,score.mismatch)
}
#####
if ( j==length(w)+1) { # Are we in last row?
score.temp = score.gap
score.gap=0
}
else{score.temp=score.gap}
if(S[i,j] == (S[i-1,j] + score.gap)){
A.temp <- c(v[i-1],A)
B.temp <- c("-",B)
score.gap = score.temp
SGA_rec(A.temp,B.temp,S,i-1,j,v,w,score.gap,score.match,score.mismatch)
}
score.gap=score.temp
####
if(i==length(v)+1){
score.temp=score.gap
score.gap=0
}
else{score.temp=score.gap}
if(S[i,j] == (S[i,j-1] + score.gap)){
A.temp = c("-",A)
B.temp = c(w[j-1],B)
score.gap=score.temp
SGA_rec(A.temp,B.temp,S,i,j-1,v,w,score.gap,score.match,score.mismatch)
}
}
return(SGA_rec(A,B,S,length(v)+1,length(w)+1,v,w,score.gap,score.match,score.mismatch))
}
S1 = SGA_score("ACGTCAT","TCATGCA")
S1
align = SGA_output("ACGTCAT","TCATGCA",S1)
align
I am surpised that the global alignment works but the semi global one doesn't, even tough they both have this NULL part (can someone maybe explain what this is? Has it something to do with internal objects in a function?) and the semi global knows what A and B is.
Any help is greatly appreciated!
SGA_rec seems to be missing a return value. You need an else {return(<something>)) after the last if.
Illustration:
fun <- function() if (FALSE) 1
x <- fun()
x
#NULL
Read help("NULL") to learn what it means.
This is my code.
beta1 = function(a,b,t) { beta(a+(1/t),b) }
beta2 = function(a,b,t) { beta(a+(2/t),b) }
eb11 = function(a,b,t) { beta2(a,b,t)/beta(a,b) }
eb12 = function(a,b,t) { (beta1(a,b,t)-beta2(a,b,t))/beta(a,b) }
eb22 = function(a,b,t) { 1 + (beta2(a,b,t)-2*beta1(a,b,t))/beta(a,b) }
eb11r11 = function(a,b,t) { beta2(a,b,t)*beta(a,b)/beta1(a,b,t)^2 }
eb12r12 = function(a,b,t) { (beta1(a,b,t)-beta2(a,b,t))*beta(a,b)/beta1(a,b,t)/(beta(a,b)-beta1(a,b,t)) }
eb22r22 = function(a,b,t) { (beta(a,b)^2 + (beta2(a,b,t)-2*beta1(a,b,t))*beta(a,b))/(beta(a,b)-beta1(a,b,t))^2 }
gbetloglik = function(a,b,t) {
loglik = n1*log(eb11r11(a,b,t)) + n2*log(eb12r12(a,b,t)) + n3*log(eb22r22(a,b,t))
return(-loglik)
}
abt = optim(c(0.5,0.5,1),gbetloglik,lower=c(0.001,0.001,0.001),method="L-BFGS-B")$par
What I'd like to do is to find a,b, and t that maximize 'gbetloglik' function.
But I got this error.
Error in 2/t : 't' is missing
It seems that the third argument of function 'beta2' is missing. When I enter three numbers directly in gbetloglik function, it works well. The problem occurs only in optim() function.
Does anyone have any idea?
It looks like you are misinterpreting the first argument of the optim function. The first argument simply supplies initial values for the 1 arguments being optimized. In your case this is supplying 3 initial guesses for one of the arguments to gbetloglik. This call will work:
abt = optim(0.5,gbetloglik,lower=c(0.001,0.001,0.001),method="L-BFGS-B", b=0.5, t= 0.5)$par
but won't optimize across all three arguments, it will simply optimize a given b and t. To optimize across all arguments you will need to install an external package from here. Here is an example from nlmrt:
ydat = c(6.308, 6.94, 9.638, 12.866, 17.069, 23.192, 31.443, 37.558, 51.156, 64.948, 77.995, 91.972)
tdat = seq_along(ydat)
start1 = c(b1=1, b2=1, b3=1)
eunsc = y ~ b1/(1+b2*exp(-b3*tt))
anlxb1g =try(nlxb(eunsc, start=start1, trace=FALSE, data=data.frame(y=ydat, tt=tdat)))
print(anlxb1g)
anlxb1g$coefficients
Using:
mean (x, trim=0.05)
Removes 2.5% from each side of the distribution, which is fine for symmetrical two-tailed data. But if I have one tailed or highly asymmetric data I would like to be able to remove just one side of the distribution. Is there a function for this or do I have write myself a new one? If so, how?
Just create a modified mean.default. First look at mean.default:
mean.default
Then modify it to accept a new argument:
mean.default <-
function (x, trim = 0, na.rm = FALSE, ..., side="both")
{
if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
warning("argument is not numeric or logical: returning NA")
return(NA_real_)
}
if (na.rm)
x <- x[!is.na(x)]
if (!is.numeric(trim) || length(trim) != 1L)
stop("'trim' must be numeric of length one")
n <- length(x)
if (trim > 0 && n) {
if (is.complex(x))
stop("trimmed means are not defined for complex data")
if (any(is.na(x)))
return(NA_real_)
if (trim >= 0.5)
return(stats::median(x, na.rm = FALSE))
lo <- if( side=="both" || side=="right" ){ floor(n * trim) + 1 }else{1}
hi <- if( side=="both" || side=="left" ){ n + 1 - (floor(n * trim) + 1 ) }else{ n}
x <- sort.int(x, partial = unique(c(lo, hi)))[lo:hi]
cat(c(length(x), lo , hi) )
}
.Internal(mean(x))
}
I don't know of a function. Something like the following would trim off the upper tail of the distribution before taking the mean.
upper.trim.mean <- function(x,trim) {
x <- sort(x)
mean(x[1:floor(length(x)*(1-trim))])
}
This should account for either side, or both sides for trimming.
trim.side.mean <- function(x, trim, type="both"){
if (type == "both") {
mean(x,trim)}
else if (type == "right") {
x <- sort(x)
mean(x[1:floor(length(x)*(1-trim))])}
else if (type == "left"){
x <- sort(x)
mean(x[max(1,floor(length(x)*trim)):length(x)])}}
one.sided.trim.mean <- function(x, trim, upper=T) {
if(upper) trim = 1-trim
data <- mean(x[x<quantile(x, trim)])
}
I found that all the answers posted do not match when checked manually. So I created one of my own. Its long but simple enough to understand
get_trim <- function(x,trim,type)
{
x <- sort(x)
ans<-0
if (type=="both")
{
for (i in (trim+1):(length(x)-trim))
{
ans=ans+x[i];
}
print(ans/(length(x)-(2*trim)))
}
else if(type=="left")
{
for (i in (trim+1):(length(x)))
{
ans=ans+x[i];
}
print(ans/(length(x)-trim))
}
else if (type=="right")
{
for (i in 1:(length(x)-trim))
{
ans=ans+x[i];
}
print(ans/(length(x)-trim))
}
}