R how to properly put column name to function as an input - r

Here is a small example of dataset I wish to process:
df = setNames(data.frame(matrix(1:100,10)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))
FilterGap = setNames(data.frame(matrix(1:10,1)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))
I have another function (FrcGap, see below) to process df dataset based on the value in the FilterGap.
The old function (not working):
FrcGap = function(Var){length(na.omit(df$Var[df$Var > FilterGap$Var])) / length(na.omit(df$Var))}
I review other posts and noticed that I need to convert $ to [[ in the function. So, I modified the old function to the new function.
The new function (not working):
FrcGap = function(Var){length( na.omit( df[[Var[df$Var > FilterGap$Var]]] ) ) / length( na.omit( df[[Var]] ) )}
I also realized that the new function is not easy to be understood and it also has errors.
The errors:
> FrcGap("Dis_N1")
Show Traceback
Rerun with Debug
Error in .subset2(x, i, exact = exact) : no such index at level 1
Manual procedure (it works):
If I insert the Var ID to the function one by one manually, it actually works.
length(na.omit(df$Dis_N1[df$Dis_N1 > FilterGap$Dis_N1])) / length(na.omit(df$Dis_N1))
length(na.omit(df$Dis_N2[df$Dis_N2 > FilterGap$Dis_N2])) / length(na.omit(df$Dis_N2))
length(na.omit(df$Dis_N10[df$Dis_N10 > FilterGap$Dis_N10])) / length(na.omit(df$Dis_N10))
Could you please provide your insights, comments, and suggestions for this type of work in R?
Thanks a lot.

OK thanks for adding example data, I can get the "old" function working fine.
FrcGap = function(var1, var2){
length(na.omit(var1[var1 > var2])) / length(na.omit(var1))
}
If you want to run it on a single set of values you can do this:
FrcGap(df$Dis_N1, FilterGap$Dis_N1)
[1] 0.9
Or if you want to run it over the both dataframes in their entirety you can use mapply
mapply(FrcGap, df, FilterGap)
Dis_N1 Dis_N2 Dis_N3 Dis_N4 Dis_N5 Dis_N6 Dis_N7 Dis_N8 Dis_N9 Dis_N10
0.9 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

Related

IF statement in R function

I am trying to construct a function with an if statement within.
IN the code below,
I basically want to use different contr_x<- makeContrast.. command for different contr_x. ex) If contr_1 is in the input, it will use the first 1f's make constrast command, if contr_2 is in the input, it will use the second if's make constrast command...
But I am running in to the error that says the object contr_1 is not found. I am confused because in my understanding. contr_1 is just an input name, not an object. (not previously defined)
I am attaching the function code and the error below; I 'll appreciate any insight!!
code
run_limma <- function (model_x, support_x, fit_x,editing_x,contr_x,tmp_x){
message("starting modeling")
model_x<-model(support_x, model_x)
message("starting fitting")
fit_x<-limma_diff(editing_x, model_x,fit_x)
message("Making contrasts")
if (contr_x == "contr_1") {contr_x<-makeContrasts (diseaseAD - diseaseControl,
levels = colnames(coef(fit_x)))
}
if (contr_x == "contr_2") {contr_x<-makeContrasts (diseaseAD_MCI - diseaseControl,
levels = colnames(coef(fit_x)))
}
if (contr_x == "contr_3") {contr_x<-makeContrasts (diseaseMCI - diseaseControl,
levels = colnames(coef(fit_x)))
}
if (contr_x == "contr_4") {contr_x<-makeContrasts (diseasePD - diseaseControl,
levels = colnames(coef(fit_x)))
}
message("making tmp file")
tmp_x<-limma_cont(contr_x, fit_x, tmp_x)
tmp_x
}
error
run_limma(model_1, support_1, fit_1,editing_1,contr_1,tmp_1)
starting modeling
starting fitting
Making contrasts
Error in run_limma(model_1, support_1, fit_1, editing_1, contr_1, tmp_1) :
object 'contr_1' not found

Unexpected symbol error in R that doesn't match my code

I am coding in R-studio and have a function called saveResults(). It takes:
sce - a Single Cell Experiment object.
opt - a list with five things
clusterLabels - simple dataframe with two columns
The important thing is that I receive an error stating:
Error: unexpected symbol in:
"saveResults(sce = sce, opt = opt, clusteInputs()
zhengMix"
which doesn't agree at all with the parameters I pass into the function. You can see this on the last line of the code block below: I pass in proper parameters, but I receive an error that says I have passed in clusteInputs(), and zhengMix instead of clusterLabels. I don't have a function called clusteInputs(), and zhengMix was several lines above.
# Save the clustering data
InstallAndLoadPackagesForSC3Clustering()
opt <- GetOptionInputs()
zhengMix <- FetchzhengMix(opt)
sce <- CreateSingleCellExperiment(zhengMix)
clusterLabels <- getClusterLabels(sce)
opt <- createNewDirectoriesToSaveData(opt)
saveResults <- function(sce, opt, clusterLabels){
print("Beginning process of saving results...")
maxClusters = ncol(clusterLabels)/2+1
for (n in 2:maxClusters){
savePCAasPDF(sce, opt, numOfClusters = n, clusterLabels)
saveClusterLabelsAsRDS(clusterLabels, numOfClusters = n, opt)
}
saveSilhouetteScores(sce, opt)
print("Done.")
}
saveResults(sce = sce, opt = opt, clusterLabels = clusterLabels)
Does anyone have an idea what is going on? I'm pretty stuck on this.
This isn't the best solution, but I fixed my own problem by removing the code out of the function and running it there caused no issues.

Why is Julia not taking the correct data in the function?

I started learning how to program in Julia, and I'm making a pretty simple code, but it's not working as I wish, and I'm lost because I can't find where's the error.
Basically, I have a vector like this one: (1,0,0,1,1) and I made two functions that will change the entries of the vector.
The first function needs to change every entry of the vector for 1.
The second function needs to change every entry as follows: if the entry is 1, then change it for 0, and vice versa.
I have the next code:
function vectorMethodOne(vector1)
for i = 1:length(vector1)
if vector1[i] == 0
vector1[i] = 1
end
end
return vector1
end
function vectorMethodTwo(vector1)
for i = 1:length(vector1)
if vector1[i] == 0
vector1[i] = 1
elseif vector1[i] == 1
vector1[i] = 0
end
end
return vector1
end
The problem happens when I run the code like this:
vectorEx = rand(0:1, 5)
println("Original Vector:")
println(string(vectorEx))
println("Vector using method 1:")
vectorM1 = vectorMethodOne(vectorEx)
println(string(vectorM1))
println("Vector using method 2:")
vectorM2 = vectorMethodTwo(vectorEx)
println(string(vectorM2))
The output looks like this:
> Original Vector:
> [1,0,0,1,1]
> Vector using method 1:
> [1,1,1,1,1]
> Vector using method 2:
> [0,0,0,0,0]
But I want that the output looks like this:
> Original Vector:
> [1,0,0,1,1]
> Vector using method 1:
> [1,1,1,1,1]
> Vector using method 2:
> [0,1,1,0,0]
If I only run the vectorMethodTwo, it works like I want, like this:
vectorEx = rand(0:1, 5)
println("Original Vector:")
println(string(vectorEx))
println("Vector using method 2:")
vectorM2 = vectorMethodTwo(vectorEx)
println(string(vectorM2))
And the output looks like this:
> Original Vector:
> [1,0,0,1,1]
> Vector using method 2:
> [0,1,1,0,0]
But I want that every function run over the original vector (1,0,0,1,1) but the vectorMethodTwo is running over the modified vector (1,1,1,1,1) and I can't understand where's the error in my code.
Let's look at your output:
> Original Vector:
> [1,0,0,1,1]
> Vector using method 1:
> [1,1,1,1,1]
> Vector using method 2:
> [0,0,0,0,0]
Odd. Method 2 looks like method 1 flipped. Let's check:
println(vectorMethodTwo([1,1,1,1,1]))
> [0,0,0,0,0]
Very suspicious! Why could this be happening? Please think about this before moving to the next section.
Your "functions" are mutating the vector. When you do this in vectorMethodOne,
vector1[i] = 1
, you are changing the contents of vector1 that was passed in. That vector1 refers to the same memory as vectorEx.
Do not write code that mutates your inputs (unless you name the function accordingly). Either create a copy of your vector before mutating it, or try a list comprehension:
function vectorMethodOne(vector1)
return [x == 0 ? 1 : x for x in vector1]
end
function vectorMethodTwo(vector1)
return [x == 0 ? 1 : x == 1 ? 0 : x for x in vector1]
end
These do not modify the contents of the input vector1 in any way.

Rewrite a function in locked env

Using pvclust::pvclust , I got an error
Error in solve.default(crossprod(X, X/vv)) : Lapack routine dgesv:
system is exactly singular: U[2,2] = 0 Calls: ...
pvclust.merge -> lapply -> FUN -> msfit -> solve -> solve.default
Execution halted
I don't want stop analysis even if crossprod(X, X/vv) be singular matrix, so I tried to insert an if {...} block in pvclust::msfit to check whether crossprod(X, X/vv) is singular or not by matrixcalc::is.singular.matrix, and if it is so, return NA and continue.
After saved my.msfit.R which contain msfit which incerted if(!is.singular.matrix(...)) {...}else{...} in original pvclust::msfit,
methods::insertSource('/myFuncDir/my.msfit.R',package="pvclust",functions='msfit')
But I got error below
Error in assign(this, thisObj, envir = envwhere) :
cannot change value of locked binding for 'msfit'
In addition: Warning message:
In methods::insertSource(filename, package = "pvclust", functions = "msfit", :
cannot insert these (not found in source): "msfit"
Is there any solution? Should I request of the author of the pvclust package?
== Below added after posting ==
An accurate advice was given in comments to use try/catch syntax, but I don't think it will give me solution.
Concerning my poor English skill, I present a toy sample which tells the situation.
fun.a <- function(a1,a2,a3,a4){
sum1 <- a1 + a2
sum2 <- a2 + a3
sum3 <- a3 + a4
return(list(sum1,sum2,sum3))
}
fun.a(1,2,3,'Char')
Because that the sum3 will be an error, so fun.a(1,2,3,'Char') returns error.
But, I want to return
List [sum1, sum2, NaN]
If I use tryCatch(...,error=expr), sum1 to sum3(actually, solve(...)in pvclust::msfit) should be wrapped.
But, fun.a(msfit) is inner function of locked package(pvclust).

How to make blocks of unit tests in R?

I have started to make unit-tests for my functions in R, using svUnit (docs). I have done the test for the functions in a file, then for the ones in another file, and I have created a mainTest, where I call all the tests. So my project looks like this:
proj
|-src
| |-functions1 (containing some functions)
| |-functions2 (containing some other functions)
| |-functions3 (containing some more functions)
| |-mainFile (here I call the functions in the files above)
|-tests
|-functions1Test (containing tests for functions in functions1 file)
|-functions2Test (containing tests for functions in functions2 file)
|-functions3Test (containing tests for functions in functions3 file)
|-mainTest (containing the function that runs all the tests)
a functionsXTest file looks like this:
source('functionsX.R')
test(fun1) <- function(){
# call the fun1 function and check the result
}
test(fun2) <- function(){
# call the fun2 function and check the result
}
# ...
functions1Tests <- svSuite(svSuiteList()) # here
The mainTests looks like this:
library('svUnit')
source('functions1Tests.R')
source('functions1Tests.R')
source('functions1Tests.R')
launchTests <- function(){
clearLog()
runTest(functions1Tests)
runTest(functions2Tests)
runTest(functions3Tests)
Log()
}
I thought that the last line at the end of the file functionsXTest.R is grouping the unit tests in a variable, but it seems that it is grouping all the tests in that variable, so functions1Tests is containing the tests for all the functions in functions1.R, and functions2Tests is containing the tests in functions1.R and functions2.R. Is there a possibility to have all the tests in a file grouped in a variable and the run the tests on each variable, so it will be easier to find the problematic test?
I have found that if I add the names of the tests in svSuite, it is separates the different tests, so doing:
functions1Tests <- svSuite("fun1", "fun2")
on the last line of the functionsXTest.R file, functions2Tests will not contain the tests for functions1Test.R.
But now I am wondering if there is a possibility to split the test in the Log, because now it will display something like:
> launchTests()
= A svUnit test suite run in less than 0.1 sec with:
* testfun1 ... OK
* testfun1 ... OK
* testfun2 ... OK
* ...
== testfun1 (in runitfunctions1Test.R) run in less than 0.1 sec: OK
//Pass: 7 Fail: 0 Errors: 0//
== testfun1 (in runitfunctions2Test.R) run in less than 0.1 sec: OK
//Pass: 4 Fail: 0 Errors: 0//
== testfun2 (in runitfunctions1Test.R) run in less than 0.1 sec: OK
//Pass: 10 Fail: 0 Errors: 0//
...

Resources