How to make blocks of unit tests in R? - r

I have started to make unit-tests for my functions in R, using svUnit (docs). I have done the test for the functions in a file, then for the ones in another file, and I have created a mainTest, where I call all the tests. So my project looks like this:
proj
|-src
| |-functions1 (containing some functions)
| |-functions2 (containing some other functions)
| |-functions3 (containing some more functions)
| |-mainFile (here I call the functions in the files above)
|-tests
|-functions1Test (containing tests for functions in functions1 file)
|-functions2Test (containing tests for functions in functions2 file)
|-functions3Test (containing tests for functions in functions3 file)
|-mainTest (containing the function that runs all the tests)
a functionsXTest file looks like this:
source('functionsX.R')
test(fun1) <- function(){
# call the fun1 function and check the result
}
test(fun2) <- function(){
# call the fun2 function and check the result
}
# ...
functions1Tests <- svSuite(svSuiteList()) # here
The mainTests looks like this:
library('svUnit')
source('functions1Tests.R')
source('functions1Tests.R')
source('functions1Tests.R')
launchTests <- function(){
clearLog()
runTest(functions1Tests)
runTest(functions2Tests)
runTest(functions3Tests)
Log()
}
I thought that the last line at the end of the file functionsXTest.R is grouping the unit tests in a variable, but it seems that it is grouping all the tests in that variable, so functions1Tests is containing the tests for all the functions in functions1.R, and functions2Tests is containing the tests in functions1.R and functions2.R. Is there a possibility to have all the tests in a file grouped in a variable and the run the tests on each variable, so it will be easier to find the problematic test?

I have found that if I add the names of the tests in svSuite, it is separates the different tests, so doing:
functions1Tests <- svSuite("fun1", "fun2")
on the last line of the functionsXTest.R file, functions2Tests will not contain the tests for functions1Test.R.
But now I am wondering if there is a possibility to split the test in the Log, because now it will display something like:
> launchTests()
= A svUnit test suite run in less than 0.1 sec with:
* testfun1 ... OK
* testfun1 ... OK
* testfun2 ... OK
* ...
== testfun1 (in runitfunctions1Test.R) run in less than 0.1 sec: OK
//Pass: 7 Fail: 0 Errors: 0//
== testfun1 (in runitfunctions2Test.R) run in less than 0.1 sec: OK
//Pass: 4 Fail: 0 Errors: 0//
== testfun2 (in runitfunctions1Test.R) run in less than 0.1 sec: OK
//Pass: 10 Fail: 0 Errors: 0//
...

Related

R conditional loop, multiple lists

I need help stepping through a second loop in R when a test fails in my first loop. Here's the logic:
to start use config_list[1] from list
then download file path_list[1] from list
check if file passes test,
if so, download path_list[1 + 1] file from list and go back to step 3
if not, change config to next in list and go back to step 2 for failed file
Here's how far I've gotten:
path_list <- list("path1", "path2", "path3")
config_list <- list("a", "b", "c")
for (con in config_list) {
con[1] # set initial config
for (val in path_list) {
print(paste(val, "downloaded")) # download file
if (val == "path2"){ # check if file passes some test
con[1 + 1] # if above test fails change to con[1 + 1]
print(paste(val, "downloaded")) # download file again with new config ???
}
print(val)
}
}

R how to properly put column name to function as an input

Here is a small example of dataset I wish to process:
df = setNames(data.frame(matrix(1:100,10)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))
FilterGap = setNames(data.frame(matrix(1:10,1)), c("Dis_N1", "Dis_N2", "Dis_N3", "Dis_N4", "Dis_N5", "Dis_N6", "Dis_N7", "Dis_N8", "Dis_N9", "Dis_N10"))
I have another function (FrcGap, see below) to process df dataset based on the value in the FilterGap.
The old function (not working):
FrcGap = function(Var){length(na.omit(df$Var[df$Var > FilterGap$Var])) / length(na.omit(df$Var))}
I review other posts and noticed that I need to convert $ to [[ in the function. So, I modified the old function to the new function.
The new function (not working):
FrcGap = function(Var){length( na.omit( df[[Var[df$Var > FilterGap$Var]]] ) ) / length( na.omit( df[[Var]] ) )}
I also realized that the new function is not easy to be understood and it also has errors.
The errors:
> FrcGap("Dis_N1")
Show Traceback
Rerun with Debug
Error in .subset2(x, i, exact = exact) : no such index at level 1
Manual procedure (it works):
If I insert the Var ID to the function one by one manually, it actually works.
length(na.omit(df$Dis_N1[df$Dis_N1 > FilterGap$Dis_N1])) / length(na.omit(df$Dis_N1))
length(na.omit(df$Dis_N2[df$Dis_N2 > FilterGap$Dis_N2])) / length(na.omit(df$Dis_N2))
length(na.omit(df$Dis_N10[df$Dis_N10 > FilterGap$Dis_N10])) / length(na.omit(df$Dis_N10))
Could you please provide your insights, comments, and suggestions for this type of work in R?
Thanks a lot.
OK thanks for adding example data, I can get the "old" function working fine.
FrcGap = function(var1, var2){
length(na.omit(var1[var1 > var2])) / length(na.omit(var1))
}
If you want to run it on a single set of values you can do this:
FrcGap(df$Dis_N1, FilterGap$Dis_N1)
[1] 0.9
Or if you want to run it over the both dataframes in their entirety you can use mapply
mapply(FrcGap, df, FilterGap)
Dis_N1 Dis_N2 Dis_N3 Dis_N4 Dis_N5 Dis_N6 Dis_N7 Dis_N8 Dis_N9 Dis_N10
0.9 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

Variable outside function is reset every time during recursion

I am a beginner at R, so please pardon me if there is a key programming construct about R that I am not understanding.
I have the following code:
tab_level <- 0
print_tree <- function (node_index) {
cat (tab_level)
cat ("\n")
# Past the tree domain
if (node_index >= 2^depth ) {
tab_level <- tab_level - 1
cat ("\n")
return()
}
# Print the value in the node
# Tabs
#cat(node_index)
for (i in 0:tab_level) {
cat("\t")
}
tab_level <- tab_level + 1
print_tree(2*node_index)
print_tree(2*node_index + 1)
}
print_tree (1)
However, when I do this and read the cat outputs, tab_level is 0 every single time, in the output. Am I understanding R incorrectly, and how it works with variable scope? I come from a Java background, if that helps at all, and I'm assuming it works similarly to Java.

test that function in r does not work

library(Rcpp)
cppFunction("
int fib(int n)
{
if (n < 2)
return(n);
return( fib(n-1) + fib(n-2) );
}
")
My task is to write several tests to show whether the case is wrong or not.
However, the wrong messages are as follows.
Error during wrapup: Test failed: 'Test cppFunction'
* Not expected: 3 not equal to equals(2)
Modes of target, current: function, numeric
target, current do not match when deparsed.
* Not expected: 5 not equal to equals(5)
Modes of target, current: function, numeric
target, current do not match when deparsed.
* Not expected: 10 not equal to equals(55)
Modes of target, current: function, numeric
target, current do not match when deparsed.
* Not expected: 8 code did not generate an error.
* Not expected: 6 code did not generate an error.
* Not expected: 9 code did not generate an error.
###test that###
library(testthat)
context("Test cppFunction")
##do not know why??
test_that("Test cppFunction",{
expect_equal(3,equals(2))
expect_equal(5,equals(5))
expect_equal(10,equals(55))
expect_error(8,equals(20))
expect_error(6,equals(7))
expect_error(9,equals(25))
})
I cannot figure out why the test that does not work.
First of all, you never even call you fib function in the tests. You should have something like
test_that("Test cppFunction",{
expect_equal(fib(3),2)
expect_equal(fib(5),5)
expect_equal(fib(10),55)
})
Also usage of expect_error is wrong, since fib function is not supposed to produce errors as it is implemented now. I suspect that you wanted to test for non-equality. But that does not make sense, if function does not produce the wrong result you expect, it does not mean that function is right. I would advice just to write more expect_equal tests. If you still want to do that, just write something like
expect_false(fib(10) == 22)
In the end your test should look something like
test_that("Test cppFunction",{
expect_equal(fib(3),2)
expect_equal(fib(5),5)
expect_equal(fib(10),55)
expect_false(fib(8) == 20)
expect_false(fib(6) == 7)
expect_false(fib(9) == 25)
})

R: check sample against ref column and dependingly add sample data to ref dataset

I'm a beginner with R (and coding in general). In January 14 I hopefully can begin and finish a R course, but I would like to learn before. I have understanding of the basics and have used functions like read.table,intersect,cbind,paste,write.table.
But I only was able to achieve partially what I want to do with two input files (shortened samples):
REF.CSV
SNP,Pos,Mut,Hg
M522 L16 S138 PF3493 rs9786714,7173143,G->A,IJKLT-M522
P128 PF5504 rs17250121,20837553,C->T,KLT-M9
M429 P125 rs17306671,14031334,T->A,IJ-M429
M170 PF3715 rs2032597,14847792,A->C,I-M170
M304 Page16 PF4609 rs13447352,22749853,A->C,J-M304
M172 Page28 PF4908 rs2032604,14969634,T->G,J2-M172
L228,7771358,C->T,J2-M172
L212,22711465,T->C,J2a-M410
SAMPLE.CSV
SNP,Chr,Allele1,Allele2
L16,Y,A,A
P128,Y,C,C
M170,Y,A,A
P123,Y,C,C
M304,Y,C,C
M172,Y,T,G
L212,Y,-0,-0
Description what I like to do:
A) Check if SAMPLE.SNP is in REF.SNP
B) if YES check SAMPLE.Allele status (first read, second read) vs REF.Mut (Ancestral->Derived)
B1) if both Alleles are the same and match Derived create output "+ Allele1-Allele2"
B2) if both Alleles are the same and match Ancestral create output "- Allele1-Allele2"
B3) if Alleles are not the same check if Allele2 is Derived and create output "+ Allele1-Allele2"
B4) if both Alleles are "-0" create output "? NC"
B5) else create output "? Allele1-Allele2"
B6) if NO create output "? NA"
C) Write REF.CSV + output in new row (Sample) and create OUTPUT file
OUTPUT.CSV (like wanted)
SNP,Pos,Mut,Hg,Sample
M522 L16 S138 PF3493 rs9786714,7173143,G->A,IJKLT-M522,+ A-A
P128 PF5504 rs17250121,20837553,C->T,KLT-M9,- C-C
M429 P125 rs17306671,14031334,T->A,IJ-M429,? NA
M170 PF3715 rs2032597,14847792,A->C,I-M170,- A-A
M304 Page16 PF4609 rs13447352,22749853,A->C,J-M304,+ C-C
M172 Page28 PF4908 rs2032604,14969634,T->G,J2-M172,+ T-G
L228,7771358,C->T,J2-M172,? NA
L212,22711465,T->C,J2a-M410,? NC
What functions I have found interesting and tried so far.
Variant1: A) is done, but I guess it is not possible to write C) with this?
Have not tried to code down B) here
GT <- read.table("SAMPLE.CSV",sep=',',skip=1)[,c(1,3,4)]
REF <- read.table("REF.CSV",sep=',')
rownames(REF) <- REF[,1]
COMMON <- intersect(rownames(GT),rownames(REF))
REF <- REF[COMMON,]
GT <- GT[COMMON,]
GT<-cbind(REF,paste(GT[,2],'-',X[,3],sep=','))
write.table(GT,file='OUTPUT.CSV',quote=F,row.names=F,col.names=F‌​)
Variant2: This is probably a complete mess, forgive me. I was just rying to build a solution on for if looping functions, but I haven't understood R's syntax and logic in this probably.
I was not able to get this to run - A) and C)
Have not tried to code down B) here
GT<-read.table("SAMPLE.CSV",sep=',',skip=1)[,c(1,3,4)]
rownames(GT)<-GT[,1]
REF <- read.table("REF.CSV",sep=',')
rownames(REF)<-REF[,1]
for (i in (nrow(REF))) {
for (j in (nrow(GT))) {
if (GT[j,] %in% REF[i,]) {
ROWC[i,]<-cbind(REF[i,],paste(GT[j,2],"-",GT[j,3],sep=','))
} else {
ROWC[i,]<-cbind(REF[i,],"NA",sep=',')
}
}
}
write.table(ROWC,file='OUTPUT.CSV',quote=F,row.names=F,col.names=F)
I would be just happy if you can indicate what logic/functions would lead to reach the task I have described. I will then try to figure it out. Thx.

Resources