How can I create a function which contains local_mock()? - r

Currently I am working on several UnitTests and my tests also contain mocking.
In order to "pass" on functions part of my function, I am applying the local_mock() function from the testthat package.
However, there are several functions on which the local_mock function needs to be applied.
They have a different structure, but same number of arguments.
Hence, I created a function as following:
.apply_local_mock = function(Function, File){
local_mock(Function = function(Argument1, Argument2, Argument3) {
result = .readCsv(file.path(targetPath, "Input", paste(File))) %>%
return(result)
})
}
An example of two functions to which local_mock will be applied to:
CalcProfitability <- function(IncomeStatement, BalanceSheet, Months){
result = (12/Months) * (IncomeStatement / BalanceSheet)
return(result)
}
CalcDiffDivByPl <- function(BalanceSheetCY, BalanceSheetPY, IncomeStatement){
result = (BalanceSheetCY - BalanceSheetPY) / IncomeStatement
return(result)
}
So the mocking test code looks as following:
.apply_local_mock(CalcProfitability,"CalcProfitability.csv")
.apply_local_mock(CalcDiffDivByPl ,"CalcDiffDivByPl.csv")
But I am receiving the following error:
Error: Function FunctionName not found in environment testthat.
Can anyone please advise how to include local_mock() function in a new function?

Related

Unable to update data in dataframe

i tried updating data in dataframe but its unable to get updating
//Initialize data and dataframe here
user_data=read.csv("train_5.csv")
baskets.df=data.frame(Sequence=character(),
Challenge=character(),
countno=integer(),
stringsAsFactors=FALSE)
/Updating data in dataframe here
for(i in 1:length((user_data)))
{
for(j in i:length(user_data))
{
if(user_data$challenge_sequence[i]==user_data$challenge_sequence[j]&&user_data$challenge[i]==user_data$challenge[j])
{
writedata(user_data$challenge_sequence[i],user_data$challenge[i])
}
}
}
writedata=function( seqnn,challng)
{
#print(seqnn)
#print(challng)
newRow <- data.frame(Sequence=seqnn,Challenge=challng,countno=1)
baskets.df=rbind(baskets.df,newRow)
}
//view data here
View(baskets.df)
I've modified your code to what I believe will work. You haven't provided sample data, so I can't verify that it works the way you want. I'm basing my attempt here on a couple of common novice mistakes that I'll do my best to explain.
Your writedata function was written to be a little loose with it's scope. When you create a new function, what happens in the function technically happens in its own environment. That is, it tries to look for things defined within the function, and then any new objects it creates are created only within that environment. R also has this neat (and sometimes tricky) feature where, if it can't find an object in an environment, it will try to look up to the parent environment.
The impact this has on your writedata function is that when R looks for baskets.df in the function and can't find it, R then turns to the Global Environment, finds baskets.df there, and then uses it in rbind. However, the result of rbind gets saved to a baskets.df in the function environment, and does not update the object of the same name in the global environment.
To address this, I added an argument to writedata that is simply named data. We can then use this argument to pass a data frame to the function's environment and do everything locally. By not making any assignment at the end, we implicitly tell the function to return it's result.
Then, in your loop, instead of simply calling writedata, we assign it's result back to baskets.df to replace the previous result.
for(i in 1:length((user_data)))
{
for(j in i:length(user_data))
{
if(user_data$challenge_sequence[i] == user_data$challenge_sequence[j] &&
user_data$challenge[i] == user_data$challenge[j])
{
baskets.df <- writedata(baskets.df,
user_data$challenge_sequence[i],
user_data$challenge[i])
}
}
}
writedata=function(data, seqnn,challng)
{
#print(seqnn)
#print(challng)
newRow <- data.frame(Sequence = seqnn,
Challenge = challng,
countno = 1)
rbind(data, newRow)
}
I'm not sure what you're programming background is, but your loops will be very slow in R because it's an interpreted language. To get around this, many functions are vectorized (which simply means that you give them more than one data point, and they do the looping inside compiled code where the loops are fast).
With that in mind, here's what I believe will be a much faster implementation of your code
user_data=read.csv("train_5.csv")
# challenge_indices will be a matrix with TRUE at every place "challenge" and "challenge_sequence" is the same
challenge_indices <- outer(user_data$challenge_sequence, user_data$challenge_sequence, "==") &
outer(user_data$challenge, user_data$challenge, "==")
# since you don't want duplicates, get rid of them
challenge_indices[upper.tri(challenge_indices, diag = TRUE)] <- FALSE
# now let's get the indices of interest
index_list <- which(challenge_indices,arr.ind = TRUE)
# now we make the resulting data set all at once
# this is much faster, because it does not require copying the data frame many times - which would be required if you created a new row every time.
baskets.df <- with(user_data, data.frame(
Sequence = challenge_sequence[index_list[,"row"]],
challenge = challenge[index_list[,"row"]]
)

Is it possible to change the value of R6 function ? (Good style OOP programming?)

I'm coming from C++ background, trying to make use of it for R OOP programming with R6 package.
Consider the following typical situation when writing a large OOP code. -
You have a class, in which you have several (possibly many) functions, each of which may also be quite complex and with many lines of code:
# file CTest.R
cTest <- R6Class(
"CTest",
public = list(
z = 10,
fDo1 = function() {
# very long and complex code goes here
self$z <- self$z*2; self$z
},
fDo2 = function() {
# another very long and complex code goes here
print(self)
}
)
) #"CTest"
Naturally, you don't want to put ALL your long and various functions in the same (CTest.R) file - it will become messy and unmanageable.
If you program in C++, normal way to program such code is : first, you declare you functions in .h file, then you create .c files for each you complex function, where you define your function. This makes it possible to do collaborative code writing, including efficient source-control.
So, I've tried to do something similar in R, like: first, declaring a function as in code above, and then, trying to assign the "actual long and complex" code to it later (which later I would put in a separate file CTest-Do1.R):
cTest$f <- function() {
self$z <- self$z*100000; self$z
}
Now I test if it works:
> tt <- cTest$new(); tt; tt$fDo1(); tt
<CTest>
Public:
clone: function (deep = FALSE)
fDo1: function ()
fDo2: function ()
z: 10
[1] 20
<CTest>
Public:
clone: function (deep = FALSE)
fDo1: function ()
fDo2: function ()
z: 20
No, it does not.- As seen from output above, the function has not been changed.
Any advice?
Thanks to Grothendieck's comment above, there's a reasonable workaround to make it work.
Instead of this:
# CTest-Do1_doesnotwork.R
cTest$fDo1 <- function() {
...
}
write this:
# CTest-Do1_works.R
cTest$set(
overwrite = TRUE, "public", "fDo1",
function() {
...
}
)
This code can now be written in separate file, as originally desired.
I still wonder though - Is the above describe way actually the common(best) practice for writing large OOP codes in R community? (looks a bit strange to me).
If not, what is it (beyond just using source()) ? - so that to enable collaborative coding and source control for separate parts (functions) of a class ?
I came here also searching for R6 best practice. One way that I've seen (here) is to define the functions elsewhere as normal R functions and pass in self, private etc as required
cTest<- R6::R6Class("CTest",
public = list(
fDo1 = function()
cTestfDo1(self),
fDo2 = function(x)
cTestfDo2(self, private, x)
))
and else where have
cTestfDo1 <- function(self) {
self$z <- self$z*2; self$z
}
and somewhere else
cTestfDo2 <- function(self, private, x) {
self$z * private$q + x
}
etc
I don't know if it's best practice, or efficient, but the class definition looks neat with it and if the cTestfDo1 functions are not exported then it's relatively neat in the namespace too.

dump() in R not source()able- output contains "..."

I'm trying to use dump() to save the settings of my analysis so I can examine them in a text editor or reload them at a later date.
In my code I'm using the command
dump(ls(), settingsOutput, append=TRUE)
The file defined by `settingsOutput' gets created, but the larger objects and locally defined functions are truncated. Here's an excerpt from such a file. Note these files are generally on the order of a few kb.
createFilePrefix <-
function (runDesc, runID, restartNumber)
{
...
createRunDesc <-
function (genomeName, nGenes, nMix, mixDef, phiFlag)
{
...
datasetID <-
"02"
descriptionPartsList <-
c("genomeNameTest", "nGenesTest", "numMixTest", "mixDefTest",
"phiFlagTest", "runDescTest", "runIDTest", "restartNumberTest"
...
diffTime <-
structure(0.531, units = "hours", class = "difftime")
dissectObjectFileName <-
function (objectFileName)
{
...
divergence <-
0
Just for reference, here's one of the functions defined above
createFilePrefix <- function(runDesc, runID, restartNumber){
paste(runDesc, "_run-", runID, "_restartNumber-", restartNumber, sep="")
}
Right now I'm going back and removing the problematic lines and then loading the files, but I'd prefer to actually have code that works as intended.
Can anyone explain to me why I'm getting this behavior and what to do to fix it?

return list of dataframe from inside a function to apply rbind using do.call()

Below is a trimmed version of my script. I am hoping my function will return me a list per loop iteration, so that I can rbind all the list to form a new data frame, but when I am executing this script, I keep getting the error:
do.call("rbind", listofdfs) : object 'listofdfs' not found
Thank you all for your help.
library(DBI)
library(RPostgreSQL)
drv<- dbDriver("MyDataBase")
con<-dbConnect(drv,dbname="DB_Name",
host="DB_Location",port=number,user="MyName",password= "Password")
dates <- seq(as.Date(as.character(Sys.Date() - 33)), as.Date(as.character(Sys.Date() - 1)), by=1)
my_function<-function(dates){
listofdfs<-list()
for(i in 1:length(dates){
data<-dbGetQuery(con, sprintf("select X,Y,Z from TABLE where date>=date('%s')", dates[i])
data$newColumn<-mean(data$X)
listofdfs[[i]]<-data
}
return(listofdfs)
}
df<-do.call("rbind", listofdfs)
I have a small simplified example to refer, please refer to the dates variable from above
my_list_function<-function(dates){
for(i in 1:length(dates))
{
my_list<-list()
my_list[[i]]<-i
}
return(my_list) }
k<-do.call(rbind,my_list(dates))
View(k)
now running
do.call(rbind,my_list(dates))
returns error could not find function "my_list" and running do.call(rbind,my_list_function(dates)) works but is only giving 33.
Thanks again for help.
listofdfs is a variable that is declared within your function. Therefore it is not defined outside of its body.
but because it is returned by the function, you can access it by calling the function itself:
df<-do.call("rbind", my_function(dates))
Also on to make you small example work:
my_list_function<-function(dates){
my_list<-list()
for(i in 1:length(dates))
{
my_list[[i]]<-i
}
return(my_list)
}
k<-do.call(rbind,my_list_function(dates))

Which library is the pr_DB object defined in?

I am completely new to R.
I am trying to use the dist object with a custom function based on the specification here, but I was unable to pass the custom function directly by name, so I tried to add it using the registry described here, but it appears that I am missing a library.
However, I'm not sure which library I need and cannot find a reference to find the name of the library.
Here's a code sample that I'm trying to run:
library(cluster)
myfun <- function(x,y) {
numDiffs <- 0;
for (i in x) {
if (x[i] != y[i])
numDiffs <- numDiffs + 1;
}
return(numDiffs);
}
summary(pr_DB)
pr_DB$set_entry(FUN = myfun, names = c("myfun", "vectorham"))
pr_DB$get_entry("MYFUN")
Here's the error:
Error in summary(pr_DB) : object 'pr_DB' not found
Execution halted
You need to learn the conventions used by R help pages. That "{proxy}" at the top of the page you linked to is really the answer to your question. The convention for the help page construction is "topic {package_name}".

Resources