How to accumulate the results of readr::read_lines_chunked? - r

I'm using readr::read_lines_chunked in the following way:
if(!require(readr)) install.packages("readr", repos = "http://cran.us.r-project.org")
mytb <- NULL
read_lines_chunked(file="/tmp/huge.xml", chunk_size=10, callback = function(xml, pos) {
// extract values from xml into tmp
if (is.null(mytb)) {
users <- as_tibble(tmp)
} else {
users <- bind_rows(users, as_tibble(tmp))
}
})
but this doesn't work as mytb always ends up being null ... how do you accumulate the results into a tibble?

I found the solution. This package has a group of callback handlers that wrap the custom handler. So this is how it works:
mytb <- read_lines_chunked(file="/tmp/huge.xml", chunk_size=10, callback = DataFrameCallback$new(function(xml, pos) {
// extract values from xml into tmp
as_tibble(tmp)
}))
Note the DataFrameCallback$new(...) decorator and returning the tibble I want to stitch together as rbind.

Related

Capturing ellipsis arguments from within an internal function

I'm trying to extract arguments passed to ... from within an internal function to perform validity check. Since the only purpose of the function is to check ellipsis, I'd like the function to have no parameter and capture the ellipsis from the parent function internally.
Here's a simple example of what I'd like to do:
check_dots <- function() {
# capture ... arguments here
if (rlang::dots_n(...) == 1L && ... == "foo") {
stop()
}
}
(function(...) {
check_dots()
"success"
})("foo", "bar")
I've tried using formals(fun = rlang::caller_fn()) to extract ... arguments without success.
The following, using base R, does what you want:
check_dots = function () {
call = match.call(definition = sys.function(-1L), call = sys.call(-1L), expand.dots = FALSE)
if (length(call$...) == 1L && call$...[[1L]] == 'foo') stop('error')
}
‘rlang’ has caller_call as an rough equivalent of match.call, but it’s missing an option to prevent expanding dots, so I don’t know how to do the same as above using ‘rlang’.

Run testthat test in separate R session (how to combine the outcomes)

I need to test package loading operations (for my multiversion package) and know that unloading namespaces and stuff is dangerous work. So I want to run every test in a fresh R session. Running my tests in parallel does not meet this demand since it will reuse slaves, and these get dirty.
So I thought callr::r would help me out. Unfortunately I am again stuck with the minimally documented reporters it seems.
The following is a minimal example. Placed in file test-mytest.R.
test_that('test 1', {
expect_equal(2+2, 5)
})
reporter_in <- testthat::get_reporter()
# -- 1 --
reporter_out <- callr::r(
function(reporter) {
reporter <- testthat::with_reporter(reporter, {
testthat::test_that("test inside", {
testthat::expect_equal('this', 'wont match')
})
})
},
args = list(reporter = reporter_in),
show = TRUE
)
# -- 2 --
testthat::set_reporter(reporter_out)
# -- 3 --
test_that('test 2', {
expect_equal(2+2, 8)
})
I called this test file using:
# to be able to check the outcome, work with a specific reporter
summary <- testthat::SummaryReporter$new()
testthat::test_file('./tests/testthat/test-mytest.R', reporter = summary)
Which seems to do what I want, but when looking at the results...
> summary$end_reporter()
== Failed ===============================================================================================
-- 1. Failure (test-load_b_pick_last_true.R:5:5): test 1 ------------------------------------------------
2 + 2 (`actual`) not equal to 5 (`expected`).
`actual`: 4
`expected`: 5
== DONE =================================================================================================
...it is only the first test that is returned.
How it works:
An ordinary test is executed.
The reporter, currently in use, is obtained (-- 1 --)
callr::r is used to call a testthat block including a test.
Within the call, I tried using set_reporter, but with_reporter is practically identical.
The callr::r call returns the reporter (tried it with get_reporter(), but with_reporter also returns the reporter (invisibly))
Now the returned reporter seems fine, but when setting it as the actual reporter with set_reporter, it seems that it is not overwriting the actual reporter.
Note that at -- 2 --, the reporter_out contains both test outcomes.
Question
I am not really sure what I expect it to do, but in the end I want the results to be added to the original reporter ((summary or) reporter_in that is, if that is not some kind of copy).
One workaround I can think of would be to move the actual test execution outside of the callr::r call, but gather the testcases inside.
I think it is neat, as long as you can place these helper functions (see the elaborate example) in your package, you can write tests with little overhead.
It doesn't answer how to work with the 'reporter' object though...
Simple example:
test_outcome <- callr::r(
function() {
# devtools::load_all()
list(
check1 = mypackage::sum(5,5), # some imaginary exported functions sum and name.
check2 = mypackage::name()
)
}
)
test_that('My test case', {
expect_equal(test_outcome$check1, 10)
expect_equal(test_outcome$check2, 'Siete')
})
Elaborate example
Note that from .add_test to .exp_true are only function definitions which can better be included in your package so they will be available when being loaded with devtools::load_all(). load_all also loads not-exported functions by default.
test_outcome <- callr::r(
function() {
# devtools::load_all()
# Defining helper functions
tst <- list(desc = 'My first test', tests = list())
.add_test <- function(type, A, B) {
# To show at least something about what is actually tested when returning the result, we can add the actual `.exp_...` call to the test.
call <- as.character(sys.call(-1))
tst$tests[[length(tst$tests) + 1]] <<- list(
type = type, a = A, b = B,
# (I couldn't find a better way to create a nice call string)
call = paste0(call[1], '(', paste0(collapse = ', ', call[2:length(call)]), ')'))
}
.exp_error <- function(expr, exp_msg) {
err_msg <- ''
tryCatch({expr}, error = function(err) {
err_msg <<- err$message
})
.add_test('error', err_msg, exp_msg)
}
.exp_match <- function(expr, regex) {
.add_test('match', expr, regex)
}
.exp_equal <- function(expr, ref) {
.add_test('equal', expr, ref)
}
.exp_false <- function(expr) {
.add_test('false', expr, FALSE)
}
.exp_true <- function(expr) {
.add_test('true', expr, TRUE)
}
# Performing the tests
.exp_match('My name is Siete', 'My name is .*')
.exp_equal(mypackage::sum(5,5), 10) # some imaginary exported functions sum and name.
.exp_match(mypackage::name(), 'Siete')
.exp_false('package:testthat' %in% search())
return(tst)
},
show = TRUE)
# Performing the actual testthat tests:
.run_test_batch <- function(test_outcome) {
test_that(test_outcome$desc, {
for (test in test_outcome$tests) {
# 'test' is a list with the fields 'type', 'a', 'b' and 'call'.
# Where 'type' can contain 'match', 'error', 'true', 'false' or 'equal'.
if (test$type == 'equal') {
with(test, expect_equal(a, b, label = call))
} else if (test$type == 'true') {
expect_true( test$a, label = test$call)
} else if (test$type == 'false') {
expect_false(test$a, label = test$call)
} else if (test$type %in% c('match', 'error')) {
with(test, expect_match(a, b, label = call))
}
}
})
}
.run_test_batch(test_outcome)
When moving the functions to your package you would need the following initialize function too.
tst <- new.env(parent = emptyenv())
tst$desc = ''
tst$tests = list()
.initialize_test <- function(desc) {
tst$desc = desc
tst$tests = list()
}
It works as follows:
An empty list is created: tst
By calling .exp_... functions, tests are added to that list
The list with tests is returned by the function in callr::r
Then we loop over the list and execute every test

How to return event$data in rstudio/websocket

I am trying to extend websocket::Websocket with a method that sends some data and returns the message, so that I can assign it to an object. My question is pretty much identical to https://community.rstudio.com/t/capture-streaming-json-over-websocket/16986. Unfortunately, the user there never revealed how they solved it themselves. My idea was to have the onMessage method return the event$data, i.e. something like:
my_websocket <- R6::R6Class("My websocket",
inherit = websocket::WebSocket,
public = list(
foo = function(x) {
msg <- super$send(paste("x"))
return(msg)
} )
)
load_websocket <- function(){
ws <- my_websocket$new("ws://foo.local")
ws$onMessage(function(event) {
return(event$data)
})
return(ws)
}
my_ws <- load_websocket()
my_ws$foo("hello") # returns NULL
but after spending a good hour on the Websocket source code, I am still completely in the dark as to where exactly the callback happens, "R environment wise".
You need to use super assignment operator <<-. <<- is most useful in conjunction with closures to maintain state. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.
my_websocket <- R6::R6Class("My websocket",
inherit = websocket::WebSocket,
public = list(
foo = function(x) {
msg <<- super$send(paste("x"))
return(msg)
} )
)
load_websocket <- function(){
ws <- my_websocket$new("ws://foo.local")
ws$onMessage(function(event) {
return(event$data)
})
return(ws)
}
my_ws <- load_websocket()
my_ws$foo("hello")

R List functions in file

How do I list all functions of a certain R file doing something like
list = list.all.functions(file.name, alphabetical = TRUE, ...)
where list is a string vector containing the names of the functions in file.name?
The solution of How to list all the functions and their arguments in an R file? gives no output for me (since I am not interested in arguments I opened a new question).
EDIT
File allometry.R starts with
#==========================================================================================#
#==========================================================================================#
# Standing volume of a tree. #
#------------------------------------------------------------------------------------------#
dbh2vol <<- function(hgt,dbh,ipft){
vol = pft$b1Vol[ipft] * hgt * dbh ^ pft$b2Vol[ipft]
return(vol)
}#end function dbh2ca
#==========================================================================================#
#==========================================================================================#
My main looks like
rm(list=ls())
here = "/directory/of/allometry.R/"
setwd(here)
is_function = function (expr) {
if (! is_assign(expr))
return(FALSE)
value = expr[[3]]
is.call(value) && as.character(value[[1]]) == 'function'
}
function_name = function (expr)
as.character(expr[[2]])
is_assign = function (expr)
is.call(expr) && as.character(expr[[1]]) %in% c('=', '<-', 'assign')
file_parsed = parse("allometry.R")
functions = Filter(is_function, file_parsed)
function_names = unlist(Map(function_name, functions))
Probably too late to join the party, but better late than never.
There is a package called NCmisc which has a function to list all functions in a file and returns a list where the names of the components are the names of the packages they belong to. If there are any functions in the global environment, they will be under the .GobalEnv list component. Simply load all packages the file uses and then run the following:
all.functions <- list.functions.in.file(
filename = "/path/to/file/my_file.R")

Error from Rfacebook 'getPage' with since&until

I am using Rfacebook version 0.6.
When I call getPage with since and until dates as following, I get the following error. What I am doing wrong, or if there's something that need be updated in the package itself?
Note: <facebook_page_name>, <my_app_id>, <my_app_secret> are placeholders for illustration, without using the actual values.
Here are the details:
content<-get_fb_data("<facebook_page_name>",since="2016/01/01",until="2016/01/20",condition=2)
get_fb_data<-function(page_name,no_of_records,since_date,until_date,condition)
{
#get data from facebook page
library("Rfacebook")
fb_oauth <- fbOAuth(app_id="<my_app_id>", app_secret="<my_app_secret>",
extended_permissions = FALSE)
if (condition == 1)
{
content<-getPage(page_name, fb_oauth, no_of_records, feed = TRUE)
}
else
{
since_date<-paste(since_date,"00:00:00 IST",sep=" ")
until_date<-paste(until_date,"23:59:59 IST",sep=" ")
from_value<-as.numeric(as.POSIXct(since_date))
to_value<-as.numeric(as.POSIXct(until_date))
content<-getPage(page_name, fb_oauth,
since = from_value,
until = to_value,
feed = TRUE)
}
return(content)
}
Error displayed:
Error in as.Date.numeric(since) : 'origin' must be supplied
Per debug, this is from function as.Date called in getPage.
This should work:
library("Rfacebook")
fb_oauth <- fbOAuth(app_id="<my_app_id>", app_secret="<my_app_secret>",
extended_permissions = FALSE)
get_fb_data <- function(page_name, no_of_records, since_date,
until_date, condition){
if (condition == 1){
content<-getPage(page_name, fb_oauth, no_of_records, feed = TRUE)
} else{
content <- getPage(page_name, fb_oauth,
since = since_date,
until = until_date,
feed = TRUE)
}
content
}
content <- get_fb_data("humansofnewyork",
since_date="2016/01/01",
until_date="2016/01/2",
condition=2)
I do not really understand, why you are trying to change date format - it's unnecessary. What is more, you have a syntax error, because else should be written after } closing if. You shouldn't also load packages inside your function. What for loading it each time? The same with your fb_oauth.

Resources