Missing `parse` information inside vignette build - r

Goal
The goal is to create a package that parses R scripts and lists functions (from the package - like mvbutils- but also imports).
Function
The main function relies on parsing R script with
d<-getParseData(x = parse(text = deparse(x)))
Reproducible code
For example in an interactive R session the output of
x<-test<-function(x){x+1}
d<-getParseData(x = parse(text = deparse(x)))
Has for first few lines:
line1 col1 line2 col2 id parent token terminal text
23 1 1 4 1 23 0 expr FALSE
1 1 1 1 8 1 23 FUNCTION TRUE function
2 1 10 1 10 2 23 '(' TRUE (
3 1 11 1 11 3 23 SYMBOL_FORMALS TRUE x
4 1 12 1 12 4 23 ')' TRUE )
Error
When building a vignette with knitr containing - either with knit html from RStudio or devtools::build_vignettes, the output of the previous chunk of code is NULL. On the other hand using "knitr::knit" inside an R session will give the correct output.
Questions:
Is there a reason for the parser to behave differently inside the knit function/environment, and is there a way to bypass this?
Update
Changing code to:
x<-test<-function(x){x+1}
d<-getParseData(x = parse(text = deparse(x),keep.source = TRUE))
Fixes the issue, but this does not answer the question of why the same function behaves differently.

From the help page ?options:
keep.source:
When TRUE, the source code for functions (newly defined or loaded) is stored internally allowing comments to be kept in the right places. Retrieve the source by printing or using deparse(fn, control = "useSource").
The default is interactive(), i.e., TRUE for interactive use.
When building the vignette, you are running a non-interactive R session, so the source code is discarded in parse().
parse(file = "", n = NULL, text = NULL, prompt = "?",
keep.source = getOption("keep.source"), srcfile,
encoding = "unknown")

Related

Error Message when Opening a CSV File in R

I attempted opening a csv file in R using R studio but got this warning message:
In readLines("persons.csv") : incomplete final line found on 'persons.csv'
Please what is wrong with the file, and how can I fix it?
You can likely ignore this as it probably still worked. Here is an example without a final newline which gives that warning and another one which has the final newline which does not give the warning. Both worked.
cat("a,b\n1,2", file = "test1.csv")
read.csv("test1.csv")
## a b
## 1 1 2
## Warning message:
## In read.table(file = file, header = header, sep = sep, quote = quote, :
## incomplete final line found by readTableHeader on 'test1.csv'
cat("a,b\n1,2\n", file = "test2.csv")
read.csv("test2.csv")
## a b
## 1 1 2
To address this try one of these:
Just ignore it as it probably worked.
Bring the file into a text editor and write it out again. That often eliminates the warning.
Use readr::read_csv. The indicated argument eliminates many messages that are otherwise output by that command.
library(readr)
read_csv("test1.csv", show_col_types = FALSE)
## # A tibble: 1 x 2
## a b
## <dbl> <dbl>
## 1 1 2
Use data.table::fread. It won't give that message.
library(data.table)
fread("test1.csv", data.table = FALSE)
## a b
## 1 1 2
From the Windows cmd line use this (note dot)
echo. >> test1.csv
or under bash (no dot)
echo >> test.csv

Changing Default dots argument in R

This question is related to this question and this question.
I need to assign a default to the ... argument in a function. I have successfully been able to to use the default package to accomplish this for specific arguments. For instance, lets say I want to allow toJSON from the jsonlite package to show more than four digits. The default is 4, but I want to show 10.
library(jsonlite)
library(default)
df <- data.frame(x = 2:5,
y = 2:5 / pi)
df
#> x y
#> 1 2 0.6366198
#> 2 3 0.9549297
#> 3 4 1.2732395
#> 4 5 1.5915494
# show as JSON - defaults to four digits
toJSON(df)
#> [{"x":2,"y":0.6366},{"x":3,"y":0.9549},{"x":4,"y":1.2732},{"x":5,"y":1.5915}]
# use default pacakge to change to 10
default(toJSON) <- list(digits = 10)
toJSON(df)
#> [{"x":2,"y":0.63661977237},{"x":3,"y":0.95492965855},{"x":4,"y":1.2732395447},{"x":5,"y":1.5915494309}]
There is another function called stream_out which uses toJSON but only uses the digits argument in ....
> stream_out(df)
{"x":2,"y":0.63662}
{"x":3,"y":0.95493}
{"x":4,"y":1.27324}
{"x":5,"y":1.59155}
Complete! Processed total of 4 rows.
>
> stream_out(df, digits = 10)
{"x":2,"y":0.63661977237}
{"x":3,"y":0.95492965855}
{"x":4,"y":1.2732395447}
{"x":5,"y":1.5915494309}
Complete! Processed total of 4 rows.
So even though I have changed the digits in toJSON, it isn't passed to the ... in stream_out. I cannot change this in the same manner as with toJSON.
> default(stream_out) <- list(digits = 10)
Error: 'digits' is not an argument of this function
This is not strictly a jsonlite question, but that is my use case here. I need to somehow change the ... argument of the stream_out function so that any time it is used, 10 digits are returned, rather than 4. However, any examples that show how to change defaults of ... arguments could probably be used to get to what I need.
Thanks!

how to feed a tibble to spacyr?

Consider this simple example
bogustib <- tibble(doc_id = c(1,2,3),
text = c('bug', 'one love', '838383838'))
# A tibble: 3 x 2
doc_id text
<dbl> <chr>
1 1 bug
2 2 one love
3 3 838383838
This tibble is called bogustib because I know spacyr will fail on row 3.
> spacy_parse('838383838', lemma = FALSE, entity = TRUE, nounphrase = TRUE)
Error in `$<-.data.frame`(`*tmp*`, "doc_id", value = "text1") :
replacement has 1 row, data has 0
so, naturally, feeding the tibble to spacyr will fail as well
spacy_parse(bogustib, lemma = FALSE, entity = TRUE, nounphrase = TRUE)
Error in `$<-.data.frame`(`*tmp*`, "doc_id", value = "3") :
replacement has 1 row, data has 0
My question is: I think I can avoid this issue by calling spacy_parse row by row.
However, this looks inefficient and I would like to use the multithread argument of spacyr to speed up the computation over my large tibble.
Is there any solution here?
Thanks!
Actually, it does not happen in my environment. In my environment, the output is like:
library(tidyverse)
library(spacyr)
bogustib <- tibble(doc_id = c(1,2,3),
text = c('bug', 'one love', '838383838'))
spacy_parse(bogustib)
spacy_parse('838383838', lemma = FALSE, entity = TRUE, nounphrase = TRUE)
## No noun phrase found in documents.
## doc_id sentence_id token_id token pos entity
## 1 text1 1 1 838383838 NUM CARDINAL_B
To get this result, I used the latest master on github. However, I was able to reproduce your error when I ran with the CRAN version of spacyr. I'm sure that I fixed the bug a while ago, but that seems not reflected on CRAN version. We will try to update the CRAN asap.
In the meantime, you can:
devtools::install_github('quanteda/spacyr')
Or zip download the repo and run:
devtools::install('******')
**** is the path to the unzipped repository.

R, getting an invalid argument to unary operator when using order function

I'm essentially doing the exact same thing 3 times, and when adding a new variable I get this error
Error in -emps$EV : invalid argument to unary operator
The code chunk causing this is
evps<-aggregate(EV~player,s1k,mean)
sort2<-evps[order(-evps$EV),]
head(sort2,10)
s1k$EM<-s1k$points-s1k$EV
emps<-aggregate(EM~player,s1k,mean)
sort3<-emps[order(-emps$EV),]
head(sort3,10)
Works like a charm for the first list, but the identical code thereafter causes the error.
This specific line is causing the error
sort3<-emps[order(-emps$EV),]
How can I fix/workaround this?
Full Code
url <- getURL("https://raw.githubusercontent.com/M-ttM/Basketball/master/class.csv")
shots <- read.csv(text = url)
shots$make<-shots$points>0
shots2<-shots[which(!(shots$player=="Luc Richard Mbah a Moute")),]
fit1<-glm(make~factor(type)+factor(period), data=shots2,family="binomial")
summary(fit1)
shots2$makeodds<-fitted(fit1)
shots2$EV<-shots2$makeodds*ifelse(shots2$type=="3pt",3,2)
shots3<-shots2[which(shots2$y>7),]
locmakes<-data.frame(table(shots3[, c("x", "y")]))
s1k <- shots2[with(shots2, player %in% names(which(table(player)>=1000))), ]
pps<-aggregate(points~player,s1k,mean)
sort<-pps[order(-PPS$points),]
head(sort,10)
evps<-aggregate(EV~player,s1k,mean)
sort2<-evps[order(-evps$EV),]
head(sort2,10)
s1k$EM<-s1k$points-s1k$EV
emps<-aggregate(EM~player,s1k,mean)
sort3<-emps[order(-emps$EV),]
head(sort3,10)
The error message seems to occur when trying to order columns including chr type data. A possible workaround is to use the reverse function rev() instead of the minus sign, like so:
column_a = c("a","a","b","b","c","c")
column_b = seq(6)
df = data.frame(column_a, column_b)
df$column_a = as.character(df$column_a)
df[with(df, order(-column_a, column_b)),]
> Error in -column_a : invalid argument to unary operator
df[with(df, order(rev(column_a), column_b)),]
column_a column_b
5 c 5
6 c 6
3 b 3
4 b 4
1 a 1
2 a 2
Let me know if it works in your case.
On this line, emps$EV doesn't exist.
s1k$EM<-s1k$points-s1k$EV
emps<-aggregate(EM~player,s1k,mean)
sort3<-emps[order(-emps$EV),]
head(sort3,10)
You probably meant
s1k$EM<-s1k$points-s1k$EV
emps<-aggregate(EM~player,s1k,mean)
sort3<-emps[order(-emps$EM),]
head(sort3,10)

could not find function "summary.rbga"

when im trying to execute the the following command:
cat(summary.rbga(GAmodel))
the output is:
Error in cat(summary.rbga(GAmodel)):
could not find function "summary.rbga"
im sure that i import the package by the command "library(genalg)" and it works perfectly for other functions.
im using Version 0.98.1102 on windows.
The function summary.rbga is in genalg but it's not exported from the package explicitly. It's the special implementation of the summary function for rbga objects. In the example from the help page, you can see how it works
evaluate <- function(string=c()) {
returnVal = 1 / sum(string);
returnVal
}
rbga.results = rbga.bin(size=10, mutationChance=0.01, zeroToOneRatio=0.5,
evalFunc=evaluate)
class(rbga.results)
# [1] "rbga"
summary(rbga.results, echo=TRUE)
# GA Settings
# Type = binary chromosome
# Population size = 200
# Number of Generations = 100
# Elitism = 40
# Mutation Chance = 0.01
#
# Search Domain
# Var 1 = [,]
# Var 0 = [,]
#
# GA Results
# Best Solution : 1 1 1 1 1 1 1 1 1 1
Note that you call summary rather than summary.rbga directly. As long as you pass in an object that has class "rbga" it will work.
You can access the function directly with genalg:::summary.rbga

Resources