R expss - 'could not find function "setalloccol"' - r

I'm trying to work through some of the examples in this article around table generations using expss - https://cran.r-project.org/web/packages/expss/vignettes/tables-with-labels.html - however, I am consistently getting the error could not find function "setalloccol" using the most basic crosstab functions of cro and fre with two variables:
> cro(df$var1, df$var2)
Error in setalloccol(ans) : could not find function "setalloccol"
I'm using R Studio 1.2.1335 and I've re-installed the packages dplyr, data.table, tidyr and expss itself, but I still seem to get this error with all these libraries running. I've googled the exact error I'm coming up with and there is absolutely zilch on this, so appreciate any help...

Try to explicitly export setalloccol from data.table before running your code:
setalloccol = data.table::setalloccol
# further calculations
# cro(df$var1, df$var2)

'Setalloccol' is an experimental command in data.table used to allocate memory by reference to assure something more stable than a shallow copy is allocated by ':='. 'Expss' looks like a monster library. I won't load it now and track down your error. But since 'setalloccol' is an experimental command, you should find the 'expss' developers and file a defect. There is, however, already a whole gnarly open bug report on this exact issue here: https://github.com/gdemin/expss/issues/42. The developer of data.table ("Matt Dowle") has comment in that bug report. In practice setalloccol works like this:
help(setalloccol)
data.table::truelength(HMR)
[1] 1035
options(datatable.verbose=TRUE)
data.table::setalloccol(HMR,2 * 1035)
data.table::truelength(HMR)
[1] 2081
But it really isn't necessary for most data.table computations. Try to pour over the "expss" code and find why and when they use it. Sorry I am not more helpful.

Thanks to rferrisx for the thread from GitHub. The post from josie-athens from 3rd Nov 19 seems to fix this issue, though I didn't run R from Bash. So my process was:
Uninstall expss and data.table packages: remove.packages(c('expss','data.table'))
Reinstall above packages: install.packages(c('data.table','expss'))
This seems to bypass the error. Not entirely sure why though but hopefully helpful for somebody experiencing the same thing.

For what it's worth, I just ran into the same issue and wanted to give my two cents. This seems to be a matter of the order in which you load the packages, since the "expss" package masks several functions of the "data.table" package and vice versa. Try reversing the loading order. At least that solved the issue for me.

Related

Determining if there are unused packages in an R script [duplicate]

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

How can I tell which packages I am not using in my R script?

As my code evolves from version to version, I'm aware that there are some packages for which I've found better/more appropriate packages for the task at hand or whose purpose was limited to a section of code which I've now phased out.
Is there any easy way to tell which of the loaded packages are actually used in a given script? My header is beginning to get cluttered.
Update 2020-04-13
I've now updated the referenced function to use the abstract syntax tree (AST) instead of using regular expressions as before. This is a much more robust way of approaching the problem (it's still not completely ironclad). This is available from version 0.2.0 of funchir, now on CRAN.
I've just got around to writing a quick-and-dirty function to handle this which I call stale_package_check, and I've added it to my package (funchir).
e.g., if we save the following script as test.R:
library(data.table)
library(iotools)
DT = data.table(a = 1:3)
Then (from the directory with that script) run funchir::stale_package_check('test.R'), we'll get:
Functions matched from package data.table: data.table
**No exported functions matched from iotools**
Have you considered using packrat?
packrat::clean() would remove unused packages, for example.
I've written a command-line script to accomplish this task. You can find it in this Github gist. I'm sure there are edge cases that it misses, but it works pretty well, on both R scripts and Rmd files.
My approach always is to close my R script or IDE (i.e. RStudio) and then start it again.
After this I run my function without loading any dependecies/packages beforehand.
This should result in various warning and error messages telling you which functions couldn't be found and executed. This again will give you hints on what packages are necessary to load beforehand and which one you can leave out.

emacs ess crashes when trying to access h_elp

My emacs/ess session crashes when I try to access help. This happens if I have two packages loaded with the same functions; for example:
library(lubridate)
library(data.table)
?month
In Rgui interface pops out and asks to choose from which packages I want help. Emacs just crashes. Similar issues happens with install.packages, but there is a way to specify mirror Is there a way to install R packages using emacs?
Is there a similar trick with help?
Well, there is no full proof solution for time being as nobody really understands why these crashes happen. I assume you are on windows, right?
There are plans in ESS to completely internalize all the help (and other) calls in order not to depend on R dialogs. Hopefully in the next version.
For time being put this into your .Rprofile
tis <- utils:::index.search
formals(tis)[["firstOnly"]] <- TRUE
assignInNamespace("index.search", tis, "utils")
It basically makes help system to pick the first package with the found topic. In your case month help page in data.table package will be ignored. Not a big deal as common topic names are quite rare anyways.
I found out that starting library(tcltk) solves this problem. Menu appears even after it is called from emacs+ess. I added library(tcltk) to my Rprofile.site and now everything works great, install.packages() and accessing help when multiple packages load same function

R: importing data.table package namespace, unexplainable jump in memory consumption

I use data.table package inside my own package and I import data.table namespace in NAMESPACE and DESCRIPTION files.
In one of my functions I use data.table function to convert data.frame into data.table
dt <- data.table(df)
But when I call my function, at the point of calling data.table() memory usage jumps instantly and R just stops responding.
The code within the function works fine when I run it line by line and with low memory consumption.
Also, if I put library(data.table) within my function everything is fine. I was trying to avoid putting library(data.table) in my function and declare dependency instead. However, it seems something is going wrong that way. I am running R-2.14.0 on Mac OS X 10.6.8
Can anybody explain what could be a reason, and how can I fix that (without using library(data.table) within my function)?
Some random guesses in no particular order :
Try use the Imports or Depends field in DESCRIPTION only. I don't think you need to import in NAMESPACE as well, but I might be wrong. Why that would explain the memory use though, don't know.
What is df? Is it big or somehow recursive or strange in some way? Please provide str(df) to tell us something about it, if possible.
Try as.data.table(df) which is faster than data.table(df). But it sounds like your problem is different to that.
Is your function call being called repeatedly? I can see why repeatedly converting df to dt would use up memory, but not why just calling library(data.table) would make that fast.
Try starting R with R --vanilla to ensure no .Rdata (which may include functions masking data.table's) is being loaded on startup, amongst other things. If you have developed your own package then some kind of function name conflict, or the order of packages on the search() path sounds plausible.
Otherwise we'll need more information please. I don't recall anything similar to this happening to me, or being reported before.
And, which version of data.table are you using? There is this bug fix in v1.8.1 on R-Forge (not yet on CRAN) :
Moved data.table setup code from .onAttach to .onLoad so that it is also run when data.table is simply imported from within a package, fixing #1916 related to missing data.table options.
But if you are using 1.8.0 from CRAN, and are Importing (only) rather than Depending then I'd expect you to get an error about missing options rather than a jump in memory consumption.

How to specify the package when looking up a help reference page for a function?

How does one look up the help manual page for a function and specify the package in R? For example, count appears in both seqinr and plyr. If I want to look up count in plyr, what's the command? I've tried a few obvious (but wrong) guesses such as "?plyr::count"
EDIT:
When I do ?count, I get the following message:
Help on topic 'count' was found in the following packages:
Package Library
plyr /Library/Frameworks/R.framework/Versions/2.15/Resources/library
seqinr /Library/Frameworks/R.framework/Versions/2.15/Resources/library
When I do ?plyr::count, I get:
No documentation for 'plyr::count' in specified packages and libraries:
you could try '??plyr::count'
When I do ?plyr:::count, I get:
No documentation for 'plyr:::count' in specified packages and libraries:
you could try '??plyr:::count'
Adding two question marks also gets me a no documentation found error as well. Looking up help for non-ambiguous funcitons is working fine (e.g. ?plot)
This is with R 2.15.0 on OSX running in emacs + ESS.
Use the package= argument to help:
help("count", package="plyr")
The correct way to do this is:
?plyr::count
?plyr:::count
See ?"?" for details - both examples are shown.
Both work for me with both packages loaded and even without the package loaded. That begs the question if you have the packages installed?
You were close, you need three : :::
?seqinr:::count # for seqinr
?plyr:::count # for plyr

Resources