How can I switch virtualenv from reticulate in R? - r

In R package reticulate there is a function use_virtualenv but it does not look like I can call it twice with different virtualenvs, second call is always ignored.
Is there a way to deactivate first virtualenv so I can call use_virtualenv("venv2") with the expected behavior?
#initialize
require(reticulate)
virtualenv_create("venv1")
virtualenv_create("venv2")
#call first virtualenv
use_virtualenv("venv1")
py_config() #show venv1 specs
#call second vrtualenv
use_virtualenv("venv2")
py_config() # still show venv1 specs, I want venv2 here
I think unloadNamespace("reticulate") could work but in my case first call is made by another package...

In short: By restarting the R session! You can't switch virtualenv in reticulate once chosen!
I tried it (but chose "venv2" first).
> use_python("venv1", T)
Error in use_python("venv1", T) :
Specified version of python 'venv1' does not exist.
> use_python("~/.virtualenvs/venv1", T)
ERROR: The requested version of Python ('~/.virtualenvs/venv1') cannot
be used, as another version of Python
('/home/josephus/.virtualenvs/venv2/bin/python') has already been
initialized. Please restart the R session if you need to attach
reticulate to a different version of Python.
Error in use_python("~/.virtualenvs/venv1", T) :
failed to initialize requested version of Python
So reticulate messages, that one has to start a new session to choose new virtual environment.
This must apply to use_virtualenv(<xxx>, T) too, though it is not as verbose as use_python(<xxx>, T).

Related

R: Patching a package function and reloading base libraries

Occasionally one wants to patch a function in a package, without recompiling the whole package.
For example, in Emacs ESS, the function install.packages() might get stuck if tcltk is not loaded. One might want to patch install.packages() in order to require tcltk before installation and unload it after the package setup.
A temp() patched version of install.packages() might be:
## Get original args without ending NULL
temp=rev(rev(deparse(args(install.packages)))[-1])
temp=paste(paste(temp, collapse="\n"),
## Add code to load tcltk
"{",
" wasloaded= 'package:tcltk' %in% search()",
" require(tcltk)",
## Add orginal body without braces
paste(rev(rev(deparse(body(install.packages))[-1])[-1]), collapse="\n"),
## Unload tcltk if it was not loaded before by user
" if(!wasloaded) detach('package:tcltk', unload=TRUE)",
"}\n",
sep="\n")
## Eval patched function
temp=eval(parse(text=temp))
# temp
Now we want to replace the original install.packages() and perhaps insert the code in Rprofile.
To this end it is worth nothing that:
getAnywhere("install.packages")
# A single object matching 'install.packages' was found
# It was found in the following places
# package:utils
# namespace:utils
# with value
#
# ... install.packages() source follows (quite lengthy)
That is, the function is stored inside the package/namespace of utils. This environment is sealed and therefore install.packages() should be unlocked before being replaced:
## Override original function
unlockBinding("install.packages", as.environment("package:utils"))
assign("install.packages", temp, envir=as.environment("package:utils"))
unlockBinding("install.packages", asNamespace("utils"))
assign("install.packages", temp, envir=asNamespace("utils"))
rm(temp)
Using getAnywhere() again, we get:
getAnywhere("install.packages")
# A single object matching 'install.packages' was found
# It was found in the following places
# package:utils
# namespace:utils
# with value
#
# ... the *new* install.packages() source follows
It seems that the patched function is placed in the right place.
Unfortunately, running it gives:
Error in install.packages(xxxxx) :
could not find function "getDependencies"
getDependencies() is a function inside the same utils package, but not exported; therefore it is not accessible outside its namespace.
Despite the output of getAnywhere("install.packages"), the patched install.packages() is still misplaced.
The problem is that we need to reload the utils library to obtain the desired effect, which also requires unloading other libraries importing it.
detach("package:stats", unload=TRUE)
detach("package:graphics", unload=TRUE)
detach("package:grDevices", unload=TRUE)
detach("package:utils", unload=TRUE)
library(utils)
install.packages() works now.
Of course, we need to reload the other libraries too. Given the dependencies, using
library(stats)
should reload everything. But there is a problem when reloading the graphics library, at least on Windows:
library(graphics)
# Error in FUN(X[[i]], ...) :
# no such symbol C_contour in package path/to/library/graphics/libs/x64/graphics.dll
Which is the correct way of (re)loading the graphics library?
Patching functions in packages is a low-level operation that should be avoided, because it may break internal assumptions of the execution environment and lead to unpredictable behavior/crashes. If there is a problem with tck/ESS (I didn't try to repeat that) perhaps it should be fixed or there may be a workaround. Particularly changing locked bindings is something to avoid.
If you really wanted to run some code at the start/end of say install.packages, you can use trace. It will do some of the low-level operations mentioned in the question, but the good part is you don't have to worry about fixing this whenever some new internals of R change.
trace(install.packages,
tracer=quote(cat("Starting install.packages\n")),
exit=quote(cat("Ending install packages.\n"))
)
Replace tracer and exit accordingly - maybe exit is not needed anyway, maybe you don't need to unload the package. Still, trace is a very useful tool for debugging.
I am not sure if that will solve your problem - if it would work with ESS - but in general you can also wrap install.packages in a function you define say in your workspace:
install.packages <- function(...) {
cat("Entry.\n")
on.exit(cat("Exit.\n"))
utils::install.packages(...)
}
This is the cleanest option indeed.

TreeTagger in R

I have downloaded TreeTaggerv3.2 for Windows and have configured it per the install.txt. I am trying to use it in R with koRpus package. I have set the kRp.env as -
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en",
preset="en", treetagger="manual", format="file",
TT.tknz=TRUE, encoding="UTF-8" )
.My data to be tagged is in a file and trying to use it as treetag("myfile.txt") but it is throwing the error-
Error in matrix(unlist(strsplit(tagged.text, "\t")), ncol = 3, byrow = TRUE, :
'data' must be of a vector type, was 'NULL'
In addition: Warning message:
running command 'C:\windows\system32\cmd.exe /c C:\TreeTagger\bin\tag-english.bat
C:\Users\vivsingh\Desktop\NLP\tree_tag_ex.txt' had status 255
The standalone TreeTagger is working on by windows.Any idea on how it works?
I had the exact same error and warning while trying lemmatization on R word vector following Bernhard Learns blog using windows 7 and R 3.4.1 (x64). The issue was also appearing using textstem package but TreeTagger was running properly in cmd window.
I mixed several answers I found on this post and here is my steps and code running properly:
get into R win_library (~\Documents\R\win-library\3.4\rJava\jri\x64\jri.dll) and copy jri.dll (thanks kravi!) to replace it the parent folder.
close and restart R
library(koRpus)
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
lemma_tagged <- treetag(lemma_unique$word_clean, treetagger="manual", format="obj", TT.tknz=FALSE , lang="en", TT.options=list(path="c:/TreeTagger", preset="en"))
lemma_tagged_tbl <- tbl_df(lemma_tagged#TT.res)
Hope it helps.
I am posting this answer to keep a record. I also faced the same issue due to incorrect specification of the location of jri.dll on 64-Bit processor and windows 8.1. If we call
set.kRp.env(TT.cmd="manual", lang="en", TT.options=list(path="/path/to/tree-tagger-windows-x.x/TreeTagger", preset="en")) and we follow either of following two steps, we can resolve this error:
While installing R, if we install only 64 Bit version of R, and
specify the proper path for these variables
LD_LIBRARY_PATH = /path/to/rJava/jri
JAVA_HOME = /path/to/jdk1.x.x
java.library.path = /path/to/rJava/jri/jri.dll
CLASSPATH = /path/to/rJava/jri
If we already installed both versions viz. 32 bit and 64 bit of R on your computer then just copy jri.dll from /path/to/rJava/jri/x64/jri.dll and replace at path/to/rJava/jri/jri.dll. Further, we need to set the path of above mentioned four variables.
I've got this issue (very similar I guess) and posted query to GitHub.
https://github.com/unDocUMeantIt/koRpus/issues/7
The current working solution for me for this case was easier than I could expect, just downgrading the koRpus package. This can change with time but this version should remain appropriate.
library("devtools")
install_github("unDocUMeantIt/koRpus", ref="0.06-5")
This package is not Java related they said.
You can face the same error while setting up the korpus environment and getting the result from treetagger. For example, when you use:
tagged.text <- treetag(
"C:/temp/sample_text.txt",
treetagger = "manual",
lang = "en",
TT.options = list(
path = "c:/Treetagger",
preset = "en"
),
doc_id = "sample"
)
You would receive a similar error
Error: Awww, this should not happen: TreeTagger didn't return any useful data.
This can happen if the local TreeTagger setup is incomplete or different from what presets expected.
You should re-run your command with the option 'debug=TRUE'. That will print all relevant configuration.
Look for a line starting with 'sys.tt.call:' and try to execute the full command following it in a command line terminal. Do not close this R session in the meantime, as 'debug=TRUE' will keep temporary files that might be needed.
If running the command after 'sys.tt.call:' does fail, you'll need to fix the TreeTagger setup.
If it does not fail but produce a table with proper results, please contact the author!
Here you need to change the value of treetagger, from
treetagger = "manual"
to
treetagger = "kRp.env"
However, before that remember to set the kRp.env as #Xochitl C. suggested in their answer
set.kRp.env(TT.cmd="C:\\TreeTagger\\bin\\tag-english.bat", lang="en", preset="en", treetagger="manual", format="file", TT.tknz=TRUE, encoding="UTF-8")
Once you do this, you'll get the desired result.

Fatal Error while using Rcpp in RStudio on Windows

I'm trying to use Rcpp on Windows in RStudio. I have R version 3.2.3 and I have installed the Rcpp package. The problem is that I am unable to call any functions defined through the CPP code. I tried the following (picked up from an example online).
body <- '
NumericVector xx(x);
return wrap( std::accumulate( xx.begin(), xx.end(), 0.0));'
add <- cxxfunction(signature(x = "numeric"), body, plugin = "Rcpp")
This gives the following warning, but completes execution successfully.
cygwin warning:
MS-DOS style path detected: C:/R/R-32~1.3/etc/x64/Makeconf
Preferred POSIX equivalent is: /cygdrive/c/R/R-32~1.3/etc/x64/Makeconf
CYGWIN environment variable option "nodosfilewarning" turns off this warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
When I try to use the above function,
x <- 1
y <- 2
res <- add(c(x, y))
I get the following error :
R Session Aborted
R encountered a fatal error.
The session was terminated.
Any suggestions? This same 'Fatal Error' happens for any code that I run with Rcpp.
Try rebuilding locally, starting with Rcpp. This is valid code and will work (and the hundreds of unit tests stress may more than this). Sometimes the compiler or something else changes under you and this sort of thing happens. It is then useful to have an alternative build system -- eg via Travis at GitHub you get Linux for free.
Also, learning about Rcpp Attributes. Your example can be written as
R> library(Rcpp)
R> cppFunction("double adder(std::vector<double> x) { return std::accumulate(x.begin(), x.end(), 0.0); }")
R> adder(c(1,2))
[1] 3
R>
which is simpler. Works of course the same way with Rcpp::NumericVector.

Getting call hierarchy in gperftools for an R package

I can use gperftools to produce a call graph, as in for instance this question.
Now I would like to get a call graph for bind_rows() in the dplyr R package in order to track down this bug.
I compiled both R and dplyr using CPP/CXXFLAGS=-g -fvar-tracking-assignments and LDFLAGS=-lprofiler -lunwind.
When I run the following:
CPUPROFILE="samples.log" R --vanilla <<< "library(dplyr)
ll = lapply(1:1e5, function(x) as.list(setNames(runif(5), letters[1:5])))
print(system.time(bind_rows(ll)))"
pprof --gif /usr/lib/R/bin/exec/R samples.log > out.gif
All I get is:
How can I get the call hierarchy so I know which call in dplyr's bind rows file is the bottleneck?
edit: It seems that the --focus option is what I need here. But how to connect this to RecursiveRelease?
pprof --focus=rbind__impl --gif /usr/lib/R/bin/exec/R samples.log > out.gif
edit: After recompiling Rcpp with -g and linking with -lprofiler, I could get the following: flame.svg, where 8% gets a good stack trace but most of it still doesn't. Could this be because some library is loaded without -lprofiler support?

Determine version of a specific package

How can I get the version number for a specific package?
The obvious way is to get the dictionary with all installed packages, and then filter for the one of interest:
pkgs = Pkg.installed();
pkgs["Datetime"]
Getting the list of all installed packages is very slow though, especially if there are many packages.
EDIT: For Julia version 1.1+
Use the Pkg REPL notation:
] status # Show every installed package version
] status pkgName # Show the specific version of the package
] status pkgName1 pkgName2 # Show the named packages. You can continue the list.
The ] enters the Pkg REPL, so you basically write status ...
So in your case, write after entering the Pkg REPL:
status DataFrame
Or use the object-oriented approach (NB: Here you don't enter the Pkg REPL, i.e. DON'T use ]:
Pkg.status("DataFrame")
EDIT: For Julia version 1.0
Pkg.installed seems to have "regressed" with the new package system. There are no arguments for Pkg.installed. So, the OP's original method seems to be about the best you can do at the moment.
pkgs = Pkg.installed();
pkgs["Datetime"]
EDIT: For Julia version upto 0.6.4
You can pass a string to Pkg.installed. For example:
pkgs = Pkg.installed("JuMP")
I often check available calling arguments with methods. For example:
julia> methods(Pkg.installed)
# 2 methods for generic function "installed":
installed() at pkg/pkg.jl:122
installed(pkg::AbstractString) at pkg/pkg.jl:129
or
julia> Pkg.installed |> methods
# 2 methods for generic function "installed":
installed() at pkg/pkg.jl:122
installed(pkg::AbstractString) at pkg/pkg.jl:129
I would try Pkg.status("PackageName")
This will print out a little blurb giving the package name.
Here is an example
julia> Pkg.status("QuantEcon")
- QuantEcon 0.0.1 master
In Julia 1.1 you can use
(v1.1) pkg> status "name_of_the_package"
to find the version of any package in a given environment.
In order to look of a version of an indirectly included package (e.g. top-level project includes Module A which depends on Module B, where you need to know info about Module B), you have to pull the info either from the Manifest.toml directly, or you have to bring in the Context object in from Pkg.
Below is done with Julia 1.3.1 ... there may be changes to Pkg's internals since then.
using Pkg
ctx = Pkg.Operations.Context()
# Get the version of CSV.jl
version = ctx.env.manifest[UUID("336ed68f-0bac-5ca0-87d4-7b16caf5d00b")].version
if version <= v"0.5.24"
# handle some uniqueness about the specific version of CSV.jl here
end
UPDATE: Or without a UUID and just the package name (thanks #HHFox):
using Pkg
pkg_name = "Observables"
m = Pkg.Operations.Context().env.manifest
v = m[findfirst(v->v.name == pkg_name, m)].version
or to do the same with the Manifest.toml
using Pkg
# given the path to the Manifest.toml file...
manifest_dict = Pkg.TOML.parsefile(manifest_path)
# look for a named package like `CSV`
package_dict = manifest_dict[package_name][1]
#show package_dict
Well this didn't print well in the comment section...
Here is a version that matches the name rather than the UUID
using Pkg
m = Pkg.Operations.Context().env.manifest
v = m[findfirst(v -> v.name == "CSV", m)].version
For packages which are dependencies of the specified packages in the project file one can use status -m <packageName> or shorter st -m <packageName> in package mode (After ]`).
For a full list, just use st -m.
This is an extension to https://stackoverflow.com/a/25641957.

Resources