rPython method python.get returns weird encoding - r

I m trying to use rPython package to pass some arguments into python code and get results back. But for some reason I m getting weird encoding from my python code. Maybe someone has some hints to point me out.
Here is my simple code to test:
require(rPython)
#pass the test word 'audiention' (in ukrainian)
word<-"аудієнція"
python.assign("input", word)
python.exec("input = input.encode('utf-8')")
python.exec("print input") #the output in console is correct at this step: аудієнція
x<-python.get("input")
cat(x) # the output is: 0C4VT=FVO
Does anybody have some suggestions why the output of python.get is encoded weird?
My Sys.getlocale() output is:
Sys.getlocale()
[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=uk_UA.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=uk_UA.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=uk_UA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=uk_UA.UTF-8;LC_IDENTIFICATION=C"
Thank you in advance for any hints!

I have recently built a new package based on the original rPython code called SnakeCharmR that addresses this and other problems rPython had.
A quick comparison:
> library(SnakeCharmR)
> py.assign("a", "'")
> py.get("a")
[1] "'"
> py.assign("a", "áéíóú")
> py.get("a")
[1] "áéíóú"
> library(rPython)
> python.assign("a", "'")
File "<string>", line 2
a =' [ "'" ] '
^
SyntaxError: EOL while scanning string literal
> python.assign("a", "áéíóú")
> python.get("a")
[1] "\xe1\xe9\xed\xf3\xfa"
You can install SnakeCharmR like this:
> library(devtools)
> install_github("asieira/SnakeCharmR")
Hope this helps.

Related

rscopus package on R in MacBook - Invalid API key error

I am trying to use the Scopus API for the first time. I have the API key and the institution token. However, I am still getting an error, when I try to use it in R on my Mac. Here is my code:
library(rscopus)
set_api_key(MY_KEY)
hdr=inst_token_header(MY_TOKEN)
key=get_api_key()
print(rscopus::get_api_key(), reveal=TRUE)
have_api_key()
auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE)
The error message is:
> library(rscopus)
>
> set_api_key(MY_KEY)
> hdr=inst_token_header(MY_TOKEN)
> key=get_api_key()
> print(rscopus::get_api_key(), reveal=TRUE)
[1] "MY_KEY"
> have_api_key()
[1] TRUE
>
> if (have_api_key()) {
+ auth = elsevier_authenticate(api_key=key)
+ }
HTTP specified is: https://api.elsevier.com/authenticate
Warning message:
In elsevier_authenticate(api_key = key) : Forbidden (HTTP 403).
> auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE)
$`service-error`
$`service-error`$status
$`service-error`$status$statusCode
[1] "AUTHENTICATION_ERROR"
$`service-error`$status$statusText
[1] "Invalid API Key: valid apikey credentials required."
Error in get_complete_author_info(...) : Service Error
I tried
if (have_api_key()) {
auth = elsevier_authenticate(api_key=key)
}
but I get the error:
HTTP specified is: https://api.elsevier.com/authenticate
Warning message:
In elsevier_authenticate(api_key = key) : Forbidden (HTTP 403).
I have tried using auth_token_header(MY_TOKEN) instead of inst_token_header(MY_TOKEN) but the code is still not working.
I have also taken the following step in my terminal:
export Elsevier_API=MY_KEY > ~/.bash_profile
source ~/.bash_profile
I am still getting the error. However, the combination of key and institution token work here: https://dev.elsevier.com/scopus.html
Can anyone please help me debug this issue?
Thank You!
I figured out the issue. So, the correct way to query would be to pass the headers argument as well:
auth_info = process_author_name(last_name="Muschelli", first_name="John", verbose=FALSE, headers=hdr)
And now the code will work! :)

R library "XML" doesn't recognize encoding

Problem
I have an XML file that I would like to parse in R. I know that this file is not corrupted because the following Python code seems to work:
>>> import xml.etree.ElementTree as ET
>>> xml_tree = ET.parse(PATH_TO_MY_XML_FILE)
>>> do_my_regular_xml_stuff_that_seems_to_work_no_problem(xml_tree)
Now, when I try to run the following code in R, I get an error message:
> library("XML")
> xml_tree <- XML::xmlParse(PATH_TO_MY_XML_FILE)
Error in nchar(text_repr): invalid multibyte string, element 1
Traceback:
Alright, maybe the parser doesn't recognize the encoding. Luckily this should be specified in a decent XML file. So, I go to my shell and check:
$ head -n1 PATH_TO_MY_XML_FILE
??<?xml version="1.0" encoding="utf-16"?>
Now, I can go back to R and explicitly pass on the encoding, only to face the next error message where I got stuck now:
> library("XML")
> xml_tree <- XML::xmlParse(PATH_TO_MY_XML_FILE, encoding='UTF-16')
Start tag expected, '<' not found
Error: 1: Start tag expected, '<' not found
Traceback:
1. XML::xmlParse(filePath, encoding = "UTF-16")
2. (function (msg, ...)
. {
. if (length(grep("\\\n$", msg)) == 0)
. paste(msg, "\n", sep = "")
. if (immediate)
. cat(msg)
. if (length(msg) == 0) {
. e = simpleError(paste(1:length(messages), messages, sep = ": ",
. collapse = ""))
. class(e) = c(class, class(e))
. stop(e)
. }
. messages <<- c(messages, msg)
. })(character(0))
A last attempt to check (in R) if the file is in fact "UTF-16" encoded yields:
> f <- file(filePath, 'r', encoding = "UTF-16")
> firstLine <- readLines(f, n=1)
> close(f)
> print(line)
[1] "<?xml version=\"1.0\" encoding=\"utf-16\"?>"
Which looks just about right to me.
Question(s)
Does anyone know what is happening here? Is this a bug from the XML library? Is the file maybe not 'UTF-16' encoded, even though it claims it is? What are the two question marks ?? that I see when I print the file into the shell? These question marks don't appear when reading in the file properly...
Is this a bug from the XML library?
I think there could be a bug here. If I generate a valid UTF-16 XML document, which will have an initial byte-order mark:
$ echo '<a>😊</a>' | iconv -t utf-16 >a-utf16.xml
$ xxd a-utf16.xml
00000000: fffe 3c00 6100 3e00 3dd8 0ade 3c00 2f00 ..<.a.>.=...<./.
00000010: 6100 3e00 0a00 a.>...
then I can parse it with:
> XML::xmlParse('a-utf16.xml')
<?xml version="1.0"?>
<a>😊</a>
but not if I specify the encoding:
> XML::xmlParse('a-utf16.xml', encoding='utf-16')
Start tag expected, '<' not found
Error: 1: Start tag expected, '<' not found
Your original problem was when you weren't specifying the encoding. However:
I know that this file is not corrupted because the following Python code seems to work
That's a good hint, but I think you'll find edge cases where that doesn't hold. Try iconv for a second opinion on whether the file is encoded correctly.
For a more specific response, you'll need to post a reproducible XML file,

In R cannot get function from imported python file using reticulate

I would like to import a python file and then use a function within the python file; but it does not work (it only work for source_python). Is it suppose to be this way?
In python file called the_py_module.py includes this code:
def f1():
return "f one"
def f2():
return "f two"
R script
# Trying to import the python file, which appear to work:
reticulate::import("the_py_module")
Gives this output:
Module(the_py_module)
# But when calling the function:
f1()
I get error saying:
Error in f1() : could not find function "f1"
This works using source python script though.
reticulate::source_python("the_py_module.py")
f1()
Try the following approach:
> library(reticulate)
> my_module <- import(the_py_module)
> my_module$f1()
[1] "f one"
or, using your approach
> my_module_2 <- reticulate::import("the_py_module")
> my_module_2$f1()
[1] "f one"

rJava: failing to load awt.events

Right now, I try to implement some event methods of awt.events, however I cannot get it to load, since I always get an ClassNotFoundException error.
> library(rJava)
> .jinit()
[1] 0
> jEvents <- .jnew("java.awt.event")
Error in .jnew("java.awt.event") : java.lang.ClassNotFoundException
Edit:
Even if I try a specific class, I get an error message:
> library(rJava)
> .jinit()
> jEvents <- .jnew("java.awt.event.ActionEvent")
Error in .jnew("java.awt.event.ActionEvent") :
java.lang.NoSuchMethodError: <init>
It seems you are trying to instantiate package instead of class:
https://docs.oracle.com/javase/7/docs/api/java/awt/event/package-summary.html
Maybe you are looking for:
java.awt.event.ActionEvent
You can try this one:
library(rJava)
.jinit()
> EVT <- J("java.awt.event.ActionEvent")
> aEVT <- new(EVT, "StringObject", 1001L, "Hello")
> aEVT
[1] "Java-Object{java.awt.event.ActionEvent[ACTION_PERFORMED,cmd=Hello,when=0,modifiers=] on Str}"
You have to call the constructor with given parameters. Note that ActionEvent doesn't have default constructor.
You can find nice source here:
http://www.deducer.org/pmwiki/pmwiki.php?n=Main.Development#wwjoir

gperftools Error: substr outside of string at /usr/local/bin/pprof line 3618

as suggest by Dirk Eddelbuettel in this talk and this answer I tried to profile compiled R code using gperftools. Here is what I did.
I used Dirks profilingSmall.R as script that I want to profile. I repeat it here:
## R Extensions manual, section 3.2 'Profiling R for speed'
## 'N' reduced to 99 here
suppressMessages(library(MASS))
suppressMessages(library(boot))
storm.fm <- nls(Time ~ b*Viscosity/(Wt - c), stormer, start = c(b=29.401, c=2.2183))
st <- cbind(stormer, fit=fitted(storm.fm))
storm.bf <- function(rs, i) {
st$Time <- st$fit + rs[i]
tmp <- nls(Time ~ (b * Viscosity)/(Wt - c), st, start = coef(storm.fm))
tmp$m$getAllPars()
}
rs <- scale(resid(storm.fm), scale = FALSE) # remove the mean
Rprof("boot.out")
storm.boot <- boot(rs, storm.bf, R = 99) # pretty slow
Rprof(NULL)
To profile it I run the following script
LD_PRELOAD="/usr/lib/libprofiler.so.0"
\CPUPROFILE=sample.log \
Rscript profilingSmall.R
Then I tried to parse the log file using
pprof /usr/bin/R sample.log
This returned the following error
Using local file /usr/bin/R.
Using local file sample.log.
substr outside of string at /usr/local/bin/pprof line 3618.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3618.
substr outside of string at /usr/local/bin/pprof line 3620.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3620.
sample.log: header size >= 2**16
sample.log is empty. However, a bunch of sample.log_digit were created that contain information that looks reasonable.
I had the same problem, but realized my problem. I'd done:
export CPUPROFILE=test.prof
export LD_PRELOAD="/usr/local/lib/libprofiler.so"
testprog ...
pprof --web `which testprog` test.prof
If I stopped after running testprog the prof files wasn't empty but after pprof it was. pprof crashed with the substr error.
What I realized later was that by setting and exporting LD_PRELOAD that libprofiler.so was also loaded for pprof, overwriting test.prof.
You just need to ensure LD_PRELOAD is not set when you run pprof.
I'm using gperftools-2.5, and I also encountered the same problem:
[root#localhost ivrserver]# pprof --text ./IvrServer ivr.prof
Using local file ./IvrServer.
Using local file ivr.prof.
substr outside of string at /usr/local/bin/pprof line 3695.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3695.
substr outside of string at /usr/local/bin/pprof line 3697.
Use of uninitialized value in string eq at /usr/local/bin/pprof line 3697.
ivr.prof: header size >= 2**16
I found this is because the prof file (ivr.prof in my example) is empty.
everytime the profiler start and end, it will create a new prof file, you should use xxx.prof.0 xxx.prof.1 ... to get the right result

Resources