How to fix Perl warning message when loading gdata package?

How to fix Perl warning message when loading gdata package? - r

I've updated Strawberry Perl 64-bit 5.30.2001 and the gdata package. Now, when loading library(gdata) I always get this warning messages which appear to be related to Perl.
suppressPackageStartupMessages(library(gdata))
# Warning messages:
# 1: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
# 2: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
However, read.xls, the function I need, seems to run well, except that the warning is repeated every time I use it.
read.xls("http://file-examples-com.github.io/uploads/2017/02/file_example_XLS_10.xls")
# trying URL 'http://file-examples-com.github.io/uploads/2017/02/file_example_XLS_10.xls'
# Content type 'application/vnd.ms-excel' length 8704 bytes
# downloaded 8704 bytes
# X0 First.Name Last.Name Gender Country Age Date Id
# 1 1 Dulce Abril Female United States 32 15/10/2017 1562
# 2 2 Mara Hashimoto Female Great Britain 25 16/08/2016 1582
# 3 3 Philip Gent Male France 36 21/05/2015 2587
# 4 4 Kathleen Hanner Female United States 25 15/10/2017 3549
# 5 5 Nereida Magwood Female United States 58 16/08/2016 2468
# 6 6 Gaston Brumm Male United States 24 21/05/2015 2554
# 7 7 Etta Hurn Female Great Britain 56 15/10/2017 3598
# 8 8 Earlean Melgar Female United States 27 16/08/2016 2456
# 9 9 Vincenza Weiland Female United States 40 21/05/2015 6548
# Warning messages:
# 1: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
# 2: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
I'm not sure how to deal with this warning because it says nothing to me, I could probably just ignore it and wrap a suppressWarnings() around it.
Nevertheless, does anybody know a way to fix this? I couldn't find anything by googling and don't know where to start and what's actually wrong.
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gdata_2.18.0
loaded via a namespace (and not attached):
[1] compiler_4.0.2 tools_4.0.2 gtools_3.8.2

I had the same issue with a freshly installed version of R, gdata and Strawberry Perl. I finally found this answer to a different (but related) question. Adapting the suggestion there, I ran the following on an elevated command promt:
FTYPE perl=C:\Strawberry\perl\bin\perl.exe %1 %*
This solved the issue for me – however: I am not sure if setting the FTYPE like this might have any unwanted side effects. So be careful.
Update: The command above did suppress the warning "ftype perl' had status 2" for me, but gdata still had issues:
gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLSX' (Excel 2007+) files.
gdata: Run the function 'installXLSXsupport()'
gdata: to automatically download and install the perl
gdata: libaries needed to support Excel XLS and XLSX formats.
However, installXLSXsupport() failed with an unspecific error message.
I then ran
Sys.which("perl")
perl
"C:\\rtools40\\usr\\bin\\perl.exe"
and realized that the Perl version from RTools takes precedence over my Strawberry Perl installation – and apparently gdata does not "like" that Perl version.
Therefore, I decided to give Strawberry Perl precedence over RTools by changing my .Renviron file (usethis::edit_r_environ()):
PATH="${RTOOLS40_HOME}\usr\bin;${PATH}" # old
PATH="${PATH};${RTOOLS40_HOME}\usr\bin" # new
Again, I'm not entirely sure what ramifications this might have, but it fixed gdata for me.
Maybe adjusting the PATH alone would also have done the trick (without the ftype stunt I made first), but I cannot test this anymore.
What I recommend:
Adjust the PATH first.
If gdata still complains about the ftype, set the ftype.

Related

Why is R making a copy-on-modification after using str?

I was wondering why R is making a copy-on-modification after using str.
I create a matrix. I can change its dim, one element or even all. No copy is made. But when a call str R is making a copy during the next modification operation on the Matrix. Why is this happening?
m <- matrix(1:12, 3)
tracemem(m)
#[1] "<0x559df861af28>"
dim(m) <- 4:3
m[1,1] <- 0L
m[] <- 12:1
str(m)
# int [1:4, 1:3] 12 11 10 9 8 7 6 5 4 3 ...
dim(m) <- 3:4 #Here after str a copy is made
#tracemem[0x559df861af28 -> 0x559df838e4a8]:
dim(m) <- 3:4
str(m)
# int [1:3, 1:4] 12 11 10 9 8 7 6 5 4 3 ...
dim(m) <- 3:4 #Here again after str a copy
#tracemem[0x559df838e4a8 -> 0x559df82c9d78]:
Also I was wondering why a copy is made when having a Task Callback.
TCB <- addTaskCallback(function(...) TRUE)
m <- matrix(1:12, nrow = 3)
tracemem(m)
#[1] "<0x559dfa79def8>"
dim(m) <- 4:3 #Copy on modification
#tracemem[0x559dfa79def8 -> 0x559dfa8998e8]:
removeTaskCallback(TCB)
#[1] TRUE
dim(m) <- 4:3 #No copy
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS: /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=de_AT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_AT.UTF-8 LC_COLLATE=de_AT.UTF-8
[5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=de_AT.UTF-8
[7] LC_PAPER=de_AT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.0.3
This is a follow up question to Is there a way to prevent copy-on-modify when modifying attributes?.
I start R with R --vanilla to have a clean session.

I have asked this question on R-help as suggested by #sam-mason in the comments.
The answer from Luke Tierney solved the issue with str:
As of R 4.0.0 it is in some cases possible to reduce reference counts
internally and so avoid a copy in cases like this. It would be too
costly to try to detect all cases where a count can be dropped, but it
this case we can do better. It turns out that the internals of
pos.to.env were unnecessarily creating an extra reference to the call
environment (here in a call to exists()). This is fixed in r79528.
Thanks.
And related to Task Callback:
It turns out there were some issues with the way calls to the
callbacks were handled. This has been revised in R-devel in r79541.
This example will no longere need to duplicate in R-devel.
Thanks for the report.

Download.file() incongruent with Manual Download

I am working with data from: Environment Canada
I am using download.file() to acquire this data. When I use:
download.file(url="http://dd.weather.gc.ca/model_gem_global/25km/grib2/lat_lon/00/000/CMC_glb_VGRD_ISBL_1000_latlon.24x.24_2015091100_P000.grib2",destfile = "Local_Grib.grib2")
GribInfo(grib.file = "Local_File.grib2",file.type = "grib2")
It yields:
$inventory
[1] "" "*** FATAL ERROR: rd_grib2_msg, missing end section ('7777') ***"
[3] ""
attr(,"status")
[1] 8
$grid
[1] "" "*** FATAL ERROR: rd_grib2_msg, missing end section ('7777') ***"
[3] ""
attr(,"status")
[1] 8
Warning messages:
1: running command 'wgrib2 Local_File.grib2 -inv -' had status 8
2: running command 'wgrib2 Local_File.grib2 -grid' had status 8
Whilst a manual download followed by:
GribInfo(grib.file = "CMC_glb_TMP_ISBL_985_latlon.24x.24_2015091100_P000.grib2",file.type = "grib2")
Yields:
$inventory
[1] "1:0:d=2015091100:TMP:985 mb:anl:"
$grid
[1] "1:0:grid_template=0:winds(N/S):" "\tlat-lon grid:(1500 x 751) units 1e-06 input WE:SN output WE:SN res 48"
[3] "\tlat -90.000000 to 90.000000 by 0.240000" "\tlon 180.000000 to 179.760000 by 0.240000 #points=1126500"
I have attempted using the Curl and Wget methods within download.file() however they fail giving a non exit error. I am able to obtain these files using a wget batch file however, I would prefer my entire system be run within R for consistency and ease of use.

As per #Martin Morgan. Downloading as a binary will circumvent this issue. Thanks again Martin.
download.file(url="http://dd.weather.gc.ca/model_gem_global/25km/grib2/lat_lon/00/000/CMC_glb_VGRD_ISBL_1000_latlon.24x.24_2015091100_P000.grib2",destfile = "Local_Grib.grib2", mode="wb")
GribInfo(grib.file = "Local_File.grib2",file.type = "grib2")

Fast reading of unicode files in windows in R

I always use fread from data.table package to read in large tables. But apparently it does not support reading unicode files in Windows (Windows 7 Professional to be more precise)
Here is the file I tried to read:
A,B
ą,ž
ū,į
ų,ė
š,ę
It works fine if I read it in Mac OS X, or I read it with read.csv with option encoding=UTF-8. Unfortunately fread does not have this option.
So are there other fast ways to read unicode tables in Windows, or I should just use other OS? Or am I missing something obvious?
Here is the output of sessionInfo():
R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.9.4
loaded via a namespace (and not attached):
[1] chron_2.3-45 plyr_1.8.1 Rcpp_0.11.5 reshape2_1.4.1 stringr_0.6.2
Update: Pasting the output as requested.
> aa<-fread("F:/R/unicode_test2.csv",verbose=TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000000 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 2 columns. Longest stretch was from line 1 to line 5
Starting data input on line 1 (either column names or first row of data). First 10 characters: Ä„,B
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 5 (including 1 at the end)
Count of sep: 4
nrow = MIN( nsep [4] / ncol [2] -1, neol [5] - nblank [1] ) = 4
Type codes ( first 5 rows): 44
Type codes: 44 (after applying colClasses and integer64)
Type codes: 44 (after applying drop or select (if supplied)
Allocating 2 column slots (2 - 0 dropped)
Read 4 rows. Exactly what was estimated and allocated up front
0.000s ( 0%) Memory map (rerun may be quicker)
0.000s ( 0%) sep and header detection
0.000s ( 0%) Count rows (wc -l)
0.000s ( 0%) Column type detection (first, middle and last 5 rows)
0.000s ( 0%) Allocation of 4x2 result (xMB) in RAM
0.000s ( 0%) Reading data
0.000s ( 0%) Allocation for type bumps (if any), including gc time if triggered
0.000s ( 0%) Coercing data already read in type bumps (if any)
0.000s ( 0%) Changing na.strings to NA
0.001s Total
> aa
Ä„ B
1: Ä… Å¾
2: Å« Ä¯
3: Å³ Ä—
4: Å¡ Ä™
> aa$A
[1] "Ä…" "Å«" "Å³" "Å¡"
> aa$B
[1] "Å¾" "Ä¯" "Ä—" "Ä™"
> bb <- read.csv("F:/R/unicode_test.csv",encoding="UTF-8",strings=FALSE)
> bb
A B
1 a ž
2 u i
3 u e
4 š e
> bb$B
[1] "ž" "į" "ė" "ę"
> bb$A
[1] "ą" "ū" "ų" "š"

Ubuntu R ForEach / DoMC not using multiple cores

I have built a function in R (running on Ubuntu 12.04 LTS 64bit, 4 core i7 server with multithreading and 6gb ram) where I've installed R using the standard packages:
sudo apt-get install r-base r-recommended r-base-dev
sudo apt-get install r-cran-multicore r-cran-iterators r-cran-foreach r-cran-domc
NB: I also installed foreach & doMC inside R (which didn't help either), like I installed the deldir package:
install.packages(c("deldir"), dependencies = TRUE)
My function runs fine, but it does not use parallel cores (just maxes out 1 of the 8):
library(deldir)
library(foreach)
library(doMC)
registerDoMC(cores=8)
#getDoParWorkers()
#getDoParName()
#getDoParVersion()
# loop through files
inputfiles <- dir(path="/home/geoadmin/data/objects/", pattern='.txt')
for( inputfilenr in 1:length(inputfiles))
{
# set file variables
curinputfile = paste("/home/geoadmin/data/objects/",inputfiles[[inputfilenr]], sep = "", collapse = NULL)
print (curinputfile)
curoutputfile = paste("/home/geoadmin/data/objects/",substr(inputfiles[[inputfilenr]], start=1, stop=10), '.out', sep = "", collapse = NULL)
# select the point x/y coordinates into a data frame...
points <- read.csv(curinputfile, header = TRUE, sep = ",", dec=".", fill = TRUE)
# set calculation variables, precision on 3 digits only because of the RDW coordinate system
voro = deldir(points$x, points$y, digits=3, list(ndx=2,ndy=2), rw=c(min(points$x)-abs(min(points$x)-max(points$x)), max(points$x)+abs(min(points$x)-max(points$x)), min(points$y)-abs(min(points$y)-max(points$y)), max(points$y)+abs(min(points$y)-max(points$y))))
tiles = tile.list(voro)
poly = array()
# start loop
poly <- foreach (i=1:length(tiles), .combine=cbind) %dopar%
{
# load tile info
tile = tiles[[i]]
# start with EWKB notation
curpoly = "POLYGON(("
# add list of coordinates by looping through the points in tile
for (j in 1:length(tiles[[i]]$x)) { curpoly = sprintf("%s %.6f %.6f,",curpoly,tile$x[[j]],tile$y[[j]]) }
# then again the first point to close the polygon and end the EWKB notation, adding that to the poly array
sprintf("%s %.6f %.6f))",curpoly,tile$x[[1]],tile$y[[1]])
}
write.csv(t(poly), file = curoutputfile, row.names = FALSE)
}
So the results are good, but no parallelism...
doMC did register correctly:
> getDoParWorkers()
[1] 8
> getDoParName()
[1] "doMC"
> getDoParVersion()
[1] "1.2.5"
If I look at the usage (with top):
top - 01:03:19 up 9 min, 3 users, load average: 1.02, 0.86, 0.45
Tasks: 131 total, 2 running, 127 sleeping, 0 stopped, 2 zombie
Cpu(s): 12.5%us, 0.0%sy, 0.0%ni, 87.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 6104932k total, 1240512k used, 4864420k free, 16656k buffers
Swap: 6283260k total, 0k used, 6283260k free, 141996k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1553 zzzzzzzz 20 0 913m 850m 3716 R 100 14.3 8:22.03 R
So just maxing out one core. Does anyone have any idea what could cause foreach/doMC to not use multiple cores?
> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] doMC_1.2.5 multicore_0.1-7 iterators_1.0.6 foreach_1.4.0
[5] deldir_0.0-19
loaded via a namespace (and not attached):
[1] codetools_0.2-8

To add the likely answer for the question:
As foreach/mc does work on the computer itself (with the standard example), it's the specific code itself and likely that the voro=deldir part takes up the time, not the loop after it. This however means that the deldir package needs to be adjusted. Looking at the code in the DelDir source it seems I would need to adjust this snippet in the code:
# Call the master subroutine to do the work:
repeat {
tmp <- .Fortran(
'master',
x=as.double(x),
y=as.double(y),
sort=as.logical(sort),
rw=as.double(rw),
npd=as.integer(npd),
ntot=as.integer(ntot),
nadj=integer(tadj),
madj=as.integer(madj),
ind=integer(npd),
tx=double(npd),
ty=double(npd),
ilist=integer(npd),
eps=as.double(eps),
delsgs=double(tdel),
ndel=as.integer(ndel),
delsum=double(ntdel),
dirsgs=double(tdir),
ndir=as.integer(ndir),
dirsum=double(ntdir),
nerror=integer(1),
PACKAGE='deldir'
)
Not sure yet how i can format this into a thing which would work with foreach though...

How to read.table with "Hebrew" column names (in R)?

I am trying to read a .txt file, with Hebrew column names, but without success.
I uploaded an example file to:
http://www.talgalili.com/files/aa.txt
And am trying the command:
read.table("http://www.talgalili.com/files/aa.txt", header = T, sep = "\t")
This returns me with:
X.....ª X...ª...... X...œ....
1 12 97 6
2 123 354 44
3 6 1 3
Instead of:
אחת שתיים שלוש
12 97 6
123 354 44
6 1 3
My output for:
l10n_info()
Is:
$MBCS
[1] FALSE
$`UTF-8`
[1] FALSE
$`Latin-1`
[1] TRUE
$codepage
[1] 1252
And for:
Sys.getlocale()
Is:
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
Can you suggest to me what to try and change to allow me to load the file correctly ?
Update:
Trying to use:
read.table("http://www.talgalili.com/files/aa.txt",fileEncoding ="iso8859-8")
Has resulted in:
V1
1 ?
Warning messages:
1: In read.table("http://www.talgalili.com/files/aa.txt", fileEncoding = "iso8859-8") :
invalid input found on input connection 'http://www.talgalili.com/files/aa.txt'
2: In read.table("http://www.talgalili.com/files/aa.txt", fileEncoding = "iso8859-8") :
incomplete final line found by readTableHeader on 'http://www.talgalili.com/files/aa.txt'
While also trying this:
Sys.setlocale("LC_ALL", "en_US.UTF-8")
Or this:
Sys.setlocale("LC_ALL", "en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8")
Get's me this:
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
OS reports request to set locale to "en_US.UTF-8" cannot be honored
Finally, here is the > sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1255 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.10.1
Any suggestion or clarification will be appreciated.
Best,
Tal

I would try passing parameter fileEncoding to read.table with a value of iso8859-8.
Use iconvlist() to get an alphabetical list of the supported encodings. As I saw here Hebrew must be part 8 of ISO 8859.

I've tried #George Donats answer, but couldn't make it work. So I wanted to suggest another possibility for future reference.
I couldn't find the file online, so I've recreated a txt file like your using TAB as a seperator. You can load it into R with the Hebrew text using a connection. It is demonstrated below:
con<-file("aa.txt",open="r",encoding="iso8859-8") ##Open a read-only connection with encoding fit for Hebrew (iso8859-8)
Than you can load it into R with your code, using con variable as the file input, code described here:
data<-read.table(con,sep="\t",header=TRUE)
Browsing into the data variable gives the following results:
str(data)
'data.frame': 3 obs. of 3 variables:
$ אחת : int 6 44 3
$ שתיים: int 97 354 1
$ שלוש : int 12 123 6
> data$אחת
[1] 6 44 3

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to fix Perl warning message when loading gdata package? - r

Related

Why is R making a copy-on-modification after using str?

Download.file() incongruent with Manual Download

Fast reading of unicode files in windows in R

Ubuntu R ForEach / DoMC not using multiple cores

How to read.table with "Hebrew" column names (in R)?

Categories

Resources