Download.file() incongruent with Manual Download - r

I am working with data from: Environment Canada
I am using download.file() to acquire this data. When I use:
download.file(url="http://dd.weather.gc.ca/model_gem_global/25km/grib2/lat_lon/00/000/CMC_glb_VGRD_ISBL_1000_latlon.24x.24_2015091100_P000.grib2",destfile = "Local_Grib.grib2")
GribInfo(grib.file = "Local_File.grib2",file.type = "grib2")
It yields:
$inventory
[1] "" "*** FATAL ERROR: rd_grib2_msg, missing end section ('7777') ***"
[3] ""
attr(,"status")
[1] 8
$grid
[1] "" "*** FATAL ERROR: rd_grib2_msg, missing end section ('7777') ***"
[3] ""
attr(,"status")
[1] 8
Warning messages:
1: running command 'wgrib2 Local_File.grib2 -inv -' had status 8
2: running command 'wgrib2 Local_File.grib2 -grid' had status 8
Whilst a manual download followed by:
GribInfo(grib.file = "CMC_glb_TMP_ISBL_985_latlon.24x.24_2015091100_P000.grib2",file.type = "grib2")
Yields:
$inventory
[1] "1:0:d=2015091100:TMP:985 mb:anl:"
$grid
[1] "1:0:grid_template=0:winds(N/S):" "\tlat-lon grid:(1500 x 751) units 1e-06 input WE:SN output WE:SN res 48"
[3] "\tlat -90.000000 to 90.000000 by 0.240000" "\tlon 180.000000 to 179.760000 by 0.240000 #points=1126500"
I have attempted using the Curl and Wget methods within download.file() however they fail giving a non exit error. I am able to obtain these files using a wget batch file however, I would prefer my entire system be run within R for consistency and ease of use.

As per #Martin Morgan. Downloading as a binary will circumvent this issue. Thanks again Martin.
download.file(url="http://dd.weather.gc.ca/model_gem_global/25km/grib2/lat_lon/00/000/CMC_glb_VGRD_ISBL_1000_latlon.24x.24_2015091100_P000.grib2",destfile = "Local_Grib.grib2", mode="wb")
GribInfo(grib.file = "Local_File.grib2",file.type = "grib2")

Related

Rscript called with crontab not finding local packages

I have the following R script ~/test.R :
print(.libPaths())
print(system(command = "whoami",ignore.stderr = TRUE))
library(lubridate)
ymd("2022-09-15")
If I run this script from the terminal with /opt/R/3.6.2/lib64/R/bin/Rscript test.R > test2.log I get the following output:
[1] "/home/domain/username/R/library/3.6.2"
[2] "/applis/R/site-library/x86_64-pc-linux-gnu/3.6.2"
[3] "/opt/R/3.6.2/lib64/R/library"
username#domain
[1] 0
[1] "2022-09-15"
So it's working as intended and I have 3 paths for packages. Now let's run this script with cron :
* * * * * /opt/R/3.6.2/lib64/R/bin/Rscript $HOME/test.R > $HOME/test.log 2>&1
I get this for test.log:
[1] "/opt/R/3.6.2/lib64/R/library"
username#domain
[1] 0
Error in library(lubridate) :
aucun package nommé ‘lubridate’ n'est trouvé
Exécution arrêtée
So I only have one path for libraries, consequently lubridate is not found, because it's installed in /home/domain/username/R/library/3.6.2. I cannot install packages within /opt/R/3.6.2/lib64/R/library, so I'm looking for a way to add libpaths to crontab.

How to fix Perl warning message when loading gdata package?

I've updated Strawberry Perl 64-bit 5.30.2001 and the gdata package. Now, when loading library(gdata) I always get this warning messages which appear to be related to Perl.
suppressPackageStartupMessages(library(gdata))
# Warning messages:
# 1: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
# 2: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
However, read.xls, the function I need, seems to run well, except that the warning is repeated every time I use it.
read.xls("http://file-examples-com.github.io/uploads/2017/02/file_example_XLS_10.xls")
# trying URL 'http://file-examples-com.github.io/uploads/2017/02/file_example_XLS_10.xls'
# Content type 'application/vnd.ms-excel' length 8704 bytes
# downloaded 8704 bytes
# X0 First.Name Last.Name Gender Country Age Date Id
# 1 1 Dulce Abril Female United States 32 15/10/2017 1562
# 2 2 Mara Hashimoto Female Great Britain 25 16/08/2016 1582
# 3 3 Philip Gent Male France 36 21/05/2015 2587
# 4 4 Kathleen Hanner Female United States 25 15/10/2017 3549
# 5 5 Nereida Magwood Female United States 58 16/08/2016 2468
# 6 6 Gaston Brumm Male United States 24 21/05/2015 2554
# 7 7 Etta Hurn Female Great Britain 56 15/10/2017 3598
# 8 8 Earlean Melgar Female United States 27 16/08/2016 2456
# 9 9 Vincenza Weiland Female United States 40 21/05/2015 6548
# Warning messages:
# 1: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
# 2: In system(cmd, intern = intern, wait = wait | intern, show.output.on.console = wait, :
# running command 'C:\Windows\system32\cmd.exe /c ftype perl' had status 2
I'm not sure how to deal with this warning because it says nothing to me, I could probably just ignore it and wrap a suppressWarnings() around it.
Nevertheless, does anybody know a way to fix this? I couldn't find anything by googling and don't know where to start and what's actually wrong.
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gdata_2.18.0
loaded via a namespace (and not attached):
[1] compiler_4.0.2 tools_4.0.2 gtools_3.8.2
I had the same issue with a freshly installed version of R, gdata and Strawberry Perl. I finally found this answer to a different (but related) question. Adapting the suggestion there, I ran the following on an elevated command promt:
FTYPE perl=C:\Strawberry\perl\bin\perl.exe %1 %*
This solved the issue for me – however: I am not sure if setting the FTYPE like this might have any unwanted side effects. So be careful.
Update: The command above did suppress the warning "ftype perl' had status 2" for me, but gdata still had issues:
gdata: Unable to load perl libaries needed by read.xls()
gdata: to support 'XLSX' (Excel 2007+) files.
gdata: Run the function 'installXLSXsupport()'
gdata: to automatically download and install the perl
gdata: libaries needed to support Excel XLS and XLSX formats.
However, installXLSXsupport() failed with an unspecific error message.
I then ran
Sys.which("perl")
perl
"C:\\rtools40\\usr\\bin\\perl.exe"
and realized that the Perl version from RTools takes precedence over my Strawberry Perl installation – and apparently gdata does not "like" that Perl version.
Therefore, I decided to give Strawberry Perl precedence over RTools by changing my .Renviron file (usethis::edit_r_environ()):
PATH="${RTOOLS40_HOME}\usr\bin;${PATH}" # old
PATH="${PATH};${RTOOLS40_HOME}\usr\bin" # new
Again, I'm not entirely sure what ramifications this might have, but it fixed gdata for me.
Maybe adjusting the PATH alone would also have done the trick (without the ftype stunt I made first), but I cannot test this anymore.
What I recommend:
Adjust the PATH first.
If gdata still complains about the ftype, set the ftype.

Extraction of POSIXlt component runs fine in R 3.4.4, but errors in R 3.5.0. Why?

1) R version 3.4.4 (2018-03-15)
my.timedate <- as.POSIXlt('2016-01-01 16:00:00')
# print(attributes(my.timedate))
print(my.timedate[['hour']])
[1] 16
2) R version 3.5.0 (2018-04-23)
my.timedate <- as.POSIXlt('2016-01-01 16:00:00')
# print(attributes(my.timedate))
print(my.timedate[['hour']])
Error in FUN(X[[i]], ...) : subscript out of bounds
I think that is a known change in R 3.5.0 where the list elements of a POSIXlt need to be unpackaged explicitly. Using R 3.5.0:
edd#rob:~$ docker run --rm -ti r-base:3.5.0 \
R -q -e 'print(unclass(as.POSIXlt("2016-01-01 16:00:00")[["hour"]])'
> print(unclass(as.POSIXlt("2016-01-01 16:00:00"))[["hour"]])
[1] 16
>
>
edd#rob:~$
whereas with R 3.4.* one does not need the unclass() as you showed:
edd#rob:~$ docker run --rm -ti r-base:3.4.3 \
R -q -e 'print(as.POSIXlt("2016-01-01 16:00:00")[["hour"]])'
> print(as.POSIXlt("2016-01-01 16:00:00")[["hour"]])
[1] 16
>
>
edd#rob:~$
I don't find a corresponding NEWS file entry though so not entirely sure if it is on purpose...
Edit: As others have noted, the corresponding NEWS entry is the somewhat opaque
* Single components of "POSIXlt" objects can now be extracted and
replaced via [ indexing with 2 indices.
From ?POSIXlt:
As from R 3.5.0, one can extract and replace single components via [ indexing with two indices (see the examples).
The example is a little opaque, but shows the idea:
leapS[1 : 5, "year"]
If you look at the source, though, you can see what's happening:
`[.POSIXlt`
#> function (x, i, j, drop = TRUE)
#> {
#> if (missing(j)) {
#> .POSIXlt(lapply(X = unclass(x), FUN = "[", i, drop = drop),
#> attr(x, "tzone"), oldClass(x))
#> }
#> else {
#> unclass(x)[[j]][i]
#> }
#> }
#> <bytecode: 0x7fbdb4d24f60>
#> <environment: namespace:base>
It is using i to subset unclass(x), where x is the POSIXlt object. So with R 3.5.0, you use [ and preface the part of the datetime you want with the index of the datetime in the vector:
my.timedate <- as.POSIXlt('2016-01-01 16:00:00')
my.timedate[1, 'hour']
#> [1] 16
as.POSIXlt(seq(my.timedate, by = 'hour', length.out = 10))[2:5, 'hour']
#> [1] 17 18 19 20
Note that $ subsetting still works as usual:
my.timedate$hour
#> [1] 16
See ?DateTimeClasses (same as ?as.POSIXlt):
As from R 3.5.0, one can extract and replace single components via [ indexing with two indices
See also similar description in R NEWS CHANGES IN R 3.5.0.
Thus:
my.timedate[1, "hour"]
# [1] 16
# or leave the i index empty to select a component
# from all date-times in a vector
as.POSIXlt(c('2016-01-01 16:00:00', '2016-01-01 17:00:00'))[ , "hour"]
# [1] 16 17
See also Examples in the help text.

Strange addTaskCallback work in RStudio

This is my next question from cycle of "strange" questions.
I found same difference in code execution in R console and RStudio and couldn't understand reason of it. It's also connected with incorrect work of "track" package in RStudio and R.NET as I'd written before in Incorrect work of track package in R.NET
So, let's look at example from https://search.r-project.org/library/base/html/taskCallback.html
(I corrected it a little for correct data output for sum in RStudio)
times <- function(total = 3, str = "Task a") {
ctr <- 0
function(expr, value, ok, visible) {
ctr <<- ctr + 1
cat(str, ctr, "\n")
if(ctr == total) {
cat("handler removing itself\n")
}
return(ctr < total)
}
}
# add the callback that will work for
# 4 top-level tasks and then remove itself.
n <- addTaskCallback(times(4))
# now remove it, assuming it is still first in the list.
removeTaskCallback(n)
## Not run:
# There is no point in running this
# as
addTaskCallback(times(4))
print(sum(1:10))
print(sum(1:10))
print(sum(1:10))
print(sum(1:10))
print(sum(1:10))
## End(Not run)
An output in R console:
>
> # add the callback that will work for
> # 4 top-level tasks and then remove itself.
> n <- addTaskCallback(times(4))
Task a 1
>
> # now remove it, assuming it is still first in the list.
> removeTaskCallback(n)
[1] TRUE
>
> ## Not run:
> # There is no point in running this
> # as
> addTaskCallback(times(4))
1
1
Task a 1
>
> print(sum(1:10))
[1] 55
Task a 2
> print(sum(1:10))
[1] 55
Task a 3
> print(sum(1:10))
[1] 55
Task a 4
handler removing itself
> print(sum(1:10))
[1] 55
> print(sum(1:10))
[1] 55
>
> ## End(Not run)
>
Okay, let's run this in RStudio.
Output:
> source('~/callbackTst.R')
[1] 55
[1] 55
[1] 55
[1] 55
[1] 55
Task a 1
>
Second run give us this:
> source('~/callbackTst.R')
[1] 55
[1] 55
[1] 55
[1] 55
[1] 55
Task a 2
Task a 1
>
Third:
> source('~/callbackTst.R')
[1] 55
[1] 55
[1] 55
[1] 55
[1] 55
Task a 3
Task a 2
Task a 1
>
and so on.
There is a strange difference between RStudio and R console and I don't know why. Could anyone help me? Is is bug or it's normal and I have curved hands?
Thank you.
P.S. This post connected with correct working of "track" package, because "track.start" method consist this part of code:
assign(".trackingSummaryChanged", FALSE, envir = trackingEnv)
assign(".trackingPid", Sys.getpid(), envir = trackingEnv)
if (!is.element("track.auto.monitor", getTaskCallbackNames()))
addTaskCallback(track.auto.monitor, name = "track.auto.monitor")
return(invisible(NULL))
which, I think, doesn't work correct in RStudio and R.NET
P.P.S. I use R 3.2.2 x64, RStudio 0.99.489 and Windows 10 Pro x64. On RRO this problem also exists under R.NET and RStudio
addTaskCallback() will add a callback that's executed when R execution returns to the top level. When you're executing code line-by-line, each statement executed will return control to the top level, and callbacks will execute.
When executed within source(), control isn't returned until the call to source() returns, and so the callback is only run once.

.Fortran() returns no result

In R, when i tried the following code
.Fortran("add", x= as.double(2),y= as.double(3))
R returned only the arguments but no result!
$x
[1] 2
$y
[1] 3
add is the only simple function i written in the Fortran source file test.f90:
function add (x,y) result (f_result)
real:: x,y,f_result
f_result = x+y
end function add
and I used:
gfortran -shared -o test.dll test.f90
to obtain the test.dll which was loaded into R by
dyn.load("test.dll")
In all above processes, I got no error or warning message. So I just cannot figure out where the problem is. I searched a lot, but couldn't find a solution. Any help?
By the way, I use windows7(x86), R3.0.2, GNU Fortran (GCC) 4.7.0
Write a subroutine, use an argument as a return value:
subroutine add(x,y,z)
real*8 x,y,z
z=x+y
end
Compile like this:
$ R CMD SHLIB add.f
> dyn.load("add.so")
> .Fortran("add",as.double(1),as.double(2),as.double(-999))
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
You don't even need to name the arguments, but it helps since you can then get the return value by name:
> .Fortran("add",as.double(1),as.double(2),ans=as.double(-999))$ans
[1] 3
>

Resources