unzip() is overwritting file, even when using overwrite = FALSE - r

saveRDS(1, tmp1 <- "test1.rds")
tmp3 <- tempfile(fileext = ".zip")
zip(tmp3, tmp1)
unlink(tmp1)
file.exists(tmp1) # FALSE
unzip(tmp3)
file.exists(tmp1) # TRUE
readRDS(tmp1) # 1
saveRDS(2, tmp1)
readRDS(tmp1) # 2
unzip(tmp3, overwrite = FALSE)
# Warning message:
# In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
readRDS(tmp1) # 1
unlink(tmp1)
I was expecting the last readRDS(tmp1) to return 2, right?
Any thought?
PS: I'm on Linux CentOS 7, using R version 3.5.2.

I have confirmed that you are indeed facing a bug in R 3.5.2. I have checked on Centos 7.2 only
R 3.6.0
$ /usr/bin/R -f test2.r
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
> saveRDS(1, tmp1 <- "test1.rds")
> tmp3 <- tempfile(fileext = ".zip")
> zip(tmp3, tmp1)
adding: test1.rds (deflated 2%)
> unlink(tmp1)
> file.exists(tmp1) # FALSE
[1] FALSE
>
> unzip(tmp3)
> file.exists(tmp1) # TRUE
[1] TRUE
> readRDS(tmp1) # 1
[1] 1
> saveRDS(2, tmp1)
> readRDS(tmp1) # 2
[1] 2
> unzip(tmp3, overwrite = FALSE)
Warning message:
In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
> readRDS(tmp1) # 1
[1] 2
>
> unlink(tmp1)
>
R 3.5.2
$ R -f test2.r
R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
> saveRDS(1, tmp1 <- "test1.rds")
> tmp3 <- tempfile(fileext = ".zip")
> zip(tmp3, tmp1)
adding: test1.rds (deflated 5%)
> unlink(tmp1)
> file.exists(tmp1) # FALSE
[1] FALSE
>
> unzip(tmp3)
> file.exists(tmp1) # TRUE
[1] TRUE
> readRDS(tmp1) # 1
[1] 1
> saveRDS(2, tmp1)
> readRDS(tmp1) # 2
[1] 2
> unzip(tmp3, overwrite = FALSE)
Warning message:
In unzip(tmp3, overwrite = FALSE) : not overwriting file './test1.rds
> readRDS(tmp1) # 1
[1] 1
>
> unlink(tmp1)
>

Related

renv fails on install.packages() and restore()

In both new (dummy) packages and mature packages with renv already initialized, renv fails on install.packages() and restore().
Note: to avoid tripping the spam filter, I've replaced all instances of 'https://' with 'web.'
Failure on install.packages(), with, e.g., dplyr:
> install.packages("dplyr")
Retrieving 'web.cloud.r-project.org/bin/macosx/contrib/4.2/dplyr_1.1.0.tgz' ...
OK [downloaded 1.5 Mb in 0.4 secs]
Retrieving 'web.cloud.r-project.org/bin/macosx/contrib/4.2/cli_3.6.0.tgz' ...
OK [file is up to date]
[...]
Retrieving 'web.cloud.r-project.org/bin/macosx/contrib/4.2/withr_2.5.0.tgz' ...
OK [file is up to date]
Installing cli [3.6.0] ...
FAILED
Error in if (eval(cond, envir = environment(dot))) return(eval(expr, envir = environment(dot))) :
the condition has length > 1
In addition: Warning message:
In system2(R(), args, stdout = TRUE, stderr = TRUE) :
running command ''/Library/Frameworks/R.framework/Resources/bin/R' CMD config CC 2>&1' had status 71
Failure on renv::restore():
> renv::restore()
The following package(s) will be updated:
# CRAN ===============================
- KernSmooth [2.23-20 -> 2.23-18]
- MASS [7.3-58.2 -> 7.3-53.1]
[...]
- zip [* -> 2.2.2]
# GitHub =============================
- staged.dependencies [* -> remoteproject/staged.dependencies#HEAD]
Do you want to proceed? [y/N]: y
* Querying repositories for available binary packages ... Done!
* Querying repositories for available source packages ... Done!
Retrieving 'web.cran.microsoft.com/snapshot/2021-03-31/src/contrib/boot_1.3-27.tar.gz' ...
OK [file is up to date]
Retrieving 'web.cran.microsoft.com/snapshot/2021-03-31/src/contrib/class_7.3-18.tar.gz' ...
OK [file is up to date]
[...]
Retrieving 'web.cran.microsoft.com/snapshot/2021-03-31/src/contrib/visNetwork_2.0.9.tar.gz' ...
OK [file is up to date]
Installing boot [1.3-27] ...
FAILED
Error in if (eval(cond, envir = environment(dot))) return(eval(expr, envir = environment(dot))) :
the condition has length > 1
In addition: Warning messages:
1: could not retrieve available packages for url 'web.cran.microsoft.com/snapshot/2021-03-31/bin/macosx/contrib/4.2'
2: In system2(R(), args, stdout = TRUE, stderr = TRUE) :
running command ''/Library/Frameworks/R.framework/Resources/bin/R' CMD config CC 2>&1' had status 71
Session info:
> sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils methods base
loaded via a namespace (and not attached):
[1] MASS_7.3-58.2 compiler_4.2.1 tools_4.2.1 renv_0.16.0
Judging from answers here and in other forums, the following might be useful information:
Output from getOption("repos"):
> getOption("repos")
CRAN MRAN
"web.cloud.r-project.org" "web.cran.microsoft.com/snapshot/2021-03-31"
Output from installing directly from utils package:
> utils::install.packages("dplyr")
Installing package into ‘/Users/dlei/Library/Caches/org.R-project.R/R/renv/library/project-01109165/R-4.2/x86_64-apple-darwin17.0’
(as ‘lib’ is unspecified)
also installing the dependencies ‘fansi’, ‘utf8’, ‘pkgconfig’, ‘withr’, ‘cli’, ‘generics’, ‘glue’, ‘lifecycle’, ‘magrittr’, ‘pillar’, ‘R6’, ‘rlang’, ‘tibble’, ‘tidyselect’, ‘vctrs’
Warning: unable to access index for repository web.cran.microsoft.com/snapshot/2021-03-31/bin/macosx/contrib/4.2:
cannot open URL 'web.cran.microsoft.com/snapshot/2021-03-31/bin/macosx/contrib/4.2/PACKAGES'
trying URL 'web.cloud.r-project.org/bin/macosx/contrib/4.2/fansi_1.0.4.tgz'
Content type 'application/x-gzip' length 364195 bytes (355 KB)
==================================================
downloaded 355 KB
trying URL 'web.cloud.r-project.org/bin/macosx/contrib/4.2/utf8_1.2.3.tgz'
Content type 'application/x-gzip' length 196823 bytes (192 KB)
==================================================
downloaded 192 KB
[...]
The downloaded binary packages are in
/var/folders/41/y38m8sw12871hpn_5nl9y88m0000gn/T//RtmpIwcHeu/downloaded_packages
Output from renv::diagnostics():
> renv::diagnostics()
Diagnostics Report [renv 0.16.0]
================================
# Session Info =======================
R version 4.2.1 (2022-06-23)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils methods base
loaded via a namespace (and not attached):
[1] MASS_7.3-58.2 compiler_4.2.1 tools_4.2.1 renv_0.16.0
# Project ============================
Project path: "~/Desktop/Git_Repos/project"
# Status =============================
The following package(s) do not appear to be used in this project:
_
BH [1.75.0-0]
KernSmooth [2.23-18]
[...]
zip [2.2.2]
Use `renv::snapshot()` to remove them from the lockfile.
The following package(s) are recorded in the lockfile, but not installed:
_
DT [0.17]
covr [3.5.1]
[...]
staged.dependencies [remoteproject/staged.dependencies#HEAD]
usethis [2.1.6]
Use `renv::restore()` to install these packages.
The following package(s) are out of sync:
Package Lockfile Version Library Version
R6 2.5.0 2.5.1
cli 3.4.1 3.6.0
[...]
vctrs 0.5.0 0.5.2
Use `renv::snapshot()` to save the state of your library to the lockfile.
Use `renv::restore()` to restore your library from the lockfile.
The following package(s) are used in the project, but are not installed:
project.test
projectdev
Consider installing these packages, and then using `renv::snapshot()`
to record these packages in the lockfile.
# Packages ===========================
Library Source Lockfile Source Path Dependency
BH <NA> <NA> 1.75.0-0 CRAN <NA> <NA>
DT <NA> <NA> 0.17 CRAN <NA> direct
[...]
zip <NA> <NA> 2.2.2 CRAN <NA> <NA>
[1]: /Users/dlei/Library/Caches/org.R-project.R/R/renv/library/project-01109165/R-4.2/x86_64-apple-darwin17.0
[2]: /Users/dlei/Desktop/Git_Repos/project/renv/sandbox/R-4.2/x86_64-apple-darwin17.0/84ba8b13
# ABI ================================
* No ABI conflicts were detected in the set of installed packages.
# User Profile =======================
[no user profile detected]
# Settings ===========================
List of 10
$ bioconductor.version : chr(0)
$ external.libraries : chr(0)
$ ignored.packages : chr(0)
$ package.dependency.fields: chr [1:3] "Imports" "Depends" "LinkingTo"
$ r.version : chr(0)
$ snapshot.type : chr "implicit"
$ use.cache : logi TRUE
$ vcs.ignore.cellar : logi TRUE
$ vcs.ignore.library : logi TRUE
$ vcs.ignore.local : logi TRUE
# Options ============================
List of 9
$ defaultPackages : chr [1:6] "datasets" "utils" "grDevices" "graphics" ...
$ download.file.method : chr "libcurl"
$ download.file.extra : NULL
$ install.packages.compile.from.source: chr "interactive"
$ pkgType : chr "both"
$ repos : Named chr [1:2] "web.cloud.r-project.org" "web.cran.microsoft.com/snapshot/2021-03-31"
..- attr(*, "names")= chr [1:2] "CRAN" "MRAN"
$ renv.consent : logi TRUE
$ renv.project.path : chr "/Users/dlei/Desktop/Git_Repos/project"
$ renv.verbose : logi TRUE
# Environment Variables ==============
HOME = /Users/dlei
LANG = en_CA.UTF-8
MAKE = make
R_LIBS = <NA>
R_LIBS_SITE = /Library/Frameworks/R.framework/Resources/site-library
R_LIBS_USER = /Users/dlei/Library/Caches/org.R-project.R/R/renv/library/project-01109165/R-4.2/x86_64-apple-darwin17.0:/Users/dlei/Desktop/Git_Repos/project/renv/sandbox/R-4.2/x86_64-apple-darwin17.0/84ba8b13
RENV_DEFAULT_R_ENVIRON = <NA>
RENV_DEFAULT_R_ENVIRON_USER = <NA>
RENV_DEFAULT_R_LIBS = <NA>
RENV_DEFAULT_R_LIBS_SITE = /Library/Frameworks/R.framework/Resources/site-library
RENV_DEFAULT_R_LIBS_USER = /Users/dlei/Library/R/x86_64/4.2/library
RENV_DEFAULT_R_PROFILE = <NA>
RENV_DEFAULT_R_PROFILE_USER = <NA>
RENV_PROJECT = /Users/dlei/Desktop/Git_Repos/project
# PATH ===============================
- /usr/local/bin
- /usr/bin
- /bin
- /usr/sbin
- /sbin
- /Library/TeX/texbin
-
- /Applications/RStudio.app/Contents/MacOS/postback
# Cache ==============================
There are a total of 0 package(s) installed in the renv cache.
Cache path: "~/Library/Caches/org.R-project.R/R/renv/cache/v5/R-4.2/x86_64-apple-darwin17.0"
Output of getOption("download.file.method"):
> getOption("download.file.method")
[1] "libcurl"
Output of renv:::renv_download_method():
> renv:::renv_download_method()
[1] "curl"
As per recommendation in the RStudio community forum, I've tried setting renv's download method from curl to libcurl, but get a similar error:
> install.packages("dplyr")
Retrieving 'web.cloud.r-project.org/bin/macosx/contrib/4.2/dplyr_1.1.0.tgz' ...
OK [downloaded 1.5 Mb in 0.3 secs]
Installing dplyr [1.1.0] ...
FAILED
Error in if (eval(cond, envir = environment(dot))) return(eval(expr, envir = environment(dot))) :
the condition has length > 1
In addition: Warning messages:
1: could not retrieve available packages for url 'web.cran.microsoft.com/snapshot/2021-03-31/bin/macosx/contrib/4.2'
2: In system2(R(), args, stdout = TRUE, stderr = TRUE) :
running command ''/Library/Frameworks/R.framework/Resources/bin/R' CMD config CC 2>&1' had status 71
Given this output in the error message:
Error in if (eval(cond, envir = environment(dot))) return(eval(expr, envir = environment(dot))) :
the condition has length > 1
In addition: Warning message:
In system2(R(), args, stdout = TRUE, stderr = TRUE) :
running command ''/Library/Frameworks/R.framework/Resources/bin/R' CMD config CC 2>&1' had status 71
Did you recently update macOS on your machine? This looks like an issue in renv where attempts to query the current compiler could fail.
If so, I believe you can work around this by running:
xcode-select --install
from a terminal.
The links which were failing were cran snapshots with no corresponding url. Explicitly pointing to the CRAN repo by running restore(repos="https://cloud.r-project.org") solved this issue.
This issue was solved thanks to #nirgrahamuk in the community.rstudio forum (for those interested in the original answer, see here).

Why would loading a package change the resid function being used?

I understand that resid() is a generic function in R, and which specific residual function is used depends on the object to which resid() is applied, just like print().
However, I noticed that, sometimes loading a package would change which specific residual function is used, yielding drastically different residual plots. Could anyone help me understand why that happens?
This is an example from my data:
> #### Showing packages loaded after starting up R ####
> search()
[1] ".GlobalEnv" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads" "package:base"
>
> #### Before loading nlme ####
>
> ## s1 is a gls object, calculated using the nlme package
> s1 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s1, type = "pearson"), main = "before loading nlme")
> qqline(resid(s1, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.HoltWinters* residuals.isoreg* residuals.lm
[6] residuals.nls* residuals.smooth.spline* residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s1, type = "pearson"))
resid.gls
=> resid.default
> ## the resid.default is used
And the resulting qqplot is
Then, after loading the nlme package,
> #### After loading nlme ####
>
> library(nlme)
Warning message:
package ‘nlme’ was built under R version 4.1.2
> search()
[1] ".GlobalEnv" "package:nlme" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods" "Autoloads" "package:base"
>
> # s2 is the same as s1
> s2 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s2, type = "pearson"), main = "after loading nlme")
> qqline(resid(s2, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.gls* residuals.glsStruct* residuals.gnls*
[6] residuals.gnlsStruct* residuals.HoltWinters* residuals.isoreg* residuals.lm residuals.lme*
[11] residuals.lmeStruct* residuals.lmList* residuals.nlmeStruct* residuals.nls* residuals.smooth.spline*
[16] residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s2, type = "pearson"))
=> resid.gls
* resid.default
> # resid.gls is used
the qqplot looks like this
As the command sloop::s3_dispatch(resid(s1, type = "pearson")) indicated, resid.default is the function being used before the nlme package is loaded, but resid.gls is the one being used after nlme is loaded. Why such a change---is it because resid.gls is not included in the default options of resid(), as the first methods(resid) suggested?
I am using R 4.1.0, and I would appreciate your feedback very much, if any. Thank you.
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 1.0
year 2021
month 05
day 18
svn rev 80317
language R
version.string R version 4.1.0 (2021-05-18)
nickname Camp Pontaneze

\xe8 matching \xf1 in str_detect() and str_replace_all()

I want to process text files including some characters shown in hexadecimals on R. When I tried to convert those back into more readable characters, I encountered some unexpected (to me) behaviours of stringr functions. Specifically, \xe8 apparently matches \xf1:
> library("tidyverse")
> str <- "ni\xf1a"
> str_detect(str, "\xe8")
[1] TRUE
This is inconvenient when I want to convert \xe8 into è and \xf1 into ñ in the same files:
> str %>%
+ str_replace_all("\xe8", "è") %>%
+ str_replace_all("\xf1", "ñ")
[1] "nièa" # I expect niña
Interestingly, gsub() works as I expect:
> str %>%
+ gsub("\xe8", "è", .) %>%
+ gsub("\xf1", "ñ", .)
[1] "niña"
Why does \xe8 match \xf1 in str_detect() and str_replace_all()? Is there a way to avoid it?
Why is the behaviour different between stringr functions and gsub()?
Update
Here is part of the output of devtools::session_info():
> devtools::session_info()
─ Session info ──────────────────────────────────────────────────────────────────
setting value
version R version 4.0.2 (2020-06-22)
os macOS Catalina 10.15.7
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_GB.UTF-8
ctype en_GB.UTF-8
tz Europe/London
date 2020-09-30
─ Packages ──────────────────────────────────────────────────────────────────────
package * version date lib source
...
stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
...

Why did R's sorting change data imported with load() after an upgrade from 3.5.2 to 4.0.0?

Short version. I load() data in a package. Previously, a test in a package passed, now it fails because the output of sort changed.
Here is a minimal reproducible example - for details see below:
y <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
sort(y)
# OLD 3.5.2 [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
# NEW 4.0.0 [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
# Update 4.0.2 see comment:
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
# From jay.sf's comment
sort.int(y, method="radix")
# [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
sort.int(y, method="shell")
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
# From Henrik's comment:
data.table::fsort(y)
# [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
The only related reported change I found is
CHANGES IN R 4.0.0
NEW FEATURES
...
When loading data sets via read.table(), data() now uses LC_COLLATE=C to ensure locale-independent results for possible string-to-factor conversions.
But I am even not sure, if this could explain what I see.
As I want to minimize the number of imported packages and I would like to understand what's going on, I am not sure how to proceed. Do I miss something?
(A change to a sort.int with method radix would do the job, but still: Why did it change? Is that really better?
I just realized, that (thanks to Roland) sort calls in my case sort.int:
function (x, decreasing = FALSE, na.last = NA, ...)
{
if (is.object(x))
x[order(x, na.last = na.last, decreasing = decreasing)]
else sort.int(x, na.last = na.last, decreasing = decreasing,
...)
}
From ?sort.int:
The "auto" method selects "radix" for short (less than 2^31 elements) numeric vectors, integer vectors, logical vectors and factors; otherwise, "shell".)
And according to the docs, sort.int did not change from 4.0.0 to 4.0.2.
From ?data.table::setorder
data.table always reorders in "C-locale". As a consequence, the
ordering may be different to that obtained by base::order. In English
locales, for example, sorting is case-sensitive in C-locale. Thus,
sorting c("c", "a", "B") returns c("B", "a", "c") in data.table but
c("a", "B", "c") in base::order. Note this makes no difference in most
cases of data; both return identical results on ids where only
upper-case or lower-case letters are present ("AB123" < "AC234" is
true in both), or on country names and other proper nouns which are
consistently capitalized. For example, neither "America" < "Brazil"
nor "america" < "brazil" are affected since the first letter is
consistently capitalized.
Using C-locale makes the behaviour of sorting in data.table more
consistent across sessions and locales. The behaviour of base::order
depends on assumptions about the locale of the R session. In English
locales, "america" < "BRAZIL" is true by default but false if you
either type Sys.setlocale(locale="C") or the R session has been
started in a C locale for you – which can happen on servers/services
since the locale comes from the environment the R session was started
in. By contrast, "america" < "BRAZIL" is always FALSE in data.table
regardless of the way your R session was started.
(Related questions Language dependent sorting with R and Best practice: Should I try to change to UTF-8 as locale or is it safe to leave it as is?)
Details
R.version # old _
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 5.2
year 2018
month 12
day 20
svn rev 75870
language R
version.string R version 3.5.2 (2018-12-20)
nickname Eggshell Igloo
y <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
sort(y)
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
stringr::str_sort(y)
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
stringr::str_sort(y, locale = "C")
# [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
# =======
R.version # new after upgrade
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 0.0
year 2020
month 04
day 24
svn rev 78286
language R
version.string R version 4.0.0 (2020-04-24)
nickname Arbor Day
y <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
sort(y)
# [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
stringr::str_sort(y)
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
stringr::str_sort(y, locale = "C")
#[1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
# ==== Test with new 4.0.2
R.version
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 0.2
year 2020
month 06
day 22
svn rev 78730
language R
version.string R version 4.0.2 (2020-06-22)
nickname Taking Off Again
y <- c("Schaffhausen", "Schwyz", "Seespital", "SRZ")
sort(y)
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
stringr::str_sort(y)
# [1] "Schaffhausen" "Schwyz" "Seespital" "SRZ"
stringr::str_sort(y, locale = "C")
# [1] "SRZ" "Schaffhausen" "Schwyz" "Seespital"
In summary, it was a bug which has been removed in R version 4.0.1. As #Roland figured out.
From CRAN:
In R 4.0.0, sort.list(x) when is.object(x) was true, e.g., for x <-I(letters), was accidentally usingmethod = "radix". Consequently,
e.g., merge(<data.frame>) was much slower than previously; reported in
PR#17794.

Negative singular value using LAPACK svd

I have a 399 x 399 matrix for which the R svd() function (using LAPACK) gives me a negative singular value! This is not supposed to happen -- has anyone seen this before? It does not happen if I use the LINPACK option, so I guess this is a bug in the LAPACK svd.
ganymede: R --vanilla
R version 2.15.1 (2012-06-22) -- "Roasted Marshmallows"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
R is free software, etc ...
> load('A.dat')
> ls()
[1] "A"
> dim(A)
[1] 399 399
>
> L1 <- svd(A)
> any( L1$d < 0 )
[1] TRUE
> L1$d[1:4]
[1] 80.18833 68.93905 61.62659 57.62883
> L1$d[396:399]
[1] 3.777844e-15 3.582460e-15 3.175665e-15 -6.512578e+00
>
> L2 <- svd(A,LINPACK=TRUE)
> any( L2$d < 0 )
[1] FALSE
> L2$d[1:4]
[1] 80.18833 68.93905 61.62659 57.62883
> L2$d[396:399]
[1] 8.565532e-32 3.254162e-32 3.484425e-47 5.411232e-48
>
Per this bug report the problem can most likely be fixed by relinking against LAPACK 3.4.1. Upgrading the generic version 3.1.1 seems to be non-trivial and the procedure depends on the OS.

Resources