I have a toy function, foo, that just adds 5 to a variable x. I have a second function, n_foo that applies foo to a data.table n times. It works like so:
# Load library
library(data.table)
# Dummy function
foo <- function(x){
x + 5
}
# Apply foo n times
n_foo <- function(x, n){
Reduce(function(a, b) foo(a), 1:n, init = x)
}
# Dummy data
dt <- data.table(values = 1:10)
# Run foo 5 times
dt[, test := n_foo(.SD, 5)]
# See results
dt
#> values test
#> 1: 1 26
#> 2: 2 27
#> 3: 3 28
#> 4: 4 29
#> 5: 5 30
#> 6: 6 31
#> 7: 7 32
#> 8: 8 33
#> 9: 9 34
#> 10: 10 35
Great! Now, say something was amiss and I wanted to debug n_foo, I'd pull out the trusty debug function.
WARNING: THE FOLLOWING CODE MIGHT CRASH YOUR SESSION.
# Load library
library(data.table)
# Dummy function
foo <- function(x){
x + 5
}
# Apply foo n times
n_foo <- function(x, n){
Reduce(function(a, b) foo(a), 1:n, init = x)
}
# Dummy data
dt <- data.table(values = 1:10)
debug(n_foo)
# Run foo 5 times
dt[, test := n_foo(.SD, 5)]
# See results
dt
produces,
Curiously, the session doesn't crash if this code is run using reprex. Why does this code lead to a fatal error?
Edit:
It turns out I can only produce this issue in RStudio and not at the CLI. RStudio tag added accordingly.
R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.8
loaded via a namespace (and not attached):
[1] compiler_4.0.0 tools_4.0.0
no crash... but goes into debugging...
debugging in: n_foo(.SD, 5)
debug at #1: {
Reduce(function(a, b) foo(a), 1:n, init = x)
}
Browse[2]>
info
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
other attached packages:
[1] data.table_1.12.8
rstudio 1.3.959
After upgrading to RStudio v. 1.3.959, I could no longer reproduce the error.
Related
I was wondering why R is making a copy-on-modification after using str.
I create a matrix. I can change its dim, one element or even all. No copy is made. But when a call str R is making a copy during the next modification operation on the Matrix. Why is this happening?
m <- matrix(1:12, 3)
tracemem(m)
#[1] "<0x559df861af28>"
dim(m) <- 4:3
m[1,1] <- 0L
m[] <- 12:1
str(m)
# int [1:4, 1:3] 12 11 10 9 8 7 6 5 4 3 ...
dim(m) <- 3:4 #Here after str a copy is made
#tracemem[0x559df861af28 -> 0x559df838e4a8]:
dim(m) <- 3:4
str(m)
# int [1:3, 1:4] 12 11 10 9 8 7 6 5 4 3 ...
dim(m) <- 3:4 #Here again after str a copy
#tracemem[0x559df838e4a8 -> 0x559df82c9d78]:
Also I was wondering why a copy is made when having a Task Callback.
TCB <- addTaskCallback(function(...) TRUE)
m <- matrix(1:12, nrow = 3)
tracemem(m)
#[1] "<0x559dfa79def8>"
dim(m) <- 4:3 #Copy on modification
#tracemem[0x559dfa79def8 -> 0x559dfa8998e8]:
removeTaskCallback(TCB)
#[1] TRUE
dim(m) <- 4:3 #No copy
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)
Matrix products: default
BLAS: /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=de_AT.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_AT.UTF-8 LC_COLLATE=de_AT.UTF-8
[5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=de_AT.UTF-8
[7] LC_PAPER=de_AT.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.0.3
This is a follow up question to Is there a way to prevent copy-on-modify when modifying attributes?.
I start R with R --vanilla to have a clean session.
I have asked this question on R-help as suggested by #sam-mason in the comments.
The answer from Luke Tierney solved the issue with str:
As of R 4.0.0 it is in some cases possible to reduce reference counts
internally and so avoid a copy in cases like this. It would be too
costly to try to detect all cases where a count can be dropped, but it
this case we can do better. It turns out that the internals of
pos.to.env were unnecessarily creating an extra reference to the call
environment (here in a call to exists()). This is fixed in r79528.
Thanks.
And related to Task Callback:
It turns out there were some issues with the way calls to the
callbacks were handled. This has been revised in R-devel in r79541.
This example will no longere need to duplicate in R-devel.
Thanks for the report.
I have a .csv file which is UTF-8 encoded when I saved it. The script is Devanagari of the data in this file. I am able to see the words in csv file in excel
में
लिए
किया
गया
हैं
नहीं
सिंह
पुलिस
दिया
करने
कहा
रहे
बाद
करें
साथ
रहा
But when I open that in R, the words do not get encoded correctly. the output for print() is like this:
word
सारे_खतरों_को
जानते_हà¥\u0081à¤\u008f_à¤à¥€
विवेक_ने
टीवी
How can I resolve this? I have tried Sys.setlocale() and read.delim(wordlist.csv, encoding = "UTF-8") but neither worked.
Too long for comment (sorry, I'm a greenhorn in R):
print( sessionInfo())
library(stringi)
library(magrittr)
x <- read.delim("D:\\bat\\SO\\64497248_devangari.csv", encoding = "UTF-8")
print('=== print(x)')
print(x)
for (line in x){
y <- line %>%
stri_replace_all_regex("<U\\+([[:alnum:]]+)>", "\\\\u$1") %>%
stri_unescape_unicode() %>%
stri_enc_toutf8()
}
print('=== print(y)')
print(y)
print('=== for (i in y) {print(i)}')
for (i in y) {print(i)}
print('=== print(z)')
z <- x['word'] %>%
stri_replace_all_regex("<U\\+([[:alnum:]]+)>", "\\\\u$1") %>%
stri_unescape_unicode() %>%
stri_enc_toutf8()
print(z)
Output (in Rgui.exe console):
> source ( 'D:\\bat\\SO\\64497248.r' )
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 LC_MONETARY=Czech_Czechia.1250
[4] LC_NUMERIC=C LC_TIME=Czech_Czechia.1250
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.0.1
[1] "=== print(x)"
word
1 <U+092E><U+0947><U+0902>
2 <U+0932><U+093F><U+090F>
3 <U+0915><U+093F><U+092F><U+093E>
4 <U+0917><U+092F><U+093E>
5 <U+0939><U+0948><U+0902>
6 <U+0928><U+0939><U+0940><U+0902>
7 <U+0938><U+093F><U+0902><U+0939>
8 <U+092A><U+0941><U+0932><U+093F><U+0938>
9 <U+0926><U+093F><U+092F><U+093E>
10 <U+0915><U+0930><U+0928><U+0947>
11 <U+0915><U+0939><U+093E>
12 <U+0930><U+0939><U+0947>
13 <U+092C><U+093E><U+0926>
14 <U+0915><U+0930><U+0947><U+0902>
15 <U+0938><U+093E><U+0925>
16 <U+0930><U+0939><U+093E>
[1] "=== print(y)"
[1] "में" "लिए" "किया" "गया" "हैं" "नहीं" "सिंह" "पुलिस" "दिया" "करने" "कहा" "रहे" "बाद" "करें" "साथ" "रहा"
[1] "=== for (i in y) {print(i)}"
[1] "में"
[1] "लिए"
[1] "किया"
[1] "गया"
[1] "हैं"
[1] "नहीं"
[1] "सिंह"
[1] "पुलिस"
[1] "दिया"
[1] "करने"
[1] "कहा"
[1] "रहे"
[1] "बाद"
[1] "करें"
[1] "साथ"
[1] "रहा"
[1] "=== print(z)"
[1] "c(\"में\", \"लिए\", \"किया\", \"गया\", \"हैं\", \"नहीं\", \"सिंह\", \"पुलिस\", \"दिया\", \"करने\", \"कहा\", \"रहे\", \"बाद\", \"करें\", \"साथ\", \"रहा\"\n)"
Warning messages:
1: package ‘magrittr’ was built under R version 4.0.2
2: In stri_replace_all_regex(., "<U\\+([[:alnum:]]+)>", "\\\\u$1") :
argument is not an atomic vector; coercing
>
I am setting up a foreach loop with %dopar%. The function inside should be executed n times. However no matter what the function is the function is executed only 6 times. There are no errors. How do I make foreach execute all iterations?
I tried changing the number inside of makePSOCKcluster() and results are the same. Currently I am running makePSOCKcluster(nworkers) with nworkers = 7 ( parallel::detectCores() returns 8 ). I was originally running a complex function inside my loop but I tried substituting the function for something simple like sqrt() and I get the same result. I am working on a mac:
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.2
Using package versions:
doParallel_1.0.11 iterators_1.0.10 foreach_1.4.4
suppressPackageStartupMessages( library(foreach) )
suppressPackageStartupMessages( library(doParallel) )
nworkers <- parallel::detectCores() -1
cl <- makePSOCKcluster(nworkers)
registerDoParallel(cl)
epkgs = c("lubridate","dplyr","tidyr","doParallel","foreach", "Rcpp")
efuns = ls(globalenv())
foreach( i= 1:31, packages = epkgs, .export = efuns) %dopar% { sqrt(i)}
The result I get is :
[[1]]
[1] 1
[[2]]
[1] 1.414214
[[3]]
[1] 1.732051
[[4]]
[1] 2
[[5]]
[1] 2.236068
[[6]]
[1] 2.44949
While the functions should have returned 31 elements.
Update I was missing a "." in front of packages so it was assuming that was an iterator. 6 packages 6 results.
How can I plot unicode symbols like the 🚺 WOMENS SYMBOL or the 🚹 MENS SYMBOL, or other symbols from that codeblock? Apart from setting a font family that contains those characters, R hangs on my system* when using the point character pch like that:
plot(0, type="n")
points(1, .5, pch=-0xfffdL)
# works
points(1, -.5, pch=-0x1f6b9L)
# R hangs
As the doc states,
Where supported by the OS, negative values specify a Unicode code
point, so e.g. -0x2642L is a ‘male sign’ and -0x20ACL is the Euro.
*My sessionInfo():
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] rsconnect_0.3.79 tools_3.2.2
Thanks for help & checking on your system in advance.
Edit: Windows hangs when I use RStudio 0.99.879 with the RStudio graphics device. If I use dev.new(noRStudioGD=T) explicitly, then I get a smiliar error as mentioned in the comments: "Error in plot.xy(xy.coords(x, y), type = type, ...) : invalid input '🚹' in 'utf8towcs'". For now, I'll use the PNG fallback option as mentioned by #42-.
I don't have an answer but there are some public domain versions in png format:
You should be able to shrink these and print at desired locations: using custom images instead of standard shapes for R line chart markers
library(png)
img <- readPNG('~/Downloads/mens_room_clip_art_9332/Mens_Room_clip_art_small.png')
str(img)
# num [1:100, 1:100, 1:4] 1 1 1 1 1 1 1 0 0 0 ...
require(grid)
#Loading required package: grid
male <- rasterGrob(img)
img <- readPNG('~/Downloads/ladies_room_clip_art_16926/Ladies_Room_clip_art_small.png')
female <- rasterGrob(img)
df = data.frame(x=rep(1:4,2), y=c(1,1,2,4,6.5,5,5.5,4.8), g=rep(c("s","m"),each=4))
p = ggplot(df, aes(x, y, group=g)) +
geom_line() +
theme_bw()
a=0.2
for (i in rownames(df[df$g=="s",])) {
p = p + annotation_custom(male, df[i,"x"]-a,df[i,"x"]+a,df[i,"y"]-a,df[i,"y"]+a)
}
b=0.2
for (i in rownames(df[df$g=="m",])) {
p = p + annotation_custom(female, df[i,"x"]-b,df[i,"x"]+b,df[i,"y"]-b,df[i,"y"]+b)
}
png();print(p);dev.off()
I also took the images and pasted into Gimp and scaled to 24 pixels:
There is a strange behaviour when I use lmer: when I save the fit using lmer into an object, let's say fit0, using lmer, I can look at the summary (output not showing):
>summary(fit0)
If I save the objects using save.image(), close the session and reopen it again, summary gives me:
>summary(fit0)
Error in diag(vcov(object, use.hessian = use.hessian))
error in evaluating the argument 'x' in selecting a method for function 'diag': Error in object#pp$unsc() : object 'merPredDunsc' not found
If I run again the model, I get the expected summary but will loose it if I close the session.
What happens? How can I avoid this Error?
Thanks for help.
Environment and version:
Windows 7
R version 3.1.2 (2014-10-31)
GNU Emacs 24.3.1 (i386-mingw-nt6.1.7601)/ESS
Here is a minimal example:
# j: cluster
# i[j]: i in cluster j
# yi[j] = zi[j] + N(0,1)
# zi[j] = b0j + b1*xi[j]
# b0j = g0 + u0j, u0j ~ N(0,sd0)
# b1 = const
library(lme4)
# Number of clusters (level 2)
N <- 20
# intercept
g0 <- 1
sd0 <- 2
# slope
b1 <- 3
# Number of observations (level 1) for cluster j
nj <- 10
# Vector of clusters indices 1,1...n1,2,2,2,....n2,...N,N,....nN
j <- c(sapply(1:N, function(x) rep(x, nj)))
# Vector of random variable
uj <- c(sapply(1:N, function(x)rep(rnorm(1,0,sd0), nj)))
# Vector of fixed variable
x1 <- rep(runif(nj),N)
# linear combination
z <- g0 + uj + b1*x1
# add error
y <- z + rnorm(N*nj,0,1)
# Put all together
d0 <- data.frame(j, y=y, z=z,x1=x1, uj=uj)
head(d0)
# mixed model
fit0 <- lmer(y ~ x1 + (1|j), data = d0)
vcov(fit0)
summary(fit0)
save.image()
After restarting und adding library lme4:
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Switzerland.1252 LC_CTYPE=German_Switzerland.1252
[3] LC_MONETARY=German_Switzerland.1252 LC_NUMERIC=C
[5] LC_TIME=German_Switzerland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lme4_1.1-7 Rcpp_0.11.0 Matrix_1.1-2-2
loaded via a namespace (and not attached):
[1] compiler_3.1.2 grid_3.1.2 lattice_0.20-29 MASS_7.3-35
[5] minqa_1.2.3 nlme_3.1-118 nloptr_1.0.4 splines_3.1.2
[9] tools_3.1.2
>