This question already has answers here:
Save a plot in an object
(4 answers)
Closed 7 years ago.
Two methods of storing plot objects in list or a name string are mentioned on this page Generating names iteratively in R for storing plots . But both do not seem to work on my system.
> plist = list()
> plist[[1]] = plot(1:30)
>
> plist
list()
>
> plist[[1]]
Error in plist[[1]] : subscript out of bounds
Second method:
> assign('pp', plot(1:25))
>
> pp
NULL
I am using:
> R.version
_
platform i486-pc-linux-gnu
arch i486
os linux-gnu
system i486, linux-gnu
status
major 3
minor 2.0
year 2015
month 04
day 16
svn rev 68180
language R
version.string R version 3.2.0 (2015-04-16)
nickname Full of Ingredients
Where is the problem?
Use recordPlot and replayPlot:
plot(BOD)
plt <- recordPlot()
plot(0)
replayPlot(plt)
Related
Trying to read a SPSS file (.sav format) in R raises:
Error: file is not in any supported SPSS format.
This happens when trying to read the .sav file with foreign and read.spss.
Trying the memsicpackage and its as.data.set(spss.system.file("my_file")) raises:
Error in spss.readheader(file) : not a sysfile
The file is a very long SPSS file containing over 2 million entries and hundreds of factors. The factors vary: Many are categorical "Yes" / "No" / "Missing" / "None", some are numerical (IDS etc), some are labelled with texts ("State One" / "State 2" / "State 3") and some are mixed ("1" / "20" / "3732" / "Technical Problem"). Sadly, I can't give you a subset of my data (severe restrictions on privacy and I don't have a SPSS license).
Reading this file in and storing it as a feather file (.fea format) already has worked on another computer - that might have had another version of R installed. I have no way of checking what version that was though...
Currently, I'm working in R version 3.4.4 (2018-03-2015) on windows 10, and use packages memisc_0.99.17.2 and foreign_0.8-71. The file is stored on a server, my R is installed in a user on the local drive.
This is the code I've tried:
require(foreign)
ws <- "my_workspace_in_local_user"
setwd(ws)
dataDir <- "my_directory_on_the_server_containing_the_file"
fn <- paste0(dataDir, "my_file.sav")
dat <- read.spss(fn, to.data.frame = TRUE)
and
require(foreign)
ws <- "my_workspace_in_local_user"
setwd(ws)
dataDir <- "my_directory_on_the_server_containing_the_file"
fn <- paste0(dataDir, "my_file.sav")
install.packages("memisc")
require("memisc")
dat <- as.data.set(fn, to.data.frame = TRUE)
Does anybody have an idea why this wouldn't work? I'm suspecting it's a problem of which version of R and the packages to use...?
Your first set of code worked for me on macOS 10.15.1 (Catalina) and R 3.6.1 with memisc_0.99.17.2 and foreign_0.8-71.
R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin15.6.0 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[R.app GUI 1.70 (7684) x86_64-apple-darwin15.6.0]
> require(foreign)
Loading required package: foreign
> dataDir <- "~/Samples/English/"
> fn <- paste0(dataDir, "accidents.sav")
> dat <- read.spss(fn, to.data.frame = TRUE)
> print(dat)
agecat gender accid pop
1 Under 21 Female 57997 198522
2 21-25 Female 57113 203200
3 26-30 Female 54123 200744
4 Under 21 Male 63936 187791
5 21-25 Male 64835 195714
6 26-30 Male 66804 208239
The "accidents.sav" is an example data file that ships with IBM SPSS Statistics versions 19.0 thru 26.0.
If this code works for you against known data from IBM SPSS, then you can probably rule out your R version and configuration as a cause. Unfortunately that probably means your *.sav file is corrupted in some way.
I recently updated to the most recent versions of R and R studio and suddenly chart.TimeSeries from the PerformanceAnalytics package is not working inside a loop.
For example if I highlight the code below in Rstudio and run it , it executes without errors (which you can confirm by checking the value of i = 3 after running) but no plots are produced
library(PerformanceAnalytics)
library(xts)
ts1 <- xts(1:12, order.by = as.Date("2018-05-01") + (-11:0))
i <- 0
for (i in 1:3) chart.TimeSeries(ts1)
However if I replace
for (i in 1:3) chart.TimeSeries(ts1)
with
chart.TimeSeries(ts1)
chart.TimeSeries(ts1)
chart.TimeSeries(ts1)
then 3 plots are produced as expected. Has anyone seen or noted this before or have an explanation for it ?
Update : The same happens if I use plot.xts (which is what chart.TimeSeries uses under the hood) in place of chart.TimeSeries.
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 5.0
year 2018
month 04
day 23
svn rev 74626
language R
version.string R version 3.5.0 (2018-04-23)
nickname Joy in Playing
R-Studio verison 1.1.423. PerformanceAnalytics version 1.5.2, xts version 0.10-2
I just ran your example and indeed, my result is the same as yours.
I changed
for (i in 1:3) chart.TimeSeries(ts1)
to
for (i in 1:3) print(PerformanceAnalytics::chart.TimeSeries(ts1))
and now all 3 charts are showing properly in my plots panel inside rstudio (I also use up-to-date versions)
Hope this answers your issue.
I am trying to merge two large data sets as i need to create a final trainset for my models to run
head(TrainWithAppevents_rel4)
event_id |device_id |gender |age |group| phone_brand |device_model| numbrand nummodel | app_id
6 6 1476664663289716480 M 19 M22- åŽä¸º Mate 7 29 919 4348659952760821248
and
head(app_labels)
app_id |label_id
1 7324884708820028416 251
The first dataset has unique rows now as i have worked on it to remove all duplicates
i want my final set to be having the below columns
event_id device_id gender age group phone_brand device_model numbrand nummodel app_id label_id
However when i try to merge using the below in R (R studio session)
TrainWithLabels=merge(x=TrainWithAppevents_rel4,y=app_labels,by="app_id",all.x = TRUE)
i get following error
**Error: cannot allocate vector of size 512.0 Mb**
Error varies if i run again but only in terms of size of vector
The sizes of my datasets are as below :
> dim(TrainWithAppevents_rel4)
[1] 4787796 10
> dim(app_labels)
[1] 459943 2
More information about the machine/R i use :
> sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
i use intel 2.6GHz/16GB RAM /64 Bit OS/Windows10/x64 -based processesor
i have tried the following :
-Reducing the dataset by removing duplicates and unwanted columns ,
all rows in the first dataset are unique now
-closing all other application on my laptop and then running the merge-Still fails
-executing gc() and then running merge
I have gone through similar questions on SO for R, however none of them offered a solution to move forward and not specific to merges failing on a 64 bit machine
Can anyone please help in either suggesting a solution or a workaround to move forward.
Please assume that this is the only machine where i can execute the code and running this R script on AWS via zepplin is not possible at the moment.
Does the randomForest package ignore the nodesize parameter? When I predict the terminal nodes for a dataset and check the counts, I see values that are less than the nodesize. I would submit a fix for this myself but the underlying code was written in Fortran. If someone can confirm this behavior I will reach out to the package maintainer and hopefully start a fix.
> library(randomForest)
> set.seed(1)
> rf <- randomForest(mtcars[,-1], mtcars[,1], nodesize = 5)
> nodes <- attr(predict(rf, mtcars[,-1], nodes = TRUE), 'nodes')
# node counts of first tree
> table(nodes[,1])
# first row is the terminal node ID#, second row is the count
2 6 9 10 11 14 15 16 18 19
5 3 3 6 4 2 3 1 3 2
Adding system info:
Session info----------------------------------------------------------------
setting value
version R version 3.1.1 (2014-07-10)
system x86_64, mingw32
ui RStudio (0.98.1049)
language (EN)
collate English_United States.1252
tz America/Chicago
Packages--------------------------------------------------------------------
package * version date source
randomForest * 4.6.10 2014-07-17 CRAN (R 3.1.1)
Response from package maintainer:
That parameter behaves as the way that Leo Breiman intended. The bug
is in how the parameter was described. It’s the same as minsplit in
the rpart:::rpart.control() function:
the minimum number of observations that must exist in a node in order
for a split to be attempted.
I will change the description in the help file in the next version to
resolve this confusion.
Best, Andy
I am trying to get a .csv file from an S3 bucket.
I am using
x = getFile()
from the RAmazonS3 and converting it to a character using
y=rawToChar(x).
I wish to convert y to a data table with the same structure as the csv file (2 columns) the lines sep is \n and the inner line sep is a ",".
Code:
table = rawToChar(getFile("bucket/file.csv"))
Output:
"\"id\",\"name\"\n\"12\",\"Member 12\"\n\"123\",\"Member 123\"\n\"1234\",\"Member 1234\"\n\"12345\....
And i wish it to be a 2 columns data.table/frame of the form:
id name
12 Member 12
123 Member 123
1234 Member 1234
12345 Member 12345
Any suggestions on an efficient way to perform this conversion?
Any other (more useful) suggestions on how to retrieve csv files from an amazon S3 bucket to a table in R are more then welcome of course.
*I am using:
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
Any help in this issue would be appreciated,
Thank you