I am currently using pdf() to save multiple plots on several pages.
I change page simply by plot.new().
Can I easily get svg() and png() to do the same? Currently only the last plot is saved in the file.
If I cannot have it in the same file, can I have them autogenerate files like: output.png, output2.png.
If you look at the help pages ?png and ?svg you will see that the default file names are "Rplot%03d.png" and "Rplot%03d.svg" respectively. The %03d part of those names means that each time a new plot is created it will automatically open a new file and that part of the file name will be replaced by an incrementing integer. So the first file will be "Rplot001.png" and the next will be "Rplot002.png" etc.
If you don't like the default file name you can create your own and still insert the portion to be replaced by an integer, such as "myplots%02d.png". The % says this is where the number part starts, the 0 is optional, but says to 0 pad the numbers (so you get 01, 02, ... rather than 1, 2, ...), this is generally preferred so that the sorting works out correctly (otherwise you may see the sorting as 1,10,11,2,3,...) and the digit (3 in the default, 2 in my example) is the number of digits, if you will create more than 1,000 plots you should up that to 4, if you know that you will not create 100 then 2 is fine (1 is fine if you know that you will produce fewer than 10). And the d is just an indicator for an integer.
Related
attach.files = c(paste("/users/joesmith/nosection_", currentDate,".csv",sep=""),
paste("/users/joesmith/withsection_", currentDate,".csv",sep=""))
Basically, if I did it like
c("nosection_051418.csv", "withsection_051418.csv")
And I did that manually it would work fine but since I'm automating this to run every day I can't do that.
I'm trying to attach files in an automated email but when I structure it like this, it doesn't work. How can I recreate this so that the character vector accepts it?
I thought your example implied the need for "parallel" inputs to the path stem, the first portion of the file name, and the date portions of those full paths. Consider this illustration of using a 2 item vector and a one item vector (produced by Sys.Date, replacing your "currentdate") to populate the %s positions in that sprintf string (suggested by #Gregor):
sprintf("/users/joesmith/%s_%s.csv", c("nosection", "withsection"), Sys.Date() )
[1] "/users/joesmith/nosection_2018-05-14.csv" "/users/joesmith/withsection_2018-05-14.csv"
I have a folder with tons of txt files from where I have to extract especific data. The problem is that the format of the file has changed once and the position of the data I need to extract has also changed. So I need to deal with files in different format.
To try to make it more clear, in column 4 I have the name of the variable and in 5 I have the value, but sometimes this is in a different row. Is there a way to find the name of the variable (in which row) and then extract its value?
Thanks in advance
EDITING
In some files I will have the data like this:
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Current--------28
But in some point in life, there was a change in the software to add another variable and the new file iis like this:
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Error------------5.
Current--------28
So I need to deal with these 2 types of data, extracting the same variables which are in different rows.
If these files can't be read with read.table use readLines and then find those lines that start with the keyword you need.
For example:
Sample file 1 (with the dashes included and extra line breaks):
Column 1-------Column 2.
Device ID------A.
Voltage------- 500.
Error------------5.
Current--------28
Sample file2 (with a comma as separator):
Column 1,Column 2.
Device ID,A.
Current,555
Voltage, 500.
Error,5.
For both cases do:
text = readLines(con = file("your filename here"))
curr = text[grepl("^Current", text, ignore.case = T)]
Which returns:
for file 1:
[1] "Current--------28"
for file 2:
[1] "Current,555"
Then use gsub to remove anything that is not a number.
I have a large text file with 40,000+ sections of output. Each output section is ~150 lines. I want one number from each section to put in a vector. The section I want to parse is shown as below.
Min ChiSq_Th: ith_cs ith_rk
-1 1
chisq_th chisq_th_min chisq_th_max ftmp_imv fstp_imv
0.149282D+05 0.200268D+05 0.200268D+05 0.100000D+01 0.100000D+00
I need the number below chisq_th in each sweep. I tried taking every 152nd line but not every sweep is exactly the same. I know R is not the ideal platform for this problem but it is the language I know best.
I am using the following code to read a file with the data.table library:
fread(myfile, header=FALSE, sep=",", skip=100, colClasses=c("character","numeric","NULL","numeric"))
but I get the following error:
The supplied 'sep' was not found on line 80. To read the file as a single character column set sep='\n'.
It says it did not find sep on line 80, however I set skip=100 so it should not pay attention to the first 100 lines.
UPDATE:
I tried with skip=101 and it worked but it skips the first line where the data starts
I am using version 1.9.2 of the data.table package and R version 3.02 64 bit on windows 7
We don't know the version number you're using, but I can make a guess in this case.
Try setting autostart=101.
Note the first paragraph of Details in ?fread :
Once the separator is found on line autostart, the number of columns is determined. Then the file is searched backwards from autostart until a row is found that doesn't have that number of columns. Thus, the first data row is found and any human readable banners are automatically skipped. This feature can be particularly useful for loading a set of files which may not all have consistently sized banners. Setting skip>0 overrides this feature by setting autostart=skip+1 and turning off the search upwards step.
the skip argument has :
If -1 (default) use the procedure described below starting on line autostart to find the first data row. skip>=0 means ignore autostart and take line skip+1 as the first data row (or column names according to header="auto"|TRUE|FALSE as usual). skip="string" searches for "string" in the file (e.g. a substring of the column names row) and starts on that line (inspired by read.xls in package gdata).
and the autostart argument has :
Any line number within the region of machine readable delimited text, by default 30. If the file is shorter or this line is empty (e.g. short files with trailing blank lines) then the last non empty line (with a non empty line above that) is used. This line and the lines above it are used to auto detect sep, sep2 and the number of fields. It's extremely unlikely that autostart should ever need to be changed, we hope.
In your case perhaps the human readable header is much larger than 30 rows, which is why I guess setting autostart=101 might work. No need to use skip.
One motivation is for convenience when a file contains multiple tables. By setting autostart to any row inside the table that you want to pluck out of the file, it'll find the first data row and header row for you automatically, and then read just that table. You don't have to worry about getting the exact line number at the start of data like you do with skip. fread can only read one table currently. It could feasibly return a list of tables from a single file, but that's getting a bit complicated and nobody has asked for that.
I am using the fread function in R for reading files to data.tables objects.
However, when reading the file I'd like to skip lines that start with #, is that possible?
I could not find any mention to that in the documentation.
fread can read from a piped command that filters out such lines, like this:
fread("grep -v '^#' filename")
Not currently, but it's on the list to do.
Are the # lines at the top forming a header which is more than 30 lines long?
If so, that's come up before and the solution is :
fread("filename", autostart=60)
where 60 is chosen to be inside the block of data to be read.
From ?fread :
Once the separator is found on line autostart, the number of columns
is determined. Then the file is searched backwards from autostart
until a row is found that doesn't have that number of columns. Thus,
the first data row is found and any human readable banners are
automatically skipped. This feature can be particularly useful for
loading a set of files which may not all have consistently sized
banners. Setting skip>0 overrides this feature by setting
autostart=skip+1 and turning off the search upwards step.
The default autostart=30 might just need bumping up a bit in your case.
Or maybe skip=n or skip="string" helps :
If -1 (default) use the procedure described below starting on line autostart to find the first data row. skip>=0 means ignore autostart and take line skip+1 as the first data row (or column names according to header="auto"|TRUE|FALSE as usual). skip="string" searches for "string" in the file (e.g. a substring of the column names row) and starts on that line (inspired by read.xls in package gdata).