Why won't `cat` append to a `file` connection? - r

I ran these two code blocks, expecting the same output
cattest <- file("cattest.txt")
cat("First thing", file = cattest)
cat("Second thing", file = cattest, append = TRUE)
close(cattest)
sink("cattest_sink.txt")
cat("First thing")
cat("Second thing")
sink()
But the resulting cattest.txt contains only "Second thing", whereas the cattest_sink.txt includes what I expected, "First thingSecond thing". Why is the append argument ignored with the file connection?
I'm on 64bit R 3.0.1 on Windows, in case it matters.

Because that's what ?cat says it will do if file is not the name of a file.
append: logical. Only used if the argument 'file' is the name of file
(and not a connection or '"|cmd"'). If 'TRUE' output will be
appended to 'file'; otherwise, it will overwrite the contents
of 'file'.

One way to append text using cat is to open a file connection of mode a.
cattest <- file("cattest.txt")
cat("First thing", file = cattest, fill = TRUE)
close(cattest)
cattest <- file("cattest.txt", open = "a")
cat("Second thing", file = cattest)
close(cattest)

Related

Converting docx.files to pdf.files with docx2pdf

Not sure what I am doing wrong.
I want to convert multiple docx.files to pdf.files - each file into a separate one.
I decided to use the "doconv"-package with following command:
docx_files <- list.files(pattern=paste0("Protokollnr_"))[39:73]
docx_files %>% length
lapply(1:35, function(x) {
docx2pdf(input = docx_files[[x]],
output = tempfile(fileext = ".pdf"))})
I does not say anything specific in the error message - only that it cannot be converted.
Is it that I should have specified the file path - now I only define the file name in my WD.
The object "docx_files" contain:
c("Protokollnr_1.docx", "Protokollnr_10.docx", "Protokollnr_11.docx",
"Protokollnr_12.docx", "Protokollnr_13.docx", "Protokollnr_14.docx",
"Protokollnr_15.docx", "Protokollnr_16.docx", "Protokollnr_17.docx",
"Protokollnr_18.docx", "Protokollnr_19.docx", "Protokollnr_2.docx",
"Protokollnr_20.docx", "Protokollnr_21.docx", "Protokollnr_22.docx",
"Protokollnr_23.docx", "Protokollnr_24.docx", "Protokollnr_25.docx",
"Protokollnr_26.docx", "Protokollnr_27.docx", "Protokollnr_28.docx",
"Protokollnr_29.docx", "Protokollnr_3.docx", "Protokollnr_30.docx",
"Protokollnr_31.docx", "Protokollnr_32.docx", "Protokollnr_33.docx",
"Protokollnr_34.docx", "Protokollnr_35.docx", "Protokollnr_4.docx",
"Protokollnr_5.docx", "Protokollnr_6.docx", "Protokollnr_7.docx",
"Protokollnr_8.docx", "Protokollnr_9.docx")
The error message is:
Error in docx2pdf(input = docx_files[[x]], output = tempfile(fileext = ".pdf")) :
could not convert C:/Users/Nadine/OneDrive/Documents/Arbeit_Büro_papa/Protokolle_Sallapulka/fertige_Protokolle/Protokollnr_1.docx
Many thanks,
Nadine
I'd recommend specifying the file path since the function requires the following format:
docx2pdf(input, output = gsub("\\.docx$", ".pdf", input))

How to append suffix to file names in write.csv() in R?

I have many data frames. I write them to csv, but I would not like to manually enter to each file the ending '_100' only to be able to specify it once and that each file would write with this ending
write.csv(results_SVM, file = "results_SVM.csv")
write.csv(results_ANN, file = "results_ANN.csv")
write.csv(results_RBF, file = "results_ANN.csv")
Get the same suffix for each file:
write.csv(results_SVM, file = "results_SVM_100.csv")
write.csv(results_ANN, file = "results_ANN_100.csv")
write.csv(results_RBF, file = "results_ANN_100.csv")
You can use paste in the filename:
#suf <- "" #nothing
suf <- "_100" #with _100
write.csv(results_SVM, file = paste0("results_SVM",suf,".csv"))
write.csv(results_ANN, file = paste0("results_ANN",suf,".csv"))
write.csv(results_RBF, file = paste0("results_ANN",suf,".csv"))

Unexpected '/' in environment variables

Every time I restart R, I get a message in the console:
Error: 2:24: unexpected '/'
2: Sys.setenv(BINPREF = C:/
I've tried unsetting BINREF (Sys.unset...) or setting to an empty string, or adding a double backslash when setting (\\) but the error persists.
This is what it is currently set to: C:\Rtools\mingw_$(WIN)\bin\
It was set with:
cat('Sys.setenv(BINPREF = "C:/Rtools/mingw_$(WIN)/bin/")',
file = file.path(Sys.getenv("HOME"), ".Rprofile"),
sep = "\n", append = TRUE)
What can I do? Is there anyway I can delete BINPREF?
Fixed with
cat('Sys.setenv(BINPREF = "C:/Rtools/mingw_$(WIN)/bin/")',
file = file.path(Sys.getenv("HOME"), ".Rprofile"),
sep = "\n", append = FALSE)
Creates new .Rprofile

How to run a VBS script from R, while passing arguments from R to VBS

Let's say I want to run a VBS script from R, and I want to pass a value from R to that script.
For example, in a simple file called 'Msg_Script.vbs', I have the code:
Dim Msg_Text
Msg_Text = "[Insert Text Here]"
MsgBox("Hello " & Msg_Text)
How do I run this script using R, while editing the parameters and/or variables in R? In the above script for instance, how would I edit the value of the Msg_Text variable?
Another way would be to pass the value as an argument to the VBScript
You'd write the VBS as follows:
Dim Msg_Text
Msg_Text = WScript.Arguments(0)
MsgBox("Hello " & Msg_Text)
And then you'd create a system command in R like this:
system_command <- paste("WScript",
'"Msg_Script.vbs"',
'"World"',
sep = " ")
system(command = system_command,
wait = TRUE)
This approach matches the arguments by position.
If you wanted, you could use named arguments instead. This way, your VBS would look like this:
Dim Msg_Text
Msg_Text = WScript.Arguments.Named.Item("Msg_Text")
MsgBox("Hello " & Msg_Text)
And then you'd create a system command in R like this:
system_command <- paste("WScript",
'"Msg_Script.vbs"',
'/Msg_Text:"World"',
sep = " ")
system(command = system_command,
wait = TRUE)
Here's a somewhat-hackish solution:
Read the lines from the vbs script into R (using readLines()):
vbs_lines <- readLines(con = "Msg_Script.vbs")
Edit the lines in R by finding and replacing specific text:
updated_vbs_lines <- gsub(x = vbs_lines,
pattern = "[Insert Text Here]",
replacement = "World",
fixed = TRUE)
Create a new VBS script using the updated lines:
writeLines(text = updated_vbs_lines,
con = "Temporary VBS Script.vbs")
Run the script using a system command:
full_temp_script_path <- normalizePath("Temporary VBS Script.vbs")
system_command <- paste0("WScript ", '"', full_temp_script_path, '"')
system(command = system_command,
wait = TRUE)
Delete the new script after you've run it:
file.remove("Temporary VBS Script.vbs")

How can I cut large csv files using any R packages like ff or data.table?

I want to cut large csv files (file size more than RAM size) and use them or save each in disk for later usage. Which R package is best for doing this for large files?
I haven't tried but using skip and nrows parameters in read.table or read.csv is worth a try. These are from ?read.table
skip integer: the number of lines of the data file to skip before
beginning to read data.
nrows integer: the maximum number of rows to read in. Negative and
other invalid values are ignored.
To avoid some troublesome issues at the end you need to do some error handling. In other words I don't know what happpens when skip value is greater than the number of rows in your big csv.
p.s. I also don't know whether header=TRUE is affecting skip or not, you also have to check that.
The answer given bu #berkorbay is OK and I can confirm that header can be used with skip. However, if your file is really large it gets painfully slow, as each subsequent reading after the first must skip over all previously read lines.
I had to do something similar and, after wasting quite a bit of time, I wrote a short script in PERL which fragments the original file in chuncks that you can read one after the other. It is much faster. I enclose the source here, translating some parts so that the intent is clear:
#!/usr/bin/perl
system("cls");
print("Fragment .csv file keeping header in each chunk\n") ;
print("\nEnter input file name = ") ;
$entrada = <STDIN> ;
print("\nEnter maximum number of lines in each fragment = ") ;
$nlineas = <STDIN> ;
print("\nEnter output file name stem = ") ;
$salida = <STDIN> ;
chop($salida) ;
open(IN,$entrada) || die "Cannot open input file: $!\n" ;
$cabecera = <IN> ;
$leidas = 0 ;
$fragmento = 1 ;
$fichero = $salida.$fragmento ;
open(OUT,">$fichero") || die "Cannot open output file: $!\n" ;
print OUT $cabecera ;
while(<IN>) {
if ($leidas > $nlineas) {
close(OUT) ;
$fragmento++ ;
$fichero = $salida.$fragmento ;
open(OUT,">$fichero") || die "Cannot open output file: $!\n" ;
print OUT $cabecera ;
$leidas = 0;
}
$leidas++ ;
print OUT $_ ;
}
close(OUT) ;
Just save with whatever name and execute. The first line might have to be changed if you have PERL in a diferent place (an, if you are on Windows, you migh have to invoke the script as "perl name-of-script").
One should have used read.csv.ffdf of ff package with specific parameters like this to read big file:
library(ff)
a <- read.csv.ffdf(file="big.csv", header=TRUE, VERBOSE=TRUE, first.rows=1000000, next.rows=1000000, colClasses=NA)
Once big file is read into a ff object, Subsetting ffobject into data frames can be done using:
a[1000:1000000,]
Rest of the code for subsetting and saving broken dataframes
totalrows = dim(a)[1]
row.size = as.integer(object.size(a[1:10000,])) / 10000 #in bytes
block.size = 200000000 #in bytes .IN Mbs 200 Mb
#rows.block is rows per block
rows.block = ceiling(block.size/row.size)
#nmaps is the number of chunks/maps of big dataframe(ff), nmaps = number of maps - 1
nmaps = floor(totalrows/rows.block)
for(i in (0:nmaps)){
if(i==nmaps){
df = a[(i*rows.block+1) : totalrows,]
}
else{
df = a[(i*rows.block+1) : ((i+1)*rows.block),]
}
#process df or save it
write.csv(df,paste0("M",i+1,".csv"))
#remove df
rm(df)
}
Alternatively you can first read the files into mysql using dbWriteTable and then use read.dbi.ffdf function from the ETLUtils package to read it back to R. Consider the function below;
read.csv.sql.ffdf <- function(file, name,overwrite = TRUE, header = TRUE, drv = MySQL(), dbname = "new", username = "root",host='localhost', password = "1234"){
conn = dbConnect(drv, user = username, password = password, host = host, dbname = dbname)
dbWriteTable(conn, name, file, header = header, overwrite = overwrite)
on.exit(dbRemoveTable(conn, name))
command = paste0("select * from ", name)
ret = read.dbi.ffdf(command, dbConnect.args = list(drv =drv, dbname = dbname, username = username, password = password))
return(ret)
}

Resources