Converting .Rd file to plain text - r

I'm trying to convert R documentation files (extension .Rd) into plain text. I am aware that RdUtils contains a tool called Rdconv, but as far as I know it can only be used from the command line. Is there a way to access Rdconv (or a similar conversion tool) from within an R session?

Try
tools::Rd2txt("path/to/file.Rd")

You may always invoke a system command e.g. with the system2 function:
input <- '~/Projekty/stringi/man/stri_length.Rd'
output <- '/tmp/out.txt'
system2('R', paste('CMD Rdconv -t txt', filename, '-o', output))
readLines(output)
## [1] "Count the Number of Characters"
## ...
Make sure that R is in your system's search path. If not, replace the first argument of system2() above with full path, e.g. C:\Program Files\R\3.1\bin\R.exe.

Related

Using a file path as an argument to system() to execute C code

I have some C code that transforms some data into a different format. My goal is that the R user inputs the file path, and then runs the executable (which came from the C code). I have been having some issues with this however. It seems to not be reading the file path properly. Translator accepts one argument: the file path as the form seen below.
My code: system("Translator C:\\Users\\user\\Documents\\data.csv")
Running this prints the error in my C code File not read. I ran the executable directly and it worked just fine, so it is not a problem with my C code, but how I am calling it in R.
I have tried several different variations of the above code, such as
system2("Translator", args = "C:\\Users\\user\\Documents\\data.csv")
system(paste("Translator C:\\Users\\user\\Documents\\data.csv, collapse = " "))
However, these have not yielded any success. I believe the issue is stemming from the fact that R is not reading the path the way I want it to due to the \\. R reads directories as / I believe. However, fopen in C interprets the directory using \. Is there a way use \ in R, or is this an issue that should be solved in C?
Thank you.
Give this format a shot:
Basically capture.output should push the cat result of the normalizePath function in a 'native' WINdows path format to the system2 command:
system2( command = "Translator", args = capture.output( cat(normalizePath(pathToFile)) ) )
in this case pathToFile can be kept in 'regular' R path format ie: "C:/Users/user/Documents/data.csv" should be possible to keep.

R text encoding

The R Data Import/Export Manual says that there is a good way to guess the encoding of a text file is to use the "file" command line tool (available in R tools). How would one use this? I already have the newest version of Rtools installed. Is this something I can do from my R session? Or do I need to open up the command prompt?
In the context of R Data Import/Export Manual, I interpret it as using a file on a command prompt.
However you can invoke a system command with system() function from R. For example if I have a file called mpi.R in the current directory, I can do:
> foo <- system('file mpi.R', intern=TRUE, ignore.stdout=FALSE, ignore.stderr=TRUE, wait=TRUE)
> print(foo)
[1] "mpi.R: ASCII text"
The "command prompt" here refers to a "Terminal" window (OS X or Linux) or "Command Prompt" (Windows). From these, you have access to the command-line file utility, which as the manual states, provides a good description of the type and format of (text) files.
You can also run this straight from R, using the system() function to pass the call to file. For example, on my system, in the current working directory I have three text files:
> list.files(pattern = "*.txt")
[1] "00005802.txt" "googlebooks-eng-all-totalcounts-20120701.txt"
[3] "sentences.txt"
> system("file *.txt")
00005802.txt: Par archive data
googlebooks-eng-all-totalcounts-20120701.txt: ASCII text, with very long lines, with no line terminators
sentences.txt: ASCII English text, with very long lines
It could be that file will call something "plain ASCII" when it only contains the lower 128 ASCII characters, but this will be the same as UTF-8 since those two encodings share the same 8-bit mappings of the first 128 ASCII characters.
Also, file is not always right -- for instance the 00005802.txt is in fact UTF-8 encoded text that I converted from a pdf using pdftotext.
Also beware that on most Windows platforms, you cannot set your system locale to UTF-8 in R. Try Sys.getlocale(). (To set it, use Sys.setlocale()).

Pass R object name as argument in shell

I'm having a little trouble here using the shell command in R. I have the a java JAR file that takes as input a file containing a character vector (1 tweet per line). I'm calling it from the shell function:
shell("java -Xmx500m -jar C:/Users/User/Documents/R/java/ark-tweet-nlp-0.3.2/ark-tweet-nlp-0.3.2.jar --input-format text C:/Users/User/Documents/R/java/ark-tweet-nlp-0.3.2/examples/test.txt",intern=T)
Rather than pull the character vector from a text file external to the R environment, I want to be able to pass a vector that I have preprocessed within R. For example, if the file "text.txt" is imported into R as a character vector called test, I thought I could do this:
shell(paste("java -Xmx500m -jar C:/Users/User/Documents/R/java/ark-tweet-nlp-0.3.2/ark-tweet-nlp-0.3.2.jar --input-format text",test,sep=" "),intern=T)
But the jar file that is being called needs to actually read the file name, not the file contents. My workaround is to write the preprocessed file to my drive and then reimport using the shell script, but that is clunky and will mess up later processing I plan on doing.
Use the system command set to create an environment variable, then read it from java. The shared location will be the environment variable table.

How to provide file input from command line arguments in R

I want my R script to accept data from a .csv file. Is there a way to do this from the command prompt.
Just like if I write Rscript myscript.R 20, it passes a value 20 as input. I want to know if specifying the absolute address of the csv file will allow my script to use the data inside the csv file. If not what do I have to do to achieve what I want?
Have a look at ?commandArgs. Minimal example:
#!/usr/bin/Rscript
print(commandArgs(trailingOnly=TRUE))
Run it:
./myscript.R yourcsvfile.csv
[1] "yourcsvfile.csv"
Maybe you will be interested in the getopt package, too.

Print plain text of help file to console [duplicate]

I'd like to be able to write the contents of a help file in R to a file from within R.
The following works from the command-line:
R --slave -e 'library(MASS); help(survey)' > survey.txt
This command writes the help file for the survey data file
--slave hides both the initial prompt and commands entered from the
resulting output
-e '...' sends the command to R
> survey.txt writes the output of R to the file survey.txt
However, this does not seem to work:
library(MASS)
sink("survey.txt")
help(survey)
sink()
How can I save the contents of a help file to a file from within R?
Looks like the two functions you would need are tools:::Rd2txt and utils:::.getHelpFile. This prints the help file to the console, but you may need to fiddle with the arguments to get it to write to a file in the way you want.
For example:
hs <- help(survey)
tools:::Rd2txt(utils:::.getHelpFile(as.character(hs)))
Since these functions aren't currently exported, I would not recommend you rely on them for any production code. It would be better to use them as a guide to create your own stable implementation.
While Joshua's instructions work perfectly, I stumbled upon another strategy for saving an R helpfile; So I thought I'd share it. It works on my computer (Ubuntu) where less is the R pager. It essentially just involves saving the file from within less.
help(survey)
Then follow these instructions to save less buffer to file
i.e., type g|$tee survey.txt
g goes to the top of the less buffer if you aren't already there
| pipes text between the range starting at current mark
and ending at $ which indicates the end of the buffer
to the shell command tee which allows standard out to be sent to a file

Resources