how to capture the full previous command run within R?

how to capture the full previous command run within R? - r

i've looked at history and savehistory and sys.call(-1) but nothing appears to capture the full previously-executed command if that command moves onto multiple lines. there's some r-help discussion on the topic, but i couldn't figure out a direct answer to my question. i just want the entire previously-evaluated command captured in a character string. is there a smart way to do this?
edit the purpose of this is for an R package convey that's dependent on another package survey and needs a few additional configuration commands run before usage. if it looks like the user ran the survey::svydesign function immediately before the convey::convey_prep function, then there is no need to print the warning. but if the user ran survey::svydesign and then edited the svydesign object prior to running convey::convey_prep then it could cause a mistaken calculation. so if the command prior to convey::convey_prep() did not include the svydesign() function, then i just want to print a warning. otherwise, it's safe to assume that the two functions were used appropriately (one-immediately-after-the-other).
i need this to work in both scripts and in interactive mode..thanks
# succeeds
c( 1 , 2 , 3 , 4 , 5 )
hist_tf <- tempfile()
savehistory( hist_tf )
hist_lines <- readLines( hist_tf )
# this output is what i want
hist_lines[ length( hist_lines ) - 2 ]
# fails
c( 1 , 2 , 3 ,
4 , 5 )
hist_tf <- tempfile()
savehistory( hist_tf )
hist_lines <- readLines( hist_tf )
# this output fails,
# because it does not have the `c( 1 , 2 , 3 ,`
hist_lines[ length( hist_lines ) - 2 ]

Related

Accessing R variable in loop

I am trying to access the R variables in a loop in the following way
bes2 = data.frame("id"=c(1,2), "generalElectionVoteW1"=c("Labour","Bla"),
"generalElectionVoteW2"=c("x","t"))
general_names <- c("generalElectionVoteW1", "generalElectionVoteW2")
labour_w = bes2[bes2$general_names[1] == "Labour",]
Which will simply result in an empty vector.
general_names is simply used to keep generalElectionVoteW1, ...W2 and many more saved for easier access in a loop.
However if I access them manually like labour_w = bes2[bes2$generalElectionVoteW1 == "Labour",] it works as desired. Where is my mistake?
bes2:
id generalElectionVoteW1 generalElectionVoteW2
1 1 Labour x
2 2 Bla t
general_names:
"generalElectionVoteW1" "generalElectionVoteW2"

How do i avoid nested for loop in R which takes more processing time

I have a dataset where i need to tokenize the words and find the frequency of each word, i can achieve this by doing for loop in R.
InputData <- To_Find_Categories
ShtDesc_Token_all <- ""
ShtDesc_Token <- ""
for(i_ID in 1:nrow(InputData))
#for(i_ID in 1:20)
{
ShtDesc_Token <- regmatches(InputData$short_description, gregexpr("((?![0-9]+)[A-Za-z0-9]+)",
InputData$short_description, perl = TRUE))[[i_ID]]
ShtDesc_Token_all <- append(ShtDesc_Token_all, ShtDesc_Token)
}
X<- sort(table(unlist(ShtDesc_Token_all)))
write.csv(X, "temp.csv", row.names=FALSE)
#
But it takes much processing time, i want to avoid the for loop, how i can do this?
Data is like in .csv format, i can give sample records
data.table::fread("number,parent , short_description
GECTASK0011264, GECHG0036340 , Restore Request
GECTASK0011265, GECHG0036340 , Restore Request
GECTASK0011748, GECHG0038670, lkj
GECTASK0011797 , GECHG0034985 , vm down-grade
GECTASK0011798, GECHG0034985 , vm down-grade
GECTASK0012252 , GECHG0040437 , remove server from load
GECTASK0012253 , GECHG0040437 , remove server from load
GECTASK0012328 , GECHG0034983 , vm down-grade
GECTASK0012329 , GECHG0034983 , vm down-grade")

Try this
You do not need for loop in this case.
input <- data.table::fread("number,parent , short_description
GECTASK0011264, GECHG0036340 , Restore Request
GECTASK0011265, GECHG0036340 , Restore Request
GECTASK0011748, GECHG0038670, lkj
GECTASK0011797 , GECHG0034985 , vm down-grade
GECTASK0011798, GECHG0034985 , vm down-grade
GECTASK0012252 , GECHG0040437 , remove server from load
GECTASK0012253 , GECHG0040437 , remove server from load
GECTASK0012328 , GECHG0034983 , vm down-grade
GECTASK0012329 , GECHG0034983 , vm down-grade")
tmp <- paste(input$short_description,collapse = " ")
tmp.splt <- stringr::str_split(tmp, pattern= " ")[[1]]
table(tmp.splt)
#> tmp.splt
#> down-grade from lkj load remove Request
#> 4 2 1 2 2 2
#> Restore server vm
#> 2 2 4
Created on 2018-08-10 by the reprex package (v0.2.0.9000).
Or
Use this one-liner (From #Onyambu 's comment):
sort(table(unlist(strsplit(InputData$short_description,"\\W"))))

Missing `parse` information inside vignette build

Goal
The goal is to create a package that parses R scripts and lists functions (from the package - like mvbutils- but also imports).
Function
The main function relies on parsing R script with
d<-getParseData(x = parse(text = deparse(x)))
Reproducible code
For example in an interactive R session the output of
x<-test<-function(x){x+1}
d<-getParseData(x = parse(text = deparse(x)))
Has for first few lines:
line1 col1 line2 col2 id parent token terminal text
23 1 1 4 1 23 0 expr FALSE
1 1 1 1 8 1 23 FUNCTION TRUE function
2 1 10 1 10 2 23 '(' TRUE (
3 1 11 1 11 3 23 SYMBOL_FORMALS TRUE x
4 1 12 1 12 4 23 ')' TRUE )
Error
When building a vignette with knitr containing - either with knit html from RStudio or devtools::build_vignettes, the output of the previous chunk of code is NULL. On the other hand using "knitr::knit" inside an R session will give the correct output.
Questions:
Is there a reason for the parser to behave differently inside the knit function/environment, and is there a way to bypass this?
Update
Changing code to:
x<-test<-function(x){x+1}
d<-getParseData(x = parse(text = deparse(x),keep.source = TRUE))
Fixes the issue, but this does not answer the question of why the same function behaves differently.

From the help page ?options:
keep.source:
When TRUE, the source code for functions (newly defined or loaded) is stored internally allowing comments to be kept in the right places. Retrieve the source by printing or using deparse(fn, control = "useSource").
The default is interactive(), i.e., TRUE for interactive use.
When building the vignette, you are running a non-interactive R session, so the source code is discarded in parse().
parse(file = "", n = NULL, text = NULL, prompt = "?",
keep.source = getOption("keep.source"), srcfile,
encoding = "unknown")

How to get the queue number from CONDOR into your R job

I think I have a simple problem because I was looking up and down the internet and couldn't find someone else asking this question:
My university has a Condor set-up. I want to run several repetitions of the same code (e.g. 100 times). My R code has a routine to store the results in a file, i.e.:
write.csv(res, file=paste(paste(paste(format(Sys.time(), '%y%m%d'),'res', queue, sep="_"), sep='/'),'.csv',sep='',collapse=''))
res are my results (a data.frame), I indicate that this file contains the results with 'res' and finally I want to add the queue number of this calculation (otherwise files would be replaced, wouldn't they?). It should look like: 140109_res_1.csv, 140109_res_2.csv, ...
My submit file to condor looks like this:
universe = vanilla
executable = /usr/bin/R
arguments = --vanilla
log = testR.log
error = testR.err
input = run_condor.r
output = testR$(Process).txt
requirements = (opsys == "LINUX") && (arch == "X86_64") && (HAS_R_2_13 =?= True)
request_memory = 1000
should_transfer_files = YES
transfer_executable = FALSE
when_to_transfer_output = ON_EXIT
queue 3
I wonder how do I get the 'queue' number into my R code? I tried a simple example with
print(queue)
print(Queue)
But there is no object found called queue or Queue. Any suggestions?
Best wishes,
Marco

Okay, I solved the problem. This is how it goes:
I had to change my submit file. I changed the slot arguments to:
arguments = --vanilla --args $(Process)
Now the process number is forwarded to the R code. There you retrieve it with the following line. The value will be stored as a character. Therefore, you should convert it to a numeric value (also check whether a number like 10 is passed on as '1' and '0' in which case you should also collapse the values).
run <- commandArgs(TRUE)
Here is an example of the code I let run.
> run <- commandArgs(TRUE)
> run
[1] "0"
> class(run)
[1] "character"
> try(as.numeric(run))
[1] 0
> try(run <- as.numeric(paste(run, collapse='')) )
> try(print(run))
[1] 0
> try(write(run, paste(run,'csv', sep='.')))
You can also find information how to pass on variables/arguments to your code here: http://research.cs.wisc.edu/htcondor/manual/v7.6/condor_submit.html
I hope this helps anyone.
Cheers and thanks for all other commenters!
Marco

correct parameters to download file using Amazon s3 API GET requests

I would like to be able to download a .csv file from my Amazon S3 bucket using R.
I have started using the API that is documented here http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectGET.html
I am using the package httr to create the GET request, I just need to work out what the correct parameters are to be able to download the relevant file.
I have set the response-content-type to text/csv as I know its a .csv file I hope to download...but the response I get is as follows:
Response [https://s3-zone.amazonaws.com/bucket.name/file.name.csv?response-content-type=text%2Fcsv]
Status: 200
Content-type: text/csv
Date and Time,Open,High,Low,Close,Volume
2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64
2007/01/01 22:52:00,5675.00,5676.00,5674.00,5674.00,17
2007/01/01 22:53:00,5674.00,5674.00,5673.00,5674.00,42
2007/01/01 22:54:00,5675.00,5676.00,5674.00,5676.00,36
2007/01/01 22:55:00,5675.00,5676.00,5675.00,5676.00,18
2007/01/01 22:56:00,5676.00,5677.00,5674.00,5677.00,64
2007/01/01 22:57:00,5678.00,5678.00,5677.00,5677.00,45
2007/01/01 22:58:00,5679.00,5680.00,5678.00,5680.00,30
.../01/01 22:59:00,5679.00,5679.00,5677.00,5678.00,19
And no file is downloaded and the data seems to be in the response...I can extract the string of characters that is created in the response, which represents the data, and I guess with some effort it can be converted into a data.frame as originally desired, but is there a better way of downloading the data...straight from the GET command, and then using read.csv to read the data? I think that it is a parameter issues...just not sure what parameters need to be set for the file to be downloaded.
If people suggest the conversion of the string...This is the structure of the string I have...what commands would I need to do to convert it into a data.frame?
chr "Date and Time,Open,High,Low,Close,Volume\r\n2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64\r\n2007/01/01 22:52:00,5675."| __truncated__
Thanks
HLM

The answer to your second question:
> chr <- "Date and Time,Open,High,Low,Close,Volume\r\n2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64\r\n"
> read.csv(text=chr)
Date.and.Time Open High Low Close Volume
1 2007/01/01 22:51:00 5683 5683 5673 5673 64
If you want extra speed for the read.csv, try this:
chr <- "Date and Time,Open,High,Low,Close,Volume\r\n2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64\r\n"
read.csv(text=chr, colClasses=c("POSIXct", rep("numeric", 5) ) )
Assuming the URL is set up properly (and we have nothing to test this on yet) I'm wondering if you may want to look at the value for GET( ...)$content
Perhaps:
infile <- read.csv(text=GET(...)$content, colClasses=c("POSIXct", rep("numeric", 5) ) )
Edit:
That was not correct because the data comes across as "raw" format. One needs to convert from raw before it will become encoded as text. I did a quick search of Nabble (it must be good for something after all) to find a csv file that was residing on the Web. This is what finally worked:
read.csv(text=rawToChar(
GET(
"http://nseindia.com/content/equities/scripvol/datafiles/16-11-2012-TO-16-11-2012ACCEQN.csv"
)[["content"]] ) )
Symbol Series Date Prev.Close Open.Price High.Price Low.Price Last.Price Close.Price
1 ACC EQ 16-Nov-2012 1404.4 1410.95 1410.95 1369.45 1374.95 1378.1
Average.Price Total.Traded.Quantity Turnover.in.Lacs Deliverable.Qty X..Dly.Qt.to.Traded.Qty
1 1393.62 132921 1852.41 56899 42.81

Here's one way:
library(taRifx) # for stack.list
test <- "Date and Time,Open,High,Low,Close,Volume\r\n2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64\r\n2007/01/01 22:51:00,5683.00,5683.00,5673.00,5673.00,64\r\n"
stack( sapply( strsplit( test, "\\n" )[[1]], strsplit, split="," ) )
[,1] [,2] [,3] [,4] [,5] [,6]
ret "Date and Time" "Open" "High" "Low" "Close" "Volume\r"
new "2007/01/01 22:51:00" "5683.00" "5683.00" "5673.00" "5673.00" "64\r"
new "2007/01/01 22:51:00" "5683.00" "5683.00" "5673.00" "5673.00" "64\r"
Now convert to a data.frame:
testdat <- stack( sapply( strsplit( test, "\\n" )[[1]], strsplit, split="," ) )
rownames(testdat) <- seq(nrow(testdat)) # Because duplicate rownames aren't allowed in data.frames
colnames(testdat) <- testdat[1,]
testdat <- testdat[-1,]
as.data.frame(testdat)
Date and Time Open High Low Close Volume\r
2 2007/01/01 22:51:00 5683.00 5683.00 5673.00 5673.00 64\r
3 2007/01/01 22:51:00 5683.00 5683.00 5673.00 5673.00 64\r

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

how to capture the full previous command run within R? - r

Related

Accessing R variable in loop

How do i avoid nested for loop in R which takes more processing time

Missing `parse` information inside vignette build

How to get the queue number from CONDOR into your R job

correct parameters to download file using Amazon s3 API GET requests

Categories

Resources