R: Using source and loop cycle to run a 2nd script - r

I am using command source in a Script (Script Number 1) to run other script file saved (Script number 2).
My idea is use a loop to read continually the Script number 2 that is modified continually. I Never stop the loop. But Lamentably this dont work.
Always Script number 1 read the original Script number 2 and not change the result when I change the script number 2. I am using a sublime Text to change the script and not stop the loop cycle.
Example:
Example Script number 1:
repeat{
source("C:/Users/myPC/Desktop/script.R")
Sys.sleep(10)
}
Example Script number 2 modified (script saved as script.R in my desktop):
repeat {
print('Checking files')
Sys.sleep(time=10)
}
This run ok. But during the loop cycle I make changes to the script number 2 (and save file) :
Script number 2 modified:
repeat {
print('NOW RE-Checking files')
Sys.sleep(time=10)
}
The result always is this. Not read the script number 2 modified.
[1] "Checking files"
[1] "Checking files"
[1] "Checking files"
[1] "Checking files"
[1] "Checking files"
[1] "Checking files"

Not knowing what your scripts are really doing, it's a little guess-work, but I'll work on the two things I do know from this:
Checking to see if a file has been changed and then reloading it will work 95-99% or more of the time. Unfortunately, the risk is that whatever is writing the file will still be writing when your script-1 tries to source it: writing contents to a file is never atomic, so you cannot guarantee without some form of file-lock (not supported on all filesystems) or other coordination mechanism.
To remedy this, I suggest that what mechanism (sight-unseen) that is modifying script-2 needs to write to a temp-file and then destructively rename it to be script2.R. While writing to a file is not atomic, renaming/moving a file is atomic. With this, when script1.R notices that a file has been updated, it will get to read the whole thing unabated.
I suggest that instead of running code in script2.R, you should define a function that is called (repeatedly) by script-1. That way, script-1 retains control and can check for updates in script2.R and, as necessary, reload it. The function one writes in script-2 needs to do one thing and then yield control back to the caller; that is, no repeat or while loops unless they are very well contained/limited and will not take longer to exit than you expect script2.R to be updated. (Recognize that script-2's while/repeat loop will not be interrupted, so plan accordingly.)
(script2.R)
work <- function() {
print("Checking files")
Sys.sleep(10)
}
(script1.R)
prevmt <- NA
repeat {
mt <- file.info("script2.R")$mtime
if (is.na(prevmt) || prevmt < mt) {
source("script2.R")
}
work()
Sys.sleep(10)
}

Related

If statement for directory starting with specific character

I have a script in which I call R and depending on the directory I specify I want it to carry out a different process. One directory starts with L and the other with S. I have numerous directories that either start with L or S and they all end differently.
I specify the directory in bash and run a script like so:
./script L_dir
or
./script S_dir
So within my R script I have it set up as such:
args <- commandArgs(TRUE)
img_dir <- args[1]
if(img_dir == "^L*"){
do_process_1
} else {
do_process_2
}
Everything works fine except that no matter what directory I specify, the process called will always be do_process_2.
I have looked at this question and tried to adapt it but can't get it to work.
After changing my code to
if(grepl("^LM*", img_dir)){
do_process_1
} else {
do_process_2
}
it worked. Be careful if you change it to the above and it still carries out process_2. This may be because what you are looking for, in my case ^L*, may also be in your second directory name i.e. dir_L = LMNOP, dir_S = STUVLJH. But once i specified ^LM* it did what i wanted it to do.

AutoIt Scripting for an External CLI Program - eac3to.exe

I am attempting to design a front end GUI for a CLI program by the name of eac3to.exe. The problem as I see it is that this program sends all of it's output to a cmd window. This is giving me no end of trouble because I need to get a lot of this output into a GUI window. This sounds easy enough, but I am begining to wonder whether I have found one of AutoIt's limitations?
I can use the Run() function with a windows internal command such as Dir and then get the output into a variable with the AutoIt StdoutRead() function, but I just can't get the output from an external program such as eac3to.exe - it just doesn't seem to work whatever I do! Just for testing purposesI I don't even need to get the output to a a GUI window: just printing it with ConsoleWrite() is good enough as this proves that I was able to read it into a variable. So at this stage that's all I need to do - get the text (usually about 10 lines) that has been output to a cmd window by my external CLI program into a variable. Once I can do this the rest will be a lot easier. This is what I have been trying, but it never works:
Global $iPID = Run("C:\VIDEO_EDITING\eac3to\eac3to.exe","", #SW_SHOW)
Global $ScreenOutput = StdoutRead($iPID)
ConsoleWrite($ScreenOutput & #CRLF)
After running this script all I get from the consolWrite() is a blank line - not the text data that was output as a result of running eac3to.exe (running eac3to without any arguments just lists a screen of help text relating to all the commandline options), and that's what I am trying to get into a variable so that I can put it to use later in the program.
Before I suggest a solution let me just tell you that Autoit has one
of the best help files out there. Use it.
You are missing $STDOUT_CHILD = Provide a handle to the child's STDOUT stream.
Also, you can't just do RUN and immediately call stdoutRead. At what point did you give the app some time to do anything and actually print something back to the console?
You need to either use ProcessWaitClose and read the stream then or, you should read the stream in a loop. Simplest check would be to set a sleep between RUN and READ and see what happens.
#include <AutoItConstants.au3>
Global $iPID = Run("C:\VIDEO_EDITING\eac3to\eac3to.exe","", #SW_SHOW, $STDOUT_CHILD)
; Wait until the process has closed using the PID returned by Run.
ProcessWaitClose($iPID)
; Read the Stdout stream of the PID returned by Run. This can also be done in a while loop. Look at the example for StderrRead.
; If the proccess doesnt end when finished you need to put this inside of a loop.
Local $ScreenOutput = StdoutRead($iPID)
ConsoleWrite($ScreenOutput & #CRLF)

How to create a new output file in R if a file with that name already exists?

I am trying to run an R-script file using windows task scheduler that runs it every two hours. What I am trying to do is gather some tweets through Twitter API and run a sentiment analysis that produces two graphs and saves it in a directory. The problem is, when the script is run again it replaces the already existing files with that name in the directory.
As an example, when I used the pdf("file") function, it ran fine for the first time as no file with that name already existED in the directory. Problem is I want the R-script to be running every other hour. So, I need some solution that creates a new file in the directory instead of replacing that file. Just like what happens when a file is downloaded multiple times from Google Chrome.
I'd just time-stamp the file name.
> filename = paste("output-",now(),sep="")
> filename
[1] "output-2014-08-21 16:02:45"
Use any of the standard date formatting functions to customise to taste - maybe you don't want spaces and colons in your file names:
> filename = paste("output-",format(Sys.time(), "%a-%b-%d-%H-%M-%S-%Y"),sep="")
> filename
[1] "output-Thu-Aug-21-16-03-30-2014"
If you want the behaviour of adding a number to the file name, then something like this:
serialNext = function(prefix){
if(!file.exists(prefix)){return(prefix)}
i=1
repeat {
f = paste(prefix,i,sep=".")
if(!file.exists(f)){return(f)}
i=i+1
}
}
Usage. First, "foo" doesn't exist, so it returns "foo":
> serialNext("foo")
[1] "foo"
Write a file called "foo":
> cat("fnord",file="foo")
Now it returns "foo.1":
> serialNext("foo")
[1] "foo.1"
Create that, then it returns "foo.2" and so on...
> cat("fnord",file="foo.1")
> serialNext("foo")
[1] "foo.2"
This kind of thing can break if more than one process might be writing a new file though - if both processes check at the same time there's a window of opportunity where both processes don't see "foo.2" and think they can both create it. The same thing will happen with timestamps if you have two processes trying to write new files at the same time.
Both these issues can be resolved by generating a random UUID and pasting that on the filename, otherwise you need something that's atomic at the operating system level.
But for a twice-hourly job I reckon a timestamp down to minutes is probably enough.
See ?files for file manipulation functions. You can check if file exists with file.exists, and then either rename the existing file, or create a different name for the new one.

Implementation of simple polling of results file

For one of my dissertation's data collection modules, I have implemented a simple polling mechanism. This is needed, because I make each data collection request (one of many) as SQL query, submitted via Web form, which is simulated by RCurl code. The server processes each request and generates a text file with results at a specific URL (RESULTS_URL in code below). Regardless of the request, URL and file name are the same (I cannot change that). Since processing time for different data requests, obviously, is different and some requests may take significant amount of time, my R code needs to "know", when the results are ready (file is re-generated), so that it can retrieve them. The following is my solution for this problem.
POLL_TIME <- 5 # polling timeout in seconds
In function srdaRequestData(), before making data request:
# check and save 'last modified' date and time of the results file
# before submitting data request, to compare with the same after one
# for simple polling of results file in srdaGetData() function
beforeDate <- url.exists(RESULTS_URL, .header=TRUE)["Last-Modified"]
beforeDate <<- strptime(beforeDate, "%a, %d %b %Y %X", tz="GMT")
<making data request is here>
In function srdaGetData(), called after srdaRequestData()
# simple polling of the results file
repeat {
if (DEBUG) message("Waiting for results ...", appendLF = FALSE)
afterDate <- url.exists(RESULTS_URL, .header=TRUE)["Last-Modified"]
afterDate <- strptime(afterDate, "%a, %d %b %Y %X", tz="GMT")
delta <- difftime(afterDate, beforeDate, units = "secs")
if (as.numeric(delta) != 0) { # file modified, results are ready
if (DEBUG) message(" Ready!")
break
}
else { # no results yet, wait the timeout and check again
if (DEBUG) message(".", appendLF = FALSE)
Sys.sleep(POLL_TIME)
}
}
<retrieving request's results is here>
The module's main flow/sequence of events is linear, as follows:
Read/update configuration file
Authenticate with the system
Loop through data requests, specified in configuration file (via lapply()),
where for each request perform the following:
{
...
Make request: srdaRequestData()
...
Retrieve results: srdaGetData()
...
}
The issue with the code above is that it doesn't seem to be working as expected: upon making data request, the code should print "Waiting for results ..." and then, periodically checking the results file for being modified (re-generated), print progress dots until the results are ready, when it prints confirmation. However, the actual behavior is that the code waits long time (I intentionally made one request a long-running), not printing anything, but then, apparently retrieves results and prints both "Waiting for results ..." and " Ready" at the same time.
It seems to me that it's some kind of synchronization issue, but I can't figure out what exactly. Or, maybe it's something else and I'm somehow missing it. Your advice and help will be much appreciated!
In a comment to the question, I believe MrFlick solved the issue: the polling logic appears to be functional, but the problem is that the progress messages are out of synch with current events on the system.
By default, the R console output is buffered. This is by design: to speed things up and avoid the distracting flicker that may be associated with frequent messages etc. We tend to forget this fact, particularly after we've been using R in a very interactive fashion, running various ad-hoc statement at the console (the console buffer is automatically flushed just before returning the > prompt).
It is however possible to get message() and more generally console output in "real time" by either explicitly flushing the console after each critical output statement, using the flush.console() function, or by disabling buffering at the level of the R GUI (right-click when on the console, see Buffered output Ctrl W item. This is also available in the Misc menu)
Here's a toy example of the explicit use of flush.console. Note the use of cat() rather than message() as the former doesn't automatically add a CR/LF to the output. The latter however is useful however because its messages can be suppressed with suppressMessages() and the like. Also as shown in the comment you can cat the "\b" (backspace) character to make the number overwrite one another.
CountDown <- function() {
for (i in 9:1){
cat(i)
# alternatively to cat(i) use: message(i)
flush.console() # <<<<<<< immediate ouput to console.
Sys.sleep(1)
cat(" ") # also try cat("\b") instead ;-)
}
cat("... Blast-off\n")
}
The output is the following, what is of course not evident in this print-out is that it took 10 seconds overall with one number printed every second, before the final "Blast off"; do remove the flush.console() statement and the output will come at once, after 10 seconds, i.e. when the function terminates (unless console is not buffered at the level of the GUI).
CountDown()
9 8 7 6 5 4 3 2 1 ... Blast-off

How to insert text into middle of text file in QT?

I'm writing a program that performs several tests on a hardware unit, and logs both the results of each test and the steps taken to perform the test. The trick is that I want the program to log these results to a text file as they become available, so that if the program crashes the results that had been obtained are not lost, and the log can help debug the crash.
For example, assume a program consisting of two tests. If the program has finished the first test and is working on the second, the log file would look like:
Results:
Test 1 Result A: Passed
Test 1 Result B: 1.5 Volts
Log:
Setting up instruments.
Beginning test 1.
[Steps in test 1]
Finished test 1.
Beginning test 2.
[whatever test 2 steps have been completed]
Once the second test has finished, the log file would look like this:
Results:
Test 1 Result A: Passed
Test 1 Result B: 1.5 Volts
Test 2 Result A: Passed
Test 2 Result B: 2.0 Volts
Log:
Setting up instruments.
Beginning test 1.
[Steps in test 1]
Finished test 1.
Beginning test 2.
[Steps in test 2]
Finished test 2.
All tests complete.
How would I go about doing this? I've been looking at the help files for QFile and QTextStream, but I'm not seeing a way to insert text in the middle of existing text. I don't want to create separate files and merge them at the end because I'd end up with separate files in the event of a crash. I also don't want to write the file from scratch every time a change is made because it seems like there should be a faster, more elegant way of doing this.
QFile.readAll will read the entire file into a QByteArray.
On the QByteArray you can then use insert to insert text in the middle,
and then write it back to file again.
Or you could use the classic c style that can modify files in the middle with the help of filepointers.
As #Roku pointed out, there is no built in way to insert data in a file with a rewrite. However if you know the size of the region, i.e., if the text you want to write has a fixed length, then you can write an empty space in the file and replace it later. Check
this discussion in overwriting part of a file.
I ended up going with the "write the file from scratch" method that I mentioned being hesitant about in my question. The benefit of this technique is that it results in a single file, even in the event of a crash since the log and the results are never placed in different files to begin with. Additionally, rewriting the file only happens when adding new results (an infrequent occurrence), whereas updating the log means simply appending text to the file as usual. I'm still a bit surprised that there isn't a way to have the OS insert text into a file for you.
Oh, and for those of you who absolutely must have this functionality as efficiently as possible, the following might be of use:
http://www.codeproject.com/Articles/17716/Insert-Text-into-Existing-Files-in-C-Without-Temp
You just cannot add more stuff in the middle of a file. I would go with two separate files, another for the results and another for the logs.

Resources