How to keep track of the number of runs of an R script per day - r

I am currently looking to write a function in R that can keep track of the number of completed runs of an .R file within any particular day. Notice that the runs might be conducted at different time periods of a day. I did some research on this problem and came across this post (To show how many times user has run the script). So far I am unable to build upon the first commenter's code when converting into R (the main obstacle is to replicate the try....except ). However, I need to add the restriction that the count is measured only within a day (exactly from 00:00:00 AM EST to 24:00:00 AM EST).
Can someone please offer some help on how to accomplish this goal?

Either I didn't get the problem, or it seems a rather easy one: create a temporary file (use Sys.Date() to name it) and store the current run number there; at the beginning of your .R file, read the temporary file, increment the number, write the file back.

Related

Graphs in Jmeter

I need to perform some load test via a jmeter and for this I have created a test plan that is going to be uploading files for some time while there also is a JDBC request selecting amount of not yet processed files.
Could anyone please help me with some advice how to put the resulting values from the JDBC listener into some meaningful graph?
In the graph I would like to have amount of unprocessed files on one axis and on the second one a time stamp from which the result is.
Many thanks.
You can use Sample Variables property, add the next line to user.properties file (lives in "bin" folder of your JMeter installation)
sample_variables=your_variable_from_JDBC_request_with_number_of_files
This way when next time you launch JMeter in command-line non-GUI mode you will see an extra column in the .jtl results file with the values for the number of not processed files for each and every Sample Result.
Going forward you can create a custom chart for the number of not processed files over time, see Generating customs graphs over time chapter for more details. You will need to change at least jmeter.reportgenerator.graph.custom_testGraph.property.set_Sample_Variable_Name property value to match the one you set in the sample_variables and amend chart and axis titles according to your needs

Finding total edit time for R files

I am trying to determine how much time I have spent on a project, which has mainly been done in .R files. I know the file.info function will extract metadata for me on that file, but since I have opened it several times over several days, I don't know how to use that information to determine total time editing. Is there a function to find this information, or a way to go through the file system to find it?
Just a thought: you could maintain a log file to which you write the following from your R script: start time, stop-time and R script file name.
You can add simple code in your script that would do this. You would then require a separate script that would analyse the logs and inform you about how much time was spent using the scipt.
For a single user this would work.
Note: this catches script execution time and not the time spent on editing the files. The log would still have merit: you would have a record of when you were working on the script under the assumption that you run your scripts frequently when developing code.
How about using an old-fashioned time sheet for the purpose of recording development time? Tools such as JIRA are very suitable for that purpose.
For example at the start of the script:
logFile <-file("log.txt")
writeLines(paste0("Scriptname start: ", Sys.time()), logFile)
close(logFile)
And at the end of the script:
logFile <-file("log.txt")
writeLines(paste0("Scriptname stop: ", Sys.time()), logFile)
close(logFile)

Get last update time of a gsheet in R without registering

I am working on a shiny app that creates a live report based on the data of a google spreadsheet. I only want to read the whole gsheet again if there are new values in it.
For this, it would be best if I could get the last update time of the gsheet and then decide based on that if I need to read in the data again.
I know that registering the gsheet gives the last update time, but registering takes more time than reading in the whole table if there are only a few values in the table.
Picture of the result of the microbenchmark comparison
Is there a way to get only the time of the last update without registering a gsheet again in R?

Why everything in RStudio workspace vanishes every time I close it?

Every time I close and open the RStudio, everything in the panels including all the data frames, functions, values, etc. vanishes and some very old ones that I have deleted long ago appears. I save workspace when I want to close it, but this happens every time. Importing my large dataset and generating everything again every time takes a lot of time. What can I do?
You can save your workspace and restore it under Tools -> Options -> General.
Please see picture below.
In addition you can also use:
save.image(file='Session.RData')
And load it later:
load('Session.RData')
However, generally speaking, some consider it bad to keep/save your environment/workspace.

Submit a new script after all parallel jobs in R have completed

I have an R script that creates multiple scripts and submits these simultaneously to a computer cluster, and after all of the multiple scripts have completed and the output has been written in the respective folders, I would like to automatically launch another R script that works on these outputs.
I haven't been able to figure out whether there is a way to do this in R: the function 'wait' is not what I want since the scripts are submitted as different jobs and each of them completes and writes its output file at different times, but I actually want to run the subsequent script after all of the outputs appear.
One way I thought of is to count the files that have been created and, if the correct number of output files are there, then submit the next script. However to do this I guess I would have to have a script opened that checks for the presence of the files every now and then, and I am not sure if this is a good idea since it probably takes a day or more before the completion of the first scripts.
Can you please help me find a solution?
Thank you very much for your help
-fra
I think you are looking at this the wrong way:
Not an R problem at all, R happens to be the client of your batch job.
This is an issue that queue / batch processors can address on your cluster.
Worst case you could just wait/sleep in a shell (or R script) til a 'final condition reached' file has been touched
Inter-dependencies can be expressed with make too

Resources