Acting on changes to a log file in R as they happen - r

Suppose a log file is being written to disk with one extra line appended to it every so often (by a process I have no control over).
I would like to know a clean way to have an R program "watch" the log file, and process a new line when it is written to the log file.
Any advice would by much appreciated.

You can use file.info to get the modification date of a file, just check every so often and take action is the modification date changes. Keeping track of how many lines have already been read will enable you to use scan or read.table to read only the new lines.
You could also delete or move the log file after it is read by your program. The external program will then create a new log file, I assume. Using file.exists you can check if the file has been recreated, and read it when needed. You then add the new data to the already existing data.
I would move the log file to an archive subfolder and read the logfiles as they are created.

Related

Automating- Appending two text files to create 1 Excel file daily

I have two files that come in daily to a shared drive. When they are posted, they come in with the current date as part of the file name. example ( dataset1_12517.txt and dataset2_12517.txt) the next day it posts it will be (dataset1_12617.txt and so on). They are pipe delimited files if that matters.
I am trying to automate a daily merge of these two files to a single excel file that will be overwritten with each merge (file name remains the same) so my tableau dashboard can read the output without having to make a new connection daily. The tricky part is the file names will change daily, but they follow a specific naming convention.
I have access to R Studio. I have not started writing code yet so looking for a place to start or a better solution.
On a Window machine, use the copy or xcopy command lines. There are several variations on how to do it. The jist of it though is that if you supply the right switches, the source file will append to the destination file.
I like using xcopy for this. Supply the destination file name and then a list of source files.
This becomes a batch file and you can run it as a scheduled task or on demand.
This is roughly what it would look it. You may need to check the online docs to choose the right parameter switches.
xcopy C:\SRC\souce_2010235.txt newfile.txt /s
As you play with it, you may even try using a wildcard approach.
xcopy C:\SRC\*.txt newfile.txt /s
See Getting XCOPY to concatenate (append) for more details.

Mapping Data Type for CSV files

Is it possible to save a mapping file which SSIS can use to decide the data type based on the column names rather than going and updating it in the Advanced section of the 'Flat File Connection Manager Editor'. Thank you
This is a common problem faced by every SSIS developer. Whenever you make changes in a flat file connection, you lose all data type mappings and you have to manually edit this by using the advanced editor.
But you can save your life using the following:
Practice #1
When you work with an existing connection, make sure you have the flat file at the reference location of the flat file connection with the same name. If you forget to save it or don't find it, try the second practice.
Practice #2
Follow the steps below before using the SSIS package:
Open package in XML file format.
Find the flat file connection.
Read the file name and path of flat file connection.
Get the output copy of the final output file (usually you can find where SSIS is exporting final output file).
Copy the final output file and rename with the file connection's file name and paste it to the flat file connection location.
Remove all the data from file except the column list (make sure you keep the file format as it is, e.g., CSV or Excel).
Close the XML of the SSIS package & close the package.
Reopen the SSIS package and you saved your life.
This trick works for me in all my cases.

purpose of .RDataTmp temporary file? [R]

what is the purpose of the R temporary file that is created in every directory where a workspace is saved? What data does it contain and is it safe to delete?
That file is a holding file for save.image() while R waits for its file argument to succeed. From help(save.image) and its safe argument -
safe - logical. If TRUE, a temporary file is used for creating the saved workspace. The temporary file is renamed to file if the save succeeds. This preserves an existing workspace file if the save fails, but at the cost of using extra disk space during the save.
So the file contains the entire workspace image and it is probably best to just leave it there in case R fails to save the workspace normally.
I'm also guessing that if you see this file, R has already failed to rename the file so you may want to search for file and check its contents before deleting the temporary file.

Usage of mmap and reloading changes to the file

I'm using mmap to load a big file with just with READ-ONLY access.
It's expected, that a cron job overwrites this file, daily once with updated content.
My query here is that how would my executable re mmap the updated file to get to the updated content?
Do I need to call mmap again? How would my executable know at what time the file was updated?
What's the usual recommended ways and options available with tradeoffs?
If the cron job just opens the file and overwrites the data in it, the new data should be immediately reflected in your mapped memory. If the cron job creates a new file, writes the data there, and then calls rename() to move the new file on top of the old, you need to close the old file and reopen to get the new data. This is often done to avoid data corruption in case of a power failure while rewriting the file.
As for how you get notified, there are several possibilities. The easiest might be to have the cron job just send a signal (e.g. SIGUSR1) to your process. You can then react to the signal and do your work. Otherwise, you could use inotify (on Linux) to monitor your file for writes.
Another option is to periodically poll the file's mtime to detect changes. Personally, I'd avoid that route though, as it seems rather hacky and inelegant.

Can you create a batch file in asp.net?

Can you write code to create multiple batch files in asp.net? Can you create it and also have the code write batch file commands into the batch files? What I want to do is create .bat files in a certain location on the computer (or make it create a folder to put the .bat files into) on a button click and have it write all the commands into the .bat file. For example, I have a web form the user inputs data, pushes a button and based off the .bat files it creates, it creates a new batch file every time the button is clicked. Is this possible?
Yes, it is, but it could have a lot of security consequences, so, be careful.
You create a file using methods in the File class, like File.CreateText.
http://msdn.microsoft.com/en-us/library/system.io.file.createtext.aspx
Yes, this is possible but in general, you will only be able to create these files inside your application's folder.
If you need to create the files outside your application's folder, you need to make sure that the App Pool your app runs under, has permissions to write to that folder.
Since batch files are basically executables, I'd be very careful about using them and I would never, under any circumstance take free user input to be placed into those files. You may give the user a set of predefined "commands" to chose from to mitigate your exposure, for example.

Resources