I have two files that come in daily to a shared drive. When they are posted, they come in with the current date as part of the file name. example ( dataset1_12517.txt and dataset2_12517.txt) the next day it posts it will be (dataset1_12617.txt and so on). They are pipe delimited files if that matters.
I am trying to automate a daily merge of these two files to a single excel file that will be overwritten with each merge (file name remains the same) so my tableau dashboard can read the output without having to make a new connection daily. The tricky part is the file names will change daily, but they follow a specific naming convention.
I have access to R Studio. I have not started writing code yet so looking for a place to start or a better solution.
On a Window machine, use the copy or xcopy command lines. There are several variations on how to do it. The jist of it though is that if you supply the right switches, the source file will append to the destination file.
I like using xcopy for this. Supply the destination file name and then a list of source files.
This becomes a batch file and you can run it as a scheduled task or on demand.
This is roughly what it would look it. You may need to check the online docs to choose the right parameter switches.
xcopy C:\SRC\souce_2010235.txt newfile.txt /s
As you play with it, you may even try using a wildcard approach.
xcopy C:\SRC\*.txt newfile.txt /s
See Getting XCOPY to concatenate (append) for more details.
Related
I have a configure script to set up some paths for my R package during installation. I wish to edit a file based on some conditions. Is there any way to edit a file from within the configure.ac? It would be great if the solution is provided for all operating systems.
Is there any way to edit a file from within the configure.ac?
configure.ac is not executable, but I suppose you mean that you want the configure script generated from it to edit a file. The configure script is a shell script, and you can cause arbitrary shell code to be included in it, more or less just by including that code at the corresponding point in configure.ac.
The question, then, is how you would automate editing a file with a shell script. There is a variety of alternatives, but sed is high on my list. You will find it on every system that can support Autoconf configure scripts, because such scripts use it internally.
On the other hand, this sort of thing is one of the main activities of a configure script, in the form of creating files (especially makefiles, but not limited to those) from templates. You should consider building your target file of interest from a template in this way, instead of making custom-programmed edits to a file packaged in your program distribution. This would involve
setting output variables containing the chosen content for the parts of the file that need to be configured;
designating the target file as one for configure to build; and
providing the template, maybe by taking a complete example file and replacing each variable part with a reference to the appropriate #output_variable#.
I need to scan through a file which has some unix shell commands and its output. I need to extract the list of unix commands mentioned throughout the file and list them down onto a different file. One way to achieve it to scan through the file for some specific list of commands and if present to redirect them to a different file. But this gets difficult as the list kept growing. Any other ideas in this line.
TIA
You can get a list of all commands available in bash with compgen. If you want to use a whitelist approach, you could store the output of compgen -ac (aliases and commands) in a file and then check each token in your input file against that list.
More details on usage of compgen can be found on this answer.
Suppose a log file is being written to disk with one extra line appended to it every so often (by a process I have no control over).
I would like to know a clean way to have an R program "watch" the log file, and process a new line when it is written to the log file.
Any advice would by much appreciated.
You can use file.info to get the modification date of a file, just check every so often and take action is the modification date changes. Keeping track of how many lines have already been read will enable you to use scan or read.table to read only the new lines.
You could also delete or move the log file after it is read by your program. The external program will then create a new log file, I assume. Using file.exists you can check if the file has been recreated, and read it when needed. You then add the new data to the already existing data.
I would move the log file to an archive subfolder and read the logfiles as they are created.
I have tons of files dumped into a few different folders. I've tried organizing them several times, unfortunatly, there is no organization structure that consistently makes sense for all of them.
I finally decided to write myself an application that I can add tags to files with, then the organization can be custom to the actual organizational structure.
I want to prevent from getting orphaned data. If I move/rename a file, my tag application should be told about it so it can update the name in the database. I don't want it tagging files that no longer exist, and having to readd tags for files that used to exist.
Is there a way I can write a callback that will hook into the mv command so that if I rename or move my files, they will invoke the script, which will notify my app, which can update its database?
My app is written in Ruby, but I am willing to play with C if necessary.
If you use Linux you can use inotify (manpage) to monitor directories for file events. It seems there is a ruby interface for inotify.
From the Wikipedia:
Some of the events that can be monitored for are:
IN_ACCESS - read of the file
IN_MODIFY - last modification
IN_ATTRIB - attributes of file change
IN_OPEN and IN_CLOSE - open or close of file
IN_MOVED_FROM and IN_MOVED_TO - when the file is moved or renamed
IN_DELETE - a file/directory deleted
IN_CREATE - a file in a watched directory is created
IN_DELETE_SELF - file monitored is deleted
This does not work for Windows (and I think also not for other Unices besides Linux) as inotify does not exist there.
Can you control the path of your users? Place a script or exe and have the path point to it before the standard mv command. Have this script do what you require and then call the standard mv to perform the move.
Alternately an alias in each users profile. Have the alias call your replacement mv command.
Or rename the existing mv command and place a replacement in the same dir, call it mv and have it call your newly renamed mv command after doing what you want.
I'm working on a web application where a user uploads a list of files, which should then be immediately rsynced to a remote server. I have a list of all the local files that need to be rsynced, but they will be mixed in with other files that I do not want rsynced every time. I know rsync will only send the changed files, but this directory structure and contents will grow very large over time and the delay would not be acceptable.
I know that doing a remote rsync, I can specify a list of remote files, i.e...
rsync "host:/path/to/file1 /path/to/file2 /path/to/file3"
... but that does not work once I remove "host:" and try to specify the files locally.
I also know I can use --files-from, but that would require me to create a file ahead of time with a list of files that I want to rsync (and then delete it afterwards). I think it'd be cleaner to just effectively say "rsync these 4 specific files to this remote server", but I can't seem to get that to work.
Is there any way to do what I'm trying to accomplish, or do I have to resort to creating a tmp file with a list in it?
Thanks!
You should be able to list the files similar to the example you gave. I did this on my machine to copy 2 specific files from a directory with many other files present.
rsync test.sql test2.cpp myUser#myHost:path/to/files/synced/