Nifi: Not able to sftp files generated continuously - sftp

I am creating a simple nifi pipeline to read file and write the same file to the two different locations. Below is the flow of my pipeline:
1) Read the file from server_1 directory_1
2) copy the file to server_1 directory_2
3) copy the file to server_2 directory_3
A python script is continuously generating csv in server_1 directory_1. I am able to do first and second step, but in third step pipeline is writing only old data. For reading new data I need to empty queue of success_sftp. Below is the screenshot of pipleine:
In the third case it is showing two behaviour:
1) If no csv file is present in the input directory and I run the flow then it will copy the file coming first(Only first files not the file after that) and success_sftp queue will be full after that.
2)If I have csv file(say 10 file) in input directory and I run the flow then it will copy all the csv files(10 files) to output directory and after that queue will be full. For writing more file I need to empty queue.
Kindly assist

Related

WinSCP command line for uploading file from folder named with current date

Our bank just changed the way in which upload and download files to them. Previously we could log in to a secured website, choose directory, and upload/download manually. Everything now has to be done through SFTP, using FileZilla or similar program.
I want to automate SFTP upload process by using WinSCP.
I realize I will need to use the put command line to upload. The file I'm wanting to upload is generated every day and the file name is exactly the same, but the folder being uploaded from changes. The directory structure is as such:
C:\Finance\FY 2021\YYYYMMDD\file.txt
My question is what would the upload command line look like to upload this file on a daily basis. This upload will always take place the same day, so the folder name will always be the current date in the above format.
Can these commands be contained within and run from a batch file rather than creating a batch file that merely points to a scripted txt file to run? Thanks for your help!
A follow-up question for handling of the FY YYYY part part:
Use WinSCP to upload from a folder with a fiscal year in its name to an SFTP server
WinSCP has %TIMESTAMP% syntax which you can use to refer to the folder with today's timestamp in its name.
And yes, you can specify WinSCP commands directly in the batch file using the /command parameter:
winscp.com /ini=nul /command ^
"open sftp://username:password#ftp.example.com/ -hostkey=""...""" ^
"put ""C:\Finance\FY 2021\%%TIMESTAMP#yyyymmdd%%\file.txt"" ""/remote/path/""" ^
"exit"

Automating- Appending two text files to create 1 Excel file daily

I have two files that come in daily to a shared drive. When they are posted, they come in with the current date as part of the file name. example ( dataset1_12517.txt and dataset2_12517.txt) the next day it posts it will be (dataset1_12617.txt and so on). They are pipe delimited files if that matters.
I am trying to automate a daily merge of these two files to a single excel file that will be overwritten with each merge (file name remains the same) so my tableau dashboard can read the output without having to make a new connection daily. The tricky part is the file names will change daily, but they follow a specific naming convention.
I have access to R Studio. I have not started writing code yet so looking for a place to start or a better solution.
On a Window machine, use the copy or xcopy command lines. There are several variations on how to do it. The jist of it though is that if you supply the right switches, the source file will append to the destination file.
I like using xcopy for this. Supply the destination file name and then a list of source files.
This becomes a batch file and you can run it as a scheduled task or on demand.
This is roughly what it would look it. You may need to check the online docs to choose the right parameter switches.
xcopy C:\SRC\souce_2010235.txt newfile.txt /s
As you play with it, you may even try using a wildcard approach.
xcopy C:\SRC\*.txt newfile.txt /s
See Getting XCOPY to concatenate (append) for more details.

Symfony2 - Upload, zip & encrypt a file once uploaded in the server

I have been implementing an entity in Symfony 2.2 in order to upload files to my server. I followed successfully the steps listed in
http://symfony.com/doc/current/cookbook/doctrine/file_uploads.html
However I need to implement an additional feature, which consists in saving the file along the entity, but not the original one, but the zipped & encrypted one, same as if I'd done that using the command line of linux and then uploaded the generated zip file. I mean, when I'm required to select in my form the file, I choose it as normal, but in the server it'd be stored a zip which contains that file instead of the file itself, and of course when downloading I want the zip as well, so the name in the table has to be the one of the zip file.
I guess it could be accomplished using system calls, allowing PHP to execute a zip command over the file, but I cannot figure it out how exactly. Any help?

Acting on changes to a log file in R as they happen

Suppose a log file is being written to disk with one extra line appended to it every so often (by a process I have no control over).
I would like to know a clean way to have an R program "watch" the log file, and process a new line when it is written to the log file.
Any advice would by much appreciated.
You can use file.info to get the modification date of a file, just check every so often and take action is the modification date changes. Keeping track of how many lines have already been read will enable you to use scan or read.table to read only the new lines.
You could also delete or move the log file after it is read by your program. The external program will then create a new log file, I assume. Using file.exists you can check if the file has been recreated, and read it when needed. You then add the new data to the already existing data.
I would move the log file to an archive subfolder and read the logfiles as they are created.

Unix system file tables

I am confused about Unix system file tables.
When two or more processes open a file for reading, does the system file table create separate entries for each process or a single entry?
If a single entry is created for multiple processes opening the same file, will their file offsets also be the same?
If process 1 opens file1.txt for reading and process 2 opens the same file file1.txt for writing, will the system file table create one or two entries?
There are three "system file tables": There is a file descriptor table that maps file descriptors (small integers) to entries in the open file table. Each entry in the open file table contains (among other things) a file offset and a pointer to the in-memory inode table. Here's a picture:
(source: rich from www.cs.ucsb.edu now on archive.org)
So there is neither just one file table entry for an open file nor is there just one per process ... there is one per open() call, and it is shared if the file descriptor is dup()ed or fork()ed.
Answering your questions:
When two or more processes open a file for reading, there's an entry in the open file table per open. There is even an entry per open if one process opens the file multiple times.
A single entry is not created in the open file table for different processes opening the same file (but there is just one entry in the in-memory inode table).
If file1.txt is opened twice, in the same or two different processes, there are two different open file table entries (but just one entry in the in-memory inode table).
The same file may be opened simultaneously by several processes, and even by the same process (resulting in several file descriptors for the same file) depending on the file organization and filesystem. Operations on the descriptors like moving the file pointer, or closing it are independent (they do not affect other descriptors for the same file). Operations of the file (like a write) can be seen by operations on the other descriptors (a posterior read can read the written data).
This is from the
open(System call) wiki page

Resources