ADF Copy Files and store meta data - azure-cosmosdb

I'm new to ADF but know a few things about copying data from source to destination like files etc.
My requirement is to copy files from source to destination and while doing this activity , i
want to store the current file's metadata like (filename , filesize , extension) into cosmos db .
Is this possible ?
Needed some guidelines.

Related

databricks autoLoader - why new data is not write to table when original csv file is deleted and new csv file is uploaded

I have a question about autoload writestream.
I have below user case:
Days before I uploaded 2 csv files into databricks file system, then read and write it to table by autoloader.
Today, I found that the files uploaded days before has wrong data that faked. so I deleted these old csv file, and then uploaded 2 new correct csv file.
Then I read and write the new files by autoloader streaming.
I found that the streaming can read the data from new files successfully, but failed to write to table by writestream.
Then I tried to delete the checkpoint folder and all sub folders or files and re-create the checkpoint folder, and read and write by stream again, found that the data is write to table successfully.
Questions:
Since the autoloader has detect the new files, why it can't write to table succesfully until I delete the checkpoint folder and create a new one.
AutoLoader works best when new files are ingested into a directory. Overwriting files might give unexpected results. I haven't worked with the option cloudFiles.allowOverwrites set to True yet, but this might help you (see documentation below).
On the question about readStream detecting the overwritten files, but writeStream not: This is because of the checkpoint. The checkpoint is always linked to the writeStream operation. If you do
df = (spark.readStream.format("cloudFiles")
.option("cloudFiles.schemaLocation", "<path_to_checkpoint>")
.load("filepath"))
display(df)
then you will always view the data of all the files in the directory. If you use writeStream, you need to add .option("checkpointLocation", "<path_to_checkpoint>"). This checkpoint will remember that the files (that were overwritten) already have been processed. This is why the overwritten files will only be processed again after you deleted the checkpoint.
Here is some more documentation about the topic:
https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/faq#does-auto-loader-process-the-file-again-when-the-file-gets-appended-or-overwritten
cloudFiles.allowOverwrites https://docs.databricks.com/ingestion/auto-loader/options.html#common-auto-loader-options

Attach csv file to mariadb

Is it possible to access a csv file directly with the csv storage engine without going through the trouble of loading it first?
We deploy a data warehouse where during load CSV files are read into a temp table and then insert the content in production fact tables. I wonder if we could just pass the load into and "directly go to jail insert"?

SQLite: what is the path of my created-database?

I have downloaded the "Precompiled Binaries for windows" for the "SQLite" from here. I opened the command-line-shell and followed this, creating one table named "tbl1".
now I am looking at my desk trying to find the database file which contains tbl1, but I can't find it anywhere.
my question is: after creating a database in the SQLite, where the database file is stored ? i.e.: what is the path of my created-database ?
I am using windows7 and have basic knowledge about SQL and database
By default (if no additional path is specified), the database is created in the current working directory.
That is, sqlite.exe ex1 creates the "ex1" database in the current directory. (Use cd from the same command shell to see what what the current working directory is.)
On the other hand, sqlite.exe C:\databases\ex1 would create the "ex1" database in the "C:\databases" directory.
To create the database on the desktop for the current user, something sqlite.exe "%USERPROFILE%\Desktop\ex1" should work. (This uses an environment variable called USERPROFILE which is expanded and used as part of the path.)
The same principle holds for any SQLite connection - the path to the database file is first resolved; either relative (as in the first case) or absolute (as in the second and third).
It creates the DB file in the given file path else by default it will create the file in your current working directory. If you can't find it then just search for the .db file inside your current project directory.

Should we store data in database?

i'm asp.net beginner and currently working in "upload download file" project with asp.net and vb.net as code behind language (like skydrive's web).
what i'm want ask is about upload file in server, must we store path file, size, accessed or created date into database? as we know we can use directory listing in system.io.
Thanks for your help.
You definetly want to store the path of the file. You want a way to find the file ;) Maybe later you will have multiple servers, replication or other fancy things.
For the rest, it depends a bit on the type of website. If it's going to get high traffic then store it in the database, this will limit the number of IO call (very slow). Also, it'll be a lot easier to handle sorting and queries. (sort by date, pull only the read onyl files, ...).
Database will also help if you want to show history or statistique.
You can save file in some directory and can save path of that file in database. You can also store size and created date of that file in DB. But storing a file in DB is a bit difficult. Rather than save file in Directory and save path of that file in DB
you could store the file information in a database to built some extra features like "avoid storing duplicate files", because you are having a faster search in the database! if you search the filesystem always a recursive function call get started

Symfony2 - Upload, zip & encrypt a file once uploaded in the server

I have been implementing an entity in Symfony 2.2 in order to upload files to my server. I followed successfully the steps listed in
http://symfony.com/doc/current/cookbook/doctrine/file_uploads.html
However I need to implement an additional feature, which consists in saving the file along the entity, but not the original one, but the zipped & encrypted one, same as if I'd done that using the command line of linux and then uploaded the generated zip file. I mean, when I'm required to select in my form the file, I choose it as normal, but in the server it'd be stored a zip which contains that file instead of the file itself, and of course when downloading I want the zip as well, so the name in the table has to be the one of the zip file.
I guess it could be accomplished using system calls, allowing PHP to execute a zip command over the file, but I cannot figure it out how exactly. Any help?

Resources