SSDT 2017 Get Data - flat file source (csv/txt) - sql-server-data-tools

There are manay examples of new using modern Get Data feature while connecting to SQL Server. However, I can't find any examples of importing data from multiple flat files (csv/txt) located in one folder.
How should I make an initial connection to the data source? Whether it should be a connection to a folder or to one of the files? How should I buid the query chaing (query M).
It seems that the way I do it in Excel does not work.
I would be gratefull for any tips.

Here are two good examples of how to import multiple text files with Power Query in Power BI.
Import all CSV files from a folder with their filenames in Power BI
https://powerpivotpro.com/2016/12/import-csv-files-folder-filenames-power-bi/
Power BI – Combine Multiple CSV files,Ignore Headers and Add FileName as a Column
http://analyticsavenue.com/power-bi-combine-multiple-csv-filesignore-headers-and-add-filename-as-a-column/
There are several ways to do this with standard SSIS tasks also, but I think the most flexible way is to use a Foreach Loop Container to read all the files in a folder. In the properties of the Foreach Loop Container you specific the folder and a file name pattern (i.e. *.csv) of the files you want to import.
You create a variable to hold the name of the current file, and use that variable to change the Connection String property of the Flat File Source each iteration of the loop.
Here's a decent example of how to do this that covers most of the setup, and provides a downloadable example.
Loop through Flat Files in SQL Server Integration Services
https://www.mssqltips.com/sqlservertip/2874/loop-through-flat-files-in-sql-server-integration-services/

Related

Trying to import data into R in a way that will allow anyone to access it when opening the markdown file/ accessing the html knit

I am currently working on a coding project and I am running into trouble with how i Should import the data set. We are supposed to have it read in a way so that our instructor can access our markdown file and be able to import the data and run the code without changing file paths. I know about using relative file paths to make it accessible to anyone, however I don't know how to get around the /users/owner part of the file path. Any help would be greatly appreciated and if you have any further questions feel free to ask.
I've tried changing the working directory to a certain folder that both I and my instructor have named the same thing, however, like I said above, when I use read.csv to import the data frame I am still forced to use the /users/owner filepath which obviously is specific to my computer.
I can understand your supervisor, I request the same from my students. My recommended solution is to put both data and R script (or the .Rmd file) in the same folder. Then one does not need to add a path in the read.csv (or similar) function.
If you use RStudio, move to the folder in the Files pane and then use the gear icon and select "Set as Working Directory".
Then send both files (.R or .Rmd) and the data to the supervisor, ideally as a zip file. The supervisor can then unpack it to an arbitrary folder and just double click to the .R/.Rmd file. The containing folder will then automatically become the working directory.
Other options are:
to use a subfolder for the data or
to put the data to a publicly readable internet location, e.g.
Github and read it directly from there.
The last option requires of course that the data have a free license.

TestCase for multiple files

I created one TestCase which I want to use on multiple files in one folder. The TestCase is for each file the same. Is there a possibility to do that in the Execution Section?
If possible I want to see after every File if it was successful or not.
Thank you and best regards
I have a similar use case where I have 1 test case executing several iterations from a file. Currently, I'm leveraging Tricentis TDM solution to store the data rather then pulling from a file(s). You can create multiple repositories to store data ie: SQL Lite, MS SQL and other source systems. Check out https://www.tricentis.com/products/automate-continuous-testing-tosca/test-data-management/

Possible to use .zip file with multiple .csv files?

Is it possible using U-SQL to unzip a zip folder with multiple .csv files and process them?
Each file has a different schema.
So you've got two problems here.
Extract from a ZIP file.
Deal with inner varying contents.
To answer your question. Is it possible?... Yes.
How?... You'd need to write a user defined extractor to do it.
First check out the MSDN extractors page:
https://msdn.microsoft.com/en-us/library/azure/mt621320.aspx
The class for the extractor needs to inherit from IExtractor with methods that iterate over the archive contents.
Then to output each inner file in turn pass a file name to the extractor so you can define the columns for each dataset.
Source: https://ryansimpson.net/2016/10/15/query-zipfile-adla/
Another option would be to use Azure Data Factory to perform the UnZip operation in a custom activity and output the CSV contents to ADL Store. This would involve some more engineering though and an Azure Batch Service.
Hope this helps.

Tie R script and QlikSense software together

I require qliksense to create an Excel file when the user selects a set of tuples and R to automatically pick up this file, perform my script on the file. My script then creates another CSV file which I then want qliksense to automatically pick up and perform some predefined operations on it. Is there any way I can link the two of these software together in such a manner?
So to clarify the flowchart is: Qlik gets a large data set -> the user selects a set of rows and creates csv -> My custom R script (picks up this csv automatically) is run on the csv and creates a new csv -> qlik picks it up (automatically) and visually displays the results of the program
Is there any kind of wrapper software to tie them together? Or is it a better idea to perhaps just make some sort of UI that works with R in the background and the user can manually pass the file through the UI? Thanks for the help.
Check out the R extension that has been developed on http://branch.Qlik.com , extensions are being created and added all the time. You will probably need to create an account to access the project but once you have the direct link is below.
http://branch.qlik.com/#/project/5677d32d7f70718900987bdd

Parse Excel to BizTalk and save it in database

How to use BizTalk To disassemble Excel File .. Then Save these data in Database?
Can anyone provide me detailed steps of how to achieve this or any existing link for the same.
Wow - this is pretty open ended!
The steps you would generally take are:
1) Generate a Flat File schema that represents your excel file structure. As it's excel I'm assuming that your file is actually a CSV?
2) Create a custom pipeline that implements a flat file disassembler to convert CSV to Xml.
3) Using the WCF-LOB adapter, generate schemas for the Table you want to insert to. You might want to front this with a stored proc. I'm assuming an SQL or ORACLE database as you don't say what DB you are using!
4) Map your input Xml file to your Table/SP Schema.
5) Send your insert request to your DB (advise using Composite operations or a User defined table parameter to to avoid looping through your Xml and sending line-by-line!)
This is pretty high-level but frankly you are asking quite a few questions in one go!
HTH
In case it isn't not a CSV flat file as assumed by teepeeboy, and it is actually a XLS file you want to parse you will need something like FarPoint Spread for BizTalk. We've successfully used this to parse incoming XLS files attached to e-mails. The other option would be to write your own Pipeline component to do it but would be a lot of work. Other than that the steps that teepeeboy outline are what I would do as well.

Resources