Azure Data Factory: How does export of U-SQL variables to next box in a pipeline work? - u-sql

I have a Pipeline in my Azure Data Factory with a,
U-SQL --> ForEach --> Web
..flow setup.
My U-SQL will eventually do a "select" of a single column and I want to call an API for each of the selected rows in the single column.
Can I do this like that?
If yes, How do I get the variable holding the selected query output in the U-SQL script out to the Azure Data Factory?
(so the ForEach can pick it up as a list and send each entry to the Web box, which calls the API and gets the data I need)
Right now, my alternative is to take a U-SQL box that selects the column and exports it to /Temp on ADLS and then have one big Azure Batch C# box that read the file and manually for loops the lines and runs the API call for each line. I am just hoping there would be a prettier way more modular ADF style way of doing the same thing.

U-SQL scripts do not return data, so you are on the right path. Azure Batch adds another layer of complexity that you probably don't need in this case. An ADF Lookup Activity can read Blob Storage and ADLS Gen 1, so your pipeline could do the following:
U-SQL Activity outputs column to blob.
Lookup Activity reads blob.
Foreach Activity loops through Lookup results.
-> internal Web Activity calls API.

Related

Handling Json Objects from SQL server DB at Fastreport

its my first time using fastreport, i am using FR with a asp.net web api, after searching i have decieded to get the data from db using stored proc. and get whatever pass whatever filter im using from the report from the front-end to the api (example like a certain Id of the object i wanna create a report for) and then pass this value to the stored proc. i export to the FR and pass this value to my parameters, i don't know if this is the best way to use FR with my api so i'm open to any suggestions. Now my main problem is that i have json obj in my SQL DB for localization(ex: the Name obj is stored like that {"en":"TestAcc2","fr":"test"}) so how can i convert a certain language from a field like this to show in the report, i have found resources on how to deal with a whole json DB but i only want to deserlizae this field so any help on how i could do it ?

Can you save a result (Given) to a variable in a Gherkin feature file, and then compare the variable with another result (Then)? (Cucumber for Java)

I am new to Cucumber for Java and trying to automate testing of a SpringBoot server backed by a MS SQL Server.
I have an endpoint "Get All Employees".
Writing the traditional feature file, I will have to list all the Employees in the #Then clause.
This is not possible with thousands of employees.
So I just want to get a row count of the Employee table in the database, and then compare with the number of objects returned from the "Get All Employees" endpoint.
Compare
SELECT count(*) from EMPLOYEE
with size of the list returned from
List<Employee> getAllEmployees()
But how does one save the rowcount in a variable in the feature file and then pass it into the stepdefs Java method?
I have not found any way that Gherkin allows this.
After writing a few scenario and feature files, I understood this about Cucumber and fixed the issue.
Gherkin/Cucumber is not a programming language. It is just a specification language. When keywords like Given, Then are reached by the interpreter, the matching methods in Java code are called. So they are just triggers.
These methods are part of a Java glue class. Data is not passed out of the Java class and into the gherkin feature file. The class is instantiated at the beginning and retained until the end. Because of this, it can store state.
So from my example in the question above, the Then response from a Spring endpoint call will be stored in a member variable in the glue class. The next Then invocation to verify the result will call the corresponding glue method which will access the data in the member variable to perform the comparison.
So Gherkin cannot do this, but Java at a lower level in the glue class, can.
You could create a package for example named dataRun (with the corresponding classes in the package) and save there the details during the test via setters.
During the execution of the step "And I get the count of employees from the data base" you set this count via the corresponding setter, during the step "And I get all employees" you set the number via the dedicated setter.
Then during the step "And I verify the number of employees is the same as the one in the data base" you get the two numbers via the getters and compare them.
Btw it's possible to compare the names of the employees (not just the count) if you put them into list and compare the lists.

Is there a way to compare file vs table record with creating new mapping using Informatica?

I'm working on a scenario where I have to compare a data record which is coming from a file with the data from a table as part of validation check before loading the data file into the staging table. I have come up with a couple of possible scenarios which involve something that needs to change within the load mapping, but my team suggested to me to make a change to something that is easy to notice since it is a non-standard approach.
Is there any approach that we can handle within the workflow manager using any of the workflow tasks or session properties?
Create a mapping that will read the file, join data with the table, do the required validation and will write nothing out (use a filter with FALSE condition) and set a variable to 0/1 to indicate if the loading should start.
Next, run the loading session if the validation passed.
This can be improved a bit if you want to store the validation errors in some audit table. Then you don't need a variable - the condition can refer to $PMTargetName#numAffectedRows built-in variable. If it's more then zero - meaning there were some errors - don't start the load.
create a workflow with command line where you need to write a script which will pull the data from the table by using JDBC connections and try to compare with data present in the file and then flag whether to load or not .
based on this command line output you need to go ahead with staging workflow or not..
Use awk commands for comparison of the data , where you ll get flexibility to compare date parts in a column
FYR : http://www.cs.unibo.it/~renzo/doc/awk/nawkA4.pdf

debatch big input flat file into smaller multiple output files with specific count

I have a positional input flat file schema of the following kind.
<Employees>
<Employee>
<Data>
In mapping, I need to extract the strings on position basis to pass on to the target schema.
I have the following conditions -
If Data has 500 records, there should be 5 files of 100 records at the output location.
If Data has 522 records, there should be 6 files (5*100, 1*22 records) at the output location.
I have tried few suggestions from internet like
Setting “Allow Message Breakup At Infix Root” to “Yes” and setting maxoccurs to "100". This doesn't seem to be working. How to Debatch (Split) a Flat File using Flat File Schema ?
I'm also working on a custom receive pipeline component suggested at Split Flat Files into smaller files (on row count) using Custom Pipeline but I'm quite new to this so it's taking some time.
Please let me know if there is any simpler way of doing this, without implementing the custom pipeline component.
I'm following the approach to divide the input flat file into multiple small files as per condition and write at the receive location, then process the files with native flat file dissembler. Please correct me if there is a better approach.
You have two options:
Import the flat file to a SQL table using SSIS.
Parse the input file as one Message, then map to a Composite Operation to insert the records into a SQL table. You could use in Insert Updategram also.
After either 1 or 2, call a Stored Procedure to retrieve the Count and Order of messages you need.
A simple way for a flat file structure without writing custom C# code is to just use a Database table. Just insert the whole file as records into the table, and then have a Receive Location that polls for records in the batch size you want.
Another approach is called the Scatter Gather Pattern, in this case you do set the Occurs to 1 which will debatch into individual records, and you then have an Orchestration that re-assembles it into the batch size you want. You will have to read up about Correlations Sets to do this.

Sql Database Server

Hi i have a sql database server runnin on my desktop. I want to create an asp.net application to detect when new data has been inserted into the database. Is there a command in visual studio to detect when theres new data right away?
Use the timestamp datatype on each column. This will stay identical until a change is made to any column in that row. If you combine this with the rowcount you can be certain if anything has changed in your database. You would need to cache the current timestamps and row count and compare them with the results of a query, you can then find out if there is a change.
So in your answer to:
Is there a command in visual studio to
detect when theres new data right
away?
Yes there is, although its not a command is the timestamp function (not to be confused with anything to do with the time)
Perhaps you need to provide more details to your scenario since constant querying of the database might not be the best way forward.
You can get a row count of your dataset and create a application
IN VB
Dim i as Integer
i=dataset.tables("table").rows.count
in sql backed return a count of a table and create a ASP.Net website to get the count and when count change alerts you
It may be heavier duty than you are looking for, but SQL Notification Services will do what you want. Essentially you execute a query and tell notification services you want to be notified whenever re-running that query would produce different results.
if you are using caching you can make it dependent on sql.
or you can fire email using sql trigger so when ever trigger get fired you will receive an email.
otherwise you will have to check your db again and again for any changes.
if you can provide more details about exact situation , we can provide more specific solution
You can create a webservice and call it using javascript.
here you can find sample how to call webservice using javascript:
function CallWebservice()
{
myWebService.isPrimeNumberWebService.callService(isPrimeNumberResult, "IsPrime",
testValue.value);
setTimeout("CallWebservice()",100);//here set time according to your requirement
}
For timer in javascript:
http://dotnetacademy.blogspot.com/2010/09/timer-in-javascript.html
For webservice in javaScript:
http://www.webreference.com/js/tips/020715.html
How to call webservice in JavaScript for FireFox 3.0

Resources