Import CSV to SQL using schema.ini as validator - asp.net

I'm using schema.ini to validate the data types/columns in my CSV file before loading into SQL. If there is a datatype mismatch in a row, it will still import the row but leaves that particular mismatch cell blank. Is there a way in which I can stop user from importing the CSV file if there is any issues and/or provide a error report (i.e. which row has problems).

The best approach is to check the file for any mismatch; but in the case of a large file, then this is not feasible.
You might need to load it first the check the loaded data in the table for the mismatch. This is much faster than checking the file (You can use a simple T-SQL script to check for nulls in the table).
IF mismatches are found, the user can then be notified and the table can then be cleared.

have a look at he FileHelpers library: http://www.filehelpers.com/
This is a very powerful library to do all kinds of imports, including csv and they also have a pretty neat error handling part
Using the Differents Error Modes The FileHelpers library has support
for 3 kinds of error handling.
In the standard mode you can catch the exceptions when something fail.
This approach not is bad but you lose some info about the current
record and you can't use the records array because is not asigned.
A more intelligent way is usign the ErrorMode.SaveAndContinue of the
ErrorManager:
Using the engine like this you have the good records in the records
array and in the ErrorManager you have te records with errors and can
do wherever you want.
Another option is to ignore the errors and continue how is showed in
this example
1 engine.ErrorManager.ErrorMode = ErrorMode.IgnoreAndContinue; 2
3 records = engine.ReadFile(... Copy to Clipboard | View Plain |
Print | ? engine.ErrorManager.ErrorMode =
ErrorMode.IgnoreAndContinue;
records = engine.ReadFile(... In the records array you only have the
good records.

Related

Ignore empty datasets

I writing a U-SQL Script that sometimes ends up with a empty data set.
Today the outputter writes an empty file when that happens. I would like the outputter to not write anything when that happens. Since I will flood the ADLS with empty files...
I have tried two things so far:
IF statement - the problem here is that I do a select count(*) from the data set and I cannot do IF #COUNT > 0 since the #count is a data set and the if statement would like to have a variable.
Write a custom outputter – But I have notice that it is not the ouputter that writes the file but some other code that runs afterwards. The file gets created after the custom outputter is done.
Does anyone have any guidance?
Thanks in advance!
One method you can do is do cook your data into a table first. Then you can INSERT into the table instead of writing to a file. Empty INSERTs do not cause job failure, nor will they affect performance at runtime or future performance on the table. Let me know if you have other questions!

Having trouble with FIELDPROC on a database (Column Encryption on Iseries)

I used Listing 3 in the following link to create a FIELDPROC program QGPL/MOBHOMEPAS which should encrypt a variable char column Field Encryption in DB2 for i
I compiled the RPGLE program and I created a separate database DBMLIB/UMAAAP00 as follows
A R UMAAAF00 TEXT('-
A TEST ENCRYPTION')
A*
A IPIAAA 20A VARLEN(20)
A KYGAAA 11S 2 COLHDG('SALARY')
I then use strsql to alter the table and protect IPIAAA
ALTER TABLE DBMLIB/UMAAAP00 alter column IPIAAA set FIELDPROC
QGPL.MOBHOMEPAS
ALTER COMPLETED FOR TABLE UMAAAP00 IN DBMLIB.
For some reason when I go in to add entries through upddta directly to the file itself and then do a wrkqry to query and file and view them I don't see them as encrypted.
Is this not how it's supposed to work? Is anyone able to assist me with the logic? Ultimately, I'd like to create a simple table from scratch that has a single 20 character or so password column as encrypted.
If the code being utilized for the named FieldProc program QGPL.MOBHOMEPAS was modeled-after [an effective copy of] the source code that was found at the URL from the OP [which BTW includes a position-to request to the comments section... Why?], then that code is implemented using the base-level of the DB2 for IBM i 7.1 SQL FieldProc support, not the next [enhanced] level of support in which the masking feature was added. That is, every invocation other than for function-code=8 will necessarily always be an Encode or a Decode operation for which any masking of the data is unsupported, because changing the data [with that level of support] would corrupt the data in the TABLE.
Note [from http://www.mcpressonline.com/rpg/db2-field-procedures-finally-support-conditional-masking.html] the differences in the coding requirements described for the pre-masking-support [eight parameters] and since-masking-support [nine parameters] as the pre-requisite to have the Run Query (RUNQRY) and Update Data (UPDDTA) features mask the data that is presented to the user:
The new FieldProc Masking support revolves around two main components.
The first component is a new parameter that was added to the parameter
lists that the DB2 engine passes to the FieldProc program on each
decode call. This new parameter controls whether or not the FieldProc
program can return a masked value. There are some DB2 operations—such
as the RGZPFM (Reorganize Physical File Member) command and trigger
processing—that always require the clear-text version of the data to
be returned. The second component is a new special SQLState value
('09501') that is to be returned by the FieldProc program whenever it
is passed a masked value on the encode call. This prevents the masked
value from being encoded, which would result in the original data
value being lost. When this special SQLState value is returned, DB2
will ignore the encoded value that is passed back by the FieldProc
program and instead use the value that's currently stored in the
record image for that column.
For some reason when I go in to add entries through upddta directly to
the file itself and then do a wrkqry to query and file and view them I
don't see them as encrypted. Is this not how it's supposed to work?
No, that's not how it's supposed to work. The data will be encoded on disk only.
When you view the data it will be decoded automatically by the FIELDPROC program no matter what you're using to view it (WRKQRY [yuck], DFU, STRSQL, whatever). This is how it works regardless of field masking (which is different/additional functionality).

Is there a way to query Oracle DB server name and use in conditional compilation?

I got bit trying to maintain code packages that run on two different Oracle 11g2 systems when a line of code to be changed slipped by me. We develop on one system with a specific data set and then test on another system with a different data set.
The differences aren't tremendous, but include needing to change a single field name in two different queries in two different packages to have the packages run. On one system, we use one field, on the other system... a different one. The databases have the same schema name, object names, and field names, but the hosting system server names are different.
The change is literally as simple as
INSERT INTO PERSON_HISTORY
( RECORD_NUMBER,
UNIQUE_ID,
SERVICE_INDEX,
[... 140 more fields... ]
)
SELECT LOD.ID RECORD_NUMBER ,
-- for Mgt System, use MD5 instead of FAKE_SSN
-- Uncomment below, and comment out Dev system statement
-- MD5 UNIQUE_ID ,
-- for DEV system, use below
'00000000000000000000' || LOD.FAKE_SSN UNIQUE_ID ,
null SERVICE_INDEX ,
[... 140 more fields... ]
FROM LEGACY_DATE LOD
WHERE (conditions follow)
;
I missed one of the field name changes in one of the queries, and our multi-day run is crap.
For stupid reasons I won't go into, I wind up maintaining all of the code, including having to translate and reprocess developer changes manually between versions, then transfer and update the required changes between systems.
I'm trying to reduce the repetitive input I have to provide to swap out code -- I want to automate this step so I don't overlook it again.
I wanted to implement conditional compilation, pulling the name of the database system from Oracle and having the single line swap automatically -- but Oracle conditional compilation requires a package static constant (boolean in this case). I can't use the sys_context function to populate the value. Or, it doesn't seem to let ME pull data from the sys_context and evaluate it conditionally and assign that to a constant. Oracle isn't having any. DB_DOMAIN, DB_NAME, or SERVER_HOST might work to differentiate the systems, but I can't find a way to USE the information.
An option is to create a global constant that I set manually when I move the code to the other system, but at this point, I have so many steps to do for a transfer that I'm worried that I'd even screw that up. I would like to make this independent of other packages or my own processes.
Is there a good way to do this?
-------- edit
I will try the procedure and try to figure out the view over the weekend. Ultimately, the project will be turned over to a customer who expects to "just run it", so they won't understand what any switches are meant to do, or why I have "special" code in a package. And, they won't need to... I don't even know if they'll look at the comments.
Thank you
As Mat says in the comments for this specific example you can solve with a view, however there are other ways for more complex situations.
If you're compiling from a filesystem or using any automatic system you can create a separate PL/SQL block/procedure, which you execute in the same session prior to compilation. I'd do something like this:
declare
l_db varchar2(30) := sys_context('userenv','instance_name');
begin
if l_db = 'MY_DB' then
execute immediate 'alter session set plsql_ccflags = ''my_db:true''';
end if;
end;
/
One important point; conditional compilation does not involve a "package static constant", but a session one. So, you need to ensure that your compilation flags are identical/unique across packages/sessions.

Teradata set table

I have a set table in teradata , when I load duplicate records throough informatica , session fails because it tries to push duplicate records in SET table.
I want that whenever duplicate records being loaded informatica rejects them using TPT or Relation connection
can anyone help me with properties I need to set
Do you really need to keep track of what records are rejected due to duplication in the TPT logs? It seems like you are open to suggestions about TPT or relational connections, so I assume you don't really care about TPT level logs.
If this assumption is correct then you can simply put an Aggregator Transformation in the mapping and mark every field as Group By. As expected, this will add a group by clause in the generated query and eliminate duplicates in the source data.
Please try following things:
1. If you'll use fload or TPT Fast load then the utility will implicitly remove the duplicates but this utility can only be used for loading into empty tables.
2. If you are trying to load data in non-empty table then place a sorter and de-dupe your data in Informatica
3. Also try changing the flag stop on error to 0 and flag Error limit in target to -1
Please share your results with us.

MS Access CREATE PROCEDURE Or use Access Macro in .NET

I need to be able to run a query such as
SELECT * FROM atable WHERE MyFunc(afield) = "some text"
I've written MyFunc in a VB module but the query results in "Undefined function 'MyFunc' in expression." when executed from .NET
From what I've read so far, functions in Access VB modules aren't available in .NET due to security concerns. There isn't much information on the subject but this avenue seems like a daed end.
The other possibility is through the CREATE PROCEDURE statement which also has precious little documentation: http://msdn.microsoft.com/en-us/library/bb177892%28v=office.12%29.aspx
The following code does work and creates a query in Access:
CREATE PROCEDURE test AS SELECT * FROM atable
However I need more than just a simple select statement - I need several lines of VB code.
While experimenting with the CREATE PROCEDURE statement, I executed the following code:
CREATE PROCEDURE test AS
Which produced the error "Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE', 'SELECT', or 'UPDATE'."
This seems to indicate that there's a SQL 'PROCEDURE' statement, so then I tried
CREATE PROCEDURE TEST AS PROCEDURE
Which resulted in "Syntax error in PROCEDURE clause."
I can't find any information on the SQL 'PROCEDURE' statement - maybe I'm just reading the error message incorrectly and there's no such beast. I've spent some time experimenting with the statement but I can't get any further.
In response to the suggestions to add a field to store the value, I'll expand on my requirements:
I have two scenarios where I need this functionality.
In the first scenario, I needed to enable the user to search on the soundex of a field and since there's no soundex SQL function in Access I added a field to store the soundex value for every field in every table where the user wants to be able to search for a record that "soundes like" an entered value. I update the soundex value whenever the parent field value changes. It's a fair bit of overhead but I considered it necessary in this instance.
For the second scenario, I want to normalize the spacing of a space-concatenation of field values and optionally strip out user-defined characters. I can come very close to acheiving the desired value with a combination of TRIM and REPLACE functions. The value would only differ if three or more spaces appeared between words in the value of one of the fields (an unlikely scenario). It's hard to justify the overhead of an extra field on every field in every table where this functionality is needed. Unless I get specific feedback from users about the issue of extra spaces, I'll stick with the TRIM & REPLACE value.
My application is database agnostic (or just not very religious... I support 7). I wrote a UDF for each of the other 6 databases that does the space normalization and character stripping much more efficiently than the built-in database functions. It really annoys me that I can write the UDF in Access as a VB macro and use that macro within Access but I can't use it from .NET.
I do need to be able to index on the value, so pulling the entire column(s) into .NET and then performing my calculation won't work.
I think you are running into the ceiling of what Access can do (and trying to go beyond). Access really doesn't have the power to do really complex TSQL statements like you are attempting. However, there are a couple ways to accomplish what you are looking for.
First, if the results of MyFunc don't change often, you could create a function in a module that loops through each record in atable and runs your MyFunc against it. You could either store that data in the table itself (in a new column) or you could build an in-memory dataset that you use for whatever purposes you want.
The second way of doing this is to do the manipulation in .NET since it seems you have the ability to do so. Do the SELECT statement and pull out the data you want from Access (without trying to run MyFunc against it). Then run whatever logic you want against the data and either use it from there or put it back into the Access database.
Why don't you want to create an additional field in your atable, which is atable.afieldX = MyFunc(atable.afield)? All what you need - to run UPDATE command once.
You should try to write a SQL Server function MyFunc. This way you will be able to run the same query in SQLserver and in Access.
A few usefull links for you so you can get started:
MSDN article about user defined functions: http://msdn.microsoft.com/en-us/magazine/cc164062.aspx
SQLServer user defined functions: http://www.sqlteam.com/article/intro-to-user-defined-functions-updated
SQLServer string functions: http://msdn.microsoft.com/en-us/library/ms181984.aspx
What version of JET (now called Ace) are you using?
I mean, it should come as no surprise that if you going to use some Access VBA code, then you need the VBA library and a copy of MS Access loaded and running.
However, in Access 2010, we now have table triggers and store procedures. These store procedures do NOT require VBA and in fact run at the engine level. I have a table trigger and soundex routine here that shows how this works:
http://www.kallal.ca/searchw/WebSoundex.htm
The above means if Access, or VB.net, or even FoxPro via odbc modifies a row, the table trigger code will fire and run and save the soundex value in a column for you. And this feature also works if you use the new web publishing feature in access 2010. So, while the above article is written from the point of view of using Access Web services (available in office 365 and SharePoint), the above soundex table trigger will also work in a stand a alone Access and JET (ACE) only application.

Resources