Teradata - Column name on which error occurred - teradata

We don't have access to our Teradata PROD and we develop scripts and test in SIT, UAT. When promoted to PROD, occasionally the following errors occur:
Invalid Date/Timestamp
Numeric overflow occurred
Untranslatable character
....
Why doesn't Teradata show the exact column name on which the error occurred?
We need to go through the script where around 20 columns are being casted from varchar to date/timestamp and around 10 columns are prone to Numeric overflow occurred. We need to individually go through each column expecting this might be the culprit one. Will be more relieved when the error does show up the column name.
I am sure that as it was not implemented till now, assume that this should have been more complex due to run time errors.
However, the ET_, UV_ error tablenames does capture some of these errors, I guess (may be not all).
Can you please explain when it was possible on ET_, UV_ tables, why can't be it implemented for a normal SQL query to show on which column the error occurred?

These runtime errors are associated with an operation on some value, not necessarily with a particular column -- it could also be a result of an expression.
I imagine associating all the fallible expressions in a query with the corresponding parts of the original SQL would incur certain overhead. It would definitely require non-trivial amount of development work. You might want to ask your Teradata representative about this.
The ET/UV tables are maintained by TPT, which handles external data and is more likely to encounter unexpected values.
If this is a common situation, perhaps you need to cleanse your data. There's usually a way to find the rows that cause the listed errors using built-in SQL functions or UDFs, for example:
Invalid Date/Timestamp - isdate() UDF or SQL
Numeric overflow occurred - comparisons, possibly after cast(... as BIGINT)
Untranslatable character - TRANSLATE_CHK()
(There doesn't appear to be a common way to check if a CAST will succeed.)

Related

How to recover from "missing docs" in xtdb?

I'm using xtdb in a testing environment with a RocksDB backend. All was well until yesterday, when the system stopped ingesting new data. It tells me that this is because of "missing docs", and gives me the id of the allegedly missing doc, but since it is missing, that doesn't tell me much. I have a specific format for my xt/ids (basically type+guid) and this doesn't match that format, so I don't think this id is one of mine. Calling history on the entity id just gives me an empty vector. I understand the block on updates for consistency reasons, but how to diagnose and recover from this situation (short of trashing the database and starting again)? This would obviously be a massive worry were it to happen in production.
In the general case this "missing docs" error indicates a corrupted document store and the only proper resolution is to manually restore/recover based on a backup of the document store. This almost certainly implies some level of data loss.
However, there was a known bug in the transaction function logic prior to 1.22.0 which could intermittently produce this error (but without any genuine data loss), see https://github.com/xtdb/xtdb/commit/1c30550fb14bd6d09027ff902cb00021bd6e57c4
However, if you weren't using transaction functions then there may be another unknown explanation.

Create table Failed: [100015] Total size of all parcels is greater than the max message size

Could someone explain what does the above error message mean? How can it be fixed?
Thanks
There appears to be two main causes of this error:
Bugs in the client software
The query is too large
Bugs:
Make sure that you have the latest tools installed.
I have seen this error when incompatible versions of different TTU
software components are installed, especially CLI.
Please install (or-reinstall) the latest and greatest patches of CLI.
-- SteveF
Steve Fineholtz Taradata Employee
The other reference is from the comments to the original post:
Could be the driver. I had a similar issue with JDBC drivers, which went away
when I simply switched to a different version. – access_granted
Query is too large:
This is the root of the problem, even if it is caused by the above bugs.
Check your actual SQL query size sent to the server. Usually OBDC logs or debug files will let you examine the actual SQL generated.
Some SQL generators include charsets and collations to each field, increasing the query length.
You may want to create your own SQL Query from scratch.
Avoid the following, since they can be added by using additional queries.
Indexes
Default Values
Constraints
Non-ASCII characters as Column Names.
Also, remove all whitespace except a single space.
Do not attempt to add data while creating a table; Unless, the total size of the SQL statement is less than 1 MB.
From the first reference, the maximum query size is 1MB.
On the extreme side, you can name all of your fields a single letter(or double letters...). You can rename them with Alter Table queries later.
The same goes for type; you can set the type for all of the columns as CHAR, and modify it later(before any data is added to the table).

Oracle Stored Procedure performance

I am facing a performance issue in one of my stored procedures.
Following is the pseudo-code:
PROCEDURE SP_GET_EMPLOYEEDETAILS(P_EMP_ID IN NUMBER, CUR_OUT OUT REF CURSOR)
IS
BEGIN
OPEN CUR_OUT FOR
SELECT EMP_NAME, EMAIL, DOB FROM T_EMPLOYEES WHERE EMP_ID=P_EMP_ID;
END;
The above stored procedure takes around 20 seconds to return the result set with let's say P_EMP_ID = 100.
However, if I hard-code employee ID as 100 in the stored procedure, the stored procedure returns the result set in 40 milliseconds.
So, the same stored procedure behaves differently for the same parameter value when the value is hard-coded instead of reading the parameter value.
The table T_EMPLOYEES has around 1 million records and there is an index on the EMP_ID column.
Would appreciate any help regarding this as to how I can improve the performance of this stored procedure or what could be the problem here.
This may be an issue with skewed data distribution and/or incomplete histograms and/or bad system tuning.
The fast version of the query is probably using an index. The slow version is probably doing a full-table-scan.
In order to know which to do, Oracle has to have an idea of the cardinality of the data (in your case, how many results will be returned). If it thinks a lot of results will be returned, it will go straight ahead and do a full-table-scan as it is not worth the overhead of using an index. If it thinks few results will be returned it will use an index to avoid scanning the whole table.
The issues are:
If using a literal value, Oracle knows exactly where to look in the histogram to see how many results would be returned. If using a bind variable, it is more complicated. Certainly, on Oracle 10 it didn't handle this well and just took a guess at the cardinality. On Oracle 11, I am not sure as it can do something called "bind variable peeking" - see SQL Plan Management.
Even if it does know the actual value, if your histogram is not up-to-date, it will get the wrong values.
Even if it works out an accurate guess as to how many results will be returned, you are still dependent on the Oracle system parameters being correct.
For this last point ... basically, Oracle has some parameters that tell it how fast it thinks a FTS is vs how fast an index look-up is. If these are not correct, it will may do an FTS even if it is a lot slower. See Burleson
My experience is that Oracle tends to flip to doing FTS way too early. Ideally, as the result set grows in size there should be a smooth transition in performance at the point where it goes from using an index to using an FTS, but in practice the systems seem to be set up to favour bulk work.

Check if record exists - performance

I have a table that will contain large amounts of data. The purpose of this table is user transactions.
I will be inserting into this table from a web-service, which a third party will be calling, frequently.
The third party will be supplying a reference code (most probably a string).
The requirement here is that I will need to check whether this reference code has already been inserted. If it exists, just return the details and do nothing else. If it doesn't create the transaction as expected. The reasoning behind this is the possibility of loss of communication with the service after the request is received.
I have some performance concerns with this, as the search will be done on a string value, and also on a large table. Most of the time the transaction will not exist in the database, as this is just a precaution.
I am not asking for code here, but for the best approach for performance.
AS your subject indicates if you are trying to look for evaluating EXISTS (SELECT 1 from Sometable) then there will not be much of performance penality. This is because you will not be writing just a bunch of 1s (means the inner query) to evaluate the result to boolean.
Other aspect to this is the non clustered indexing provided on the reference code field.If the length of the reference code lets say its a fixed length string (CHAR(50) then also the B -tree will be optimum .
I am not sure about the data consistency requirements hence exepct the normal readcommitted will wont do any harm unless you have highly transnational read writes.

Why "Error: Subreport could not be shown" for some reports and not others?

I'm using VS2010 and the built-in visual Report Designer to create RDLC templates for rendering reports with sub-reports as PDF files in an ASP.NET application using a ReportViewer control and the .LocalReport member. The code iterates over a set of records, producing one report (with its sub-reports) for each record.
I noticed recently that for a small number of the reports, one of the sub-reports was failing and giving the "Error: Subreport could not be shown" message. What's puzzling me about this case, in contrast to the many posts about this error that I've read (and previous times I've wrestled with it myself), is that it is only occurring for a subset of cases; from what I've seen elsewhere, the problem is usually all-or-nothing -- this error always appears until a solution is found, then this error never appears.
So... what could cause this error for only a subset of records? I can run the offending sub-report directly without errors; I can open the .xsd file and preview the DataSet for the offending records without errors; I can run the query behind the DataSet in SQL Server Mgt Studio without errors... I'm not sure where else to look for the cause(s) of this problem which only appears when I run the report-with-subreports?
I tracked this down to an out-of-date .xsd file (DataSet) -- somewhere along the way a table column string width was increased, but the DataSet was not updated or regenerated, so it still had the old width limit on that element, e.g., <xs:maxLength value="50" /> in the .xsd XML instead of the new width of 125 characters. The error was being thrown for those cases where at least one record in the subreport had a data value (string) in that column that exceeded the old width of 50.
An important clue came from adding a handler for the DataSet's .Selected event; I was already using the .Selecting event to set the sub-report's parameter (to tie it to the parent record), but I couldn't see anything useful when breaking in that event. However, examining the event args variable in the .Selected event, after the selection should have occurred, I found an Exception ("Exception has been thrown by the target of an invocation") with an InnerException ("Failed to enable constraints. One or more rows contain values violating non-null, unique, or foreign-key constraints"). There was also a stack trace which indicated the point of failure was executing Adapter.Fill(dataTable).
While this turned out to be pretty misleading -- I had no such constraints in place on the tables involved in the query behind the DataSet -- it at least got me focusing on the specific records in the subreports. After much fruitless searching for anomalies in the subreport record data SQL Server Mgt Studio, I eventually started removing the records one-by-one from one of the offending subreport cases, re-running the report each time to see if I had fixed the error. Eventually I removed a subreport record and the report worked -- the remaining subreport records appeared!
Now I had a specific sub-report record to examine more closely. By chance (wish I could call it inspired intuition...), I decided to edit that record in the web app instead of looking at it as I had been in SQL Server. One of the fields was flagged with an alert saying the string value was too long! That was a mystery to me for a moment: if the string value was too long, how could it already be saved in the database?! I double-checked the column definition in the table, and found it was longer than what the web-app front-end was trying to enforce. I then realized that the column had been expanded without updating the app UI, and I suspected immediately that the .xsd file also had not been updated... Bingo!
There are probably a number of morals to this story, and it leaves me with a familiar and unwelcome feeling that I'm not doing some things as intelligently as I ought. One moral: always update (or better and usually simpler, just re-build) your .xsd DataSet files whenever you change a query or table that its based on... easier said than remembered, however. The queasy feeling I have is that there must be some way that I haven't figured out to avoid building brittle apps where a column width that's defined in the database is also separately coded into the UI and/or code-behind to provide user feedback and/or do data validation... suggestions on how to manage that more robustly are welcome!

Resources