Teradata: How to query the data source's name - teradata

Is it possible to query the current connection's data source name in teradata? I was hoping to use it to tailor some queries so that they could automatically update certain parts of said queries based on the connection information.
For example, a table in our test environment would have a different suffix than the same table in the production environment. The intent would be to use the same query in both environments, so it would need to be able to discern the current connection data in order to append the appropriate suffix to the table name.
I tried to get this from DBC.SessionInfoV in the LogonSource field, as per this answer, but there does not appear to be any pattern between the logon strings that discerns the test environment from the production one - it looks like the strings are simply random.
Here you can see what information I'm looking to actually pull into my query.

Related

Schema editor collection sampling is missing fields

I am attempting to use the ODBC Schema Editor to connect to several Cosmos DB collections for reporting purposes (using Power BI). While I can successfully generate a schema for one collection, another is not working correctly.
The Document in question includes a request object. Within request there should be multiple fields. When I sample my collection in Schema Editor, the resulting schema is missing any array of objects (or anything that includes an array of objects) that should be included under the request object – they are just not listed in the resulting schema. Several others are properly split out into their own tables, but the tables are always empty when the schema is applied (this is not reflective of the underlying data – I would expect to see things in those tables). Behavior does not change if the same collection is re-sampled.
Here's an example:
JSON selection
Does anyone know how I can get the schema editor to recognize all of my data? I'm not sure what to share that would be helpful but I'm happy to provide more if there's something that would be informative.
EDIT: Unless I'm misunderstanding how to query Cosmos DB, it seems that I'm seeing the issue show up even if I try to query the data directly through Data Explorer. In the below, you can see if I select c.request.preparedBy that preparedBy has a property mail:
preparedBy
However, if I try to query c.request.preparedBy.mail directly then I see nothing but blanks, which is exactly what appeared in the Schema Editor:
preparedBy.mail
Thinking that maybe there was a limit to how many layers of depth I could query, I tried selecting from request instead of the entire collection. Interestingly, even though I see preparedBy when I select * from request, request.preparedBy again returns nothing but empty braces.

How to Combine multiple files in BizTalk?

I have multiple flatfiles (CSV) (with multiple records) where files will be received randomly. I have to combine them (records) with unique ID fields.
How can I combine them, if there is no common unique field for all files, and I don't know which one will be received first?
Here are some files examples:
In real there are 16 files.
Fields and records are much more then in this example.
I would avoid trying to do this purely in XSLT/BizTalk orchestrations/C# code. These are fairly simple flat files. Load them into SQL, and create a view to join your data up.
You can still use BizTalk to pickup/load the files. You can also still use BizTalk to execute the view or procedure that joins the data up and sends your final message.
There are a few questions that might help guide how this would work here:
When do you want to join the data together? What triggers that (a time of day, a certain number of messages received, a certain type of message, a particular record, etc)? How will BizTalk know when it's received enough/the right data to join?
What does a canonical version of this data look like? Does all of the data from all of these files truly get correlated into one entity (e.g. a "Trade" or a "Transfer" etc.)?
I'd probably start with defining my canonical entity, and then look towards the path of getting a "complete" picture of that canonical entity by using SQL for this kind of case.

Prevent SQL Injection when the table name and where clause are variables

I have a situation need your help.
At the moment, i've build an asp.net app using ado.net. I'm using CommandText to build dynamic query so it have SQL Injection vulnerability.
My CommandText like this
String.Format("SELECT COUNT(*) FROM {0} {1}", tableName, whereClause)
TableName and whereClause is passed in by developer. As you see I cannot use SQLParameters here because I need to pass entire tableName and whereClause not only parameter values.
My solution to prevent SQL Injection is using BlackList check TableName and whereClause to find out malicious string but I don't know this is the best way in this situation, isn't it. And if it is anyone can help me where to find BlackList references or library.
Without knowing further details, there are several options you have in order to avoid SQL injections attacks or at least to minimize the damage that can be done:
Whitelisting is more secure than blacklisting: Think about whether you really need access to all the tables except the blacklisted ones. If anyone adds tables at a later point in time, he or she might forget to add them to the backlist.
Maybe you can restrict the access to a specific subset of tables. Ideally, these tables follow a common naming scheme so the table name can be validated against that scheme. If there is no naming scheme, you could also add a list of the tables that can be accessed in the program or the application configuration so you can check whether the table name is contained in this list. If you save the list in a configuration file, you are able to expand the list without compiling the application again.
If you cannot whitelist the table names, you could at least check whether the supplied table name is present in the database by querying the sys.tables system table (in SQL Server, other DBMS might have similar tables). In this query, you can use parameters so you are safe.
For SQL Server, you should put the table name in square brackets (SELECT COUNT(*) FROM [" + tableName + "]"). Square brackets are used to delimit identifiers (also see this link). In order for this to work, you have to check that the tableName variable does not contain a closing square bracket. If the tableName variable might contain a schema identifier (e.g. dbo.MyTable you'd have to split the parts first and then add the square brackets ([dbo].[MyTable]) as these are separate identifiers (one for the schema, one for the table name).
Validate the contents of the variables very carefully by using regular expressions or similar checks. This is easy for the table name, but very hard for the WHERE clause as you'd basically have to parse the SQL WHERE clause and assert that no dangerous code is contained.
The hardest part is to check the WHERE clause. Also in this respect it would be best, if you could limit the options for the user and whitelist the possible WHERE clauses. This means that the user can choose from a range of WHERE clauses that the program knows or builds based upon the user input. These known WHERE clauses could contain parameters and therefore are safe against SQL injection attacks. If you cannot whitelist the WHERE clauses, you'd have to parse the WHERE clause in order to be able to decide whether a certain request is dangerous or not. This would require a large effort (if you don't find a library that can do this for you), so I'd try to whitelist as many parts of the dynamic query as possible.
In order to reduce the damage of a successful attack, you should run the query under a specific account that has very limited rights. You'd have to add another connection string to the config-file that uses this account and create the connection with the limited connection string. In SQL Server, you could move the tables that this account is able to access to a specific schema and limit the access to this schema for this account.
Protect your service very well against unauthorized access so that only trusted developers can access it. You can do this by using some components in the infrastructure (firewalls, transport-level security etc.) and also by adding a strong user authentication mechanism.
Log each request to the service so that the user and machine can be identified. Notify the users about this logging mechanism so that they know that they will be identified should anything go wrong.
Some final thoughts: even if it seems very easy to provide developers with such an open method for querying data, think about whether it is really necessary. One possible option would be to not have this open access, but instead configure the queries other developers need in a configuration file. Each query gets an identifier and the query text is stored in the file and therefore known beforehand. Still, you are able to add new queries or change existing ones after you have deployed the service. You can allow parameters in the query that the callers specify (maybe a numbered parameter scheme like p1, p2, ...).
As you can see from the list above, it is very hard (and in some areas close to impossible) to lock the service down and avoid all kinds of SQL injection attacks once you allow this open access. With an approach as described in the last paragraph you loose some flexibility, but you wouldn't have to worry about SQL injection attacks anymore.

Meta-data from SQLite

Is there any way to query a SQLite database for basic meta data such as:
Last date/time updated
Hash of database to indicate "state"
I am just looking for a simple, infrastructural way to have a script evaluate different databases and take a reasonable point of view on whether they are the same "state" as other databases in a different environment (PROD and DEV for instance).
In my experience, if no update, new record, or any change is made to the SQLite database file, the last modified time of the file doesn't change. So the last modified time should suffice for the time of any change made to database.
If 2 database files with same state are only accessed for reading, their modified times are always the same.
Similarly you get the file sizes for comparison.
You can use the whole file to calculate hash. If you consider same data in the database as the same "state" regardless of any difference in the past, then maybe you want hash of the all records in database, which is probably not simple.

Is there a way to find the SQL that updated a particular field at a particular time?

Let's assume that I know when a particular database record was updated. I know that somewhere exists a history of all SQL that's executed, perhaps only accessible by a DBA. If I could access this history, I could SELECT from it where the query text is LIKE '%fieldname%'. While this would pretty much pull up any transactional query containing the field name I am looking for, it's a great start, especially if I can filter the recordset down to a particular date/time range.
I've discovered the dbc.DBQLogTbl view, but it doesn't appear to work as I expect. Is there another view that contains the information I am looking for?
It depends on the level of database query logging (DBQL) that has been enabled by the DBA.
Some DBA's may elect not to detailed information for tactical queries so it is best to consult with your DBA team to understand what is being captured. You can also query the DBC.DBQLRules to determine what level of logging has been enabled.
The following data dictionary objects will be of particular interest to your question:
DBC.QryLog contains the details about the query with respect to the user, session, application, type of statement, CPU, IO, and other fields associated with a particular query.
DBC.QryLogSQL contains the SQL statements. If a SQL statement is exceeds a certain length it is split across multiple rows which is denoted by a column in this table. If you join this to the main Query Log table care must be taken if you are aggregating and metrics in the Query Log table. Although more often then not if your are joining the Query Log table to the SQL table you are not doing any aggregation.
DBC.QryLogObjects contains the objects used by a particular query and how they were used. This includes tables, columns, and indexes referenced by a particular query.
These tables can be joined together in DBC via QueryID and ProcID. There are a few other tables that capture information about the queries but are beyond the scope of this particular question. You can find out about those in the Teradata Manuals.
Check with your DBA team to determine the level of logging being done and where they historical DBQL data is retained. Often DBQL data is moved nightly to a historical database and there often is a ten minute delay in data being flushed from cache to the DBC tables. Your DBA team can tell you where to find historical DBQL data.

Resources