Which data type is appropriate to store address details?
How to specifiy new line character while inserting data into this column?
I would strongly discourage you from storing your address details in a single column unless it is being provided to you from the source system in this manner.
Address details should be stored in separate atomic fields allowing for multiple address lines, city, state, postal code, and ZIP+4 (United States). This will provide you the most flexibility and make cleansing of address information using a third-party tool much easier should the need arise later.
-- To store a CR/LF in Teradata
SELECT 'A' || '0D0A'xc || 'B';
Related
I am trying to create a naming convention for different objects in DynamoDB, such as tables, partition and sort keys, LSIs, GSIs, attributes, etc. I read a lot of articles and there is no common way to do that but want to learn from real-time examples to choose which one will fit best our needs.
The infrastructure I am working on is based on microservices. Along with this, some of our development environments share the same AWS account. Based on this, I ended up with something like this:
Tables: [Environment].[Service Name].[Table Name].ddb-table
GSIs/LSIs: [Environment].[Service Name].[Table Name].[GSI/LSI Name].ddb-[gsi/lsi]
Partition Key: pk ??? (in my understanding, the keys should have abstract names, because the single table stores versatile data in the same key)
Sort Key: sk ??? (in my understanding, the keys should have abstract names, because the single table stores versatile data in the same key)
Attributes: meaningful but as short as possible as they are kept for every item in the table
Different elements are separated by dot (.)
All names are separated by dashes (kebab-case) and in lower case
Tables/GSIs/LSIs are in singular form
Here is an example:
Table: dev.user-service.user-order.ddb-table
LSI: dev.user-service.user-order.lsi1pk.ddb-lsi
GSI: dev.user-service.user-order.gsi1pk.ddb-gsi
What naming conventions do you follow?
Thanks a lot in advance!
My advice:
Use PK and SK as your partition key and sort key.
Don't put table names into code. Use ParameterStore. For example, if you ever do a table restore it will be to a new table name, and if you want to send traffic to the new name you'll not want to change code.
Thus don't get too fixed to any particular table name. Never try to have code predict a table name. Only have them be consistent to help humans.
Don't put regions in your table names. When you switch to Global Tables they all keep the same name. Awkward!
GSIs can be called GSI1, GSI2, etc. GSI keys are GSI1PK and GSI1SK, etc.
Tag your tables with their name if you ever want to track per-table costs later.
Short yet meaningful attribute names are nice because it reduces storage and can reduce RCU/WCU if you're near the 4kb or 1kb lines.
Use difference accounts for dev, staging, and production. If you want to put the names into tables as well to help you spot "OMG I'm in production" that's fine.
If you have lots of attributes as the item payload which aren't used for GSIs or filtering and are always returned together, consider just storing them as a string or binary which gets parsed client side. You can even compress them. It's more efficient and lower latency because it skips the data marshaling.
Is it possible to query the current connection's data source name in teradata? I was hoping to use it to tailor some queries so that they could automatically update certain parts of said queries based on the connection information.
For example, a table in our test environment would have a different suffix than the same table in the production environment. The intent would be to use the same query in both environments, so it would need to be able to discern the current connection data in order to append the appropriate suffix to the table name.
I tried to get this from DBC.SessionInfoV in the LogonSource field, as per this answer, but there does not appear to be any pattern between the logon strings that discerns the test environment from the production one - it looks like the strings are simply random.
Here you can see what information I'm looking to actually pull into my query.
I'm currently trying to design a database and I'm not too sure about the best way to approach a dynamically sized array field of one of my objects. My first thought is to use a column in my object to store an array of integers. However the more I read, the more I think this isn't the best option. Concrete example wise, I have a player object that stores 0 to many items, which are represented by an integer. What is the best way to represent this?
If that collection of values is atomic, store them together. Meaning, if you always care about the entire group, if you never search for nested values and never sort by nested values, then they should be stored together as a single field value.
If not, they should be stored in a separate table, each value bring a row , each assigned the parent ID (foreign key) of a record on the other table that "owns" them as a group.
For example, a clump of readings from a scientific instrument that are only ever used together as a collection for analysis should be stored together in a field. In contrast, a list of phone numbers for a customer that may often need to be queried for an individual number should probably be broken up into single phone number per row in a related child table.
For more info, search on the term "database normalization".
Some databases, support an array as a data type. For example, Postgres allows you to define a column as a one-dimension array, or even a two dimension array.
If your database does not support array as a type of column definition, then you may have three alternatives:
XML/JSONTransform you data collection into an XML or JSON document if your database your database supports that type. For example, Postgres has basic support for storing, retrieving, and non-indexed searching of XML using XPath. And Postgres offers excellent industry-leading support for JSON as a data type including indexed support on nested values with its jsonb data type where incoming JSON is parsed and stored in an internally-defined binary format. This feature addresses one of the main reasons people consider using the so-called “NoSQL” systems, looking to store and search semi-structured data.
TextCreate a string representation of your data to store as text.
BLOBCreate a binary value to store as a binary large object (BLOB).
I have two different databases that are not connected in any way. In fact, one is a public school database and one is a hud (housing) database. By law they are not allowed to share names and other specific identifying addresses. Birthdates and addresses are okay - along with zip codes and other more general ids. The uses need to be able to query the other database to get non-specific information so it would appear that they need to share the same unique id. I was considering such things as using birthdates and perhaps initials of name or perhaps last 4 digits of ssn along with the birthdate. The client was thinking of global positioning data but I'm concerned about apartments next to one another or moving of families. Any ideas?
First you need to determine what will be your measure of uniqueness. If there are two people in either database with more than one entry for your measure of uniqueness, you need to change your strategy. After that, put a constraint on both databases constraining that these properties(Birthday, SSN) are what make a Person record unique.
I have some EDI messages (X12, HL7, etc ...) stored in an Oracle database. I sometimes want to pull out individual fields (e.g. ISA-03). Currently, I have some really ugly sql. I'd like to create a PL/SQL package to make it easier and was wondering if anybody had already done this.
I imagine something like:
select
edi.x12.extract_field( clob_column, 'ISA', 4)
from
edi_table
While I never stored the HL7 message as is in a database it should be possible.
The idea of HL7 (and XML) is that it's a common format for systems to use to transfer information. It was never designed as a "storable" item. Usually, I would pull the data out of the warehouse format into a particular HL7 message and send it to the MQHub/eGate for transmitting. On the return do the opposite extract the fields I'm warehousing and save those. I.E. HL7 should not be stored so I don't have one.
Enough of the lecture. :)
I would suggest a function/procedure per segment and split the message into a temp table.
example of split in oracle