What character encoding are ODBC datasources? - odbc

Specifically, what character encoding does SQLDataSources use?
On my Windows 7 machine (set to New Zealand English) it seems to use CP1252. I can't find any mention of character encodings in the documentation.

It depends of database you use. For PostgreSQL I use SET client_encoding to <encoding>; after connecting do database. For Informix there is Client Encoding option available on Environment tab. For Oracle I use NLS_LANG environment setting.

I've done some experimentation and determined that data source names are in unicode. SQLDataSources gives you the name converted to the system code page, replacing characters that can't be converted with '?'. This is about as useful as you might expect. The undocumented function SQLDataSourcesW gives the name encoded in UTF-16.

Related

What is CCSID='...' on a connection string

I'm working on moving an old asp code to .net.
on the strCon (the connection to the database) one of the parameter is:
strCon=".....;CCSID=1255;"
I'm not sure what that means, I researched online but didn't find anything.
Can anybody explains what that means?
Per Wikipedia, CCSID means "Coded Character Set Identifier". Which sounds a little like "code page" and Windows has a codepage 1255 for Hebrew. If your application deals with text data that's in Hebrew, this may be the reason for it (but read the next paragraph!).
It may be legacy cruft left over from an old database or driver which handled different encodings via the connection string - it's not a standard parameter in SQL Server connection strings. See https://www.connectionstrings.com/all-sql-server-connection-string-keywords/ and https://msdn.microsoft.com/en-us/library/ms130822.aspx
Try removing that portion of the connection string; it may not be needed. The only way to be sure is to test.

Connecting to oracle 11g on Red hat linux from windows server using asp.net

We have our application developed and tested with sql server 2008r2 using ASP.NET on windows server. Now we have a requirement to move the database from windows to oracle on red hat linux.
We haven't yet setup the infrastructure to test the same. I would like to know in the meantime if anyone has successfully done this kind of thing. Pointers to any resources will be a great advantage.
Is changing the connection string the only thing that needs to be done or are there any specific configuration in Linux to allow this?
I will verify this once I get the environment ready, but as a headstart if anyone has any similar experience, do share.
Thanks in advance.
P.S: For migration of table structure, storedprocedures etc to oracle we will be using the Sql Developer tool.
I would like to answer my question,because, migration to oracle is not that straight forward, but there are some tips that may help anyone migrate to oracle on windows or linux with less headache.
The first thing the Sql developer tool does a good job of migrating sqlserver schema and data to oracle including storedprocedures, constraints, triggers etc.
It also does a good job of datatype mapping and provides option to remap datatype if required.
Some caveats and precautions.
Oracle has a limitation on the length of stored procedure names of about 30 characters. This is the area you need to resort to some manual renaming as when migration SP's or identifiers whose name is greater than 30 characters may get truncated.
The other common issue that you may face is respect to date insertion and formatting. You can use the following snippet to avoid the headache. The common error will be "Not a valid month."
OracleConnection conn = new OracleConnection(oradb); // C#
conn.Open();
OracleGlobalization session = conn.GetSessionInfo();
session.DateFormat = "DD.MM.RR"; // change the format as required here
conn.SetSessionInfo(session);
The most annoying error would be well character to numeric conversion when inserting or updating data or related error.
The issue here is when you add parameters to command object for sql provider, the binding happens by name, but forOracle.DataAccess the default binding is by position. Here's the post that saved me lot of headache.
ODP .NET Parameter problem with uint datatype
What you can do is set the command.BindByName = true;
When migrating SP's that returns data, oracle creates an out parameter ref cursor. This needs to be taken care of while constructing command parameters.
For e.g.
OracleParameter refp = new Oracle.DataAccess.Client.OracleParameter("cv_1", OracleDbType.RefCursor, ParameterDirection.InputOutput);
command.Parameters.Add(refp);
Also the sqlserver requires parameters to SP be prefixed with "#" and oracle doesn't. This can be easily taken care of in your data layer.
Also since there is no bit datatype in Oracle, number(1) works fine. You may need to convert your bool to numeric, if required.
Hope this helps someone avoid a migration headaches. I will post more issues if I encounter.

ODBC: how to handle Booleans?

Disclaimer: I am a n00b.
It seems like ODBC does not support a BOOLEAN type? Is this true?
If so, what's the standard kludgearound?
Edit: I am using ADO with Delphi on Windows to write the data, but PHP 5 to read it back.
SQL itself has traditionally not supported a boolean type, so ODBC is just reflecting this. As ODBC is intended to provide portability across databases, it is generally better to implement booleans in the database as one of the standard types, such as CHAR(1), containing either 'Y' or 'N', rather than use a vendor specific type.
There's SQL_C_BIT, but you need to lookup what a given driver uses for each SQL type. For example, MySQL uses SQL_C_CHAR for bool.
I believe it depends on the actual SQL server implementation. You can check the ODBC driver/datasource settings, if you are doing it under Windows -- there might be options such as Bool As Char, or something.

SQLite and Portuguese-br characters

I'm developing an app that requires the storage of Portuguese characters. I was wondering if I need to do any configuration to prepare my SQLite db to store those considered special characters. When I query a db table that contains those characters I get a '?' (without quotes) in their place.
Probably an encoding problem. Is your DB/client using UTF-8?
you should check your DB encoding with PRAGMA encoding;, be sure your client does it's job using the same encoding and verify that the encoding used handles well those Portuguese chars.

Which code set is /etc/passwd stored in? Can it be UTF-8? What limits are placed on user names?

On a modern Unix or Linux system, how can you tell which code set the /etc/passwd file stores user names in? Are user names allowed to contain accented characters (from the range 0x80..0xFF in, say, ISO 8859-1 or 8859-15)? Can the /etc/passwd file contain UTF-8? Can you tell that it contains UTF-8? What about the plain text of passwords before they are encrypted or hashed?
Clearly, if the usernames and other data is limited to the 0x00..0x7F range (and excludes 0x00 anyway), then there is no difference between UTF-8, 8859-1 or 8859-15; the characters present are all encoded the same.
Also, I'm using /etc/passwd as an abbreviation for something along the lines of "the user identification and authentication database (sometimes termed a directory service) on a Unix-based machine, usually accessed via PAM and sometimes hosted on other machines altogether from the local one, but sometimes still actually a file on the local hard disk, conventionally called /etc/passwd, often supported by /etc/shadow". I'm also assuming that the equivalent questions about the group database (often the /etc/group file) have the same answer.
It's all ASCII. But the password itself is never stored - only the results of the one-way hash. If you're wondering what characters can be in the password itself, it depends on the locale, which will restrict the characters your terminal is able to deal with. See "man locale"
From the BSD man page:
"/etc/passwd ASCII password file..."
As for usernames, I can tell you that Solaris only supports ASCII. I can't speak for other Unix-en.
"Not every object in Solaris 2 and Solaris 7can have names composed of arbitrary characters. The names of the following objects must be composed of ASCII characters:
* User names, group name, and passwords
* System name ...
"

Resources