Why does symbol char(10) at the end of line lose? [duplicate] - sqlite

This question already has an answer here:
SQLite FireDAC trailing spaces
(1 answer)
Closed 5 years ago.
In my case, I have lost a symbol $A at the end of line when I get field in Delphi. I think, Problem is in FireDac components. I use Delphi 10.1 Berlin and Sqlite (I don't know a version). When I start up program below, I have result 3!=4 in message.
This is code:
FD := TFDQuery.Create(nil);
FD.Connection := FDConnection1;
FD.ExecSQL('create table t2 (f2 text)');
FD.ExecSQL('insert into t2 values(''123''||char(10))');
FD.Open('select f2, length(f2) as l from t2');
ShowMessage(IntToStr(Length(FD.FieldByName('f2').AsString))+'!='+FD.FieldByName('l').AsString);
Last symbol $A lost.
May be somebody explain me this strange behavior.

You need to turn off the TFDQuery.FormatOptions.StrsTrim property:
Controls the removing of trailing spaces from string values and zero bytes from binary values
...
For SQLite, this property is applied to all string columns

Related

What does the TEXT (5) in CREATE TABLE SQL mean when I use SQLitle? [duplicate]

This question already has answers here:
sqlite allows char(p) inputs exceeding length p
(2 answers)
Varchar Length? SQLite in Android
(1 answer)
String data type in sqlite
(2 answers)
No limitation on SQL column data type
(1 answer)
Closed 10 days ago.
The Code A is to created a SQLitle table made by SQLite Studio 3.4.3.
1: What does the TEXT (5) mean in SQLitle? Does it mean that the length of linkTitle field can't be exceed 5 letters? What will happend if I add a record with 20 letters for linkTitle field?
2: The default value "The title of saved links" of the linkTitle field has exceeded 5 letters, What will happend ?
Code A
CREATE TABLE myTable (
id INTEGER PRIMARY KEY ASC AUTOINCREMENT,
linkTitle TEXT (5) NOT NULL
DEFAULT [The title of saved links],
linkSaved TEXT (10) NOT NULL
);
It has no affect, the column type is considered as TEXT. SQLite accommodates much of the SQL that is used by other databases and hence the acceptance of the length specifier but ignoring (other than syntactically (omit either parenthesis and a syntax error will be issued, likewise if a non numeric is specified)).
No length restriction is imposed by the specifier.
If you were to use:-
INSERT INTO myTable (linkTitle,linkSaved) VALUES
('The quick brown fox jumped over the lazy fence','The slow grey elephant could not jump over the fence so crashed though it'),
(100,zeroblob(10)),
(10.1234567,'10.1234567')
;
SELECT * FROM myTable;
The result would be:-
This also demonstrates that you can save (with the exception of an alias of the rowid, e.g. the id column, or the rowid itself ), any type of data in any type of column. Which other databases typically do not allow.
Furthermore, the column type itself is highly flexible. That is you could specify virtually any type e.g. the_column ridiculoustype and SQLite will make the type NUMERIC (the default/drop through type) in this case (see 3.1 in the link for the rules for assigning a type affinity).
You should perhaps have a read of https://www.sqlite.org/datatype3.html

PLSQL: Find invalid characters in a database column (UTF-8)

I have a text column in a table which I need to validate to recognize which records have non UTF-8 characters.
Below is an example record where there are invalid characters.
text = 'PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, monta􀄪 tablic i uchwytów na r􀄊czniki, wymiana zamka systemowego'
There are over 3 million records in this table, so I need to validate them all at once and get the rows where this text column has non UTF-8 characters.
I tried below:
instr(text, chr(26)) > 0 - no records get fetched
text LIKE '%ó%' (tried this for a few invalid characters I noticed) - no records get fetched
update <table> set text = replace(text, 'ó', 'ó') - no change seen in text
Is there anything else I can do?
Appreciate your input.
This is Oracle 11.2
The characters you're seeing might be invalid for your data, but they are valid AL32UTF8 characters. Else they would not be displayed correctly. It's up to you to determine what character set contains the correct set of characters.
For example, to check if a string only contains characters in the US7ASCII character set, use the CONVERT function. Any character that cannot be converted into a valid US7ASCII character will be displayed as ?.
The example below first replaces the question marks with string '~~~~~', then converts and then checks for the existence of a question mark in the converted text.
WITH t (c) AS
(SELECT 'PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, monta􀄪 tablic i uchwytów na r􀄊czniki, wymiana zamka systemowego' FROM DUAL UNION ALL
SELECT 'Just a bit of normal text' FROM DUAL UNION ALL
SELECT 'Question mark ?' FROM DUAL),
converted_t (c) AS
(
SELECT
CONVERT(
REPLACE(c,'?','~~~~~')
,'US7ASCII','AL32UTF8')
FROM t
)
SELECT CASE WHEN INSTR(c,'?') > 0 THEN 'Invalid' ELSE 'Valid' END as status, c
FROM converted_t
;
Invalid
PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, montao??? tablic i uchwyt??w na ro??Sczniki, wymiana zamka systemowego
Valid
Just a bit of normal text
Valid
Question mark ~~~~~
Again, this is just an example - you might need a less restrictive character set.
--UPDATE--
With your data: it's up to you to determine how you want to continue. Determine what is a good target data set. Contrary to what I set earlier, it's not mandatory to pass a "from dataset" argument in the CONVERT function.
Things you could try:
Check which characters show up as '�' when converting from UTF8 at AL32UTF8
select * from G2178009_2020030114_dinllk
WHERE INSTR(CONVERT(text ,'AL32UTF8','UTF8'),'�') > 0;
Check if the converted text matches the original text. In this example I'm converting to UTF8 and comparing against the original text. If it is different then the converted text will not be the same as the original text.
select * from G2178009_2020030114_dinllk
WHERE
CONVERT(text ,'UTF8') = text;
This should be enough tools for you to diagnose your data issue.
As shown by previous comments, you can detect the issue in place, but it's difficult to automatically correct in place.
I have used https://pypi.org/project/ftfy/ to correct invalidly encoded characters in large files.
It guesses what the actual UTF8 character should be, and there are some controls on how it does this. For you, the problem is that you have to pull the data out, fix it, and put it back in.
So assuming you can get the data out to the file system to fix it, you can locate files with bad encodings with something like this:
find . -type f | xargs -I {} bash -c "iconv -f utf-8 -t utf-16 {} &>/dev/null || echo {}"
This produces a list of files that potentially need to be processed by ftfy.

Get the correct Hexadecimal for strange symbol

I have this strange symbol on my pl/sql developer client (check image it's the symbol between P and B )
In the past, and for a different symbol, i was able to update my DB and remove them making this:
update table set ent_name = replace(ent_name, UTL_RAW.CAST_TO_VARCHAR2(HEXTORAW('C29B')), ' ');
The problem is that i dont remember how I translated the symbol (i had at that time) to the C29B.
Can you help me to understand how can i translate the currenct symbol to the HEX format, to i can use the command to remove it from my database?
Thanks
As long as it's in your table, you can use the DUMP function to find it.
Use DUMP to get the byte representation of the data in code of you wish to inspect for weirdness.
A good overview: Oracle / PLSQL: DUMP Function
Here's some text with plain ASCII:
select dump('Dashes-and "smart quotes"') from dual;
Typ=96 Len=25:
68,97,115,104,101,115,45,97,110,100,32,34,115,109,97,114,116,32,113,117,111,116,101,115,34
Now introduce funny characters:
select dump('Dashes—and “smart quotes”') from dual;
Typ=96 Len=31:
68,97,115,104,101,115,226,128,148,97,110,100,32,226,128,156,115,109,97,114,116,32,113,117,111,116,101,115,226,128,157
In this case, the number of bytes increased because my DB is using UTF8. Numbers outside of the valid range for ASCII stand out and can be inspected further.
The ASCIISTR function provides an even more convenient way to see the special characters:
select asciistr('Dashes—and “smart quotes”') from dual;
Dashes\2014and \201Csmart quotes\201D
This one converts non-ASCII characters into backslashed Unicode hex.
The DUMP function takes an additional argument that can be used to format the output in a nice way:
select DUMP('Thumbs 👍', 1017) from dual;
Typ=96 Len=11 CharacterSet=AL32UTF8: T,h,u,m,b,s, ,f0,9f,91,8d
select DUMP('Smiley 😊 Face', 17) from dual;
Typ=96 Len=16: S,m,i,l,e,y, ,f0,9f,98,8a, ,F,a,c,e

Prevent SQLite query from stripping leading zeros from numeric strings?

In my database, a table contains two columns each containing an 8 digit ASCII code, usually it's just alphanumeric. For example, a row might contain A123B45C in col1 and PQ2R4680 in col2.
I need to have a query/view that outputs a 4 character string calculated as the 2nd+3rd chars of these, concatenated. So in this example the extra column value would be 12Q2.
This is a cut-down version of the SQL I'd like to use, although it won't work as written because of zero stripping / conversion:
select
*,
(substr(col1, 2, 2) || substr(col2, 2, 2)) AS mode
from (nested SQL source query)
where (conditions)
This fails because if a row contains A00B23B4 in col1 and P32R4680 in col2, it will evaluate as 0032 and the query output will contain numeric 32 not 0032. (It's worse if col1 contains P1-2345 or "1.23456" or something like that)
Other questions on preventing zero stripping and string to integer conversion in Sqlite, all relate to data in tables where you can define a column text affinity, or static (quotable) data. In this case I can't do these things. I also can only create queries, not tables, so I can't write to a temp table.
What is the best way to ensure I get a 4 character output in all cases?
I believe you issue is not with substr stripping characters as this works as expected e.g. :-
Then running query SELECT substr(col1,2,2) || substr(col2,2,2) as mode FROM stripping
results in (as expected):-
Rather, your issue is likely how you subsequently utilise mode in which case you may need to use a CAST expression CAST expressions
For example the following does what is possibly happening :-
`SELECT substr(col1,2,2) || substr(col2,2,2) as mode, CAST(substr(col1,2,2) || substr(col2,2,2) AS INTEGER) AS oops FROM stripping`
resulting in :-

SQLITE3 + execute Insert [duplicate]

This question already has answers here:
SQLite parameter substitution problem
(8 answers)
Closed 7 years ago.
Trying to execute insert an item coming from a list:`
item=u'Sunil Goyal'
c.execute('''INSERT INTO bpersons(person_name) VALUES (?)''',item)`
is simple enough, but it returns
Incorrect number of bindings supplied. The current statement uses 1, and there are 11 supplied.
Clearly instead of reading item as one element, it is reading characters. There is no problem with the earlier code which returns this list:
>>> if meta[7]:#bcoz list could be empty also
for item in meta[7]:
print item
Sunil Goyal
Rehan Yar Khan
Khan
Kae Capital
Ashish Shankar
Karthik Reddy
Feroze Azeez
len(meta[7])
7
Any idea where I am going wrong?
insert is looking for an iterable (documentation) and this succeeds because your unicode string is an iterable, but you should put it inside of a tuple or list to be handled properly by sqlite3.
c.execute('''INSERT INTO bpersons(person_name) VALUES (?)''',(item,))`

Resources