Line Breaks in MySQL Column - sqlite

I have a mySQL column called description and I have several sentences in this column. For example: "John ran down a hill. He was tired. John went to get water." I would like a line break after each sentence so that it outputs like:
John went to get water.
He was tired.
John went to get water.
I'm using a SQLite DB Browser (http://sqlitebrowser.org/). I thought I could do line breaks with: "John ran down a hill. \n" but unfortunately it outputs the \n as well as the "". Can anyone help me with these line breaks? Thanks!

SQLite doesn't have string escaping logics. You have to concatenate the newline char using another way, like the char(10) function or typing it in exadecimal: x'0a'. Remember that the character 10 is the newLine for windows architectures.
Just use a query like this (to update all the rows):
update tabel_name set "description" =
replace("description", '.', '.' || x'0a')
If you want to edit a specific row add a WHERE clause:
update tabel_name set "description" =
replace("description", '.', '.' || x'0a')
WHERE "rowid" = 1
The || is the concatenation operator in SQLite.

Related

PLSQL: Find invalid characters in a database column (UTF-8)

I have a text column in a table which I need to validate to recognize which records have non UTF-8 characters.
Below is an example record where there are invalid characters.
text = 'PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, monta􀄪 tablic i uchwytów na r􀄊czniki, wymiana zamka systemowego'
There are over 3 million records in this table, so I need to validate them all at once and get the rows where this text column has non UTF-8 characters.
I tried below:
instr(text, chr(26)) > 0 - no records get fetched
text LIKE '%ó%' (tried this for a few invalid characters I noticed) - no records get fetched
update <table> set text = replace(text, 'ó', 'ó') - no change seen in text
Is there anything else I can do?
Appreciate your input.
This is Oracle 11.2
The characters you're seeing might be invalid for your data, but they are valid AL32UTF8 characters. Else they would not be displayed correctly. It's up to you to determine what character set contains the correct set of characters.
For example, to check if a string only contains characters in the US7ASCII character set, use the CONVERT function. Any character that cannot be converted into a valid US7ASCII character will be displayed as ?.
The example below first replaces the question marks with string '~~~~~', then converts and then checks for the existence of a question mark in the converted text.
WITH t (c) AS
(SELECT 'PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, monta􀄪 tablic i uchwytów na r􀄊czniki, wymiana zamka systemowego' FROM DUAL UNION ALL
SELECT 'Just a bit of normal text' FROM DUAL UNION ALL
SELECT 'Question mark ?' FROM DUAL),
converted_t (c) AS
(
SELECT
CONVERT(
REPLACE(c,'?','~~~~~')
,'US7ASCII','AL32UTF8')
FROM t
)
SELECT CASE WHEN INSTR(c,'?') > 0 THEN 'Invalid' ELSE 'Valid' END as status, c
FROM converted_t
;
Invalid
PP632485 - Hala A - prace kuchnia Zepelin, wymiana muszli, montao??? tablic i uchwyt??w na ro??Sczniki, wymiana zamka systemowego
Valid
Just a bit of normal text
Valid
Question mark ~~~~~
Again, this is just an example - you might need a less restrictive character set.
--UPDATE--
With your data: it's up to you to determine how you want to continue. Determine what is a good target data set. Contrary to what I set earlier, it's not mandatory to pass a "from dataset" argument in the CONVERT function.
Things you could try:
Check which characters show up as '�' when converting from UTF8 at AL32UTF8
select * from G2178009_2020030114_dinllk
WHERE INSTR(CONVERT(text ,'AL32UTF8','UTF8'),'�') > 0;
Check if the converted text matches the original text. In this example I'm converting to UTF8 and comparing against the original text. If it is different then the converted text will not be the same as the original text.
select * from G2178009_2020030114_dinllk
WHERE
CONVERT(text ,'UTF8') = text;
This should be enough tools for you to diagnose your data issue.
As shown by previous comments, you can detect the issue in place, but it's difficult to automatically correct in place.
I have used https://pypi.org/project/ftfy/ to correct invalidly encoded characters in large files.
It guesses what the actual UTF8 character should be, and there are some controls on how it does this. For you, the problem is that you have to pull the data out, fix it, and put it back in.
So assuming you can get the data out to the file system to fix it, you can locate files with bad encodings with something like this:
find . -type f | xargs -I {} bash -c "iconv -f utf-8 -t utf-16 {} &>/dev/null || echo {}"
This produces a list of files that potentially need to be processed by ftfy.

Extract Java comments from SQL statement in R

Trying to run a SQL statement in an RStudio environment, but I'm having difficulty extracting Java-style comments from the statement. I cannot edit the SQL statements / comments themselves, so trying a sequence of gsub to remove the unwanted special characters so I'm left with only the SQL statement in the R string.
I'm trying to use gsub to remove the special characters and the comment in between, but struggling to find the right regex to do so (especially one that does not read the division symbol in the SELECT statement as a part of the Java comment).
SELECT
id
, metric
, SUM(numerator)/SUM(denominator) AS rate
/*
This is an example of the comment.
I want to remove this. */
FROM table
WHERE id = 2
You can remove anything between /* and */ using this regex:
gsub(pattern = "/\\*[^*]*\\*/", replacement = "", x = text)
Result:
"SELECT\n id\n, metric\n, SUM(numerator)/SUM(denominator) AS rate\n/\nFROM table\nWHERE id = 2"

What is wrong with my Printf statement? (SYNTAX)

I am trying to use printf to align some columns up I am not very familiar with the syntax of the printf command so I will give an explanation of what I want it to do
printf("%-15s %-15s %25s\n", Fname, Lname, Num)
I want to create 3 columns one called "Fname" one called "Lname" and one called "Num"
I want the Fname column to be left aligned and 15 spaces long, same with the Lname column I don't care about alignment in the Num column, but I want it to be slightly longer at 25 characters
This is my error
Syntax error near unexpected token `"%-15s %-15s %25s\n",'
You dont need the brackets to enclose the statement.
Also, i guess that you may need to be handling Lname, Fname,Num as variables too, so you would need to correct this also.
Correct syntax:
printf "%-15s %-15s %25s\n" $Fname, $Lname, $Num

How to do a column name inside of a dynamic where clause? TO_NUMBER(column name)

I am currently trying to create a dynamic Select statement when the user has to input a various amount of criteria to search by.
Currently, I have every part of the statement working except for the most important part.
I am attempting to do something like this:
selStmt := 'SELECT column_one, column_2, column_3
FROM nerf
whereClause := ' WHERE TO_NUMBER('''|| column_one ||''') <= '''|| userInput ||'''';
However, in doing this the WHERE cluse of my SELECT statement is not accurate as shown by my output line:
WHERE TO_NUMBER('') <= '5';
I have tried various solutions with quote marks and I end up with either a ORA-00905 missing identifier error, or I get a ORA-00911: invalid character error.
At this point I'm not quite sure how to approach this issue.
Any useful help gets thanks in advance.
For some reason, Oracle uses the single quote to delimit strings and to escape characters, so using '' is an instruction to Oracle to add a quote inside your string. Example:
'This is a string with a quote here: '' and then it ends normally'
will be represented as
This is a string with a quote here: ' and then it ends normally
In your example, you are ending the WHERE clause you're building up and then concatenating a PL/SQL variable identifier called column_one:
' WHERE TO_NUMBER('''|| column_one ||''')
...and with a NULL value for the identifier column_one this is represented as
WHERE TO_NUMBER('')
Presumably you want to reference column_one from inside the query, and not from a PL/SQL variable of the same name, so you should remove the quotes around it like so:
whereClause := ' WHERE TO_NUMBER(column_one) <= TO_NUMBER('''|| userInput ||''')';
Escaping strings in Oracle is often infuriating - it helps a lot if you have a good IDE with decent syntax highlighting like TOAD or SQL*Developer.
This should work:
selStmt := 'SELECT column_one, column_2, column_3 FROM nerf';
whereClause := ' WHERE TO_NUMBER(column_one) <= TO_NUMBER('''|| userInput ||''')';

How to remove carriage returns in a text field in sqlite?

I have an sqlite database with over 400k records. I have just found that some of the text fields have carriage returns in them and I wanted to clean them out. I wanted to copy the structure of the original table and then do something like:
INSERT INTO large_table_copy
SELECT date, other_fields, replace(dirty_text_field,XXX,"")
FROM large_table
Where XXX is whatever the code would be for a carriage return. It's not \n. But I can't find out what it is.
SQLite lets you put line breaks inside string literals, like this:
SELECT replace(dirty_text_field, '
', '');
If you don't like this syntax, you can pass the string as a BLOB: X'0D' for \r or X'0A' for \n (assuming the default UTF-8 encoding).
Edit: Since this answer was originally written, SQLite has added a CHAR function. So you can now write CHAR(13) for \r or CHAR(10) for \n, which will work whether your database is encoded in UTF-8 or UTF-16.
From #MarkCarter's comment on the question above:
SELECT replace(dirty_text_field, X'0A', '\n');

Resources