Postgres's query to select value in array by index - r

My data is string like:
'湯姆 is a boy.'
or '梅isagirl.'
or '約翰,is,a,boy.'.
And I want to split the string and only choose the Chinese name.
In R, I can use the command
tmp=strsplit(string,[A-z% ])
unlist(lapply(tmp,function(x)x[1]))
And then getting the Chinese name I want.
But in PostgreSQL
select regexp_split_to_array(string,'[A-z% ]') from db.table
I get a array like {'湯姆','','',''},{'梅','','',''},...
And I don't know how to choose the item in the array.
I try to use the command
select regexp_split_to_array(string,'[A-z% ]')[1] from db.table
and I get an error.

I don't think that regexp_split_to_array is the appropriate function for what you are trying to do here. Instead, use regexp_replace to selectively remove all ASCII characters:
SELECT string, regexp_replace(string, '[[:ascii:]~:;,"]+', '', 'g') AS name
FROM yourTable;
Demo
Note that you might have to adjust the set of characters to be removed, depending on what other non Chinese characters you expect to have in the string column. This answer gives you a general suggestion for how you might proceed here.

Related

SQLite replace() specified character and next character

I have a database with a Username column.
There are multiple section signs followed by numbers §# that format the name.
I have to make sure all names are unique, but I want to disregard the formatting character pairs.
I was going to use,
SELECT * FROM Users WHERE replace(lower(Username),'§%','') = 'name';
but I realized that would look for the percent sign and not act as a wildcard. I could really use some help.
Use a combination of INSTR and SUBSTR to isolate the name before comparing it:
SELECT *
FROM Users
WHERE LOWER(SUBSTR(Username, 1, INSTR(Username, '§%') - 1)) = 'name';

SQLite: trim usage

My goal is to extract the domain out of given URL.
For that end I use the following:
select distinct ltrim(rtrim('https://www.youtube.com/watch?v=...', '/'), 'https://')
The result I get is:
www.youtube.com/watch?v=...
While the following is expected:
www.youtube.com
How can the above be achieved?
Note:
I notices that the trim function works differently than I expected.
select distinct ltrim('https://www.youtube.com/watch?v...', 'youtu') returns the same string without any change.
Trying to trim only the slash by select ltrim('https://www.youtube.com/watch?v...', '/') returns the same string as well.
Any explainations are welcomed.
Trim only removes the given characters at the beginning and/or end of the string.
You'll need substr and instr. (https://www.sqlite.org/lang_corefunc.html)
But the best option is probably to fix this in your code before saving it to the database.
At the end I didn't use trim but substr as offered.
The following worked:
select replace(substr(substr(<url>, instr(<url>, '//')+2),0,instr(substr(<url>, instr(<url>, '//')+2),'/')),'.','')
select replace(substr(substr(<url>, instr(<url>, '//www.')+6),0,instr(substr(<url>, instr(<url>, '//www.')+6),'/')),'.','')

How to query Unicode characters from SQL Server 2008

With NVARCHAR data type, I store my local language text in a column. I face a problem how to query that value from the database.
ዜናገብርኤልስ is stored value.
I wrote SQL like this
select DivisionName
from t_Et_Divisions
where DivisionName = 'ዜናገብርኤልስ'
select unicode (DivisionName)
from t_Et_Divisions
where DivisionName = 'ዜናገብርኤልስ'
The above didn't work. Does anyone have any ideas how to fix it?
Thanks!
You need to prefix your Unicode string literals with a N:
select DivisionName
from t_Et_Divisions
where DivisionName = N'ዜናገብርኤልስ'
This N prefix tells SQL Server to treat this string literal as a Unicode string and not convert it to a non-Unicode string (as it will if you omit the N prefix).
Update:
I still fail to understand what is not working according to you....
I tried setting up a table with an NVARCHAR column, and if I select, I get back that one, exact row match - as expected:
DECLARE #test TABLE (DivisionName NVARCHAR(100))
INSERT INTO #test (DivisionName)
VALUES (N'ዜናገብርኤልስ'), (N'ዜናገብርኤልስ,ኔትዎርክ,ከስተመር ስርቪስ'), (N'ኔትዎርክ,ከስተመር ስርቪስ')
SELECT *
FROM #test
WHERE DivisionName = N'ዜናገብርኤልስ'
This returns exactly one row - what else are you seeing, or what else are you expecting??
Update #2:
Ah - I see - the columns contains multiple, comma-separated values - which is a horrible design mistake to begin with..... (violates first normal form of database design - don't do it!!)
And then you want to select all rows that contain that search term - but only display the search term itself, not the whole DivisionName column? Seems rather pointless..... try this:
select N'ዜናገብርኤልስ'
from t_Et_Divisions
where DivisionName LIKE N'%ዜናገብርኤልስ%'
The LIKE searches for rows that contain that value, and since you already know what you want to display, just put that value into the SELECT list ....

SQLite: How to select part of string?

There is table column containing file names: image1.jpg, image12.png, script.php, .htaccess,...
I need to select the file extentions only. I would prefer to do that way:
SELECT DISTINCT SUBSTR(column,INSTR('.',column)+1) FROM table
but INSTR isn't supported in my version of SQLite.
Is there way to realize it without using INSTR function?
below is the query (Tested and verified)
for selecting the file extentions only. Your filename can contain any number of . charenters - still it will work
select distinct replace(column_name, rtrim(column_name,
replace(column_name, '.', '' ) ), '') from table_name;
column_name is the name of column where you have the file names(filenames can have multiple .'s
table_name is the name of your table
Try the ltrim(X, Y) function, thats what the doc says:
The ltrim(X,Y) function returns a string formed by removing any and all characters that appear in Y from the left side of X.
List all the alphabet as the second argument, something like
SELECT ltrim(column, "abcd...xyz1234567890") From T
that should remove all the characters from left up until .. If you need the extension without the dot then use SUBSTR on it. Of course this means that filenames may not contain more that one dot.
But I think it is way easier and safer to extract the extension in the code which executes the query.

String Aggregation in sqlite

Anyone knows if String Aggregation in sqlite is possible?
If i have an animal column with 5 rows/datas, how can i combine them so that the output would be in one field
'dog','cat','rat','mice','mouse' as animals
Thanks
You're looking for something like the following:
select group_concat(animal) from animals;
This will return something like the following:
dog,cat,rat,mice,mouse
If you don't want to use a comma as the separator, you can add your own separator as a second parameter:
select group_concat(animal, '_') from animals;
which will return:
dog_cat_rat_mice_mouse
I think this will be useful:
group_concat(X)
group_concat(X,Y)
The group_concat() function returns a string which is the concatenation of all non-NULL values of X. If parameter Y is present then it is used as the separator between instances of X. A comma (",") is used as the separator if Y is omitted. The order of the concatenated elements is arbitrary.

Resources