I'm using SQLite to deal with tons of data (like 100gb of data).
I need to seach the value of one column in other table in the fastest way possible.
For example, I need to find the following values of Table 1
[COD]
C62
K801
And then find them in Table 2:
[COD_2]
C60-C63
K80-K81
My desired result is something like:
[COD_1] [COD_2]
C62 C60-C63
K801 K80-K81
Since I have a lot of data, it is inefficient to do something like:
SELECT *
FROM TABLE_1, TABLE_2
WHERE COD_1 LIKE '%' || COD_2 || '%';
Instead, I was trying to do this:
SELECT *
FROM TABLE_1
WHERE COD_1 IN (SELECT COD_2 FROM TABLE_2);
Of course that this doesn't result because the codes are not exactly the sames. Is there a way to search for similar values of one column (something like the LIKE operator) in other table by using IN? Or other way that doesn't cross TABLE_1 and TABLE_2?
Thank you!!!
useful to me.
Based on the small data set shown, and my presumed answer to #Shawn's question (K801 is a typo and is meant to be K80 or K81) I assume the following problem description:
Find a row in COD_2 such that the value in COD_1 is between {value1}-{value2} in COD_2; the - being significant and dependable.
I cannot speak to speed, but I would approach it this way:
SELECT value1, value2
from COD_1,COD_2
where value1 between substr(value2,1,instr(value2,'-')-1) and substr(value2,instr(value2,'-')+1)
The thought being: split the value from COD-2 into a "start" and an "end" value.
Related
My database has a table with columns genus, species, and inatcode. inatcode is blank for every row it hasn't been manually added to. I imported a new table that contains all the codes and want to create an Update query that copies them all to the corresponding rows of the first table. However, because the species column of my first table has additional strings, the match is not perfect and many rows were not updated. table.a.species might look like :
x bimundorum
vesicula (sexgen)
sinuata breviloba
And I want it to match these values in table.b.species:
bimundorum
vesicula
sinuata
I know to use table.a.species LIKE '%table.b.species%' when b is a substring of a, but this is the opposite case and just flipping (shown below) doesn't seem to work. Is there another way to accomplish this in SQLite? The differences between a and b are heterogeneous, but there are only a few cases and I could potentially do multiple queries to account for each.
"UPDATE table.a SET inatcode = table.b.inatcode
FROM table.b
WHERE table.b.genus = table.a.genus
AND table.a.species LIKE '%table.b.species%' "
I am working with SQLite through the DBI package in R, and could make all of this happen in R and reinsert instead. But it seems like this I should be able to do in SQLite.
The modified formatting posted here does what I wanted; I just needed to use the || to concatenate the values and the % operator to make the LIKE function behave as I wanted it to.
"UPDATE table.a SET inatcode = table.b.inatcode
FROM table.b
WHERE table.b.genus = table.a.genus
AND table.a.species LIKE '%' || table.b.species || '%'
AND table.b.species != '' "
I am pretty sure you guys have a simple and fast solution to my problem but my SQL-Skills are limited and I can't figure it out by my self.
So what I have is something like:
Newsgroup: [Name(PK)]
Article: [Id(PK)] [newsgroupName (FK)] [date:Date] [read:Boolean]
What I want know is a query that gives me the Name of each Newsgroup along with the Count of unread articles, Count of all articles and the date of the most recent one...
Is this even possible to archieve in a single Select-Query?
Thank you guys in advance!
You can simply use the appropriate aggregation functions:
SELECT newsgroupName,
SUM(NOT read) AS countUnread,
COUNT(*) AS countAll,
MAX(date) AS mostRecentDate
FROM Article
GROUP BY newsgroupName;
I guess something like this:
SELECT name, COUNT(countUnread) AS countUnread, COUNT(countRead) AS countRead, MAX(mostRecentDateUnread) AS mostRecentDateUnread, MAX(mostRecentDateRead) AS mostRecentDateRead FROM (
SELECT newsgroupName AS name, COUNT(newsgroupName) AS countUnread, 0 AS countRead, MAX(date) AS mostRecentDateUnread, NULL AS mostRecentDateRead
FROM Article
WHERE read = 0
GROUP BY newsgroupName, read
UNION ALL
SELECT newsgroupName AS name, 0 AS countUnread, COUNT(newsgroupName) AS countRead, NULL AS mostRecentDateUnread, MAX(date) AS mostRecentDateRead
FROM Article
WHERE read = 1
GROUP BY newsgroupName, read
)
GROUP BY name
I haven't tried but in theory with some fix it could work.
I have a SQLite table where one column contains a JSON array containing 0 or more values. Something like this:
id|values
0 |[1,2,3]
1 |[]
2 |[2,3,4]
3 |[2]
What I want to do is "unfold" this into a list of all distinct values contained within the arrays of that column.
To start, I am using the JSON1 extension's json_each function to extract a table of values from a row:
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
Where I can vary the id (2, above) to select any row in the table.
Now, I am trying to wrap this in a recursive CTE so that I can apply it to each row across the entire table and union the results. As a first step I replicated (roughly) the results from above as follows:
WITH RECURSIVE result AS (
SELECT null
UNION ALL
SELECT
value
FROM
json_each(
(
SELECT
values
FROM
my_table
WHERE
id == 2
)
)
)
SELECT * FROM result;
As the next step I had originally planned to make id a variable and increment it (in a similar manner to the first example in the documentation, but haven't been able to get that to work.
I have gone through the other examples in the documentation, but they are somewhat more complex and I haven't been able to distill those down to see how they might apply to this problem.
Can someone provide a simple example of how to solve this (or a similar problem) with a recursive CTE?
Of course, my goal is to solve the problem with or without CTEs so Im also happy to hear if there is a better way...
You do not need a recursive CTE for this.
To call json_each for multiple source rows, use a join:
SELECT t1.id, t2.value
FROM my_table AS t1
JOIN json_each((SELECT "values" FROM my_table WHERE id = t1.id)) AS t2;
Kindly review this simple SQLite-fiddle .
If any word in the column ITEM of table TWO matches any word in the column NAME_2 of table ONE, there should be a result(s).
Tried many permutes, yet not sure how to form the query for achieving this. Should the GLOBclause be used instead of LIKE, or perhaps some other wildcard patterns?
Many thanks in advance!
As per my comment, I think you can make use of instr as well. For example:
SELECT * FROM Table1 T1, Table2 T2 where instr(T2.NAME_2,T1.ITEM);
The above should return rows where T2.NAME contains a string = T1.ITEM
I need to select v_col1, from table_x and that column gives me string that i need to put(update) into same
rowid but into diffrent column(h_col2) in sama table table_x - sorry it seems easy but i am beginner....
tabl_x
rowid V_col1, h_col2 etc .....
1 672637263 GVRT1898
2 384738477 GVRT1876
3 263237863 GVRT1832
like in this example i need to put GVRT1898 (update) instead of 672637263 and i need to
go into every row in this table_x and fix -
like next line would be (rowid2 would be GVRT1876 instead of 384738477 :-)
this table has 40000 lines like this and i need to loop for every rowid
THX for your responce Justin - this is a little more complex,
i have this string in h_col and need to take only GVRTnumber out and put into v_col - but it's
hard becouse GVRTnumber is in various place in column see down here....
"E_ID"=X:"GVRT1878","RCode"=X:"156000","Month"=d:1,"Activate"=d:5,"Disp_Id"=X:"4673498","Tar"=X:"171758021";
2"E_ID"=X:"561001760","RCode"=X:"156000","Month"=d:1,"Activate"=d:5,"Disp_Id"=X:"GVRT1898","Tar"=X:"171758021";
h_col column have this number that i want but in various place like somethimes it's in this 600byte column it's in byte nr 156 - sometimes in 287 but the only unique is "GVRT...." how can i take that string and put it to v_col -
Can you show me how to write such SQL pl/sql ?
regards & thanks
It sounds like you just want
UPDATE tabl_x
SET h_col2 = v_col1
Of course, if you do something like this, that implies that one of the two columns should be dropped or the data model needs to get fixed. Having two copies of the same data in each row is a bad idea from a normalization standpoint if nothing else.