SqlLite query a range based on column value - sqlite

I have a project in SQLite which contains a table
Prefix table
prefix id
T6A-T6Z 1
YAA-YAZ 2
ZAA-ZAZ 3
7RA-7RZ 4
7TA-7YZ 5
For example I have the value “T6C” which falls in the range of the first record. I need the id of that record. I look into REGEXP as a possible solution but what I read I need a callback function This app is being developed in Adobe Air and I could not find a way to implement the callback.
I also tried the wildcard '_' approach but came up short on that.
Any help would be great.

This is a lousy data format. You should really have the beginning and ending values in separate fields. Oh well. You can do this with string manipulations:
select *
from prefix
where 'T6C' between substr(prefix, 1, 3) and substr(prefix, -3, 3)

If your prefix ranges followed the simple pattern of your first four (1-4) example id's, it would be a simple LIKE query using the first two characters of your prefix (range) beginning:
SELECT id FROM table WHERE prefix LIKE 'T6%';
But your fifth (5) example id has a prefix range that spans beyond the expected -7TZ range ending convention that the other four represented. If you have design control of the the prefix ranges then LIKE is another alternative.

Related

Perform operation on all substrings of a string in SQL (MariaDB)

Disclaimer: This is not a database administration or design question. I did not design this database and I do not have rights to change it.
I have a database in which many fields are compound. For example, a single column is used for acre usage for a district. Many districts have one primary crop and the value is a single number, such as 14. Some have two primary crops and it has two numbers separated by a comma like "14,8". Some have three, four, or even five primary crops resulting in a compound value like "14,8,7,4,3".
I am pulling data out of this database for analytical research. Right now, I am pulling columns like that into R, splitting them into 5 values (padding nulls if there aren't 5 values), and performing work on the values. I want to do it in the database itself. I want to split the value on the comma, perform an operation on the resulting values, and then concatenate the result of the operation back into the original column format.
Example, I have a column that is in acres. I want it in square meters. So, I want to take "14,8", temporarily turn it into 14 and 8, multiply each of those by 4046.86, and get "56656.04,32374.88" as my result. What I am currently doing is using regexp_replace. I start with all rows where "acres REGEXP '^[0-9.]+,[0-9.]+,[0-9.]+,[0-9.]+$'" for the where clause. That gives me rows with 5 numbers in the field. Then, I can do the first number with "cast(regexp_replace(acres,',.*%','') as float) * 4046.86". I can do each of the 5 using a different regexp_replace. I can concatenate those values back together. Then, I run a query for those with 4 numbers, then 3, then 2, and finally the single number rows.
Is this possible as a single query?
Use a function to parse the string and to convert it to desired result. This will allow for you to use a sigle query for the job.

Align Text To Row Identified by Number and to ID Matching that Embedded in String

I need to align text in an ALERT STRING column with the row identified by number in an ID ROW column.
Additionally, I need to also align the same ALERT STRING text with the same ID ROW number AND with the ID matching that embedded in a string in the TEXT WITH ID column. (This double-check will sometimes be necessary with the real-world data.)
So far, I've only figured out how to align the ALERT STRING with the ID matching that embedded in the TEXT WITH ID column:
=LOOKUP(2,1/SEARCH(A2,$F$2:$F$11),$G$2:$G$11)
I appreciate any help folks can offer. You can find an editable copy of the workbook here:
https://1drv.ms/x/s!ArQ7Kw6ayNMY2zktTW3pDCbMmJZ_
UPDATE: Nayan provided a solution to the first part of this question (please see answer below). I'm still trying to work out a formula for the column D part of this question, in which the row reference shown in column E is combined with a match of the ID shown in column A with its corresponding value in one of the text strings in column F.
The best I've been able to come up with so far is a formula with a high failure rate:
=INDEX($G$2:$G$11,MATCH(ROW(D2),$E$2:$E$11,MATCH("*"&A2&"*",$F$2:$F$11,0)))
Any help with this part of the question will be greatly appreciated.
ROW([reference])
Returns the row number of a reference
E.g.: Row(B2) returns 2. If nothing provided like ROW() will also
return row number based on position of cell where it is called.
VLOOKUP(loolup_value, table_array, col_index_num, [range_lookup])
Looks for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify (col_index_num)
By default - the table must be sorted in an ascending order.
Try this:
=VLOOKUP(ROW(B2),$E$2:$G$11,3,FALSE)
INDEX(array, row_num, [column_num]) INDEX(reference, row_num,
[column_num], [area_num])
Returns a value or reference of the cell at the intersection of a particular row and column, in a given range.
In this case, you have to get row_num with MATCH function.
MATCH(lookup_value, lookup_array, [match_type])
Returns a relative position of an item in an array that matches a specified value in a specified order.
match_type: 1 (Less than), 0 (Exact match), -1 (Greater than)
Try this:
=INDEX($G$2:$G$11,MATCH(ROW(B2),$E$2:$E$11,0))
Identity Data with Multiple Criteria Condition using MATCH()
=INDEX($G$2:$G$11,MATCH(1, (ROW(D2) = $E$2:$E$11) * (ISNUMBER(SEARCH(A2, $F$2:$F$11))),0))
References:
https://exceljet.net/excel-functions/excel-vlookup-function
https://exceljet.net/excel-functions/excel-index-function
https://exceljet.net/formula/index-and-match-with-multiple-criteria
This is the formula I was looking for in column D:
=INDEX($G$2:$G$11,MATCH(ROW(D2)&"*"&A2&"*",INDEX($E$2:$E$11&$F$2:$F$11,),0))
You can see it working here.
Nayan provided a great deal of help with answering this question, so I will mark his answer as the accepted solution.
Syeda Fahima Nazreen provided the example I referenced to figure out the formula shown above.
Reference:
Nested Excel Formula with Two INDEX Functions and a MATCH Function with Multiple Criteria

Can sqlite-utils convert function select two columns?

I'm using sqlite-utils to load a csv into sqlite which will later be served via Datasette. I have two columns, likes and dislikes. I would like to have a third column, quality-score, by adding likes and dislikes together then dividing likes by the total.
The sqlite-utils convert function should be my best bet, but all I see in the documentation is how to select a single column for conversion.
sqlite-utils convert content.db articles headline 'value.upper()'
From the example given, it looks like convert is followed by the db filename, the table name, then the col you want to operate on. Is it possible to simply add another col name or is there a flag for selecting more than one column to operate on? I would be really surprised if this wasn't possible, I just can't find any documentation to support it.
This isn't a perfect answer as it doesn't resolve whether sqlite-utils supports multiple column selection for transforms, but this is how I solved this particular problem.
Since my quality_score column would just be basic math, I was able to make use of sqlite's Generated Columns. I created a file called quality_score.sql that contained:
ALTER TABLE testtable
ADD COLUMN quality_score GENERATED ALWAYS AS (likes /(likes + dislikes));
and then implemented it by:
$ sqlite3 mydb.db < quality_score.sql
You do need to make sure you are using a compatible version of sqlite, as this only works with version 3.31 or later.
Another consideration is to make sure you are performing math on integers or floats and not text.
Also attempted to create the table with the virtual generated column first then fill it with my data later, but that didn't work in my case - it threw an error that said the number of items provided didn't match the number of columns available. So I just stuck with the ALTER operation after the fact.

How to Add Column (script) transform that queries another column for content

I’m looking for a simple expression that puts a ‘1’ in column E if ‘SomeContent’ is contained in column D. I’m doing this in Azure ML Workbench through their Add Column (script) function. Here’s some examples they give.
row.ColumnA + row.ColumnB is the same as row["ColumnA"] + row["ColumnB"]
1 if row.ColumnA < 4 else 2
datetime.datetime.now()
float(row.ColumnA) / float(row.ColumnB - 1)
'Bad' if pd.isnull(row.ColumnA) else 'Good'
Any ideas on a 1 line script I could use for this? Thanks
Without really knowing what you want to look for in column 'D', I still think you can find all the information you need in the examples they give.
The script is being wrapped by a function that collects the value you calculate/provide and puts it in the new column. This assignment happens for each row individually. The value could be a static value, an arbitrary calculation, or it could be dependent on the values in the other columns for the specific row.
In the "Hint" section, you can see two different ways of obtaining the values from the other rows:
The current row is referenced using 'row' and then a column qualifier, for example row.colname or row['colname'].
In your case, you obtain the value for column 'D' either by row.D or row['D']
After that, all you need to do is come up with the specific logic for ensuring if 'SomeContent' is contained in column 'D' for that specific row. In your case, the '1 line script' would look something like this:
1 if [logic ensuring 'SomeContent' is contained in row.D] else 0
If you need help with the logic, you need to provide more specific examples.
You can read more in the Azure Machine Learning Documentation:
Sample of custom column transforms (Python)
Data Preparations Python extensions
Hope this helps

how do i divide a sql variable into 2

i have a field in sql named as address which is of 80 char.
i want to put this field into 2 fields addr1 and addr2 of 40 char each.
how do i do it.
this is for T-SQL, but it can't be much different for PL/SQL
declare #yourVar varchar(80)
select substring(#yourVar, 1, 40), substring(#yourVar, 40, 40)
for plsql, it's substr(), so select substr(addr, 1, 40) as addr1, substr(addr, 40) as addr2 from ...
I think your schema would be better off if you altered that table to have two columns instead of one. I'd prefer that solution to parsing the current value.
Brute-force chopping the 80-character value at position 40 runs the risk of breaking in the middle of a word. You might want to do the following instead:
Replace all runs of whitespace with a single blank.
Find the last blank at or before position 40.
Place everything before that blank in the first result field.
Place everything after that blank in the second result field.
The exact details of the operations above will depend on what tools are available to you (e.g. SQL only, or reading from one DB and writing to another using a separate program, etc.)
There is the possibility that the 80-character value may be filled in such a way that breaking between "words" will require one of the result values to be more than 40 characters long to avoid truncation.

Resources