I have a column containing user entry descriptions, these descriptions can be anything however i do need them sorted into a logical order.
The text can be anything like
16 to 26 months
40 to 60 months
Literacy
Mathematics
When i order these in sql statement the text items return fine. However any beginning with numbers come back in an order not logical
i.e.
16 to 26 months
will be before
8 to 20 months
i understand why as it takes first character etc but don't know how to alter sql statement (using sqlite) to improve the performance without messing up the entries beginning with text
When i cast to numeric the numbers are fine the items beginning with text go wrong
Thanks
What you need is sorting the values in "natural order". To achieve this you will need to implement your own collating sequence; SQLite doesn't provide one for this case.
There are some questions (and answers) regarding this topic here on SO, but they are for other RDBMS. The best I could find in a quick search was this:
http://wiki.ozanh.com/doku.php?id=python:database:sqlite:how_to_natural_sort
You should think about improving your table schema, e. g. splitting the period into separate integer columns (monthsMin, monthsMax) instead of using text, which would make sorting much easier. You can always build a string from this values if necessary.
Related
When using ROracle to fetch data from a database, I am running into an issue trying to fetch large integers (up to 21 digit positions). the database column has format NUMBER(38,0).
Fetching them through a simple select does not work, the numbers get garbled from the 12th position on.
I can circumvent this by converting them to characters (to_char(COLUMN_NAME)), but this is far from ideal.
A solution from an oracle forum that converts to binary double (cast(COLUMN_NAME as binary_double)) does not work in my case.
Do you have a hint towards data types to use?
Disclaimer: This is not a database administration or design question. I did not design this database and I do not have rights to change it.
I have a database in which many fields are compound. For example, a single column is used for acre usage for a district. Many districts have one primary crop and the value is a single number, such as 14. Some have two primary crops and it has two numbers separated by a comma like "14,8". Some have three, four, or even five primary crops resulting in a compound value like "14,8,7,4,3".
I am pulling data out of this database for analytical research. Right now, I am pulling columns like that into R, splitting them into 5 values (padding nulls if there aren't 5 values), and performing work on the values. I want to do it in the database itself. I want to split the value on the comma, perform an operation on the resulting values, and then concatenate the result of the operation back into the original column format.
Example, I have a column that is in acres. I want it in square meters. So, I want to take "14,8", temporarily turn it into 14 and 8, multiply each of those by 4046.86, and get "56656.04,32374.88" as my result. What I am currently doing is using regexp_replace. I start with all rows where "acres REGEXP '^[0-9.]+,[0-9.]+,[0-9.]+,[0-9.]+$'" for the where clause. That gives me rows with 5 numbers in the field. Then, I can do the first number with "cast(regexp_replace(acres,',.*%','') as float) * 4046.86". I can do each of the 5 using a different regexp_replace. I can concatenate those values back together. Then, I run a query for those with 4 numbers, then 3, then 2, and finally the single number rows.
Is this possible as a single query?
Use a function to parse the string and to convert it to desired result. This will allow for you to use a sigle query for the job.
I have a column of store IDs which all have leading zeroes. I.E. 0017 shows rather than 17, 0876 shows rather than 876.
All Store IDs are 4 digits long with these leading zeroes. Is there a way to remove these leading zeroes and therefore leave me with 17 and 876 (as per above).
I imagine this would involve a REGEXP statement but I haven't been able to successfully create one yet.
Create a calculated field using this formula REGEXP_REPLACE(Store ID, r'^\D*0*', '').
Working example here.
I'm using sqlite-utils to load a csv into sqlite which will later be served via Datasette. I have two columns, likes and dislikes. I would like to have a third column, quality-score, by adding likes and dislikes together then dividing likes by the total.
The sqlite-utils convert function should be my best bet, but all I see in the documentation is how to select a single column for conversion.
sqlite-utils convert content.db articles headline 'value.upper()'
From the example given, it looks like convert is followed by the db filename, the table name, then the col you want to operate on. Is it possible to simply add another col name or is there a flag for selecting more than one column to operate on? I would be really surprised if this wasn't possible, I just can't find any documentation to support it.
This isn't a perfect answer as it doesn't resolve whether sqlite-utils supports multiple column selection for transforms, but this is how I solved this particular problem.
Since my quality_score column would just be basic math, I was able to make use of sqlite's Generated Columns. I created a file called quality_score.sql that contained:
ALTER TABLE testtable
ADD COLUMN quality_score GENERATED ALWAYS AS (likes /(likes + dislikes));
and then implemented it by:
$ sqlite3 mydb.db < quality_score.sql
You do need to make sure you are using a compatible version of sqlite, as this only works with version 3.31 or later.
Another consideration is to make sure you are performing math on integers or floats and not text.
Also attempted to create the table with the virtual generated column first then fill it with my data later, but that didn't work in my case - it threw an error that said the number of items provided didn't match the number of columns available. So I just stuck with the ALTER operation after the fact.
I have a TEXT column called "time" in a table meal and in a table pain which is TEXT formatted as YYYY-MM-DDTHH:MM. I'm trying to search for other times that are within 12 hours of a given time, although I can't figure out how to do that.
I've tried testing
WHERE pain.time < meal.time + "1:00" AND pain.time > meal.time
but this approach alters the year instead of the hour. I also tried testing the same query adding "0000-00-00T01:00", but it doesn't seem to do anything.
I'm not sure what else to test.
SQLite has no built-in date/time data type, so you have to use either numbers or strings and handle them correctly.
To do calculations, you have to either do them directly on the numerical value (which might require conversions into a number and back), or use the modifiers of the built-in date/time functions:
... WHERE meal.time BETWEEN datetime(pain.time, '-12 hours') AND pain.time