I just imported a huge text file into a table, using the .import command. Everything is OK, except for the fact that it seems to treat clearly numeric values as text. For instance, conditions such as WHERE field > 4 are always met. I did not specify datatypes when I created the table, but this doesn't seem to matter when small tables are created.
Any advice would be welcome. Thanks!
Edit/conclusion: It turns out some of the values in my CSV file were blanks. I ended up solving this by being a bit less lazy and declaring the datatypes explicitly.
The way SQLite handles types is described on this page: http://www.sqlite.org/datatype3.html
In particular:
Under circumstances described below,
the database engine may convert values
between numeric storage classes
(INTEGER and REAL) and TEXT during
query execution.
Section 3.4 (Comparison Example) should give you concrete examples, which are likely to explain the problem you have. This is probably this example:
-- Because column "a" has text affinity, numeric values on the
-- right-hand side of the comparisons are converted to text before
-- the comparison occurs.
SELECT a < 40, a < 60, a < 600 FROM t1;
0|1|1
To avoid the affinity to be guessed, you can use CAST explicitly (see section 3.2 too):
SQLite may attempt to convert values
between the storage classes INTEGER,
REAL, and/or TEXT before performing a
comparison. Whether or not any
conversions are attempted before the
comparison takes place depends on the
affinity of the operands. Operand
affinity is determined by the
following rules:
An expression that is a simple reference to a column value has the
same affinity as the column. Note that
if X and Y.Z are column names, then +X
and +Y.Z are considered expressions
for the purpose of determining
affinity.
An expression of the form "CAST(expr AS type)" has an affinity
that is the same as a column with a
declared type of "type".
Otherwise, an expression has NONE affinity.
Here is another example:
CREATE TABLE test (value TEXT);
INSERT INTO test VALUES(2);
INSERT INTO test VALUES(123);
INSERT INTO test VALUES(500);
SELECT value, value < 4 FROM test;
2|1
123|1
500|0
It's likely that the CSV import create columns of affinity TEXT.
Related
I am a somewhat newbie to SQLite (and KMyMoney). KMyMoney (an open source personal finance manager) allows one-click exporting data into an SQLite database.
On browsing the SQLite database output, the dollar amount data is stored in a table called kmmSplits as several text fields in a strange format based on “value” and “valueFormatted” (see screen shot below). The “value” field is apparently written as a division equation (in a text format) which apparently yields the “valueFormatted” field (again in text format). The “valueFormatted is the correct number amount but the problem is that parenthesis are used to indicate a negative number instead of a simple minus in front of the value. This is apparently an accounting number format, but I don’t know how to parse this into a float value for running calculated SQL queries, etc. The positive values (without parenthesis) are no problem to convert to FLOATS.
I’ve tried using the CAST to FLOAT function but this does not do the division math, nor does it convert parenthesis into negative values (see screen shot).
The basic question is: how to parse a text value containing parenthesis in the “valueFormatted field (accounting money format) into a common number format OR, alternatively, how to convert a division equation in the “value” field to an actual calculation.
Use a CASE expression to check if valueFormatted is a numeric value inside parentheses and if it is multiply -1 with the substring starting from the 2nd char (the closing parenthesis will be discarded by SQLite during this implicit type casting):
SELECT *,
CASE
WHEN valueFormatted LIKE '(%)' THEN (-1) * SUBSTR(valueFormatted, 2)
ELSE valueFormatted
END AS value
FROM kmmSQLite;
Or, replace '(' with ''-'' and add 0 to covert the result to a number:
SELECT *,
REPLACE(valueFormatted, '(', '-') + 0 AS value
FROM kmmSQLite;
I'm using sqlite-utils to load a csv into sqlite which will later be served via Datasette. I have two columns, likes and dislikes. I would like to have a third column, quality-score, by adding likes and dislikes together then dividing likes by the total.
The sqlite-utils convert function should be my best bet, but all I see in the documentation is how to select a single column for conversion.
sqlite-utils convert content.db articles headline 'value.upper()'
From the example given, it looks like convert is followed by the db filename, the table name, then the col you want to operate on. Is it possible to simply add another col name or is there a flag for selecting more than one column to operate on? I would be really surprised if this wasn't possible, I just can't find any documentation to support it.
This isn't a perfect answer as it doesn't resolve whether sqlite-utils supports multiple column selection for transforms, but this is how I solved this particular problem.
Since my quality_score column would just be basic math, I was able to make use of sqlite's Generated Columns. I created a file called quality_score.sql that contained:
ALTER TABLE testtable
ADD COLUMN quality_score GENERATED ALWAYS AS (likes /(likes + dislikes));
and then implemented it by:
$ sqlite3 mydb.db < quality_score.sql
You do need to make sure you are using a compatible version of sqlite, as this only works with version 3.31 or later.
Another consideration is to make sure you are performing math on integers or floats and not text.
Also attempted to create the table with the virtual generated column first then fill it with my data later, but that didn't work in my case - it threw an error that said the number of items provided didn't match the number of columns available. So I just stuck with the ALTER operation after the fact.
I'm trying to write a table to an Oracle database using the ROracle package. This works fine, however all of the numeric values are showing the full floating point decimal representation on the database. For instance, 7581.24 shows up as 7581.2399999999998.
Is there a way of specifying the number of digits to be stored after the decimal point when writing the table?
I found a work around using Allan's solution here, but it would be better not to have to change the variable after writing it to the database.
Currently I write the table with code like this:
dbWriteTable(db_connection, "TABLE_NAME", table, overwrite = TRUE)
Thanks in advance.
It's not elegant but maybe good programming to make the types and precisions explicit. I did it with something like:
if (dbExistsTable(con, "TABLE_NAME")) dbRemoveTable(con, "TABLE_NAME")
create_table <- "create table CAMS_CFDETT_2019_AA(
ID VARCHAR2(100),
VALUE NUMBER(6,2)
)"
dbGetQuery(con_maps, create_table)
ins_str <- "insert into TABLE_NAME values(:1, :2)"
dbGetQuery(con, ins_str, df)
dbCommit(con)
Essentially, it creates the table and specifies the types for each column and the precision. Then it fills in the values with those from the dataframe (df) in R. You just have to be careful that everything matches up in terms of the columns. If you assign a number to oracle with precision 2 (VALUE NUMBER(3,2) and then push a value from R with more decimals, it will round it to the assigned precision (2 in this example). It will not truncate it. So df$value = 3.1415 in R would become VALUE 3.14 in the Oracle table.
I want to limit numeric column type to 10 symbols before decimal separator and 4 symbols after decimal separator. I executed the following command:
ALTER TABLE scustdisc ADD COLUMN spec_price numeric(10,4)
The command executed without errors but when I try to insert value in spec_price 10.123456 I am able to do it. It should give error and the value not to be inserted. Am I wrong in my alter command?
SQLite has a dynamic type system and the column types have a limited impact, but can be virtually any name. They are resolved to one of TEXT, NUMERIC, INTEGER, REAL or BLOB.
numeric(0,0) - numeric(99999999,99999999) and more resolve to NUMERIC.
As such 10,4 4,10 etc means nothing and makes no difference to SQLite.
With one exception bar constraints a column may hold any type of value. The column type only comes into play in determining the way the data is stored.
A must read is Datatypes In SQLite Version 3
You may also find How flexible/restricive are SQLite column types?
You may be able to resolve this by using a CHECK constraint CREATE TABLE or by using a TRIGGER or multiple TRIGGERs.
You could format the number(s) appropriately when they are displayed.
You could utilise the round(x,y) function Core Functions
Below is my table create in SQLite database,
CREATE TABLE MyData(
Code VARCHAR(20),
Amount DECIMAL(18, 8)
);
then I insert 2 rows into the table.
INSERT INTO MyData
VALUES('A', 1.12345678);
INSERT INTO MyData
VALUES('B', 1234567890.12345678);
After that, execute a SELECT statement,
SELECT * FROM MyData;
SQLite returns the following result:
A|1.12345678
B|1234567890.12346
The DECIMAL(18, 8) suppose means precision=18 and scale=8, why some decimal places are truncated?
The details of how sqlite stores its data is described here. When you specify the DECIMAL column type, the storage for the column has NUMERIC affinity.
Section 2.0 has the following description about type affinity:
A column with NUMERIC affinity may contain values using all five
storage classes. When text data is inserted into a NUMERIC column, the
storage class of the text is converted to INTEGER or REAL (in order of
preference) if such conversion is lossless and reversible. For
conversions between TEXT and REAL storage classes, SQLite considers
the conversion to be lossless and reversible if the first 15
significant decimal digits of the number are preserved. If the
lossless conversion of TEXT to INTEGER or REAL is not possible then
the value is stored using the TEXT storage class. No attempt is made
to convert NULL or BLOB values.
This indicates that sqlite will attempt conversions between types, and if the first 15 digits of the number can be converted and reversed, the numbers are deemed to be equal. This effectively puts a limit on the available precision with which a number can be stored to 15 significant digits.
The wikipedia article on double precision floating point numbers has additional information which is useful when dealing with floating point numbers.