Using Dates in SQLite - sqlite

I have a TEXT column called "time" in a table meal and in a table pain which is TEXT formatted as YYYY-MM-DDTHH:MM. I'm trying to search for other times that are within 12 hours of a given time, although I can't figure out how to do that.
I've tried testing
WHERE pain.time < meal.time + "1:00" AND pain.time > meal.time
but this approach alters the year instead of the hour. I also tried testing the same query adding "0000-00-00T01:00", but it doesn't seem to do anything.
I'm not sure what else to test.

SQLite has no built-in date/time data type, so you have to use either numbers or strings and handle them correctly.
To do calculations, you have to either do them directly on the numerical value (which might require conversions into a number and back), or use the modifiers of the built-in date/time functions:
... WHERE meal.time BETWEEN datetime(pain.time, '-12 hours') AND pain.time

Related

Pentaho Formula

I am new to Pentaho, so please be gentle.
I am, perhaps naively, wanting to use a Formula to convert a six-character string in the form YYYYMM to the date representing the final day of that month.
I imagine doing this step by step using successive lines of the Formula: checking that the string is of the correct length and, if so:
extracting the year and converting it to integer (with error checking)
extracting the month and converting it to integer (also with error checking)
converting ([year], [month], 1) to a date (the first of the month)
adding a month
subtracting a day
Some of those steps may be combined but, overall, it relies on a succession of steps to achieve a final result.
Formula does not seem to recognise the values achieved along the way though, at least not by enclosing them in square brackets as you do with fields from previous objects in the mapping.
I suppose I could have a series of Formula objects one after the other in the mapping but that seems untidy and inefficient. If a single Formula object cannot have a series of values defined on successive lines, what is the point of even having lines? How do I use a value I have defined on a previous line?
The formula step isn’t the best way to achieve that. The resulting formula will be hard to read and quite cumbersome.
It’s better (and faster) to use a calculator step. A javascript step can also be used, and it will be easier to read, but slower (though that probably won't be a major issue).
So, one way forward is to implement this on a calculator step:
Create a copy of your string field as a Date
Create 2 constant fields: 1 and -1
Add 1 month to the date field
Subtract 1 day to the result
Create a copy of the result as a string.
See screenshot:

Can sqlite-utils convert function select two columns?

I'm using sqlite-utils to load a csv into sqlite which will later be served via Datasette. I have two columns, likes and dislikes. I would like to have a third column, quality-score, by adding likes and dislikes together then dividing likes by the total.
The sqlite-utils convert function should be my best bet, but all I see in the documentation is how to select a single column for conversion.
sqlite-utils convert content.db articles headline 'value.upper()'
From the example given, it looks like convert is followed by the db filename, the table name, then the col you want to operate on. Is it possible to simply add another col name or is there a flag for selecting more than one column to operate on? I would be really surprised if this wasn't possible, I just can't find any documentation to support it.
This isn't a perfect answer as it doesn't resolve whether sqlite-utils supports multiple column selection for transforms, but this is how I solved this particular problem.
Since my quality_score column would just be basic math, I was able to make use of sqlite's Generated Columns. I created a file called quality_score.sql that contained:
ALTER TABLE testtable
ADD COLUMN quality_score GENERATED ALWAYS AS (likes /(likes + dislikes));
and then implemented it by:
$ sqlite3 mydb.db < quality_score.sql
You do need to make sure you are using a compatible version of sqlite, as this only works with version 3.31 or later.
Another consideration is to make sure you are performing math on integers or floats and not text.
Also attempted to create the table with the virtual generated column first then fill it with my data later, but that didn't work in my case - it threw an error that said the number of items provided didn't match the number of columns available. So I just stuck with the ALTER operation after the fact.

Mixed Timed Data

I have a vector that contains time data, but there's a problem: some of the entries are listed as dates (e.g., 10/11/2017), while other entries are listed as dates with time (e.g., 12/15/2016 09:07:17). This is problematic for myself, since as.Date() can't recognize the time portion and enters dates in an odd format (0012-01-20), while seemingly adding dates with time entries as NA's. Furthermore, using as.POSIXct() doesn't work, since not all entries are a combination of date with time.
I suspect that, since these entries are entered in a consistent format, I could hypothetically use an if function to change the entries in the vector to a consistent format, such as using an if statement to remove time entirely, but I don't know enough about it to get it to work.
use
library(lubridate)
Name of the data frame or table-> x
the column that has date->Date
use the ymd function
x$newdate<-ydm(x$Date)

Date Filter in SSRS 2008

I am working with SSRS 2008. I have a table in my report consisting a date/Time Column (DOB). I have a date/time parameter (MyDate) as well. I am trying to set a Filter on my data set like
FormatDateTime(Fields!DOB.Value,2)<=FormatDateTime(Parameters!MyDate.Value,2)
It doesn't filter my table correctly. But if I remove FormatDateTime function then it works fine. i want to understand whats the problem here.
FormatDateTime will return a string, so you're not comparing dates anymore but rather their string representations.
Comparing the dates 02-Feb-2012 and 10-Oct-2012 will give different results than comparing the strings 2/2/2012 and 10/10/2012.
As mentioned in the comment, it looks like you're just trying to remove the time portion from dates?
Something like this should work, i.e. converting the strings back to dates.
CDate(FormatDateTime(Fields!DOB.Value,2)) <= CDate(FormatDateTime(Parameters!MyDate.Value,2))
But this is just one suggestion, there are any number of ways of doing this.

Alternative to sqlite OR a better way to handle date / time fields in sqlite

My data tends to be medium to large but never qualifies as "BIG" data. The data is almost always complexly relational. For the purposes I'm talking about here, 10-50 tables with a total size of 1-10 GB. Nothing more. When I deal with data bigger than this, I'll stick it into Postgres or SQL Server.
Overall, I like SQLite, but the data I work with has lots and lots of date / datetime fields and dealing with date fields in SQLite makes my head hurt and when I move data back and forth between R and SQLite, my dates often get mangled.
I am either looking for a file-based alternative to SQLite that is easy to work with from R.
OR
Better techniques/packages for moving data in/out of SQLite and R without mangling the dates. My goal is to stop mangling my dates. For example, when I use dbWriteTable from the RSQLite package my dates are usually messed up in a way that makes them impossible to work with.
My primary workstation is running Ubuntu but I work in an office dominated by Windows. If suggesting an alternative to SQLite, +++ for an alternative that works on both platforms (or more).
Use epoch times and dates (days from origin, seconds from origin). The conversion using epochs into R POSIXct or Date is fast (strings are very slow).
Edit: Another alternative, after re-reading and considering the size of your data:
You could simply save the tables directly in R format, perhaps with a small piece of extra metadata describing the key relationships between tables. You would have to create your own conventions and all, but it's definitely smoother (no impedance mismatches).
Also, I'm personally very partial to the package data.table. It's fast and has a syntax which is pure R but has a nice mapping onto SQL concepts. E.g. in dt[i, j, by=list(...)], i corresponds to "where", j correspond to "select", and by to "group by" and there are facilities for joins as well, although I wrote infix wrappers around those so it was easier to remember.
I typically do my data processing work exclusively in R (after an initial pull from SQLITE), and I find data.table more faster and practical than massive SQLDF queries.
http://datatable.r-forge.r-project.org/
sqlite wants to read the data in the standard format "YYYY-MM-DD HH:MM:SS" (you can omit the time part if you don't need it)---I don't know of a way to read arbitrary date strings. This results in a normalized date being stored.
On output, you want to format the date using sqlite functions to whatever your other software needs---check the options of strftime().
For instance, Octave likes the day number since year 0, so if I have a table mydata with column "date", I'd do
select julianday(mydate)-1721059.666667 from mydata
The magic number is julianday("0000-01-01T00:00:00-04:00") and compensates for the fact that julianday starts in year 4017BC or something like that, whereas Octave counts from year 0.

Resources