I wrote the little bash script below and it works as intended, but I added couple comments and newlines for readability which breaks the code. Removing comments and newlines should make it a valid script.
### read all measurements from the database and list each value only once
sqlite3 -init /tmp/timeout /tmp/testje.sqlite \
'select distinct measurement from errors order by measurement;' |
### remove the first line of stdout as this is a notification rather than intended output
sed '1d' |
### loop though all found values
while read error; do
### count the number of occurences in the original table and print that
sqlite3 -init /tmp/timeout /tmp/testje.sqlite \
"select $error,count( measurement ) from errors where measurement = '$error' ;"
done
The result is like this:
134 1
136 1
139 2
159 1
Question: Is it possible with sqlite3 to translate the while-loop to SQL statements? In other words, does sqlite3 support some sort of for-loop to loop through results of a previous query?
Now I know sqlite3 is a very limited database and chances are that what I want is just too complex for it. I've been searching, for it but I'm really a database nitwit and the hits I get so far are either on a different database or solving an entirely different problem.
The easiest answer (that I do not hope for BTW) is 'sqlite3 does not support loops'.
SQLite does not support loops. Here is the entire language, you'll notice that structured programming is completely absent.
However, that's not to say that you can't get what you want without loops, using sets or some other SQL construct instead. In your case it might be as simple as:
select measurement, count( measurement ) from errors GROUP BY measurement
That will give you a list of all measurements in the errors table and a count of how often each one occurs.
In general, SQL engines are best utilized by expressing your query in a single (sometimes complex) SQL statement, which is submitted to the engine for optimization. In your example you've already codified some decisions about the strategy used to get the data from the database -- it's a tenet of SQL that the engine is better able to make those decisions than the programmer.
Related
I use Telegraf to collect data and then write it to InfluxDb. And I would like to aggregate the data into one row. As I know there are a lot of plugins for Telegraf including plugins to aggregate data. I use BasicStats Aggregator Plugin for my purpose. But it has unexpected behavior (unexpected for me). It aggregates rows by their tags but I need aggregation only by timestamp. How can I make this plugin to aggregate rows only by their timestamps? For example I have the following rows:
timestamp=1 tag1=foo field1=3
timestamp=1 tag1=bar field1=1
timestamp=1 tag1=baz field1=4
and for them I would like to get the following aggregated row:
timestamp=1 sum=8
Thank you in advance
I believe you can't aggregate by timestamp with Telegraf's conventional processors or aggregators( like BasicStats ) because of InfluxDB's nature. It is a time-series database and it is indexed by time. But you can aggregate the data with an InfluxDB Query:
SELECT mean("field1") FROM "statsdemo"."telegraf"."my_metric" WHERE time > now()-5m AND time < now() GROUP BY time(2000ms) FILL(null)
Another approach could be using execd aggregator. Just write a script with bash or your favorite programming language that reads from STDIN, aggregate the data by timestamp, and print the result to STDOUT following the influx line protocol. I have never used this aggregator before, but I think it may work.
Does anyone know how to count number of rows in a SAS table using x command, I need to achieve this through unix. I tried wc -l but it gave me a different result than what proc sql count(*) gives me.
Lee has got the right idea here - SAS datasets are stored in a proprietary binary format, in which line breaks are not necessarily row separators, so you cannot use tools like wc to get an accurate row count. Using SAS itself is one option, or you could potentially use other tools like the python pandas.read_sas module to load the table if you don't have SAS installed on your unix server.
Writing a script to do this for you is outside the scope of this answer, so have a go at writing one yourself and post a more specific question if you get stuck.
I need to use cast function with length of column in teradata.
say I have a table with following data ,
id | name
1|dhawal
2|bhaskar
I need to use cast operation something like
select cast(name as CHAR(<length of column>) from table
how can i do that?
thanks
Dhawal
You have to find the length by looking at the table definition - either manually (show table) or by writing dynamic SQL that queries dbc.ColumnsV.
update
You can find the maximum length of the actual data using
select max(length(cast(... as varchar(<large enough value>))) from TABLE
But if this is for FastExport I think casting as varchar(large-enough-value) and postprocessing to remove the 2-byte length info FastExport includes is a better solution (since exporting a CHAR() will results in a fixed-length output file with lots of spaces in it).
You may know this already, but just in case: Teradata usually recommends switching to TPT instead of the legacy fexp.
I am using TSQLT AssertResultSetsHaveSameMetaData to compare metadata between two tables.But the problem is that i cannot hardcode the table name since i am passing the table name as the parameter at the runtime.So is there any way to do that
You use tSQLt.AssertResultSetsHaveSameMetaData by passing two select statements like this:
exec tSQLt.AssertResultSetsHaveSameMetaData
'SELECT TOP 1 * FROM mySchema.ThisTable;'
, 'SELECT TOP 1 * FROM mySchema.ThatTable;';
So it should be quite easy to parameterise the names of the tables you are comparing and build the SELECT statements based on those table name parameters.
However, if you are using the latest version of tSQLt you can also now use tSQLt.AssertEqualsTableSchema to do the same thing. You would use this assertion like this:
exec tSQLt.AssertEqualsTableSchema
'mySchema.ThisTable'
, 'mySchema.ThatTable';
Once again, parameterising the tables names would be easy since they are passed to AssertEqualsTableSchema as parameters.
If you explain the use case/context and provide sample code to explain what you are trying to do you stand a better chance of getting the help you need.
I'm programmatically fetching a bunch of datasets, many of them having silly names that begin with numbers and have special characters like minus signs in them. Because none of the datasets are particularly large, and I wanted the benefit R making its best guess about data types, I'm (ab)using dplyr to dump these tables into SQLite.
I am using square brackets to escape the horrible table names, but this doesn't seem to work. For example:
data(iris)
foo.db <- src_sqlite("foo.sqlite3", create = TRUE)
copy_to(foo.db, df=iris, name="[14m3-n4m3]")
This results in the error message:
Error in sqliteSendQuery(conn, statement, bind.data) : error in statement: no such table: 14m3-n4m3
This works if I choose a sensible name. However, due to a variety of reasons, I'd really like to keep the cumbersome names. I am also able to create such a badly-named table directly from sqlite:
sqlite> create table [14m3-n4m3](foo,bar,baz);
sqlite> .tables
14m3-n4m3
Without cracking into things too deeply, this looks like dplyr is handling the square brackets in some way that I cannot figure out. My suspicion is that this is a bug, but I wanted to check here first to make sure I wasn't missing something.
EDIT: I forgot to mention the case where I just pass the janky name directly to dplyr. This errors out as follows:
library(dplyr)
data(iris)
foo.db <- src_sqlite("foo.sqlite3", create = TRUE)
copy_to(foo.db, df=iris, name="14M3-N4M3")
Error in sqliteSendQuery(conn, statement, bind.data) :
error in statement: unrecognized token: "14M3"
This is a bug in dplyr. It's still there in the current github master. As #hadley indicates, he has tried to escape things like table names in dplyr to prevent this issue. The current problem you're having arises from lack of escaping in two functions. Table creation works fine when providing the table name unescaped (and is done with dplyr::db_create_table). However, the insertion of data to the table is done using DBI::dbWriteTable which doesn't support odd table names. If the table name is provided to this function escaped, it fails to find it in the list of tables (the first error you report). If it is provided escaped, then the SQL to do the insertion is not synatactically valid.
The second issue comes when the table is updated. The code to get the field names, this time actually in dplyr, again fails to escape the table name because it uses paste0 rather than build_sql.
I've fixed both errors at a fork of dplyr. I've also put in a pull request to #hadley and made a note on the issue https://github.com/hadley/dplyr/issues/926. In the meantime, if you wanted to you could use devtools::install_github("NikNakk/dplyr", ref = "sqlite-escape") and then revert to the master version once it's been fixed.
Incidentally, the correct SQL-99 way to escape table names (and other identifiers) in SQL is with double quotes (see SQL standard to escape column names?). MS Access uses square brackets, while MySQL defaults to backticks. dplyr uses double quotes, per the standard.
Finally, the proposal from #RichardScriven wouldn't work universally. For example, select is a perfectly valid name in R, but is not a syntactically valid table name in SQL. The same would be true for other reserved words.