SQLite GROUP BY - sqlite

I am using DB Browser for SQLite to extract some interesting data from my database but I encountered one big problem with GROUP BY statement.
Even the most basic SELECT I can imagine is not working properly.
(Filename nvarchar(2147483647))
SELECT FileName FROM TableName WHERE FileName LIKE '%Nieminen%' GROUP BY FileName gives 5 rows even though I know that there are 9 distinct FileNames containing the phrase 'Nieminen' (I've browsed it).
Can it be possible that GROUP BY in sqlite compares only N (e.g. 10) initial characters? From my observation it might be true...
Any clues?

why don't you try out the following:
SELECT DISTINCT FileName FROM TableName WHERE FileName LIKE '%Nieminen%'

Related

RODBC gives proper row count but yields empty query

Using R-3.5.0 and RODBC v. 1.3-15 on Windows.
I am trying to query data from a remote database. I can connect fine and if I do a query to count the rows, the answer comes out correctly. But if I try to remove the count statement select count(*) and actually get the data via select *, I yield an empty query (with some rather strange headers). Only two of the column names come out correctly and the rest are question marks and a number (as shown below). I can using sql developer to query the data no problem.
I include the simplest version of the code below but I get the same results if I try to limit to just a few rows or certain conditions, etc. Sorry I cannot create a reproducible example but as this is a remote db and I have no idea what the problem is, I'm not sure how I could even do that.
I can query other tables from different schemas within the same odbc connection, so I don't think it is that. I have tried with and without the believeNRows and the rows_at_time.
Thank you for any thoughts.
channel <- odbcConnect("mydb", uid="myuser", pwd="mypass", believeNRows=FALSE,rows_at_time = 1)
myquery <- paste("select count(*) from MYSCHEMA.MYTABLE")
sqlQuery(channel, myquery)
COUNT(*)
1 149712361
myquery <- paste("select * from MYSCHEMA.MYTABLE")
sqlQuery(channel, myquery)
[1] ID FMC_IN_ID ? ?.1 ?.2 ?.3 ?.4 ?.5 ?.6 ?.7 ?.8 ?.9 ?.10 ?.11 ?.12 ?.13 ?.14 ?.15
<0 rows> (or 0-length row.names)
I would try the following:
add a simple limit 100 to your query to see if you can get some data back
add the believeNRows option to the sqlQuery call -- in my experience it is needed at that level
In case it helps others, the problem was that the database contained an Oracle spatial field (MDSYS.SDO_GEOMETRY). R did not know what to do with it. I assumed it would just convert it to a character but instead it just got confused. By omitting the spatial field, the query worked fine.

Prevent SQLite query from stripping leading zeros from numeric strings?

In my database, a table contains two columns each containing an 8 digit ASCII code, usually it's just alphanumeric. For example, a row might contain A123B45C in col1 and PQ2R4680 in col2.
I need to have a query/view that outputs a 4 character string calculated as the 2nd+3rd chars of these, concatenated. So in this example the extra column value would be 12Q2.
This is a cut-down version of the SQL I'd like to use, although it won't work as written because of zero stripping / conversion:
select
*,
(substr(col1, 2, 2) || substr(col2, 2, 2)) AS mode
from (nested SQL source query)
where (conditions)
This fails because if a row contains A00B23B4 in col1 and P32R4680 in col2, it will evaluate as 0032 and the query output will contain numeric 32 not 0032. (It's worse if col1 contains P1-2345 or "1.23456" or something like that)
Other questions on preventing zero stripping and string to integer conversion in Sqlite, all relate to data in tables where you can define a column text affinity, or static (quotable) data. In this case I can't do these things. I also can only create queries, not tables, so I can't write to a temp table.
What is the best way to ensure I get a 4 character output in all cases?
I believe you issue is not with substr stripping characters as this works as expected e.g. :-
Then running query SELECT substr(col1,2,2) || substr(col2,2,2) as mode FROM stripping
results in (as expected):-
Rather, your issue is likely how you subsequently utilise mode in which case you may need to use a CAST expression CAST expressions
For example the following does what is possibly happening :-
`SELECT substr(col1,2,2) || substr(col2,2,2) as mode, CAST(substr(col1,2,2) || substr(col2,2,2) AS INTEGER) AS oops FROM stripping`
resulting in :-

Adding value to existing database table in RSQLite

I am new to RSQLite.
I have an input document in text format in which values are seperately by '|'
I created a table with the required variables (dummy code as follows)
db<-dbconnect(SQLite(),dbname="test.sqlite")
dbSendQuery(conn=db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER
NAME CHAR(25)
DATED DATE)"
)
However I am struck at how to import values into the created table.
I cannot use INSERT INTO Values command as there are thousands of rows and more than 20+ columns in the original data file and it is impossible to manually type in each data point.
Can someone suggest an alternative efficient way to do so?
You are using a scripting language. The deal of this is literally to avoid manually typing each data point. Sorry.
You have two routes:
1: You have corrected loaded a database connection and created an empty table in your SQLite database. Nice!
To load data into the table, load your text file into R using e.g. df <-
read.table('textfile.txt', sep='|') (modify arguments to fit your text file).
To have a 'dynamic' INSERT statement, you can use placeholders. RSQLite allows for both named or positioned placeholder. To insert a single row, you can do:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (?, ?, ?);', list(1, 16, 'Big fellow'))
You see? The first ? got value 1, the second ? got value 16, and the last ? got the string Big fellow. Also note that you do not enclose placeholders for text in quotation marks (' or ")!
Now, you have thousands of rows. Or just more than one. Either way, you can send in your data frame. dbSendQuery has some requirements. 1) That each vector has the same number of entries (not an issue when providing a data.frame). And 2) You may only submit the same number of vectors as you have placeholders.
I assume your data frame, df contains columns mark, roll, and name, corrsponding to the columns. Then you may run:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (:mark, :roll, :name);', df)
This will execute an INSERT statement for each row in df!
TIP! Because an INSERT statement is execute for each row, inserting thousands of rows can take a long time, because after each insert, data is written to file and indices are updated. Insert, enclose it in an transaction:
dbBegin(db)
res <- dbSendQuery(db, 'INSERT ...;', df)
dbClearResult(res)
dbCommit(db)
and SQLite will save the data to a journal file, and only save the result when you execute the dbCommit(db). Try both methods and compare the speed!
2: Ah, yes. The second way. This can be done in SQLite entirely.
With the SQLite command utility (sqlite3 from your command line, not R), you can attach a text file as a table and simply do a INSERT INTO ... SELECT ... ; command. Alternately, read the text file in sqlite3 into a temporary table and run a INSERT INTO ... SELECT ... ;.
Useful site to remember: http://www.sqlite.com/lang.html
A little late to the party, but DBI provides dbAppendTable() which will write the contents of a dataframe to an SQL table. Column names in the dataframe must match the field names in the database. For your example, the following code would insert the contents of my random dataframe into your newly created table.
library(DBI)
db<-dbConnect(RSQLite::SQLite(),dbname=":memory")
dbExecute(db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER,
NAME TEXT
)"
)
df <- data.frame(MARKS = sample(1:100, 10),
ROLLNUM = sample(1:100, 10),
NAME = stringi::stri_rand_strings(10, 10))
dbAppendTable(db, "TABLE1", df)
I don't think there is a nice way to do a large number of inserts directly from R. SQLite does have a bulk insert functionality, but the RSQLite package does not appear to expose it.
From the command line you may try the following:
.separator |
.import your_file.csv your_table
where your_file.csv is the CSV (or pipe delimited) file containing your data and your_table is the destination table.
See the documentation under CSV Import for more information.

sqlloader: how to use substr in when-clause correctly?

I'm having a problem that I thought to be rather common, but trying to look it up in the "Oracle Database 10g2 Utilities_b14215.pdf" didn't help. After that I've surfed through numerous threads but no luck so far.
I'm having a tab-delimited file (x'09') e. g. name, userid, persnr. The values for the userids begin with either P, R or T e. g. P2198, P2199, R7288, T1229.
I want to load only the records with userids beginning with P.
Isolating a single record with a controlfile like this works splendidly:
OPTIONS (SKIP=1)
LOAD DATA
INFILE UserlistLoader.dat
APPEND
INTO TABLE Z_USERLIST
WHEN USERID = 'P2198'
FIELDS TERMINATED BY x'09'
TRAILING NULLCOLS
(name, userid, persnr)
But every attempt at using SUBSTR in the when-clause fails.
This:
OPTIONS (SKIP=1)
LOAD DATA
INFILE UserlistLoader.dat
APPEND
INTO TABLE Z_USERLIST
WHEN SUBSTR(USERID, 1, 1) = 'P'
FIELDS TERMINATED BY x'09'
TRAILING NULLCOLS
(name, userid, persnr)
ends in an SQL*Loader-350: Syntax-Error.
This
OPTIONS (SKIP=1)
LOAD DATA
INFILE UserlistLoader.dat
APPEND
INTO TABLE Z_USERLIST
WHEN "SUBSTR(:USERID, 1, 1)" = 'P'
FIELDS TERMINATED BY x'09'
TRAILING NULLCOLS
(name, userid, persnr)
ends in an SQL*Loader-403: Referenced column USERID not present in table Z_USERLIST.
But IT IS PRESENT - as the first example proves. I've found that the column should be preceded by : but that obviously isn't the issue.
What am I doing wrong?
From SQL Loader docs the left-hand side of a WHEN condition can only be a full field name e.g. USERID or a position spec e.g. (3:5).
The docs aren't very clear though on what is allowed - e.g. can LIKE be used as the operator?
USERID LIKE 'P%'
I strongly suspect it can't though.
I would load the entire file into a staging table that matches the file layout, then run a procedure that inserts the rows you want from there into the production table. That is a more common way to handle loads with criteria like this without having to edit source data.
If you can preprocess the source file, move the userid to the first field or copy the first letter of the userid to it's own field and construct the WHEN like this so sqlldr looks at the first position (this will cause sqlldr to return non-zero though, as not all rows meet WHEN clause criteria):
WHEN (1) = 'P'

SQLITE: select rows where a certian column is contained in a given string

I have a table which has a column named "directory" which contains strings like:
c:\mydir1\mysubdir1\
c:\mydir2
j:\myotherdir
...
I would like to do something like
SELECT FROM mytable WHERE directory is contained within 'c:\mydir2\something\'
This query should give me as a result:
c:\mydir2
Ok, I've just found that sqlite has a function instr that seems to work for my purpose.
Not sure about the performance, though.

Resources