In a script that processes a lot of rows in a MySQL server, I use dbSendQuery and fetch to throttle the fetching and processing of results.
When my fetch command retrieves exactly the number of rows available (or left) in the resultset, leaving 0 rows to be fetched, dbHasCompleted returns FALSE whereas I expected it to return TRUE.
query <- "select record_id, name
from big_table left join another_table using (record_id)
limit 500"
resultset <- dbSendQuery(con, query) # con: DB connection
while(!dbHasCompleted(resultset)) {
input <- fetch(resultset, n = 500)
print(paste("Rows fetched:", nrow(input)))
# process input ...
}
I expected this loop to run once, but there is an extra run as after processing, print is called again:
Rows fetched: 500
...
Rows fetched: 0
Apparently, dbHasCompleted(resultset) is false when the exact number of available rows is fetched (same behaviour is observed for n = 1000, 2000, 3000). When in this script n = 501, there is no second loop.
Is this to be expected? Am I doing something wrong?
Related
I have R script for data analysis. I try it on 6 different tables from my mysql database. On 5 of them script works fine. But on last table it don't wont work. There is part of my code :
sql <- ""
#write union select for just one access to database which will optimize code
for (i in 2:length(awq)-1){
num <- awq[i]-1
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", num, ",1) UNION ")
}
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", awq[length(awq)-1], ",1)")
#database query
nb <- dbGetQuery(mydb, sql)
My mysql table where script don't work have 21 676 rows. My other tables have under 20 000 rows and with them script work. If it don't work work it give me this error :
Error in .local(conn, statement, ...) :
could not run statement: memory exhausted near '1) UNION (SELECT * FROM mytable LIMIT 14107 ,1) UNION (SELECT * FROM mytabl' at line 1
I understood there is memory problem. But how to solve it ? I don't want delete rows from my table. Is there another way ?
Suppose you're trying to do a typical insert-or-update loop with RSQLite. I would expect the following to work:
library(DBI)
testdb <- dbConnect(RSQLite::SQLite(), "test.sqlite")
dbExecute(testdb, "CREATE TABLE spray_count (spray TEXT, count INTEGER)")
urs <- dbSendStatement(testdb, "UPDATE spray_count SET count = count + :count WHERE spray = :spray")
irs <- dbSendStatement(testdb, "INSERT INTO spray_count VALUES (:spray, :count)")
for (i in 1:nrow(InsectSprays)) {
print(paste("update", i))
dbBind(urs, InsectSprays[i,])
if (!dbGetRowsAffected(urs)) {
print(paste("insert", i))
dbBind(irs, InsectSprays[i,])
}
}
But it does not:
[1] "update 1"
Error in rsqlite_bind_rows(res#ptr, params) :
external pointer is not valid
In addition: Warning message:
Closing open result set, pending rows
Basically it seems that you can have only one prepared statement at a time, and creating a second one somehow invalidates the first. Am I missing something or is this a limitation of DBI and/or RSQLite? I have DBI v0.6-1 and RSQLite v1.1-2.
I need to count elements that was in result of SQL query:
db._query('FOR v in visitors FILTER v.ip == "127.0.0.1" return COUNT(v.guid) ')
This request is return my a length of every GUID, but I need to get total number of GUIDs for example: 2.
You need to use the result of the query as input for the COUNT function, and then RETURN this result.
You can replace the RETURN value of the actual query by 1 for performance reasons:
RETURN COUNT(FOR v IN visitors FILTER v.ip == "127.0.0.1" RETURN 1)
Version from 2022!
FOR m IN messages
FILTER DATE_HOUR(m.date) == 3
COLLECT WITH COUNT INTO length
RETURN length
I am trying to fetch a decently large result set (about 1-2M records) using RJDBC using the following
library(RJDBC)
drv <- JDBC("oracle.jdbc.driver.OracleDriver",
classPath="../oracle11g/ojdbc6.jar", " ")
con <- dbConnect(drv, "jdbc:oracle:thin:#hostname:1521/servname","user","pswd")
data <- dbGetQuery(con, "select * from largeTable where rownum < xxx")
The above works if xxx is less than 32768. Above 32800, I get the following exception
> data <- dbGetQuery(con, "select * from dba_objects where rownum < 32768")
> dim(data)
[1] 32767 15
> data <- dbGetQuery(con, "select * from dba_objects where rownum < 32989")
Error in .jcall(rp, "I", "fetch", stride) :
java.sql.SQLException: Bigger type length than Maximum
In https://cran.r-project.org/web/packages/RJDBC/RJDBC.pdf, I see "fetch retrieves the content of the result set in the form of a data frame. If n is -1 then the current implementation fetches 32k rows first and then (if not sufficient) continues with chunks of 512k rows, appending them." followed by "Note that some databases (like Oracle) don’t support a fetch size of more than 32767."
Sorry for the newbie question but I don't see how I can tell dbGetQuery to fetch the result set in chunks of 32K only. I believe my fetch is dying because it went to fetch 512K records.
Would really appreciate any suggestions. Thanks in advance.
How can I get a single row result (e.g. in form of a table/array) back from a sql statement. Using Lua Sqlite (LuaSQLite3). For example this one:
SELECT * FROM sqlite_master WHERE name ='myTable';
So far I note:
using "nrows"/"rows" it gives an iterator back
using "exec" it doesn't seem to give a result back(?)
Specific questions are then:
Q1 - How to get a single row (say first row) result back?
Q2 - How to get row count? (e.g. num_rows_returned = db:XXXX(sql))
In order to get a single row use the db:first_row method. Like so.
row = db:first_row("SELECT `id` FROM `table`")
print(row.id)
In order to get the row count use the SQL COUNT statement. Like so.
row = db:first_row("SELECT COUNT(`id`) AS count FROM `table`")
print(row.count)
EDIT: Ah, sorry for that. Here are some methods that should work.
You can also use db:nrows. Like so.
rows = db:nrows("SELECT `id` FROM `table`")
row = rows[1]
print(row.id)
We can also modify this to get the number of rows.
rows = db:nrows("SELECT COUNT(`id`) AS count FROM `table`")
row = rows[1]
print(row.count)
Here is a demo of getting the returned count:
> require "lsqlite3"
> db = sqlite3.open":memory:"
> db:exec "create table foo (x,y,z);"
> for x in db:urows "select count(*) from foo" do print(x) end
0
> db:exec "insert into foo values (10,11,12);"
> for x in db:urows "select count(*) from foo" do print(x) end
1
>
Just loop over the iterator you get back from the rows or whichever function you use. Except you put a break at the end, so you only iterate once.
Getting the count is all about using SQL. You compute it with the SELECT statement:
SELECT count(*) FROM ...
This will return one row containing a single value: the number of rows in the query.
This is similar to what I'm using in my project and works well for me.
local query = "SELECT content FROM playerData WHERE name = 'myTable' LIMIT 1"
local queryResultTable = {}
local queryFunction = function(userData, numberOfColumns, columnValues, columnTitles)
for i = 1, numberOfColumns do
queryResultTable[columnTitles[i]] = columnValues[i]
end
end
db:exec(query, queryFunction)
for k,v in pairs(queryResultTable) do
print(k,v)
end
You can even concatenate values into the query to place inside a generic method/function.
local query = "SELECT * FROM ZQuestionTable WHERE ConceptNumber = "..conceptNumber.." AND QuestionNumber = "..questionNumber.." LIMIT 1"