How might I get detailed database error messages from dplyr::tbl? - r

I'm using R to plot some data I pull out of a database (the Stack Exchange data dump, to be specific):
dplyr::tbl(serverfault,
dbplyr::sql("
select year(p.CreationDate) year,
avg(p.AnswerCount*1.0) answers_per_question,
sum(iif(ClosedDate is null, 0.0, 100.0))/count(*) close_rate
from Posts p
where PostTypeId = 1
group by year(p.CreationDate)
order by year(p.CreationDate)
"))
The query works fine on SEDE, but I get this error in the R console:
Error: <SQL> 'SELECT *
FROM (
select year(p.CreationDate) year,
avg(p.AnswerCount*1.0) answers_per_question,
sum(iif(ClosedDate is null, 0.0, 100.0))/count(*) close_rate
from Posts p
where PostTypeId = 1
group by year(p.CreationDate)
order by year(p.CreationDate)
) "zzz11"
WHERE (0 = 1)'
nanodbc/nanodbc.cpp:1587: 42000: [FreeTDS][SQL Server]Statement(s) could not be prepared.
I reckoned "Statement(s) could not be prepared." meant that SQL Server didn't like the query for some reason. Unfortunately, it didn't give any hint about what went wrong. After fiddling with the query for a bit, I noticed it was wrapped in a subselect, according to the error message. Copying and executing the full query as constructed by one of the libraries in the chain, SQL Server gave me this more informative error message:
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
Now the solution is obvious: remove (or comment out) the order by clause. But where is the detailed error message in the R console? I'm using Rstudio, should that matter. If I could get the full exception right next to the code I'm working on, it would help me fix bug a lot quicker. (And just to be clear, I get cryptic errors from dplyr::tbl often and typically use binary search debugging to fix them.)

Related

SQR - how to use FROM [dynamic table name} within BEGIN-SELECT?

I have to create an SQR that generates a list of EEIDs, if there were any changes to the Pension data in the past day. The SQR compiles and works perfectly when I hardcode in the table names.
However, when I tried using variables for the table names, I get a compile error
I've pasted the portion of SQR that I'm trying to fix
When I start using $tableName and $auditTableName as table variables, that's when I get the error and I'm not sure what is going wrong
Can anyone help?
Please and Thank You
!***************************
begin-procedure Process-Main
!***************************
let $tableName = 'PS_PENSION_PLAN'
let $auditTableName = 'PS_AUDIT_PENSION_PLN'
let $dummy-dyn-variable = ''
begin-SELECT DISTINCT
L.EMPLID
L.EMPL_RCD
do someProcName(&L.EMPLID, &L.EMPL_RCD)
FROM [$dummy-dyn-variable]
(
SELECT DISTINCT
PP.EMPLID,
PP.EMPL_RCD,
PP.EFFDT,
'1901-01-01 12:00:00' AS AUDIT_STAMP
FROM [$dummy-dyn-variable] [$tableName] PP
UNION
SELECT DISTINCT
A.EMPLID,
A.EMPL_RCD,
A.EFFDT,
A.AUDIT_STAMP
FROM [$dummy-dyn-variable] [$auditTableName] A
)L
WHERE DATEDIFF(DAY,CAST(L.AUDIT_STAMP AS DATE),SYSDATE) = 1
ORDER BY 1,2
end-SELECT
end-procedure
Edit:
does the UNION have anything to do with this?
I keep receiving is this error:
(SQR 5528) ODBC SQL dbdesc: SQLNumResultCols error 102 in cursor 1:
[Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near 'FROM'.
(SQR 5528) ODBC SQL dbdesc: SQLNumResultCols error 8180 in cursor 1:
[Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared.
Edit2:
Ok, initial problem solved with [$dummy-dyn-variable], which led to the next problem with the DO command. I've updated the code above with DO someProcName(param_a, param_b)
I am now getting an error saying:
(SQR 2002) DO arguments do not match procedure's
Weird part, if I remove the dynamic table variables and hardcode the table names in the FROM section, then it compiles properly without errors. This makes me believe that the error is not related to my someProcName (maybe?)
am I missing something here?

MariaDB SphinxSE not accepting weights parameter

If I try this query:
select * FROM sphinx.products where `query` = "test";
it works. But if I try to give it weights it returns an error:
select * FROM sphinx.products where `query` = "test;sort=extended:#weight DESC;weights=3,1,1,1";
Fails with error:
Error in query (1430): There was a problem processing the query on the foreign data source.
Data source error: searchd error: invalid deprecated unordered_weight count 4 (expe
(Error reported by MariaDB gets truncated there, but I believe it says "expecting 0")
And:
select * FROM sphinx.products where `query` = "test;sort=extended:#weight DESC";
Fails with error:
Error in query (1430): There was a problem processing the query on the foreign data source.
Data source error: searchd error: index 'sku_products': sort-by attribute '#weight'
(Again, error returned by SphinxSearch gets truncated by MariaDB)
All the documentation I find about SphinxSE tells me to query the index this way, yet it does not work, but nobody in the Internet seem to have met this error since nobody is asking about this anywhere...
Am I doing something wrong?
Well, the option weights= didn't work, but it accepted fieldweights=sku,90,partnumber,30,barcode,20,name,10.
(I.e., fieldweights=<field1_name>,<field1_weight>,...)
Results came ordered by weight even without specifying sort=extended:#weight DESC, so I dodged both errors and got what I needed.
Hope this helps anyone in the same situation.

Join works in Azure SQL but fails with with R DBI connection: The multi-part identifier could not be found

I have a query that works perfectly in SSMS. But when running the query in R using the DBI package, I receive several multipart identifier errors: The multi-part identifier: "rt.secondary_id" could not be bound, "rt.third_id" could not be bound, and "t2.important" could not be bound.
select t1.[main_id]
,rt.secondary_id
,rt.third_id
,t1.[date_col]
,t2.important
from t1
inner join rt on t1.main_id = rt.main_id
inner join t2 on rt.main_id = t2.main_id
inner join (select t1.main_id, max(t1.date_col) as upload_time from t1 group by t1.main_id) AS ag ON t1.main_id = ag.main_id AND t1.date_col = ag.upload_time
The unique identifier in t1 is the combination of main_id and date_col, and this query finds the most recent entry in t1 for a given main_id.
Not exactly sure if my query is structured in a poor way or this is an R issue. I've tried adding SET NOCOUNT ON to the query based on what I thought might be related issues elsewhere on stackoverflow, but no dice.
I found out what my issue was- silly (but time consuming) mistake on my part... but essentially, I was bringing my SQL query into R via paste(scan(...), collapse = " "). I had a comment in my SQL query, --, which could not be read correctly by R. Deleting the comment OR switching the comment to /* ... */ syntax fixes the problem.

ORA-00907: missing right parenthesis, on Oracle 10 and not on Oracle 11

Why the following query fails on Oracle 10 an not on Oracle 11.
SELECT trunc(DBMS_RANDOM.value(low => 10, high =>50)) from dual;
Oracle 10:
ORA-00907: missing right parenthesis
This answer is a bit speculative, but one possible explanation for the missing right parentheses error is that this error is not really about missing parentheses. Instead, if the API for DBMS_RANDOM.value is different in your version of Oracle 10 vs. Oracle 11 then you could be seeing this error. Try this query instead:
SELECT TRUNC(DBMS_RANDOM.value(10, 50))
FROM dual
If this works, then you will know that the API has changed between Oracle 10 and 11.
Here is a reference which uses the API as I have in my query.
This was a new feature in 11gR1:
Beginning in this release, it is now possible to invoke the function
in a SQL statement. For example, named notation syntax is:
SELECT f(pn=>3, p2=>2, p1=>1) FROM dual
Or, mixed notation is:
SELECT f(1, pn=>3) FROM dual
In previous releases, attempting named or mixed notation resulted in
an error.
So prior to 11g you could only call a PL/SQL function from SQL using positional notation, e.g.
SELECT trunc(DBMS_RANDOM.value(10, 50)) from dual;
... as #TimBiegeleisen has already shown works. The 'missing right parenthesis' error doesn't necessarily mean your parentheses are unbalanced; just that the parser saw something it didn't expect where it thought a parenthesis might go - in this case, at the =>.

Create a stored procedure using RMySQL

Background: I am developing a rscript that pulls data from a mysql database, performs a logistic regression and then inserts the predictions back into the database. I want the entire system to be self contained in the script in case of database failure. This includes all mysql stored procedures that the script depends on to aggregate the data on the backend since these would be deleted in such a database failure.
Question: I'm having trouble creating a stored procedure from an R script. I am running the following:
mySQLDriver <- dbDriver("MySQL")
connect <- dbConnect(mySQLDriver, group = connection)
query <-
"
DROP PROCEDURE IF EXISTS Test.Tester;
DELIMITER //
CREATE PROCEDURE Test.Tester()
BEGIN
/***DO DATA AGGREGATION***/
END //
DELIMITER ;
"
sendQuery <- dbSendQuery(connect, query)
dbClearResult(dbListResults(connect)[[1]])
dbDisconnect(connect)
I however get the following error that seems to involve the DELIMITER change.
Error in .local(conn, statement, ...) :
could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DELIMITER //
CREATE PROCEDURE Test.Tester()
BEGIN
/***DO DATA AGGREGATION***/
EN' at line 2
What I've Done: I have spent quite a bit of time searching for the answer, but have come up with nothing. What am I missing?
Just wanted to follow up on this string of comments. Thank you for your thoughts on this issue. I have a couple Python scripts that need to have this functionality and I began researching the same topic for Python. I found this question that indicates the answer. The question states:
"The DELIMITER command is a MySQL shell client builtin, and it's recognized only by that program (and MySQL Query Browser). It's not necessary to use DELIMITER if you execute SQL statements directly through an API.
The purpose of DELIMITER is to help you avoid ambiguity about the termination of the CREATE FUNCTION statement, when the statement itself can contain semicolon characters. This is important in the shell client, where by default a semicolon terminates an SQL statement. You need to set the statement terminator to some other character in order to submit the body of a function (or trigger or procedure)."
Hence the following code will run in R:
mySQLDriver <- dbDriver("MySQL")
connect <- dbConnect(mySQLDriver, group = connection)
query <-
"
CREATE PROCEDURE Test.Tester()
BEGIN
/***DO DATA AGGREGATION***/
END
"
sendQuery <- dbSendQuery(connect, query)
dbClearResult(dbListResults(connect)[[1]])
dbDisconnect(connect)

Resources