select into temporary table - r

I believe I should be able to do select * into #temptable from othertable (where #temptable does not previously exist), but it does not work. Assuming that othertable exists and has valid data, and that #sometemp does not exist,
# conn <- DBI::dbConnect(...)
DBI::dbExecute(conn, "select top 1 * into #sometemp from othertable")
# [1] 1
DBI::dbGetQuery(conn, "select * from #sometemp")
# Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name '#sometemp'. [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could not be prepared.
The non-temporary version works without error:
DBI::dbExecute(conn, "select top 1 * into sometemp from othertable")
# [1] 1
DBI::dbGetQuery(conn, "select * from sometemp")
### ... valid data ...
System info:
conn
# <OdbcConnection> myuser#otherdomain-DATA01
# Database: dbname
# Microsoft SQL Server Version: 13.00.5026
DBI::dbGetQuery(conn, "select ##version")
#
# 1 Microsoft SQL Server 2016 (SP2) (KB4052908) - 13.0.5026.0 (X64) \n\tMar 18 2018 09:11:49 \n\tCopyright (c) Microsoft Corporation\n\tStandard Edition (64-bit) on Windows Server 2016 Standard 10.0 <X64> (Build 14393: )\n
Tested on Win11 and Ubuntu. R-4.1.2, DBI-1.1.2, odbc-1.3.3.
I've seen some comments that suggest "select into ..." isn't for temporary tables, but I've also seen several tutorials demonstrate that it works (for them).
Back-story: this is for a generic accessor function for upserting data: I insert into a temp table, do the upsert, then remove the temp table. I can use a non-temp table, but I think there are valid reasons to use temps when justified, and I want to understand why this doesn't or shouldn't work as intended. Other than switching from temps, I could try to reconstitute the structure of the othertable programmatically, but that is prone to interpretative error with some column types. I can't just insert into a temp table since there are times when the data types are imperfectly mapped (such as when I should use nvarchar(max) and/or when a new column is indeterminant due to being all-NA).
Related links:
Insert Data Into Temp Table with Query from 2013
https://www.sqlshack.com/select-into-temp-table-statement-in-sql-server/ from 2021

There are few different approaches:
Use the immediate arg in your DBI::dbExecute statement
DBI::dbExecute(conn, "select top 5 * into #local from sometable", immediate=TRUE)
DBI::dbGetQuery(conn, "select * from #local")
Use a global temp table
DBI::dbExecute(conn, "select top 5 * into ##global from sometable")
DBI::dbGetQuery(conn, "select * from ##global")
Use dplyr/dbplyr
tt = tbl(conn, sql("select top 5 * from sometable")) %>% compute()
tt
Also see here: https://github.com/r-dbi/odbc/issues/127

Related

SQL Server ALTER datetime to datetime2 does not work

I am trying to convert a "datetime" variable to "datetime2" format.
# Load libraries
library(DBI)
library(tidyverse)
# Create dataframe
df <- data.frame("myid" = stringi::stri_rand_strings(5, 5),
"mydate" = c(Sys.time(), Sys.time()-1, Sys.time()-2, Sys.time()-3, Sys.time()-4) )
# Create SQL table sschema.ttable
DBI::dbWriteTable(conn = connection,
name = DBI::Id(schema = "sschema", table = "ttable"),
value = df,
overwrite = TRUE,
append = FALSE)
# Query for variable type in the SQL table
query <- paste0("exec sp_columns ", "ttable")
query <- DBI::dbSendQuery(connection, query)
res <- NULL
res <- DBI::dbFetch(query)
DBI::dbClearResult(query)
view(res)
# Alter mydate to datetime2
query <- DBI::dbSendStatement(conn = connection,
statement = paste0("ALTER TABLE sschema.ttable ALTER COLUMN mydate datetime2"))
DBI::dbFetch(query)
DBI::dbClearResult(query)
but this leads to the error
Error: nanodbc/nanodbc.cpp:1617: 00000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The UPDATE permission was denied on the object 'ttable', database 'dbo', schema 'sschema'.
'ALTER TABLE sschema.ttablename ALTER COLUMN mydate datetime2'
However, converting another VARCHAR(10) variable in the same table to VARCHAR(100) works fine. Any idea what is the problem? How to get this working?
I am working with Microsoft SQL Azure version 12, by operating on an RStudio-server and the DBI library.
To change the data type of a column you must have both the ALTER permission and UPDATE permission on the table.
From the docs:
Adding a column that updates the rows of the table requires UPDATE permission on the table.
ALTER TABLE - permissions
This goes for ALTERing an existing column too, as you can verify like this:
use tempdb
go
revert
go
if exists(select * from sys.database_principals where name = 'fred')
drop user fred
go
drop table if exists tablename
go
create user fred without login
create table tablename(id int, variablename varchar(20))
go
grant select on tablename to fred
--grant update on tablename to fred --uncomment to clear error
grant alter on schema::dbo to fred
execute as user='fred'
ALTER TABLE dbo.tablename ALTER COLUMN variablename datetime2
revert

Writing a Teradata With Statement in RODBC in R

I have connected Teradata to my R session with RODBC.
Typically I use data <- sqlQuery(conn, "SELECT statement") however when I put the following WITH statement in place of the SELECT statement, there is an error.
data <- sqlQuery(conn,
"WITH zzz as (SELECT statement1),
yyy as (SELECT statement2)
SELECT statement3"
try correcting mismatched " and ) as below...
data <- sqlQuery(conn,
"WITH zzz as (SELECT statement1),
yyy as (SELECT statement2)
SELECT statement3")

R - RMysql - could not run statement: memory exhausted

I have R script for data analysis. I try it on 6 different tables from my mysql database. On 5 of them script works fine. But on last table it don't wont work. There is part of my code :
sql <- ""
#write union select for just one access to database which will optimize code
for (i in 2:length(awq)-1){
num <- awq[i]-1
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", num, ",1) UNION ")
}
sql <- paste(sql, "(SELECT * FROM mytable LIMIT ", awq[length(awq)-1], ",1)")
#database query
nb <- dbGetQuery(mydb, sql)
My mysql table where script don't work have 21 676 rows. My other tables have under 20 000 rows and with them script work. If it don't work work it give me this error :
Error in .local(conn, statement, ...) :
could not run statement: memory exhausted near '1) UNION (SELECT * FROM mytable LIMIT 14107 ,1) UNION (SELECT * FROM mytabl' at line 1
I understood there is memory problem. But how to solve it ? I don't want delete rows from my table. Is there another way ?

Why is sqlQuery from RODBC not always returning the same data when querying an Impala DB?

I'm trying to get some data from an Impala database using the sqlQuery function from the RODBC package. The results I get changes from one execution of a query to another execution of the exact same query.
The data.frame I get doesn't always have the same number of rows:
library("RODBC")
conn <- odbcConnect("Cloudera Impala DSN;host=mydb;port=21050")
df<-sqlQuery(conn, "select * from hydrau.hydr where flight= 'V0051'")
dim(df)
[1] 26600 220
df<-sqlQuery(conn, "select * from hydrau.hydr where flight= 'V0051'")
dim(df)
[1] 142561 220
df<-sqlQuery(conn, "select * from hydrau.hydr where flight= 'V0051'")
dim(df)
[1] 23500 220
This query should in fact return a 142561 x 220 data frame.
On the other hand, the following query always return the same (correct) result :
sqlQuery(conn, "select count(*) from hydr where flight= 'V0051' ")
count(*)
1 142561
It seems my problem was that Impala didn't have enough memory to perform well.

dbgetquery java.sql.SQLException: Bigger type length than Maximum

I am trying to fetch a decently large result set (about 1-2M records) using RJDBC using the following
library(RJDBC)
drv <- JDBC("oracle.jdbc.driver.OracleDriver",
classPath="../oracle11g/ojdbc6.jar", " ")
con <- dbConnect(drv, "jdbc:oracle:thin:#hostname:1521/servname","user","pswd")
data <- dbGetQuery(con, "select * from largeTable where rownum < xxx")
The above works if xxx is less than 32768. Above 32800, I get the following exception
> data <- dbGetQuery(con, "select * from dba_objects where rownum < 32768")
> dim(data)
[1] 32767 15
> data <- dbGetQuery(con, "select * from dba_objects where rownum < 32989")
Error in .jcall(rp, "I", "fetch", stride) :
java.sql.SQLException: Bigger type length than Maximum
In https://cran.r-project.org/web/packages/RJDBC/RJDBC.pdf, I see "fetch retrieves the content of the result set in the form of a data frame. If n is -1 then the current implementation fetches 32k rows first and then (if not sufficient) continues with chunks of 512k rows, appending them." followed by "Note that some databases (like Oracle) don’t support a fetch size of more than 32767."
Sorry for the newbie question but I don't see how I can tell dbGetQuery to fetch the result set in chunks of 32K only. I believe my fetch is dying because it went to fetch 512K records.
Would really appreciate any suggestions. Thanks in advance.

Resources