R table from SQL weird behavior - r

I connected R to SQL using the following:
library(dplyr)
library(dbplyr)
library(odbc)
library(RODBC)
library(DBI)
con <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "srv name",
Database = "Warehouse")
I pull in the table I want using
data <- tbl(con, in_schema("prc", "PricingLawOfUniv")))
The following things show me what I expect to see (a 38 X 1000 table of data):
head(data)
colnames(data)
The following things behave as I expect:
In the Environment data is a "list of 2"
View(data) shows a list with "src" and "ops" - each of those is also a list of 2.
Ultimately I want to work with the 38 X 1000 table as a dataframe using dplyr. How can I do this? I tried data[1] and data[2] but neither worked. Where is the actual table I want hiding?

You could use DBI::Id to specify the table/schema, and then dbReadTable:
tbl <- DBI::Id(
schema = "prc",
table = "PricingLawOfUniv"
)
data <- DBI::dbReadTable(con, tbl)

Related

Delete specific rows in specific table in a SQLite database

I have multiple Datatable in a SQLite database. I am trying to delete specific rows of a datatable using DBI package. Here is the code:
library(dplyr)
library(DBI)
con <- DBI::dbConnect(RSQLite::SQLite(), dbname = "C:\\DB2.sqlite" , password="password")
DBI::dbWriteTable(con,"data_iris",iris,overwrite=TRUE)
query<-"DELETE FROM data_iris WHERE Species = ?;"
specie<-'setosa'
res <- dbExecute(con,query,params = list(specie))
res
[1] 50
The above code works good. But why the following code does not work:
query <- 'DELETE FROM ? WHERE Species = ?;'
table_name<-"data_iris"
res <- dbExecute(con,query,params = c(table_name,specie))
#Error: near "?": syntax error
I can not use the first code since the table_name changes dynamically (in a shiny APP).

Error: BigQuery does not support temporary tables

I'm trying to join tables from two different datasets in the same project. How can I do this?
library(tidyverse)
library(bigrquery)
con1 <-
bConnect(
drv = bigrquery::bigquery(),
project = PROJECT,
dataset = "dataset_1"
)
con2 <-
bConnect(
drv = bigrquery::bigquery(),
project = PROJECT,
dataset = "dataset_2"
)
A <- con1 %>% tbl("A")
B <- con2 %>% tbl("B")
inner_join(A, B,
by = "key",
copy = T) %>%
collect()
Then I get the error: Error: BigQuery does not support temporary tables
The problem is most likely that you are using different connections to connect with the two tables. When you attempt this, R tries to copy data from one source into a temporary table on the other source.
See this question and the copy parameter in this documentation (its a different package, but the principle is the same).
The solution is to only use a single connection for all your tables. Something like this:
con <-
bConnect(
drv = bigrquery::bigquery(),
project = PROJECT,
dataset = "dataset_1"
)
A <- con %>% tbl("A")
B <- con %>% tbl("B")
inner_join(A, B,
by = "key") %>%
collect()
You may need to leave the dataset parameter blank in your connection string, or use in_schema to include the dataset name along with the table when you connect to a remote table. It's hard to be sure without knowing more about the structure of your database(s).

dbplyr tbl and DBI dbListTables - conflicting results on table presence

Here is my code
library(DBI)
library(dplyr)
con <- dbConnect(odbc::odbc(), some_credentials)
dbListTables(con, table_name = "Table_A")
The above code returns Table_A indicating presence of table. Now I am trying to query Table_A
df <- as.data.frame(tbl(con, "Table_A"))
and get back:
Error: <SQL> 'SELECT *
FROM "Table_A" AS "zzz18"
WHERE (0 = 1)'
nanodbc/nanodbc.cpp:1587: 42S02: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 'Table_A'.
so dplyr does not see it. How can I reconcile. I already double checked spelling.
As mentioned, any object (table, stored procedure, function, etc.) residing in a non-default schema requires explicit reference to the schema. Default schemas include dbo in SQL Server and public in PostgreSQL. Therefore, as docs indicate, use in_schema in dbdplyr and Id or SQL in DBI:
# dbplyr VERSION
df <- tbl(con, in_schema("myschema", "Table_A"))
# DBI VERSION
t <- Id(schema = "myschema", table = "Table_A")
df <- dbReadTable(con, t)
df <- dbReadTable(con, SQL("myschema.Table_A"))
Without a reproducible example it is kinda hard but I will try my best. I think you should add the dbplyr package which is often used for connecting to databases.
library(DBI)
library(dbplyr)
library(tidyverse)
con <- dbConnect(odbc::odbc(), some_credentials)
df <- tbl(con, "Table_A") %>%
collect() #will create a dataframe in R and use dplyr
Here are some additional resources:
https://cran.r-project.org/web/packages/dbplyr/vignettes/dbplyr.html
Hope that can help!

Update selected rows in sqlite table in r

I am using the RSQLite package in a shiny app. I need to be able to dynamically update an sqlite db as users progress through the app. I want to use the UPDATE syntax in SQLite to achieve this, but I have come up against a problem when trying to update multiple rows for the same user.
Consider the following code:
# Load libraries
library("RSQLite")
## Path for SQLite db
sqlitePath <- "test.db"
# Create db to store tables
con <- dbConnect(SQLite(),sqlitePath)
## Create toy data
who <- c("jane", "patrick", "samantha", "jane", "patrick", "samantha")
tmp_var_1 <- c(1,2,3, 4, 5, 6)
tmp_var_2 <- c(2,4,6,8,10,12)
# Create original table
users <- data.frame(who = as.character(who), tmp_var_1 = tmp_var_1, tmp_var_2 = tmp_var_2)
users$who <- as.character(users$who)
# Write original table
dbWriteTable(con, "users", users)
# Subset users data
jane <- users[who=="jane",]
patrick <- users[who=="patrick",]
samantha <- users[who=="samantha",]
# Edit Jane's data
jane$tmp_var_1 <- c(99,100)
# Save edits back to SQL (this is where the problem is!)
table <- "users"
db <- dbConnect(SQLite(), sqlitePath)
query <- sprintf(
"UPDATE %s SET %s = ('%s') WHERE who = %s",
table,
paste(names(jane), collapse = ", "),
paste(jane, collapse = "', '"),
"'jane'"
)
dbGetQuery(db, query)
## Load data to check update has worked
loadData <- function(table) {
# Connect to the database
db <- dbConnect(SQLite(), sqlitePath)
# Construct the fetching query
query <- sprintf("SELECT * FROM %s", table)
# Submit the fetch query and disconnect
data <- dbGetQuery(db, query)
dbDisconnect(db)
data
}
loadData("users")
Here I am trying to update the entry for Jane so that the values for tmp_var_1 are changed, but all other columns remain the same. In response to questions from #zx8754 and #Altons posted below, the value for query is as follows:
UPDATE users SET who, tmp_var_1, tmp_var_2 = ('c(\"jane\", \"jane\")', 'c(99, 100)', 'c(2, 8)') WHERE who = 'jane'
The problem is almost certainly coming from the way that I am specifying the query to RSQlite. When I run dbGetQuery(db, query) I get the following error:
Error in sqliteSendQuery(con, statement, bind.data) :
error in statement: near ",": syntax error
Any suggestions for improvement would be most welcome.

Filter table from redshift database using R dplyr

I have a table saved in AWS redshift that has lots of rows and I want to collect only a subset of them using a "user_id" column. I am trying to use R with the dplyr library to accomplish this (see below).
conn_dplyr <- src_postgres('dev',
host = '****',
port = ****,
user = "****",
password = "****")
df <- tbl(conn_dplyr, "redshift_table")
However, when I try to subset over a collection of user ids it fails (see below). Can someone help me understand how I might be able to collect the data table over a collection of user id elements? The individual calls work, but when I combine them both it fails. In this case there are only 2 user ids, but in general it could be hundreds or thousands, so I don't want to do each one individually. Thanks for your help.
df_subset1 <- filter(df, user_id=="2239257806")
df_subset1 <- collect(df_subset1)
df_subset2 <- filter(df, user_id=="22159960")
df_subset2 <- collect(df_subset2)
df_subset_both <- filter(df, user_id==c("2239257806", "22159960"))
df_subset_both <- collect(df_subset_both)
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: operator does not exist: character varying = record
HINT: No operator matches the given name and argument type(s). You may need to add explicit type casts.
)
Try this:
df_subset_both <- filter(df, user_id %in% c("2239257806", "22159960"))
Also you can add condition in the query you uploaded from redshift.
install.packages("RPostgreSQL")
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
conn <-dbConnect(drv,host='host link',port='5439',dbname='dbname',user='xxx',password='yyy')
df_subset_both <- dbSendQuery(conn,"select * from my_table where user_id in (2239257806,22159960)")

Resources