I cannot connect postgresql schema.table with dplyr package - r

Im trying to connect postgres with dplyr functions
my_db <- src_postgres(dbname = 'mdb1252', user = "diego", password = "pass")
my_db
src: postgres 9.2.5 [postgres#localhost:5432/mdb1252]
tbls: alf, alturas, asociad, atenmed, base, bfa_boys_p_exp, bfa_boys_z_exp,
bfa_girls_p_exp, bfa_girls_z_exp, bres, c21200012012, c212000392011, c212000532011,
c21200062012, c212006222012, c212007352012, c212012112013, c212012242012,
c212012452012, c2222012242012, calles, cap, cap0110, casos_tbc_tr09, casos_tbctr09,
casosvadela, catpo, cbcvl, cie09, cie10, cie103d, cie103dantigua, cie10c, cie9a,
cie9mc, clasiarc, coalc, coddepto, codedades, codest, codlocaerbio, codprov, coheb,
cohec, cohep, cohiv, coho09_20110909_m, coign, combl, comet, comp, comport, conev,
conymad, copri, corci3cod, corci910, cores, corin, cotab, cutoi, cutto, def0307,......
but when I try to connect a tbl
my_tbl <- tbl(my_db, 'def0307')
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: no existe la relación «def0307»
LINE 1: SELECT * FROM "def0307" WHERE 0=1;
^
)
I think the problem is a schema issue because sql should be:
SELECT * FROM mortalidad.def0307
I made my_tbl <- tbl(my_db, 'mortalidad.def0307');
my_tbl <- tbl(my_db, c('mortalidad','def0307')) without a solution.
Im having a lot of fun working with dplyr Im from SQL but I wish resolve that and trying dplyr skills.
Thanks in advance.

Finally dplyr has the solution to this problem thanks to the latest version 0.7 recently announced by Hadley Wickham. The DBI and dbplyr libraries greatly simplified the connection between dplyr and PostgreSQL.
con <- DBI::dbConnect(RPostgreSQL::PostgreSQL(),
host = "database.rstudio.com",
user = "hadley",
password = rstudioapi::askForPassword("Database password"))
tbl <- dplyr::tbl(con, dbplyr::in_schema('mortalidad','def0307'))

You might want this,
db=src_postgres(dbname = 'mdb1252',
user = "diego", password = "pass", options="-c search_path=mortalidad")

If anybody ends up here with the same problem, here is what works for me: (taken from #Diego's comment from Feb 6'14)
postgre_table <- function (src, schema, table) {
paste('SELECT * FROM', paste(schema, table, sep = '.')) %>%
sql() %>% tbl(src = src)
}

Related

Read Kudu from SparkR

In Spark I am unable to find how to connect to Kudu using SparkR. If I try the following in scala:
import org.apache.kudu.spark.kudu._
import org.apache.kudu.client._
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.functions._
// Read kudu table and select data of August 2018
val df = spark.sqlContext.read.options(Map("kudu.master" -> "198.y.x.xyz:7051","kudu.table" -> "table_name")).kudu
df.createOrReplaceTempView("mytable")
it works perfectly. In SparkR I have been trying to the following:
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sc = sparkR.session(master = "local[*]", sparkConfig = list(spark.driver.memory = "2g"), sparkPackages = "org.apache.kudu:kudu-spark2_2.11:1.8.0")
sqlContext <- sparkRSQL.init(sc)
df = read.jdbc(url="198.y.x.xyz:7051",
driver = "jdbc:kudu:sparkdb",
source="jdbc",
tableName = "table_name"
)
I get the following error:
Error in jdbc : java.lang.ClassNotFoundException: jdbc:kudu:sparkdb
Trying the following:
df = read.jdbc(url="jdbc:mysql://198.19.10.103:7051",
tableName = "banglalink_data_table_1"
)
gives:
Error: Error in jdbc : java.sql.SQLException: No suitable driver
I cannot find any help on how to load the correct driver. I think that using the sparkPackages option is correct as it gives no error. What am I doing wrong??

Can't get data with dbplyr from shiny-server

I'm trying to get data from AWS SQL Server.
This code works fine from local PC, but it didn't work from shiny-server (ubuntu).
library(dbplyr)
library(dplyr)
library(DBI)
con <- dbConnect(odbc::odbc(),
driver = "FreeTDS",
server = "aws server",
database = "",
uid = "",
pwd = "")
tbl(con, "shops")
dbGetQuery(con,"SELECT *
FROM shops")
"R version 3.4.2 (2017-09-28)"
packageVersion("dbplyr")
[1] ‘1.2.1.9000’
packageVersion("dplyr")
[1] ‘0.7.4’
packageVersion("DBI")
[1] ‘0.7.15’
I have next error:
tbl(con, "shops")
Error: <SQL> 'SELECT *
FROM "shops" AS "zzz2"
WHERE (0 = 1)'
nanodbc/nanodbc.cpp:1587: 42000: [FreeTDS][SQL Server]Incorrect syntax near 'shops'.
But dbGetQuery(con,"SELECT * FROM shops") works fine.
Can you explain what's going wrong?
This is more likely because the FreeTDS driver does not return the class that dbplyr expects to see in order to use the MS SQL translation. The workaround is to take the result of class(con) and then add the following lines right after you connect, but before calling tbl(). Replace the [you class name] with the results of the class(con) call:
sql_translate_env.[your class name] <- dbplyr:::`sql_translate_env.Microsoft SQL Server`
sql_select.[your class name]<- dbplyr:::`sql_select.Microsoft SQL Server`

How to connect DB2 from R?

We have installed Data Studio 4.1.0.0 Client to access the data that is stored in DB2. We have installed DB2 11.1 64bit on our PC which has a Windows 7 64 bit.
I need to connect to the DB2 data from 64bit R.
We tried the following
library (RODBC)
driver.name <- "{IBM DB2 ODBC DRIVER}"
db.name <- "SBXSHRD"
host.name <- "XX.XXX.X.XX"
port <- "60012"
user.name <- "X20XX4"
pwd <- "SXXXXX01"
#Connection String
con.text <- paste ("DRIVER =", driver.name,
                   "; Database =", db.name,
                   "; Hostname =", host.name,
                   "; Port =", port,
                   "; PROTOCOL = TCPIP",
                   "; UID =", user.name,
                   "; PWD =", pwd, sep = "")
#Connect to DB2
con1 <- odbcDriverConnect (con.text)
top <- sqlQuery (con1,
               "SELECT *
               FROM ODS_CANALES_LINK.VW_OP_D_TRANSACCIONCANAL
               where CODMES_PROC = 201708
               FETCH FIRST 3 ROW ONLY
               ",
               errors = FALSE)
But I get the following result in r
> con1 <- odbcDriverConnect(con.text)
Warning messages:
1: In odbcDriverConnect(con.text) :
[RODBC] ERROR: state IM004, code 0, message [Microsoft][Administrador de controladores ODBC] Error de SQLAllocHandle del controlador en SQL_HANDLE_ENV
2: In odbcDriverConnect(con.text) : ODBC connection failed
here a detail of the DB2 that we have and a snapshot of what we are doing in R
enter image description here
enter image description here
RJDBC works quite well. But ... On one occasion, after the complete rebuild of docker image, I got all resultsets with changed column names because they changed name from jdbc function getColumnName to getColumnLabel.
https://github.com/s-/RJDBC/commit/7f1c1eec25ed90ec5ed71141189b816e2a3c2657
library(RJDBC)
CONSTR <- "jdbc:db2://hostname:446/database"
jcc = JDBC("com.ibm.db2.jcc.DB2Driver", "db2jcc4.jar")
connect <- function() {
dbConnect(jcc, CONSTR, user="scott", password="tiger")
}
dept <- function() {
con <- connect()
sql <- "SELECT DEPTNO, DEPTNAME FROM DSN8710.dept"
rs <- dbSendQuery(con, sql)
x <- dbFetch(rs)
dbClearResult(rs)
# change column names, because the names are not stable!
names(x) <- c('DEPTNO', 'DEPTNAME')
dbDisconnect(con)
x
}

Connecting to HIve JDBC through R Class Error

I am trying to connect to hive using R JDBC library. My code looks like this:
library('DBI')
library('rJava')
library('RJDBC')
hadoop.class.path = list.files(path=c('/usr/hdp/hadoop/'), pattern='jar', full.names=T);
hadoop.lib.path = list.files(path=c('/usr/hdp/hadoop/lib/'), pattern='jar', full.names=T);
hive.class.path = list.files(path=c('/usr/hdp/hive/lib/'), pattern='jar', full.names=T);
mapred.class.path = list.files(path=c('/usr/hdp/hadoop-mapreduce'), pattern='jar', full.names=T);
cp = c(hadoop.class.path, hadoop.lib.path, hive.class.path, mapred.class.path, '/usr/hdp/hadoop-mapreduce/hadoop-mapreduce-client-core.jar')
.jinit(classpath=cp)
drv <- JDBC('org.apache.hive.jdbc.HiveDriver', '/usr/hdp/hive/lib/hive-jdbc.jar')
con <- dbConnect(drv, 'jdbc:hive2://my.cluster.net:10000/default;principal=hive/my.cluster.net#domain.com', 'hive', 'hive')
But when I run, I get the following error:
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.SecurityUtil
However, I checked my /usr/hdp/hadoop/hadoop-commons.jar and found that the class org.apache.hadoop.security.SecurityUtil is there. So what else could be causing this error?

RODBC: able to connect to db but can't find table object

I am trying to connect SQLite database using RODBC in R. RODBC is able to connect to the database but is not able to get the list of tables in database using sqlTables, which returns "0 rows". The database has 20 tables.
System: R 3.1.2, Windows 7, Rstudio
Code snippet
> library(RODBC)
> odbcGetInfo(bbdb1)
DBMS_Name
"SQLite"
DBMS_Ver
"3.8.6"
Driver_ODBC_Ver
"03.00"
Data_Source_Name
"bbdb1"
Driver_Name
"sqlite3odbc.dll"
Driver_Ver
"0.999"
ODBC_Ver
"03.80.0000"
Server_Name
"C:\\Users\\shals\\Documents\\R in a nutshell\\nutshell\\data\\bb1"
> sqlListTables(bbdb1)
Error: could not find function "sqlListTables"
> sqlTables(bbdb1)
[1] TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS
<0 rows> (or 0-length row.names)
> sqlPrimaryKeys(bbdb1,func,errors=FALSE,as.is=TRUE,catalog=NULL,schema=NULL)
Error in sqlPrimaryKeys(bbdb1, func, errors = FALSE, as.is = TRUE, catalog = NULL, :
object 'func' not found
Can anyone please help why sqlTables returning 0 rows when there are 20 tables in database.
changed the connection string as below after which the code worked fine.
bbdb1 <- odbcConnect(dsn="bbdb",believeNRows = FALSE,rows_at_time = 1)

Resources