How to connect to MySQL database in Julia - julia

I am new to Julia having used R for the last few years and I am struggling with my first task which is connecting to my AWS MySQL database.
I have followed many online tutorials but I get the same message no matter what I do.
Everything was installed yesterday so it should all be the current version.
julia-version = Version 1.5.2
Here is the code:
Pkg.add(PackageSpec(url="https://github.com/JuliaComputing/MySQL.jl"))
Pkg.add(PackageSpec(url="https://github.com/JuliaDB/DBI.jl"))
using MySQL
con = MySQL.connect("ec2blah.eu-west-2.compute.amazonaws.com", "name", "password", db = "database")
When I run this I get the following error:
UndefVarError: connect not defined
getproperty(::Module, ::Symbol) at Base.jl:26
top-level scope at data_prep.jl:18
Thanks

As per the documentation, you should probably do something like this:
using Pkg
Pkg.add("MySQL") # No need for a full PackageSpec here
Pkg.add("DBInterface")
using MySQL
using DBInterface
conn = DBInterface.connect(MySQL.Connection, "ec2blah.eu-west-2.compute.amazonaws.com",
"name", "password", db = "database")

Related

Connecting to Azure Databricks from R using jdbc and sparklyr

I'm trying to connect my on-premise R environment to an Azure Databricks backend using sparklyr and jdbc. I need to perform operations in databricks and then collect the results locally. Some limitations:
No RStudio available, only a terminal
No databricks-connect. Only odbc or jdbc.
The configuration with odbc + dplyr is working, but it seems too complicated, so I would like to use jdbc and sparklyr. Also, if I use RJDBC it works, but it would be great to have the tidyverse available for data manipulation. For that reason I would like to use sparklyr.
I've the jar file for Databricks (DatabricksJDBC42.jar) in my current directory. I downloaded it from: https://www.databricks.com/spark/jdbc-drivers-download. This is what I got so far:
library(sparklyr)
config <- spark_config()
config$`sparklyr.shell.driver-class-path` <- "./DatabricksJDBC42.jar"
# something in the configuration should be wrong
sc <- spark_connect(master = "https://adb-xxxx.azuredatabricks.net/?o=xxxx",
method = "databricks",
config = config)
spark_read_jdbc(sc, "table",
options = list(
url = "jdbc:databricks://adb-{URL}.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/{ORG_ID}/{CLUSTER_ID};AuthMech=3;UID=token;PWD={PERSONAL_ACCESS_TOKEN}",
dbtable = "table",
driver = "com.databricks.client.jdbc.Driver"))
This is the error:
Error: java.lang.IllegalArgumentException: invalid method toDF for object 17/org.apache.spark.sql.DataFrameReader fields 0 selected 0
My intuition is that the sc might not be not working. Maybe a problem in the master parameter?
PS: this is the solution that works via RJDBC
databricks_jdbc <- function(address, port, organization, cluster, token) {
location <- Sys.getenv("DATABRICKS_JAR")
driver <- RJDBC::JDBC(driverClass = "com.databricks.client.jdbc.Driver",
classPath = location)
con <- DBI::dbConnect(driver, sprintf("jdbc:databricks://%s:%s/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/%s/%s;AuthMech=3;UID=token;PWD=%s", address, port, organization, cluster, token))
con
}
DATABRICKS_JAR is an environment variable with the path "./DatabricksJDBC42.jar"
Then I can use DBI::dbSendQuery(), etc.
Thanks,
I Tried multiple configurations for master. So far I know that jdbc for the string "jdbc:databricks:..." is working. The JDBC connection works as shown in the code of the PS section.
Configure R studios with azure databricks -> go to cluster -> app -> set up azure Rstudio .
For information refer this third party link it has detail information about connecting azure databricks with R
Alternative approach in python:
Code:
Server_name = "vamsisql.database.windows.net"
Database = "<database_name"
Port = "1433"
user_name = "<user_name>"
Password = "<password"
jdbc_Url = "jdbc:sqlserver://{0}:{1};database={2}".format(Server_name, Port,Database)
conProp = {
"user" : user_name,
"password" : Password,
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
df = spark.read.jdbc(url=jdbc_Url, table="<table_name>", properties=conProp)
display(df)
Output:

Problems with connecting to local postgres database from R through RPostgreSQL

I have to following code
drv <- RPostgreSQL::PostgreSQL()
con <- DBI::dbConnect(drv, dbname = 'dbname', user = 'user',
host = 'host.name', port = 5432, password = 'password')
When I run it on server (Ubuntu server 16.04 with latest updates) running the database I get the following error:
Error in .valueClassTest(ans, "data.frame", "dbGetQuery") :
invalid value from generic function ‘dbGetQuery’, class “NULL”, expected “data.frame”
But when I run R from commandline with sudo, it works, when I run it from different my laptop connecting to the DB on the server it also works. So it shouldn't be connection problem. I am thinking about problem with access rights to some libraries/executables/configs on the system? Any help will be appreciated.
When I run the dbConnect multiple times and it ends with the error, when I run drv_info <- RPostgreSQL::dbGetInfo(drv), I still get multiple connectionIds in the drv_info:
drv_info <- RPostgreSQL::dbGetInfo(drv)
> drv_info
$drvName
[1] "PostgreSQL"
$connectionIds
$connectionIds[[1]]
<PostgreSQLConnection>
$connectionIds[[2]]
<PostgreSQLConnection>
$fetch_default_rec
[1] 500
$managerId
<PostgreSQLDriver>
$length
[1] 16
$num_con
[1] 2
$counter
[1] 2
Found a source of confusion, but not necessarily the root problem. (I was assuming RPostgres, while you are using RPostgreSQL (github mirror).)
If you check the source, you'll find that the method is calling postgresqlNewConnection, which includes a call to dbGetQuery. The problem you're seeing is that your call to dbConnect is failing (my guess is at line 100) and returns something unexpected, but postgresqlNewConnection continues.
I see three options for you:
try calling dbConnect(..., forceISOdate=FALSE) to bypass that one call to dbGetQuery (note that this doesn't fix the connection problem, but it at least will not give you the unexpected query error on connection attempt);
raise an issue for the package maintainers; or
switch to using RPostgres, still DBI-based and actively developed (it looks like RPostgreSQL has not had significant commits in the last few years, not sure if that's a sign of good code stability or development stagnation)
One lesson you may take from this is that you should check the value returned from dbConnect; if is.null(con), something is wrong. (Again, this does not solve the connection problem.)

how to read data from Cassandra (DBeaver) to R

I am using Cassandra CQL- system in DBeaver database tool. I want to connect this cassandra to R to read data. Unfortunately the connection takes more time (i waited for more than 2 hours) with RCassandra package. but it does not seem to get connected at all and still loading. Does anyone has any idea on this?
the code as follows:
library(RCassandra)
rc <- RC.connect(host ="********", port = 9042)
RC.login(rc, username = "*****", password = "******")
after this step RC.login, it is still loading for more than 2 hours.
I have also tried using RJDBC package like posted here : How to read data from Cassandra with R?.
library(RJDBC)
drv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver",
list.files("C:/Program Files/DBeaver/jre/lib",
pattern="jar$",full.names=T))
But this throws error
Error in .jfindClass(as.character(driverClass)[1]) : class not found
None of the answers are working for me from the above link.I am using latest R version 3.4.0 (2017-04-21) and New version of DBeaver : 4.0.4.
For your first approach, which I am less familiar with, should you not have a line that sets the use of the connection?
such as:
library(RCassandra)
c <- RC.connect(host ="52.0.15.195", port = 9042)
RC.login(c, username = "*****", password = "******")
RC.use(c, "some_db")
Did you check logs that you are not getting some silent error while connecting?
For your second approach, your R program is not seeing a driver in a classpath for Java (JMV).
See this entry for help how to fix it.

ODBC Teradata connection Julia

I am trying to connect to Teradata using ODBC in Julia using the following but I am getting an error:
co = ODBC.connect(dsn = "DBS", uid = "me", pwd = "xxx")
Error:
>ArgumentError: function connect does not accept
I've installed the ODBC package and if I run ODBC.listdsns() I can see a list of available DSNs, one of which I am using in my connection string. I've tried a few other variants that are listed in the documentation but I get the same error message.
I have been using the above connection string in R so I'm confident that I can connect with ODBC but I only installed Julia over the weekend so I am still getting to grips with it.
Any help would be much appreciated.

Connect R and Teradata using JDBC

I´m trying to connect R and Teradata using RJDBC.
I´ve found this link that has an example using mysql, but i´m nos sure how to do the same with teradata.
library(RJDBC)
drv <- JDBC("com.mysql.jdbc.Driver",
"/etc/jdbc/mysql-connector-java-3.1.14-bin.jar",
identifier.quote="`")
conn <- dbConnect(drv, "jdbc:mysql://localhost/test", "user", "pwd")
I´ve downloaded this driver:
http://downloads.teradata.com/download/connectivity/jdbc-driver
But i´m not sure where i should reference the directory.
I know there is a teradataR package out there, but i don´t know if it really works with the R 3.0.0.
For the time being i´m just interesting in pulling data out of the database. Something as simple as SELECT * FROM table. The problem is RODBC is very slow...
Are there other options for doing this task?
Using the R Console, enter the following steps below to make a Teradata connection:
drv = JDBC("com.teradata.jdbc.TeraDriver","ClasspathForTeradataJDBCDriverFiles")
Example:
drv = JDBC("com.teradata.jdbc.TeraDriver","c:\\terajdbc\\terajdbc4.jar;c:\\terajdbc\\tdgssconfig.jar")
NOTE: A path on a UNIX machine would use single forward slashes to separate its components and a colon between files.
conn = dbConnect(drv,"jdbc:teradata://DatabaseServerName/ParameterName=Value","User","Password")
Example:
conn = dbConnect(drv,"jdbc:teradata://jdbc1410ek1.labs.teradata.com/TMODE=ANSI,LOGMECH=LDAP","guestldap","passLDAP01")
NOTE: Connection parameters are optional. The first ParameterName is separated from the DatabaseServerName by a forward slash character.
dbGetQuery(conn,"SQLquery")
Example:
dbGetQuery(conn,"select ldap from dbc.sessioninfov where sessionno=session")

Resources