Here is my function
getSQL <- function(server="server name", database="database name", Uid="
user name", Pwd="password", Query){
conlink <- paste('driver={SQL Server};server=', server,';database=',database,';Uid=', Uid,
';Pwd=', Pwd,';Encrypt=True;TrustServerCertificate=False', sep="")
conn <- odbcDriverConnect(conlink)
dat <- sqlQuery(channel= conn, Query, stringsAsFactors = F)
odbcCloseAll()
return(dat)
}
When I call the function using
query.cut = "SELECT [measurename]
,[OrgType]
,[year_session]
,[Star]
,[cutvalue]
,[Date]
,[File]
FROM [database name].[dbo].[DST_Merged_Cutpoint]
ORDER BY [year_session] DESC
"
getSQL(Query=query.cut)
I get this error:
Error in sqlQuery(conn, Query, stringsAsFactors = F) :
first argument is not an open RODBC channel
In addition: Warning messages:
1: In odbcDriverConnect(conlink) :
[RODBC] ERROR: state 28000, code 18456, message [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user ' insightm8'.
2: In odbcDriverConnect(conlink) :
[RODBC] ERROR: state 01S00, code 0, message [Microsoft][ODBC SQL Server Driver]Invalid connection string attribute
3: In odbcDriverConnect(conlink) :
Error in sqlQuery(conn, Query, stringsAsFactors = F) :
first argument is not an open RODBC channel
How can I fix these errors? Thanks in advance
Take care not to add spaces to UID:
Server]Login failed for user ' insightm8'.
Reproducing this on an SQL Server connection creates the same error.
Try using paste0 instead of paste :
conlink <- paste0('driver={SQL Server};server=', server,';database=',database,';Uid=', Uid,
';Pwd=', Pwd,';Encrypt=True;TrustServerCertificate=False', sep="")
Related
Airflow-pandas-read-sql-query to dataframe
i am trying to connect to SQL server local to get data from a table and process the data using pandas operations but i m failing to figure out how to pass the select query results to a data frame
the below works to clear data in the table
``` sql_command = """ DELETE FROM [TestDB].[dbo].[PythonTestData] """
t3 = MsSqlOperator( task_id = 'run_test_proc',
mssql_conn_id = 'mssql_local',
sql = sql_command,
dag = dag,
database = 'TestDB',
autocommit = True) ```
the intended pandas is
query = 'SELECT * FROM [ClientData] '#where product_name='''+i+''''''
df = pd.read_sql(query, conn)
pn_list = df['ClientID'].tolist()
#print("The original pn_list is : " + str(pn_list))
for i in pn_list:
varw= str(i)
queryw = 'SELECT * FROM [ClientData] where [ClientID]='''+varw+''
dfw = pd.read_sql(queryw, conn)
dfw = dfw.applymap(str)
cols=['product_id','product_name','brand_id']
x=dfw.values.tolist()
x=x[0]
ClientID=x[0]
Name=x[1]
Org=x[2]
Email=x[3]
#print('Name :'+Name+' ,'+'Org :'+Org+' ,'+'Email :'+Email+' ,'+'ClientID :'+ClientID)
salesData_qry= 'SELECT * FROM [TestDB].[dbo].[SalesData] where [ClientID]='''+ClientID+''
salesData_df= pd.read_sql(salesData_qry, conn)
salesData_df['year1'] = salesData_df['Order Date'].dt.strftime('%Y')
salesData_df['OrderMonth'] = salesData_df['Order Date'].dt.strftime('%b')
filename ='Daily_Campaign_Report_'+Name+'_'+Org+'_'+datetime.now().strftime("%Y%m%d_%H%M%S")
p = Path('C:/Users/user/Documents/WorkingData/')
salesData_df.to_csv(Path(p, filename + '.csv'))```
Please point me to correct approach as i m new to airflow
I'm not so clear on how you generate the query code but in order to get dataframe from MsSQL you need to use MsSqlHook:
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
def mssql_func(**kwargs):
hook = MsSqlHook(conn_id='mssql_local')
df = hook.get_pandas_df(sql="YOUR_QUERY")
#do whatever you need on the df
run_this = PythonOperator(
task_id='mssql_task',
python_callable=mssql_func,
dag=dag
)
this is the code i am using for the dag
def mssql_func(**kwargs):
conn = MsSqlHook.get_connection(conn_id="mssql_local")
hook = conn.get_hook()
df = hook.get_pandas_df(sql="SELECT * FROM [TestDB].[dbo].[ClientData]")
#do whatever you need on the df
print(df)
run_this = PythonOperator(
task_id='mssql_task',
python_callable=mssql_func,
dag=dag
)
Error Log
[2021-01-12 16:07:15,114] {providers_manager.py:159} WARNING - The provider for package 'apache-airflow-providers-imap' could not be registered from because providers for that package name have already been registered
[2021-01-12 16:07:15,618] {base.py:65} INFO - Using connection to: id: mssql_local. Host: localhost, Port: 1433, Schema: dbo, Login: sa, Password: XXXXXXXX, extra: None
[2021-01-12 16:07:15,626] {taskinstance.py:1396} ERROR - (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Traceback (most recent call last):
File "src/pymssql.pyx", line 636, in pymssql.connect
File "src/_mssql.pyx", line 1964, in _mssql.connect
File "src/_mssql.pyx", line 682, in _mssql.MSSQLConnection.__init__
File "src/_mssql.pyx", line 1690, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Here is my connection string
library(RODBC)
odbconnection <-
odbcDriverConnect("Driver=SQLServer;Server=some_address; Database=some_db;Uid=some_username; Pwd=some_password")
it returns:
Warning messages:
1: In odbcDriverConnect("Driver=SQLServer;Server=some_address; Database=some_db;Uid=some_username; Pwd=some_password") :
[RODBC] ERROR: state 08001, code 0, message [RStudio][SqlServer] Problem establishing connection to the server.
2: In odbcDriverConnect("Driver=SQLServer;Server=some_address; Database=some_db;Uid=some_username; Pwd=some_password") :
ODBC connection failed
How can I troubleshoot this error: [RODBC] ERROR: state 08001, code 0,?
Try with the DBI package and host:port/service_name such as
library(DBI)
con <- dbConnect(drv, 'user', 'pass', 'some_address:someport/some_db')
dbGetQuery(con,"select count(*) from some_db")
I'm trying to get data from AWS SQL Server.
This code works fine from local PC, but it didn't work from shiny-server (ubuntu).
library(dbplyr)
library(dplyr)
library(DBI)
con <- dbConnect(odbc::odbc(),
driver = "FreeTDS",
server = "aws server",
database = "",
uid = "",
pwd = "")
tbl(con, "shops")
dbGetQuery(con,"SELECT *
FROM shops")
"R version 3.4.2 (2017-09-28)"
packageVersion("dbplyr")
[1] ‘1.2.1.9000’
packageVersion("dplyr")
[1] ‘0.7.4’
packageVersion("DBI")
[1] ‘0.7.15’
I have next error:
tbl(con, "shops")
Error: <SQL> 'SELECT *
FROM "shops" AS "zzz2"
WHERE (0 = 1)'
nanodbc/nanodbc.cpp:1587: 42000: [FreeTDS][SQL Server]Incorrect syntax near 'shops'.
But dbGetQuery(con,"SELECT * FROM shops") works fine.
Can you explain what's going wrong?
This is more likely because the FreeTDS driver does not return the class that dbplyr expects to see in order to use the MS SQL translation. The workaround is to take the result of class(con) and then add the following lines right after you connect, but before calling tbl(). Replace the [you class name] with the results of the class(con) call:
sql_translate_env.[your class name] <- dbplyr:::`sql_translate_env.Microsoft SQL Server`
sql_select.[your class name]<- dbplyr:::`sql_select.Microsoft SQL Server`
We have installed Data Studio 4.1.0.0 Client to access the data that is stored in DB2. We have installed DB2 11.1 64bit on our PC which has a Windows 7 64 bit.
I need to connect to the DB2 data from 64bit R.
We tried the following
library (RODBC)
driver.name <- "{IBM DB2 ODBC DRIVER}"
db.name <- "SBXSHRD"
host.name <- "XX.XXX.X.XX"
port <- "60012"
user.name <- "X20XX4"
pwd <- "SXXXXX01"
#Connection String
con.text <- paste ("DRIVER =", driver.name,
"; Database =", db.name,
"; Hostname =", host.name,
"; Port =", port,
"; PROTOCOL = TCPIP",
"; UID =", user.name,
"; PWD =", pwd, sep = "")
#Connect to DB2
con1 <- odbcDriverConnect (con.text)
top <- sqlQuery (con1,
"SELECT *
FROM ODS_CANALES_LINK.VW_OP_D_TRANSACCIONCANAL
where CODMES_PROC = 201708
FETCH FIRST 3 ROW ONLY
",
errors = FALSE)
But I get the following result in r
> con1 <- odbcDriverConnect(con.text)
Warning messages:
1: In odbcDriverConnect(con.text) :
[RODBC] ERROR: state IM004, code 0, message [Microsoft][Administrador de controladores ODBC] Error de SQLAllocHandle del controlador en SQL_HANDLE_ENV
2: In odbcDriverConnect(con.text) : ODBC connection failed
here a detail of the DB2 that we have and a snapshot of what we are doing in R
enter image description here
enter image description here
RJDBC works quite well. But ... On one occasion, after the complete rebuild of docker image, I got all resultsets with changed column names because they changed name from jdbc function getColumnName to getColumnLabel.
https://github.com/s-/RJDBC/commit/7f1c1eec25ed90ec5ed71141189b816e2a3c2657
library(RJDBC)
CONSTR <- "jdbc:db2://hostname:446/database"
jcc = JDBC("com.ibm.db2.jcc.DB2Driver", "db2jcc4.jar")
connect <- function() {
dbConnect(jcc, CONSTR, user="scott", password="tiger")
}
dept <- function() {
con <- connect()
sql <- "SELECT DEPTNO, DEPTNAME FROM DSN8710.dept"
rs <- dbSendQuery(con, sql)
x <- dbFetch(rs)
dbClearResult(rs)
# change column names, because the names are not stable!
names(x) <- c('DEPTNO', 'DEPTNAME')
dbDisconnect(con)
x
}
I am trying to connect to Oracle from R using RODBC without using tnsnanes.ora.
I have tried following strings, but none of them are working.
> con.text <- paste0("Driver={OracleODBC-11g};Dbq=//oracle.server:1527/database.pdw.prod;Uid=user;Pwd=pswd;")
> con.text <- paste0("Driver={OracleODBC-11g}; ",
"CONNECTSTRING=(DESCRIPTION=(ADDRESS= (PROTOCOL = TCP)(HOST = oracle.server)(PORT = 1527))(CONNECT_DATA=(SERVICE_NAME = database.pdw.prod))); uid=user;pwd=pswd;")
> con.text <- paste0("Driver=", "OracleODBC-11g"
, ";Server=", "oracle.server"
, ";Database=", "database.pdw.prod"
, ";Uid=", "user"
, ";Pwd=", "pwd", ";")
> con.text <- paste0("Driver=", "OracleODBC-11g"
, ";Server=", "oracle.server"
, ";CONNECTSTRING=" , "(DESCRIPTION=(ADDRESS= (PROTOCOL = TCP)(HOST = oracle.server)(PORT = 1527))(CONNECT_DATA=(SERVICE_NAME = database.pdw.prod)))"
, ";Database=", "database.pdw.prod"
, ";Uid=", "user"
, ";Pwd=", "pswd", ";")
> con1 <- odbcDriverConnect(connection = con.text)
But for all these strings I am getting following error:
Warning messages:
1: In odbcDriverConnect(connection = con.text) :
[RODBC] ERROR: state HY000, code 12162, message [unixODBC][Oracle][ODBC][Ora]ORA-12162: TNS:net service name is incorrectly specified
2: In odbcDriverConnect(connection = con.text) : ODBC connection failed
OR
1: In odbcDriverConnect(connection = con.text) :
[RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data source name not found, and no default driver specified
The correct sysntaxis you are looking for is
Conex <- odbcDriverConnect("DRIVER=Oracle en OraClient11g_home2;UID=USERNAME;PWD=PASSWORD;DBQ=//HOSTNAME:PORT/ORACLE_SID;",
believeNRows = FALSE)
Ex
Conex <- odbcDriverConnect("DRIVER=Oracle en OraClient11g_home2;UID=John;PWD=Deere;DBQ=//fcoracleserver.youdomain:1521/TestEnvironment;",
believeNRows = FALSE)
The hard part is to find the name of the Driver, as you can see mine is on spanish.
What I did is I create first a ODBC Conection using the C:\Windows\System32\odbcad32.exe, there you can check the right name of your Oracle or SQL Server driver.
Once you create the conection, you can use
odbcDataSources() on R, to see that conection and to find out the driver. Thats really the hard part.
Hope it helps !