When I read table from Postgre SQL database I get encoded polish letters. The database has encoding UTF8 and collation Polish_Poland.1250.
Code:
library(PostgreSQL)
# make connection
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host="localhost", port=xxxx, user="user", dbname="myBase")
# read tables and assign to variables
table <- dbReadTable(con, c("public","table"))
Results:
"BYDGOSZCZ" "wĹ‚ocĹ‚awski" "WROCĹAW"
"gostyĹ„ski" "nowotarski" "MOGILEĹSKI" "myĹ›liborski"
Should be:
"BYDGOSZCZ" "włocławski" "WROCŁAW"
"gostyński" "nowotarski" "MOGILEŃSKI" "myśliborski"
I found the solution!!! This line should be added after the connection:
postgresqlpqExec(con, "SET client_encoding = 'windows-1250'").
Related
I have a problem with importing data from edb postgres into R (with RPostgres). I using this code:
library(RPostgres)
conn <- dbConnect(Postgres(), dbname = 'xx', host = 'xx', port = 'xx',
user = 'xx', password = 'xx',
options="-c search_path=xx")
a=dbGetQuery(conn, "SELECT * FROM xx")
After importing data into R is all Ok exept all column with date which become corruped. A few exemples:
postgres value: 2009-11-02 after importing in R: 0231-08-11
postgres value: 2009-08-12 after importing in R: 1206-01-15
data type in postgres for this column is date. If I read from postgres column (date) with query:
".. to_char(column_with_date, 'DD-MM-YYYY') as column_with_date .."
then I get ok data in R but data type in R is not date but character.
I tried the same with importing the same data from mysql base into R (with RMySQL) and in this case data are noncorrupted . Used code:
library(RMySQL)
conn <- dbConnect(MySQL(), user = "xx", password = "xx", dbname = "xx", host = "xx" )
a=dbGetQuery(conn,"select* from xx ")
Thank you
I have a table in Postgres database with some columns and values. I have imported this table into local memory, performed some computation on these columns and have a data frame with new values . Now I want this updated data frame to be placed back in the database in the same table.
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host = "*****", port = "****",
dbname = "sai", user = "sai", password = "password")
saix_account_demo <- dbReadTable(con = con, name = "saix_account")
...
dbWriteTable(con, name = "saix_account", value = saix_account_demo,
row.names=FALSE, append=TRUE)`
I have performed dbWrtiteTable() with append==T and overwrite ==F. But I am facing an error saying primary key constraint violated. I understood the problem that I was trying to insert data instead of updating.
Using R, I tried to insert a data frame. My script looks like below:
con <- dbConnect(RMySQL::MySQL(), username = "xxxxxx", password = "xxxxxx",host = "127.0.0.1", dbname = "xxxxx")
dbWriteTable(conn=con,name='table',value=as.data.frame(df), append = TRUE, row.names = F)
dbDisconnect(con)
WHen the script hit below line:
dbWriteTable(conn=con,name='table',value=as.data.frame(df), append = TRUE, row.names = F)
I got an error like below:
Error in .local(conn, statement, ...) : could not run statement: Invalid utf8 character string: 'M'
I am not sure why this error occurred. This is a part of script that has been run well on another machine.
Please advise
You should create proper connection, then only can insert data frame to your DB.
for creating connection username & password, host name & data base name should correct. same code only but i removed some parameter
try this:
mydb = dbConnect(MySQL(), user='root', password='password', dbname='my_database', host='localhost')
i just insert Iris data in my_database
data(iris)
dbWriteTable(mydb, name='db', value=iris)
i inserted iris data frame in the name of db in my_database
RJDBC connecting to Hive fine and also reading the data from Hive. But it is not writing data to Hive using --> dbWriteTable.
see below-
options(java.parameters = "-Xmx8g")
library(DBI)
library(rJava)
library(RJDBC)
cp <- c(list.files("/tmp/R_hive_libs/cloudera_hive_jars", pattern = "[.]jar", full.names=TRUE, recursive=TRUE),list.files("/tmp/R_hive_libs/R_hadoop_libs", pattern = "[.]jar", full.names=TRUE, recursive=TRUE),list.files("/tmp/R_hive_libs/R_hadoop_libs/lib", pattern = "[.]jar", full.names=TRUE, recursive=TRUE), recursive=TRUE)
drv <- JDBC(driverClass = "com.cloudera.hive.jdbc4.HS2Driver", classPath=cp)
conn <- dbConnect(drv, "jdbc:hive2://XXXXXX:10000/default", "user", "password")
show_databases <- dbGetQuery(conn, "show databases")
List_of_Tables <- dbListTables(conn)
data1 <- dbGetQuery(conn, "select * from XXX.xxx limit 10000")
data_to_write_back_to_hive <- data.frame(aggregate(data1$xxx.xxx, by=list(Month=data1$xxx.cmp_created_timestamp_month), FUN=sum))
data_to_write_back_to_hive[[2]] <-c(10,20)
colnames(data_to_write_back_to_hive) <- c("Month", "Energy")
dbWriteTable(conn, "xxxx.checking",data_to_write_back_to_hive)
How to write data back to hive? it is giving below error-
Error in .local(conn, statement, ...) : execute JDBC update query failed in dbSendUpdate ([Simba]HiveJDBCDriver ERROR
processing query/statement. Error Code: 40000, SQL state:
TStatus(statusCode:ERROR_STATUS,
infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error
while compiling statement: FAILED: ParseException line 1:36 mismatched
input 'PRECISION' expecting ) near 'DOUBLE' in create table
statement:28:27,
org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:326,
org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:102,
org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:171,
org.apache.hive.service.cli.operation.Operation:run:Operation.java:268,
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:410,
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:391,
sun.reflect.GeneratedMethodAccessor56:invoke::-1,
sun.reflect.DelegatingMeth
This question comes up a fair bit. I think the short answer is that you can't do what you want at present. The DBI/JDBC drivers don't metaprogram syntactically correct HiveQL.
How do I update data in a postgresql db through R with new data?
I've tried
dbGetQuery(con,"UPDATE table SET column1=:1,column2=:2, column3=:3
where id=:4", data=Rdata[,c("column1", "column3", "column3","id")])
I also tried with the colons replaced with $ but that didn't work either. I keep getting:
Error in postgresqlExecStatement(conn, statement, ...) :
unused argument(s)
I figured it out using:
update <- function(i) {
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname="db_name", host="localhost", port="5432", user="chris", password="password")
txt <- paste("UPDATE data SET column_one=",data$column_one[i],",column_two=",data$column_two[i]," where id=",data$id[i])
dbGetQuery(con, txt)
dbDisconnect(con)
}
registerDoMC()
foreach(i = 1:length(data$column_one), .inorder=FALSE,.packages="RPostgreSQL")%dopar%{
update(i)
}
At least the RODBC has a specific function sqlUpdate:
sqlUpdate updates the table where the rows already exist. Data frame
dat should contain columns
with names that map to (some of) the columns in the table
See http://cran.r-project.org/web/packages/RODBC/RODBC.pdf