There seem to be plenty of similar issues and answers to them, yet so far haven't found a solution that would work for me.
Issue in short: special characters (scandic letters, ohm-symbol, Celcius degree-symbol, etc.) get all scrambled up when application is run on Linux Shiny server. Data shown on Shiny Dashboard is queried from Oracle database. The column name in these examples is "NAME" and type is VARCHAR2. When I run similar code on Linux server in R or on my local Windows RStudio, all characters look fine.
What I've tried so far: Characters started to look fine in Linux R after placing NLS_LANG to NLS_LANG=AMERICAN_AMERICA.AL32UTF8 in /etc/environment. I figured these are correct NLS_LANG settings by running SELECT * FROM V$NLS_PARAMETERS and SELECT * from NLS_SESSION_PARAMETERS in Linux's R. Though this didn't fix the issue on the Shiny Server side.
Also I've played around with the dbConnect encoding-parameter, with no luck.
Somewhat reproducible example: (sorry I can't gain access to my Oracle server ;-) )
library(ROracle)
ORAdrv <- dbDriver("Oracle", unicode_as_utf8 = TRUE, ora.attributes = TRUE) #doesn't matter if I have these two latter attributes or not
ORAconnect.string <- paste(
"(DESCRIPTION=",
"(ADDRESS=(PROTOCOL=tcp)(HOST=xx.xx.xx.xx)(PORT=xxxx))",
"(CONNECT_DATA=(SID=...)))", sep = ""
)
query2 <- ("select NAME, DATA_FIELD from TABLE where DATA_FIELD in ('ID7018789', 'ID7025838', 'ID7021380')")
ORAcon <- dbConnect(ORAdrv, username = "...", password = "...", dbname = ORAconnect.string, encoding = "UTF-8") #doesn't matter if encoding is defined or not
res <- dbSendQuery(ORAcon, query2, 'set character set "utf8"') #doesn't matter if the last attribute is defined or not
df <- fetch(res)
dbDisconnect(ORAcon)
print(df)
What the end result looks like:
If I run the code in Linux R, the result is what is expected (Ohm, Celcius symbols and scandic characters look good):
If I run the same code and render the dataframe as datatable on ShinyServer app, the result is like this: (Ohm and Celcius symbols are replaced with question mark, scandic characters äö -> ao)
Any help on getting the encoding correctly on Shiny Server application side is highly appreciated =)
I was able to solve it finally. If someone else is struggling with this, cast the column to nvarchar already in the query.
in my case, to_nchar(NAME) did the trick.
https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions187.htm
Related
So, I've been struggling with this for a while now and can't seem to google my way out of it. I'm trying to read a .sql file into R, I always do that to avoid putting 100+ lines of sql in my R scripts. I usually do this:
library(tidyverse)
library(DBI)
con <- dbConnect(<CONNECTION ARGUMENTS>)
query <- read_file("path/to/script.sql")
df <- as_tibble(dbGetQuery(con, query))
dbDisconnect(con)
However, this time my sql script has some Spanish characters in it. Say something like this:
select tree_id, tree
from forest.trees
where species = 'árbol'
When I read this script into R and make the query it just doesn't return anything, but if I copy and paste the sql script into an R string it works! So it seems that the problem is in the line where I read the script into R.
I tried changing the string's encoding in a couple of ways:
# none of these work
query <- read_file("path/to/script.sql")
Encoding(query) <- "latin1"
query <- readLines("path/to/script.sql", encoding = "latin1")
query <- paste0(query, collapse = " ")
Unfortunately I don't have a public database to offer to anyone reading this. I'm connecting to a postgreSQL 11 database.
--- UPDATE ----
I'm on a windows 10 machine, with US locale.
When I use the read_file function the contents of query look ok, the Spanish characters print out like they should, but when I pass it to dbGetQuery it just doesn't fetch anything.
I tried forcing encoding "latin1" because I found online that Spanish characters tend to fix in R when doing that. When doing this, the Spanish characters print out wrong, so I didn't expected it to work, and it didn't.
The character values in my database have 'utf-8' encoding.
Just to be completely clear, all my attempts to read the .sql script haven't worked, however this does work:
library(tidyverse)
library(DBI)
con <- dbConnect(<CONNECTION ARGUMENTS>)
query <- "select tree_id, tree from forest.trees where species = 'árbol'"
# df actually has results
df <- as_tibble(dbGetQuery(con, query))
dbDisconnect(con)
The encoding statement is telling R how to interpret the filename, not its contents. Try this instead:
filetext <- readLines(file("path/to/script.sql", encoding = "latin1"))
See this answer for more details:R: can't read unicode text files even when specifying the encoding
So after some time to think about it, I wondered why the solution proposed by MrFlick didn't work. I checked the encoding of the file created by this chunk:
query <- "select tree_id, tree from forest.trees where species = 'árbol'"
write_lines(query, "test.sql")
After checking what encoding did test.sql have, it turned out it was ANSI, but it didn't look right. So I manually changed my original script.sql encoding to ANSI. After that it worked totally fine.
This solution however, didn't work when I cloned my repo on an ubuntu environment. In ubuntu there was no problem with the original 'utf-8' encoding.
Hope this helps anyone dealing with this in windows.
So I am currently working with a connecting to an Access database. I am able to get connected to the Access DB which is located on my local system. This is actually connected to a SharePoint list. I would love to automate the process handling this SharePoint list with an R and Access combo! What I want to be able to do actually pretty basic, I want to introduce new data via a .csv which is processed for the relevant content and then compared to the current Access DB and finally the new information uploaded from r to Access.
I've learned that you need to pair the bit version of your Windows OS, Office version, and R version. So I am x64 on all of the above. This allowed me to connect to the Access DB. You also need the 'Microsoft Access Database Engine 2016 Redistributable' which is essentially the driver for the connection.
So what I have so far is:
library(odbc)
library(DBI)
file_path <- "C:/user/Documents/R Projects/...pathtofile.../filename.accdb"
accdb_con <- dbConnect(drv = odbc(), .connection_string = paste0("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=",file_path,";"))
access.db <- dbReadTable(accdb_con, "sNPS Deep Dives")
That now connects!
I then read in a .csv of new information
new.df <- read.csv("C:/user/Documents/R projects/...pathtofile.csv", header=T, stringsAsFactors=FALSE, na.strings=c("","NA"))
an example of the data set might just look something like this:
date <- c("15/10/2018","15/10/2018", "16/10/2018", "12/11/2018", "07/09/2018")
score <- c("6", "10", "7", "10", "9")
group <- c("a","b", "b", "a", "b")
CaseID <- c("301", "302", "303", "304", "305")
new.df <- data.frame(date,score,group,CaseID)
new.df$date <- as.character(new.df$date)
new.df$score <- as.numeric(new.df$score)
new.df$group <- as.character(new.df$group)
new.df$CaseID <- as.numeric(new.df$CaseID)
Notably there are more columns in the Access DB that people will fill in by hand with further information.
and I process it to be ready go into the Access DB.
probably not that interesting...
Then I compare the the new data against the Access DB as such:
library(dplyr)
new <- anti_join(new.df, access.db, by= "Case.ID")
Now I've tried:
dbWriteTable(access.db.copy, new, append = TRUE)
dbAppendTable(access.db.copy, new)
I don't seem to be able to get this to go anywhere
I am getting an error:
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘dbWriteTable’ for signature ‘"ACCESS", "data.frame", "missing"’
I've seen plenty of posts in which people are having trouble connecting to an Access DB but I haven't seen anything about writing new data into that database.
I know this isn't quite a reproducible example but it seems like a difficult problem to recreate since it's a connection problem between different tools. I would be happy to provide example sets that might make this easier
I would appreciate any direction you all can provide.
Thanks!
Edit:
It appears that Bing Sun was right, I was missing an argument. So it appears that we need something more like:
dbWriteTable(access.db.copy, "Name of table",new, append = TRUE)
Which produces the error:
Error in result_insert_dataframe(rs#ptr, values) :
nanodbc/nanodbc.cpp:1944: HY104: [Microsoft][ODBC Microsoft Access Driver]Invalid precision value
I wonder if this may something that is an error from Access about a file type?
now if I use the append I don't get an error I get a 0 for output
dbAppendTable(access.db.copy, "Name of table", new, append= TRUE)
With output:
[1] 0
But I don't see any of the new values when I check the Access file.
I know it's years later, but hopefully this will help someone else with this issue since you're right CrayCrayTown, there aren't very many posts covering this issue.
I've run into this problem repeatedly when dealing with R and MS Access. The solution that I've come up with is pretty "hacky" but it accomplishes what's trying to be done...just not very eloquently.
The way I do this is with a combo of RODBC and DBI packages.
First, I open a connection to the DB with RODBC, and use that connection to write my data to the DB as an intermediary table:
chan <- RODBC::odbcDriverConnection(connection = "/path/to/database.accdb")
RODBC::sqlSave(channel = chan,
dat = df,
tablename = "tbl_intermediary",
rownames = FALSE,
append = FALSE)
RODBC::odbcClose(chan)
rm(chan)
Make sure to close the RODBC connection, I also destroy it for good measure, because why not? I use RODBC for the intermediary table because it supports batch insert statements. I know that the same thing can, in theory, be done with DBI with DBI::dbAppendTable()(but we wouldn't be on this post if that worked how we had hoped). I tried this in a previous SO question here, but it didn't solve my problem. I also don't know how big my intermediary tables could get in the future. Hopefully by the time they get too big we'll be in a different DBMS.
Next, I reopen the connection, this time with DBI, and send a statement to the DB to write those data from the intermediary table to the final resting place for those data, and then drop the intermediary table.
con <- DBI::dbConnect(odbc::odbc(), .connection_string = "/path/to/database.accdb")
DBI::dbSendStatement(
conn = con,
statement = 'UPDATE
tbl_intermediary INNER JOIN final_tbl ON tbl_intermediary.SampleID = final_tbl.sampleNumber
SET
final_tbl.field1 = [tbl_intermediary].[field1],
final_tbl.notes = IIf(Nz([tbl_intermediary].[Notes],"")="",[final_tbl].[notes],[final_tbl].[notes] & "; Newest Notes: " & [tbl_intermediary].[Notes]);'
)
DBI::dbSendStatement(
conn = con,
statement = 'DROP TABLE tbl_intermediary;'
DBI::dbDisconnect(con)
rm(con)
)
The main reason why I chose this method is because some of the SQL I use with Access also has some VBA in it. When I send the SQL-VBA hybrid string with RODBC, I get assorted errors in the IIF() and Nz() functions (see example above). From the RODBC CRAN docs the query argument for the sqlQuery() function is strictly assumed to be a valid SQL statement. So, RODBC has no clue how to interpret the IIf() and Nz() MS Access functions. I think this also has to do with how the ODBC driver handles communication as well (please, someone correct me if I'm wrong about this).
As I understand it, DBI::dbSendStatment() however lets the database engine you're working with interpret how to use the statement argument you provide. In the situation above, the VBA is executed exactly how I would expect if it were run in Access directly. As per the DBI docs, for interactive use you'll generally want to use dbGetQuery or dbExecute.
I'm currently building a shiny application that needs to be translated in different languages. I have the whole structure but I'm struggling into getting values such as "Validació" that contain accents.
The structure I've followed is the following:
I have a dictionary that is simply a csv with the translation where
there's a key and then each language. The structure of this dictionary is the following:
key, cat, en
"selecció", "selecció", "Selection"
"Diferències","Diferències", "Differences"
"Descarregar","Descarregar", "Download"
"Diagnòstics","Diagnòstics", "Diagnoses"
I have a script that once the dictionary.csv is modified, generates a .bin file that later will be loaded in the code.
In strings.R I have all the strings that will appear on the code and I use a function to translate the current language to the one I want. The function is the following:
Code:
tr <- function(text){
sapply(text, function(s) translation[[s]][["cat"]], USE.NAMES=F)
}
When I translate something, since I am doing in another file, I assign it to another variable something like:
str_seleccio <- tr('Selecció)
The problem I'm facing is that for example if we translate 'Selecció' would be according to this function, tr('Selecció') and provides a correct answer if I execute it in the RStudio terminal but when I do it in the Shiny application, appears to me as a NULL. If the word I translate has no accents such as "Hello", tr("Hello") provides me a correct answer in the Shiny application and I can see it throught the code.
So mainly tr(word) gets the correct value but when assigning it "loses the value" so I'm a bit lost how to do it.
I know that you can do something like Encoding(str_seleccio) <- "UTF-8" but in this case is not working. In case of plain words it used to do but since when I asssign it, gets NULL is not working.
Any idea? Any suggestion? What I would like is to add something to tr function
The main idea comes from this repository that if you can take a look is the simplest version you can do, but (s)he has problem with utf-8 also.
https://github.com/chrislad/multilingualShinyApp
As in http://shiny.rstudio.com/articles/unicode.html suggested (re)save all files with UTF-8 encoding.
Additionaly change within updateTranslation.R:
translationContent <- read.delim("dictionary.csv", header = TRUE, sep = "\t", as.is = TRUE)
to:
translationContent <- read.delim("dictionary.csv", header = TRUE, sep = "\t", as.is = TRUE, fileEncoding = "UTF-8").
Warning, when you (re)save ui.R, your "c-cedilla" might get destroyed. Just re-insert it, in case it happens.
Happy easter :)
I am connecting to an Oracle database from R using ROracle. The problem is for every special utf-8 character it returns a question mark. Some Chinese values returns a solid string of question marks. I believe this is relevant because I haven't found any other question on this site (or others) that answers this for the package ROracle.
Some questions that were the most promising include an answer for MySQL: Fetching UTF-8 text from MySQL in R returns "????" but I was unable to make this work for ROracle. This site also provided some useful information https://docs.oracle.com/cd/E17952_01/mysql-5.5-en/charset-connection.html Before I was using RODBC and was easily able to configure the uft-8 encoding.
Here is some sample code... I am sorry that unless you have an Oracle database with utf-8 characters it may be impossible to duplicate... I also changed the host number and the sid for data privacy reasons...
library(ROracle)
drv <- dbDriver("Oracle")
# Create the connection string
host <- "10.00.000.86"
port <- 1521
sid <- "f110"
connect.string <- paste(
"(DESCRIPTION=",
"(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",
"(CONNECT_DATA=(SID=", sid, ")))", sep = "")
con <- dbConnect(drv, username = "XXXXXXXXX",
password = "xxxxxxxxx",dbname=connect.string)
my.table <- dbReadTable(con, "DASH_D_PROJECT_INFO")
my.table[40, 1:3]
PROJECT_ID DATE_INPUT PROJECT_NAME
211625 2012-07-01 ??????, ?????????????????? ????? ??????, 1869?1917 [????? 3]
Any help is appreciated. I have read the entire documentation of the ROracle packages, and it seemed to have a solution for writing utf-8 characters, but not for reading them.
Okay after several weeks I found my own answer. I hope that it will be of value to someone else.
My question is largely answered by how Oracle stores the data. If you want UTF-8 characteristics preserverd you need the column in the table to be an NVARCHAR not just a varchar. At that point regular data pulling and encoding will work in R as expected. I was looking for the error in the wrong place.
I also want to mention one hang up on how to write utf-8 data from R to Oracle with utf-8
In writing files I had some that would not convert to UTF-8 in the following manner. So I did the step in too parts and wrote them in two steps to an oracle table. The results worked perfectly.
Encoding(my.data1$Project.Name) <- "UTF-8"
my.data1.1 <- my.data1[Encoding(my.data1$Project.Name) == "UTF-8", ]
my.data1.2 <- my.data1[Encoding(my.data1$Project.Name) != "UTF-8", ]
attr(my.data1.1$Project.Name, "ora.encoding") <- "UTF-8"
If you found this insightful give it an up vote so more can find it.
Anyone run in to his problem? I have a shiny app that uses SQL files to import data from MS SQL server using the RODBC package. I've narrowed down the problem to here in the server.R file:
ch <- odbcConnect(dsn = xxxxxx)
iQry <- readChar("LeaderDashInd.sql", file.info("LeaderDashInd.sql")$size, T)
oQry <- readChar("LeaderDashOrg.sql", file.info("LeaderDashOrg.sql")$size, T)
iDat <- sqlQuery(channel = ch, query = iQry, stringsAsFactors = F)
oDat <- sqlQuery(channel = ch, query = oQry, stringsAsFactors = F)
odbcClose(ch)
# PREPROCESSING --------------------------
cy <- max(iDat$CampYear)
The app stops at the last line above and gives... Error in iDat$CampYear : $ operator is invalid for atomic vectors. I know this chunk is the problem because when I have the same app run off imported csv files, it works.
A couple things to note:
This code runs first in the server.R file outside the shinyServer function.
The app runs fine when launched from my workstation through R Studio. It only stops working when run on our Shiny Server installation.
Shiny packages are up to date and shiny server is a recent install.
Any thoughts?
FYI... changing the SQL query to a stored procedure and then calling the stored procedure with RODBC::sqlQuery fixed the problem. The $ operator invalid error was coming up because the query was not running and returning an empty vector.
Still doesn't explain why the stand alone query using readChar doesn't work in Shiny Server but hey... its a fix.