How to insert nested values from R into Big Query - r

I would like to insert data into BQ via R. When I have a normal table everything is ok. The problem begins when I have to insert table which contains a map (nested/record repeated) column.
The column is defined like this:
I use bigrquery package and DBI like here:
dbWriteTable(
con,
"database.table",
table,
overwrite = FALSE,
append = TRUE,
row.names = FALSE
)
How should I define the customerdata column in R to insert it into Big Query? I've tried json and list but it didn't work. Although it could also be wrongly written json or list :)
I know that the example is not reproducible but it is rather not possible here or I have no idea how to create it.
Do you have any idea how to do this?

Related

Error Appending Data to Existing Empty Table on BigQuery Using R

I created an empty table from Big Query GUI with the schema for the table_name. Later I'm trying to append data to the existing empty table from R using bigrquery package.
I have tried below code,
upload_job <- insert_upload_job(project = "project_id",
dataset = "dataset_id",
table = "table_name",
values = values_table,
write_disposition = "WRITE_APPEND")
wait_for(upload_job)
But it is throwing me an error saying,
Provided Schema does not match Table. Field alpha has changed mode from REQUIRED to NULLABLE [invalid]
My table doesn't have any NULL or NA in the mentioned column and data_types in the schema matches exactly with the data types of values_table.
I tried without creating schema uploading directly from R. While I'm doing that it is automatically converting the mode to nullable which is not what I'm looking for.
I also tried by changing write_dispostion = "WRITE_TRUNCATE" which is also converting mode to nullable.
I also looked at this and this which didn't really help me.
Can someone explain what is happening behind the scenes and what is the best way to upload data without recreating schema.
Note: There was a obvious typo error. Earlier it was wirte_disposition edited it to write_disposition.

using RODBC sqlquery in R to import long string, but R truncate the string, how to get around this?

I'm using R library(RODBC) to import the results of a sql store procedure and save it in a data frame then export that data frame using write.table to write it to xml file (the results from sql is an xml output)
anyhow, R is truncating the string (imported xml results from sql).
I've tried to find a function or an option to expand the size/length of the R dataframe cell but didn't find any
I also tried to use the sqlquery in the write.table statement to ignore using a dataframe but also it didn't work, the imported data from sql is always truncated.
Anyone have any suggestions or an answer that could help me.
here is my code
#library & starting the sql connection
library(RODBC)
my_conn<-odbcDriverConnect('Driver={SQL Server};server=sql2014;database=my_conn;trusted_connection=TRUE')
#Create a folder and a path to save my output
x <- "C:/Users/ATM1/Documents/R/CSV/test"
dir.create(x, showWarnings=FALSE)
setwd(x)
Mpath <- getwd()
#importing the data from sql store procedure output
xmlcode1 <- sqlquery(my_conn, "exec dbo.p_webDefCreate 'SA25'", stringsAsFactors=F, as.is=TRUE)
#writing to a file
write.table(xmlcode1, file=paste0(Mpath,"/SA5b_def.xml"), quote = FALSE, col.names = FALSE, row.names = FALSE)
what I get is plain text that is not the full output.
and the code below is how I find the current length of my string
stri_length(xmlcode1)
[1] 65534
I had similar issue with our project, the data that was coming from the db was getting truncated to 257 characters, and I could not really get around it. Eventually I converted the column def on the db table from varchar(max) to varchar(8000) and I got all the characters back. I did not mind changing the table defintion.
In your case you can perhaps convert the column type in your proc output to varchar with some defined value if possible.
M
I am using PostgeSQL but experienced the same issue of truncation upon importing into R with RODBC package. I used Michael Kassa's solution with a slight change to set the data type to text which can store a string with unlimited length per postgresqltutorial. This worked for me.
The TEXT data type can store a string with unlimited length.
varchar() also worked for me
If you do not specify the n integer for the VARCHAR data type, it behaves like the TEXT datatype. The performance of the VARCHAR (without the size n) and TEXT are the same.

Get all rows that DBI::dbWriteTable has just written

I want to use dbWriteTable() of R's DBI package to write data into a database. Usually, the respective tables are already present so I use the argument append = TRUE. How do I get which rows were added to the table by dbWriteTable()? Most of tables have certain columns with UNIQUE values so a SELECT will work (see below for a simple example). However, this is not true for all of them or only several columns together are UNIQUE making the SELECT more complicated. In addition, I would like to put the writing and querying into a function so I would prefer a consistent approach for all cases.
I mainly need this to get the PRIMARY KEY's added by the database and to allow a user to quickly see what was added. If important, my database is PostgreSQL and I would like to use the odbc package for connection.
I have something like this in mind, however, I am looking for a more general solution:
library(DBI)
con <- dbConnect(odbc::odbc(), dsn = "database")
dbWriteTable(con,
name = "site", value = data.frame(name = c("abcd", "efgh"),
append = TRUE))
dbGetQuery(conn,
paste("SELECT * FROM site WHERE name in ('abcd', 'efgh');"))

Persistence of data frames(R objects) into Database

I have a database table lets say Table 1. Table 1 has 10 columns lets assume:
column1,column2,column3,column4,column5,column6,column7,column8,column9,column10...
I have a data-frame as
sample_frame<-data.frame(column1=1,column2=2,column3=3,column4=4)
I wish to persist the data frame i.e. sample_frame into my database table i.e. Table 1.
presently I am using ROracle package to write into database. the code which I am using is as follows:
library(ROracle)
dbWriteTable(con, name="Table 1", value=sample_frame, row.names = FALSE,
overwrite = FALSE,append = TRUE, schema ="sample_schema")
I have created connection object using dbConnect(), As far as integrity and null constraints of Table 1 is concerned, I have taken care of that. When I try to write into the table using dbWriteTable(), the following error is thrown:
"ORA-00947: not enough values"
Can someone correct the method I am using or provide me an alternative method of inserting selective columns(non-nullable columns) into the Table 1 while leaving other columns empty. I am using R 2.15.3
As I mentioned in my comment, you are creating sample_frame with lesser number of columns you are getting this error... Try this (if you actual table in database have same column names)
sample_frame<-data.frame(column1=1,column2=2,column3=3,column4=4,
column5=5,column6=6,column7=7,column8=8,
column9=9,column10=10)
library(ROracle)
dbWriteTable(con, name="Table 1", value=sample_frame, row.names = FALSE,
overwrite = FALSE,append = TRUE, schema ="sample_schema")
Update
Considering your new requirement, I would suggest you prepare a query and use following
qry = #You update query
dbSendQuery(con, qry)

Replacing a mysql table using R code

I am trying to export data from R data frame into mysql table names X using "sqlsave" function of RODBC. But, as per my process, I need to replace the data everytime in the table X. Can anyone let me know if I can do that inside R to either drop the table and then create an new table using sqlsave or if there is a way to replace the table X with new data.
Thanks!!!
If you need to overwrite table :
The simpler is to use RMySQL and dbWriteTable.
dbWriteTable(connection, data.frame, "MyTable", overwrite = TRUE, row.names = FALSE )
PS: if the table doesn't exist , the first call it will be created.

Resources