Error Appending Data to Existing Empty Table on BigQuery Using R - r

I created an empty table from Big Query GUI with the schema for the table_name. Later I'm trying to append data to the existing empty table from R using bigrquery package.
I have tried below code,
upload_job <- insert_upload_job(project = "project_id",
dataset = "dataset_id",
table = "table_name",
values = values_table,
write_disposition = "WRITE_APPEND")
wait_for(upload_job)
But it is throwing me an error saying,
Provided Schema does not match Table. Field alpha has changed mode from REQUIRED to NULLABLE [invalid]
My table doesn't have any NULL or NA in the mentioned column and data_types in the schema matches exactly with the data types of values_table.
I tried without creating schema uploading directly from R. While I'm doing that it is automatically converting the mode to nullable which is not what I'm looking for.
I also tried by changing write_dispostion = "WRITE_TRUNCATE" which is also converting mode to nullable.
I also looked at this and this which didn't really help me.
Can someone explain what is happening behind the scenes and what is the best way to upload data without recreating schema.
Note: There was a obvious typo error. Earlier it was wirte_disposition edited it to write_disposition.

Related

How to insert nested values from R into Big Query

I would like to insert data into BQ via R. When I have a normal table everything is ok. The problem begins when I have to insert table which contains a map (nested/record repeated) column.
The column is defined like this:
I use bigrquery package and DBI like here:
dbWriteTable(
con,
"database.table",
table,
overwrite = FALSE,
append = TRUE,
row.names = FALSE
)
How should I define the customerdata column in R to insert it into Big Query? I've tried json and list but it didn't work. Although it could also be wrongly written json or list :)
I know that the example is not reproducible but it is rather not possible here or I have no idea how to create it.
Do you have any idea how to do this?

Create database in Azure Databricks using R dataframe

I have my final output in R dataframe. I need to write this output to a database in Azure Databricks. Can someone help me with the syntax? I used this code:
require(SparkR)
data1 <- createDataFrame(output)
write.df(data1, path="dbfs:/datainput/sample_dataset.parquet",
source="parquet", mode="overwrite")
This code runs without error, but i dont see the database in the datainput folder (mentioned in the path). Is there some other way to do it?
I believe you are looking for saveAsTable function. write.df is particularly to save the data in a file system only, not to tag the data as table.
require(SparkR)
data1 <- createDataFrame(output)
saveAsTable(data1, tableName = "default.sample_table", source="parquet", mode="overwrite")
In the above code, default is some existing database name, under which a new table will get created having name as sample_table. If you mention sample_table instead of default.sample_table then it will be saved in the default database.

sparklyr::spark_write_jdbc Does not Accept Spark Dataframe?

I am working within Databricks, trying to use the sparklyr function spark_write_jdbc to write a dataframe to a SQL Server table. The server name/driver etc are correct and work, as I successfully used sparklyr::spark_read_jdbc() earlier in the code.
Per the documentation (here), spark_write_jdbc should accept a Spark Dataframe.
I used SparkR::createDataFrame() to convert the dataframe I was working with to a Spark dataframe.
Here is the relevant code:
events_long_test <- SparkR::createDataFrame(events_long, schema = NULL, samplingRatio = 1, numPartitions = NULL)
sparklyr::spark_write_jdbc(events_long_test,
name ="who_status_long_test" ,
options = list(url = url,
user = user,
driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
password = pw,
dbtable = "who_status_long_test"))
However, when I run this, it gives me the following error:
Error in UseMethod("spark_write_jdbc") : Error in UseMethod("spark_write_jdbc") :
no applicable method for 'spark_write_jdbc' applied to an object of class "SparkDataFrame"
I have searched around and cannot find other people asking about this error. Why would it say this function cannot work with a Spark Dataframe, when the documentation says it does?
Any help is appreciated.
What is in events_long? the syntax is correct and make sure your connection properties in options are correct. Make sure that events_long_test is a spark dataframe not a table.

R dbReadTable returns no such table error

I ran this bit of code yesterday successfully, but today, I am getting a 'no such table' error. I am trying to pull data in a table (named tblLatjamInputs) from an SQLite database into R using DBI::dbReadTable(), but it is acting as though the table does not exist.
Using SQLiteStudio
and separately the command line
,
I can see that the table does indeed exist and that there are data in the table.
Here is the code, both written out and as a screenshot so you can see the error I'm getting.
setwd("D:/hsm/databases")
db <- dbConnect(SQLite(), conn = "lookup_and_tracking.sqlite")
tblName <- "tblLatjamInputs"
df.full <- dbReadTable(db, tblName)
Error in result_create(conn#ptr, statement) : no such table: tblLatjamInputs
I got the same error when the tblName line is changed to this: tblName <- dbQuoteIdentifier(db, "tblLatjamInputs")
dbListTables(db) returns character(0), and dbListFields(db, "lkpSpecies") (a different table in the db) returns the no such table error as well.
I checked that there are no spaces around the table name in the database. I also tried to pull data from other tables (to see if it was just an issue with this table), but I got the same error. I have tried disconnecting and reconnecting to the database multiple times, including disconnecting from the db, closing SQLiteStudio and the command line, and then reopening. I also tried closing everything, including R, reloading the project, and starting again from scratch. I also tried connecting to a different database altogether with the same results (R makes the connection, but can't seem to find any tables). I'm totally baffled because, as I mentioned, all this works fine in the command line, and I did this yesterday with the same database, table, and lines of code, and it worked fine.
Use
db <- dbConnect(SQLite(), "lookup_and_tracking.sqlite")
The problem is the file name parameter is not named conn=; It's dbname= and the default is "" which creates a new, empty data base.

Persistence of data frames(R objects) into Database

I have a database table lets say Table 1. Table 1 has 10 columns lets assume:
column1,column2,column3,column4,column5,column6,column7,column8,column9,column10...
I have a data-frame as
sample_frame<-data.frame(column1=1,column2=2,column3=3,column4=4)
I wish to persist the data frame i.e. sample_frame into my database table i.e. Table 1.
presently I am using ROracle package to write into database. the code which I am using is as follows:
library(ROracle)
dbWriteTable(con, name="Table 1", value=sample_frame, row.names = FALSE,
overwrite = FALSE,append = TRUE, schema ="sample_schema")
I have created connection object using dbConnect(), As far as integrity and null constraints of Table 1 is concerned, I have taken care of that. When I try to write into the table using dbWriteTable(), the following error is thrown:
"ORA-00947: not enough values"
Can someone correct the method I am using or provide me an alternative method of inserting selective columns(non-nullable columns) into the Table 1 while leaving other columns empty. I am using R 2.15.3
As I mentioned in my comment, you are creating sample_frame with lesser number of columns you are getting this error... Try this (if you actual table in database have same column names)
sample_frame<-data.frame(column1=1,column2=2,column3=3,column4=4,
column5=5,column6=6,column7=7,column8=8,
column9=9,column10=10)
library(ROracle)
dbWriteTable(con, name="Table 1", value=sample_frame, row.names = FALSE,
overwrite = FALSE,append = TRUE, schema ="sample_schema")
Update
Considering your new requirement, I would suggest you prepare a query and use following
qry = #You update query
dbSendQuery(con, qry)

Resources