Pentaho, R Executor insert into DB - r

It is possible to run a R script with Pentaho, but instead of export the result as a csv file, insert the result directly into a table on a DB?

Using the Community Edition of Pentaho, you could use a script executor step to execute a shell script in your OS to do all the work, including inserting to the database, which is not much Pentaho related, all the work is done by the shell script and you just use Pentaho to call the execution of that script.
There's also a very old plugin available in Github that I don't know if it would work with modern versions of Pentaho and R, to execute R code within Pentaho and then continue the stream of data to "normal" steps like the table output to insert the data to a table.
These are the details to configure that plugin from the developers:
http://dekarlab.de/wp/?p=5

Related

AzerothCore : Import the update of database

Hello I wanted to ask if, to import the .sql update (after a git pull) I have to assemble and merge with the bash file (app/db_assembler) or if it's ok if I just launch the worldserver.exe and he will do it
Thanks
Short answer
No, the worldserver process will NOT update your database.
You need to use the DB-assembler bash script, as the instructions say.
More details
This is different than in TrinityCore, where it is a feature of the worldserver process to update the database.
In AzerothCore this task is a responsability of an external script, written in bash, the DB-assembler.
The advantage of having an external script to do this task instead of the worldserver is:
You don't need to compile and run the worldserver if you only need to create the database (useful when using or developing tools that only need the DBs)
The DB assembler is able to generate a unique SQL update file per each DB (by merging all the single SQL update files), which can be useful for debugging or development purposes
In general, it is better to delegate different software components for different tasks, instead of having a monolith doing everything
You can also make your own merge script and apply manually. Or just merge with the db_assembler.sh then apply manually.
Else refer to Francesco's answer

Error: oauth_listener() needs an interactive environment

Using Shiny and R to create a little webapp that pulls data from Google BigQuery and spits it out onto the page. Uses the bigrquery package
When running the script from inside R (source(x.R)) everything runs fine, however when using Rscript x.R I get the error. I'm trying to setup cron to run the script automatically.
There is a .httr-oauth file in the directory of the script.
This question is similar to another one I answered where I suggested using a Google Service Account for server-to-server authentication using the googleAuthR and bigQueryR packages for R. Please refer to that answer (via the above link) for details including an example R script.
In the end I've decided to just use python to pull data from bigquery and use it in R.

Execute R script from SSIS Package

I wanted to execute R code from SSIS package. How can I add a data control step that executes R-code? SSIS supports only vb.net and asp.net.
SSIS has many data transformations available but R is very friendly when it comes to data manipulations.
I want to run a R-code from SSIS scripts or some other way.Basically, I'm trying to integrate R in ETL process.
I wanted to extract data(E) from from a CSV file.
Transform (T) it in R and load (L) it in Microsoft database.
Is it possible to get this workflow done in SSIS package by executing R-script using SSIS data control items? Thanks!
Here are a couple of ways you could integrate R into your ETL process.
Crude, fast and dirty - Execute Process Task in the Control Flow. This would be similar to calling RScript from the command line. You would likely make your transformation, save it to a file on disk, and get that filename from your Execute Process Task so you can feed it into a Data Flow task. Upside is you're keeping your R clean and separate from your C#/VB.
Integrated via Rdotnet - You could use the RDotNet library (I believe, haven't tried to integrate it). You would need to register the DLLs in the GAC, and then you can either work with .NET objects in your SSIS scripts or call R scripts directly.
Integrated in SQL Server 2016 - Microsoft has added R support via extended stored procedures. You call the R script via stored proc and use a sql query for input data and can store the output. See more detail here. This would mean utilizing an Execute SQL task in SSIS.
I hope it helps you or someone else, since you want data processing you might bring your dataset into a CSV file (throught a data flow task), execute the file using: "Rscript " (it might be executed as a command with the execute process task), inside the file you have to upload the dataset into a dataframe ( calling it with readLines() function), then do all the math/Calculation you request, write the data or calculation results into a CSV file an reading again it from SSIS.
It is not an elegant solution, but it works :), At least till microsoft integrates R as a control/data flow process.
CYA
PS. here you go how to execute files from the command line: Run R script from command line

R dataset connection to tableau

Recently tableau gave the functionality of R connection in their release 8.1. I want to know if there is any way i can call an entire table created in R to tableau. Or an .rds object which contains the dataset into Tableau?
There is a tutorial on the Tableau website for this and a blog on r-bloggers which discuss. The tutorial has a number of comments and one of them (in early Dec I think) asks how to get an rds file in. You need to start Rserve and then execute a script on it to get your data.
Sorry I can't be more help as I only looked into it briefly and put it on the back-burner but if you get stuck they seem to come back quickly if you post a comment on the page:
http://www.tableausoftware.com/about/blog/2013/10/tableau-81-and-r-25327
Just pointing out that the Tableau Data Extract API might be useful here, even if the current version of R integration doesn't yet meet your needs. (Note, that link is to the version 8.1 docs released in late 2013 - so look for the latest version to see what functionality they've added since)
If what you want to do is to manipulate data in R and then send a table of data to Tableau for visualization, you could first try the simple step of exporting the data from R as a CSV file and then visualizing that data in Tableau. I know that's not sexy, but its always good to make sure you've got a way to get the output result you need before investing time in optimizing the process.
If that gets the effect you want, but you just want to automate more of the steps, then take a look at the Tableau Data Extract API. You could use that library to generate a Tableau Data Extract instead of a CSV file. If you have something in production that needs updates, then you could presumably create a python script or JVM program to read your RDS file periodically and generate a revised extract.
Let us assume your data.frame/ tibble etc (say dataset object) is ready in R/ RStudio and you want to connect it with Tableau
1. In RStudio (or R terminal), execute the following steps:
install.packages("Rserve")
library(Rserve)
Rserve() ##This gets the R connection service up and running
2. Now go to Tableau (I am using 10.3.2):
Help > Settings and Performances > Manage External Service Connection
Enter localhost in the Server field and click on Test Connection.
You have now established a connection between R and Tableau.
3. Come back to RStudio. Now we need a .rdatafile that will consist of our R object(s). In this case, dataset. This is the R object that we want to use in Tableau. Enter this in the R console:
save(dataset, file="objectName.rdata")
4. Switch to Tableau now.
Connect To a File > Statistical File
Go to your working directory where the newly created objectName.rdata resides. From the drop down list of file type, select R files (*.rdata, *.rda) and select your object. This will open the object you created in R in Tableau. Alternatively, you can drag and drop your object directly to Tableau's workspace.

Load sqlite database into Postgres

I have been developing locally for some time and am now pushing everything to production. Of course I was also adding data to the development server without thinking that I hadn't reconfigured it to be Postgres.
Now I have a SQLite DB who's information I need to be on a remote VPS on a Postgres DB there.
I have tried dumping to a .sql file but am getting a lot of syntax complaints from Postgres. What's the best way to do this?
For pretty much any conversion between two databases the options are:
Do a schema-only dump from the source database. Hand-convert it and load it into the target database. Then do a data only dump from the source DB in the most compatible form of SQL dump it offers. Try loading that into the target DB. When you hit problems, script transformations to the dump using sed/awk/perl/whatever and try again. Repeat until it loads and the results match.
Like (1), hand-convert the schema. Then write a script in your preferred language that connects to both databases, SELECTs from one, and INSERTs into the other, possibly with some transformations of data types and representations.
Use an ETL tool like Talend or Pentaho to connect to both databases and convert between them. ETL tools are like a "somebody else already wrote it" version of (2), but they can take some learning.
Hope that you can find a pre-written conversion too. Heroku one called sequel that will work for SQLite -> PostgreSQL; is it available without Heroku and able to function without all the other Heroku infrastructure and code?
After any of those, some post-transfer steps like using setval() to initialize sequences is typically required.
Heroku's database conversion tool is called sequel. Here are the ruby gems you need:
gem install sequel
gem install sqlite3
gem install pg
Then this worked for me for a sqlite database file named 'tweets.db' in the current working directory:
sequel -C sqlite://tweets.db postgres://pgusername:pgpassword#localhost/pgdatabasename
PostgreSQL supports "foreign data wrappers", which allow you to directly access any data source through the DB, including sqlite. Even up to automatically importing the schema. You can then use create table localtbl as (select * from remotetbl) to get your data into the actual PG storage.
https://wiki.postgresql.org/wiki/Foreign_data_wrappers
https://github.com/pgspider/sqlite_fdw

Resources