Refresh the dataset in Azure machine learning

Refresh the dataset in Azure machine learning - r

I have an experiment (exp) which is published as a web service (exp [Predictive Exp.]) in the azure machine learning studio, the data used by this experiment was pushed by R using AzureML package
library(AzureML)
ws <- workspace(
id = 'xxxxxxxxx',
auth = 'xxxxxxxxxxx'
)
upload.dataset(data_for_azure, ws, "data_for_azure")
The above thing worked, but lets say I want to update the dataset(same schema just added more rows)
I tired this but this does not work:
delete.datasets(ws, "data_for_azure")
refresh(ws, what = c("everything", "data_for_azure", "exp", "exp [Predictive Exp.]"))
I get the error stating the following:
Error: AzureML returns error code:
HTTP status code : 409
Unable to delete dataset due to lingering dependants
I went through the documentation, and I know that a simple refresh is not possible(same name), the only alternative I see is to delete the web service and perform everything again
Any solution will be greatly helped!

From the R doc.
The AzureML API does not support uploads for replacing datasets with
new data by re-using a name. If you need to do this, first delete the
dataset from the AzureML Studio interface, then upload a new version.
Now, I think this is particular for the R sdk, as the Python SDK, and the AzureML Studio UI lets you upload a new dataset. Will check in with the R team about this.
I would recommend uploading it as a new dataset with a new name, and then replacing the dataset in your experiment with this new dataset. Sorry this seem's round about, but I think is the easier option.
Unless you want to upload the new version using the AzureML Studio, in which case go to +NEW, Dataset, select your file and select the checkbox that says this is an existing dataset. The filename should be the same.

Related

Multiple session in R

Can we create multiple sessions in R language . I am not using Rstudio or shiny app . I am just integrating R and python using pip module pypeR. I cannot see any R code where i can learn to create session. My requirement is to create a new session for every R request from python.
Because whenever there are two requests to R at the same time my server gets blocked.
Also i have used "pypeR" module to integrate R with python.
import pyper
r=pyper.R(RCMD="/usr/bin/R")
path2script = path of my script file
output = r.run("source('"+path2script+"')")
I am a newbie in R as i have worked more in python.I have searched a lot about this issue and it seems to me that R works only in single session when called as a script through python.
Any help would be appreciated.

How to connect R to SQL Azure?

I have two databases in Azure and each has 5 tables. I can perform data wrangling inside Azure using Kusto but I would rather prefer using RStudio. I wish to connect R with Azure such that I can run a script in R and return results without importing the actual datasets.
Please help, where do I start? I have zero knowledge of such connections.
Thank you in advance.

Assuming you have already installed R and RStudio. Please follow below steps:
Open ODBC Data Source through Start Window and Add a User Data Source under 'User DSN' as below. Follow through the next buttons until finish and test the connections.
Go to RStudio and create new connection, you should see the Data Source added in above step. You should see the connection and table listed under Azure Sql Database that you connected with.
Run the query like below in Console:
dbGetQuery(con, "SELECT * from dbo.xxxx")
You should be able to see the result set accordingly. You can play with queries in the way you want.

RStudio Connection to Spark on IBM Watson Studio

I'm trying to connect to Spark from an RStudio instance on IBM Watson Studio but I'm getting the following error.
No encoding supplied: defaulting to UTF-8. Error in force(code) :
Failed during initialize_connection: attempt to use zero-length
variable name
Log: /tmp/Rtmpdee7QC/file1b33141066_spark.log
---- Output Log ----
hummingbird kernel
http://localhost:8081/apsrstudio/agent/v1/kernel/hb-connect ; Time
Diff :1.31352798938751
{"code": "import sparklyr._"} ; Time Diff :0.00552034378051758
Here's the code I'm using to create the connection:
kernels <- load_spark_kernels()
sc <- spark_connect(config = kernels[1])
Any help would be highly appreciated!

I was able to fix this issue! Seems like I was missing a Project Access Token. Project access tokens can be manually created as described here. Tokens can be created on the Settings page of your project. From the link shared above:
Create an access token on the Settings page of your project. Only project admins can create access tokens. The access token can have viewer or editor access permissions. Only editors can inject the token into a notebook.
After adding a project access token, I could connect to Spark using the code provided in the question with no problems.
kernels <- load_spark_kernels()
sc <- spark_connect(config = kernels[1])

If you are using IBM Watson Studio on Cloud and using Rstudio in it, you should be using list_spark_kernels() to list the kernels.
kernels <- list_spark_kernels()
Then use spark_connect() to connect to it.
One more thing, do not upgrade the sparklyr, if you did, uninstall it.
Since sparklyr that Rstudio on watson studio cloud has is customized to allow to be able to connect to spark service from IBM Cloud.
sc <- spark_connect(config = kernels[1])
Uninstalling the sparklyr or removing your version of sparklyr will load the original sparklyr(customized).
Hope it helps.

Sonar 5.3 force issues report generation on publish analysis mode

I've been working with Sonar 4.5, although I would like to migrate to version 5.3. During the tests I've noticed that Issues Report is not getting generated when analysis type is publish (it needs to be preview), although it's a need for us to store the analysis to the DB on each Sonar run.
My question is, is there a way to force it to generate the issues report even running on publish mode (I suspect this limitation is related to the fact that publish mode takes longer to finish).
PS: I need the Issues Report in order to extract metrics from it and publish it on TeamCity.

This is not possible anymore. If you want to retrieve some information from SonarQube, you should use the standard public WS to achieve this.
Once the analysis has been sent to the server, you can check <work_dir>/report-task.txt to know which URL to call in order to know when the report is processed (URL is given by the ceTaskUrl property in this file)
Once the call to this WS returns the SUCCESS status, you can query the /api/resources WS (using the key of the project given by the projectKey property) to get any information you want on your project that has just been analysed.

R dataset connection to tableau

Recently tableau gave the functionality of R connection in their release 8.1. I want to know if there is any way i can call an entire table created in R to tableau. Or an .rds object which contains the dataset into Tableau?

There is a tutorial on the Tableau website for this and a blog on r-bloggers which discuss. The tutorial has a number of comments and one of them (in early Dec I think) asks how to get an rds file in. You need to start Rserve and then execute a script on it to get your data.
Sorry I can't be more help as I only looked into it briefly and put it on the back-burner but if you get stuck they seem to come back quickly if you post a comment on the page:
http://www.tableausoftware.com/about/blog/2013/10/tableau-81-and-r-25327

Just pointing out that the Tableau Data Extract API might be useful here, even if the current version of R integration doesn't yet meet your needs. (Note, that link is to the version 8.1 docs released in late 2013 - so look for the latest version to see what functionality they've added since)
If what you want to do is to manipulate data in R and then send a table of data to Tableau for visualization, you could first try the simple step of exporting the data from R as a CSV file and then visualizing that data in Tableau. I know that's not sexy, but its always good to make sure you've got a way to get the output result you need before investing time in optimizing the process.
If that gets the effect you want, but you just want to automate more of the steps, then take a look at the Tableau Data Extract API. You could use that library to generate a Tableau Data Extract instead of a CSV file. If you have something in production that needs updates, then you could presumably create a python script or JVM program to read your RDS file periodically and generate a revised extract.

Let us assume your data.frame/ tibble etc (say dataset object) is ready in R/ RStudio and you want to connect it with Tableau
1. In RStudio (or R terminal), execute the following steps:
install.packages("Rserve")
library(Rserve)
Rserve() ##This gets the R connection service up and running
2. Now go to Tableau (I am using 10.3.2):
Help > Settings and Performances > Manage External Service Connection
Enter localhost in the Server field and click on Test Connection.
You have now established a connection between R and Tableau.
3. Come back to RStudio. Now we need a .rdatafile that will consist of our R object(s). In this case, dataset. This is the R object that we want to use in Tableau. Enter this in the R console:
save(dataset, file="objectName.rdata")
4. Switch to Tableau now.
Connect To a File > Statistical File
Go to your working directory where the newly created objectName.rdata resides. From the drop down list of file type, select R files (*.rdata, *.rda) and select your object. This will open the object you created in R in Tableau. Alternatively, you can drag and drop your object directly to Tableau's workspace.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Refresh the dataset in Azure machine learning - r

Related

Multiple session in R

How to connect R to SQL Azure?

RStudio Connection to Spark on IBM Watson Studio

Sonar 5.3 force issues report generation on publish analysis mode

R dataset connection to tableau

Categories

Resources