Get data from Workspace in Azure ML Studio - r

I have a problem with connecting to workspace in Azure ML Studio. I am using library azuremlsdk, but it doesn't work.
My code looks like:
library(azuremlsdk)
workspace_name = 'workspace_name'
subscription_id = 'subscription_id'
resource_group = 'resource_group'
ws <- get_workspace(name = workspace_name, subscription_id = subscription_id, resource_group = resource_group)
dataSet <- get_dataset_by_name(ws, name = "registration_name", version = "latest")
After I run that code I get error:
I have know idea what is wrong. I tried to do it in python and it works fine with the same parameters, code:
from azureml.core import Workspace, Dataset
subscription_id = 'subscription_id'
resource_group = 'resource_group'
workspace_name = 'workspace_name'
workspace = Workspace(subscription_id, resource_group, workspace_name)
dataset = Dataset.get_by_name(workspace, name='registration_name')
Any idea how can I fix that bug?

I tried to reproduce the issue and got the solution. Follow the below procedure to access the dataset contents from the workspace.
data.set <-data.frame(installed.packages("azuremlsdk"));
data.set <-data.frame(installed.packages("remotes"));
remotes::install_cran('azuremlsdk', repos = 'http://cran.us.r-project.org', INSTALL_opts=c("--no-multiarch"))
library(azuremlsdk)
ws <- get_workspace(name = workspace_name, subscription_id = subscription_id, resource_group = resource_group, auth = authentication)
print(ws)
dataSet <- get_dataset_by_name(ws, name = "registration_name", version = "latest")
dataSet

Related

Writing to database in parallel in R

I try to write a table which is a processed subset of a global data variable, in a normal for loop this piece of code works fine but when I try to do it in parallel it raises an error.
Here is my piece of code;
library(doParallel)
library(foreach)
library(odbc)
library(data.table)
nc <- detectCores() - 1
cs <- makeCluster(nc)
registerDoParallel(cs)
con <- dbConnect(odbc(),driver = 'SQL Server',server = 'localserver',database = 'mydb', encoding = 'utf-8',timeout = 20)
range_to <- 1e6
set.seed(1)
random_df <- data.table(a = rnorm(n = range_to,mean = 2,sd = 1),
b = runif(n = range_to,min = 1,max = 300))
foreach(i=1:1000,.packages = c('odbc','data.table')) %dopar% {
subk <- random_df[i,]
subk <- subk**2
odbc::dbWriteTable(conn = con,name = 'parallel_test',value = subk,row.names = FALSE,append = TRUE)
}
This code raises this error;
Error in {: task 1 failed - "unable to find an inherited method for function 'dbWriteTable' for signature '"Microsoft SQL Server", "character", "data.table"'"
Like I said before in a normal for loop it works fine.
Thanks in advance.
I solved that issue by changing only creating connection object method by;
parallel::clusterEvalQ(cs, {library(odbc);con <- dbConnect(odbc(),driver = 'SQL Server',server = 'localserver',database = 'mydb', encoding = 'utf-8',timeout = 20)})

Error while using tm1r to send dataset. How to debug this issue?

I am currently developing a database integrated forecasting tool for a costumer. I am using mainly R and TM1 Persepctives. To connect R with tm1 I use tm1r. For Data import from tm1 to r it works fine.
However, when I am trying to write back the calculated forecast from R to tm1, I run into problems.
I will give you some reprex data, so you can have a look at the output. If I use "tm2_send_data" instead of "tm1_send_dataset" it works fine too.
The latter function gives me the Error :
"Error in if (is.character(txt) && length(txt) == 1 && nchar(txt, type = "bytes") < :
missing value where TRUE/FALSE needed"
I have no clue, what this is supposed to mean! I tried some formatting of the data types, without any effect.
library(tm1r)
# data
values <- data.frame(fake_values =
c(105,147,159,232,312,337,285,188,257,10,98,27)
)
date_stamps <- c("2021001","2021002","2021003","2021004","2021005","2021006","2021007","2021008","2021009","2021010","2021011","2021012")
rownames(values) = date_stamps
# Send dataset to TM1
con_obj <- tm1_connection("localhost", "8840", "test_admin", "")
tm1_send_dataset(
con_obj,
valueset = values, cube = "pvl_FORECAST_HILFSWÜRFEL",
rowdim = "PVL_test_Zeit", coldim = "pvl_Produkt",
titledim1 = "DATENART", titleel1 = "FC",
titledim2 = "Version", titleel2 = 'Version_Bearbeitung',
titledim3 = "FC-Scheibe", titleel3 = "ML_FC_2021",
titledim4 = "PVL_test_Kunde", titleel4 = "MGR_domestic_D",
titledim5 = "PVL_test_Measure", titleel5 = "Menge_EA"
)

biomaRt getBM not working due to getNodeSet {XML} error

I'm running the following code in r and it is not working. getBM seems to not work for any arguments. Am I doing something wrong?
ibrary(biomaRt)
ensembl <- useEnsembl(biomart = "genes")
ensembl <- useEnsembl(biomart = "ensembl",
dataset = "hsapiens_gene_ensembl",
mirror = "useast")
affyids <- c("202763_at","209310_s_at","207500_at")
getBM(attributes = c('affy_hg_u133_plus_2', 'entrezgene_id'),
filters = 'affy_hg_u133_plus_2',
values = affyids,
mart = ensembl)
The error I'm getting is
Error in getNodeSet(html, path = "//div[#class='plain-box float-right archive-box']")[[1]] : subscript out of bounds
I tried this both in r version 3.6.3 and 4.1
Ensembl is temporarily unavailable according to https://www.ensembl.org/info/.

Conceptual Stucture doesn't accept clust

I use the latest version of package and I try to run this:
I try to run this example:
library(bibliometrix)
download.file("https://www.bibliometrix.org/datasets/joi.zip", destfile = temp<-tempfile())
M <- convert2df(readLines(unz(temp, "joi.txt")), dbsource="isi",format="plaintext")
CS <- conceptualStructure(M, method="MCA", field="ID", minDegree=10, clust=5, stemming=FALSE, labelsize=8,documents=20)
but I receive this error:
Error in conceptualStructure(M, method = "MCA", field = "ID", minDegree = 10, :
unused argument (clust = 5)
What should change?

Lexical Error while running featuretoolsR

In an effort to test the working of featuretools, I installed featuretoolsR through RStudio,and installed numpy and featuretools in Python.
However on trying to create an entitiy following error is coming
# Libs
library(featuretoolsR)
library(magrittr)
# Create some mock data
set_1 <- data.frame(key = 1:100, value = sample(letters, 100, T))
set_2 <- data.frame(key = 1:100, value = sample(LETTERS, 100, T))
# Create entityset
es <- as_entityset(set_1, index = "key", entity_id = "set_1", id = "demo")```
Error: lexical error: invalid char in json text.
WARNING: The conda.compat modul
(right here) ------^
Kindly help in diagnosing and providing solution to same.
The same warning happened to me after updating to conda version 4.6.11. I think the problem is generated because of the print statement at the end of the compat.py script. I know this is not a great fix but I accessed the compat.py file and removed the print statement:
print("WARNING: The conda.compat module is deprecated and will be removed in a future release.", file=sys.stderr)
The file should be located here: \Anaconda3\pkgs\conda-4.6.11-py37_0\Lib\site-packages\conda
I hope it helps.

Resources