h2o mojo predict in R Shiny - r

I think I have exhausted the entire internet looking for an example / answer to my query regarding implementing a h2o mojo model to predict within RShiny. We have created a bunch of models, and wish to predict scores in a RShiny front end where users enter values. However, with the following code to implement the prediction we get an error of
Warning: Error in checkForRemoteErrors: 6 nodes produced errors; first
error: No method asJSON S3 class: H2OFrame
dataInput <- dfName
dataInput <- toJSON(dataInput)
rawPred <- as.data.frame(h2o.predict_json(model= "folder/mojo_model.zip", json = dataInput, genmodelpath = "folder/h2o-genmodel.jar"))
Can anyone help with some pointers?
Thanks,
Siobhan

This is not a Shiny issue. The error indicates that you're trying to use toJSON() on an H2OFrame (instead of an R data.frame), which will not work because the jsonlite library does not support that.
Instead you can convert the H2OFrame to a data.frame using:
dataInput <- toJSON(as.data.frame(dataInput))
I can't guarantee that toJSON() will generate the correct input for h2o.predict_json() since I have not tried that, so you will have to try it out yourself. Note that the only way this may work is if this is a 1-row data.frame because the h2o.predict_json() function expects a single row of data, encoded as JSON. If you're trying to score multiple records, you'd have to loop over the rows. If for some reason toJSON() doesn't give you the right format, then you can use a function I wrote in this post here to create the JSON string from a data.frame manually.
There is a ticket open to create a better version of h2o.predict_json() that will support making predictions from a MOJO on data frames (with multiple rows) without having to convert to JSON first. This will make it so you can avoid dealing with JSON altogether.
An alternative is to use a H2O binary model instead of a MOJO, along with the standard predict() function. The only requirement here is that the model must be loaded into H2O cluster memory.

The following works now using the json formatting from first two lines and the single quote around var with spaces.
df<- data.frameV1=1,V2=1,CMPNY_EL_IND=1,UW_REGION_NAME = "'LONDON & SE'" )
dfstr <- sapply(1:ncol(df), function(i) paste(paste0('\"', names(df)[i], '\"'), df[1,i], sep = ':'))
json <- paste0('{', paste0(dfstr, collapse = ','), '}')
dataPredict <- as.data.frame(h2o.predict_json(model = "D:\\GBM_model_0_CMP.zip", json = json, genmodelpath = "D:\\h2o-genmodel.jar", labels = TRUE))

Related

How to import a dynamodb table attribute into a dataframe in R and plot the dataframe using ggplot

I am trying to import dynamodb in R using paws.database library. I am successful in retrieving the required attribute into R(using scan operation). However the imported data is in the form of a nested list i.e. in [[]] form. My intention is to format the imported dynamodb attribute into a dataframe and later be able to plot it using ggpplot. I have tried using options such as
df <- ldply (list_a, data.frame), ldply (list_a, data.frame),
data.frame(matrix(unlist(list_a), nrow=length(list_a),
byrow=TRUE),stringsAsFactors=FALSE),
as.data.frame(do.call(cbind,list_a))
so far and was unable to convert the data in a proper dataframe format. The final error I get in ggplot is "
Error: data must be a data frame, or other object coercible by fortify(), not a list "
Could anyone please help ?
See this similar issue.
I'm also using paws. Here's what I did to work with a small DynamoDB table:
dyna <- paws::dynamodb()
Table<-dyna$scan("table_name")
newtable<-rbindlist(Table$Items,fill = TRUE)
Then I create a new dataframe by using unlist() on each column of newtable.

Analyze code in dataframe with Tidycode in R

I am trying to take R code, stored in cells of the content column of a dataframe, and analyze the functions used by applying the Tidycode package. However, I first need to convert the data to a Matahari tibble before applying an unnest_calls() function.
Here is the data:
data <- read.csv("https://github.com/making-data-science-count/TidyTuesday-Analysis/raw/master/db-tmp/cleaned%20database.csv")
I have tried doing this in a number of different ways, including extracting each row (in the content column ) as an Rfile and then reading it back in with Tidycode calls, for example:
tmp<-data$content2[1])
writeLines(tmp, "tmp.R") #I've also used save() and write()
rfile<-tidycode::read_rfiles("tmp.R")
But, I keep getting errors such as: "Error in parse(text = x) : <text>:1:14: unexpected symbol
1: library(here)library"
Ultimately, what I would like to do is analyze the different types of code per file, and keep that linked with the other data in the data dataframe, such as date and username.
Any help would be greatly appreciated!

How to export an R dataframe to a Power Query table

I'm using an R script within Power Query to do some data transformations and return a scaled table.
My R code is like this:
# 'dataset'
It does seem like odd that this fails to return. A quick glance online gave this 3 minute youtube video, which uses the same method, which you are using. Further searching down a source, one may come across the Microsoft Documentation, which gives a possible reason for why there might be an issue.
When preparing and running an R script in Power BI Desktop, there are a few limitations:
Only data frames are imported, so make sure the data you want to import to Power BI is represented in a data frame
Columns that are typed as Complex and Vector are not imported, and are replaced with error values in the created table
These seem like the most obvious reasons. Betting that there is no complex columns in your dataset, I'd believe the prior is likely the reason. A quick recreation of your dataset shows that the scale functions changes your dataset into a matrix class object. This is kept by cbind, and as such output is of class matrix and not data.frame.
>dataset <- as.data.frame(abs(matrix(rnorm(1000),ncol=4)))
>class(dataset)
[1]"data.frame"
>library(dplyr)
>df_normal <- log(dataset + 1) %>%
> select(c(2:4)) %>%
> scale
>class(df_normal)
[1] "matrix"
>df_normal <- cbind(dataset[,1], df_normal)
>output <- df_normal
>class(output)
[1] "matrix"
A simple fix would then seem to be adding output <- as.data.frame(output), as this is in line with the documentation of powerBI. Maybe it would need a return like statement at the end. Adding a line at the end of the script simply stating output should fix this.
Edit
For clarification, I believe the following edited script (of yours) should return the data expected
# 'dataset' contém os dados de entrada neste script
library(dplyr)
df_normal <- log(dataset+1) %>%
select(c(2:4)) %>%
scale
df_normal <-cbind(dataset[,c(1)], df_normal)
output <- as.data.frame(df_normal)
#output ##This line might be needed without the first comment

Why am I getting different output from the Alteryx R tool

I using the Alteryx R Tool to sign an amazon http request. To do so, I need the hmac function that is included in the digest package.
I'm using a text input tool that includes the key and a datestamp.
Key= "foo"
datastamp= "20120215"
Here's the issue. When I run the following script:
the.data <- read.Alteryx("1", mode="data.frame")
write.Alteryx(base64encode(hmac(the.data$key,the.data$datestamp,algo="sha256",raw = TRUE)),1)
I get an incorrect result when compared to when I run the following:
write.Alteryx(base64encode(hmac("foo","20120215",algo="sha256",raw = TRUE)),1)
The difference being when I hardcode the values for the key and object I get the correct result. But if use the variables from the R data frame I get incorrect output.
Does the data frame alter the data in someway. Has anyone come across this when working with the R Tool in Alteryx.
Thanks for your input.
The issue appears to be that when creating the data frame, your character variables are converted to factors. The way to fix this with the data.frame constructor function is
the.data <- data.frame(Key="foo", datestamp="20120215", stringsAsFactors=FALSE)
I haven't used read.Alteryx but I assume it has a similar way of achieving this.
Alternatively, if your data frame has already been created, you can convert the factors back into character:
write.Alteryx(base64encode(hmac(
as.character(the.data$Key),
as.character(the.data$datestamp),
algo="sha256",raw = TRUE)),1)

How to save and load the output of seqefsub() in TraMineR

I have long dataset where I want to experiment with different settings for the seqefsub() function, and depending on the setting, one run can take relatively long. Therefore I want the computer to calculate all the different variations and later evaluate the results, evtl. use them for further processing.
My problem is when I save the results in a file and load them, the structure of the data appears to be broken. As a result I cannot use the TraMineR functions on this data after I load it, hence I need to reproduce all the calculations every single time after closing R.
Saving to the workspace with RStudio (.RData) gives the same error. Saving to binary format gives the same error.
This is how the sequence list looks like in RStudio, before saving:
And after loading:
This is the code I used for this example:
library(TraMineR)
data(actcal.tse)
seqe <- seqecreate(actcal.tse[1:100, ])
fsub <- seqefsub(seqe, minSupport = 0.1)
save(fsub, file="fsub.rda")
rm(fsub)
load("fsub.rda")
Details of my system:
x86_64-pc-linux-gnu (Ubuntu 14.04 LTE)
R version 3.2.0 (2015-04-16)
RStudio Version 0.98.1103
TraMineR stable version 1.8-9 (Built: 2015-04-22)
If you check the value returned from the seqefsub() it is a subseqelist object. This kind object contains other objects listed in the docs as:
seqe: The list of sequences in which the subsequences were searched (a seqelist event sequence object).
subseq: A list of subsequences (a seqelist event sequence object).
data: A data frame containing details (support, frequency, ...) about the subsequences
And others. What I did to save the results was to convert the data that I needed in to lists and build a data frame with them before save it.
library(TraMineR)
data(actcal.tse)
seqe <- seqecreate(actcal.tse[1:100, ])
fsub <- seqefsub(seqe, minSupport = 0.1)
#Get the data I need only
#(Explore the other objects to get what you need)====
#Gets the column support from data (which is a data frame)
support <- fsub$data$Support
#subseq is class that cannot be converted to a data frame
#it stores de subsquences found and I will convert them to strings
sequences <- as.character(fsub$subseq)
#Builds the data frame
result <- data.frame(sequences, support)
#Save it at root
save(result, file="~/result.rda")
rm(result)
load('~/result.rda')
I hope it still helps you.

Resources