I am working on the Kaggle Digit Recognizer problem.when I tried the given code I got the error.
Error in eval(expr, envir, enclos) : could not find function "eval"
library(ggplot2)
library(proto)
library(readr)
train <- data.frame(read_csv("../input/train.csv"))
labels <- train[,1]
features <- train[,-1]
rowsToPlot <- sample(1:nrow(train), 49)
rowToMatrix <- function(row) {
intensity <- as.numeric(row)/max(as.numeric(row))
return(t(matrix((rgb(intensity, intensity, intensity)), 28, 28)))
}
geom_digit <- function (digits, labels) GeomRasterDigit$new(geom_params =
list(digits=digits),stat = "identity", position = "identity", data = NULL,
inherit.aes = TRUE)
I am getting the error when I run the following segment.
GeomRasterDigit <- proto(ggplot2:::GeomRaster, expr={
draw_groups <- function(., data, scales, coordinates, digits, ...) {
bounds <- coord_transform(coordinates, data.frame(x = c(-Inf, Inf), y = c(
- Inf, Inf)), scales)
x_rng <- range(bounds$x, na.rm = TRUE)
y_rng <- range(bounds$y, na.rm = TRUE)
rasterGrob(as.raster(rowToMatrix(digits[data$rows,])), x_rng[1], y_rng[1],
diff(x_rng), diff(y_rng),default.units = "native", just =c("left","bottom"),
interpolate = FALSE)
}
})
Link for the complete code :
https://www.kaggle.com/benhamner/digit-recognizer/example-handwritten-digits/code
Take a look at the latest ggplot2 code on github. ggproto now replaces proto among other changes.
The code below should work fine.
GeomRasterDigit <- ggproto(ggplot2:::GeomRaster, expr={
draw_groups <- function(., data, scales, coordinates, digits, ...) {
bounds <- coord_transform(coordinates, data.frame(x = c(-Inf, Inf), y = c(
- Inf, Inf)), scales)
x_rng <- range(bounds$x, na.rm = TRUE)
y_rng <- range(bounds$y, na.rm = TRUE)
rasterGrob(as.raster(rowToMatrix(digits[data$rows,])), x_rng[1], y_rng[1],
diff(x_rng), diff(y_rng),default.units = "native", just =c("left","bottom"),
interpolate = FALSE)
}
})
There is a vignette about ggproto that is a good read.
Related
Tried running this code and I am getting this error message:
"Must subset columns with a valid subscript vector. Can't convert from double to integer due to loss of precision." Could someone help either fix or convert so it recognizes the dataframe columns appropriately
data1 <- wins.df(data1, data1$q, wins.limits = c(.01, .99), append.wins.label = FALSE, verbose = TRUE)
Here is the function:
wins.df <-
function(X,
var,
wins.limits = c(.01, .99),
append.wins.label = TRUE,
verbose = TRUE) {
Y <- X
x <- X[, var]
x.w <- wins(x, wins.limits)
var.w <- var
if (append.wins.label)
var.w <- paste(var, ".w", sep = "")
Y[, var.w] <- x.w
if (verbose) {
print(summary(Y[, var.w])) print(summary(X[, var]))
}
return(Y)
}
This question already exists:
R: Convert "grob" (graphical object) to "ggplot" [duplicate]
Closed 2 years ago.
I working with the R programming language. I am trying to convert a "grob" object into a "ggplot" object (the goal is eventually to convert the ggplot object into a "plotly" object).
I am looking for "the most simple" way to convert "grob" to "ggplot" - the computer I am using does not have a USB port or an internet connection, it only has R with some preloaded libraries (e.g. ggplot2, ggpubr)
In my example: I generated some data, ran a statistical model ("random forest") and plotted the results using "compressed" axis ("Tsne"). The code below can be copy/pasted into R, and the resulting "plot" ("final_plot") is the object that I want to convert to "ggplot":
library(cluster)
library(Rtsne)
library(dplyr)
library(randomForest)
library(caret)
library(ggplot2)
library(plotly)
#PART 1 : Create Data
#generate 4 random variables : response_variable ~ var_1 , var_2, var_3
var_1 <- rnorm(10000,1,4)
var_2<-rnorm(10000,10,5)
var_3 <- sample( LETTERS[1:4], 10000, replace=TRUE, prob=c(0.1, 0.2, 0.65, 0.05) )
response_variable <- sample( LETTERS[1:2], 10000, replace=TRUE, prob=c(0.4, 0.6) )
#put them into a data frame called "f"
f <- data.frame(var_1, var_2, var_3, response_variable)
#declare var_3 and response_variable as factors
f$response_variable = as.factor(f$response_variable)
f$var_3 = as.factor(f$var_3)
#create id
f$ID <- seq_along(f[,1])
#PART 2: random forest
#split data into train set and test set
index = createDataPartition(f$response_variable, p=0.7, list = FALSE)
train = f[index,]
test = f[-index,]
#create random forest statistical model
rf = randomForest(response_variable ~ var_1 + var_2 + var_3, data=train, ntree=20, mtry=2)
#have the model predict the test set
pred = predict(rf, test, type = "prob")
labels = as.factor(ifelse(pred[,2]>0.5, "A", "B"))
confusionMatrix(labels, test$response_variable)
#PART 3: Visualize in 2D (source: https://dpmartin42.github.io/posts/r/cluster-mixed-types)
gower_dist <- daisy(test[, -c(4,5)],
metric = "gower")
gower_mat <- as.matrix(gower_dist)
labels = data.frame(labels)
labels$ID = test$ID
tsne_obj <- Rtsne(gower_dist, is_distance = TRUE)
tsne_data <- tsne_obj$Y %>%
data.frame() %>%
setNames(c("X", "Y")) %>%
mutate(cluster = factor(labels$labels),
name = labels$ID)
plot = ggplot(aes(x = X, y = Y), data = tsne_data) +
geom_point(aes(color = labels$labels))
plotly_plot = ggplotly(plot)
a = tsne_obj$Y
a = data.frame(a)
data = a
data$class = labels$labels
decisionplot <- function(model, data, class = NULL, predict_type = "class",
resolution = 100, showgrid = TRUE, ...) {
if(!is.null(class)) cl <- data[,class] else cl <- 1
data <- data[,1:2]
k <- length(unique(cl))
plot(data, col = as.integer(cl)+1L, pch = as.integer(cl)+1L, ...)
# make grid
r <- sapply(data, range, na.rm = TRUE)
xs <- seq(r[1,1], r[2,1], length.out = resolution)
ys <- seq(r[1,2], r[2,2], length.out = resolution)
g <- cbind(rep(xs, each=resolution), rep(ys, time = resolution))
colnames(g) <- colnames(r)
g <- as.data.frame(g)
### guess how to get class labels from predict
### (unfortunately not very consistent between models)
p <- predict(model, g, type = predict_type)
if(is.list(p)) p <- p$class
p <- as.factor(p)
if(showgrid) points(g, col = as.integer(p)+1L, pch = ".")
z <- matrix(as.integer(p), nrow = resolution, byrow = TRUE)
contour(xs, ys, z, add = TRUE, drawlabels = FALSE,
lwd = 2, levels = (1:(k-1))+.5)
invisible(z)
}
model <- randomForest(class ~ ., data=data, mtry=2, ntrees=500)
#this is the final plot
final_plot = decisionplot(model, data, class = "class", main = "rf (1)")
From here, I am trying to convert this object ("final_plot") into a ggplot object:
library(ggpubr)
final = ggpubr::as_ggplot(final_plot)
But this gives me the following error:
Error in gList(...) : only 'grobs' allowed in "gList"
From here, I eventually would have wanted to use this command to convert the ggplot into a plotly object:
plotly_plot = ggplotly(final)
Does anyone know if there is a straightforward way to convert "final_plot" into a ggplot object? (and then plotly)? I don't have the ggplotify library.
Thanks
library(RSSL)
set.seed(1)
df <- generateSlicedCookie(1000,expected=FALSE) %>%
add_missinglabels_mar(Class~.,0.98)
class_erlr <- EntropyRegularizedLogisticRegression(Class ~., df, lambda=0.01,lambda_entropy = 100)
In the EntropyRegularizedLogisticRegression function from the RSSL package, the example in the documentation passed in the formula Class ~. as the input. I was looking at the source code, and these are the parameters for the function
function (X, y, X_u = NULL, lambda = 0, lambda_entropy = 1, intercept = TRUE,
init = NA, scale = FALSE, x_center = FALSE)
I tried manually defining what X, y, X_u are based on the df I generated. But running the following gives me an error with the optimization:
y <- df$Class
X <- df[, -1]
ids <- which(is.na(y))
X_u <- X[ids, ]
class_erlr_manual <- EntropyRegularizedLogisticRegression(X = X, y = y, X_u = X_u, lambda=0.01,lambda_entropy = 100)
The error reads:
Error in optim(w, fn = loss_erlr, gr = grad_erlr, X, y, X_u, lambda = lambda, :
initial value in 'vmmin' is not finite
Why does changing the formula input Class ~. into X=X, y =y, X_u = X_u result in an error? Can anyone point me to where in the source code the formula input is being used?
When I want to create a scatterplot matrix, there is a error about
Error in grid.Call.graphics(C_downviewport, name$name, strict) :
Viewport 'plot_01.panel.1.1.off.vp' was not found".
How I can fix it?
varNum <- function(x){
val <- 1:ncol(x)
names(val) <- colnames(x)
return(val)
}
varNum(house)
Bedroom SquareFeet Followers VisitingTime TotalPrice UnitPrice
1 2 3 4 5 6
District Location
7 8
house1 <- house[,c(7,1:6)]
offDiag <- function(x,y,...){
panel.grid(h = -1,v = -1,...)
panel.hexbinplot(x,y,xbins = 15,...,border = gray(.7),
trans = function(x)x^.5)
# panel.loess(x , y, ..., lwd=2,col='red')
}
onDiag <- function(x, ...){
yrng <- current.panel.limits()$ylim
d <- density(x, na.rm = TRUE)
d$y <- with(d, yrng[1] + 0.95 * diff(yrng) * y / max(y) )
panel.lines(d,col = rgb(.83,.66,1),lwd = 2)
diag.panel.splom(x, ...)
}
splom(house1,as.matrix = TRUE,
xlab = '',main = "Beijing Housing Variables",
pscale = 0, varname.cex = 0.8,axis.text.cex = 0.6,
axis.text.col = "purple",axis.text.font = 2,
axis.line.tck = .5,
panel = offDiag,
diag.panel = onDiag
)
Error in grid.Call.graphics(C_downviewport, name$name, strict) :
Viewport 'plot_01.panel.1.1.off.vp' was not found
Try installing the ellipse package. You do not need to load it as a library, only install it.
install.packages("ellipse")
I had the same issue. In my case this was caused by specifying the splom argument col = mydataframe$somevariable which was a categorical variable of strings. Specifying it as col = as.numeric(as.factor(mydataframe$somevariable)) fixed the issue. To anyone faced with this error message in the future: try removing optional arguments to identfy what might be wrong.
I am trying to reproduce an example from ND Lewis: Neural Networks for time series forecasting with R. If I include the device argument I get the error:
Error in mx.opt.sgd(...) :
unused argument (device = list(device = "cpu", device_id = 0, device_typeid = 1))
In addition: Warning message:
In mx.model.select.layout.train(X, y) :
Auto detect layout of input matrix, use rowmajor..
If I remove this parameter, I still get this warning:
Warning message:
In mx.model.select.layout.train(X, y) :
Auto detect layout of input matrix, use rowmajor..
The code is:
library(zoo)
library(quantmod)
library(mxnet)
# data
data("ecoli", package = "tscount")
data <- ecoli$cases
data <- as.zoo(ts(data, start = c(2001, 1), end = c(2013, 20), frequency = 52))
xorig <- do.call(cbind, lapply((1:4), function(x) as.zoo(Lag(data, k = x))))
xorig <- cbind(xorig, data)
xorig <- xorig[-(1:4), ]
# normalization
range_data <- function(x) {
(x - min(x))/(max(x) - min(x))
}
xnorm <- data.matrix(xorig)
xnorm <- range_data(xnorm)
# test/train
y <- xnorm[, 5]
x <- xnorm[, -5]
n_train <- 600
x_train <- x[(1:n_train), ]
y_train <- y[(1:n_train)]
x_test <- x[-(1:n_train), ]
y_test <- y[-(1:n_train)]
# mxnet:
mx.set.seed(2018)
model1 <- mx.mlp(x_train,
y_train,
hidden_node = c(10, 2),
out_node = 1,
activation = "sigmoid",
out_activation = "rmse",
num.round = 100,
array.batch.size = 20,
learning.rate = 0.07,
momentum = 0.9
#, device = mx.cpu()
)
pred1_train <- predict(model1, x_train, ctx = mx.cpu())
How can I fix this?
Regarding the second warning message, MXNet is trying to detect the row/column major based on the shape of your inputs: https://github.com/apache/incubator-mxnet/blob/424143ac47ab3a38ae8aedaeb3319379887de0bc/R-package/R/model.R#L329
For the unused argument device = mx.cpu(), should the argument name be corrected to ctx instead of device?