Select data and name when pointing it chart with ggplotly - r

I did everything in ggplot, and it was everything working well. Now I need it to show data when I point a datapoint. In this example, the model (to identify point), and the disp and wt ( data in axis).
For this I added the shape (same shape, I do not actually want different shapes) to model data. and asked ggplot not to show shape in legend. Then I convert to plotly. I succeeded in showing the data when I point the circles, but now I am having problems with the legend showing colors and shapes separated with a comma...
I did not wanted to make it again from scrach in plotly as I have no experience in plotly, and this is part of a much larger shiny project, where the chart adjust automatically the axis scales and adds trend lines the the chart among other things (I did not include for simplicity) that I do not know how to do it in plotly.
Many thanks in advance. I have tried a million ways for a couple of days now, and did not succeed.
# choose mtcars data and add rowname as column as I want to link it to shapes in ggplot
data1 <- mtcars
data1$model <- rownames(mtcars)
# I turn cyl data to character as when charting it showed (Error: Continuous value supplied to discrete scale)
data1$cyl <- as.character(data1$cyl)
# linking colors with cylinders and shapes with models
ccolor <- c("#E57373","purple","green")
cylin <- c(6,4,8)
# I actually do not want shapes to be different, only want to show data of model when I point the data point.
models <- data1$model
sshapes <- rep(16,length(models))
# I am going to chart, do not want legend to show shape
graff <- ggplot(data1,aes(x=disp, y=wt,shape=model,col=cyl)) +
geom_point(size = 1) +
ylab ("eje y") + xlab('eje x') +
scale_color_manual(values= ccolor, breaks= cylin)+
scale_shape_manual(values = sshapes, breaks = models)+
guides(shape='none') # do not want shapes to show in legend
graff
chart is fine, but when converting to ggplotly, I am having trouble with the legend
# chart is fine, but when converting to ggplotly, I am having trouble with the legend
graffPP <- ggplotly(graff)
graffPP
legend is not the same as it was in ggplot
I succeeded in showing the model and data from axis when I point a datapoint in the chart... but now I am having problems with the legend....

To the best of my knowledge there is no easy out-of-the box solution to achieve your desired result.
Using pure plotly you could achieve your result by assigning legendgroups which TBMK is not available using ggplotly. However, you could assign the legend groups manually by manipulating the plotly object returned by ggplotly.
Adapting my answer on this post to your case you could achieve your desired result like so:
library(plotly)
p <- ggplot(data1, aes(x = disp, y = wt, shape = model, col = cyl)) +
geom_point(size = 1) +
ylab("eje y") +
xlab("eje x") +
scale_color_manual(values = ccolor, breaks = cylin) +
scale_shape_manual(values = sshapes, breaks = models) +
guides(shape = "none")
gp <- ggplotly(p = p)
# Get the names of the legend entries
df <- data.frame(id = seq_along(gp$x$data), legend_entries = unlist(lapply(gp$x$data, `[[`, "name")))
# Extract the group identifier, i.e. the number of cylinders from the legend entries
df$legend_group <- gsub("^\\((\\d+).*?\\)", "\\1", df$legend_entries)
# Add an indicator for the first entry per group
df$is_first <- !duplicated(df$legend_group)
for (i in df$id) {
# Is the layer the first entry of the group?
is_first <- df$is_first[[i]]
# Assign the group identifier to the name and legendgroup arguments
gp$x$data[[i]]$name <- df$legend_group[[i]]
gp$x$data[[i]]$legendgroup <- gp$x$data[[i]]$name
# Show the legend only for the first layer of the group
if (!is_first) gp$x$data[[i]]$showlegend <- FALSE
}
gp

Related

How does one control the appearance (e.g. line size, line type, colour) of mqgam plots produced using plot.mgamViz from the "mgcViz" package?

I am using quantile regression in R with the qgam package and visualising them using the mgcViz package, but I am struggling to understand how to control the appearance of the plots. The package effectively turns gams (in my case mqgams) into ggplots.
Simple reprex:
egfit <- mqgam(data = iris,
Sepal.Length ~ s(Petal.Length),
qu = c(0.25,0.5,0.75))
plot.mgamViz(getViz(egfit))
I am able to control things that can be added, for example the axis labels and theme of the plot, but I'm struggling to effect things that would normally be addressed in the aes() or geom_x() functions.
How would I control the thickness of the line? If this were a normal geom_smooth() or geom_line() I'd simply put size = 1 inside of the geoms, but I cannot see how I'd do so here.
How can I control the linetype of these lines? The "id" is continuous and one cannot supply a linetype to a continuous scale. If this were a nomral plot I would convert "id" to a character, but I can't see a way of doing so with the plot.mgamViz function.
How can I supply a new colour scale? It seems as though if I provide it with a new colour scale it invents new ID values to put on the legend that don't correlate to the actual "id" values, e.g.
plot.mgamViz(getViz(egfit)) + scale_colour_viridis_c()
I fully expect this to be relatively simple and I'm missing something obvious, and imagine the answer to all three of these subquestions are very similar to one another. Thanks in advance.
You need to extract your ggplot element using this:
p1 <- plot.mgamViz(getViz(egfit))
p <- p1$plots [[1]]$ggObj
Then, id should be as.factor:
p$data$id <- as.factor(p$data$id)
Now you can play with ggplot elements as you prefer:
library(mgcViz)
egfit <- mqgam(data = iris,
Sepal.Length ~ s(Petal.Length),
qu = c(0.25,0.5,0.75))
p1 <- plot.mgamViz(getViz(egfit))
# Taking gg infos and convert id to factor
p <- p1$plots [[1]]$ggObj
p$data$id <- as.factor(p$data$id)
# Changing ggplot attributes
p <- p +
geom_line(linetype = 3, size = 1)+
scale_color_brewer(palette = "Set1")+
labs(x="Petal Length", y="s(Petal Length)", color = "My ID labels:")+
theme_classic(14)+
theme(legend.position = "bottom")
p
Here the generated plot:
Hope it is useful!

adapt plot code to make a ggplot

I have the following data
[1] 0.09733344 0.17540020 0.14168188 0.54093074 0.78151039 0.28068527
[7] 1.96164429 0.33743328 0.05200734 0.09103039 0.28842044 0.09240131
[13] 0.09143535 0.38142022 0.11700952
from which I did bayesian inference and made a plot with the following code
f_theta<-function(theta,Data){
(theta^length(Data) )*exp(-theta*sum(Data))}
theta<-seq(1,20,length=100)
a=b=0.001
plot(theta,dgamma(theta,a,b),type="l",col="red",
ylim=c(0,2),tck=-0.01,cex.lab=0.8,cex.axis=0.8)
lines(theta,dgamma(theta,length(Data)+a,sum(Data)+b),col="green",lty=1)
lines(theta,f_theta(theta,Data=Data),lty=1,col="blue")
legend('topright',legend=c("Prior","Post","Likelihood")
,col=c("red","green","blue","purple"),lty=1,bty="n",cex=0.8)
But I've seen the following graph
which has code
# ggplot2 examples
library(ggplot2)
# create factors with value labels
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),
labels=c("3gears","4gears","5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),
labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),
labels=c("4cyl","6cyl","8cyl"))
# Kernel density plots for mpg
# grouped by number of gears (indicated by color)
qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),
main="Distribution of Gas Milage", xlab="Miles Per Gallon",
ylab="Density")
but I'm not quite familiar with ggplot library and graphs and I would like some help in order to adapt my code and make a graph similar to last one.
ggplot() assumes that your data are in a particular format (sometimes called "long", but the author of ggplot() dislikes that description), so let's start by putting them into that format:
Data2 = data.frame(
theta = rep(theta, 3),
WhichDistribution = c(rep("Prior",length(theta)), rep("Post",length(theta)), rep("Likelihood",length(theta))),
Density = c(dgamma(theta,a,b), dgamma(theta,length(Data)+a,sum(Data)+b), f_theta(theta,Data=Data))
)
Then we can construct a ggplot() command. ggplot() needs data, aesthetics, and a geometry. Your data will be the data frame just constructed. The aesthetics refer generally to how the qualities of the data will impact the graph (what is on axes, what determines groups, etc.), and the geometry is the kind of plot (not a great wording, sorry).
ggplot(Data2, aes(x=theta, y=Density, group=WhichDistribution, color=WhichDistribution, fill=WhichDistribution))+
# position="identity" in order to not stack the densities
geom_area(alpha=.2, position="identity") +
# gets rid of the title on the legend
theme(legend.title = element_blank())+
# make the horizontal axis label pretty
scale_x_continuous(expression(theta))
You can change alpha to adjust transparency. If you want the horizontal axis to not go all the way to 20, change it in scale_x_continuous():
ggplot(Data2, aes(x=theta, y=Density, group=WhichDistribution, color=WhichDistribution, fill=WhichDistribution))+
# position="identity" in order to not stack the densities
geom_area(alpha=.2, position="identity") +
# gets rid of the title on the legend
theme(legend.title = element_blank())+
# make the horizontal axis label pretty
scale_x_continuous(expression(theta), limits=c(0,7))
qplot() is a quick plotting function that seems to mostly get in the way for people trying to learn the ggplot() language, so you might want to avoid it.

ggplot geom_histogram color by factor not working properly

In trying to color my stacked histogram according to a factor column; all the bars have a "green" roof? I want the bar-top to be the same color as the bar itself. The figure below shows clearly what is wrong. All the bars have a "green" horizontal line at the top?
Here is a dummy data set :
BodyLength <- rnorm(100, mean = 50, sd = 3)
vector <- c("80","10","5","5")
colors <- c("black","blue","red","green")
color <- rep(colors,vector)
data <- data.frame(BodyLength,color)
And the program I used to generate the plot below :
plot <- ggplot(data = data, aes(x=data$BodyLength, color = factor(data$color), fill=I("transparent")))
plot <- plot + geom_histogram()
plot <- plot + scale_colour_manual(values = c("Black","blue","red","green"))
Also, since the data column itself contains color names, any way I don't have to specify them again in scale_color_manual? Can ggplot identify them from the data itself? But I would really like help with the first problem right now...Thanks.
Here is a quick way to get your colors to scale_colour_manual without writing out a vector:
data <- data.frame(BodyLength,color)
data$color<- factor(data$color)
and then later,
scale_colour_manual(values = levels(data$color))
Now, with respect to your first problem, I don't know exactly why your bars have green roofs. However, you may want to look at some different options for the position argument in geom_histogram, such as
plot + geom_histogram(position="identity")
..or position="dodge". The identity option is closer to what you want but since green is the last line drawn, it overwrites previous the colors.
I like density plots better for these problems myself.
ggplot(data=data, aes(x=BodyLength, color=color)) + geom_density()
ggplot(data=data, aes(x=BodyLength, fill=color)) + geom_density(alpha=.3)

log-scaled density plot: ggplot2 and freqpoly, but with points instead of lines

What I really want to do is plot a histogram, with the y-axis on a log-scale. Obviously this i a problem with the ggplot2 geom_histogram, since the bottom os the bar is at zero, and the log of that gives you trouble.
My workaround is to use the freqpoly geom, and that more-or less does the job. The following code works just fine:
ggplot(zcoorddist) +
geom_freqpoly(aes(x=zcoord,y=..density..),binwidth = 0.001) +
scale_y_continuous(trans = 'log10')
The issue is that at the edges of my data, I get a couple of garish vertical lines that really thro you off visually when combining a bunch of these freqpoly curves in one plot. What I'd like to be able to do is use points at every vertex of the freqpoly curve, and no lines connecting them. Is there a way to to this easily?
The easiest way to get the desired plot is to just recast your data. Then you can use geom_point. Since you don't provide an example, I used the standard example for geom_histogram to show this:
# load packages
require(ggplot2)
require(reshape)
# get data
data(movies)
movies <- movies[, c("title", "rating")]
# here's the equivalent of your plot
ggplot(movies) + geom_freqpoly(aes(x=rating, y=..density..), binwidth=.001) +
scale_y_continuous(trans = 'log10')
# recast the data
df1 <- recast(movies, value~., measure.var="rating")
names(df1) <- c("rating", "number")
# alternative way to recast data
df2 <- as.data.frame(table(movies$rating))
names(df2) <- c("rating", "number")
df2$rating <- as.numeric(as.character(df$rating))
# plot
p <- ggplot(df1, aes(x=rating)) + scale_y_continuous(trans="log10", name="density")
# with lines
p + geom_linerange(aes(ymax=number, ymin=.9))
# only points
p + geom_point(aes(y=number))

Plotting error while using ggplot faceting function in R

I am trying to do the comparison of my observed and modeled data sets for two stations. One station is called station "red" and another is called "blue". I was able to create the facets but when I tried to add two series in one facet, only one facet got updated while other didn't.
This means for blue only one series is plotted and for red two series are plotted.
The code I used is as follows:
# install.packages("RCurl", dependencies = TRUE)
require(RCurl)
out <- postForm("https://dl.dropbox.com/s/ainioj2nn47sis4/watersurf1.csv?dl=1", format="csv")
watersurf <- read.csv(textConnection(out))
watersurf[1:100,]
watersurf$coupleid <- factor(rep(unlist(by(watersurf$id,watersurf$group1,
function(x) {ave(as.numeric(unique(x)),FUN=seq_along)}
)),each=6239))
p <- ggplot(data=watersurf,aes(x=time,y=data,group=id))+geom_line(aes(linetype=group1),size=1)+facet_wrap(~coupleid)
p
Is it also possible to add a third series in the graph but of unequal length (i.e not same interval)?
The output is
I followed the example on this page to create the graphs.
http://www.ats.ucla.edu/stat/r/faq/growth.htm
Is this what you are looking for,
ggplot(data = watersurf, aes( x = time, y = data))
+ geom_line(aes(linetype = group1, colour = group1), size = 0.2)
+ facet_wrap(~ id)

Resources