Where to properly position ggplotly tooltip in ggplot? - r

When I add a text line to a geom_line plot, the line disappears.
library(tidyverse)
library("lubridate")
library(plotly)
library("RColorBrewer")
library(htmlwidgets)
library("reprex")
activity <- c("N", "FB", "N", "N", "N", "FA", "N", "FA", "N", "FA", "N", "N", "N", "N", "N", "FA", "N", "N", "N", "N", "FA", "N", "N", "FA", "FA")
activity_date <- as.Date(c(NA, "2022-04-19", "2022-05-01", "2022-05-01", "2022-05-06", "2022-05-06", "2022-05-07", "2022-05-07", "2022-05-09", "2022-05-09", "2022-05-10", "2022-05-13", "2022-05-14", "2022-05-14", "2022-05-14", "2022-05-15", "2022-05-15", "2022-05-15", "2022-05-15", "2022-05-15", "2022-05-16", "2022-05-16", "2022-05-16", "2022-05-16", "2022-05-16"))
fcrawl_cum <- c(0L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 8L)
clutch_cum <- c(1L, 1L, 2L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 8L, 9L, 10L, 11L, 11L, 12L, 13L, 14L, 15L, 15L, 16L, 17L, 17L, 17L)
turtle_activity_gtm <- tibble(activity, activity_date, fcrawl_cum, clutch_cum)
the_pal <- RColorBrewer::brewer.pal(n = 8,"Dark2") #Set color palette.
myplot2 <-
ggplot() +
geom_line(data = turtle_activity_gtm,
aes(x=activity_date, y=fcrawl_cum,
text = paste("Date: ", as.Date(activity_date),
"<br>Total: ", fcrawl_cum)),
na.rm = TRUE,
linetype = "111111",
linewidth = 1.5, color = the_pal[6]) +
geom_line(data = turtle_activity_gtm,
aes(x=activity_date, y=clutch_cum),
na.rm = TRUE,
linewidth = 1.5,
color = the_pal[7]) +
labs(title = "myplot2")
myplot2
ggplotly(myplot2)
ggplotly(myplot2, tooltip = c("text"))
If I use, ggplotly(myplot2) the line with the text line added is still not there. However, the data points still appear for missing line. If I use ggplotly with the added tooltip, ggplotly(myplot2, tooltip = c("text")) ,the label is missing for the line without the added text line but the label is exactly as written in the text line.
I would show some of the plots; however, I am not allow to yet. Reputation too low.
How can I do this properly so that both lines show with the added tooltip? I eventually want both lines to have their own text lines added. This is a very simplified chart. One I can get past this problem, I plan to eventually add a lot more items to this chart with a full data set.
Thanks,
Jeff

When adding the text attribute to geom_line you have to explicitly set the group aesthetic, i.e. use e.g. group=1 to tell ggplot2 that all observations belong to one group which for simplicity I called 1:
library(tidyverse)
library(plotly)
myplot2 <-
ggplot() +
geom_line(
data = turtle_activity_gtm,
aes(
x = activity_date, y = fcrawl_cum, group = 1,
text = paste(
"Date: ", as.Date(activity_date),
"<br>Total: ", fcrawl_cum
)
),
na.rm = TRUE,
linetype = "111111",
linewidth = 1.5, color = the_pal[6]
) +
geom_line(
data = turtle_activity_gtm,
aes(x = activity_date, y = clutch_cum),
na.rm = TRUE,
linewidth = 1.5,
color = the_pal[7]
) +
labs(title = "myplot2")
#> Warning in geom_line(data = turtle_activity_gtm, aes(x = activity_date, :
#> Ignoring unknown aesthetics: text
ggplotly(myplot2, tooltip = c("text"))
EDIT TBMK there is only one text attribute, i.e. specify your tooltip via text the same way as for the first geom_line and use tooltip=c("text").
But a more ggplot2 like approach to create your chart would be to first reshape your data to long format. Doing so allows to create your plot with just one geom_line but requires map on the color aes, to map on the group aes appropriately and to set your colors via scale_color_manual. Note that doing so will automatically add a legend to your plot:
turtle_activity_gtm_long <- turtle_activity_gtm %>%
tidyr::pivot_longer(c(fcrawl_cum, clutch_cum))
ggplot() +
geom_line(
data = turtle_activity_gtm_long,
aes(
x = activity_date, y = value,
color = name, group = name,
text = paste(
"Date: ", as.Date(activity_date),
"<br>Total: ", value
)
),
na.rm = TRUE,
linewidth = 1.5
) +
scale_color_manual(values = c(clutch_cum = the_pal[[7]], fcrawl_cum = the_pal[[6]])) +
labs(title = "myplot2")
ggplotly(tooltip = c("text"))

Related

Add Count Labels on Top of Barchart in base R

I have a barplot to which I'm looking to add count labels on top of each bars. Can someone tell me how to do that in base R and NOT using ggplot2?
Since I saw your last question I have a bit more detail than others.
Example data
structure(list(ID = c(140L, 620L, 868L, 1120L, 2313L), DemAffl = c(10L,
4L, 5L, 10L, 11L), DemAge = c(76L, 49L, 70L, 65L, 68L), DemCluster = c(16L,
35L, 27L, 51L, 4L), DemClusterGroup = c("C", "D", "D", "F", "A"
), DemGender = c("U", "U", "F", "M", "F"), DemReg = c("Midlands",
"Midlands", "Midlands", "Midlands", "Midlands"), DemTVReg = c("Wales & West",
"Wales & West", "Wales & West", "Midlands", "Midlands"), PromClass = c("Gold",
"Gold", "Silver", "Tin", "Tin"), PromSpend = c(16000, 6000, 0.02,
0.01, 0.01), PromTime = c(4L, 5L, 8L, 7L, 8L), TargetBuy = c(0L,
0L, 1L, 1L, 0L), TargetAmt = c(0L, 0L, 1L, 1L, 0L)), row.names = c(NA,
5L), class = "data.frame")
To make the plot
counts <- table(df$TargetBuy)
Here you will need to change the y-axis scale because if you don't the top label wont show
b <- barplot(counts, main= "number of yes/no", xlab = "response", ylab = "number of occurrences", ylim=c(0,4))
To add the labels you need to add text() to the plot
text(x= b, y=counts,pos = 3, label = counts, cex = 0.8, col = "red")
so the full thing will look like this
counts <- table(df$TargetBuy)
b <- barplot(counts, main= "number of yes/no", xlab = "response", ylab = "number of occurrences", ylim=c(0,4)
text(x= b, y=counts,pos = 3, label = counts, cex = 0.8, col = "red")
This produces a plot that looks like this. Note that I changed the y axis length to 4. If it was set to 3, the top 3 above the first bar would not show

ggplot functions to replicate plots

I'm trying to replicate the theme of these graph using ggplot, I searched online to show me how to assign and I found few articles that discussed changing colors of two variables in scatterplot, I tried the following:
d1<-read.csv("./data/games.csv")
p.1<-ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
geom_point(aes(color = cream_rating))
p.1 + ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5)) + scale_color_manual(
values=c("orange", "green"))
I get this error:
ERROR while rich displaying an object: Error: Continuous value supplied to discrete scale
Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler)
7. repr::mime2repr[[mime]](obj)
8. repr_text.default(obj)
9. paste(capture.output(print(obj)), collapse = "\n")
10. capture.output(print(obj))
11. evalVis(expr)
12. withVisible(eval(expr, pf))
13. eval(expr, pf)
14. eval(expr, pf)
15. print(obj)
16. print.ggplot(obj)
17. ggplot_build(x)
18. ggplot_build.ggplot(x)
19. lapply(data, scales_train_df, scales = npscales)
20. FUN(X[[i]], ...)
21. lapply(scales$scales, function(scale) scale$train_df(df = df))
22. FUN(X[[i]], ...)
23. scale$train_df(df = df)
24. f(..., self = self)
25. self$train(df[[aesthetic]])
26. f(..., self = self)
27. self$range$train(x, drop = self$drop, na.rm = !self$na.translate)
28. f(..., self = self)
29. scales::train_discrete(x, self$range, drop = drop, na.rm = na.rm)
30. stop("Continuous value supplied to discrete scale", call. = FALSE)
I'm using the wrong function, which one that I should use and how to get the cross line in the middle?
structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE,
TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L,
5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L,
2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime",
"resign"), class = "factor"), winner = structure(c(2L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream",
"draw"), class = "factor"), increment_code = structure(c(3L,
7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0",
"15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"),
cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L,
1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L,
1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA,
10L), class = "data.frame")
This is what I want to achieve:
I tried Stefan's suggestion (which was great help) with some modifications:
`d1<-read.csv("./data/games.csv")
ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(aes(color = winner), alpha = 0.2) +
# Add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.2) +
geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.2) +
scale_shape_manual(values=c(16, 17)) +
# "draw"s should be dropped and removed from the title
scale_color_manual(values = c(cream = "seagreen4", charcoal = "chocolate3", draw = NA)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal") + theme_bw() + theme(plot.title = element_text(hjust = 0.5))
I want to filter out "draw" from the plot, also when I change the dot shapes to triangles and circle, they don't seem to be changing, in addition I get this error:
Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”
One more thing that I noticed, I get double cross instead of one!
This is my output:
The issue is that you mapped a continuous variable (cream_rating) on a discrete color scale (scale_color_manual).
As the plots in your images show there are only two colors, i.e. we need a discrete variable. As your data is about ratings my guess is that to achieve the plots you have to map winner on color. One question remains: How about draws. In my code below I set the color for draws equal to NA, i.e draws are dropped. But you can change that as you like.
From the image I also guess that some transparency was used to tackle overplotting. This could be achieved via the alpha argument, which I set to 0.6.
Concerning the cross appearing in your plot. Hard to tell, but my guess is that here the data was "replicated" two times by fixing one of your ratings variables to its meanvalue. If this guess is correct, we can get the cross via two additional geom_point layers.
library(ggplot2)
d1 <- structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE,
TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L,
5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L,
2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime",
"resign"), class = "factor"), winner = structure(c(2L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream",
"draw"), class = "factor"), increment_code = structure(c(3L,
7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0",
"15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"),
cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L,
1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L,
1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA,
10L), class = "data.frame")
ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(aes(color = winner), alpha = 0.6) +
# Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.6) +
geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.6) +
# Should "draw"s be colored or dropped?
scale_color_manual(values = c(cream = "green", charcoal = "orange", draw = NA)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))
EDIT
the shapes don't show up because you missed to map winner on the shape aes
the "errors" are simply warnings which arise because we set the color for draws to NA. These are the rows which ggplot removes. To get rid of the draws simply filter your dataset before plotting:
library(ggplot2)
library(dplyr)
d1 %>%
filter(winner != "draw") %>%
ggplot(aes(x=cream_rating, y=charcoal_rating, color = winner, shape = winner)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(alpha = 0.6, na.rm = TRUE) +
# Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating)), alpha = 0.6) +
geom_point(aes(y = mean(charcoal_rating)), alpha = 0.6) +
# Should "draw"s be colored or dropped?
scale_color_manual(values = c(cream = "green", charcoal = "orange")) +
scale_shape_manual(values = c(cream = 16, charcoal = 17)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))

labeling each data point

I have a data like this
df<-structure(list(X = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 10L,
9L, 11L, 12L, 8L), .Label = c("A", "B", "C", "D", "E", "F", "GG",
"IR", "MM", "TT", "TTA", "UD"), class = "factor"), X_t = c(3.7066,
3.6373, 3.2693, 2.5626, 2.4144, 2.2868, 2.1238, 1.8671, 1.7627,
1.4636, 1.4195, 1.0159), NEW = structure(c(8L, 7L, 9L, 1L, 2L,
3L, 4L, 5L, 6L, 10L, 11L, 12L), .Label = c("12-Jan", "14-Jan",
"16-Jan", "19-Jan", "25-Jan", "28-Jan", "4-Jan", "Feb-38", "Feb-48",
"Jan-39", "Jan-41", "Jan-66"), class = "factor")), class = "data.frame", row.names = c(NA,
-12L))
I am trying to put the label for each dot but I get a warning
here is how I plot it
ggplot(data=df)+
geom_point(aes(X_t, X,size =X_t,colour =X_t,label = NEW))
also I want to merge the two legend into one because it is redundant, if you have any tips let me know please
Use geom_text for text (e.g., labels):
ggplot(data=df, aes(X_t, X)) +
geom_point(aes(size = X_t, colour = X_t)) +
geom_text(aes(label = NEW), nudge_y = 0.5) +
guides(color = guide_legend(), size = guide_legend())
Aesthetics you specify in the ggplot() call will be inherited by subsequent layeres (geoms). So by putting the x and y aesthetics in ggplot(), we don't have to re-specify them again.
As for the legend question, see this answer for details. To combine color and size legends we use guide_legend.

Bubble chart without axis with labels in R

I have the following data frame:
> dput(df)
structure(list(text = structure(c(9L, 10L, 1L, 7L, 5L, 12L, 1L,
11L, 5L, 8L, 2L, 13L, 2L, 5L, NA, 6L, 13L, 4L, NA, 5L, 4L, 3L
), .Label = c("add ", "change ", "clarify", "correct", "correct ",
"delete", "embed", "follow", "name ", "remove", "remove ", "specifiy ",
"update"), class = "factor"), ID = c(1052330L, 915045L, 931207L,
572099L, 926845L, 510057L, 927946L, 490640L, 928498L, 893872L,
956074L, 627059L, 508649L, 508657L, 1009304L, 493138L, 955579L,
144052L, 1011166L, 151059L, 930992L, 913074L)), .Names = c("text",
"ID"), class = "data.frame", row.names = c(NA, -22L))
I would like to have a bubble chart for my df with circles labeling with each verb in the text column and also the number of IDs that are related to each verb in the text column. This is the code I have for the circles but I don't know how to do the labeling:
> library(packcircles)
> library(ggplot2)
> packing <- circleProgressiveLayout(df)
> dat.gg <- circleLayoutVertices(packing)
> ggplot(data = dat.gg) +geom_polygon(aes(x, y, group = id, fill = factor(id)), colour = "black",show.legend = FALSE) +scale_y_reverse() +coord_equal()
You create a data.frame for your labels with the appropriate x and y coordinate and use geom_text
library(ggplot2)
packing <- circleProgressiveLayout(df)
dat.gg <- circleLayoutVertices(packing)
cbind(df, packing) -> new_df
ggplot(data = dat.gg) +geom_polygon(aes(x, y, group = id, fill = factor(id)), colour = "black",show.legend = FALSE) +
scale_y_reverse() +coord_equal() +geom_text(data = new_df, aes(x, y,label = text))
For the Text and ID, you can do:
new_df$text2 <- paste0(new_df$text,"\n",new_df$ID)
ggplot(data = dat.gg) +geom_polygon(aes(x, y, group = id, fill = factor(id)), colour = "black",show.legend = FALSE) +
scale_y_reverse() +coord_equal() +geom_text(data = new_df, aes(x, y,label = text2))

R line graphs, values outside plot area

I have 300 variables (columns) taken at 10 timepoints (rows), for each variable at any given timepoint I have temperature values A and F.
Attached is a sample of the dataframe
structure(list(Timepoint = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L,
5L, 5L, 6L, 6L, 7L, 7L, 8L, 8L, 9L, 9L, 13L, 13L, 25L, 25L),
Temperature = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("A",
"F"), class = "factor"), Concentration.of.chylomicrons = c(1.29e-11,
1.25e-11, 1.02e-11, 1.1e-11, 1.08e-11, 1.3e-11, 1.28e-11,
1.26e-11, 1.06e-11, 1.32e-11, 8.85e-12, 1.21e-11, 8.83e-12,
1.08e-11, 1.35e-11, 1.12e-11, 8.99e-12, 1.08e-11, 9.55e-12,
1.04e-11, 0, 1.01e-11), Total.lipids = c(0.00268, 0.0026,
0.00208, 0.00225, 0.00222, 0.0027, 0.00268, 0.0026, 0.00219,
0.00273, 0.0018, 0.00247, 0.00179, 0.00221, 0.00276, 0.00229,
0.00182, 0.00222, 0.00195, 0.00212, 0, 0.00204), Phospholipids = c(0.000224,
0.000223, 0.000145, 0.00016, 0.000157, 0.000211, 0.00023,
0.000211, 0.000165, 0.000224, 0.000109, 0.00018, 0.000113,
0.000163, 0.000175, 0.000177, 0.000122, 0.000173, 0.000127,
0.000156, 0, 0.000138)), .Names = c("Timepoint", "Temperature",
"Concentration.of.chylomicrons", "Total.lipids", "Phospholipids"
), class = "data.frame", row.names = c(NA, -22L))
I would like to draw a line graph to show how each variable varies with time. On this line graph I would like the A and F lines to be drawn.I have successfully managed to write the loop code for this.
# subset based on temperatures A and F
a_df <- subset(df, Temperature == "A")
f_df <- subset(df, Temperature == "F")
# loop from columns 3:x
for (i in 3:x) {
plot(a_df[, 1],
a_df[, i],
type = "l",
ylab = colnames(a_df[i]),
xlab = "Timepoint",
lwd = 2,
col = "blue")
lines(f_df[, 1],
f_df[, i],
type = "l",
lwd = 2,
col = "red")
legend("bottomleft",
col = c("blue", "red"),
legend = c("Temperature A", "Temperature F"),
lwd = 2,
y.intersp = 0.5,
bty = "n")
}
However for certain variables, certain points are outside the plot area, image attached below
Please click on this link for image
How can I make sure that in this loop command I can have graghs with all points visible. Im sure there is a quick way to fix this, can anyone help?
I have tried the following line, kindly suggested
ylim = c(min(f_df[,-1] ,max(f_df[,-1]),
I get the following error message
for (i in 3:229) {
+ plot(a_df[, 1],
+ a_df[, i],
+ type = "b",
+ ylim = c(min(f_df[,-1] ,max(f_df[,-1]),
+ ylab = colnames(f_df[i]),
+ main = colnames(f_df[i]),
+ xlab = "Timepoint",
+ lwd = 2,
+ col = "red")
+ lines(f_df[, 1],
Error: unexpected symbol in:
" col = "red")
lines"
f_df[, i],
Error: unexpected ',' in " f_df[, i],"
type = "b",
Error: unexpected ',' in " type = "b","
lwd = 2,
Error: unexpected ',' in " lwd = 2,"
col = "blue")
Error: unexpected ')' in " col = "blue")"
legend("bottomleft",
+ col = c("red", "blue"),
+ legend = c("Ambient", "Fridge"),
+ lwd = 2,
+ y.intersp = 0.5,
+ bty = "n")
Error in strwidth(legend, units = "user", cex = cex, font = text.font) :
plot.new has not been called yet
}
Error: unexpected '}' in "}"
Lakmal
To recap in an answer. Setting ylim solves the issue
# loop from columns 3:x
for (i in 3:x) {
plot(a_df[, 1],
a_df[, i],
type = "l",
ylab = colnames(a_df[i]),
xlab = "Timepoint",
ylim = c(min(df[,-1]) ,max(df[,-1])),
lwd = 2,
col = "blue")
...
sets the plot boundaries as equal for each plot which is better if you want to compare the plots but has the downside that the plot area might be considerably larger than your data.
# loop from columns 3:x
for (i in 3:x) {
plot(a_df[, 1],
a_df[, i],
type = "l",
ylab = colnames(a_df[i]),
xlab = "Timepoint",
ylim = c(min(df[,i]) ,max(df[,i])),
lwd = 2,
col = "blue")
...
sets new boundaries for each plot, which is worse for comparison but reduces unnecessary empty plot space. I've replaced min(a_df[, i],f_df[, i])with min(df[,i]) since they should be identical.

Resources