ggplot2 - labeling maxima and minima in a facet wrap - r

I want to track the seven observations' performances on seven assessments using sparklines, so I thought that I could just melt my data frame then do a facet wrap by observation in ggplot. Now, I really need to label the maxima and minima for each facet. Is this possible in my current set up or do I need to graph each facet separately and add on indictors via geom_annotate? I'm sorry if this is a very rookie question. I am very new to R.
ggplot(test,aes(x=variable,y=value,group=1))+
facet_wrap("student",nrow=7)+
geom_point()+
geom_line()+
mytheme

It is possible. You have to add the max to your data frame, or make a new one to hold them and then add it to the plot. Can't reproduce, so the following code may contain errors, but something like this:
maxd <- aggregate(test$value, list(student = test$student), max)
names(maxd)[length(maxd)] <- "maxvalue"
ggplot(..) + ... +
geom_text(data = maxd, aes(label = maxvalue, x = X0, y = Y0))
#substitute X0, Y0 with your desired position of text

Related

Plotting lines between two points in ggplot2

I'm looking for a way to represent a vector coming off of a point given angle and magnitude in ggplot. I've calculated what the endpoint of these vectors should be, but can't figure out a way to plot this properly in ggplot2. In short, given an observation with (X,Y,vec.x,vec.y), how can I plot a line from (X,Y) to (vec.x,vec.y) that does not show (vec.x,vec.y)?
My first instinct was to use geom_line, but this seems to rely on connecting different observations, so I would need to separate each observation into two observations, one with the original point and one with the vector endpoint. However, this seems fairly messy and like there should be a cleaner way to achieve this. Furthermore, this would make it complicated to show the original points but hide the vector points, as they would be plotted within the same geom_point call.
Here's a sample dataset in the form I'm talking about:
test <- tibble(
x = c(1,2,3,4,5),
y = c(5,4,3,2,1),
vec.x = c(1.5,2.5,3.5,4.5,5.5),
vec.y = c(4,3,2,1,0)
)
test %>%
ggplot() +
geom_point(aes(x=x,y=y),color='red') +
geom_point(aes(x=vec.x,y=vec.y),color='blue')
What I'm hoping to achieve is this, but without the blue dots:
Any thoughts? Apologies if this is a duplicated issue. I did some Googling and was unable to find a similar question for ggplot.
test %>%
ggplot() +
geom_point(aes(x=x,y=y),color='red') +
geom_point(aes(x=vec.x,y=vec.y),color='blue') +
geom_segment(
aes(x = x,y = y, xend = vec.x,yend = vec.y),
arrow = arrow(length = unit(0.03,units = "npc")),
size = 1
)
Reference: https://ggplot2.tidyverse.org/reference/geom_segment.html

How to plot density of points in one dimension with different factors in ggplot2

I am attempting to place individual points on a plot using ggplot2, however as there are many points, it is difficult to gauge how densely packed the points are. Here, there are two factors being compared against a continuous variable, and I want to change the color of the points to reflect how closely packed they are with their neighbors. I am using the geom_point function in ggplot2 to plot the points, but I don't know how to feed it the right information on color.
Here is the code I am using:
s1 = rnorm(1000, 1, 10)
s2 = rnorm(1000, 1, 10)
data = data.frame(task_number = as.factor(c(replicate(100, 1),
replicate(100, 2))),
S = c(s1, s2))
ggplot(data, aes(x = task_number, y = S)) + geom_point()
Which generates this plot:
However, I want it to look more like this image, but with one dimension rather than two (which I borrowed from this website: https://slowkow.com/notes/ggplot2-color-by-density/):
How do I change the colors of the first plot so it resembles that of the second plot?
I think the tricky thing about this is you want to show the original values, and evaluate the density at those values. I borrowed ideas from here to achieve that.
library(dplyr)
data = data %>%
group_by(task_number) %>%
# Use approxfun to interpolate the density back to
# the original points
mutate(dens = approxfun(density(S))(S))
ggplot(data, aes(x = task_number, y = S, colour = dens)) +
geom_point() +
scale_colour_viridis_c()
Result:
One could, of course come up with a meausure of proximity to neighbouring values for each value... However, wouldn't adjusting the transparency basically achieve the same goal (gauging how densely packed the points are)?
geom_point(alpha=0.03)

R - Time series data with ggplot2

I have a time series dataset in which the x-axis is a list of events in reverse chronological order such that an observation will have an x value that looks like "n-1" or "n-2" all the way down to 1.
I'd like to make a line graph using ggplot that creates a smooth, continuous line that connects all of the points, but it seems when I try to input my data, the x-axis is extremely wonky.
The code I am currently using is
library(ggplot2)
theoretical = data.frame(PA = c("n-1", "n-2", "n-3"),
predictive_value = c(100, 99, 98));
p = ggplot(data=theoretical, aes(x=PA, y=predictive_value)) + geom_line();
p = p + scale_x_discrete(labels=paste("n-", 1:3, sep=""));
The fitted line and grid partitions that would normally appear using ggplot are replaced by no line and wayyy too many partitions.
When you use geom_line() with a factor on at least one axis, you need to specify a group aesthetic, in this case a constant.
p = ggplot(data=theoretical, aes(x=PA, y=predictive_value, group = 1)) + geom_line()
p = p + scale_x_discrete(labels=paste("n-", 1:3, sep=""))
p
If you want to get rid of the minor grid lines you can add
theme(panel.grid.minor = element_blank())
to your graph.
Note that it can be a little risky, scale-wise, to use factors on one axis like this. It may work better to use a typical continuous scale, and just relabel the points 1, 2, and 3 with "n-1", "n-2", and "n-3".

ggplot2: how to overlay 2 plots when using stat_summary

i am totally new in R so maybe the answer to the question is trivial but I couldn't find any solution after searching in the net for days.
I am using ggplot2 to create graphs containing the mean of my samples with the confidence interval in a ribbon (I can't post the pic but something like this: S1
I have a data frame (df) with time in the first column and the values of the variable measured in the other columns (each column is a replicate of the measurement).
I do the following:
mdf<-melt(df, id='time', variable_name="samples")
p <- ggplot(data=mdf, aes(x=time, y=value)) +
geom_point(size=1,colour="red")
stat_sum_df <- function(fun, geom="crosbar", ...) {
stat_summary(fun.data=fun, geom=geom, colour="red")
}
p + stat_sum_df("mean_cl_normal", geom = "smooth")
and I get the graph I have shown at the beginning.
My question is: if I have two different data frames, each one with a different variable, measured in the same sample at the same time, how I can plot the 2 graphs in the same plot? Everything I have tried ends in doing the statistics in the both sets of data or just in one of them but not in both. Is it possible just to overlay the plots?
And a second small question: is it possible to change the colour of the ribbon?
Thanks!
something like this:
library(ggplot2)
a <- data.frame(x=rep(c(1,2,3,5,7,10,15,20), 5),
y=rnorm(40, sd=2) + rep(c(4,3.5,3,2.5,2,1.5,1,0.5), 5),
g = rep(c('a', 'b'), each = 20))
ggplot(a, aes(x=x,y=y, group = g, colour = g)) +
geom_point(aes(colour = g)) +
geom_smooth(aes(fill = g))
I'd suggest you reading the basics of ggplot. Check ?ggplot2 for help on ggplot but also available help topics here and particularly how group aesthetic may be manipulated.
You'll find useful the discussion group at Google groups and maybe join it. Also, QuickR have a lot of examples on ggplot graphs and, obviously, here at Stackoverflow.

How to make an overall boxplot alongside factors in R?

I am trying to create a boxplot that shows all of the factors of a variable, along with sample size, and at eh end of the plot also want an overall boxplot that combines all of the values into one. I am using the following line of code to do everything except making the overall plot:
library(ggplot2)
library(plyr)
xlabels <- ddply(extract8, .(Fuel), summarize, xlabels = paste(unique(Fuel), '\n(n = ', length(Fuel),')'))
ggplot(extract8, aes(x = Fuel, y = Exfiltration.Fraction.Percentage))+geom_boxplot()+
stat_boxplot(geom='errorbar', linetype=1) +
geom_boxplot(fill="pink") + geom_hline(yintercept = 0.4) +
scale_x_discrete(labels = xlabels[['xlabels']]) + ggtitle("Exfiltration Fraction (%) by Fuel Type")
Not sure on how to proceed regarding adding a boxplot that combines all of the factors into one.
This is certainly not the most elegant way to solve it, but it works:
Copy your dataset into a new object.
Within the new object, replace the content of the variable containing the factors with the label you would like, for instance, "Total".
Use rbind to attach the old and new objects together and attribute the result to the new object.
In ggplot replace the old object by the new object.
I had the same issue, couldn't find an answer and proceeded this way.

Resources