How to stop ggplot line plot adding fill - r

I am producing a ggplot which looks at a curve in a dataset. When I build the plot, ggplot is automatically adding fill to data which is on the negative side of the x axis. Script and plot shown below.
ggplot(df, aes(x = Var1, y = Var2)) +
geom_line() +
geom_vline(xintercept = 0) +
geom_hline(yintercept = Var2[1])
Using base R, I am able to get the plot shown below which is how it should look.
plot(x = df$Var1, y = df$Var2, type = "l",
xlab = "Var1", ylab = "Var2")
abline(v = 0)
abline(h = df$Var2[1])
If anyone could help identify why I might be getting the automatic fill and how I could make it stop, I would be very appreciative. I would like to make this work in ggplot so I can later animate the line as it is a time series that can be used to compare between other datasets from the same source.
Can add data if necessary. Data set is 1561 obs long however. Thanks in advance.

I guess you should try
ggplot(df, aes(x = Var1, y = Var2)) +
geom_path() +
geom_vline(xintercept = 0) +
geom_hline(yintercept = Var2[1])
instead. The geom_line()-function connects the points in order of the variable on the x-axis.
Take a look at this example
dt <- data.frame(
x = c(seq(-pi/2,3*pi,0.001),seq(-pi/2,3*pi,0.001)),
y = c(sin(seq(-pi/2,3*pi,0.001)), cos(seq(-pi/2,3*pi,0.001)))
)
ggplot(dt, aes(x,y)) + geom_line()
The two points with x-coordinate -pi/2 will be connected first, creating a vertical black line. Next x = -pi/2 + 0.001 will be processed and so on. The x values will be processed in order.
Therefore you should use geom_path() to get the desired result
dt <- data.frame(
x = c(seq(-pi/2,3*pi,0.001),seq(-pi/2,3*pi,0.001)),
y = c(sin(seq(-pi/2,3*pi,0.001)), cos(seq(-pi/2,3*pi,0.001)))
)
ggplot(dt, aes(x,y)) + geom_path()

Related

ggplot2: how to add sample numbers to density plot?

I am trying to generate a (grouped) density plot labelled with sample sizes.
Sample data:
set.seed(100)
df <- data.frame(ab.class = c(rep("A", 200), rep("B", 200)),
val = c(rnorm(200, 0, 1), rnorm(200, 1, 1)))
The unlabelled density plot is generated and looks as follows:
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
What I want to do is add text labels somewhere near the peak of each density, showing the number of samples in each group. However, I cannot find the right combination of options to summarise the data in this way.
I tried to adapt the code suggested in this answer to a similar question on boxplots: https://stackoverflow.com/a/15720769/1836013
n_fun <- function(x){
return(data.frame(y = max(x), label = paste0("n = ",length(x))))
}
ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4) +
stat_summary(geom = "text", fun.data = n_fun)
However, this fails with Error: stat_summary requires the following missing aesthetics: y.
I also tried adding y = ..density.. within aes() for each of the geom_density() and stat_summary() layers, and in the ggplot() object itself... none of which solved the problem.
I know this could be achieved by manually adding labels for each group, but I was hoping for a solution that generalises, and e.g. allows the label colour to be set via aes() to match the densities.
Where am I going wrong?
The y in the return of fun.data is not the aes. stat_summary complains that he cannot find y, which should be specificed in global settings at ggplot(df, aes(x = val, group = ab.class, y = or stat_summary(aes(y = if global setting of y is not available. The fun.data compute where to display point/text/... at each x based on y given in the data through aes. (I am not sure whether I have made this clear. Not a native English speaker).
Even if you have specified y through aes, you won't get desired results because stat_summary compute a y at each x.
However, you can add text to desired positions by geom_text or annotate:
# save the plot as p
p <- ggplot(df, aes(x = val, group = ab.class)) +
geom_density(aes(fill = ab.class), alpha = 0.4)
# build the data displayed on the plot.
p.data <- ggplot_build(p)$data[[1]]
# Note that column 'scaled' is used for plotting
# so we extract the max density row for each group
p.text <- lapply(split(p.data, f = p.data$group), function(df){
df[which.max(df$scaled), ]
})
p.text <- do.call(rbind, p.text) # we can also get p.text with dplyr.
# now add the text layer to the plot
p + annotate('text', x = p.text$x, y = p.text$y,
label = sprintf('n = %d', p.text$n), vjust = 0)

How to get a really periodic polar surface plot with ggplot

Sample data:
mydata="theta,rho,value
0,0.8400000,0.0000000
40,0.8400000,0.4938922
80,0.8400000,0.7581434
120,0.8400000,0.6675656
160,0.8400000,0.2616592
200,0.8400000,-0.2616592
240,0.8400000,-0.6675656
280,0.8400000,-0.7581434
320,0.8400000,-0.4938922
360,0.8400000,0.0000000
0,0.8577778,0.0000000
40,0.8577778,0.5152213
80,0.8577778,0.7908852
120,0.8577778,0.6963957
160,0.8577778,0.2729566
200,0.8577778,-0.2729566
240,0.8577778,-0.6963957
280,0.8577778,-0.7908852
320,0.8577778,-0.5152213
360,0.8577778,0.0000000
0,0.8755556,0.0000000
40,0.8755556,0.5367990
80,0.8755556,0.8240077
120,0.8755556,0.7255612
160,0.8755556,0.2843886
200,0.8755556,-0.2843886
240,0.8755556,-0.7255612
280,0.8755556,-0.8240077
320,0.8755556,-0.5367990
360,0.8755556,0.0000000
0,0.8933333,0.0000000
40,0.8933333,0.5588192
80,0.8933333,0.8578097
120,0.8933333,0.7553246
160,0.8933333,0.2960542
200,0.8933333,-0.2960542
240,0.8933333,-0.7553246
280,0.8933333,-0.8578097
320,0.8933333,-0.5588192
360,0.8933333,0.0000000
0,0.9111111,0.0000000
40,0.9111111,0.5812822
80,0.9111111,0.8922910
120,0.9111111,0.7856862
160,0.9111111,0.3079544
200,0.9111111,-0.3079544
240,0.9111111,-0.7856862
280,0.9111111,-0.8922910
320,0.9111111,-0.5812822
360,0.9111111,0.0000000
0,0.9288889,0.0000000
40,0.9288889,0.6041876
80,0.9288889,0.9274519
120,0.9288889,0.8166465
160,0.9288889,0.3200901
200,0.9288889,-0.3200901
240,0.9288889,-0.8166465
280,0.9288889,-0.9274519
320,0.9288889,-0.6041876
360,0.9288889,0.0000000
0,0.9466667,0.0000000
40,0.9466667,0.6275358
80,0.9466667,0.9632921
120,0.9466667,0.8482046
160,0.9466667,0.3324593
200,0.9466667,-0.3324593
240,0.9466667,-0.8482046
280,0.9466667,-0.9632921
320,0.9466667,-0.6275358
360,0.9466667,0.0000000
0,0.9644444,0.0000000
40,0.9644444,0.6512897
80,0.9644444,0.9997554
120,0.9644444,0.8803115
160,0.9644444,0.3450427
200,0.9644444,-0.3450427
240,0.9644444,-0.8803115
280,0.9644444,-0.9997554
320,0.9644444,-0.6512897
360,0.9644444,0.0000000
0,0.9822222,0.0000000
40,0.9822222,0.6751215
80,0.9822222,1.0363380
120,0.9822222,0.9125230
160,0.9822222,0.3576658
200,0.9822222,-0.3576658
240,0.9822222,-0.9125230
280,0.9822222,-1.0363380
320,0.9822222,-0.6751215
360,0.9822222,0.0000000
0,1.0000000,0.0000000
40,1.0000000,0.6989533
80,1.0000000,1.0729200
120,1.0000000,0.9447346
160,1.0000000,0.3702890
200,1.0000000,-0.3702890
240,1.0000000,-0.9447346
280,1.0000000,-1.0729200
320,1.0000000,-0.6989533
360,1.0000000,0.0000000"
read in a data frame:
foobar <- read.csv(text = mydata)
You can check (if you really want to!) that the data are periodic in the theta direction, i.e., for each given rho, the point at theta=0 and theta=360 are precisely the same. I would like to plot a nice polar surface plot, in other words an annulus colored according to value. I tried the following:
library(viridis) # just because I very much like viridis: if you don't want to install it, just comment this line and uncomment the scale_fill_distiller line
library(ggplot2)
p <- ggplot(data = foobar, aes(x = theta, y = rho, fill = value)) +
geom_tile() +
coord_polar(theta = "x") +
scale_x_continuous(breaks = seq(0, 360, by = 45), limits=c(0,360)) +
scale_y_continuous(limits = c(0, 1)) +
# scale_fill_distiller(palette = "Oranges")
scale_fill_viridis(option = "plasma")
I'm getting:
Yuck! Why the nasty hole in the annulus? If I generate a foobar data frame with more rows (more theta and rho values) the hole gets smaller. This isn't a viable solutione, both because computing data at more rho/theta values is costly and time-consuming, and both because even with 100x100=10^4 rows I still get a hole. Also, with a bigger dataframe, ggplot takes forever to render the plot: the combination of geom_tile and coord_polar is incredibly inefficient. Isn't there a way to get a nice-looking polar plot without unnecessarily wasting memory & CPU time?
Edit: all value of data for theta=360 were removed (repeat from the values of theta=0)
ggplot(data = foobar, aes(x = theta, y = rho, fill = value)) +
geom_tile() +
coord_polar(theta = "x",start=-pi/9) +
scale_y_continuous(limits = c(0, 1))+
scale_x_continuous(breaks = seq(0, 360, by = 45))
I just removed limits from scale_x_continuous
That gives me:

Adjusting axis in ggtern ternary plots

I am trying to generate a ternary plot using ggtern.
My data ranges from 0 - 1000 for x, y,and z variables. I wondered if it is possible to extend the axis length above 100 to represent my data.
#Nevrome is on the right path, your points will still be plotted as 'compositions', ie, concentrations sum to unity, but you can change the labels of the axes, to indicate a range from 0 to 1000.
library(ggtern)
set.seed(1)
df = data.frame(x = runif(10)*1000,
y = runif(10)*1000,
z = runif(10)*1000)
breaks = seq(0,1,by=0.2)
ggtern(data = df, aes(x, y, z)) +
geom_point() +
limit_tern(breaks=breaks,labels=1000*breaks)
I think there is no direct solution to do this with ggtern. But an easy workaround could look like this:
library(ggtern)
df = data.frame(x = runif(50)*1000,
y = runif(50)*1000,
z = runif(50)*1000,
Group = as.factor(round(runif(50,1,2))))
ggtern() +
geom_point(data = df, aes(x/10, y/10, z/10, color = Group)) +
labs(x="X", y="Y", z="Z", title="Title") +
scale_T_continuous(breaks = seq(0,1,0.2), labels = 1000*seq(0,1,0.2)) +
scale_L_continuous(breaks = seq(0,1,0.2), labels = 1000*seq(0,1,0.2)) +
scale_R_continuous(breaks = seq(0,1,0.2), labels = 1000*seq(0,1,0.2))

How to underline text in a plot title or label? (ggplot2)

Please pardon my ignorance if this is a simple question, but I can't seem to figure out how to underline any part of a plot title. I'm using ggplot2.
The best I could find was
annotate("segment") done by hand, and I have created a toy plot to illustrate its method.
df <- data.frame(x = 1:10, y = 1:10)
rngx <- 0.5 * range(df$x)[2] # store mid-point of plot based on x-axis value
rngy <- 0.5 * range(df$y)[2] # stores mid-point of y-axis for use in ggplot
ggplot(df, aes(x = x, y = y)) +
geom_point() +
ggtitle("Oh how I wish for ..." ) +
ggplot2::annotate("text", x = rngx, y = max(df$y) + 1, label = "underlining!", color = "red") +
# create underline:
ggplot2::annotate("segment", x = rngx-0.8, xend = rngx + 0.8, y= 10.1, yend=10.1)
uses bquote(underline() with base R
pertains to lines over and under nodes on a graph
uses plotmath and offers a workaround, but it didn't help
Try this:
ggplot(df, aes(x = x, y = y)) + geom_point() +
ggtitle(expression(paste("Oh how I wish for ", underline(underlining))))
Alternatively, as BondedDust points out in the comments, you can avoid the paste() call entirely, but watch out for the for:
ggplot(df, aes(x = x, y = y)) + geom_point() +
ggtitle(expression(Oh~how~I~wish~'for'~underline(underlining)))
Or another, even shorter approach suggested by baptiste that doesn't use expression, paste(), or the many tildes:
ggplot(df, aes(x = x, y = y)) + geom_point() +
ggtitle(~"Oh how I wish for "*underline(underlining))

R: prevent break in line showing time series data using ggplot geom_line

Using ggplot2 I want to draw a line that changes colour after a certain date. I expected this to be be simple, but I get a break in the line at the point the colour changes. Initially I thought this was a problem with group (as per this question; this other question also looked relevant but wasn't quite what I needed). Having messed around with the group aesthetic for 30 minutes I can't fix it so if anybody can point out the obvious mistake...
Code:
require(ggplot2)
set.seed(1111)
mydf <- data.frame(mydate = seq(as.Date('2013-01-01'), by = 'day', length.out = 10),
y = runif(10, 100, 200))
mydf$cond <- ifelse(mydf$mydate > '2013-01-05', "red", "blue")
ggplot(mydf, aes(x = mydate, y = y, colour = cond)) +
geom_line() +
scale_colour_identity(mydf$cond) +
theme()
If you set group=1, then 1 will be used as the group value for all data points, and the line will join up.
ggplot(mydf, aes(x = mydate, y = y, colour = cond, group=1)) +
geom_line() +
scale_colour_identity(mydf$cond) +
theme()

Resources