I am trying to plot regression lines for my 3 Y variable and my 1 x variable.
library(ggplot2)
library(tidyverse)
Sika_deer<-read.csv("C:/Users/Lau/Desktop/Sikadeer.csv", sep = ";",header = T)
Plot<-ggplot(Sika_deer, aes(x=Year)) +
geom_point(aes(y=Females, color="Females")) +
geom_point(aes(y=Young, color="Youngs")) +
geom_point(aes(y=Males, color="Males")
)+ facet_wrap(~District, scales = ("free_y"))+
labs(x = "Number of culled animals", y = "Year)")
I tried using geom_smooth but I keep on receiving the error:geom_smooth() using formula 'y ~ x'
Errore: stat_smooth requires the following missing aesthetics: y
I am not sure what I am doing wrong here...
Thank you all for the attention and help!
p.s sorry if I made some mistakes posting my question, it's my first time asking for help on an online platform.
This is my plot
It would be best to reshape your data into long format to avoid the need for repeated calls to geom_point and geom_smooth, but the following should work for you:
library(ggplot2)
library(tidyverse)
Sika_deer <- read.csv("C:/Users/Lau/Desktop/Sikadeer.csv", sep = ";", header = TRUE)
Plot <- ggplot(Sika_deer, aes(x = Year)) +
geom_point(aes(y = Females, color = "Females")) +
geom_point(aes(y = Young, color = "Youngs")) +
geom_point(aes(y = Males, color = "Males")) +
geom_smooth(aes(y = Females, color = "Females"), se = FALSE) +
geom_smooth(aes(y = Young, color = "Youngs"), se = FALSE) +
geom_smooth(aes(y = Males, color = "Males"), se = FALSE) +
facet_wrap(~District, scales = ("free_y")) +
labs(x = "Number of culled animals", y = "Year)")
If this does not work for you, please edit your question to include a sample of your data by typing dput(Sika_deer) into the console and pasting the result into your question.
I agree about transforming to long data before trying this plot, then you can pass the color variable into aes at the top and the subsequent layers will just inherit it. Since I don't have your data to confirm an answer, I'm showing an example with the iris dataset but it will be the same with yours.
library(tidyverse)
iris %>%
pivot_longer(-c(Species, Sepal.Length), names_to = "attribute") %>%
ggplot(aes(x = Sepal.Length, y = value, color = Species)) +
geom_point() +
geom_smooth() +
facet_wrap(facets = "attribute", scales = "free_y")
With your data I think you could try:
Sika_deer %>%
pivot_longer(-c(District, year), names_to = "category") %>%
ggplot(aes(x = year, y = value, color = category)) +
geom_point() +
geom_smooth() +
facet_wrap(facets = "District", scales = "free_y") +
labs(x = "Number of culled animals", y = "Year)")
But if you share the output of dput(Sika_deer) in your question, we can be sure.
Related
I am plotting a distribution of two variables on a single histogram. I am interested in highlighting each distribution's mean value on that graph through a doted line or something similar (but hopefully something that matches the color present already in the aes section of the code).
How would I do that?
This is my code so far.
hist_plot <- ggplot(data, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs( x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE)
Also, is there any way to show the count of n for each type on this graph?
i've made some reproducible code that might help you with your problem.
library(tidyverse)
# Generate some random data
df <- data.frame(value = c(runif(50, 0.5, 1), runif(50, 1, 1.5)),
type = c(rep("type1", 50), rep("type2", 50)))
# Calculate means from df
stats <- df %>% group_by(type) %>% summarise(mean = mean(value),
n = n())
# Make the ggplot
ggplot(df, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs(x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE) +
geom_vline(data = stats, aes(xintercept = mean, color = type), size = 2) +
geom_text(data = stats, aes(x = mean, y = max(df$value), label = n),
size = 10,
color = "black")
If things go as intended, you'll end up something akin to the following plot.
histogram with means
I am plotting 2 sets of data on the same plot using ggplot. I have specified the colour for each data set, but there is no legend that comes out when the dot plot is generated.
What can i do to manually add a legend?
# Create an index to hold values of m from 1 to 100
m_index <- (1:100)
data_frame_50 <- data(prob_max_abs_cor_50)
data_frame_20 <- data.frame(prob_max_abs_cor_20)
library(ggplot2)
plot1 <- ggplot(data_frame_50, mapping = aes(x = m_index,
y = prob_max_abs_cor_50),
colour = 'red') +
geom_point() +
ggplot(data_frame_20, mapping = aes(x = m_index,
y = prob_max_abs_cor_20),
colour = 'blue') +
geom_point()
plot1 + labs(x = " Values of m ",
y = " Maximum Absolute Correlation ",
title = "Dot plot of probability")
First, I would suggest neatening your ggplot code a little. This is equivalent to your posted code;
ggplot() +
geom_point(data = data_frame_50, aes(x = m_index, y = prob_max_abs_cor_50,
colour = 'red')) +
geom_point(data = data_frame_20, aes(x = m_index, y = prob_max_abs_cor_20,
colour = 'blue')) +
labs(x = " Values of m ", y = " Maximum Absolute Correlation ",
title = "Dot plot of probability")
You won't get a legend here, because you are plotting different datasets with only one category in each. You need to have a single dataset with a column grouping your data (i.e. 20 or 50). So using some example data, this is the equivalent of what you are plotting and ggplot won't provide a legend;
ggplot() +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Width), colour = 'red') +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Length), colour = 'blue')
If you want to colour by category, include a colour argument inside the aes call;
ggplot() +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Width,
colour = factor(Species)))
Have a look at the iris dataset to get a sense of how you need to shape your data. It's hard to give precise advice, because you haven't provided an idea of what your data look like, but something like this might work;
df.20 <- data.frame("m" = 1:100, "Group" = 20, "Numbers" = prob_max_abs_cor_20)
df.50 <- data.frame("m" = 1:100, "Group" = 50, "Numbers" = prob_max_abs_cor_50)
df.All <- rbind(df.20, df.50)
I am having issues trying to name a set of plots created with the facet_wrap feature. I am specifically trying to wrap the titles onto multiple lines. I have looked into this issue extensively within stack overflow and cannot find the error that I am generating. The code is below. a2$variable is a column of character strings (for grouping purposes), a2$ma_3 and a2$ma_12 are moving averages that I am trying to plot. The error that is generated is:
Error in as.integer(n) :
cannot coerce type 'closure' to vector of type 'integer'
p1=a2 %>%
ggplot(aes(x = date, color = variable)) +
geom_line(aes(y = ma_12), color = "aquamarine3", alpha = 0.5,size=.7) +
geom_line(aes(y = ma_3), color = "gray40", alpha = 0.5,size=.7) +
facet_wrap(~ variable, ncol = 3, scale = "free_y",label_wrap_gen(width=10))
Thanks in advance.
You're close. To modify the facet_wrap labels, we use the labeller argument:
library(tibble)
library(ggplot2)
mtcars %>%
rownames_to_column() %>%
head() %>%
ggplot(aes(x = mpg, color = cly)) +
geom_point(aes(y = wt), color = "aquamarine3", alpha = 0.5,size=5) +
geom_point(aes(y = qsec), color = "gray40", alpha = 0.5,size=5) +
facet_wrap(~ rowname, ncol = 3, scale = "free_y",
labeller = label_wrap_gen(width = 10))
Output:
I'd suggest formatting the variable before you send it to ggplot, like this:
library(tidyverse)
mtcars %>%
rownames_to_column() %>%
head() %>%
mutate(carname = stringr::str_wrap(rowname, 10)) %>%
ggplot(aes(x = mpg, color = cly)) +
geom_point(aes(y = wt), color = "aquamarine3", alpha = 0.5,size=5) +
geom_point(aes(y = qsec), color = "gray40", alpha = 0.5,size=5) +
facet_wrap(~ carname, ncol = 3, scale = "free_y")
I am using ggplot and geoms to show my data, but the plot sidebar area just shows a gray box with the x and y axis correctly labeled.
Here is the output image:
The code which made the plot:
ggplot(Wc, aes(y = popsafe, x = rnground)) +
geom_jitter(aes(col = me)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Looks like your dataset is empty. We don't know what your dataset contains, so here an example with the built-in iris dataset. First a proper plot, using the same geoms and mappings you use:
library(ggplot2)
ggplot(iris, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
Now I remove all the data from the dataset and replot:
library(dplyr)
iris_empty <- filter(iris, Sepal.Length < 0)
ggplot(iris_empty, aes(y = Sepal.Length, x = Sepal.Width)) +
geom_jitter(aes(col = Species)) +
geom_smooth(method = "lm", se = FALSE, col = "black")
A simple head(Wc) would confirm whether your dataset actually contains any data.
I'm making a plot in which I have a 3x3 grid obtained from facet_wrap. Eight out of nine plots use geom_violin while the remaining plot is made using geom_bar. After finding some helpful answers here on the site, I got this all working. The problem that I have is that when I use fill = "white, color = "black" for my bar chart, it draws these lines inside the bars.
Here is some example code and figures.
library(tidyverse)
n <- 100
tib <- tibble(value = c(rnorm(n, mean = 100, sd = 10), rbinom(n, size = 1, prob = (1:4)/4)),
variable = rep(c("IQ", "Sex"), each = n),
year = factor(rep(2012:2015, n/2)))
ggplot(tib, aes(x = year, y = value)) +
facet_wrap(~variable, scales = "free_y") +
geom_violin(data = filter(tib, variable == "IQ")) +
geom_bar(data = filter(tib, variable == "Sex"), stat = "identity",
color = "black", fill = "white")
Now to my question: how do I get rid of these lines inside the bars? I just want it to be white with black borders. I've been experimenting a lot with various configurations, and I can manage to get rid of the lines but at the expense of screwing the facet up. I'm fairly certain it's got to do with the stat, but I'm at a loss trying to fix it. Any suggestions?
I would suggest summarizing the data within the barplot:
ggplot(tib, aes(x = year, y = value)) +
facet_wrap(~variable, scales = "free_y") +
geom_violin(data = filter(tib, variable == "IQ")) +
geom_bar(data = tib %>%
group_by(year,variable) %>%
summarise(value=sum(value)) %>%
filter(variable == "Sex"),
stat = "identity",
color = "black",
fill = "white")
I'm not sure this is a good way to represent the data, with the y-axes of the different panels representing very different things, but accept that your example might not match your actual use case. Making separate plots and then using gridExtra::grid.arrange, or cowplot::plot_grid is probably a better solution.
But if you want to do this
ggplot(tib, aes(x = year, y = value)) +
facet_wrap(~variable, scales = "free_y") +
geom_violin(data = filter(tib, variable == "IQ")) +
geom_col(data = filter(tib, variable == "Sex") %>%
group_by(year, variable) %>%
summarise(value = sum(value)),
fill = "white", colour = "black")
Using geom_col rather than geom_bar so I don't need to use stat = identity.