For a Difference in Difference method, I have come up with the following data:
df <- structure(list(Class = structure(c(1L, 1L, 2L, 2L), levels = c("PovCon",
"PovDeCon"), class = "factor"), After_2015 = structure(c(1L,
2L, 1L, 2L), levels = c("Before 2015", "After 2015"), class = "factor"),
mean_VLP = c(16.5314094033954, 25.3785125225305, 22.4646340695607,
19.5147929056452), se_duration = c(3.72103200892531, 8.17273164333138,
4.03966402631034, 2.56248212580638), upper = c(23.824632140889,
41.39706654346, 30.382375561129, 24.5372578722257), lower = c(9.23818666590181,
9.35995850160102, 14.5468925779924, 14.4923279390647)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), groups = structure(list(
Class = structure(1:2, levels = c("PovCon", "PovDeCon"), class = "factor"),
.rows = structure(list(1:2, 3:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -2L), .drop = TRUE))
For the graphical presentation, I used the following codes:
ggplot(df, aes(x = After_2015,
y = mean_VLP,
color = Class)) +
geom_pointrange(aes(ymin = lower, ymax = upper), size = 1) +
geom_line(aes(group = Class))
Now, as per the requirement of a journal, I would need to have everything in black and white, no color!
Hence, I would ideally like to get two different shapes for the two Class and different linetypes which connect the two corresponding data points.
I used the following code to change the lines:
ggplot(plot_data_VLP, aes(x = After_2015,
y = mean_VLP,
shape = Class,
linetype = Class)) +
geom_pointrange(aes(ymin = lower, ymax = upper), size = 1) +
geom_line(aes(group = Class)))
How do I change the shape with upward and downward triangles?
Please help, and thank you for your time.
To get different shapes you have to map on the shape aesthetic:
library(ggplot2)
base <- ggplot(df, aes(
x = After_2015,
y = mean_VLP,
linetype = Class
)) +
geom_pointrange(aes(ymin = lower, ymax = upper, shape = Class), size = 1) +
geom_line(aes(group = Class))
base
UPDATE: To get upward and downward triangles you could use a manual scale where for the shapes I use some UTF8 (https://www.compart.com/en/unicode/block/U+25A0):
base + scale_shape_manual(values = c("\u25B2", "\u25BC"))
Related
I am plotting a linear mixed effects model using ggplot2 in R. I keep receiving this error with regards to including the mean rating per trial.
Error: Continuous value supplied to discrete scale
I have localized the problem to geom_point as geom_line and geom_ribbon work just fine. Here is the code I am currently using
p2 <- ggplot(td_mean_pref_plot_groups, aes(x, td_mean_pref_plot_groups$predicted, col = group)) +
geom_line(size=1.5) +
scale_color_manual(values = c("blue","red")) +
geom_ribbon(aes(ymin=conf.low,ymax=conf.high, fill=group),alpha=.2,colour=NA) +
scale_fill_manual(values = c("blue","red")) +
geom_point(data=summStats_td_mean,aes(trial,mean, col = condition),size=2) +
scale_color_manual(values = c("blue","red")) +
theme_bw() +
xlab('Trial') +
ylab('Prediction Error') +
ylim(1,2.2) +
ggtitle('TD learning about TD vs. TD \n learning about ASD') +
theme(text=element_text(size=20),
plot.title = element_text(hjust = 0.5),
panel.border = element_blank())
p2
geom_point reads the data below
structure(list(condition = c(1L, 1L, 1L, 1L, 1L, 1L), trial = 1:6,
n = c(80L, 93L, 92L, 94L, 94L, 94L), mean = c(1.225, 1.39784946236559,
1.25, 1.40425531914894, 1.24468085106383, 1.29787234042553
), sd = c(1.01849976541573, 1.08487411558084, 1.00137268424261,
1.11025666834073, 1.00199983058361, 1.09573746202196)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -6L), groups = structure(list(
condition = 1L, .rows = structure(list(1:6), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L), .drop = TRUE))
As you can see, the options for condition for condition are 1 or 2 but R seems to be reading those as continuous. I have used this code many times before and never run into this issue so I'm not sure why it's suddenly acting up. Thank you!
As mentioned in the comment the condition in your data is an integer therefore the scale in continuous, you can easily change this by calling as.factor() on your condition:
library(ggplot2)
p1 = ggplot()+
geom_point(data= df, aes(trial, mean, col = condition))+
labs(subtitle = "Continuous scale")
p2 = ggplot()+
geom_point(data= df, aes(trial, mean, col = as.factor(condition)))+
scale_color_manual(values = c("blue","red"))+
labs(subtitle = "Discrete scale")
library(ggpubr)
ggarrange(p1,p2)
I am trying to display all month values on my x-axis, which is formatted as a "yearmon" variable
My data are structured as follows:
Print data example
dput(collective_action_monthly[1:4, ])
ouptut:
structure(list(collective_action = structure(c(2L, 2L, 2L, 2L
), .Label = c("0", "1"), class = "factor"), treatment_details = c("pre",
"pre", "pre", "pre"), month_year = structure(c(2011.41666666667,
2011.75, 2011.83333333333, 2011.91666666667), class = "yearmon"),
n = c(22L, 55L, 15L, 207L), collective_action_percentage = c(0.0124223602484472,
0.031055900621118, 0.00846979107848673, 0.116883116883117
), am = structure(c(2L, 2L, 2L, 2L), .Label = c("post", "pre"
), class = "factor")), class = c("grouped_df", "tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L), groups = structure(list(
treatment_details = "pre", .rows = structure(list(1:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -1L), .drop = TRUE))
This my code to visualize the trend using bar graphs by month:
ggplot(data = collective_action_monthly, aes(x = month_year, y = collective_action_percentage)) +
geom_bar(stat = "identity", position=position_dodge()) +
scale_fill_grey() +
ylab("percentage") +
theme(text=element_text(size=10)) +
theme(plot.title = element_text(size = 10, face = "bold")) +
scale_y_continuous(labels = percent_format(accuracy = 1)) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) +
theme_bw()
which produces:
However, rather than only showing three months in the x-axis, I would like to show all months. I also tried adding "scale_x_continuous(labels = 0:14, breaks = 0:14) " to the code above, but it still does not display months:
Ideally, I would like to produce a graph as the one below, but with months instead of years.
The zoo packages includes scale_x_yearmon, so you can do:
library(zoo)
library(ggplot2)
ggplot(data = collective_action_monthly,
aes(x = month_year, y = collective_action_percentage)) +
geom_col(position = position_dodge(preserve = "single")) +
scale_y_continuous(labels = scales::percent_format(accuracy = 1),
name = "percentage") +
scale_x_yearmon(breaks = seq(2011.25, 2012, 1/12),
limits = c(2011.25, 2012)) +
theme_bw(base_size = 10) +
theme(plot.title = element_text(size = 10, face = "bold"),
axis.text.x = element_text(angle = 90, vjust = 0.5))
ggplot doesn't have a yearmon scale built in--looks like the zoo package does, but it doesn't have a convenient way to specify "breaks every month"--so I would suggest converting to Date class and using scale_x_date. I've deleted most of your theme stuff to make the changes I've made more obvious (the theming didn't seem relevant to the issue).
ggplot(data = collective_action_monthly, aes(x = as.Date(month_year), y = collective_action_percentage)) +
geom_bar(stat = "identity", position=position_dodge()) +
scale_fill_grey() +
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y") +
ylab("percentage") +
theme_bw()
This is probably an easy question for ggplot2 experts: I want to use my own colors rather than the default colors. How to achieve that?
Here's a snippet of the data:
df <- structure(list(start = c(0, 251, 1976, 5127, 5717, 6783), end = c(251,
1976, 5127, 5717, 6783, 6830), minute = c(0L, 0L, 0L, 0L, 0L,
0L), AOI = c("*", "A", "*", "*", "A", "*"), AOI_col = c("blue",
"red", "blue", "blue", "red", "blue")), row.names = c(NA, -6L
), groups = structure(list(minute = 0L, .rows = structure(list(
1:6), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr",
"list"))), row.names = 1L, class = c("tbl_df", "tbl", "data.frame"
), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
))
The colors I wish to plot are in column AOI_col. Here's the code so far:
library(ggplot2)
ggplot(df5, aes(x = start, xend = end,
y = minute + scale(as.numeric(as.factor(AOI)))/10,
yend = minute + scale(as.numeric(as.factor(AOI)))/10,
color = AOI)) +
geom_segment(size = 2) +
scale_y_reverse(breaks = 0:53, labels = paste0(0:53, "min"), name = NULL) +
labs(title = "Gaze activity Speaker C F01") +
theme(axis.title.x.bottom = element_blank())
I've tried using aes(fill = AOI_col) and scale_color_manual(values = AOI_col) but to no avail. Help is appreciated!
You can add a color adjustment in the geom_segment():
ggplot(df, aes(x = start, xend = end,
y = minute + scale(as.numeric(as.factor(AOI)))/10,
yend = minute + scale(as.numeric(as.factor(AOI)))/10)) +
geom_segment(size = 2, color = df$AOI_col) +
scale_y_reverse(breaks = 0:53, labels = paste0(0:53, "min"), name = NULL) +
labs(title = "Gaze activity Speaker C F01") +
theme(axis.title.x.bottom = element_blank())
Or again the complete example with the answer from #user438383:
df <-
structure(
list(
start = c(0, 251, 1976, 5127, 5717, 6783),
end = c(251,
1976, 5127, 5717, 6783, 6830),
minute = c(0L, 0L, 0L, 0L, 0L,
0L),
AOI = c("*", "A", "*", "*", "A", "*"),
AOI_col = c("blue",
"red", "blue", "blue", "red", "blue")
),
row.names = c(NA,-6L),
groups = structure(
list(
minute = 0L,
.rows = structure(
list(1:6),
ptype = integer(0),
class = c("vctrs_list_of", "vctrs_vctr",
"list")
)
),
row.names = 1L,
class = c("tbl_df", "tbl", "data.frame"),
.drop = TRUE
),
class = c("grouped_df", "tbl_df", "tbl", "data.frame")
)
library(ggplot2)
ggplot(df,
aes(
x = start,
xend = end,
y = minute + scale(as.numeric(as.factor(AOI))) / 10,
yend = minute + scale(as.numeric(as.factor(AOI))) / 10,
color = AOI
)) +
geom_segment(size = 2) +
scale_y_reverse(breaks = 0:53,
labels = paste0(0:53, "min"),
name = NULL) +
labs(title = "Gaze activity Speaker C F01") +
theme(axis.title.x.bottom = element_blank()) +
scale_colour_manual(values = unique(df$AOI_col))
I think the ideal thing to do is make AOI_col a factor and sort it alphabetically, then assign the colours to be the unique values of that column:
df5$AOI_col = factor(df5$AOI_col, levels = sort(unique(df$AOI_col)))
ggplot(df5, aes(x = start, xend = end,
y = minute + scale(as.numeric(as.factor(AOI)))/10,
yend = minute + scale(as.numeric(as.factor(AOI)))/10,
color = AOI)) +
geom_segment(size = 2) +
scale_y_reverse(breaks = 0:53, labels = paste0(0:53, "min"), name = NULL) +
labs(title = "Gaze activity Speaker C F01") +
theme(axis.title.x.bottom = element_blank()) +
scale_colour_manual(values = unique(df$AOI_col))
I am trying to use the "group" argument in aes() in ggplot2, and I am not sure why it is not working as I currently have it.
I would like an image that groups my "maskalthalf" variable in the way that this image uses "sex" (found here).
This is what my graph currently looks like.
This is the code I have so far.
ggplot(groups, aes(x = message, y = mean, group = factor(maskalthalf))) +
geom_bar(stat = "identity", width = 0.5, fill = "003900") +
geom_text(aes(label = round(mean, digits = 1), vjust = -2)) +
geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width = .2, position = position_dodge(.9)) +
labs(title = "Evaluations of Personal and General Convincingness") +
ylab("Rating") +
xlab("Personal evaluation or general evaluation") +
ylim(0, 8)
This is a sketch of what I am aiming for:
Data:
structure(list(maskalthalf = c("High", "High", "Low", "Low"),
message = c("General", "Personal", "General", "Personal"),
mean = c(4.79090909090909, 6.38181818181818, 4.69879518072289,
4.8433734939759), se = c(0.144452868727642, 0.104112130946133,
0.149182255019704, 0.180996951567937)), row.names = c(NA,
-4L), groups = structure(list(maskalthalf = c("High", "Low"),
.rows = structure(list(1:2, 3:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
The image in your first example uses facets to group by variable. So you could try that:
ggplot(df1, aes(x = message, y = mean)) +
geom_col(width = 0.5, fill = "003900") +
geom_text(aes(label = round(mean, digits = 1), vjust = -2)) +
geom_errorbar(aes(ymin = mean - se, ymax = mean + se), width = .2, position = position_dodge(.9)) +
labs(title = "Evaluations of Personal and General Convincingness") +
ylab("Rating") +
xlab("Personal evaluation or general evaluation") +
ylim(0, 8) +
facet_wrap(~maskalthalf)
I am attempting to make a series of plots using the same code with unique coral species databases.
Databases
data_1 <- structure(list(Site_long = structure(c(1L, 1L, 2L, 2L), .Label = c("Hanauma Bay",
"Waikiki"), class = "factor"), Shelter = structure(c(1L, 2L,
1L, 2L), .Label = c("Low", "High"), class = c("ordered", "factor"
)), mean = c(1.19986885018767, 2.15593884020962, 0.369605100791602,
0.31005865611133), sd = c(2.5618758944073, 3.67786619671933,
1.0285671157698, 0.674643037178562), lower = c(0.631321215232725,
1.33972360808602, 0.141339007832154, 0.160337623931733), upper = c(1.76841648514261,
2.97215407233321, 0.59787119375105, 0.459779688290928), sample_size = c(78L,
78L, 78L, 78L)), row.names = c(NA, -4L), groups = structure(list(
Site_long = structure(1:2, .Label = c("Hanauma Bay", "Waikiki"
), class = "factor"), .rows = structure(list(1:2, 3:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
data_2 <- structure(list(Site_long = structure(c(2L, 2L, 1L, 1L), .Label = c("Hanauma Bay",
"Waikiki"), class = "factor"), Shelter = structure(c(1L, 2L,
1L, 2L), .Label = c("Low", "High"), class = c("ordered", "factor"
)), mean = c(0.695203162997812, 0.838720069947102, 0.76957780057238,
0.771070502382599), sd = c(1.17117437618039, 1.02766824928792,
1.43499288333539, 1.28634022958585), lower = c(0.435288768568787,
0.610653459098997, 0.451115141323908, 0.485597776371556), upper = c(0.955117557426838,
1.06678668079521, 1.08804045982085, 1.05654322839364), sample_size = c(78L,
78L, 78L, 78L)), row.names = c(NA, -4L), groups = structure(list(
Site_long = structure(1:2, .Label = c("Hanauma Bay", "Waikiki"
), class = "factor"), .rows = structure(list(3:4, 1:2), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = 1:2, class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
When I run my code on the first species database (data_1), the barplots and associated error bar annotations render correctly. Notice I also made a new variable "data" that will be the same object used in later for species 2. In order to keep this plot to make a composite of a number of plots later, I named the plot "species_1_plot" to save it to the global environment.
Code for Species 1 Plot
data <- data_1
mult_compare_recruitment <- c("A", "A", "A", "A")
data <- data[c(3, 4, 1, 2),]
data$Shelter <- factor(data$Shelter, levels = c("Low", "High"))
# reorder summary dataframe for plotting
position <- c("Waikiki", "Hanauma Bay")
# ggplot2 barplot position with Waikiki (Low-High Shelter) and Hanauma Bay
recruitment_plot_3 <- ggplot(data = data, aes(fill=Shelter, y=mean, x=Site_long)) +
geom_bar(position = "dodge", stat="identity", width = .8) +
scale_x_discrete(limits = position) +
geom_errorbar(aes(ymin = lower, ymax = upper), position = position_dodge(.8), width = .1) +
geom_text(aes(label = mult_compare_recruitment, y = data$upper), vjust = -.5, position = position_dodge(width = 0.8), size = 4) +
scale_fill_grey(name = "Shelter", start = .8, end = .2) +
labs(x = "Site", y = expression(paste("Coral recruitment per m"^"2"))) +
theme_classic(base_size = 14.5) +
theme(text = element_text(size = 18), axis.title.x = element_blank(),
legend.position = "none", axis.text.y = element_text(angle = 90))
species_1_plot <- recruitment_plot_3
species_1_plot
In order to create my next plot, I run the same code on a different species database (data_2) while once again assigning the new database to the object "data". Once again, I saved the new plot "species_2_plot" to the global environment.
Code for Species 2 Plot
data <- data_2
mult_compare_recruitment <- c("A", "A", "B", "B")
data <- data[c(3, 4, 1, 2),]
data$Shelter <- factor(data$Shelter, levels = c("Low", "High"))
# reorder summary dataframe for plotting
position <- c("Waikiki", "Hanauma Bay")
# ggplot2 barplot position with Waikiki (Low-High Shelter) and Hanauma Bay
recruitment_plot_3 <- ggplot(data = data, aes(fill=Shelter, y=mean, x=Site_long)) +
geom_bar(position = "dodge", stat="identity", width = .8) +
scale_x_discrete(limits = position) +
geom_errorbar(aes(ymin = lower, ymax = upper), position = position_dodge(.8), width = .1) +
geom_text(aes(label = mult_compare_recruitment, y = data$upper), vjust = -.5, position = position_dodge(width = 0.8), size = 4) +
scale_fill_grey(name = "Shelter", start = .8, end = .2) +
labs(x = "Site", y = expression(paste("Coral recruitment per m"^"2"))) +
theme_classic(base_size = 14.5) +
theme(text = element_text(size = 18), axis.title.x = element_blank(),
legend.position = "none", axis.text.y = element_text(angle = 90))
species_2_plot <- recruitment_plot_3
species_2_plot
The problem is, when I plot the first species plot again (species_1_plot), the data are correct (bars), but the height of text annotations and their letter values are not correct. They are in fact the values from species_2_plot.
species_1_plot
I saved each plot to the global environment with a unique name knowing this would be an issue. But despite this, geom_text() seems to be using data from the second plot (code that is in the global environment) instead despite that the actual data (bars) in the plot are correct (from species_plot_1). My understanding was that when you name a plot as an object (species_1_plot and species_2_plot) that its akin to saving the plot and therefore preventing any changes to plot and annotations unless specified. Is there any way to prevent this from happening without specifically naming the databases (data_1 and data_2)? All input is appreciated. Thanks in advance!
I would suggest you to use an approach with a function. The fact of using data twice is maybe changing the environment and as a result the plots change. I have made a function with parameters for data, position and recruitment and I display the outputs. You have to fill them in the same way you defined that variables in your code. Functions work on internal environments so there might not be issues about how data is processed. Here the code where I used the data you shared:
library(ggplot2)
#Function
myplotfunc <- function(x,y,z)
{
data <- x
mult_compare_recruitment <- y
data <- data[c(3, 4, 1, 2),]
data$Shelter <- factor(data$Shelter, levels = c("Low", "High"))
# reorder summary dataframe for plotting
position <- z
# ggplot2 barplot position with Waikiki (Low-High Shelter) and Hanauma Bay
plot <- ggplot(data = data, aes(fill=Shelter, y=mean, x=Site_long)) +
geom_bar(position = "dodge", stat="identity", width = .8) +
scale_x_discrete(limits = position) +
geom_errorbar(aes(ymin = lower, ymax = upper), position = position_dodge(.8), width = .1) +
geom_text(aes(label = mult_compare_recruitment, y = data$upper), vjust = -.5, position = position_dodge(width = 0.8), size = 4) +
scale_fill_grey(name = "Shelter", start = .8, end = .2) +
labs(x = "Site", y = expression(paste("Coral recruitment per m"^"2"))) +
theme_classic(base_size = 14.5) +
theme(text = element_text(size = 18), axis.title.x = element_blank(),
legend.position = "none", axis.text.y = element_text(angle = 90))
return(plot)
}
#Code
o1 <- myplotfunc(x=data_1,y=c("A", "A", "A", "A"),z=c("Waikiki", "Hanauma Bay"))
o2 <- myplotfunc(x=data_2,y=c("A", "A", "B", "B"),z=c("Waikiki", "Hanauma Bay"))
Outputs: