I want to show every second of x-axis label list in the presentation.
Simplified code example in the following and its output in Fig. 1 where four Dates shown but #2 and #4 should be skipped.
# https://stackoverflow.com/a/6638722/54964
require(ggplot2)
my.dates = as.Date(c("2011-07-22","2011-07-23",
"2011-07-24","2011-07-28","2011-07-29"))
my.vals = c(5,6,8,7,3)
my.data <- data.frame(date =my.dates, vals = my.vals)
plot(my.dates, my.vals)
p <- ggplot(data = my.data, aes(date,vals))+ geom_line(size = 1.5)
Expected output: skip dates second and fourth.
Actual code
Actual code where due to rev(Vars) logic, I cannot apply as.Date to the values in each category; the variable molten has a column Dates
p <- ggplot(molten, aes(x = rev(Vars), y = value)) +
geom_bar(aes(fill=variable), stat = "identity", position="dodge") +
facet_wrap( ~ variable, scales="free") +
scale_x_discrete("Column name dates", labels = rev(Dates))
Expected output: skip #2,#4, ... values in each category.
I thought here changing scale_x_discrete to scale_x_continuous and having a break sequence breaks = seq(1,length(Dates),2)) in scale_x_continuous but it fails because of the following error.
Error: `breaks` and `labels` must have the same length
Proposal based Juan's comments
Code
ggplot(data = my.data, aes(as.numeric(date), vals)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = pretty(as.numeric(rev(my.data$date)), n = 5))
Output
Error: Discrete value supplied to continuous scale
Testing EricWatt's proposal application into Actual code
Code proposal
p <- ggplot(molten, aes(x = rev(Vars), y = value)) +
geom_bar(aes(fill=variable), stat = "identity", position="dodge") +
facet_wrap( ~ variable, scales="free") +
scale_x_discrete("My dates", breaks = Dates[seq(1, length(Dates), by = 2)], labels = rev(Dates))
Output
Error: `breaks` and `labels` must have the same length
If you have scale_x_discrete("My dates", breaks = Dates[seq(1, length(Dates), by = 2)]), you get x-axis without any labels so blank.
Fig. 1 Output of the simplified code example,
Fig. 2 Output of EricWatt's first proposal
OS: Debian 9
R: 3.4.0
This works with your simplified example. Without your molten data.frame it's hard to check it against your more complicated plot.
ggplot(data = my.data, aes(date, vals)) +
geom_line(size = 1.5) +
scale_x_date(breaks = my.data$date[seq(1, length(my.data$date), by = 2)])
Basically, use scale_x_date which will likely handle any strange date to numeric conversions for you.
My solution eventually on the actual code motivated by the other linked thread and EricWatt's answer
# Test data of actual data here # https://stackoverflow.com/q/45130082/54964
ggplot(data = molten, aes(x = as.Date(Time.data, format = "%d.%m.%Y"), y = value)) +
geom_bar(aes(fill = variable), stat = "identity", position = "dodge") +
facet_wrap( ~ variable, scales="free") +
theme_bw() + # has to be before axis text manipulations because disables their effect otherwise
theme(axis.text.x = element_text(angle = 90, hjust=1),
text = element_text(size=10)) +
scale_x_date(date_breaks = "2 days", date_labels = "%d.%m.%Y")
Related
I am wondering how to add data labels to a ggplot showing the true value of the data points when the x-axis is in log scale.
I have this data:
date <- c("4/3/2021", "4/7/2021","4/10/2021","4/12/2021","4/13/2021","4/13/2021")
amount <- c(105.00, 96.32, 89.00, 80.84, 121.82, 159.38)
address <- c("A","B","C","D","E","F")
df <- data.frame(date, amount, address)
And I plot it in ggplot2:
plot <- ggplot(df, aes(x = log(amount))) +
geom_histogram(binwidth = 1)
plot + theme_minimal() + geom_text(label = amount)
... but I get the error
"Error: geom_text requires the following missing aesthetics: y"
I have 2 questions as a result:
Why am I getting this error with geom_histogram? Shouldn't it assume to use count as the y value?
Will this successfully show the true values of the data points from the 'amount' column despite the plot's log scale x-axis?
Perhaps like this?
ggplot(df, aes(x = log(amount), y = ..count.., label = ..count..)) +
geom_histogram(binwidth = 1) +
stat_bin(geom = "text", binwidth = 1, vjust = -0.5) +
theme_minimal()
ggplot2 layers do not (at least in any situations I can think of) take the summary calculations of other layers, so I think the simplest thing would be to replicate the calculation using stat_bin(geom = "text"...
Or perhaps simpler, you could pre-calculate the numbers:
library(dplyr)
df %>%
count(log_amt = round(log(amount))) %>%
ggplot(aes(log_amt, n, label = n)) +
geom_col(width = 1) +
geom_text(vjust = -0.5)
EDIT -- to show buckets without the log transform we could use:
df %>%
count(log_amt = round(log(amount))) %>%
ggplot(aes(log_amt, n, label = n)) +
geom_col(width = 0.5) +
geom_text(vjust = -0.5) +
scale_x_continuous(labels = ~scales::comma(10^.),
minor_breaks = NULL)
I would like to change the y axis label (or main title would also be fine) of a ggplot to reflect the column name being iterated over within an apply function.
Here is some sample data and my working apply function:
trial_df <- data.frame("Patient" = c(1,1,2,2,3,3,4,4),
"Outcome" = c("NED", "NED", "NED", "NED", "Relapse","Relapse","Relapse","Relapse"),
"Time_Point" = c("Baseline", "Week3", "Baseline", "Week3","Baseline", "Week3","Baseline", "Week3"),
"CD4_Param" = c(50.8,53.1,20.3,18.1,30.8,24.5,35.2,31.0),
"CD8_Param" = c(5.3,9.7,4.4,4.3,3.1,3.2,5.6,5.3),
"CD3_Param" = c(11.6,16.6,5.0,5.1,14.3,7.1,5.9,8.1))
apply(trial_df[,4:length(trial_df)], 2, function(i) ggplot(data = trial_df, aes_string(x = "Time_Point", y = i )) +
facet_wrap(~Outcome) +
geom_boxplot(alpha = 0.1) +
geom_point(aes(color = `Outcome`, fill = `Outcome`)) +
geom_path(aes(group = `Patient`, color = `Outcome`)) +
theme_minimal() +
ggpubr::stat_compare_means( method = "wilcox.test") +
scale_fill_manual(values=c("blue", "red")) +
scale_color_manual(values=c("blue", "red")))
Example plot output
This creates 3 graphs as expected, however the y axis just says "y". I would like this to display the column name for the column in that iteration. It would also be fine to add a main title with this information, as I just need to know which graph corresponds to which column.
Here are things I have already tried adding to the ggplot code above based on some similar questions I found, but all of them give me the error "non-numeric argument to binary operator":
ggtitle(paste(i))
labs(y = i)
labs(y = as.character(i))
Any help or resources I may have missed would be greatly appreciated, thanks!
So.....for the strangest of reasons I cannot figure out why. This gives what you want but for only one graph!!!
apply(trial_df[,4:length(trial_df)], 2, function(i) ggplot(data = trial_df, aes_string(x = "Time_Point", y = i )) +
facet_wrap(~Outcome) +
geom_boxplot(alpha = 0.1) +
geom_point(aes(color = `Outcome`, fill = `Outcome`)) +
geom_path(aes(group = `Patient`, color = `Outcome`)) +
theme_minimal() +
stat_compare_means( method = "wilcox.test") +
scale_fill_manual(values=c("blue", "red")) +
scale_color_manual(values=c("blue", "red"))+
labs(y=colnames(trial_df)[i]))
Gives these:
I need to plot hourly data for different days using ggplot, and here is my dataset:
The data consists of hourly observations, and I want to plot each day's observation into one separate line.
Here is my code
xbj1 = bj[c(1:24),c(1,6)]
xbj2 = bj[c(24:47),c(1,6)]
xbj3 = bj[c(48:71),c(1,6)]
ggplot()+
geom_line(data = xbj1,aes(x = Date, y= Value), colour="blue") +
geom_line(data = xbj2,aes(x = Date, y= Value), colour = "grey") +
geom_line(data = xbj3,aes(x = Date, y= Value), colour = "green") +
xlab('Hour') +
ylab('PM2.5')
Please advice on this.
I'll make some fake data (I won't try to transcribe yours) first:
set.seed(2)
x <- data.frame(
Date = rep(Sys.Date() + 0:1, each = 24),
# Year, Month, Day ... are not used here
Hour = rep(0:23, times = 2),
Value = sample(1e2, size = 48, replace = TRUE)
)
This is a straight-forward ggplot2 plot:
library(ggplot2)
ggplot(x) +
geom_line(aes(Hour, Value, color = as.factor(Date))) +
scale_color_discrete(name = "Date")
ggplot(x) +
geom_line(aes(Hour, Value)) +
facet_grid(Date ~ .)
I highly recommend you find good tutorials for ggplot2, such as http://www.cookbook-r.com/Graphs/. Others exist, many quite good.
In R with ggplot, I want to create a spaghetti plot (2 quantitative variables) grouped by a third variable to specify line color. Secondly, I want to aggregate that grouping variable with the line type or width.
Here's an example using the airquality dataset. I want the line's color to represent the month, and the summer months to have a different line width from non-summer months.
First, I created an indicator variable for the aggregated groups:
airquality$Summer <- with(airquality, ifelse(Month >= 6 & Month < 9, 1, 0))
I would like something like this, but with differing line widths:
However, this fails:
library(ggplot2)
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Summer)) +
geom_point() +
geom_line(linetype = as.factor(Summer))
This also fails (specifying airquality$Summer):
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer))
I attempted this solution, but get another error:
lty <- setNames(c(0, 1), levels(airquality$Summer))
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer)) +
scale_linetype_manual(values = lty)
Any ideas?
EDIT:
My actual data show very clear trends, and I want to differentiate the top line from all the others below. My goal is to convince people they should make more than just the minimum payment on their student loans:
You just need to change the group to Month and putlinetype in aes:
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Month)) +
geom_point() +
geom_line(aes(linetype = factor(Summer)))
If you want to specify the linetype you can use a few methods. Here is one way:
lineT <- c("solid", "dotdash")
names(lineT) <- c("1","0")
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month))) +
geom_point() +
geom_line(aes(linetype = factor(Summer))) +
scale_linetype_manual(values = lineT)
I'm learning to use ggplot2 and am looking for the smallest ggplot2 code that reproduces the base::plot result below. I've tried a few things and they all ended up being horrendously long, so I'm looking for the smallest expression and ideally would like to have the dates on the x-axis (which are not there in the plot below).
df = data.frame(date = c(20121201, 20121220, 20130101, 20130115, 20130201),
val = c(10, 5, 8, 20, 4))
plot(cumsum(rowsum(df$val, df$date)), type = "l")
Try this:
ggplot(df, aes(x=1:5, y=cumsum(val))) + geom_line() + geom_point()
Just remove geom_point() if you don't want it.
Edit: Since you require to plot the data as such with x labels are dates, you can plot with x=1:5 and use scale_x_discrete to set labels a new data.frame. Taking df:
ggplot(data = df, aes(x = 1:5, y = cumsum(val))) + geom_line() +
geom_point() + theme(axis.text.x = element_text(angle=90, hjust = 1)) +
scale_x_discrete(labels = df$date) + xlab("Date")
Since you say you'll have more than 1 val for "date", you can aggregate them first using plyr, for example.
require(plyr)
dd <- ddply(df, .(date), summarise, val = sum(val))
Then you can proceed with the same command by replacing x = 1:5 with x = seq_len(nrow(dd)).
After a couple of years, I've settled on doing:
ggplot(df, aes(as.Date(as.character(date), '%Y%m%d'), cumsum(val))) + geom_line()
Jan Boyer seems to have found a more concise solution to this problem in this question, which I have shortened a bit and combined with the answers of Prradep, so as to provide a (hopefully) up-to-date-answer:
ggplot(data = df,
aes(x=date)) +
geom_col(aes(y=value)) +
geom_line(aes(x = date, y = cumsum((value))/5, group = 1), inherit.aes = FALSE) +
ylab("Value") +
theme(axis.text.x = element_text(angle=90, hjust = 1))
Note that date is not in Date-Format, but character, and that value is already grouped as suggested by Prradep in his answer above.