I am trying to figure out why all of my data points are grouped up on the y-axis of my graph (see image below). I am trying to do multiple plots with different variables. How can I separate the values on the y-axis? Also is it possible to show the dates every quarter on the x-axis instead of showing it every year?
I am fairly new to R so your help is much appreciated
Code :
library(ggplots)
library(xts)
beta<-as.data.frame(beta)
beta[,"Date"]<- as.Date(beta[,"Date"])
beta<- xts(beta,order.by=beta[,"Date"])
autoplot(beta,facets=Series~.)+ geom_point() + theme_bw()
Data set:
Size Value
2013-01-01 0.032715590 -0.729988962
2013-02-01 0.029004454 -0.720470432
2013-03-01 -0.005376306 -0.774927763
2013-04-01 -0.065253538 -0.832884912
2013-05-01 -0.132726778 -0.805900000
2013-06-01 -0.094694083 -0.693202747
2013-07-01 -0.067636417 -0.540439590
2013-08-01 -0.080754396 -0.523916099
2013-09-01 -0.046787938 -0.633682670
2013-10-01 -0.039442980 -0.527533014
2013-11-01 0.007652725 -0.602841925
2013-12-01 0.012766257 -0.562559325
2014-01-01 0.005465503 -0.590979360
2014-02-01 0.033734341 -0.500183338
2014-03-01 0.036242236 -0.458877891
2014-04-01 0.085039855 -0.370762659
2014-05-01 0.120012885 -0.361754453
2014-06-01 0.146198534 -0.291407100
2014-07-01 0.147598628 -0.393385963
2014-08-01 0.173900895 -0.384568303
Image:
This type of problem is generally a matter of Reshaping data.frame from wide to long format.
library(tidyverse)
beta %>%
mutate(Date = as.Date(Date)) %>%
pivot_longer(
cols = c(Size, Value),
names_to = "Variable",
values_to = "Values"
) %>%
ggplot(aes(Date, Values, color = Variable)) +
geom_point() +
geom_line() +
scale_x_date(date_breaks = "3 month", date_labels = "%b %Y") +
facet_grid(Variable ~ .) +
theme_bw() +
theme(axis.text.x=element_text(angle=60, hjust=1))
Data
beta <- read.table(text = "
Date Size Value
2013-01-01 0.032715590 -0.729988962
2013-02-01 0.029004454 -0.720470432
2013-03-01 -0.005376306 -0.774927763
2013-04-01 -0.065253538 -0.832884912
2013-05-01 -0.132726778 -0.805900000
2013-06-01 -0.094694083 -0.693202747
2013-07-01 -0.067636417 -0.540439590
2013-08-01 -0.080754396 -0.523916099
2013-09-01 -0.046787938 -0.633682670
2013-10-01 -0.039442980 -0.527533014
2013-11-01 0.007652725 -0.602841925
2013-12-01 0.012766257 -0.562559325
2014-01-01 0.005465503 -0.590979360
2014-02-01 0.033734341 -0.500183338
2014-03-01 0.036242236 -0.458877891
2014-04-01 0.085039855 -0.370762659
2014-05-01 0.120012885 -0.361754453
2014-06-01 0.146198534 -0.291407100
2014-07-01 0.147598628 -0.393385963
2014-08-01 0.173900895 -0.384568303
", header = TRUE)
Related
I have a facet plot that I need to place a rectangle in or highlight 3 specific facets. Facets 5, 6, and 10. See Below:
I found some code referring to "geom_rect" that seems like it may work but it won't show up, also doesn't give me any error message. Here is the code:
weekly_TS_PDF<- ggplot(TS_stack, aes(x= TS_log, y = TS_depth, color= sentiment)) +
scale_y_reverse(limits= c(16,2), breaks= seq(16,2)) +
geom_rect(data = data.frame(Week = 5), aes(xmin = -65, xmax = -55, ymin = 1, ymax = 16), alpha = .3, fill="grey", inherit.aes = F) +
geom_point() + facet_grid(.~ Week) + geom_hline(data = week_avg_15E, aes(yintercept = x), linetype = "solid") +
ylab("Target Depth (m)") + xlab("Mean Target Strength (dB)") + ggtitle("Mean TS by Depth by Week (12 hour resolution)") +
guides(color=guide_legend("Year"))
Reprex data:
X TS_depth Group.1 x TS_log Date_time AMPM Week sentiment
1 1 9.593093 2020-12-01 18:00:00 5.390264e-07 -62.68390 2020-12-01 18:00:00 PM 5 Year 1
2 2 9.550032 2020-12-02 06:00:00 4.022841e-07 -63.95467 2020-12-02 06:00:00 AM 6 Year 1
3 3 9.677069 2020-12-02 18:00:00 6.277191e-07 -62.02235 2020-12-02 18:00:00 PM 7 Year 1
4 4 9.679256 2020-12-03 06:00:00 3.501608e-07 -64.55732 2020-12-03 06:00:00 AM 8 Year 1
5 5 9.606380 2020-12-03 18:00:00 6.698625e-07 -61.74014 2020-12-03 18:00:00 PM 9 Year 1
6 6 9.548408 2020-12-04 06:00:00 4.464622e-07 -63.50215 2020-12-04 06:00:00 AM 10 Year 1
I just need to highlight or put a rectangle in facets 5,6, and 10. Any help is appreciated.
I'm trying to extract only the day and the month from as.POSIXct entries in a dataframe to overlay multiple years of data from the same months in a ggplot.
I have the data as time-series objects ts.
data.ts<-read.zoo(data, format = "%Y-%m-%d")
ts<-SMA(data.ts[,2], n=10)
df<-data.frame(date=as.POSIXct(time(ts)), value=ts)
ggplot(df, aes(x=date, y=value),
group=factor(year(date)), colour=factor(year(date))) +
geom_line() +
labs(x="Month", colour="Year") +
theme_classic()
Now, obviously if I only use "date" in aes, it'll plot the normal time-series as a consecutive sequence across the years. If I do "day(date)", it'll group by day on the x-axis. How do I pull out day AND month from the date? I only found yearmon(). If I try as.Date(df$date, format="%d %m"), it's not doing anything and if I show the results of the command, it would still include the year.
data:
> data
Date V1
1 2017-02-04 113.26240
2 2017-02-05 113.89059
3 2017-02-06 114.82531
4 2017-02-07 115.63410
5 2017-02-08 113.68569
6 2017-02-09 115.72382
7 2017-02-10 114.48750
8 2017-02-11 114.32556
9 2017-02-12 113.77024
10 2017-02-13 113.17396
11 2017-02-14 111.96292
12 2017-02-15 113.20875
13 2017-02-16 115.79344
14 2017-02-17 114.51451
15 2017-02-18 113.83330
16 2017-02-19 114.13128
17 2017-02-20 113.43267
18 2017-02-21 115.85417
19 2017-02-22 114.13271
20 2017-02-23 113.65309
21 2017-02-24 115.69795
22 2017-02-25 115.37587
23 2017-02-26 114.64885
24 2017-02-27 115.05736
25 2017-02-28 116.25590
If I create a new column with only day and month
df$day<-format(df$date, "%m/%d")
ggplot(df, aes(x=day, y=value),
group=factor(year(date)), colour=factor(year(date))) +
geom_line() +
labs(x="Month", colour="Year") +
theme_classic()
I get such a graph for the two years.
I want it to look like this, only with daily data instead of monthly.
ggplot: Multiple years on same plot by month
You are almost there. As you want to overlay day and month based on every year, we need a continuous variable. "Day of the year" does the trick for you.
data <-data.frame(Date=c(Sys.Date()-7,Sys.Date()-372,Sys.Date()-6,Sys.Date()-371,
Sys.Date()-5,Sys.Date()-370,Sys.Date()-4,Sys.Date()-369,
Sys.Date()-3,Sys.Date()-368),V1=c(113.23,123.23,121.44,111.98,113.5,114.57,113.44, 121.23, 122.23, 110.33))
data$year = format(as.Date(data$Date), "%Y")
data$Date = as.numeric(format(as.Date(data$Date), "%j"))
ggplot(data=data, mapping=aes(x=Date, y=V1, shape = year, color = year)) + geom_point() + geom_line()
theme_bw()
I have the following data:
DATE TIME 15 16 17 18 20 22 23 24
30/08/2018 08:00:00 130.4905899 15.44164769 948.9185939 6211.837354 1071.730556 10.08920076 7.793301031 4.290571724
30/08/2018 11:00:00 125.7301547 18.87143833 991.0009783 6304.471569 1082.126629 10.80475415 7.857773565 4.434150761
30/08/2018 14:00:00 153.3779662 17.63335938 949.1741247 6209.524186 1079.102756 10.68438383 8.326058855 4.265092761
30/08/2018 17:00:00 132.3256891 15.8961511 917.0452991 6123.402395 1081.166439 10.41007094 7.856372445 4.19841642
30/08/2018 20:00:00 130.6405835 15.28122651 917.0229181 6135.679239 1084.589394 10.70688202 7.741277402 4.236844143
31/08/2018 08:00:00 124.1484465 17.14357927 948.9060481 6126.791479 1085.907147 10.76713085 7.810187162 4.356138132
03/09/2018 08:00:00 161.0455657 17.10409992 881.4517913 5839.73355 1073.585164 9.925269955 7.987206082 4.301307752
10/09/2018 08:00:00 165.645823 16.45928764 860.4285647 5781.612679 1073.439013 10.01297791 7.983672272 4.257314139
Column 1 is the date, column two is the time. I have completed the following code but just keep getting errors.
Any help?
bc <- read.csv("CSV name", header = T)
bc$DATE = as.POSIXct(strptime(bc$DATE, format="%d-%m-%Y"))
bc$TIME = as.POSIXct(bc$TIME, format="%H-%M-%S"))
library(ggplot2); library(tidyr)
df_tidy <- df %>%
mutate(TIME = row.names %>% as.integer) %>%
select(-row.names) %>%
gather(series, value, -TIME)
ggplot(df_tidy, aes(TIME, value, group = series)) +
geom_line() +
facet_wrap(~series, scales = "free_y")
Any help is much appreciated.
Possible solution, assuming your data.frame is named df
It creates a column named timestamp, of class POSIXct
library(tidyverse)
df %>% mutate( timestamp = as.POSIXct( paste0( as.character(DATE), " ", as.character(TIME) ), format = "%d/%m/%Y %H:%M:%S" ) )
# DATE TIME X15 X16 X17 X18 X20 X22 X23 X24 timestamp
# 1 30/08/2018 08:00:00 130.4906 15.44165 948.9186 6211.837 1071.731 10.08920 7.793301 4.290572 2018-08-30 08:00:00
# 2 30/08/2018 11:00:00 125.7302 18.87144 991.0010 6304.472 1082.127 10.80475 7.857774 4.434151 2018-08-30 11:00:00
I have two dataframes tur_e and tur_w. Below you can see the data frame:
tur_e:
Time_f turbidity_E
1 2014-12-12 00:00:00 87
2 2014-12-12 00:15:00 87
3 2014-12-12 00:30:00 91
4 2014-12-12 00:45:00 84
5 2014-12-12 01:00:00 92
6 2014-12-12 01:15:00 89
tur_w:
Time_f turbidity_w
47 2015-06-04 11:45:00 8.4
48 2015-06-04 12:00:00 10.5
49 2015-06-04 12:15:00 9.2
50 2015-06-04 12:30:00 9.1
51 2015-06-04 12:45:00 8.7
52 2015-06-04 13:00:00 8.4
I then create a unique dataframe combining turbidity_E and turbidity_w. I match with the date (time_f) and use melt to reshape data:
dplr <- left_join(tur_e, tur_w, by=c("Time_f"))
dt.df <- melt(dplr, measure.vars = c("turbidity_E", "turbidity_w"))
I plotted series of box plot over time. The code is below:
dt.df%>% mutate(Time_f = ymd_hms(Time_f)) %>%
ggplot(aes(x = cut(Time_f, breaks="month"), y = value)) +
geom_boxplot(outlier.size = 0.3) + facet_wrap(~variable, ncol=1)+labs(x = "time")
I obtain the following graph:
I would like to reduce the number of dates that appear in my x-axis. I add this line of code:
scale_x_date(breaks = date_breaks("6 months"),labels = date_format("%b"))
I got this following error:
Error: Invalid input: date_trans works with objects of class Date
only
I tried a lot of different solutions but no one work. Any help would be appreciate! Thanks!
Two things. First, you need to use scale_x_datetime (you don't have only dates, but also time!). Secondly, when you cut x, it actually just becomes a factor, losing any sense of time altogether. If you want a boxplot of each month, you can group by that cut instead:
dt.df %>% mutate(Time_f = lubridate::ymd_hms(Time_f)) %>%
ggplot(aes(x = Time_f, y = value, group = cut(Time_f, breaks="month"))) +
geom_boxplot(outlier.size = 0.3) +
facet_wrap(~variable, ncol = 1) +
labs(x = "time") +
scale_x_datetime(date_breaks = '1 month')
I have a data set that is something like this:
start_date end_date outcome
1 2014-07-18 2014-08-20 TRUE
2 2014-08-04 2014-09-23 TRUE
3 2014-08-01 2014-09-03 TRUE
4 2014-08-01 2014-09-03 TRUE
5 2014-12-10 2014-12-10 TRUE
6 2014-10-11 2014-11-07 TRUE
7 2015-04-27 2015-05-20 TRUE
8 2014-11-22 2014-12-25 TRUE
9 2015-03-24 2015-04-26 TRUE
10 2015-03-12 2015-04-10 FALSE
11 2014-05-29 2014-06-28 FALSE
12 2015-03-19 2015-04-20 TRUE
13 2015-03-25 2015-04-26 TRUE
14 2015-03-25 2015-04-26 TRUE
15 2014-07-09 2014-08-10 TRUE
16 2015-03-26 2015-04-26 TRUE
17 2014-07-09 2014-08-10 TRUE
18 2015-03-30 2015-04-28 TRUE
19 2014-03-13 2014-04-13 TRUE
20 2015-04-01 2015-04-29 TRUE
I want to plot a histogram where each bar corresponds to a month and it contains the proportion of FALSE / ALL = (FALSE + TRUE) in that month.
What is the easiest way to do this in R preferably using ggplot?
Here is one way. There will be better ways to do this. But I will leave what I tried. The main job was to create a new data frame for the graphic. Using your data above, I first converted factors to date objects. If yo have date objects in your data, you do not need this. Then, I summarised your data for start_date and end_date using count(). I bound the two data frames and further did the calculation to get the proportion of FALSE for each month.
library(zoo)
library(dplyr)
library(ggplot2)
library(lubridate)
mutate_each(mydf, funs(as.POSIXct(., format = "%Y-%m-%d")), -outcome) %>%
mutate_each(funs(paste(year(.),"-",month(.), sep = "")), vars = -outcome) -> foo1;
count(foo1, start_date, outcome) %>% rename(date = start_date) -> foo2;
count(foo1, end_date, outcome) %>%
rename(date = end_date) %>%
bind_rows(foo2) %>%
group_by(date, outcome) %>%
summarize(total = sum(n)) %>%
summarize(prop = length(which(outcome == FALSE)) / sum(total)) %>%
mutate(date = as.Date(as.yearmon(date))) -> foo3
ggplot(data = foo3, aes(x = date, y = prop)) +
geom_bar(stat = "identity") +
scale_x_date(labels = date_format("%Y-%m"), breaks = date_breaks("month")) +
theme(axis.text.x = element_text(angle = 90, vjust = 1))