Re-ordering the legend items in ggplot2 [closed] - r

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a table with following data
marks cut but xut
1 49 51 67
2 53 47 76
3 54 46 67
4 54 46 56
5 55 45 65
6 55 45 75
7 55 45 45
8 55 45 33
9 55 45 43
10 56 45 53
11 56 45 23
12 56 44 78
13 56 44 45
When I plot the graph I get get the legend as cut but xut , I want the legend to be as xut but and cut i.e. I want to re-order the legends and present them in a manner whichI need
below is the code which I have implemented
install.packages("plyr")
install.packages("ggplot2")
install.packages("reshape2")
library("plyr")
library("reshape2")
library("ggplot2")
data=read.csv("data.csv")
attach(data)
data$marks <- factor(data$marks, levels
= data$marks[order(data$cut)])
c.data=melt(data, id.var="marks")
n.data = ddply(c.data,.(marks), transform, pos = cumsum(value) - 0.5*value)
n.data <- transform(n.data,variable = factor(levels = c("xut", "but", "cut")))
plot = ggplot(d.data, aes(x = marks, y = value)) + geom_bar(stat = "identity",mapping = aes(x = value, fill = variable)) + scale_y_continuous( breaks=seq(0,100, by = 10))+geom_text(aes(label = value, y = pos), size = 3, face="bold", colour="white") + scale_fill_manual(values=c("455555","333333","335566")) + theme(axis.line = element_line(),axis.text.x=element_text(angle=60,hjust=1,colour="white"),axis.text.y=element_text(colour="white"),axis.title.x = element_blank(),axis.title.y = element_blank(),panel.background = element_blank(),axis.ticks=element_blank()) + labs(fill="")+coord_cartesian(ylim=c(0,100)) + theme(legend.position = "bottom", legend.direction = "horizontal")

Reorder the levels of the variable with the groups in the long or melted version of the data. For example, using your data
foo <- read.table(text="marks cut but xut
1 49 51 67
2 53 47 76
3 54 46 67
4 54 46 56
5 55 45 65
6 55 45 75
7 55 45 45
8 55 45 33
9 55 45 43
10 56 45 53
11 56 45 23
12 56 44 78
13 56 44 45", header = TRUE)
Melt it into a suitable format
require(reshape2)
require(ggplot2)
bar <- melt(foo, id = "marks")
> head(bar)
marks variable value
1 1 cut 49
2 2 cut 53
3 3 cut 54
4 4 cut 54
5 5 cut 55
6 6 cut 55
Then set the levels on the variable factor containing the group labels
bar <- transform(bar,
variable = factor(variable, levels = c("xut", "but", "cut")))
Then plot
ggplot(bar) + geom_bar(mapping = aes(x = value, fill = variable))
As you don't show any plotting code I'm guessing what your actual plot code looks like, but as the above shows, at least the ordering is what you want...

Related

Adding text in one of the four facets [duplicate]

This question already has an answer here:
Annotation on only the first facet of ggplot in R?
(1 answer)
Closed last month.
I want to add a few texts in one facet out of four facets in my ggplot.
I am using annotate function to add a text but it generates the text at a given location (x,y) in every facet. Because the data variables have different ranges of y in each facet, the texts are not coming at a desired location (x,y).
Please let me know what should be done. Thanks.
library(dplyr)
library(tidyr)
library(ggplot2)
df%>%
select(Date, Ca, Na, K, Mg)%>%
gather(var,value,-Date)%>%
ggplot(aes(as.Date(Date), value))+
geom_point()+
theme_bw()+
facet_wrap(~var,scales = 'free_y',ncol = 1)+
ylab(" (ppm) (ppm)
(ppm) (ppm)")+
facet_wrap(~var,scales = 'free_y',ncol = 1, strip.position = "right")+
geom_vline(aes(xintercept = as.Date("2021-04-28")), col = "red")+
geom_vline(aes(xintercept = as.Date("2021-04-28")), col = "red")+
geom_vline(aes(xintercept = as.Date("2021-04-29")), col = "red")+
theme(axis.title = element_text(face="bold"))+
theme(axis.text = element_text(face="bold"))+
xlab('Date')+
theme(axis.title.x = element_text(margin = margin(t = 10)))+
theme(axis.title.y = element_text(margin = margin(r = 10)))+
annotate("text", label = "E1", x = as.Date("2021-04-28"), y = 2.8)
This is the code I am using for the desired output. I want to name all the xintercept lines which is E1, E2, E3 (from left to right) on the top of xaxis i.e. above the first facet of variable Ca in the data. Any suggestions?
Here is a part of my data:
df <- read.table(text = "
Date Ca K Mg Na
2/18/2021 1 25 21 19
2/22/2021 2 26 22 20
2/26/2021 3 27 23 21
3/4/2021 4 28 5 22
3/6/2021 5 29 6 8
3/10/2021 6 30 7 9
3/13/2021 7 31 8 10
3/17/2021 8 32 9 11
3/20/2021 9 33 10 12
3/23/2021 10 34 11 13
3/27/2021 11 35 12 14
3/31/2021 12 36 13 15
4/3/2021 13 37 14 16
4/7/2021 14 38 15 17
4/10/2021 15 39 16 18
4/13/2021 16 40 17 19
4/16/2021 17 41 18 20
4/19/2021 8 42 19 21
4/22/2021 9 43 20 22
4/26/2021 0 44 21 23
4/28/2021 1 45 22 24
4/28/2021 2 46 23 25
4/28/2021 3 47 24 26
4/28/2021 5 48 25 27
4/29/2021 6 49 26 28
5/4/2021 7 50 27 29
5/7/2021 8 51 28 30
5/8/2021 9 1 29 31
5/10/2021 1 2 30 32
5/29/2021 3 17 43 45
5/31/2021 6 18 44 46
6/1/2021 4 19 45 47
6/2/2021 8 20 46 48
6/3/2021 2 21 47 49
6/7/2021 3 22 48 50
6/10/2021 5 23 49 51
6/14/2021 3 5 50 1
6/18/2021 1 6 51 2
", header = TRUE)
Prepare the data before plotting, make a separate data for text annotation:
dfplot <- df %>%
select(Date, Ca, Na, K, Mg) %>%
#convert to date class before plotting
mutate(Date = as.Date(Date, "%m/%d/%Y")) %>%
#using pivot instead of gather. gather is superseded.
#gather(var, value, -Date)
pivot_longer(cols = 2:5, names_to = "grp", values_to = "ppm")
dftext <- data.frame(grp = "Ca", # we want text to show up only on "Ca" facet.
ppm = max(dfplot[ dfplot$grp == "Ca", "ppm" ]),
Date = as.Date(c("2021-04-27", "2021-04-28", "2021-04-29")),
label = c("E1", "E2", "E3"))
After cleaning up your code, we can use geom_text with dftext:
ggplot(dfplot, aes(Date, ppm)) +
geom_point() +
facet_wrap(~grp, scales = 'free_y',ncol = 1, strip.position = "right") +
geom_vline(xintercept = dftext$Date, col = "red") +
geom_text(aes(x = Date, y = ppm, label = label), data = dftext, nudge_y = -2)
Try using ggrepel library to avoid label overlap, replace geom_text with one of these:
#geom_text_repel(aes(x = Date, y = ppm, label = label), data = dftext)
#geom_label_repel(aes(x = Date, y = ppm, label = label), data = dftext)
After cleaning up the code and seeing the plot, I think this post is a duplicate of Annotation on only the first facet of ggplot in R? .

How can I stop geom_point from removing rows in order to create a map

My intention is to plot several locations for which I have the longitude and the latitude onto a map (as simple dots). The locations are distributed across Uganda.
print(locations)
Latitude Longitude
1 0.482980 30.212160
2 0.647717 30.315984
3 0.44735 30.18063
4 0.58416316 30.2066327
5 0.60012 30.19998
6 0.433483 30.20179
7 0.625317 30.224837
8 0.654277 30.251667
9 0.387517 30.197475
10 0.607402 30.292068
11 0.770128 30.403456
12 0.767266 30.414246
13 0.777873 30.389111
14 0.631774 30.290356
15 0.734015 30.279161
16 0.722133 30.277941
17 0.66322994 30.22795225
18 0.66900827 30.21357739
19 0.450372 30.197764
20 0.493699 30.250891
21 0.479716 30.180958
22 0.483242 30.284576
23 0.645044 30.321270
24 0.602389 30.275637
25 0.868827 30.465939
26 0.631194 30.263565
27 0.631576 30.263855
28 0.413701 30.247934
29 0.67135 30.2675
30 0.492360 30.223620
31 0.81481 30.39311
32 0.396665 30.26309
33 0.666170 30.308960
34 0.610067 30.306058
35 0.677144 30.196810
36 0.677144 30.196810
37 0.555555 30.231681
38 0.63874 30.231691
39 0.512953 30.207603
40 0.442291 30.279173
41 0.575658 30.310231
42 0.423129 30.211289
43 0.623838 30.256925
44 0.639643 30.341620
45 0.653550 30.170428
46 0.752630 30.401040
47 0.478544 30.191938
48 0.48114 30.198471
49 0.679820 30.259800
50 0.581293 30.158619
51 0.730410 30.376620
52 0.504059 30.178556
53 0.587441 30.310364
54 0.588072 30.277877
55 0.70893233 30.19008103
56 0.81699 30.41799
57 0.609300 30.271613
58 0.595226 30.315580
59 0.459029 30.277659
60 0.727873 30.216385
61 0.647722 30.217760
62 0.690064 30.193881
63 0.512339 30.140107
64 0.649181 30.302570
65 0.649881 30.303974
66 0.649736 30.302481
67 0.722082 30.226063
68 0.463480 30.203050
69 0.692930 30.281880
70 0.652864 30.229106
71 0.491520 30.233780
72 0.778370 30.415920
73 0.682090 30.276460
74 0.564670 30.148920
75 0.655588 30.243047
76 0.647717 30.315984
77 0.518769 30.159384
78 0.683070 30.339650
79 0.662980 30.253890
80 0.591899 30.145857
81 0.699690 30.344650
82 0.441030 30.177240
83 0.612202 30.213022
84 0.472530 30.236980
85 0.473722 30.165020
86 0.499181 30.159485
87 0.6598021 30.29158
88 0.6601362 30.29119
89 0.48386 30.23142
90 0.679470 30.282190
91 0.685860 30.271070
92 0.528797 30.171251
93 0.514863 30.243976
94 0.603612 30.258705
95 0.484708 30.142588
96 0.523857 30.233239
97 0.395356 30.215351
98 0.612247 30.269341
99 0.55878815 30.17702095
100 0.747630 30.384240
101 0.538778 30.326353
102 0.554198 30.299815
103 0.504410 30.298260
104 0.418705 30.259747
105 0.669850 30.324100
106 0.654277 30.251667
107 0.460830 30.214070
108 0.378725 30.216429
Here is what I managed to do so far:
locations$Latitude=as.numeric(levels(locations$Latitude))[locations$Latitude]
locations$Longitude=as.numeric(levels(locations$Longitude))[locations$Longitude]
uganda <- raster::getData('GADM', country='UGA', level=1)
ggplot() +
geom_polygon(data = uganda,
aes(x = long, y = lat, group = group),
colour = "grey10", fill = "#fff7bc") +
geom_point(data = locations,
aes(x = Longitude, y = Latitude)) +
coord_map() +
theme_bw() +
xlab("Longitude") + ylab("Latitude")
As you can see by executing the code above, the map of Uganda is loaded from the GADM database and displayed correctly. However, I get the following warning message:
Warning:
Removed 108 rows containing missing values (geom_point).
I read in another post (Explain ggplot2 warning: "Removed k rows containing missing values") that this error might be caused by erroneous axis ranges. I'm not familiar with the plotting of geographic data and GADM maps, though. This is why I wasn't able to adjust the ranges (I guess this would be done in the geom_polygon -part). Can somebody help me, please?
I am not sure why you are running your first part of the code:
locations$Latitude=as.numeric(levels(locations$Latitude))[locations$Latitude] locations$Longitude=as.numeric(levels(locations$Longitude))[locations$Longitude]
If you don't run that part, there won't be any NA anymore. So if you run the following code, it should work:
library(tidyverse)
library(raster)
uganda <- raster::getData('GADM', country='UGA', level=1)
ggplot() +
geom_polygon(data = uganda,
aes(x = long, y = lat, group = group),
colour = "grey10", fill = "#fff7bc") +
geom_point(data = locations,
aes(x = Longitude, y = Latitude)) +
coord_map() +
theme_bw() +
xlab("Longitude") + ylab("Latitude")
Output:

ggplot2 for a newbie multiple columns grouped in a bar chart? [duplicate]

I have the following data
Input Rtime Rcost Rsolutions Btime Bcost
1 12 proc. 1 36 614425 40 36
2 15 proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82
I want to create a grouped bar chart from this data such that x-axis contains Input field (as groups) and y axis represent the log scale for the Rtime and Btime fields (the two bars).
All solutions/examples I checked online had similar data put into a three column layout. I do not know how to use the data I have to generate the grouped bar-chart. Or if there is a way to convert this data (manually converting is not an options because it is a huge file with a lot of rows) into a R and ggplot compatible data format.
Edit :
Graph generated using gncs solution
As requested, a ggplot2 solution that also uses reshape2:
library(reshape2)
df <- read.table(text = " Input Rtime Rcost Rsolutions Btime Bcost
1 12-proc. 1 36 614425 40 36
2 15-proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82",header = TRUE,sep = "")
dfm <- melt(df[,c('Input','Rtime','Btime')],id.vars = 1)
ggplot(dfm,aes(x = Input,y = value)) +
geom_bar(aes(fill = variable),stat = "identity",position = "dodge") +
scale_y_log10()
Note a style difference here, where since log(1) = 0, ggplot2 treats that as a bar of zero height and doesn't plot anything, whereas barplot plots a little stub (which in my opinion is a little misleading).
I think I understand the problem and this is what I would suggest (short run - option):
data <- read.table("data.txt", header=TRUE)
subset <- t(data.frame(data$Rtime, data$Btime))
barplot(subset, legend = c("Rtime", "Btime"), names.arg=data$Input, log="y", beside=TRUE)
Is that what you want? It is kind of dirty, but it does the job.
Update: code corrected.
As requested, a ggplot2 solution that also uses pivot_longer() https://tidyr.tidyverse.org/reference/pivot_longer.html to transform the data into a format that geom_bar() can easily plot.
library(dplyr)
library(ggplot2)
df <- read.table(text = " Input Rtime Rcost Rsolutions Btime Bcost
1 12-proc. 1 36 614425 40 36
2 15-proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82",
header = TRUE,sep = "")
dfm <- pivot_longer(df, -Input, names_to="variable", values_to="value")
## pivot_longer takes the input data frame, excludes the Input field from the transformation, turns the remaining column names into the variable "variable" (often called the "key"), and assigns the values to the variable "value".
ggplot(dfm,aes(x = Input,y = value)) +
geom_bar(aes(fill = variable),stat = "identity",position = "dodge") +
scale_y_log10()
joran's answer helped me a lot, but I had to use stat="identity" in the ggplot statement like that:
ggplot(dfm, aes(x = Input,y = value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity") +
scale_y_log10()
My version of R is 3.2.2 and ggplot2 version 1.0.1
Thanks.

Remove link between time series and add minor date tick on x_axis in ggplot

I was trying to plot a time series composed of weekly averanges. Here is the plot that I have obtained:
[weekly averages A]
[1]: https://i.stack.imgur.com/XMGMs.png
As you can see the time serie do not cover all the years completely, so, when I have got no data ggplot links two subsequent years. I think I have to group the data in some ways, but I do not understand how. Here is the code:
df4 <- data.frame(df$Date, df$A)
colnames(df4)<- c("date","A")
df4$date <- as.Date(df4$date,"%Y/%m/%d")
df4$week_day <- as.numeric(format(df4$date, format='%w'))
df4$endofweek <- df4$date + (6 - df4$week_day)
week_aveA <- df4 %>%
group_by(endofweek) %>%
summarise_all(list(mean=mean), na.rm=TRUE) %>%
na.omit()
g1 = ggplot() +
geom_step(data=week_aveA, aes(group = 1, x = (endofweek), y = (A_mean)), colour="gray25") +
scale_y_continuous(expand = c(0, 0), limits = c(0, 2500)) +
scale_x_date(breaks="year", labels=date_format("%Y")) +
labs(y = expression(A~ ~index),
x = NULL) +
theme(axis.text.x = element_text(size=10),
axis.title = element_text(size=10))
Here an extraction (the former three years) of the dataset:
endofweek date_mean A_mean week_day_mean
1 20/03/2010 17/03/2010 939,2533437 3
2 27/03/2010 24/03/2010 867,3620121 3
3 03/04/2010 31/03/2010 1426,791222 3
4 10/04/2010 07/04/2010 358,5698314 3
5 17/04/2010 13/04/2010 301,1815352 2
6 24/04/2010 21/04/2010 273,4922895 3,333333333
7 01/05/2010 28/04/2010 128,5989633 3
8 08/05/2010 05/05/2010 447,8858881 3
9 15/05/2010 12/05/2010 387,9828891 3
10 22/05/2010 19/05/2010 138,0770986 3
11 29/05/2010 26/05/2010 370,2147933 3
12 05/06/2010 02/06/2010 139,0451791 3
13 12/06/2010 09/06/2010 217,1286356 3
14 19/06/2010 16/06/2010 72,36972411 3
15 26/06/2010 23/06/2010 282,2911902 3
16 03/07/2010 30/06/2010 324,3215936 3
17 10/07/2010 07/07/2010 210,568691 3
18 17/07/2010 14/07/2010 91,76930829 3
19 24/07/2010 21/07/2010 36,4211218 3,666666667
20 31/07/2010 28/07/2010 37,53981103 3
21 07/08/2010 04/08/2010 91,33282642 3
22 14/08/2010 11/08/2010 28,38587352 3
23 21/08/2010 18/08/2010 58,72836406 3
24 28/08/2010 24/08/2010 102,1050612 2,5
25 04/09/2010 02/09/2010 13,45357513 4,5
26 11/09/2010 08/09/2010 51,24017212 3
27 18/09/2010 15/09/2010 159,7395663 3
28 25/09/2010 21/09/2010 62,71136678 2
29 02/04/2011 31/03/2011 1484,661164 4
30 09/04/2011 06/04/2011 656,1827964 3
31 16/04/2011 13/04/2011 315,3097313 3
32 23/04/2011 20/04/2011 293,2904042 3
33 30/04/2011 26/04/2011 255,7517519 2,4
34 07/05/2011 04/05/2011 360,7035289 3
35 14/05/2011 11/05/2011 342,0902797 3
36 21/05/2011 18/05/2011 386,1380421 3
37 28/05/2011 24/05/2011 418,9624807 2,833333333
38 04/06/2011 01/06/2011 112,7568 3
39 11/06/2011 08/06/2011 85,17855619 3,2
40 18/06/2011 15/06/2011 351,8714638 3
41 25/06/2011 22/06/2011 139,7936898 3
42 02/07/2011 29/06/2011 68,57716191 3,6
43 09/07/2011 06/07/2011 62,31823822 3
44 16/07/2011 13/07/2011 80,7328917 3
45 23/07/2011 20/07/2011 114,9475331 3
46 30/07/2011 27/07/2011 90,13118758 3
47 06/08/2011 03/08/2011 43,29372258 3
48 13/08/2011 10/08/2011 49,39935204 3
49 20/08/2011 16/08/2011 133,746822 2
50 03/09/2011 31/08/2011 76,03928942 3
51 10/09/2011 05/09/2011 27,99834637 1
52 24/03/2012 23/03/2012 366,2625797 5,5
53 31/03/2012 28/03/2012 878,8535513 3
54 07/04/2012 04/04/2012 1029,909052 3
55 14/04/2012 11/04/2012 892,9163416 3
56 21/04/2012 18/04/2012 534,8278693 3
57 28/04/2012 25/04/2012 255,1177585 3
58 05/05/2012 02/05/2012 564,5280546 3
59 12/05/2012 09/05/2012 767,5018168 3
60 19/05/2012 16/05/2012 516,2680148 3
61 26/05/2012 23/05/2012 241,2113073 3
62 02/06/2012 30/05/2012 863,6123397 3
63 09/06/2012 06/06/2012 201,2019288 3
64 16/06/2012 13/06/2012 222,9955486 3
65 23/06/2012 20/06/2012 91,14166632 3
66 30/06/2012 27/06/2012 26,93145693 3
67 07/07/2012 04/07/2012 67,32183278 3
68 14/07/2012 11/07/2012 46,25297513 3
69 21/07/2012 18/07/2012 81,34359825 3,666666667
70 28/07/2012 25/07/2012 49,59130851 3
71 04/08/2012 01/08/2012 44,13438077 3
72 11/08/2012 08/08/2012 30,15773151 3
73 18/08/2012 15/08/2012 57,47256772 3
74 25/08/2012 22/08/2012 31,9109555 3
75 01/09/2012 29/08/2012 52,71058484 3
76 08/09/2012 04/09/2012 24,52495229 2
77 06/04/2013 01/04/2013 1344,388042 1,5
78 13/04/2013 10/04/2013 1304,838687 3
79 20/04/2013 17/04/2013 892,620141 3
80 27/04/2013 24/04/2013 400,1720434 3
81 04/05/2013 01/05/2013 424,8473083 3
82 11/05/2013 08/05/2013 269,2380208 3
83 18/05/2013 15/05/2013 238,9993749 3
84 25/05/2013 22/05/2013 128,4096151 3
85 01/06/2013 29/05/2013 158,5576121 3
86 08/06/2013 05/06/2013 175,2036942 3
87 15/06/2013 12/06/2013 79,20250839 3
88 22/06/2013 19/06/2013 126,9065428 3
89 29/06/2013 26/06/2013 133,7480108 3
90 06/07/2013 03/07/2013 218,0092943 3
91 13/07/2013 10/07/2013 54,08460936 3
92 20/07/2013 17/07/2013 91,54285041 3
93 27/07/2013 24/07/2013 44,64567928 3
94 03/08/2013 31/07/2013 229,5067999 3
95 10/08/2013 07/08/2013 49,70729373 3
96 17/08/2013 14/08/2013 53,38618335 3
97 24/08/2013 21/08/2013 217,2800997 3
98 31/08/2013 28/08/2013 49,43590136 3
99 07/09/2013 04/09/2013 64,88783029 3
100 14/09/2013 11/09/2013 11,04300773 3
So at the end I have one mainly question: how can I eliminated the connection between the years? ... and an aesthetic question: how can I add minor ticks on the x_axis? At least one every 6 months, just to make the plot easy to read.
Thanks in advance for any suggestion!
Edit
This is the code I tried with the suggestion, maybe I mistype some part of it.
library(tidyverse)
library(dplyr)
library(lubridate)
df4 <- data.frame(df$Date, df$A)
colnames(df4)<- c("date","A")
df4$date <- as.Date(df4$date,"%Y/%m/%d")
df4$week_day <- as.numeric(format(df4$date, format='%w'))
df4$endofweek <- df4$date + (6 - df4$week_day)
week_aveA <- df4 %>%
group_by(endofweek) %>%
summarise_all(list(mean=mean), na.rm=TRUE) %>%
na.omit()
week_aveA$endofweek <- as.Date(week_aveA$endofweek,"%d/%m/%Y")
week_aveA$A_mean <- as.numeric(gsub(",", ".", week_aveA$A_mean))
week_aveA$week_day_mean <- as.numeric(gsub(",", ".", week_aveA$week_day_mean))
week_aveA$year <- format(week_aveA$endofweek, "%Y")
library(ggplot2)
library(methods)
library(scales)
mylabel <- function(x) {
ifelse(grepl("-07-01$", x), "", format(x, "%Y"))
}
ggplot() +
geom_step(data=week_aveA, aes(x = endofweek, y = A_mean, group = year), colour="gray25") +
scale_y_continuous(expand = c(0, 0), limits = c(0, 2500)) +
scale_x_date(breaks="6 month", labels = mylabel) +
labs(y = expression(A~ ~index),
x = NULL) +
theme(axis.text.x = element_text(size=10),
axis.title = element_text(size=10))
You have to group by year:
Add a variable with the year to your dataset
Map the year variable on the group aesthetic
For the ticks. Increase the number of the breaks. If you want only ticks but not labels you can use a custom function to get rid of unwanted labels, e.g. my approach below set the breaks to "6 month" but replaces the mid-year labels with an empty string:
week_aveA$endofweek <- as.Date(week_aveA$endofweek,"%d/%m/%Y")
week_aveA$A_mean <- as.numeric(gsub(",", ".", week_aveA$A_mean))
week_aveA$week_day_mean <- as.numeric(gsub(",", ".", week_aveA$week_day_mean))
week_aveA$year <- format(week_aveA$endofweek, "%Y")
library(ggplot2)
mylabel <- function(x) {
ifelse(grepl("-07-01$", x), "", format(x, "%Y"))
}
ggplot() +
geom_step(data=week_aveA, aes(x = endofweek, y = A_mean, group = year), colour="gray25") +
scale_y_continuous(expand = c(0, 0), limits = c(0, 2500)) +
scale_x_date(breaks="6 month", labels = mylabel) +
labs(y = expression(A~ ~index),
x = NULL) +
theme(axis.text.x = element_text(size=10),
axis.title = element_text(size=10))

Creating grouped bar-plot of multi-column data in R

I have the following data
Input Rtime Rcost Rsolutions Btime Bcost
1 12 proc. 1 36 614425 40 36
2 15 proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82
I want to create a grouped bar chart from this data such that x-axis contains Input field (as groups) and y axis represent the log scale for the Rtime and Btime fields (the two bars).
All solutions/examples I checked online had similar data put into a three column layout. I do not know how to use the data I have to generate the grouped bar-chart. Or if there is a way to convert this data (manually converting is not an options because it is a huge file with a lot of rows) into a R and ggplot compatible data format.
Edit :
Graph generated using gncs solution
As requested, a ggplot2 solution that also uses reshape2:
library(reshape2)
df <- read.table(text = " Input Rtime Rcost Rsolutions Btime Bcost
1 12-proc. 1 36 614425 40 36
2 15-proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82",header = TRUE,sep = "")
dfm <- melt(df[,c('Input','Rtime','Btime')],id.vars = 1)
ggplot(dfm,aes(x = Input,y = value)) +
geom_bar(aes(fill = variable),stat = "identity",position = "dodge") +
scale_y_log10()
Note a style difference here, where since log(1) = 0, ggplot2 treats that as a bar of zero height and doesn't plot anything, whereas barplot plots a little stub (which in my opinion is a little misleading).
I think I understand the problem and this is what I would suggest (short run - option):
data <- read.table("data.txt", header=TRUE)
subset <- t(data.frame(data$Rtime, data$Btime))
barplot(subset, legend = c("Rtime", "Btime"), names.arg=data$Input, log="y", beside=TRUE)
Is that what you want? It is kind of dirty, but it does the job.
Update: code corrected.
As requested, a ggplot2 solution that also uses pivot_longer() https://tidyr.tidyverse.org/reference/pivot_longer.html to transform the data into a format that geom_bar() can easily plot.
library(dplyr)
library(ggplot2)
df <- read.table(text = " Input Rtime Rcost Rsolutions Btime Bcost
1 12-proc. 1 36 614425 40 36
2 15-proc. 1 51 534037 50 51
3 18-proc 5 62 1843820 66 66
4 20-proc 4 68 1645581 104400 73
5 20-proc(l) 4 64 1658509 14400 65
6 21-proc 10 78 3923623 453600 82",
header = TRUE,sep = "")
dfm <- pivot_longer(df, -Input, names_to="variable", values_to="value")
## pivot_longer takes the input data frame, excludes the Input field from the transformation, turns the remaining column names into the variable "variable" (often called the "key"), and assigns the values to the variable "value".
ggplot(dfm,aes(x = Input,y = value)) +
geom_bar(aes(fill = variable),stat = "identity",position = "dodge") +
scale_y_log10()
joran's answer helped me a lot, but I had to use stat="identity" in the ggplot statement like that:
ggplot(dfm, aes(x = Input,y = value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity") +
scale_y_log10()
My version of R is 3.2.2 and ggplot2 version 1.0.1
Thanks.

Resources