How to create the ordered bar plot in ggplot2 with both positive and negative values. Here is the data:
down -11
down -10
down -9
down -6
up 6
up 6
up 6
up 6
up 7
up 7
up 8
up 8
up 8
up 8
up 8
up 8
up 8
up 10
up 10
up 11
up 11
up 12
up 14
up 14
up 21
up 21
up 24
I have tried this code:
ggplot(GO, aes(x = d1, y = order(d2), fill = factor(d1))) +
geom_bar(stat = "identity", position = "identity", width = 0.6)
This is not working.
I would like to order the plot. Can anybody please suggest some code.
Please check out my answer for a similar question. You should set your vector up in the order you want and then use +scale_y_discrete(limits = yourOrderedData) and it should plot in your order.
Related
I have a dataset that I would like to visualize with barplot() . My question is, why do some labels not show when appended with text() and how does one solve this issue?
For example this is my table
table(test$Freq)
2 3 4 5 6 7 8 9 10 11 12 14 16 44
6338 2544 1072 394 102 29 11 9 5 2 3 1 1 1
And the following barplot will miss the first label:
barplot(table(test$Freq))
text(x = xx, y = test$Freq, label = test$Freq, pos = 3, cex = 0.8, col = "red")
It looks like the text is being plotted outside of your graph.
Try adjusting the ylim value when you call barplot. This should solve your problem.
I got a data in table form which look like this in R:
V1 V2
1 19 -1539
2 7 -1507
3 3 -1446
4 7 -1427
5 8 -1401
6 2 -422
7 22 4178
8 5 4277
9 10 4303
10 18 4431
....200 million more lines to go
I would like to plot a density plot for the value in the second column with respect to the label in the first column (i.e. each label has on density curve on a same graph). But I don't know how. Any suggestion?
If I understood the question correctly, this would end up somewhat like a density heatmap in the end. (Considering there are 200 million observations total and V1 has fairly considerable range of variation)
For that I would try ggplot and stat_binhex:
df <- read.table(text="V1 V2
1 19 -1539
2 7 -1507
3 3 -1446
4 7 -1427
5 8 -1401
6 2 -422
7 22 4178
8 5 4277
9 10 4303
10 18 4431")
library(ggplot2)
ggplot(data=df,aes(V1,V2)) +
stat_binhex() +
scale_fill_gradient(low="red", high="steelblue") +
scale_y_continuous() +
theme_bw()
stat_binhex should work well with large data and has several parameters that will help with presentation (like bins, binwidth. See ?stat_binhex)
OK I figure it out by myself
ggplot(data, aes(x=V2, color=V1)) + geom_density(aes(group=V1))
Should be able to do that.
However there is two thing I need to make sure first in order to let it run:
V1 is a factor
V2 is a numerical value
The data I got wasn't set directly by read.tables in the way I want, so I have to do the following before using ggplot:
data$V1 = as.factor(data$V1)
data$V2 = as.numeric(as.character(data$V2))
I have a dataset of about 1000 records, following is the sample of it-
Var1 Freq
3 Abhay Jadhav 22
4 Abhijit Rana 8
5 Abhinav Sahu 24
6 Abhishek Chaudhary 22
7 Abhishek Dutt 7
8 Abhishek Gautam 7
9 Abhishek Mishra 13
10 Abhishek Mukherjee 23
11 Abhishek Nair 22
12 Abhishek Panigrahi 15
13 Abhishek Tiwari 21
14 Abzal Ayub 5
15 Adhiraj Banerjee 7
I want to plot the same within the range of Freq like (1..5 , 6..10,11...) , the number of var1 .
Like
1..5 => 3 Var1 Items
6...10 => 10 Var1 Items
Wold like to use ggplot for doing the same,
I tried to use normal plot with break but was not impressed and my intention to use ggplot only.
I am fine to use histogram or barplot or any better option
I think this is what you're looking for:
df$group <- cut(df$Freq, breaks = seq(0, max(df$Freq) + 4, by = 5), include.lowest = T)
ggplot(df, aes(x = group)) + geom_bar()
I have a 12x13 matrix that looks like that:
monat beob werex_00 werex_11 werex_22 werex_33 werex_44 werex_55 werex_66 werex_77 werex_88 werex_99 Min Max
1 22.4930171 9.1418697 8.1558828 8.0312839 10.013298 8.8922567 9.395811 10.7933080 6.5136136 8.721697 10.279974 0.108381 59.65309
2 25.1414834 13.5886794 9.1694683 10.8709352 13.021066 10.3316655 10.579970 17.0555902 7.5915886 11.035921 13.366310 0.924013 66.94970
3 33.8286673 16.3800292 10.0202342 11.3072626 17.674761 16.1370288 15.018551 15.3331395 12.6856599 15.479521 13.929905 -0.794309 78.78572
4 22.0579421 11.9930633 8.4899130 8.2304118 12.987301 7.8763578 8.554007 12.4956321 9.4723508 7.057423 7.688662 -10.496481 49.01380
5 2.5535161 -2.4503375 -4.2354520 -3.6309377 -2.969866 -4.5876993 -5.383716 -3.2612018 -5.2054387 -2.780719 -4.359513 -19.579135 32.54282
6 -2.4405826 -8.8534136 -9.4666674 -7.4249244 -7.820072 -9.1485440 -8.546798 -7.8179739 -7.4222923 -10.978398 -12.644807 -22.821617 18.62139
7 -2.2580848 -6.7569968 -8.3390114 -8.8757506 -8.248305 -8.4171552 -7.760800 -5.7471163 -8.7864075 -6.239596 -8.870658 -22.933219 20.84375
8 -0.3448858 -5.6683742 -5.0467756 -5.7201820 -2.800106 -5.9640095 -5.011171 -3.3557601 -2.8967683 -4.407761 -6.146411 -17.042893 17.86556
9 3.3963303 0.4305926 -0.8554308 -0.9985536 -1.184610 -0.5520555 0.347758 -0.3838614 -0.2199835 -1.174712 -1.630363 -8.533647 19.66163
10 5.1839209 1.6050281 1.1578316 1.8503193 2.327975 1.6633771 1.557532 1.5563157 2.2776155 1.667714 1.333829 -4.686715 31.17342
11 9.2551418 4.4810518 2.9992301 4.9848408 3.824927 4.2413024 3.939119 5.4256008 3.5804488 4.965302 3.790589 -1.615777 43.90991
12 18.2233848 7.7648233 6.3344735 7.3477135 6.573620 7.1884950 7.428654 7.3119002 6.9405167 7.663072 8.342437 0.014096 62.83760
That are time-lines of a certain value. In the next step I plot it with ggplot(). Therefore I used the melt() operation to get the matrix in shape for plot:
R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif <- melt(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean, na.rm = FALSE, id.vars="monat")
This data looks like that now:
Monat Projektion value
1 1 beob 22.4930171
2 2 beob 25.1414834
3 3 beob 33.8286673
4 4 beob 22.0579421
5 5 beob 2.5535161
6 6 beob -2.4405826
7 7 beob -2.2580848
8 8 beob -0.3448858
9 9 beob 3.3963303
10 10 beob 5.1839209
11 11 beob 9.2551418
12 12 beob 18.2233848
13 1 werex_00 9.1418697
14 2 werex_00 13.5886794
15 3 werex_00 16.3800292
16 4 werex_00 11.9930633
17 5 werex_00 -2.4503375
18 6 werex_00 -8.8534136
19 7 werex_00 -6.7569968
20 8 werex_00 -5.6683742
21 9 werex_00 0.4305926
22 10 werex_00 1.6050281
23 11 werex_00 4.4810518
24 12 werex_00 7.7648233
25 1 werex_11 8.1558828
... ... ... ...
I also added some new names for the melted data (as already seen above):
names(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif)<-c("Monat","Projektion","value")
Next step defines some custom colors for the plot:
Projektionen_Farben<-c("#000000","#00EEEE","#EEAD0E","#006400","#BDB76B","#EE7600","#68228B","#8B0000","#1E90FF","#EE6363","#556B2F","#D6D6D6","#D6D6D6")
Now I plot the melted data:
ggplot(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif,
aes(x=Monat,y=value,color=Projektion,group=Projektion)) +
geom_line(size=0.8) +
xlab("Monat") +
ylab("Grundwasserneubildung [mm/Monat]") +
ggtitle("Grundwasserneubildung") +
theme_bw() +
scale_x_continuous(breaks = c(1,2,3,4,5,6,7,8,9,10,11,12),
labels = c("Jan","Feb","Mär","Apr","Mai","Jun","Jul","Aug","Sep","Okt","Nov","Dez")) +
theme(axis.title=element_text(size=15,vjust = 0.3, face="bold"),
title=element_text(size=15,vjust = 1.5,face="bold")) +
scale_colour_manual(values = Projektionen_Farben)
Sorry, but I haven't got enough reputation to post a pic of the plot.
Now I want to fill/shade the space between the Max-line and the Min-line with, lets say, a light grey (alpha=.3). I have tried with geom_ribbon() but haven't found the right way to define x, ymin and ymax as needed. Does someone know a way to fill the space between these two lines?
Use your original data frame for the geom_ribbon() and provide columns Min and Max as ymin and ymax.
+ geom_ribbon(data=R1_Grundwasserneubildung_Rg1Rg2_Monat_mean,
aes(x=monat,ymin=Min,ymax=Max),
inherit.aes=FALSE,alpha=0.3,color="grey30")
I'm trying to create a stacked bar graph using ggplot 2. My data in its wide form, looks like this. The numbers in each cell are the frequency of responses.
activity yes no dontknow
Social events 27 3 3
Academic skills workshops 23 5 8
Summer research 22 7 7
Research fellowship 20 6 9
Travel grants 18 8 7
Resume preparation 17 4 12
RAs 14 11 8
Faculty preparation 13 8 11
Job interview skills 11 9 12
Preparation of manuscripts 10 8 14
Courses in other campuses 5 11 15
Teaching fellowships 4 14 16
TAs 3 15 15
Access to labs in other campuses 3 11 18
Interdisciplinary research 2 11 18
Interdepartamental projects 1 12 19
I melted this table using reshape2 and
melted.data(wide.data,id.vars=c("activity"),measure.vars=c("yes","no","dontknow"),variable.name="haveused",value.name="responses")
That's as far as I can get. I want to create a stacked bar chart with activities on the x axis, frequency of responses in the y axis, and each bar showing the distribution of the yes, nos and dontknows
I've tried
ggplot(melted.data,aes(x=activity,y=responses))+geom_bar(aes(fill=haveused))
but I'm afraid that's not the right solution
Any help is much appreciated.
You haven't said what it is that's not right about your solution. But some issues that could be construed as problems, and one possible solution for each, are:
The x axis tick mark labels run into each other. SOLUTION - rotate the tick mark labels;
The order in which the labels (and their corresponding bars) appear are not the same as the order in the original dataframe. SOLUTION - reorder the levels of the factor 'activity';
To position text inside the bars set the vjust parameter in position_stack to 0.5
The following might be a start.
# Load required packages
library(ggplot2)
library(reshape2)
# Read in data
df = read.table(text = "
activity yes no dontknow
Social.events 27 3 3
Academic.skills.workshops 23 5 8
Summer.research 22 7 7
Research.fellowship 20 6 9
Travel.grants 18 8 7
Resume.preparation 17 4 12
RAs 14 11 8
Faculty.preparation 13 8 11
Job.interview.skills 11 9 12
Preparation.of.manuscripts 10 8 14
Courses.in.other.campuses 5 11 15
Teaching.fellowships 4 14 16
TAs 3 15 15
Access.to.labs.in.other.campuses 3 11 18
Interdisciplinay.research 2 11 18
Interdepartamental.projects 1 12 19", header = TRUE, sep = "")
# Melt the data frame
dfm = melt(df, id.vars=c("activity"), measure.vars=c("yes","no","dontknow"),
variable.name="haveused", value.name="responses")
# Reorder the levels of activity
dfm$activity = factor(dfm$activity, levels = df$activity)
# Draw the plot
ggplot(dfm, aes(x = activity, y = responses, group = haveused)) +
geom_col(aes(fill=haveused)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.25)) +
geom_text(aes(label = responses), position = position_stack(vjust = .5), size = 3) # labels inside the bar segments