How to fill/shade space between timelines, plotted with ggplot? - r

I have a 12x13 matrix that looks like that:
monat beob werex_00 werex_11 werex_22 werex_33 werex_44 werex_55 werex_66 werex_77 werex_88 werex_99 Min Max
1 22.4930171 9.1418697 8.1558828 8.0312839 10.013298 8.8922567 9.395811 10.7933080 6.5136136 8.721697 10.279974 0.108381 59.65309
2 25.1414834 13.5886794 9.1694683 10.8709352 13.021066 10.3316655 10.579970 17.0555902 7.5915886 11.035921 13.366310 0.924013 66.94970
3 33.8286673 16.3800292 10.0202342 11.3072626 17.674761 16.1370288 15.018551 15.3331395 12.6856599 15.479521 13.929905 -0.794309 78.78572
4 22.0579421 11.9930633 8.4899130 8.2304118 12.987301 7.8763578 8.554007 12.4956321 9.4723508 7.057423 7.688662 -10.496481 49.01380
5 2.5535161 -2.4503375 -4.2354520 -3.6309377 -2.969866 -4.5876993 -5.383716 -3.2612018 -5.2054387 -2.780719 -4.359513 -19.579135 32.54282
6 -2.4405826 -8.8534136 -9.4666674 -7.4249244 -7.820072 -9.1485440 -8.546798 -7.8179739 -7.4222923 -10.978398 -12.644807 -22.821617 18.62139
7 -2.2580848 -6.7569968 -8.3390114 -8.8757506 -8.248305 -8.4171552 -7.760800 -5.7471163 -8.7864075 -6.239596 -8.870658 -22.933219 20.84375
8 -0.3448858 -5.6683742 -5.0467756 -5.7201820 -2.800106 -5.9640095 -5.011171 -3.3557601 -2.8967683 -4.407761 -6.146411 -17.042893 17.86556
9 3.3963303 0.4305926 -0.8554308 -0.9985536 -1.184610 -0.5520555 0.347758 -0.3838614 -0.2199835 -1.174712 -1.630363 -8.533647 19.66163
10 5.1839209 1.6050281 1.1578316 1.8503193 2.327975 1.6633771 1.557532 1.5563157 2.2776155 1.667714 1.333829 -4.686715 31.17342
11 9.2551418 4.4810518 2.9992301 4.9848408 3.824927 4.2413024 3.939119 5.4256008 3.5804488 4.965302 3.790589 -1.615777 43.90991
12 18.2233848 7.7648233 6.3344735 7.3477135 6.573620 7.1884950 7.428654 7.3119002 6.9405167 7.663072 8.342437 0.014096 62.83760
That are time-lines of a certain value. In the next step I plot it with ggplot(). Therefore I used the melt() operation to get the matrix in shape for plot:
R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif <- melt(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean, na.rm = FALSE, id.vars="monat")
This data looks like that now:
Monat Projektion value
1 1 beob 22.4930171
2 2 beob 25.1414834
3 3 beob 33.8286673
4 4 beob 22.0579421
5 5 beob 2.5535161
6 6 beob -2.4405826
7 7 beob -2.2580848
8 8 beob -0.3448858
9 9 beob 3.3963303
10 10 beob 5.1839209
11 11 beob 9.2551418
12 12 beob 18.2233848
13 1 werex_00 9.1418697
14 2 werex_00 13.5886794
15 3 werex_00 16.3800292
16 4 werex_00 11.9930633
17 5 werex_00 -2.4503375
18 6 werex_00 -8.8534136
19 7 werex_00 -6.7569968
20 8 werex_00 -5.6683742
21 9 werex_00 0.4305926
22 10 werex_00 1.6050281
23 11 werex_00 4.4810518
24 12 werex_00 7.7648233
25 1 werex_11 8.1558828
... ... ... ...
I also added some new names for the melted data (as already seen above):
names(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif)<-c("Monat","Projektion","value")
Next step defines some custom colors for the plot:
Projektionen_Farben<-c("#000000","#00EEEE","#EEAD0E","#006400","#BDB76B","#EE7600","#68228B","#8B0000","#1E90FF","#EE6363","#556B2F","#D6D6D6","#D6D6D6")
Now I plot the melted data:
ggplot(R1_Grundwasserneubildung_Rg1Rg2_Monat_mean_druckreif,
aes(x=Monat,y=value,color=Projektion,group=Projektion)) +
geom_line(size=0.8) +
xlab("Monat") +
ylab("Grundwasserneubildung [mm/Monat]") +
ggtitle("Grundwasserneubildung") +
theme_bw() +
scale_x_continuous(breaks = c(1,2,3,4,5,6,7,8,9,10,11,12),
labels = c("Jan","Feb","Mär","Apr","Mai","Jun","Jul","Aug","Sep","Okt","Nov","Dez")) +
theme(axis.title=element_text(size=15,vjust = 0.3, face="bold"),
title=element_text(size=15,vjust = 1.5,face="bold")) +
scale_colour_manual(values = Projektionen_Farben)
Sorry, but I haven't got enough reputation to post a pic of the plot.
Now I want to fill/shade the space between the Max-line and the Min-line with, lets say, a light grey (alpha=.3). I have tried with geom_ribbon() but haven't found the right way to define x, ymin and ymax as needed. Does someone know a way to fill the space between these two lines?

Use your original data frame for the geom_ribbon() and provide columns Min and Max as ymin and ymax.
+ geom_ribbon(data=R1_Grundwasserneubildung_Rg1Rg2_Monat_mean,
aes(x=monat,ymin=Min,ymax=Max),
inherit.aes=FALSE,alpha=0.3,color="grey30")

Related

Scatterplot - adding equation and r square value

I am newbie at R. Now I want to plot data (two variables) and showing regression line including the boxplot. I am able to show those data except the r square value and equation chart.
Below is my script in showing the graph
library (car)
scatterplot(FIRST_S2A_NDVI, MEAN_DRONE_NDVI,
main = "NDVI Value from Sentinel and Drone",
xlab = "NDVI Value from Sentinel",
ylab = "NDVI Value from Drone",
pch = 15, col = "black",
regLine = list(col="green"), smooth = FALSE)
The figure is like this.
Now, the final touch is to add the equation and r square value on my figure. What script do I need to write. I tried this script from Add regression line equation and R^2 on graph but still no idea how to show them.
Thanks for read and hopefully helping me in this.
p.s.
Content of my data
OBJECTID SAMPLE_GRID FIRST_S2A_NDVI MEAN_DRONE_NDVI
1 1 1 0.6411405 0.8676092
2 2 2 0.4335293 0.5697814
3 3 3 0.7350439 0.7321858
4 4 4 0.7268013 0.8271566
5 5 5 0.3638939 0.5682631
6 6 6 0.1953890 0.3168246
7 7 7 0.4841993 0.7380627
8 8 8 0.4137447 0.3239288
9 9 9 0.8219178 0.8676065
10 10 10 0.2647872 0.2296441
11 11 11 0.8126657 0.8519964
12 12 12 0.2648504 0.2465738
13 13 13 0.5992035 0.8016030
14 14 14 0.2420299 0.3933670
15 15 15 0.5059137 0.7593807
16 16 16 0.7713419 0.8026068
17 17 17 0.3762540 0.5941540
18 18 18 0.5876435 0.7763927
19 19 19 0.2491609 0.5095306
20 20 20 0.3213648 0.4456958
21 21 21 0.2101466 0.1960858
22 22 22 0.3749034 0.4956361
23 23 23 0.5712630 0.7350484
24 24 24 0.8444895 0.8577550
25 25 25 0.3331450 0.4390229
26 26 26 0.1851611 0.4573663
27 27 27 0.4914998 0.2750837
28 28 28 0.7121390 0.7780228
For adding the equation and the R squared value to your current plot. You can simply create a model with the y and x variables and format a equation and paste in over the plot using mtext function.
m <- lm(MEAN_DRONE_NDVI~FIRST_S2A_NDVI)
eq <- paste0("y = ",round(coef(m)[2],3),"x ",
ifelse(coef(m)[1]<0,round(coef(m)[1],3),
paste("+",round(coef(m)[1],3))))
mtext(eq, 3,-1)
mtext(paste0("R^2 = ",round(as.numeric(summary(m)[8]),3)), 3, -3)
You can change the variables in your model and also change the position of the text with the 2nd and 3rd arguments in the mtext function

How to order both positive and negative values in ggplot

How to create the ordered bar plot in ggplot2 with both positive and negative values. Here is the data:
down -11
down -10
down -9
down -6
up 6
up 6
up 6
up 6
up 7
up 7
up 8
up 8
up 8
up 8
up 8
up 8
up 8
up 10
up 10
up 11
up 11
up 12
up 14
up 14
up 21
up 21
up 24
I have tried this code:
ggplot(GO, aes(x = d1, y = order(d2), fill = factor(d1))) +
geom_bar(stat = "identity"‌​, position = "identity", width = 0.6)
This is not working.
I would like to order the plot. Can anybody please suggest some code.
Please check out my answer for a similar question. You should set your vector up in the order you want and then use +scale_y_discrete(limits = yourOrderedData) and it should plot in your order.

ggplot2 is plotting a line strangely

i am trying to plot the time series x_t = A + (-1)^t B
To do this i am using the following code. The problem is, that the ggplot is wrong.
require (ggplot2)
set.seed(42)
N<-2
A<-sample(1:20,N)
B<-rnorm(N)
X<-c(A+B,A-B)
dat<-sapply(1:N,function(n) X[rep(c(n,N+n),20)],simplify=FALSE)
dat<-data.frame(t=rep(1:20,N),w=rep(A,each=20),val=do.call(c,dat))
ggplot(data=dat,aes(x=t, y=val, color=factor(w)))+
geom_line()+facet_grid(w~.,scale = "free")
looking at the head of dat everything looks right:
> head(dat)
t w val
1 1 12 10.5533
2 2 12 13.4467
3 3 12 10.5533
4 4 12 13.4467
5 5 12 10.5533
6 6 12 13.4467
So the lower (blue) line should only have values 10.5533 and 13.4467. But it also takes different values. What is wrong in my code?
Thanks in advance for any help
You really should be more careful before asserting that something is "wrong". The way you are creating dat the rows are not ordered by dat$t, so head(...) is not displaying the extra values:
head(dat[order(dat$w,dat$t),],10)
# t w val
# 21 1 18 18.43530
# 61 1 18 18.36313
# 22 2 18 19.56470
# 62 2 18 17.63687
# 23 3 18 18.43530
# 63 3 18 18.36313
# 24 4 18 19.56470
# 64 4 18 17.63687
# 25 5 18 18.43530
# 65 5 18 18.36313
Note the row numbers.

Merge values of a factor column

Column data$form contains 170 unique different values, (numbers from 1 to ~800).
I would like to merge some values (e.g with a 10 radius/step).
I need to do this in order to use:
colors = rainbow(length(unique(data$form)))
In a plot and provide a better visual result.
Thank you in advance for your help.
you can use %/% to group them and mean to combine them and normalize to scale them.
# if you want specifically 20 groups:
groups <- sort(form) %/% (800/20)
x <- c(by(sort(form), groups, mean))
x <- normalize(x, TRUE) * 19 + 1
0 1 2 3 4
1.000000 1.971781 2.957476 4.103704 4.948560
5 6 7 8 9
5.950617 7.175309 7.996914 8.953086 9.952263
10 11 12 13 14
10.800705 11.901235 12.888889 13.772291 14.888889
15 16 17 18 19
15.927984 16.864198 17.918519 18.860082 20.000000
You could also use cut. If you use the argument labels=FALSE, you get an integer value:
form <- runif(170, min=1,max=800)
> cut(form, breaks=20)
[1] (518,558] (280,320] (240,280] (121,160] (757,797]
[6] (160,200] (320,359] (598,638] (80.8,121] (359,399]
[7] (121,160] (200,240] ...
20 Levels: (1.18,41] (41,80.8] (80.8,121] (121,160] (160,200] (200,240] (240,280] (280,320] (320,359] (359,399] (399,439] ... (757,797]
> cut(form, breaks=20, labels=FALSE)
[1] 14 8 7 4 20 5 9 16 3 10 4 6 5 18 18 6 2 12
[19] 2 19 13 11 13 11 14 12 17 5 ...
On a side-note, I want you to re-consider plotting with rainbow colours, as it distorts reading the data, cf. Rainbow Color Map (Still) Considered Harmful.

How to create a stacked bar chart from summarized data in ggplot2

I'm trying to create a stacked bar graph using ggplot 2. My data in its wide form, looks like this. The numbers in each cell are the frequency of responses.
activity yes no dontknow
Social events 27 3 3
Academic skills workshops 23 5 8
Summer research 22 7 7
Research fellowship 20 6 9
Travel grants 18 8 7
Resume preparation 17 4 12
RAs 14 11 8
Faculty preparation 13 8 11
Job interview skills 11 9 12
Preparation of manuscripts 10 8 14
Courses in other campuses 5 11 15
Teaching fellowships 4 14 16
TAs 3 15 15
Access to labs in other campuses 3 11 18
Interdisciplinary research 2 11 18
Interdepartamental projects 1 12 19
I melted this table using reshape2 and
melted.data(wide.data,id.vars=c("activity"),measure.vars=c("yes","no","dontknow"),variable.name="haveused",value.name="responses")
That's as far as I can get. I want to create a stacked bar chart with activities on the x axis, frequency of responses in the y axis, and each bar showing the distribution of the yes, nos and dontknows
I've tried
ggplot(melted.data,aes(x=activity,y=responses))+geom_bar(aes(fill=haveused))
but I'm afraid that's not the right solution
Any help is much appreciated.
You haven't said what it is that's not right about your solution. But some issues that could be construed as problems, and one possible solution for each, are:
The x axis tick mark labels run into each other. SOLUTION - rotate the tick mark labels;
The order in which the labels (and their corresponding bars) appear are not the same as the order in the original dataframe. SOLUTION - reorder the levels of the factor 'activity';
To position text inside the bars set the vjust parameter in position_stack to 0.5
The following might be a start.
# Load required packages
library(ggplot2)
library(reshape2)
# Read in data
df = read.table(text = "
activity yes no dontknow
Social.events 27 3 3
Academic.skills.workshops 23 5 8
Summer.research 22 7 7
Research.fellowship 20 6 9
Travel.grants 18 8 7
Resume.preparation 17 4 12
RAs 14 11 8
Faculty.preparation 13 8 11
Job.interview.skills 11 9 12
Preparation.of.manuscripts 10 8 14
Courses.in.other.campuses 5 11 15
Teaching.fellowships 4 14 16
TAs 3 15 15
Access.to.labs.in.other.campuses 3 11 18
Interdisciplinay.research 2 11 18
Interdepartamental.projects 1 12 19", header = TRUE, sep = "")
# Melt the data frame
dfm = melt(df, id.vars=c("activity"), measure.vars=c("yes","no","dontknow"),
variable.name="haveused", value.name="responses")
# Reorder the levels of activity
dfm$activity = factor(dfm$activity, levels = df$activity)
# Draw the plot
ggplot(dfm, aes(x = activity, y = responses, group = haveused)) +
geom_col(aes(fill=haveused)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.25)) +
geom_text(aes(label = responses), position = position_stack(vjust = .5), size = 3) # labels inside the bar segments

Resources