ggplot doesn't show the second geom_line() in my plot - r

My df:
p1 p2 p3 x y
0 3000 14 0.0 0.026500
20 3000 14 11.0 0.054000
30 3000 14 17.9 0.057000
60 3000 14 49.3 0.064000
80 3000 14 77.4 0.063000
60 3500 14 45.3 0.061000
60 4000 14 41.4 0.058300
60 4400 14 43.7 0.073600
60 3500 9 41.7 0.060556
60 3500 18 46.7 0.060700
60 3500 21 49.2 0.059900
This is the result of a "one parameter at a time" experimental design, i.e., one where the parameters p1, p2 and p3 were changed one at a time (definitely not the best kind of DOE, but that's what I got). For each observation, two variables are measured, x and y. I would like to plot a line connecting all points of the p1 study (the first 5 rows), a line connecting all points of the p2 study (rows 4 and 6:8) and a third line connecting the points of the p3 study (rows 6 and 9:11). I tried with
ggplot(df, aes(x = x, y = y, color = p2)) +
geom_point( aes(shape = p3)) +
geom_line() +
geom_line(data = filter(df, p1 == "60" & p3 == "14"), aes(x = x, y = y))
The red and the green line correspond to the p1 and p3 study, but ggplot doesn't plot the line corresponding to the p2. How can I manage to plot it? In practice, I need either a geom_path or a geom_line connecting the triangle symbols in the center of the screen (x coordinate between 40 and 50).

Related

time series aesthetics with ggplot2

hello I have tried to graph the following data
I have tried to graph the following time series
fecha importaciones
1 Ene\n1994 171.0
2 Feb\n1994 170.7
3 Mar\n1994 183.7
4 Abr\n1994 214.6
5 May\n1994 227.2
6 Jun\n1994 221.1
7 Jul\n1994 216.4
8 Ago\n1994 235.3
9 Sep\n1994 227.0
10 Oct\n1994 216.0
11 Nov\n1994 221.5
12 Dic\n1994 270.9
13 Ene\n1995 250.4
14 Feb\n1995 259.6
15 Mar\n1995 258.2
16 Abr\n1995 232.9
17 May\n1995 335.0
18 Jun\n1995 295.2
19 Jul\n1995 302.5
20 Ago\n1995 283.3
21 Sep\n1995 264.4
22 Oct\n1995 277.6
23 Nov\n1995 289.1
24 Dic\n1995 280.5
25 Ene\n1996 252.4
26 Feb\n1996 250.1
.
.
.
320 Ago\n2020 794.6
321 Sep\n2020 938.2
322 Oct\n2020 966.3
323 Nov\n2020 958.9
324 Dic\n2020 1059.2
325 Ene\n2021 1056.2
326 Feb\n2021 982.5
I graph it with office cal
but trying to plot it in R with ggplot
ggplot(datos, aes(x = fecha, y = importaciones)) +
geom_line(size = 1) +
scale_color_manual(values=c("#00AFBB", "#E7B800"))+
theme_minimal()
I have tried to graph with all the possible steps but it does not fit me in a correct way for someone to guide me
Change the x-axis to date class.
library(ggplot2)
df$fecha <- lubridate::dmy(paste0(1, df$fecha))
ggplot(datos, aes(x = fecha, y = importaciones, group = 1)) +
geom_line(size = 1) +
scale_color_manual(values=c("#00AFBB", "#E7B800"))+
theme_minimal()
You can use scale_x_date to change the breaks and display format of dates on x-axis.

Barplot overlay with geom line

here is the data example:
S P C P_int C_int
10 20 164 72 64
20 550 709 92 89
30 142 192 97 96
40 45 61 99 98
50 12 20 99 99
60 5 6 99 99
70 2 2 99 99
80 4 1 99 99
90 1 0 10 99
100 0 1 10 99
Let's say i have a dataframe called df, the aim is to have a bar chart using variables P and C, with an line chart overlayed using sum of variables P_int and C_int. Currently I have these lines of codes to create the bar chart:
final <- df %>% tidyr::gather(type, value, c(`P`, `C`))
ggplot(final, aes(S))+
geom_bar(aes(y=value, fill=type), stat="identity", position="dodge")
The thing I can't figure out is hot to plot the sum of variables P_int and C_int as a line chart overlayed on the above plot with a second Y axis. Would appreciate any help.
Do you need something like this ?
library(ggplot2)
library(dplyr)
ggplot(final, aes(S))+
geom_bar(aes(y=value, fill=type), stat="identity", position="dodge") +
geom_line(data = final %>%
group_by(S) %>%
summarise(total = sum(P_int + C_int)),
aes(y = total), color = 'blue') +
scale_y_continuous(sec.axis = sec_axis(~./1)) +
theme_classic()
I have kept the scale of secondary y-axis same as primary y-axis since they are in the same range but you might need to adjust it in according to your real data.

geom_bar labeling for melted data / stacked barplot

I have a problem with drawing stacked barplot with ggplot. My data looks like this:
timeInterval TotalWilling TotalAccepted SimID
1 16 12 Sim1
1 23 23 Sim2
1 63 60 Sim3
1 69 60 Sim4
1 61 60 Sim5
1 60 54 Sim6
2 16 8 Sim1
2 23 21 Sim2
2 63 52 Sim3
2 69 64 Sim4
2 61 45 Sim5
2 60 32 Sim6
3 16 14 Sim1
3 23 11 Sim2
3 63 59 Sim3
3 69 69 Sim4
3 61 28 Sim5
3 60 36 Sim6
I would like to draw a stacked barplot for each simID over a timeInterval, and Willing and Accepted should be stacked. I achieved the barplot with the following simple code:
dat <- read.csv("myDat.csv")
meltedDat <- melt(dat,id.vars = c("SimID", "timeInterval"))
ggplot(meltedDat, aes(timeInterval, value, fill = variable)) + facet_wrap(~ SimID) +
geom_bar(stat="identity", position = "stack")
I get the following graph:
Here my problem is that I would like to put percentages on each stack. Which means, I want to put percentage as for Willing label: (Willing/(Willing+Accepted)) and for Accepted part, ((Accepted/(Accepted+Willing)) so that I can see how many percent is willing how many is accepted such as 45 on red part of stack to 55 on blue part for each stack. I cannot seem to achieve this kind of labeling.
Any hint is appreciated.
applied from Showing data values on stacked bar chart in ggplot2
meltedDat <- melt(dat,id.vars = c("SimID", "timeInterval"))
meltedDat$normvalue <- meltedDat$value
meltedDat$valuestr <- sprintf("%.2f%%", meltedDat$value, meltedDat$normvalue*100)
meltedDat <- ddply(meltedDat, .(timeInterval, SimID), transform, pos = cumsum(normvalue) - (0.5 * normvalue))
ggplot(meltedDat, aes(timeInterval, value, fill = variable)) + facet_wrap(~ SimID) + geom_bar(stat="identity", position = "stack") + geom_text(aes(x=timeInterval, y=pos, label=valuestr), size=2)
also, it looks like you may have some of your variables coded as factors.

ggplot create map with arrows

I have a data frame like this
id lon lat
1 A -69.5 -58.5
2 A -69.5 -58.5
3 A -69.5 -57.5
4 A -68.5 -57.5
5 A -68.5 -57.5
6 A -68.5 -57.5
7 A -66.5 -57.5
8 A -68.5 -56.5
9 A -68.5 -56.5
10 A -67.5 -56.5
11 A -65.5 -56.5
12 A -65.5 -56.5
13 A -65.5 -55.5
14 A -62.5 -54.5
15 B -177 -52.5
16 B -178 -50.5
17 B -179 -48.5
18 B 179 -47.5
19 B 178 -46.5
20 B 177 -46.5
and I want to produce a map of the position of A and B, linked by oriented lines. However when ids cross the Pacific (lon=-180 -> lon=+180) I get an arrow crossing the whole figure, like shown below.
This is the code I am using
worldmap = map_data("world")
ggplot(test, aes(x = lon, y=lat, colour = factor(id))) +
geom_polygon(data=worldmap,center=180,aes(x=long, y=lat, group=group), fill="black",colour="black") +
xlab("") +ylab("")+theme(axis.text=element_blank(),axis.ticks=element_blank())+ theme(panel.background = element_rect(fill = 'white', colour = 'black') ,panel.grid.major = element_blank(),panel.grid.minor = element_blank())+
geom_path(size =2,arrow = arrow(angle=30,length = unit(0.6, "inches")))
How can I fix it?
Thanks
I guess that depends on what you think the "right" think to do is. I decided to break up the pathes that cross the glob into two segments by adding in points at the edge of the map, and then creating a "sequence" indicator so ggplot knows which lines to connect. Here's the transformation for your sample data
test2 <- do.call(rbind, lapply(split(test, test$id), function(x) {
cp <- cumsum(c(FALSE, diff(x$lon)>250))
xx<-split(x, cp)
xx<-Map(cbind, xx, seq=seq_along(xx))
Reduce(function(a,b) {
lasta<-a[nrow(a),]
firstb<-b[1,]
lasta$lon <- 180*sign(lasta$lon)
firstb$lon <- 180*sign(firstb$lon)
lasta$lat <- mean(lasta$lat, firstb$lat)
firstb$lat <- lasta$lat
rbind(a,lasta, firstb,b)
}, xx)
}))
tail(test2)
# id lon lat seq
# B.17 B -179 -48.5 1
# B.171 B -180 -48.5 1
# B.18 B 180 -48.5 2
# B.181 B 179 -47.5 2
# B.19 B 178 -46.5 2
# B.20 B 177 -46.5 2
here you can see that we've broken the B line up into two sequences. Then if we use a group aesthetic
geom_path(aes(group=interaction(id, seq)), ...)
then R will only connect those points that are in the same id/seq group. This will prevent the line from going across the ocean. However, because we are drawing two lines for that group rather than one, there's no way to turn of the arrow head for just one of the segments. you might want to find another way to indicate start/end.

Add points ggplot

Hi I have many data frame like this
id oldid yr mo dy lon lat
1 01206295 Aberfeldy 1885 3 22 -127.1 -31.78
2 05670001 05670005 1885 3 22 -4.38 49.15
3 06279 06279 1885 3 22 -123.5 37.5
4 106251 06323 1885 3 22 178.5 19.5
5 58FFF3618 58FFF3618 1885 3 22 -0.73 69.73
6 Achille.F Achille.F 1885 3 22 -35.62 -2.98
stored in different files myfiles and I am trying to plot the (lon,lat) points for each of them with the colour chosen according to the id value. So far I am doing like this
for (i in 1:length(myfiles)){
colnames(myfilesContent[[i]]) <-c("id","oldid","yr","mo","dy","lon","lat")
p <- ggplot() + geom_polygon(data=world_map,aes(x=long, y=lat,group=group))
myfilesContent[[i]]$lon <- as.numeric(myfilesContent[[i]]$lon)
myfilesContent[[i]]$lat <- as.numeric(myfilesContent[[i]]$lat)
p + geom_point(data=myfilesContent[[i]], aes(x=lon, y=lat, fill=as.factor(id)), size = 4, shape = 21, show_guide=FALSE)
print(p)
}
Anyway I am not sure that if an id is in different files it will be assigned with the same colour
Many thanks
You can make sure the levels for all your id columns are the same. First, get a master list of all the IDs from all the data.frames
allids <- unique(unlist(lapply(myfilesContent, function(x) levels(x[,1])))
Then make sure all the ID columns share these levels
lapply(seq_along(myfilesContent), function(i) {
myfilesContent[[i]][,1] < -factor(myfilesContent[[i]][,1], levels=allids)
})
If they have the same levels, they should get the same colors.

Resources