Connect the red points with a line in ggplot - r

Please help me;
I made a plot comprising some red and blue points using ggplot.
Now I want to connect the red points to each other with a line and connect the blue points to each other with another line
These are my codes
m <- as.factor(c(7,"12 PCA", 21, "24 PCA", "31 PCA", 38, 70))
## Then we plot the points
ggplot(pH, aes(x= m, y=All))+ ylim(60,100)+
scale_x_discrete(limits=c(7,"12 PCA", 21, "24 PCA", "31 PCA", 38, 70))+
geom_point(data=pH, aes(y=All), colour = 'red', size =1)+
geom_point(data=pH, aes(y=Test), colour = 'blue', size=1)
And this is my plot
How can I do that?
Thanks

I think it's generally best to not work with independent vectors of data when possible, instead placing it in a single frame. In this case, one column will be used to indicate which "group" the dots belong to.
dat <- data.frame(m=c(m,m), All=c(94,95,96,95,94,95,96, 74,67,74,67,68,73,74), grp=c(rep("red",7), rep("blue",7)))
dat
# m All grp
# 1 7 94 red
# 2 12 PCA 95 red
# 3 21 96 red
# 4 24 PCA 95 red
# 5 31 PCA 94 red
# 6 38 95 red
# 7 70 96 red
# 8 7 74 blue
# 9 12 PCA 67 blue
# 10 21 74 blue
# 11 24 PCA 67 blue
# 12 31 PCA 68 blue
# 13 38 73 blue
# 14 70 74 blue
Plot code:
library(ggplot2)
ggplot(dat, aes(m, All, group=grp, color=grp)) +
geom_point() +
geom_line() +
scale_color_manual(values = c(blue = "blue", red = "red"))

Related

Barplot overlay with geom line

here is the data example:
S P C P_int C_int
10 20 164 72 64
20 550 709 92 89
30 142 192 97 96
40 45 61 99 98
50 12 20 99 99
60 5 6 99 99
70 2 2 99 99
80 4 1 99 99
90 1 0 10 99
100 0 1 10 99
Let's say i have a dataframe called df, the aim is to have a bar chart using variables P and C, with an line chart overlayed using sum of variables P_int and C_int. Currently I have these lines of codes to create the bar chart:
final <- df %>% tidyr::gather(type, value, c(`P`, `C`))
ggplot(final, aes(S))+
geom_bar(aes(y=value, fill=type), stat="identity", position="dodge")
The thing I can't figure out is hot to plot the sum of variables P_int and C_int as a line chart overlayed on the above plot with a second Y axis. Would appreciate any help.
Do you need something like this ?
library(ggplot2)
library(dplyr)
ggplot(final, aes(S))+
geom_bar(aes(y=value, fill=type), stat="identity", position="dodge") +
geom_line(data = final %>%
group_by(S) %>%
summarise(total = sum(P_int + C_int)),
aes(y = total), color = 'blue') +
scale_y_continuous(sec.axis = sec_axis(~./1)) +
theme_classic()
I have kept the scale of secondary y-axis same as primary y-axis since they are in the same range but you might need to adjust it in according to your real data.

How to set the speed of the animation in R? [duplicate]

This question already has an answer here:
Control speed of a gganimation
(1 answer)
Closed 3 years ago.
I have set up an animation using gganimate in R. This animation is just too fast. How to set the speed of the frames?
Also the points disappears when showing the next dot. Is it possible that the points are show one by one so at the end I have a full scatter plot?
How can I fix this?
This is the testdata and my code so far:
#Test data
ID Weight Color Time
A 27 Red 1
A 11 Red 2
A 37 Red 3
A 49 Red 4
A 10 Red 5
A 25 Blue 6
A 49 Blue 7
A 20 Blue 8
A 21 Blue 9
A 36 Blue 10
A 24 Green 11
A 32 Green 12
A 47 Green 13
A 35 Green 14
A 24 Green 15
A 49 Yellow 16
A 42 Yellow 17
A 39 Yellow 18
A 22 Yellow 19
A 47 Yellow 20
#R code
library(plyr)
library(tidyr)
library(dplyr)
library(ggplot2)
library(gganimate)
p <- ggplot(dataset, aes(x=Color, y=Weight)) + geom_point()
order <- as.numeric(dataset$Time)
p +
transition_time(order) +
labs(title = "TIME: {frame_time}") + enter_fade()
Possible duplicate of: Control speed of a gganimation
In any case, I think this does what you want, if you play around with the fps argument a litte:
p <- ggplot(dataset, aes(x=Color, y=Weight)) + geom_point()
order <- as.numeric(dataset$Time)
gif <- p +
transition_time(order) +
labs(title = "TIME: {frame_time}") + enter_fade()
animate(gif, fps = 2)

ggplot facets: show annotated text in selected facets

I want to create a 2 by 2 faceted plot with a vertical line shared by the four facets. However, because the facets on top have the same date information as the facets at the bottom, I only want to have the vline annotated twice: in this case in the two facets at the bottom.
I looked a.o. here, which does not work for me. (In addition I have my doubts whether this is still valid code, today.) I also looked here. I also looked up how to influence the font size in geom_text: according to the help pages this is size. In the case below it doesn't work out well.
This is my code:
library(ggplot2)
library(tidyr)
my_df <- read.table(header = TRUE, text =
"Date AM_PM First_Second Systolic Diastolic Pulse
01/12/2017 AM 1 134 83 68
01/12/2017 PM 1 129 84 76
02/12/2017 AM 1 144 88 56
02/12/2017 AM 2 148 93 65
02/12/2017 PM 1 131 85 59
02/12/2017 PM 2 129 83 58
03/12/2017 AM 1 153 90 62
03/12/2017 AM 2 143 92 59
03/12/2017 PM 1 139 89 56
03/12/2017 PM 2 141 86 56
04/12/2017 AM 1 140 87 58
04/12/2017 AM 2 135 85 55
04/12/2017 PM 1 140 89 67
04/12/2017 PM 2 128 88 69
05/12/2017 AM 1 134 99 67
05/12/2017 AM 2 128 90 63
05/12/2017 PM 1 136 88 63
05/12/2017 PM 2 123 83 61
")
# setting the classes right
my_df$Date <- as.Date(as.character(my_df$Date), format = "%d/%m/%Y")
my_df$First_Second <- as.factor(my_df$First_Second)
# to tidy format
my_df2 <- gather(data = my_df, key = Measure, value = Value,
-c(Date, AM_PM, First_Second), factor_key = TRUE)
# Measures in 1 facet, facets split over AM_PM and First_Second
## add anntotations column for geom_text
my_df2$Annotations <- rep("", 54)
my_df2$Annotations[c(4,6)] <- "Start"
p2 <- ggplot(data = my_df2) +
ggtitle("Blood Pressure and Pulse as a function of AM/PM,\n Repetition, and date") +
geom_line(aes(x = Date, y = Value, col= Measure, group = Measure), size = 1.) +
geom_point(aes(x = Date, y = Value, col= Measure, group = Measure), size= 1.5) +
facet_grid(First_Second ~ AM_PM) +
geom_vline(aes(xintercept = as.Date("2017/12/02")), linetype = "dashed",
colour = "darkgray") +
theme(axis.text.x=element_text(angle = -90))
p2
yields this graph:
This is the basic plot from which I start. Now we try to annotate it.
p2 + annotate(geom="text", x = as.Date("2017/12/02"), y= 110, label="start", size= 3)
yielding this plot:
This plot has the problem that the annotation occurs 4 times, while we only want it in the bottom parts of the graph.
Now we use geom_text which will use the "Annotations" column in our dataframe, in line with this SO Question. Be carefull, the column added to the dataframe must be present when you create "p2", the first time (that is why we added the column supra)
p2 + geom_text(aes(x=as.Date("2017/12/02"), y=100, label = Annotations, size = .6))
yielding this plot:
Yes, we succeeded in getting the annotation only in the bottom two parts of the graph. But the font is too big ( ... and ugly) and when we try to correct it with size, two things are interesting: (1) the font size is not changed (although you would expect that from the help pages) and (2) a legend is added.
I have been clicking around a lot and have been unable to solve this after hours and hours. Any help would be appreciated.

geom_bar labeling for melted data / stacked barplot

I have a problem with drawing stacked barplot with ggplot. My data looks like this:
timeInterval TotalWilling TotalAccepted SimID
1 16 12 Sim1
1 23 23 Sim2
1 63 60 Sim3
1 69 60 Sim4
1 61 60 Sim5
1 60 54 Sim6
2 16 8 Sim1
2 23 21 Sim2
2 63 52 Sim3
2 69 64 Sim4
2 61 45 Sim5
2 60 32 Sim6
3 16 14 Sim1
3 23 11 Sim2
3 63 59 Sim3
3 69 69 Sim4
3 61 28 Sim5
3 60 36 Sim6
I would like to draw a stacked barplot for each simID over a timeInterval, and Willing and Accepted should be stacked. I achieved the barplot with the following simple code:
dat <- read.csv("myDat.csv")
meltedDat <- melt(dat,id.vars = c("SimID", "timeInterval"))
ggplot(meltedDat, aes(timeInterval, value, fill = variable)) + facet_wrap(~ SimID) +
geom_bar(stat="identity", position = "stack")
I get the following graph:
Here my problem is that I would like to put percentages on each stack. Which means, I want to put percentage as for Willing label: (Willing/(Willing+Accepted)) and for Accepted part, ((Accepted/(Accepted+Willing)) so that I can see how many percent is willing how many is accepted such as 45 on red part of stack to 55 on blue part for each stack. I cannot seem to achieve this kind of labeling.
Any hint is appreciated.
applied from Showing data values on stacked bar chart in ggplot2
meltedDat <- melt(dat,id.vars = c("SimID", "timeInterval"))
meltedDat$normvalue <- meltedDat$value
meltedDat$valuestr <- sprintf("%.2f%%", meltedDat$value, meltedDat$normvalue*100)
meltedDat <- ddply(meltedDat, .(timeInterval, SimID), transform, pos = cumsum(normvalue) - (0.5 * normvalue))
ggplot(meltedDat, aes(timeInterval, value, fill = variable)) + facet_wrap(~ SimID) + geom_bar(stat="identity", position = "stack") + geom_text(aes(x=timeInterval, y=pos, label=valuestr), size=2)
also, it looks like you may have some of your variables coded as factors.

ggplot2 stacked area line charts producing odd lines and holes

I have a data set that's structured as follows:
year color toyota honda ford
2011 blue 66 75 13
2011 red 75 91 62
2011 green 65 26 57
2012 blue 64 23 10
2012 red 84 8 62
2012 green 67 21 62
2013 blue 31 74 49
2013 red 48 43 35
2013 green 57 62 74
2014 blue 59 100 32
2014 red 72 47 67
2014 green 97 24 70
2015 blue 31 0 79
2015 red 60 35 74
2015 green 51 2 28
(My actual data, presented in the chart images below, is much larger and has 100s of "colors" but I'm simplifying here so you can merely understand the structure.)
I am trying to make a stacked area line chart that shows how many cars of each color are produced over time for a specific company. (i.e. each company has its own chart in which x axis = years, y axis = cars produced).
I run this code:
qplot(year, toyota, data = dataName, fill = color, group = color, geom= "area", position = "stack")
+ geom_area() + theme(legend.position = "none")
However, every company's chart has issues. There are seemingly random cut-out holes as well as lines that cut across the top of the layers.
company1_chart
company2_chart
I'm confused why this is happening or even possible (especially the holes... won't the data stack down?) Would it help if I made the companies long rather than wide in the data structure?
Even with 0 values, you should not have those errors. I took your data and added 0's in the honda column sporadically.
The code (using ggplot2)
library(ggplot2)
df <- read.csv("cartest.csv", header = TRUE)
ggplot(data=df,aes(x=year,y=h,fill=color)) +
geom_area() +
ggtitle("car test")
If you are importing your data as a CSV or TSV and your data columns are numeric you should not have this issue. If it was imported as .character you can convert using:
df$h <- as.numeric(df$h)

Resources