I have some count variables against which I want to make bar-plots on the same y-axis but I have no grouping variable. Something like the following plot
B <- 25
iter_M1
[1] 5 13 14 11 7 8 10 14 10 5 7 13 10 12 4 5 9 6 5 12 8 8 7 11 9
max_M1 <- max(iter_M1)
count_M1 <- integer(max_M1)
for(i in 1:max_M1)
{
for(j in 1:B)
{
if(iter_M1[j] == i)
count_M1[i] = count_M1[i] +1
}
}
count_M1
[1] 0 0 0 1 4 1 3 3 2 3 2 2 2 2
df <- data.frame(x = 1:max_M1, y = count_M1)
p_M1 <-ggplot(data=df, aes(x=x, y=y)) + geom_bar(stat="identity")
p_M1
This results in a plot like this
and another similar variable
iter_M2
[1] 3 1 3 2 6 3 4 4 3 7 4 2 2 3 4 3 4 4 1 3 7 3 2 4 2
max_M2 <- max( iter_M2)
count_M2 <- integer(max_M2)
for(i in 1:max_M2)
{
for(j in 1:B)
{
if(iter_M2[j] == i)
count_M2[i] = count_M2[i] +1
}
}
count_M2
[1] 2 5 8 7 0 1 2 df1 <- data.frame(x1 = 1:max_M2, y1 = count_M2)
p_M2 <-ggplot(data=df1, aes(x=x1, y=y1)) +
geom_bar(stat="identity") p_M2
which results in a second plot as
and similar variables like these... How can I plot this data side by side. Also the way I'hv generated data currently, there is no common y-axis for all x-axis. Are there some suggestion to generate such a plot or dataset in other format to achive the requried plot.
As suggested in the comments, making a factor (class) is the easiest way, allowing you to facet the plot.
But you seem explicitly just to want to have the same y-axis. This is achievable with the scale limits. For example, generate a vector with the limits based on max and then use this in your plots.
ylimits <- c(0, max(c(count_M1, count_M2)))
p_M1 + ylim(ylimits)
p_M2 + ylim(ylimits)
I have the following dataframe t :
name type total
a 1 20
a 1 20
a 3 20
a 2 20
a 3 20
b 1 25
b 2 25
c 5 35
c 5 35
c 6 35
c 1 35
The total is the identical for all the entries with the same name.
I want to plot a stacked barplot with type on the x axis and count of name normalized by the total on the y axis.
I plotted the non normalized plot by the following :
ggplot(t, aes(type,fill= name))+geom_bar() + geom_bar(position="fill")
How can I plot the normalized barplot ? i.e for type = 1 the y axis value would be 2/20 for a and 1/25 for b and 1/35 for c...
My try which did not work:
ggplot(t, aes(type, ..count../t$total[1],fill= name))+geom_bar() + geom_bar(position="fill")
Read in the data
d <- read.table(header = TRUE, text =
'name type total
a 1 20
a 1 20
a 3 20
a 2 20
a 3 20
b 1 25
b 2 25
c 5 35
c 5 35
c 6 35
c 1 35')
It's a bad idea to call it t, since that is the name of the transpose function.
Calculate the fractions
library(dplyr)
d2 <- d %>%
group_by(name, type) %>%
summarize(frac = n() / first(total))
This is much easier to do using the dplyr package.
Make the plot
ggplot(d2, aes(type, frac, fill = name)) +
geom_bar(stat = 'identity')
Result
I want have the following data frame
Value Phase
22 1
23 1
40 1
19 2
17 2
16 2
12 3
13 3
14 3
9 4
7 4
6 4
I want to see how the sum of value of a particular phase has changed over different phases. The phase column can range from 1 to 5. I want to see from phase 1 to phase 2 to 3 and so on, is there a decrease or increase in the sum of value of that phase. I want to use the base plotting system. How can I plot the graph so that the changes in each phase are made clear.
Here is how to do a line + scatter plot of the sums of Value for each value in Phase. First you need to aggregate the data by Phase. I'm providing both a base R solution (as you requested) and a ggplot solution.
df <- read.table(text = "Value Phase
22 1
23 1
40 1
19 2
17 2
16 2
12 3
13 3
14 3
9 4
7 4
6 4", header = TRUE)
sums <- aggregate(Value ~ Phase, df, sum, na.rm = TRUE)
png("sums.png", height = 540, width = 540)
plot(sums$Phase, sums$Value, xlab = "Phase", ylab = "Sum of Value")
lines(sums$Phase, sums$Value, type = "l")
dev.off()
# ggplot method
require(ggplot2)
ggplot(sums, aes(x = Phase, y = Value)) + geom_point() + geom_line()
ggsave("sums-ggplot.png")
I'm trying to plot three data series in a single plot. The X and Y coordinates of each series are in separate columns in my data frame:
X1 Y1 X2 Y2 X3 Y3
1 0 1 0 2 0 3
2 1 2 1 3 1 4
3 2 3 2 4 2 5
4 3 4 3 5 3 6
5 4 5 4 6 4 7
6 5 6 5 7 5 8
7 6 7 6 8 6 9
8 0 0 7 9 7 8
9 0 0 8 8 0 0
10 0 0 9 7 0 0
Since the trailing (0,0) data points of each series are invalid, only this subset of points should eventually be plotted:
X1 Y1 X2 Y2 X3 Y3
1 0 1 0 2 0 3
2 1 2 1 3 1 4
3 2 3 2 4 2 5
4 3 4 3 5 3 6
5 4 5 4 6 4 7
6 5 6 5 7 5 8
7 6 7 6 8 6 9
8 7 9 7 8
9 8 8
10 9 7
Additionally, the X-axis of the first series should be inverted:
Even without cleaning up with data frame first, I struggled to plot the column pairs as individual series in ggplot2 (see 'legend').
require(ggplot2)
report <- function(df){
plot = ggplot(data=df, aes(x=-X1, y=Y1, size=3)) + #inverted X-axis of series 1
layer(geom="point") +
geom_point(aes(X2, Y2, colour="red", size=2)) +
geom_point(aes(X3, Y3, colour="blue", size=1)) +
xlab("X") + ylab("Y")
print(plot)
}
X1 = c(0,1,2,3,4,5,6,0,0,0)
Y1 = c(1,2,3,4,5,6,7,0,0,0)
X2 = c(0,1,2,3,4,5,6,7,8,9)
Y2 = c(2,3,4,5,6,7,8,9,8,7)
X3 = c(0,1,2,3,4,5,6,7,0,0)
Y3 = c(3,4,5,6,7,8,9,8,0,0)
df <- data.frame(X1,Y1,X2,Y2,X3,Y3)
colnames(df) <- c("X1","Y1","X2","Y2","X3","Y3")
report(df)
What would be the best way to get rid of the invalid (0,0) data points in each series, and how should I plot them properly?
I think you actually want to transform your data.frame in order to make your ggplot call more concise. Here is the updated version to plot your data correctly using the dplyr package to transform the data.
In response to comment requesting additional info on dplyr. It provides the %>% operator which simply passed the argument to the left into the function on the right as the first argument. It allows for much more readable R code. The mutate function adds the Series variable via a manual setting of the variable given the knowledge of which points are part of which series. Then the filter function removes the 0,0 points which you indicated were not wanted. You can inspect the df after these operations to see the final output. Hope this helps interpret the below code. Also here is a link to the dplyr page.
library(dplyr)
df <- rbind.data.frame(
data.frame(X=-X1, Y=Y1),
data.frame(X=X2, Y=Y2),
data.frame(X=X3, Y=Y3))
df <- df %>%
mutate(Series=rep(c('S1', 'S2', 'S3'), each=10)) %>%
filter(!(X == 0 & Y == 0))
png('foo.png')
ggplot(df) + geom_point(aes(x=X, y=Y, color=Series, size=Series))
dev.off()
Also if you want to manual set the values of color and size as well as adding the lines as in your ideal example plot, here is a more complex ggplot command:
ggplot(df, aes(x=X, y=Y, color=Series, size=Series)) +
geom_point() + geom_line(size=1) + theme_bw() +
scale_color_manual(values=c('black', 'red', 'blue')) +
scale_size_manual(values=seq(4,2,-1))
It's my first day learning R and ggplot. I've followed some tutorials and would like plots like are generated by the following command:
qplot(age, circumference, data = Orange, geom = c("point", "line"), colour = Tree)
It looks like the figure on this page:
http://www.r-bloggers.com/quick-introduction-to-ggplot2/
I had a handmade test data file I created, which looks like this:
site temp humidity
1 1 1 3
2 1 2 4.5
3 1 12 8
4 1 14 10
5 2 1 5
6 2 3 9
7 2 4 6
8 2 8 7
but when I try to read and plot it with:
test <- read.table('test.data')
qplot(temp, humidity, data = test, color=site, geom = c("point", "line"))
the lines on the plot aren't separate series, but link together:
http://imgur.com/weRaX
What am I doing wrong?
Thanks.
You need to tell ggplot2 how to group the data into separate lines. It's not a mind reader! ;)
dat <- read.table(text = " site temp humidity
1 1 1 3
2 1 2 4.5
3 1 12 8
4 1 14 10
5 2 1 5
6 2 3 9
7 2 4 6
8 2 8 7",sep = "",header = TRUE)
qplot(temp, humidity, data = dat, group = site,color=site, geom = c("point", "line"))
Note that you probably also wanted to do color = factor(site) in order to force a discrete color scale, rather than a continuous one.