histogram: printing group labels [duplicate] - r

This question already has an answer here:
How to change panel labels and x-axis sublabels in a lattice bwplot
(1 answer)
Closed 8 years ago.
I am using the following histogram command to visualize the features of a labeled dataset that has binary labels (0 or 1).
require(lattice)
data <- data.frame(num_child=1:10,label=rep(0:1,each=5))
histogram( ~ data$num_child | data$label ,xlab="Number of children")
I get a pair of histogram plots, as expected, with x-axis labeled as "Number of children" and y-axis labeled as "Percent of Total". However, the labels on top of both the plots are "data$label" rather than the value of the group label. The histogram command takes a xlab, and ylab as parameter, but does not seem to have a parameter for the group label. How can I get the group label (i.e. "0" and "1") to be printed?

Looks like the easiest solution is to change your grouping to a factor:
histogram( ~ data$num_child | as.factor(data$label),xlab="Number of children")

Related

Creating ggplot scatterplot with two subsets [duplicate]

This question already has answers here:
Plot with multiple lines in different colors using ggplot2
(1 answer)
ggplot combining two plots from different data.frames
(3 answers)
Multiple plots with variable color in R ggplot
(1 answer)
Closed 6 months ago.
I have 2 dataframes, "hot", and "cold" that I want to plot on the same ggplot scatterplot, in 2 different colours.
I want the "temperature" variable on the x-axis and "depth" variable on the y-axis.
I then want both dataframes plotted on this graph, with "hot" shown by red points and "cold" shown by blue points.
I have already plotted them individually, but can't figure out how to combine them into one plot. Example of the code used for the individual plots:
ggplot(hot, aes(x=temp, y=depth)) +
geom_point(size=0.1) +
labs(title="Depth Profile of Temperature in the Warm Months",x="Temperature (degC)", y = "Depth (m)") +
scale_y_reverse()

Multiple line on one Chart in R [duplicate]

This question already has answers here:
Plotting two variables as lines using ggplot2 on the same graph
(5 answers)
Closed 3 years ago.
I am trying to plot multiple line on one chart.
They all have the same X- axis in months, but different observation for y-axis.
I have tried writing this code but I keep on getting an error.
Can someone point me towards what I am doing wrong?
"Test3" is the same of my data set, "Oil_1" represents the first Y observation, "Oil_2" second observation and "Month" is the X-axis
ggplot(test3, + aes(x = Months)) +
geom_line(aes(y=oil_1),colour="blue")+
geom_line(aes(y=oil_2),colour="red") +
ylab(label="Production")+
xlab("Months")
You have an extra "+" before the first aesthetic call, that could be the problem.

How to use errorbars and correct dodging in a barplot with uneven groups in ggplot2? [duplicate]

This question already has answers here:
Consistent width for geom_bar in the event of missing data
(3 answers)
Consistent width of boxplots if missing data by group?
(1 answer)
Closed 3 years ago.
I'm trying to make a barplot in ggplot with three "people" on the x-axis and a proportion on the y. Two of the people will have two bars and the third person will have three. I've successfully grouped the bars correctly, but am having trouble getting error bars to dodge correctly.
I've tried dodging the error bars the same way I did with the bars themselves, but they don't dodge wide enough; they're all clustered toward the center of the group.
library(ggplot2)
#building a fake dataset
person <- c("Person 1", "Person 1", "Person 2", "Person 2", "Person
3","Person 3","Person 3")
prop <- as.numeric(c("0.1","0.64","0.43","0.84","0.9","0.23","0.26"))
class <- c("a","b","a","b","a","b","c")
err <- as.numeric(c("0.01","0.01","0.03","0.08","0.01","0.08","0.03"))
dat <- as.data.frame(person)
dat$prop <- prop
dat$class <- class
dat$err <- err
#plotting
p = ggplot(data = dat, aes(x = person, y = prop, fill = class)) +
geom_bar(stat="identity",position=position_dodge(0.9,preserve='single'),
colour="black",lwd=1)+
geom_errorbar(stat='identity',aes(ymin=prop-err, ymax=prop+err),
position=position_dodge(0.9),width=0.15, size=1)+
labs(x="person", y="proportion",fill="class")
p
As you can see if you run that code, the error bars plot correctly onto person 3's bars, but neither of the others. Do I need to add rows to the dataframe to account for the missing data? Or is it an issue with my coding?
Thanks!
I believe you also need to add the preserve = "single" argument in the position_dodge portion of your geom_errorbar argument.
The code
ggplot(data=dat, aes(x=person, y=prop,fill=class)) +
geom_bar(stat="identity",position=position_dodge(0.9,preserve='single'),colour="black",lwd=1) +
geom_errorbar(stat='identity',aes(ymin=prop-err, ymax=prop+err), position=position_dodge(0.9, preserve = "single"),width=0.15, size=1) +
labs(x="person", y="proportion",fill="class")
gives

Revisiting R+ggplot+geom_bar+scale_x_continuous+limits: leftmost and rightmost bars not showing on plot

Please don't tag this as a duplicate of R+ggplot+geom_bar+scale_x_continuous+limits: leftmost and rightmost bars not showing on plot : some people commented that the example in there was too long/convoluted/weird, so here is a simpler example that reproduces the problem. If a moderator think it is a good idea I will delete the original (longer) question.
I am trying to create a function that does a stacked bar plot of some yearly measures. The function takes as parameters the data and the min and max year I want to plot. The problem is that for some combination of the years the bars get weird.
Here is the code, it defines the function, creates a simple simulated dataset and creates four plots with different parameters. The resulting images are below.
library(ggplot2)
library(plyr)
# Plot either all data or select by name.
doPlot <- function(data,minYear,maxYear) {
title = paste("Bob's Performance ",minYear,"-",maxYear)
# Aggregate quantity by year and category
byYear <- aggregate(Quantity ~ Year+Category, data, sum)
# Get coordinates for numbers in stacked bars
byYear = ddply(byYear, "Year", mutate, label_y = cumsum(Quantity))
g <- ggplot(byYear, aes(x=Year,y=Quantity))
g <- g + geom_bar(stat="identity",aes(fill=Category), colour="black") +
ggtitle(title) +
scale_fill_discrete("Category",labels=c("Sheep","Cactus","Chicken"),drop=FALSE,c=45, l=80)+
scale_x_continuous(name="Year", limits=c(minYear,maxYear), breaks=seq(minYear,maxYear,1)) +
geom_text(aes(label=Quantity,y=label_y), vjust=1.3,size=6)
print(g)
}
consts = paste('"Category","Year","Name","Quantity"\n',
'CACTUS,1997,Bob,45\n',
'CHICKEN,1997,Bob,6\n',
'SHEEP,1998,Bob,2\n',
'SHEEP,1999,Bob,4\n',
'SHEEP,2005,Bob,5\n',sep = "")
data <- read.csv(text=consts,header = TRUE)
data$Category <- factor(data$Category, levels = c("SHEEP", "CACTUS", "CHICKEN"))
# This works OK
doPlot(data,1996,2006)
# This don't: bars on left and rightside disappears
doPlot(data,1997,2005)
# This don't: left bar disappears but it seems it was not plotted.
doPlot(data,1998,2000)
# This is weird: why does the bar width uses over 5 years?
doPlot(data,1999,2011)
The first plot is OK since the data is all inside the years range:
In the second plot the years range is exactly the same as the range of years in the data. The leftmost and rightmost bars are not plotted, but the numbers are.
In the third plot the year range is very narrow -- again leftmost and/or rightmost bars are not plotted. There's a hint here that the bar width could not be fitted in the plot -- see the width for 1999!
The fourth plot the year range is wider, but again leftmost and/or rightmost bars are not plotted, and the one bar that is plotted covers several years.
I can make the plot sort of work by using always an extended range for years, but this is bugging me. I guess I didn't specify something that controls the bar widths, but what?
I noticed that there are similar problems with the leftmost and rightmost bars, e.g. In ggplot2 - how to ensure geom_errorbar displays bar limits for all points when controlling x-axis with xlim() , and the solutions are similar, but I believe there ought to be a better way.
I must point out that using
scale_x_continuous(name="Year", breaks=seq(minYear,maxYear,1)) +
coord_cartesian(xlim=c(minYear,maxYear)) +
instead of
scale_x_continuous(name="Year", limits=c(minYear,maxYear),breaks=seq(minYear,maxYear,1)) +
solves the "bar over several years" issue of the fourth plot, but causes parts of the leftmost/rightmost bars to be plotted:
thanks
Rafael

Grouped Bar Plot Species and plots

Id like to make a grouped barplot that has two groups. One named Exotic Species and the second Native Species. then compare them to the Plot that they are found.Therefore 3 columns are involved with the graph. Y would be "Species Richness" and it would be the number of species either of native or exotic. X will be the "Plot name". How do i write out the coding for the bar graph i described above? If you google search European Parliament Elections R grouped barplot (orange and purply plot. thats what i want
If I understand your question correctly you want something like this:
Here is two solutions:
1.using barplot: let's say mtcars is the dataframe
# Grouped Bar Plot
counts <- table(mtcars$vs, mtcars$gear)
barplot(counts, main="Car Distribution by Gears and VS",
xlab="Number of Gears", col=c("darkblue","red"),
legend = rownames(counts), beside=TRUE)
Link
using lattice
library(lattice)
barchart(Species~Reason,data=Reasonstats,groups=Catergory,
scales=list(x=list(rot=90,cex=0.8)))
Link

Resources