I have a data frame with (to simplify) judges, movies, and ratings (ratings are on a 1 star to 5 star scale):
d = data.frame(judge=c("alice","bob","alice"), movie=c("toy story", "inception", "inception"), rating=c(1,3,5))
I want to create a bar chart where the x-axis is the number of stars and the height of each bar is the number of ratings with that star.
If I do
ggplot(d, aes(rating)) + geom_bar()
this works fine, except that the bars aren't centered over each rating and the width of each bar isn't ideal.
If I do
ggplot(d, aes(factor(rating))) + geom_bar()
the order of the number of stars gets messed up on the x-axis. (On my Mac, at least; for some reason, the default ordering works on a Windows machine.) Here's what it looks like:
I tried
ggplot(d, aes(factor(rating, ordered=T, levels=-3:3))) + geom_bar()
but this doesn't seem to help.
How can I get my bar chart to look like the above picture, but with the correct ordering on the x-axis?
I'm not sure your sample data frame is representative of the images you put up. You mentioned your ratings are on a 1-5 scale, but your images show a -3 to 3 scale. With that said, I think this should get you going in the right direction:
Sample data:
d = data.frame(judge=sample(c("alice","bob","tony"), 100, replace = TRUE)
, movie=sample(c("toy story", "inception", "a league of their own"), 100, replace = TRUE)
, rating = sample(1:5, 100, replace = TRUE))
You were closest with this:
ggplot(d, aes(rating)) + geom_bar()
and by adjusting the default binwidth in geom_bar we can make the bar widths more appropriate and treating rating as a factor centers them over the label:
ggplot(d, aes(x = factor(rating))) + geom_bar(binwidth = 1)
If you wanted to incorporate one of the other variables in the chart such as the movie, you can use fill:
ggplot(d, aes(x = factor(rating), fill = factor(movie))) + geom_bar(binwidth = 1)
It may make more sense to put the movies on the x axis and fill with the rating if you have a small number of movies to compare:
ggplot(d, aes(x = factor(movie), fill = factor(rating))) + geom_bar(binwidth = 1)
If this doesn't get you on your way, put up a more representative example of your dataset. I wasn't able to recreate the ordering problems, but that could be due to a difference in the sample data you posted and the data you are analyzing.
The ggplot website is also a great reference: http://had.co.nz/ggplot2/geom_bar.html
Related
I have some data that I cannot make a repro of: 1 line = Customer, year, quantity ordered. I need a histogram of these quantities split by year:
ggplot(cust.year %>%
mutate(yearly.cases = ceiling(yearly.cases)), aes(x = yearly.cases)) +
geom_histogram(binwidth = 1, fill = "black") +
xlab("Yearly Cases") +
facet_wrap("year")
The plots output shows the data like this:
I noticed some low values seem to be me missing, and when I click zoom they show up:
They show up in shiny and save as the version missing bars, what's up with this? I've tried adjusting xlim to no avail.
EDIT: A more extreme example...
I am trying to plot a line graph with multiple lines (grouped by a categorical value - factor) and based on what I have done in the past and what I can find online here the easiest way to do this is by assigning the categorical value to the group aesthetic - but this isn't working for me I am only getting one line on the line graph. I am 100% sure I am doing something super silly but I can't for the life of me work it out. Thanks in advance :)
#dummy data for example
test <- data.frame(x = sample(seq(as.Date('2015/01/01'), as.Date('2020/01/01'), by="day"), 20),
y = sample(10:300, 10),
Origin_Station = as.factor(rep(1, 10)),
Neighbour_station = as.factor(rep(1:5, each = 20)))
#plot - what I want to see is a line for each of the 5 Neighbour_station categories (1:5) but what I get is just one line
ggplot(test, aes(x=x, y=y, group = Neighbour_station))+
geom_line()
I have also tried this:
ggplot(test, aes(x=x, y=y, group = factor(Neighbour_station), colour = Neighbour_station))+
geom_line()
Hi Rhetta also from Aus here, big ups Australian useRs:
library(ggplot2)
ggplot(test, aes(x = x, y = y, group = Neighbour_station, colour = Neighbour_station))+
geom_line()
Note the reason you can't see the distinct lines is because your data is exactly the same for each factor level (Neighbour_station 1:5).
please I'm having really hard time to probably do something quite simply. I read different posts in here but can't find anything similar to what I would need.
I have the following dataframe:
sector <- c("tech", "energy", "retail", "gaming")
curr_sales <- c(10, 18, 15, 7)
avg_sales <- c(8.2, 20.1, 25.0, 4.1)
df <- data.frame(sector, curr_sales, avg_sales)
df$sector <- as.character(df$sector)
my initial goal was to create a plot with horizontal bars, with on the Y axis the sector, on the x axis the current sales curr_sales and bars sorted by current sales.
The following code so far helps to achieve this goal:
ggplot(df, aes(x = reorder(sector, curr_sales), y = curr_sales)) +
geom_bar(stat = "identity") +
coord_flip()
Goal: at this point, I would need a way to display for each sector (= for each horizontal bar) the average sales value. I was hoping to achieve this without having a second bar for each sector, but rather a marker or a line that would allow easily to see where the avg sales is per sector vs the current sales value.
I couldn't find any similar example and any suggestions would be much appreciated.
Thanks
It sounds like you might be able to do this with two geom_bar layers, each with a different y aesthetic. Something like:
ggplot(df) +
geom_bar(stat = "identity", aes(x = reorder(sector, curr_sales), y = curr_sales), fill=sector) +
geom_bar(stat = "identity", aes(x = reorder(sector, curr_sales), y = avg_sales), alpha=0, color='black') +
coord_flip() +
scale_fill_manual(values=c("energy"="red", "gaming"="blue", "retail"="orange", "tech"="green"))
And you could play with the second bar to get the exact effect you are looking for (in my example it is transparent with a black outline). This example also has colors.
Say I'm measuring 10 personality traits and I know the population baseline. I would like to create a chart for individual test-takers to show them their individual percentile ranking on each trait. Thus, the numbers go from 1 (percentile) to 99 (percentile). Given that a 50 is perfectly average, I'd like the graph to show bars going to the left or right from 50 as the origin line. In bar graphs in ggplot, it seems that the origin line defaults to 0. Is there a way to change the origin line to be at 50?
Here's some fake data and default graphing:
df <- data.frame(
names = LETTERS[1:10],
factor = round(rnorm(10, mean = 50, sd = 20), 1)
)
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor)) +
geom_bar(stat="identity") +
coord_flip()
Picking up on #nongkrong's comment, here's some code that will do what I think you want while relabeling the ticks to match the original range and relabeling the axis to avoid showing the math:
library(ggplot2)
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(breaks=seq(-50,50,10), labels=seq(0,100,10)) + ylab("Percentile") +
coord_flip()
This post was really helpful for me - thanks #ulfelder and #nongkrong. However, I wanted to re-use the code on different data without having to manually adjust the tick labels to fit the new data. To do this in a way that retained ggplot's tick placement, I defined a tiny function and called this function in the label argument:
fix.labels <- function(x){
x + 50
}
ggplot(data = df, aes(x=names, y=factor - 50)) +
geom_bar(stat="identity") +
scale_y_continuous(labels = fix.labels) + ylab("Percentile") +
coord_flip()
So I'm trying to use geom_bar in ggplot2, and all of the cases that I see of people demonstrating it online are of comparative frequencies of certain things. The chart that I'm trying to do is the stacked bar graph like this one
However, I want to do it from a vector of values. That is, let's say I have the vector
v=c(1,2,3,4)
Instead of 4 even bars, which is what I understand I would get, I'd like a stack of 4 bars where the top one is 1 unit tall, and the next one down is 2 units tall (etc.). Is this possible in R?
Edit: Here is the code that I've used for my graph. It's yielding a normal bar graph, not the stacked version that I'm looking for:
ggplot(data = v, aes(x = factor(x), y = y)) + geom_bar(aes(fill = factor(y)),stat = 'identity')
I think you can start from this:
v=data.frame(x="My Stacked Bar", y=c(1,2,3,4))
ggplot(data = v, aes(x = factor(x), y = y))+
geom_bar(aes(fill=factor(y)), stat="identity")