ggplot2 geom_bar fill aesthetic - r

I have following code to graph a contracts in different countries.
Country <- CCOM$Principal.Place.of.Performance.Country.Name
Val <- CCOM$Action_Absolute_Value
split <- CCOM$Contract.Category
ggplot(CCOM, aes(x = Country, y = Val, fill = levels(split))) +
geom_bar(stat = "identity")
I want a simple stacked bar chart with the bars colored by the contract category which is the variable "split" (ie. CCOM$Contract.Category).
However when I run the code it produces the graph below:
Why won't gplot separate the spending into three distinct blocks? Why do I get color sections scattered throughout the chart.? I have tried using factor(split) and levels(split) but does not seem to work. Maybe I am putting it in the wrong position.

Ah, I just realized what was going on. You seem scared to modify your data frame, don't be! Creating external vectors for ggplot is asking for trouble. Rather than create Country and Val as loose vectors, add them as columns to your data:
CCOM$Country <- CCOM$Principal.Place.of.Performance.Country.Name
CCOM$Val <- CCOM$Action_Absolute_Value
Then your plot is nice and straightforward, you don't have to worry about order or anything else.
ggplot(CCOM, aes(x = Country, y = Val, fill = Contract.Category)) +
geom_bar(stat = "identity")

as you suggest order provides a solution:
ggplot(CCOM[order(CCOM$split), ], aes(x = Country, y = Val, fill = Contract.Category)) +
geom_bar(stat = "identity")
I have a similar example where I use the equivalent of fill as Contact.Category and it still requires the reordering.

Related

what are these gray lines inside the bars of my ggplot bargraph?

I wanted to create a graph to show which continent has the highest co2 emissions total per capita over the years,
so I coded this:
barchart <- ggplot(data = worldContinents,
aes(x = Continent,
y = `Per.Capita`,
colour = Continent))
+ geom_bar(stat = "identity")
barchart
This is my dataframe:
Geometry is just for some geographical mapping later.
I checked which(is.na(worldContinents$Per.Capita)) to see whether there were NA values but it returned nothing
What's causing these gray lines?
How do I get rid of them?
These are the gray lines inside the bar graph
Thank you
You have a couple of issues here. First of all, I'm guessing you want the fill to be mapped to the continent, not color, which only controls the color of the bars' outlines.
Secondly, there are multiple values for each continent in your data, so they are simply stacking on top of each other. This is the reason for the lines in your bars, and is probably not what you want. If you want the average value per capita in each continent, you either need to summarise your data beforehand or use stat_summary like this:
barchart <- ggplot(data = worldContinents,
aes(x = Continent,
y = `Per.Capita`,
fill = Continent)) +
stat_summary(geom = "col", fun = mean, width = 0.7,
col = "gray50") +
theme_light(base_size = 20) +
scale_fill_brewer(palette = "Spectral")
barchart
Data used
Obviously, we don't have your data, so I used a modified version of the gapminder data set to match your own data structure
worldContinents <- gapminder::gapminder
worldContinents$Per.Capita <- worldContinents$gdpPercap
worldContinents$Continent <- worldContinents$continent
worldContinents <- worldContinents[worldContinents$year == 2007,]

how to plot percentage instead of count, in facet_grid graph?

I am having a hard time plotting percentage instead of count when using facet_grid.
I have the following DF (this is an example, my DF is much longer):
'Gu<-c("1","0","0","0","1","0")
variable<-c("THR","Screw removal","THR","THR","THR","Screw removal")
value<-c("0","1","0","1","0","0")
df2<-data.frame(Gu,variable,value)'
and I am trying to plot the "1" values out of the specific variable (either THR or Screw removal) and split the graph by "Gu" (facet grid).
I manage to code it to plot count, but I can seem to be able to calculate the percentage (I need to calculate the percentage from each variable only and not from all the DF)
This is my code:
ggplot(data = df2, aes(x = variable,y =value ,
fill = variable)) +
geom_bar(stat = "identity")+
facet_grid(~ Gu,labeller=labeller(Gu
=c('0'="Nondisplaced fracture",'1'="Displaced
fracture")))+
scale_fill_discrete(name = "Revision", labels =
c("THR","SCREW"))
and this is what I plotted:
enter image description here
I searched this website and the web and couldn't find an answer...
any help will do!
thanks

I want to create a double layered pie / donut chart in R

and I am trying to create a double layered pie, here is my data:
winner <- c("White" , "draw" , "Black")
lowrated_scotch1 <- c(0.56617647, 0.04411765, 0.38970588) #winrate for Whites,Draws,Blacks in chess
highrated_scotch1 <- c(0.50000000, 0.03676471, 0.46323529)
To give more context, i'm trying to visualize the difference in winrate between whites/draws/blacks for a highrated/lowrated players in Chess for the Scotch opening from the data I already managed to gather.
This is what I have in mind :(image taken from google image)
layered pie chart example.
This is my code :
multi_layer_scotch<-data.frame(winner = c("White","draw","Black"),
Y = c(highrated_scotch),
X = c(lowrated_scotch))
ggplot(multi_layer_scotch, aes(x = X, y = Y, fill = winner))+
geom_bar(stat = "identity")+
scale_fill_manual(values = c("#769656","#baca44","#eeeed2"))+
coord_polar(theta="y")+
theme_void()
and this is what i'm getting as an output :
my marvelous not complete graph
As you can see, the graph isn't layered the way I want. The 3 layers from my plot should be assembled in one layer (to represent the lowrated payers) stacked with another layer (representing the highrated players).
I tried to follow the solution given in this post , but I couldn't manage to do it myself, I felt like it was a little incomplete : Multi-level Pie Chart in R
I'de be glad if you could help me with this! thanks in advance
did you mean something like this:
df1 <- melt(multi_layer_scotch)
ggplot(df1, aes(x = variable, y = value, fill = winner))+
geom_bar(stat = "identity")+
coord_polar(theta="y")

Overlapping data on columns in a ggplot facet grid

Thanks in advance for humoring a complete newbie to R. I'm working with some data from the GSS for an online class, and I've created a ggplot facet grid. I'm sure I've done this a super awkward, long way, but I'm trying to get these data points to not overlap each other, but be centered on the columns.
Here's what I've got so far:
I've created a new dataset from the GSS with the variables 'conpress', 'sex', and 'news' -- which refer to the confidence in the press, gender and how often someone reads a newspaper. I wanted to get the percentages, not the counts, which is why I did the ..count..stuff.
gss_press_full <- gss %>% select (conpress, news, sex)
gss_press_clean <-na.omit(gss_press_full)
ggplot(gss_press_clean, aes(x = conpress, y = (..count..)/sum(..count..), fill = sex)) +
geom_bar(aes(y = (..count..)/sum(..count..)), position = position_dodge()) + facet_grid(news~.) +
geom_text(aes(y = ((..count..)/sum(..count..)), label = round((..count..)/sum(..count..), 2)), stat = "count", vjust = -0.25) +
labs(title = "Newspaper readership and Press confidence", y = "Percent", x = "Levels of confidence in the Press")
I have been googling for far too long and can't seem to find a way to adjust these labels atop the columns. It seems to be especially tricky since my y variable is being calculated in my ggplot creation, but again, like a complete novice, that was how I cobbled my way to the output. If someone has help on how to streamline this process, I'd appreciate that too!
( I hope I've included enough code to be helpful!)
Again, thanks for any help!

Draw line between points with groups in ggplot

I have a time-series, with each point having a time, a value and a group he's part of. I am trying to plot it with time on x axis and value on y axes with the line appearing a different color depending on the group.
I tried using geom_path and geom_line, but they end up linking points to points within groups. I found out that when I use a continuous variable for the groups, I have a normal line; however when I use a factor or a categorical variable, I have the link problem.
Here is a reproducible example that is what I would like:
df = data.frame(time = c(1,2,3,4,5,6,7,8,9,10), value = c(5,4,9,3,8,2,5,8,7,1), group = c(1,2,2,2,1,1,2,2,2,2))
ggplot(df, aes(time, value, color = group)) + geom_line()
And here is a reproducible example that is what I have:
df = data.frame(time = c(1,2,3,4,5,6,7,8,9,10), value = c(5,4,9,3,8,2,5,8,7,1), group = c("apple","pear","pear","pear","apple","apple","pear","pear","pear","pear"))
ggplot(df, aes(time, value, color = group)) + geom_line()
So the first example works well, but 1/ it adds a few lines to change the legend to have the labels I want, 2/ out of curiosity I would like to know if I missed something.
Is there any option in ggplot I could use to have the behavior I expect, or is it an internal constraint?
As pointed by Richard Telford and Carles Sans Fuentes, adding group = 1 within the ggplot aesthetic makes the job. So the normal code should be:
ggplot(df, aes(time, value, color = group, group = 1)) + geom_line()

Resources