I have question regarding the plotting package Gadfly for Julia Language. Suppose I have a DataFrame that Looks Like This:
DataFrame:(DrinkType, Country, Month, Sales)
Coke, UK, April, 500
Coke, US, April, 500
Coke, UK, March, 400
Coke, US, March, 700
I want to generate a bar chart, for each DrinkType, where color divides to red and green for UK or US in a dodge style, and Month March and April for each country in Light green and light red (opaque colors) in a stacked style.
The result should show 2 bars for each Drink, something like:
Coke: bar1, stacked US(400 March (light red), 500 April (red))
bar2, stacked UK(700 March (light greed), 500 April (green))
so far, I have came up with this:
t0 = plot(result, y=:Drink, x=:x1, color=:Country, Theme(minor_label_font_size= 12pt, bar_spacing = -0.15cm), Geom.bar(position=:dodge, orientation=:horizontal)
,Scale.color_discrete_manual(Green,Orange),
Guide.title("Overall Drink Trends") , Guide.ylabel(nothing),Guide.xlabel(nothing))
This will generate the 4 bars individually...
If I understand the question right, you can generate the plot you're looking for (minus the change in alpha value for different months) with
using DataFrames, Gadfly
data = DataFrame()
data[:DrinkType] =["Coke","Coke","Coke","Coke","Pepsi","Pepsi","Pepsi","Pepsi"]
data[:Country] = ["UK", "US", "UK", "US","UK", "US", "UK", "US"]
data[:Month] = ["April", "April", "March", "March","April", "April", "March", "March"]
data[:Sales] = [500,500,400,700,340,120,990,620]
data
If you also want to show another dimension one way is to use a subplot grid like
plot(data, xgroup="DrinkType", x="Month", y="Sales", color="Country", Geom.subplot_grid(Geom.bar(position=:dodge)),Scale.color_discrete_manual("red","green"))
or
plot(data, xgroup="Country", x="Month", y="Sales", color="DrinkType", Geom.subplot_grid(Geom.bar(position=:stack)),Scale.color_discrete_manual("red","green"))
or similar, although at that point bar charts may not be your best option.
Related
I am trying to make a plot of GDP vs CO2 emissions globally. I have found that I have two countries that have data that is a lot larger than the rest of the data so I am trying to separate it with facet_wrap so I have one graph of the two outlier countries and one graph with the rest of the data.
My code thus far is
ggplot(CO2_GDP, aes(x= GDP, y=value)) +
geom_point(size=1)+
labs(title = "GDP and CO2 Emissions", y= "CO2 Emissions in Tons", x= "GDP in Billions of USD") +
facet_wrap(~country_name==c("China", "United States"))
This gives me one graph with all of the countries including China and the United States and another graph of just China and United States. I need to find a way to remove China and United States from the first graph but have just that data on the second graph.
I thought by adding the comma between China and United States in the last row would remove them from the first graph and just show it on the second but thats not the case as you can see in this image the data on the "True" graph is still on the false graph and its not supposed to be.
I'm trying to make a bar plot in base R, so not with ggplot, that is grouped and reversed, and I found a lot of similar questions and answers but it seems none of them works for me.
My database is about Eurovision song contest 2007, this is the link towards it: https://www.kaggle.com/datagraver/eurovision-song-contest-scores-19752019
and this is the code for cleaning and getting the database I'm working with:
cela_baza <- read.csv("eurovision_song_contest_1975_2019.csv", stringsAsFactors = FALSE)
evro2007_pocetna<-cela_baza[cela_baza$Year=='2007',]
evro2007_aggr<-aggregate(evro2007_pocetna$Points~evro2007_pocetna$From.country+
evro2007_pocetna$To.country,FUN=mean)
colnames(evro2007_aggr) <- c('From country', 'To country','Points')
evro2007<-evro2007_aggr[!(evro2007_aggr$`From country`==evro2007_aggr$`To country`),]
nrow(subset(evro2007, evro2007$Points== 0 ))
evro2007_zero<- subset(evro2007, evro2007$Points> 0 )
What I need is a bar plot with number of points on x-axis and countries that participated in competition on y-axis, each country has three grouped bars of different color: first represents how many points that country gave to Serbia (winner), second points to Ukraine (2nd place) and third points to Russia (3rd place). So it is grouped and reversed, and I found code for that, but my problem is that not all of the participating countries gave points to these three countries I need so always occurs some errors.
Code for ggplot will work too, I can't install it on my old PC, but I will ask someone to do it for me, as long as I have the code, thanks for all the help in advance!
It's possible to do what you're describing, but the resulting plot is horrible because of the number of countries on the x axis (42, with three bars each), and the limitations of base R's barplot.
Here's how we can get the data in the correct format:
winners <- evro2007[evro2007$`To country` == "Ukraine" |
evro2007$`To country` == "Russia" |
evro2007$`To country` == "Serbia",]
self <- data.frame(`From country` = c("Serbia", "Ukraine", "Russia"),
`To country` = c("Serbia", "Ukraine", "Russia"),
Points = c(0, 0, 0), stringsAsFactors = FALSE)
names(self) <- names(winners)
winners <- rbind(winners, self)
winners <- winners[order(winners$`From country`, winners$`To country`),]
However, the base R barplot looks like this:
barplot(Points ~ `To country` + `From country`,
data = winners, beside = TRUE, cex.names = 0.3)
The countries are illegible, and the plot difficult to interpret.
Whereas, using ggplot:
winners$`To country` <- factor(winners$`To country`,
levels = c("Serbia", "Ukraine", "Russia"))
ggplot(winners, aes(`To country`, Points, fill = `To country`)) +
geom_col() +
facet_wrap(.~`From country`) +
theme(axis.text.x = element_blank())
We get this:
I am plotting a time series bar chart with a measure for different categories. When I plot the time series bar chart, the width of the bars fills over many dates so that the neighbouring bars touch, even if they are a month apart, but this means that it is unclear which date that bar corresponds to. How do I change the code so that the bars only appear over the date in the underlying dataframe?
I have successfully plotted another time series bar chart with exactly the same ggplot code but different underlying data and so it is unclear to me why this is happening with this particular dataframe.
In this following example, I use a dataframe with only one category for simplicity in highlighting the issue:
data <- data.frame(a = c(as.Date("2019-05-30"), as.Date("2019-06-19")), b = c("FX FORWARD", "FX FORWARD"), c = c(29.2, 74.7))
colnames(data ) <- c("Expiration Date", "Security Type", "Exposure $M")
plot <- ggplot(data , aes(x=`Expiration Date`, y=`Exposure $M`, fill=`Security Type`)) +
geom_bar(stat="identity") + scale_x_date(labels = scales::date_format("%d-%b"), date_breaks = "3 day")
I expected the bars to appear only above the day in which they are stored in the dataframe and not as it is shown in the chart, i.e. $29.2 above 31st May 2019 only and not spreading from 23rd May to 8th June; same for the second data point. Can anyone advise how I may correct this in my code?
Thanks in advance for any help, I've tried looking all over for a solution.
I want to create a waterfallchart with several groups where all the groups start at 0.
This is my code:
gdp <- data.frame("Country"=rep(c("China", "USA"), each=2),
"Type"=rep(c("GDP2013", "GDP2014"), 2),
"Cnt"= c(16220, 3560, 34030, -10570))
gdp <- gdp %>%
mutate(start=Cnt,
start=lag(start),
end=ifelse(Type=="GDP2013", Cnt, start+Cnt),
start=ifelse(Type=="GDP2013", 0, start),
amount=end-start,
id=rep(1:2, each=2))
gdp %>%
ggplot(aes(fill=Type)) +
geom_rect(stat="identity", aes(x=Country,
xmin=id-0.25,
xmax=id+0.25,
ymin=start,
ymax=end))
The two bar types should be ordered next to each other per group and USA GDP2014 should start at the height of USA GDP2013 but end 10570 lower.
I know that I could do this with a facet_wrap but I want no separation between groups (e.g. facets.
geom_rect takes a position parameter.
I believe position='dodge' does what you require if I understand your question correctly.
More info: https://ggplot2.tidyverse.org/reference/position_dodge.html
Good evening, can anybody help with plotting. I have some data like this:
DF <- data.frame(
country_dest=c("Russia", "Germany", "Ukraine" ,"Kazakhstan",
"United States", "Italy", "Israel", "Belarus"),
SumTotal=c(7076562,2509617,1032325,680137,540630,359030,229186,217623)
)
It is not a big deal to plot it with separate 8 bars, but i am wondering to know is it possible to make a plot with 3 bars, where first bar will be with data of Russia (for example) second will be stacked bar of Germany, Ucraine, Kazakhstan, US and Italy maybe with some legend to understand who is who and third stacked bar of Belarus and Israel.
In internet I have found a solution to create new DF with 0 values, but didn't quite understand.
Than you in advance!
Well, you will need to add grouping information to your data. Then it becomes straightforward. Here's one strategy
#define groups
grp <-c("Russia"=1, "Germany"=2, "Ukraine"=2,"Kazakhstan"=2,
"United States"=2, "Italy"=2, "Israel"=3, "Belarus"=3)
DF$grp<-grp[as.character(DF$country_dest)]
#ensure plotting order
DF$country_dest <- factor(DF$country_dest, levels=DF$country_dest)
#draw plot
ggplot(DF, aes(x=grp, y=SumTotal, fill=country_dest)) +
geom_bar(stat="identity")
This will give you
You might wish to give your groups a more descriptive label.