I found how to estimate the historical Variance Decomposition for VAR models in R in the below link
Historical Variance Error Decompotision Daniel Ryback
Daniel Ryback presents the result in an excel plot, but I wanted to prepare it with ggplot so I created some lines to get it, nevertheless, the plot I got in ggplot is very different to the one showed by Daniel in Excel. I replicated in excel and got the same result than Daniel so it seems there is an error in the way I am preparing the ggplot. Does anyone have a suggestion to arrive to the excel result?
See below my code
library(vars)
library(ggplot2)
library(reshape2)
this code is run after runing the code developed by Daniel Ryback in the link above to define the HD function
data(Canada)
ab<-VAR(Canada, p = 2, type = "both")
HD <- VARhd(Estimation=ab)
HD[,,1]
ex <- HD[,,1]
ex1 <- as.data.frame(ex) # transforming the HD matrix as data frame #
ex2 <- ex1[3:84,1:4] # taking our the first 2 rows as they are N/As #
colnames(ex2) <- c("Emplyment", "Productivity", "Real Wages", "Unemplyment") # renaming columns #
ex2$Period <- 1:nrow(ex2) # creating an id column #
col_id <- grep("Period", names(ex2)) # setting the new variable as id #
ex3 <- ex2[, c(col_id, (1:ncol(ex2))[-col_id])] # moving id variable to the first column #
molten.ex <- melt(ex3, id = "Period") # melting the data frame #
ggplot(molten.ex, aes(x = Period, y = value, fill = variable)) +
geom_bar(stat = "identity") +
guides(fill = guide_legend(reverse = TRUE))
ggplot version
Excel version
The difference is that ggplot2 is ordering the variable factor and plotting it in a different order than excel. If you reorder the factor before plotting it will put 'unemployment' at the bottom and 'employment' at the top, as in excel:
molten.ex$variable <- factor(molten.ex$variable, levels = c("Unemployment",
"Real Wages",
"Productivity",
"Employment"))
ggplot(molten.ex, aes(x = Period, y = value, fill = variable)) +
geom_bar(stat = "identity", width = 0.6) +
guides(fill = guide_legend(reverse = TRUE)) +
# Making the R plot look more like excel for comparison...
scale_y_continuous(limits = c(-6,8), breaks = seq(-6,8, by = 2)) +
scale_fill_manual(name = NULL,
values = c(Unemployment = "#FFc000", # yellow
`Real Wages` = "#A4A4A4", # grey
Productivity = "#EC7C30", # orange
Employment = "#5E99CE")) + # blue
theme(rect = element_blank(),
panel.grid.major.y = element_line(colour = "#DADADA"),
legend.position = "bottom",
axis.ticks = element_blank(),
axis.title = element_blank(),
legend.key.size = unit(3, "mm"))
Giving:
To roughly match the excel graph in Daniel Ryback's post:
I have a dataframe named myKrige_new contains some latitude-longutude wise interpolated values. You can download from HERE. I have plotted this values on a particular area of a country map using ggplot2 package in R and I got this plot
But I want the legend(colourbar) of my plot would be like the following legend.
In my dataset here, the range of the data (pred) is 72 to 257. But I want my legend would show the value 0 to 200 because of comparing reason with other plot though there no value under 72 here .
So, I want to use 20 different colour like above legend that means last box of legend would contain colour regarding value greater than 200. I have used scale_fill_gradientn function but it didn't work. I have spend days to find some option to do it in R, didn't get success. Any kind of help will be highly appreciable.
R code :
library(scales)
library(ggplot2)
myKrige_new <- read.csv ("myKrige_new.csv")
range(myKrige_new$LON)
range(myKrige_new$LAT)
#Original skorea data transformed the same was as myKrige_new
skorea1 <- getData("GADM", country= "KOR", level=1)
skorea1 <- fortify(skorea1)
myKorea1 <- data.frame(skorea1)
###############
ggplot()+
theme_minimal() +
#SOLUTION 1:
#geom_tile(data = myKrige_new, aes(x= LON, y= LAT, fill = pred)) +
#SOLUTION 2: Uncomment the line(s) below:
#geom_point(data = myKrige_new, aes(x= LON, y= LAT, fill = pred),
#shape=22, size=8, colour=NA)+
#Solution 3
stat_summary_2d(data=myKrige_new, aes(x = LON, y = LAT, z = pred),bins = 30,
binwidth = c(0.05,0.05)) +
scale_fill_gradientn(colours=c("white","blue","green","yellow","red"),
values=rescale(c(0,50,100,150,200)),
guide="colorbar", name = "PM10 Conc")+
geom_map(data= myKorea1, map= myKorea1, aes(x=long,y=lat,map_id=id,group=group),
fill=NA, colour="black") +
coord_cartesian(xlim= c(126.6, 127.2), ylim= c(37.2 ,37.7)) +
labs(title= "PM10 Concentration in Seoul Area at South Korea",
x="", y= "")+
theme(legend.position = "bottom")+
guides(fill = guide_colourbar(barwidth = 27, barheight = NULL,
title.position = "bottom", title.hjust = 0.5))
Here is a working solution:
library(scales)
library(ggplot2)
library(raster) # needed for the `getData` function
library(dplyr) # needed for the `mutate` funtion
myKrige_new <- read.csv("~/Downloads/myKrige_new.csv")[-1]
range(myKrige_new$LON)
range(myKrige_new$LAT)
# Original skorea data transformed the same was as myKrige_new
skorea1 <- getData("GADM", country= "KOR", level=1)
skorea1 <- fortify(skorea1)
myKorea1 <- data.frame(skorea1)
# the range of pred goes above 200 (max = 257)
summary(myKrige_new$pred)
ggplot() +
theme_minimal() +
stat_summary_2d(data = mutate(myKrige_new,
pred = ifelse(pred > 200, 200, pred)),
aes(x = LON, y = LAT, z = pred),
bins = 30,
binwidth = c(0.05,0.05)) +
scale_fill_gradientn(colours=c("white","blue","green","yellow","red"),
values=rescale(c(0,50,100,150,200)),
name = expression(paste(PM[10], group("[",paste(mu,g/m^3), "]"))),
limits = c(0,200),
breaks = seq(0,200, 20),
guide = guide_colorbar(nbin = 20,
barwidth = 27,
title.position = "bottom",
title.hjust = 0.5,
raster = FALSE,
ticks = FALSE)) +
geom_map(data= myKorea1,
map= myKorea1,
aes(x=long,y=lat,map_id=id,group=group),
fill=NA,
colour="black") +
coord_equal(xlim= c(126.6, 127.2),
ylim= c(37.2 ,37.7)) +
scale_y_continuous(expand = c(0,0)) +
scale_x_continuous(expand = c(0,0)) +
labs(title = "PM10 Concentration in Seoul Area at South Korea",
x = "",
y = "") +
theme(legend.position = "bottom")
I added limits = c(0,200) and breaks = seq(0, 200, 20) to scale_fill_gradientn as well as nbin = 20 to guide_colorbar, this last change is optional because the default nbin is 20, but in your case you actually need 20. Also, adding limits means values outside the range are plotted in grey50 so I had to transform pred values above 200 to 200 to avoid that; the interpretation of red color is now 200+.
One more thing, the option raster in guide_colorbar changes the colorbar from a raster object to a set of rectangles achieving the look you were looking for.
Finally, I changed the coordinate system from cartesian to equal because you are plotting a map.
Here is the result hope it helps:
Update: added a expand argument to scale_y_continuous and scale_x_continuous as requested by OP
Map Data: InputSpatialData
Yield Data: InputYieldData
Results_using viewport():
EDIT: Results using "multiplot" function as suggested by #rawr (see comment below). I do love the new results, especially that the map is bigger. Nonetheless, the boxplot seems misaligned with the map plot still. Is there a more systematic way to control for centering and placement?
My Question: Is there a way to control for the size of the boxplot plot to make it close in size and centered with the map plot above it?
FullCode:
## Loading packages
library(rgdal)
library(plyr)
library(maps)
library(maptools)
library(mapdata)
library(ggplot2)
library(RColorBrewer)
library(foreign)
library(sp)
library(ggsubplot)
library(reshape)
library(gridExtra)
## get.centroids: function to extract polygon ID and centroid from shapefile
get.centroids = function(x){
poly = wmap#polygons[[x]]
ID = poly#ID
centroid = as.numeric(poly#labpt)
return(c(id=ID, long=centroid[1], lat=centroid[2]))
}
## read input files (shapefile and .csv file)
wmap <- readOGR(dsn=".", layer="ne_110m_admin_0_countries")
wyield <- read.csv(file = "F:/Purdue University/RA_Position/PhD_ResearchandDissert/PhD_Draft/GTAP-CGE/GTAP_Sims&Rests/NewFiles/RMaps_GTAP/AllWorldCountries_CCShocksGTAP.csv", header=TRUE, sep=",", na.string="NA", dec=".", strip.white=TRUE)
wyield$ID_1 <- substr(wyield$ID_1,3,10) # Eliminate the ID_1 column
## re-order the shapefile
wyield <- cbind(id=rownames(wmap#data),wyield)
## Build table of labels for annotation (legend).
labs <- do.call(rbind,lapply(1:17,get.centroids)) # Call the polygon ID and centroid from shapefile
labs <- merge(labs,wyield[,c("id","ID_1","name_long")],by="id") # merging the "labs" data with the spatial data
labs[,2:3] <- sapply(labs[,2:3],function(x){as.numeric(as.character(x))})
labs$sort <- as.numeric(as.character(labs$ID_1))
labs <- cbind(labs, name_code = paste(as.character(labs[,4]), as.character(labs[,5])))
labs <- labs[order(labs$sort),]
## Dataframe for boxplot plot
boxplot.df <- wyield[c("ID_1","name_long","A1B","A1BLow","A1F","A1T","A2","B1","B1Low","B2")]
boxplot.df[1] <- sapply(boxplot.df[1], as.numeric)
boxplot.df <- boxplot.df[order(boxplot.df$ID_1),]
boxplot.df <- cbind(boxplot.df, name_code = paste(as.character(boxplot.df[,1]), as.character(boxplot.df[,2])))
boxplot.df <- melt(boxplot.df, id=c("ID_1","name_long","name_code"))
boxplot.df <- transform(boxplot.df,name_code=factor(name_code,levels=unique(name_code)))
## Define new theme for map
## I have found this function on the website
theme_map <- function (base_size = 14, base_family = "serif") {
# Select a predefined theme for tweaking features
theme_bw(base_size = base_size, base_family = base_family) %+replace%
theme(
axis.line=element_blank(),
axis.text.x=element_text(size=rel(1.2), color="grey"),
axis.text.y=element_text(size=rel(1.2), color="grey"),
axis.ticks=element_blank(),
axis.ticks.length=unit(0.3, "lines"),
axis.ticks.margin=unit(0.5, "lines"),
axis.title.x=element_text(size=rel(1.2), color="grey"),
axis.title.y=element_text(size=rel(1.2), color="grey"),
legend.background=element_rect(fill="white", colour=NA),
legend.key=element_rect(colour="white"),
legend.key.size=unit(1.3, "lines"),
legend.position="right",
legend.text=element_text(size=rel(1.3)),
legend.title=element_text(size=rel(1.4), face="bold", hjust=0),
panel.border=element_blank(),
panel.grid.minor=element_blank(),
plot.title=element_text(size=rel(1.8), face="bold", hjust=0.5, vjust=2),
plot.margin=unit(c(0.5,0.5,0.5,0.5), "lines")
)}
## Transform shapefile to dataframe and merge with yield data
wmap_df <- fortify(wmap)
wmap_df <- merge(wmap_df,wyield, by="id") # merge the spatial data and the yield data
## Plot map
mapy <- ggplot(wmap_df, aes(long,lat, group=group))
mapy <- mapy + geom_polygon(aes(fill=AVG))
mapy <- mapy + geom_path(data=wmap_df, aes(long,lat, group=group, fill=A1BLow), color="white", size=0.4)
mapy <- mapy + labs(title="Average yield impacts (in %) across SRES scenarios ") + scale_fill_gradient2(name = "%Change in yield",low = "red3",mid = "snow2",high = "darkgreen")
mapy <- mapy + coord_equal() + theme_map()
mapy <- mapy + geom_text(data=labs, aes(x=long, y=lat, label=ID_1, group=ID_1), size=6, family="serif")
mapy
## Plot boxplot
boxploty <- ggplot(data=boxplot.df, aes(factor(name_code),value)) +
geom_boxplot(stat="boxplot",
position="dodge",
fill="grey",
outlier.colour = "blue",
outlier.shape = 16,
outlier.size = 4) +
labs(title="Distribution of yield impacts (in %) by GTAP region", y="Yield (% Change)") + theme_bw() + coord_flip() +
stat_summary(fun.y = "mean", geom = "point", shape=21, size= 4, color= "red") +
theme(plot.title = element_text(size = 26,
hjust = 0.5,
vjust = 1,
face = 'bold',
family="serif")) +
theme(axis.text.x = element_text(colour = 'black',
size = 18,
hjust = 0.5,
vjust = 1,
family="serif"),
axis.title.x = element_text(size = 14,
hjust = 0.5,
vjust = 0,
face = 'bold',
family="serif")) +
theme(axis.text.y = element_text(colour = 'black',
size = 18,
hjust = 0,
vjust = 0.5,
family="serif"),
axis.title.y = element_blank())
boxploty
## I found this code on the website, and tried to tweak it to achieve my desired
result, but failed
# Plot objects using widths and height and respect to fix aspect ratios
grid.newpage()
pushViewport( viewport( layout = grid.layout( 2 , 1 , widths = unit( c( 1 ) , "npc" ) ,
heights = unit( c( 0.45 ) , "npc" ) ,
respect = matrix(rep(2,1),2) ) ) )
print( mapy, vp = viewport( layout.pos.row = 1, layout.pos.col = 1 ) )
print( boxploty, vp = viewport( layout.pos.row = 2, layout.pos.col = 1 ) )
upViewport(0)
vp3 <- viewport( width = unit(0.5,"npc") , x = 0.9 , y = 0.5)
pushViewport(vp3)
#grid.draw( legend )
popViewport()
Is this close to what you had in mind?
Code:
library(rgdal)
library(ggplot2)
library(RColorBrewer)
library(reshape)
library(gridExtra)
setwd("<directory with all your files...>")
get.centroids = function(x){ # extract centroids from polygon with given ID
poly = wmap#polygons[[x]]
ID = poly#ID
centroid = as.numeric(poly#labpt)
return(c(id=ID, c.long=centroid[1], c.lat=centroid[2]))
}
wmap <- readOGR(dsn=".", layer="ne_110m_admin_0_countries")
wyield <- read.csv(file = "AllWorldCountries_CCShocksGTAP.csv", header=TRUE)
wyield <- transform(wyield, ID_1 = substr(ID_1,3,10)) #strip leading "TR"
# wmap#data and wyield have common, unique field: name
wdata <- data.frame(id=rownames(wmap#data),name=wmap#data$name)
wdata <- merge(wdata,wyield, by="name")
labs <- do.call(rbind,lapply(1:17,get.centroids)) # extract polygon IDs and centroids from shapefile
wdata <- merge(wdata,labs,by="id")
wdata[c("c.lat","c.long")] <- sapply(wdata[c("c.lat","c.long")],function(x) as.numeric(as.character(x)))
wmap.df <- fortify(wmap) # data frame for world map
wmap.df <- merge(wmap.df,wdata,by="id") # merge data to fill polygons
palette <- brewer.pal(11,"Spectral") # ColorBrewewr.org spectral palette, 11 colors
ggmap <- ggplot(wmap.df, aes(x=long, y=lat, group=group))
ggmap <- ggmap + geom_polygon(aes(fill=AVG))
ggmap <- ggmap + geom_path(colour="grey50", size=.1)
ggmap <- ggmap + geom_text(aes(x=c.long, y=c.lat, label=ID_1),size=3)
ggmap <- ggmap + scale_fill_gradientn(name="% Change",colours=rev(palette))
ggmap <- ggmap + theme(plot.title=element_text(face="bold"),legend.position="left")
ggmap <- ggmap + coord_fixed()
ggmap <- ggmap + labs(x="",y="",title="Average Yield Impacts across SRES Scenarios (% Change)")
ggmap <- ggmap + theme(plot.margin=unit(c(0,0.03,0,0.05),units="npc"))
ggmap
box.df <- wdata[order(as.numeric(wdata$ID_1)),] # order by ID_1
box.df$label <- with(box.df, paste0(name_long," [",ID_1,"]")) # create labels for boxplot
box.df <- melt(box.df,id.vars="label",measure.vars=c("A1B","A1BLow","A1F","A1T","A2","B1","B1Low","B2"))
box.df$label <- factor(box.df$label,levels=unique(box.df$label)) # need this so orderin is maintained in ggplot
ggbox <- ggplot(box.df,aes(x=label, y=value))
ggbox <- ggbox + geom_boxplot(fill="grey", outlier.colour = "blue", outlier.shape = 16, outlier.size = 4)
ggbox <- ggbox + stat_summary(fun.y=mean, geom="point", shape=21, size= 4, color= "red")
ggbox <- ggbox + coord_flip()
ggbox <- ggbox + labs(x="", y="% Change", title="Distribution of Yield Impacts by GTAP region")
ggbox <- ggbox + theme(plot.title=element_text(face="bold"), axis.text=element_text(color="black"))
ggbox <- ggbox + theme(plot.margin=unit(c(0,0.03,0,0.0),units="npc"))
ggbox
grid.newpage()
pushViewport(viewport(layout=grid.layout(2,1,heights=c(0.40,0.60))))
print(ggmap, vp=viewport(layout.pos.row=1,layout.pos.col=1))
print(ggbox, vp=viewport(layout.pos.row=2,layout.pos.col=1))
Explanation:
The last 4 lines of code do most of the work in arranging the layout. I create a viewport layout with 2 viewports arranged as 2 rows in 1 column. The upper viewport is 40% of the height of the grid, the lower viewport is 60% of the height. Then, in the ggplot calls I create a right margin of 3% of the plot width for both the map and he boxplot, and a left margin for the map so that the map and the boxplot are aligned on the left. There's a fair amount of tweaking to get everything lined up, but these are the parameters to play with. You should also know that, since we use coord_fixed() in the map, if you change the overall size of the plot (by resizing the plot window, for example), the map's width will change..
Finally, your code to create the choropleth map is a little dicey...
## re-order the shapefile
wyield <- cbind(id=rownames(wmap#data),wyield)
This does not reorder the shapefile. All you are doing here is prepending the wmap#data rownames to your wyield data. This works if the rows in wyield are in the same order as the polygons in wmap - a very dangerous assumption. If they are not, then you will get a map, but the coloring will be incorrect and unless you study the output very carefully, it is likely to be missed. So the code above creates an association between polygon ID and region name, merges the wyield data based on name, and then merges that into wmp.df based on polygon id.
wdata <- data.frame(id=rownames(wmap#data),name=wmap#data$name)
wdata <- merge(wdata,wyield, by="name")
...
wmap.df <- fortify(wmap) # data frame for world map
wmap.df <- merge(wmap.df,wdata,by="id") # merge data to fill polygons
The graph I'm currently trying to make falls a little between two stools. I want to make a histogram that is composed of stacked and labelled boxes. Here's an example of exactly the sort of thing I'm talking about, taken from a recent article in the New York Times:
http://farm8.staticflickr.com/7109/7026409819_1d2aaacd0a.jpg
Is it possible to achieve this using ggplot2?
To amplify the question somewhat, so far what I have is:
dfr <- data.frame(
name = LETTERS[1:26],
percent = rnorm(26, mean=15)
)
ggplot(dfr, aes(x=percent, fill=name)) + geom_bar() +
stat_bin(geom="text", aes(label=name))
...which I'm clearly doing all wrong. Ultimately what I'd ideally like is something along the lines of the manually-modified graph below, with (say) letters A to M filled one shade and N to Z filled another.
http://farm8.staticflickr.com/7116/7026536711_4df9a1aa12.jpg
Here you go!
set.seed(3421)
# added type to mimick which candidate is supported
dfr <- data.frame(
name = LETTERS[1:26],
percent = rnorm(26, mean=15),
type = sample(c("A", "B"), 26, replace = TRUE)
)
# easier to prepare data in advance. uses two ideas
# 1. calculate histogram bins (quite flexible)
# 2. calculate frequencies and label positions
dfr <- transform(dfr, perc_bin = cut(percent, 5))
dfr <- ddply(dfr, .(perc_bin), mutate,
freq = length(name), pos = cumsum(freq) - 0.5*freq)
# start plotting. key steps are
# 1. plot bars, filled by type and grouped by name
# 2. plot labels using name at position pos
# 3. get rid of grid, border, background, y axis text and lables
ggplot(dfr, aes(x = perc_bin)) +
geom_bar(aes(y = freq, group = name, fill = type), colour = 'gray',
show_guide = F) +
geom_text(aes(y = pos, label = name), colour = 'white') +
scale_fill_manual(values = c('red', 'orange')) +
theme_bw() + xlab("") + ylab("") +
opts(panel.grid.major = theme_blank(), panel.grid.minor = theme_blank(),
axis.ticks = theme_blank(), panel.border = theme_blank(),
axis.text.y = theme_blank())