I have been using Rmarkdown for some time. I recently was generating my pdf output from Rmarkdown. I used the ggplot function from ggplot2 and sapply to plot charts. While the charts got populated perfectly fine, There were bunch of other details that got plotted on the pdf.
Starting with ##, multiple lines with detail on panel plots data color size got printed. I am not sure how to exclude them from my plot. Is this related to Rmarkdown or sapply. I tried various options like echo=false but had no success.
`sapply(unique(turnDatC$vintage),function(x) {
tmpflt <- filter(turnDatC,vintage == x)
sapply(unique(tmpflt$net_coupon),function(y) {
df1 <-filter(tmpflt,(net_coupon == y)) %>%
select(product,as_of_date,vintage, net_coupon,cltv_bkt,variable,value) %>%
group_by(product,vintage,net_coupon,cltv_bkt,variable) %>%
filter(length(value)>1) %>%
ungroup()
if(nrow(df1)>1)
{ print(x)
print(y)
plot(df1%>%
ggplot(aes(x = as_of_date,y = value,color = variable,group = variable)) +
geom_line() +
geom_point() +
scale_x_date(breaks = date_breaks("months"),labels = date_format("%Y/%m")) +
xlab("TimeSeries") +
ylab("CPR") +
facet_grid(cltv_bkt ~ .) +
theme(legend.position = "bottom", legend.title = element_blank(), axis.text.x = element_text(angle = 90, hjust = 1)) +
ggtitle(paste0("Vintage - ",x, " Coupon - ",y , " CLTV across Time Series")))}
})`
Related
I am working with the CHOIRBM package (https://github.com/emcramer/CHOIRBM) and made a body map looking at prevalence and pain throughout the body. I want to alter the thickness of the lines on the bodymap itself but the package doesn't appear to have that option. The package was built on top of the ggplot package however so I was wondering if there was some type of ggplot command that I'm not aware that will do this.
Here is the code. Note that id= region of the body, Percent= percent of pain, and group= front or back of body. The package requires things to be labeled this way. And this is what the map looks like:
painF %>% filter(value == 1) %>% select(id, Percent, group) %>%
plot_female_choirbm(value="Percent") +
scale_fill_gradient(low="white", high="red", limits=c(0,10)) +
theme(legend.position = "bottom") +
labs(fill = "Prevelance of pain (%)")+
ggtitle(" " , subtitle="Female")+
theme(plot.subtitle=element_text(face="bold", hjust=0.5))
There doesn't seem to be a direct way to do this, but you can do it indirectly since it is built ultimately out of geom_polygon. First store the plot:
p <- plot_female_choirbm(values, "value") +
scale_fill_gradient(low="white", high="red", limits=c(0,10)) +
theme(legend.position = "bottom") +
labs(fill = "Prevelance of pain (%)")+
ggtitle(" " , subtitle="Female")+
theme(plot.subtitle=element_text(face="bold", hjust=0.5))
p
Now you overwrite the size aesthetic of the polygon layer:
p$plot_env$p$layers[[1]]$aes_params$size <- 0.5
p
Note that the polygons may no longer overlap perfectly any more.
Data generated from GitHub example
library(CHOIRBM)
# generate some random example data
set.seed(123)
ids <- as.character(c(seq.int(101, 136, 1), seq.int(201, 238, 1)))
values <- data.frame(
id = ids
, value = runif(length(ids), 0, 10)
, ucolors = rainbow(length(ids))
, group = ifelse(as.numeric(ids) < 200, "Front", "Back")
)
I am trying to fetch some code from webpage https://fgeerolf.com/data/oecd/ULC_QUA.html#labour_income_share_(real_ulc)_(total_economy) and replicate the code, but I encountered the error message "function and object not found".
The code given is:
ULC_QUA %>%
filter(MEASURE == "IXOBTE",
SUBJECT == "ULQBBU99",
LOCATION == "DEU") %>%
left_join(ULC_QUA_var$SECTOR, by = c("SECTOR" = "id")) %>%
rename(SECTOR_desc = label) %>%
year_to_date %>%
arrange(SECTOR_desc) %>%
ggplot() +
geom_line(aes(x = date, y = obsValue, color = SECTOR_desc, linetype = SECTOR_desc)) +
scale_color_manual(values = viridis(9)[1:8]) +
theme_minimal() +
scale_x_date(breaks = seq(1920, 2025, 2) %>% paste0("-01-01") %>% as.Date,
labels = date_format("%y")) +
theme(legend.position = c(0.8, 0.3),
legend.title = element_blank()) +
scale_y_continuous(breaks = seq(0, 200, 10)) +
ylab("Labour Income Share (Real ULC) (Total Economy)") + xlab("")
And I got
Error in year_to_date(.) : could not find function "year_to_date"
Eventually I want to generate the following plot:
First of all, I think I need to read the original data from the source but I don't know the location and how to import the data. Is there anyway I can replicate the plot without any further information?
Any help would be much appreciated!
I am interested in visualizing two plots together in the same page. One plot shows Usage of Drugs over time and the other plot shows Occurrence of Resistance over time. The Y-axis scales are very different for both so I think displaying them in separate graphs is useful.
I use the following code to generate the two different graphs for 676 ids(or elements in my data) in to two separate pdfs. This is not helpful when comparing how usage and resistance for one id is varying with time. Instead I would like to generate one pdf and in each page of the pdf, I would like to show the resistance and usage variation over time for the same id/element. So goal is to have 676 pages for 676 ids in my pdf and in each page display the use and resistance for the same id.
I know this can be done using grid.arrange from gridExtra but not sure how to use it in a loop and with lapply.
###Resistance
Plot_list1 =list()
#this is the loop
for (i in J0$id){
temp1 <- J0%>%
filter(id==i)%>%
ggplot(aes(x = Year , y = Rest)) +
geom_line()+
geom_point()+
scale_x_continuous(breaks=c(2008, 2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))+
theme(axis.text.x = element_text(angle = 90))+
theme(legend.position = "none") +
ggtitle(i)
Plot_list1[[i]] <- temp1
}
##saving the loop in pdf
pdf("Resistance.pdf")
invisible(lapply(Plot_list1, print))
dev.off()
###Usage
Plot_list2 =list()
#this is the loop
for (i in J0$id){
temp2 <- J0%>%
filter(id==i)%>%
ggplot(aes(x = Year , y = DUL0)) +
geom_line()+
geom_point()+
scale_x_continuous(breaks=c(2008, 2009,2010,2011,2012,2013,2014,2015,2016,2017,2018))+
theme(axis.text.x = element_text(angle = 90))+
theme(legend.position = "none") +
ggtitle(i)
Plot_list2[[i]] <- temp2
}
##saving the loop in pdf
pdf("UsageDUL0.pdf")
invisible(lapply(Plot_list2, print))
dev.off()
Here's a terse walk-through.
Step 1, generate fake data, plot it individually into gg1 and gg2, then combine them using patchwork. This could easily (and perhaps arguably should) be broken into multiple stages, but it's small enough to use just one lapply.
library(ggplot2)
library(patchwork)
set.seed(42)
allgg <- lapply(1:3, function(ind) {
dat <- mtcars[sample(NROW(mtcars), 10),]
gg1 <- ggplot(dat, aes(disp, mpg)) + geom_point(color = "red") + labs(title = paste("Page ", ind), subtitle = "mpg ~ disp")
gg2 <- ggplot(dat, aes(qsec, drat)) + geom_point(color = "blue") + labs(subtitle = "drat ~ qsec")
gg1 / gg2
})
Start the pdf file, plot them all, then close the device.
pdf("quux.pdf", onefile = TRUE, width = 6, height = 6)
for (pg in allgg) print(pg)
dev.off()
For me, I get a 6x6 (inch) PDF with three pages, looking like:
This question already has an answer here:
List for Multiple Plots from Loop (ggplot2) - List elements being overwritten
(1 answer)
Closed 2 years ago.
I am trying to organize several dozens of plots using ggarrange, so I have setup a loop in which I save each plot in a list. Each plot differs from each other with different data, title, etc. Everything works perfectly until I try to use geom_text to place some text inside the plot. When the plots are saved in the list, each plot inherits the geom_text from the last plot in the list. I don't know how to avoid this.
my.list=vector("list", length = 2);
dt=data.table(x=c(1,100,100000),y=c(1,100,100000))
plotname=c('first','second')
for (i in 1:length(my.list)) {
my.list[[i]]=ggplot(data = dt, aes(x = x, y = y )) + geom_point(size=1.5,aes(color=c('red'))) + labs(x=NULL, y=NULL)
+ scale_color_manual(values='red')
+ theme_bw() + theme(panel.background = element_rect(fill='light grey', colour='black'),legend.position = "none")
+ geom_text(inherit.aes=FALSE,aes(x=500, y=100000, label=paste0('NRMSE:',i))) + ggtitle(paste0(plotname[i])) + coord_equal()
+ geom_abline(slope=1)
+ scale_y_log10(breaks = c(1,10,100,1000,10000,100000),limits=c(1,100000))
+ scale_x_log10(breaks = c(1,10,100,1000,10000,1000000),limits=c(1,100000))
+ labs(x=NULL, y=NULL)
+ theme_bw() + theme(panel.background = element_rect(fill='light grey', colour='black'),legend.position = "none")
}
after this I do
plotosave=ggarrange(plotlist=my.list)
Using lapply instead of forloop works fine:
my.list <- lapply(1:2, function(i) {
ggplot(data = dt, aes(x = x, y = y )) +
geom_point(size=1.5) +
labs(x=NULL, y=NULL) +
theme_bw() +
theme(panel.background = element_rect(fill='light grey', colour='black'),
legend.position = "none") +
geom_text(inherit.aes=FALSE,aes(x=50000, y=100000,
label=paste0('NRMSE:',i))) +
ggtitle(paste0(plotname[i]))
})
ggarrange(plotlist = my.list)
Note: the issue is not with ggarrange.
Roland:
The plot is build when you print the ggplot object. Anything that is not part of the data passed will be taken from the enclosing environment at exactly that time point. If you use the iterator of a for loop in the plot, it has its last value then (or any value you change it to later on). lapply avoids the issue because of the stuff explained in the Note in its documentation.
Related post:
the problem is that ggplot() waits until you print the plot to resolve the variables in the aes() command.
I don't exactly know why this occurs but if you remove aes from geom_text it works.
library(ggplot2)
my.list = vector("list", length = 2)
dt = data.table::data.table(x=c(1,100,100000),y=c(1,100,100000))
plotname = c('first','second')
for (i in 1:length(my.list)) {
my.list[[i]]= ggplot(data = dt, aes(x = x, y = y )) +
geom_point(size=1.5) +
labs(x=NULL, y=NULL) +
theme_bw() +
theme(panel.background = element_rect(fill='light grey', colour='black'),
legend.position = "none") +
geom_text(x=50000, y=100000, label=paste0('NRMSE:',i)) +
ggtitle(paste0(plotname[i]))
}
plotosave = ggpubr::ggarrange(plotlist=my.list)
I have a script that used to produce a facetted plot with strip text on multiple lines. But this does not work anymore. Below is a MWE where the strip text should be parsed from, e.g. "bold(A)\nreally~long~extra" to:
A
really long extra
The second line is cut off as you can see via the debug function. I even increased the margins but to no avail...
Any ideas what is the issue?
exmpl = data.frame(a = 1:100,
b = rep(1:5, 20),
f = factor(rep(LETTERS[1:5], each = 20))) %>%
as_tibble() %>%
mutate(f2 = paste0("bold(",f, ")\nreally~long~extra"))
ggplot(exmpl, aes(x = b, y = a)) +
facet_grid(. ~ f2, labeller = label_parsed) +
geom_point() +
theme(strip.text.x = element_text(size = 10, hjust = 0, margin = margin(.5, 0, .5, 0, "cm"), debug = T))
EDIT:
And while we are at it, I only came up with this workaround because my previous solution of using label_bquote() does not work anymore. Please have a look at this other question, maybe you can help me with this, too?
Not sure wether this works for you. But one way to achieve the desired result would be to make use of the ggtext package, which allows you to style your facet labels using HTML and CSS. To this end ggtext introduces a new theme element element_markdown. Try this:
library(ggplot2)
library(dplyr)
exmpl = data.frame(a = 1:100,
b = rep(1:5, 20),
f = factor(rep(LETTERS[1:5], each = 20))) %>%
as_tibble() %>%
mutate(f2 = paste0("<b>", f, "</b><br>", "really long extra"))
ggplot(exmpl, aes(x = b, y = a)) +
facet_grid(. ~ f2) +
geom_point() +
theme(strip.text.x = ggtext::element_markdown(size = 10, hjust = 0))
And for the second question in your former post a solution might look like so:
mylabel <- function(x) {
mutate(x, Species = paste0(letters[Species], " <i>", Species, "</i>"))
}
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + geom_point()
p + facet_grid(. ~ Species, labeller = mylabel) +
theme(strip.text.x = ggtext::element_markdown())