plot labels overlap - how to expand scale of graph sheet? - r

Setup/Problem:
I have created a simple scatter plot using ggplot2 library and the qplot() function in RStudio.
Issue:
The issue is that the labels overlap when I create the plot.
Question:
Is there a simple way to expand the graph sheet to stop the plot labels overlapping?
Is there a simple way to stop the labels being cut off by the edge of the graph
I do not want to remove labels. My sense would be to expand the sheet size but I cannot seem to find a way to do that. Any help would be much appreciated.
Research so far
I have investigated the wordcloud library as an alternative but hit the same issue.
I have investigated using the scale_x_continuous(expand = c(.3, .3)) command which does allow me to expand the sheet to address the edge issue but I am looking to see if a better solution exists.
I have read through the ggplot2 manual pages but have failed to find a clean solution. I felt it is time to ask for some help and a few pointers to a solution. If I find a solution I will post it.
Example Output (Date File to below)
Code
library(ggplot2)
library(grid)
td3 <- read.csv("td3.csv")
p <-qplot(X,Y, xaxs = "i", yaxs = "r", las = 1, data=td3, shape=as.factor(Type), label=Identifier, asp = 1)
p <- p + scale_x_continuous(expand = c(.3, .3))
p + geom_text(aes(colour=factor(Type)), angle = 30, size=4, hjust=-0.1, panel.margin = unit(50, "lines"))
Test Data
Identifier,X,Y,,Type
1st Reference Long Title,5,280,,Super fit
2nd Reference Long Title,1,60,,fit
3rd Reference Long Title,1,60,,fit
4th Reference Long Title,3,100,,fit
5th Reference Long Title,1,14,,unfit
6th Reference Long Title,1,48,,fit
7th Reference Long Title,1,48,,fit
8th Reference Long Title,10,80,,fit
9th Reference Long Title,1,24,,unfit
10th Reference Long Title,1,80,,fit
11th Reference Long Title,1,36,,unfit
12th Reference Long Title,1,10,,unfit
13th Reference Long Title,3,60,,fit
14th Reference Long Title,3,120,,fit
15th Reference Long Title,3,80,,fit
16th Reference Long Title,10,400,,Super fit
17th Reference Long Title,5,360,,Super fit
18th Reference Long Title,2,5,,unfit

You could either "increase the size of the canvas" before issuing the plotting command, see for example ?png or ?jpg.
Alternatively you could use ggsave, see R plot: size and resolution
G

Related

Remove polygon lines in trap - or possibly dissolve polygons?

I'm making a map of flood zones that ends up with additional lines I don't want. Here is the code:
tmap_mode("view")
tm_shape(floodplot) +
tm_polygons(
col = "Risk Level",
alpha = 0.9,
palette = colors)
) +
tm_layout(
title = "What is the risk of flooding in Robeson County?",
legend.position = c("right", "top"),
legend.text.size = 12
) +
tm_style(style = "classic")
And here is the map: enter image description here
The horizontal and vertical lines are the census tracts, but I don't want them on this map. The floodplot data has a geometry column as well as a SHAPE_Leng and SHAPE_Area column, which I assume are causing the issue.
I tried using some arguments in tm_lines, but it didn't work. Doesn't actually seem like any of the lines after the polygons layer is changing the map, but I'm fine with that if I can get rid of the census tract lines.
As a temporary solution I set lwd = 0, but that of course removes all the lines. The person who showed me the data said something about dissolving the polygons, but I'm not sure how to do that.
Thank you!
You should try to add and manipulate the function tm_grid() (see documentation) to your map code. It is hard to come with a solution if you don't share the data, but probably adding:
+ tm_grid(alpha = 1)
or:
+ tm_grid(lines = FALSE)
should work.
As for the dissolving: while it is difficult to make 100% certain without having a look at your data I suppose this should be doable via a dplyr::summarise() call; something along the lines of
library(dplyr)
library(sf)
floodplot <- floodplot %>%
group_by(`Risk Level`) %>%
summarise()
You will find the technique explained here: https://www.jla-data.net/eng/merging-geometry-of-sf-objects-in-r/

Issues with axis labeling on boxplots in R

Hopefully this is a quick fix. I am trying to make boxplots of nutrient river concentrations using R code written by the person previously in my position (and I am not so experienced with R, but we use it only for this). The issue is that the output boxplot axis have multiple overlapping text, some of which seems to come from another part of the code which I thought did not dictate axis labels. The original code is shown below (the working directory is already set and csv files imported and I know that works), and the resulting boxplot is in 1.
Edit: Code below
png(filename="./TP & TN Plotting/Plots/TN Concentration/Historical TN Concentrations Zoomed to Medians.png",
width=10,
height=4,
unit="in",
res=600)
par(mar=c(5,5,3,1),
cex=.75)
tnhistconc<-(boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist))
boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist,
ylim=c(0,3000),
xaxt="n"),
at=c(1:3,5:8,10:(length(tnhistconc$n)+2)))
axis.break(axis=1,breakpos=c(4),style="slash")
axis.break(axis=1,breakpos=c(9),style="slash")
text(c(1:3,5:8,10:(length(tnhistconc$n)+2)),
-50,
paste("n=",
tnhistconc$n),
cex=0.8)
title(ylab="TN Concentration (ppb)",
xlab="Year")
title(main=paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(TNhist$Year),") TN Concentration"))
dev.off()
I made the edit of adding ,xlab="",ylab="" after xlab="Year" towards the bottom, since this fixed this issue in other similar sections of boxplot code (except it seems I needed to add it to a different part of those sections, see 2 - also tried it after xaxt ="n" as in 2 and got the same result). It fixes the overlapping text issue, but the axis labels are still not what I want them to be ("Year", and "TN Concentration (ppb)), and this is shown in 3.
So, does anyone potentially know of a simple fix that might get rid of these unwanted labels and replace them with the correct ones? Am I missing something basic? The same original code seemed to work fine in the past before I was doing this (for 2018 data), and the spreadsheets the data is being imported from are the same, same setup and everything. Many thanks in advance!
Edit: I have a sample dataset which is just the last 2 years of data. See here: https://docs.google.com/spreadsheets/d/10oo9w-IzXkLWdY10A9gHYhDH67MeSibBpc2q67L6o88/edit?usp=sharing
Original code result
How this code fixed other similar issues
Partially fixed result based on edit
I don't know if it is a typo but you have an extra parenthesis after xatx = "n".
Maybe you can try something like that:
png(filename="./TP & TN Plotting/Plots/TN Concentration/Historical TN Concentrations Zoomed to Medians.png",
width=10, height=4, unit="in", res=600)
par(mar=c(5,5,3,1), cex=.75)
tnhistconc<-(boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010], data=TNhist))
boxplot(Conc_ppb[Year!=2009 & Year !=2010]~Year[Year!=2009 & Year !=2010],
data=TNhist,
ylim=c(0,3000),
xaxt="n", ylab = "", xlab = "",
at=c(1:3,5:8,10:(length(tnhistconc$n)+2)))
axis.break(axis=1,breakpos=c(4),style="slash")
axis.break(axis=1,breakpos=c(9),style="slash")
text(c(1:3,5:8,10:(length(tnhistconc$n)+2)),
-50,
paste("n=",
tnhistconc$n),
cex=0.8)
title(ylab="TN Concentration (ppb)",
xlab="Year",
main=paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(TNhist$Year),") TN Concentration"))
dev.off()
xatx will remove the x axis (that will control by axis.break. xlab and ylab will remove x and y axis title and they will be set later by title.
Hopefully, it will works
EDIT: Using ggplot2
Your dataframe is actually in a longer format making it easily ready to be plot using ggplot2 in few lines. Here your dataset is named df:
library(ggplot2)
ggplot(df, aes(x = as.factor(Year), y = Conc_ppb))+
geom_boxplot()+
labs(x = "Year", y = "TN Concentration (ppb)",
title = paste("Historical (1998 - 2000), (2005 - 2008) + UMass (2012 -
",max(df$Year),") TN Concentration"))

Ggplot does not show plots in sourced function

I've been trying to draw two plots using R's ggplot library in RStudio. Problem is, when I draw two within one function, only the last one displays (in RStudio's "plots" view) and the first one disappears. Even worse, when I run ggsave() after each plot - which saves them to a file - neither of them appear (but the files save as expected). However, I want to view what I've saved in the plots as I was able to before.
Is there a way I can both display what I'll be plotting in RStudio's plots view and also save them? Moreover, when the plots are not being saved, why does the display problem happen when there's more than one plot? (i.e. why does it show the last one but not the ones before?)
The code with the plotting parts are below. I've removed some parts because they seem unnecessary (but can add them if they are indeed relevant).
HHIplot = ggplot(pergame)
# some ggplot geoms and misc. here
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
HHIAvePlot = ggplot(AveHHI, aes(x = AveHHI$n_brokers))
# some ggplot geoms and misc. here
ggsave(paste("Average HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I've already taken a look here and here but neither have helped. Adding a print(HHIplot) or print(HHIAvePlot) after the ggsave() lines has not displayed the plot.
Many thanks in advance.
Update 1: The solution suggested below didn't work, although it works for the answer's sample code. I passed the ggplot objects to .Globalenv and print() gives me an empty gray box on the plot area (which I imagine is an empty ggplot object with no layers). I think the issue might lie in some of the layers or manipulators I have used, so I've brought the full code for one ggplot object below. Any thoughts? (Note: I've tried putting the assign() line in all possible locations in relation to ggsave() and ggplot().)
HHIplot = ggplot(pergame)
HHIplot +
geom_point(aes(x = pergame$n_brokers, y = pergame$HHI)) +
scale_y_continuous(limits = c(0,10000)) +
scale_x_discrete(breaks = gameSizes) +
labs(title = paste("HHI Index of all games,",year,"Finals"),
x = "Game Size", y = "Herfindahl-Hirschman Index") +
theme(text = element_text(size=15),axis.text.x = element_text(angle = 0, hjust = 1))
assign("HHIplot",HHIplot, envir = .GlobalEnv)
ggsave(paste("HHI Index of all games,",year,"Finals.png"),
path = plotpath, width = 6, height = 4)
I'll preface this by saying that the following is bad practice. It's considered bad practice to break a programming language's scoping rules for something as trivial as this, but here's how it's done anyway.
So within the body of your function you'll create both plots and put them into variables. Then you'll use ggsave() to write them out. Finally, you'll use assign() to push the variables to the global scope.
library(ggplot2)
myFun <- function() {
#some sample data that you should be passing into the function via arguments
df <- data.frame(x=1:10, y1=1:10, y2=10:1)
p1 <- ggplot(df, aes(x=x, y=y1))+geom_point()
p2 <- ggplot(df, aes(x=x, y=y2))+geom_point()
ggsave('p1.jpg', p1)
ggsave('p2.jpg', p2)
assign('p1', p1, envir=.GlobalEnv)
assign('p2', p2, envir=.GlobalEnv)
return()
}
Now, when you run myFun() it will write out your two plots to .jpg files, and also drop the plots into your global environment so that you can just run p1 or p2 on the console and they'll appear in RStudio's Plot pane.
ONCE AGAIN, THIS IS BAD PRACTICE
Good practice would be to not worry about the fact that they're not popping up in RStudio. They wrote out to files, and you know they did, so go look at them there.

Stacked bar in R

I have a table exported in csv from PostgreSQL and I'd like to create a stacked bar graph in R. It's my first project in R.
Here's my data and what I want to do:
It the quality of the feeder bus service for a certain provider in the area. For each user of the train, we assign a service quality based of synchronization between the bus and the train at the train stations and calculate the percentage of user that have a ideal or very good service, a correct service, a deficient service or no service at all (linked to that question in gis.stackexchange)
So, It's like to use my first column as my x-axis labels and my headers as my categories. The data is already normalized to 100% for each row.
In Excel, it's a couple of clicks and I wouldn't mind typing a couple of line of codes since it's the final result of an already quite long plpgsql script... I'd prefer to continue to code instead of moving to Excel (I also have dozens of those to do).
So, I tried to create a stacked bar using the examples in Nathan Yau's "Visualize This" and the book "R in Action" and wasn't quite successful. Normally, their examples use data that they aggregate with R and use that. Mine is already aggregated.
So, I've finally come up with something that works in R:
but I had to transform my data quite a bit:
I had to transpose my table and remove my now-row (ex-column) identifier.
Here's my code:
# load libraries
library(ggplot2)
library(reshape2)
# load data
stl <- read.csv("D:/TEMP/rabat/_stl_rabattement_stats_mtl.csv", sep=";", header=TRUE)
# reshape for plotting
stl_matrix <- as.matrix(stl)
# make a quick plot
barplot(stl_matrix, border=NA, space=0.1, ylim=c(0, 100), xlab="Trains", ylab="%",
main="Qualité du rabattement, STL", las = 3)
Is there any way that I could use my original csv and have the same result?
I'm a little lost here...
Thanks!!!!
Try the ggplot2 and reshape library. You should be able to get the chart you want with
stl$train_order <- as.numeric(rownames(stl))
stl.r <- melt(stl, id.vars = c("train_no", "train_order"))
stl.r$train_no <- factor(
stl.r$train_no,
levels = stl$train_no[order(stl$train_order)])
ggplot(stl.r, aes(x = factor(train_no), y = value, fill = variable)) + geom_bar(stat = 'identity')
It appears that you transposed the matrix manually. This can be done in R with the t() function.
Add the following line after the as.matrix(stl) line:
stl_matrix <- t(stl_matrix)

Text not appearing on XTS plot

I'm having trouble adding some text to an plot of time series data in R using xts. I've produced a simple example of the problem.
My text() command seems to do nothing, whereas I can add a points to the plot. I've tried to keep the code simple by using defaults where possible
require(quantmod)
# fetch the data and plot it using default options
getSymbols('MKS.L')
plot(MKS.L$MKS.L.Close)
# try to add text - doesn't appear
text(as.Date('2012-01-01'),y=500,"wobble", cex=4)
# add a point - this does appear
testPos <- xts(600, as.Date('2012-01-01'))
points( testPos, pch = 3, cex = 4, col = "red" )
Any help appreciated - I'm pretty new to R and I've spent hours on this!
Not a direct answer, but the plot.xts function that comes with the xts package is not fully developed.
You're much better off using plot.zoo or plot.xts from the xtsExtra package (which was written as a Google Summer of Code project with the intention being to roll it into the xts package)
Either of these will work:
plot(as.zoo(MKS.L$MKS.L.Close))
text(as.Date('2012-01-01'),y=500,"wobble", cex=4)
#install.packages("xtsExtra", repos="http://r-forge.r-project.org")
xtsExtra::plot.xts(MKS.L$MKS.L.Close)
text(as.Date('2012-01-01'),y=500,"wobble", cex=4)

Resources