How to show plotted data with big value differences?

How to show plotted data with big value differences? - r

I have the data car_crashes that I am plotting using ggplot. It has 3 different data sets as seen below
but since Average of Cars is huge, the other values do not show even bit because they are in the range of 100. If I remove the average of cars data, the plot actually looks like this
Is there a way I can show all the data in one plot so that at least I can see the num of crashes plot?
The code I used is below:
carcrashes_figure <- ggplot()+geom_area(aes(YEAR_WW,AverageofCars,group = 1,colour = 'Average of cars'),car_crashes,fill = "dodgerblue1",alpha = 0.4)+
geom_line(aes(YEAR_WW,averageofcars,group = 1,linetype ='num of crashes'),car_crashes,fill = "dodgerblue3",colour = "dodgerblue3",size = 1.6) +
geom_line(aes(car_crashes$YEAR_WW,constantline,group = 1, size = 'constant line' ),car_crashes1,fill = "green4",colour = "green4")+
theme_bw() +
theme(axis.text.x = element_text(angle=70, vjust=0.6, face = 'bold'))+
theme(axis.text.y = element_text(angle=0, vjust=0.2, face = 'bold'))+
scale_colour_manual('', values = "dodgerblue1")+
scale_size_manual('',values = 1.4)+
scale_linetype_manual('',values = 1)+
scale_y_continuous()+
theme(legend.text = element_text(size = 8, colour = "black", angle = 0))
carcrashes_figure

I agreed the idea, using a separate y-axis by #Jim Quirk. As far as I know, ggplot2 isn't very good at doing it, so I used basic plot.
# making example ts_data
set.seed(1); data <- matrix(c(rnorm(21, 1000, 100), rnorm(21, 53, 10), rep(53, 21)), ncol=3)
ts_data <- ts(data, start = 1980, frequency = 1)
par(mar=c(4, 4.2, 1.5, 4.2)) # enlarge a right margin
# plot(ts_data[,1]) # check y-range
plot(ts_data[,2:3], plot.type = "single", ylab="num of crashes & constant line",
col=c(2,3), ylim=c(35,100), lwd=2) # draw "num of crashes" and "constant line"
par(usr = c(par("usr")[1:2], 490, 1310)) # set the second y coordinates
axis(4) # write it on the right side
polygon(x = c(1980:2000, rev(1980:2000)), y = c(ts_data[,1], rep(0,21)),
col="#0000FF20", border = "blue") # paint "Average of cars"
mtext(side=4, "Average of cars", line=2.5)
legend("topright",paste(c("num of crashes","constant line","Average of cars")),
pt.cex=c(0,0,3), lty=c(1,1,0), pch=15, cex=0.9, col=c(2, 3, "#0000FF20"), bty="n",
inset=c(0.02,-0.02), y.intersp=1.5)

Related

How do I add the degree symbol and letters to each value along the x- and y-axis of a graph

So I am trying to add the degree symbol and some letters to the axis values of my graph to make them look like longitude and latitudes.
My current graph:
Want to make the axis look like this graph (with e.g., 90°N etc.)
This is the code I am using to generate my current graph:
image.plot(lon_baseline_temp, lat_baseline_temp, dat_baseline_temp,
col=rev(brewer.pal(11,"RdBu")), xlab="",
ylab="",
main="Global surface temperature (Baseline)", sub="Year 1970 ~ 1999", font.sub=2,
legend.lab="K", legend.line=2.5, legend.mar=7,
xaxp=c(-180, 180, 6), yaxp=c(-90, 90, 6), las=1)
title(ylab = expression(paste("Latitude "(degree))), line = 2, cex.lab = 1)
title(xlab = expression(paste("Longitude "(degree))), line = 2.5, cex.lab = 1)
minor.tick(nx = 5, ny = 5, tick.ratio = 0.5)
map(database = 'world', add = T, lwd=1.5)
I would really appreciate any help on this soon, thank you very much!

I cant use your data but I think you just need to use a specify your labels as follows:
#some example plot
g <- ggplot() + geom_point(aes(50,50)) + ylim(0,100) + xlim(0,100) + labs(y = "Latitude",x = "Longitude")
#plot it
g
#add a new scale with specific labels
g + scale_y_continuous(breaks = c(0,25,50,75,100),
limits = c(0,100),
labels = c(expression(0~degree),
expression(25~degree),
expression(50~degree),
expression(75~degree),
expression(100~degree)
)
) +
labs(y = "Latitude",x = "Longitude")
#plot
g

Create bar plot with logarithmic scale in R

I am trying to create a bar plot with a logarithmic scale as my data varies from 3.92 to 65700.
This is the code i have used so far:
beach <- c(PlasticsBlue=3.92, PlasticsGrey=65700, FoamsOrange=17.9, FoamsWhite=51300, RopesGreen=9.71, RopesGreen=3140)
beach
par(mar = c(10, 5, 10, 5))
barplot(beach, names.arg=c("Plastics/Blue", "Plastics/Grey", "Foams/Orange", "Foams/White", "Ropes/Green", "Ropes/Green"), col=c("red2", "slateblue4", "red2", "slateblue4", "red2", "slateblue4", "red2"), legend.text = c("Lowest", "Highest"), args.legend=list(cex=0.75,x="topright"), ylim=c(1,100000), log = ("y"), las=2, ylab = expression("mg g"^-1))
Which has given me this graph graph
This is exactly what I'm looking for apart from the log function used means that the next tick mark would be 1000000 which is far too large and therefore currently the y axis is only numbered up to 10000 which does not incorporate my largest values. Is there any way around this to have the y axis numbered up to 100000 whilst still using the log function as this seemed to work when I first made the graph in excel (see graph2 link) graph2
Thanks in advance, Alistair

You can always get what if you are willing to fiddle with the details in R. In this case it is easier to bypass R's helpful log axis and construct your own:
options(scipen=8)
out <- barplot(log10(beach), names.arg=c("Plastics/Blue", "Plastics/Grey", "Foams/Orange",
"Foams/White", "Ropes/Green", "Ropes/Green"), col=c("red2", "slateblue4", "red2",
"slateblue4", "red2", "slateblue4", "red2"), legend.text = c("Lowest", "Highest"),
args.legend=list(cex=0.75,x="topright"), ylim=c(0, 5), las=2, yaxt="n",
ylab = expression("mg g"^-1))
yval <- c(1, 10, 100, 1000, 10000, 100000)
ypos <- log10(yval)
axis(2, ypos, yval, las=1)
text(out, log10(beach), beach, pos=3, xpd=NA)
The first line just keeps R from switching to scientific notation for the 100000 value. The barplot differs in that we convert the raw data with log10() set the ylim based on the log10 values, and suppress the y-axis. Then we create a vector of the positions on the y axis we want to label and get their log10 positions. Finally we print the axis. The last line uses the value out from barplot which returns the positions of the bars on the x axis so we can print the values on the tops of the bars.

Using ggplot2 and company could look like:
library(dplyr)
library(ggplot2)
library(tibble)
library(scales)
beach <- c(PlasticsBlue = 3.92, PlasticsGrey = 65700, FoamsOrange = 17.9, FoamsWhite = 51300, RopesGreen = 9.71, RopesGreen = 3140) %>%
enframe() %>%
mutate(colorID = rep(c('Lowest', 'Highest'), 3))
plot <- beach %>%
ggplot(aes(x = 1:nrow(beach), y = value, label = value, fill = colorID)) +
geom_col(stat = 'identity') +
scale_y_continuous(trans = "log10", labels = label_number(), breaks = c(1, 10, 100, 1000, 10000, 100000)) +
scale_x_discrete(labels = beach$name, breaks = 1:nrow(beach), limits = 1:nrow(beach)) +
geom_text(vjust = -1) +
theme_minimal() +
theme(panel.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.line.y = element_line(colour = 'black'),
legend.position = 'right',
legend.title = element_blank()) +
labs(y = expression("mg g"^-1),
x = 'Category/Sample colour') +
scale_fill_manual(values = rep(c('slateblue4', 'red2'), 3))
This gives us:

Is there a way to use R to break chart axis and break linear regression line?

I'm trying to figure out how to modify a scatter-plot that contains two groups of data along a continuum separated by a large gap. The graph needs a break on the x-axis as well as on the regression line.
This R code using the ggplot2 library accurately presents the data, but is unsightly due to the vast amount of empty space on the graph. Pearson's correlation is -0.1380438.
library(ggplot2)
p <- ggplot(, aes(x = dis, y = result[, 1])) + geom_point(shape = 1) +
xlab("X-axis") +
ylab("Y-axis") + geom_smooth(color = "red", method = "lm", se = F) + theme_classic()
p + theme(plot.title = element_text(hjust = 0.5, size = 14))
This R code uses gap.plot to produce the breaks needed, but the regression line doesn't contain a break and doesn't reflect the slope properly. As you can see, the slope of the regression line isn't as sharp as the graph above and there needs to be a visible distinction in the slope of the line between those disparate groups.
library(plotrix)
gap.plot(
x = dis,
y = result[, 1],
gap = c(700, 4700),
gap.axis = "x",
xlab = "X-Axis",
ylab = "Y-Axis",
xtics = seq(0, 5575, by = 200)
)
abline(v = seq(700, 733) , col = "white")
abline(lm(result[, 1] ~ dis), col = "red", lwd = 2)
axis.break(1, 716, style = "slash")
Using MS Paint, I created an approximation of what the graph should look like. Notice the break marks on the top as well as the discontinuity between on the regression line between the two groups.

One solution is to plot the regression line in two pieces, using ablineclip to limit what's plotted each time. (Similar to #tung's suggestion, although it's clear that you want the appearance of a single graph rather than the appearance of facets.) Here's how that would work:
library(plotrix)
# Simulate some data that looks roughly like the original graph.
dis = c(rnorm(100, 300, 50), rnorm(100, 5000, 100))
result = c(rnorm(100, 0.6, 0.1), rnorm(100, 0.5, 0.1))
# Store the location of the gap so we can refer to it later.
x.axis.gap = c(700, 4700)
# gap.plot() works internally by shifting the location of the points to be
# plotted based on the gap size/location, and then adjusting the axis labels
# accordingly. We'll re-compute the second half of the regression line in the
# same way; these are the new values for the x-axis.
dis.alt = dis - x.axis.gap[1]
# Plot (same as before).
gap.plot(
x = dis,
y = result,
gap = x.axis.gap,
gap.axis = "x",
xlab = "X-Axis",
ylab = "Y-Axis",
xtics = seq(0, 5575, by = 200)
)
abline(v = seq(700, 733), col = "white")
axis.break(1, 716, style = "slash")
# Add regression line in two pieces: from 0 to the start of the gap, and from
# the end of the gap to infinity.
ablineclip(lm(result ~ dis), col = "red", lwd = 2, x2 = x.axis.gap[1])
ablineclip(lm(result ~ dis.alt), col = "red", lwd = 2, x1 = x.axis.gap[1] + 33)

How to assign dates on x-axis to the barplot in R?

I have multiple dates data set that I would like to plot using barplot functions in R. The data is for two different periods so I want to have its respective dates on the x-axis for ease of comparison. Here is my code so far. A_Date is for dataset in A while B_Date is for dataset contain in B.
A= runif(24, min = 25, max = 45)
B=runif(24, min = 35, max = 100)
DF=rbind(A,B)
A_Date= as.data.frame(seq(as.Date("1987-01-01"), to= as.Date("1988-12-31"),by="months"))
names(A_Date)= "Dates"
A_Date$year=as.numeric(format(A_Date$Dates, "%Y"))
A_Date$month=as.numeric(format(A_Date$Dates, "%m"))
A_Date=A_Date[,-1]
A_Date = as.character(paste(month.abb[A_Date$month], A_Date$year, sep = "_" ))
B_Date= as.data.frame(seq(as.Date("2010-01-01"), to= as.Date("2011-12-31"),by="months"))
names(B_Date)= "Dates"
B_Date$year=as.numeric(format(B_Date$Dates, "%Y"))
B_Date$month=as.numeric(format(B_Date$Dates, "%m"))
B_Date=B_Date[,-1]
B_Date = as.character(paste(month.abb[B_Date$month], B_Date$year, sep = "_" ))
barplot(DF, beside = T, col = c("red","darkblue"), legend.text =c("1987-88", "2010-11"), args.legend =list(x="topleft", cex = 1.2, bty="n", x.intersp=0.2),
ylab = "Precipitation (mm)", cex.axis = 1.2, cex.lab=1.5)
Also, I would like to have x-axis line (just like the line on y-axis.
Thank you

barplot also throws a coordinate matrix, which we may catch by assignment, here by b <-. Now we can make an axis with ticks at the right places. To avoid that the plot becomes too crowded, we could unify the redundant month information and just split the different years in mtextlines. I've used here built-in month.abbs.
b <- barplot(DF, beside=T, col=c("red","darkblue"),
legend.text=c("1987-88", "2010-11"),
args.legend=list(x="topleft", cex=1.2, bty="n", x.intersp=0.2),
ylab="Precipitation (mm)", cex.axis=1.2, cex.lab=1.5, ylim=c(0, 130))
axis(1, at=b[1, ], labels=FALSE)
axis(1, at=b[2, ], labels=FALSE)
mtext(rep(c(1987, 1988), each=12), 1, 1, at=b[1, ], cex=.8, las=2)
mtext(rep(c(2010, 2011), each=12), 1, 1, at=b[2, ], cex=.8, las=2)
mtext(rep(month.abb, 2), 1, 3, at=colMeans(b), las=2)
 
Result
If you'd also like to close the gap between y and x axis, you could add this line:
abline(h=0, cex=1.3)

I feel like it's going to be hard to fit all 4 dates into one spot on the axis. Here is the best I could come up with. I also rearranged your data so it fits all in one dataframe and used ggplot2.
library(tidyverse)
new_df <- tibble(precip = runif(48, c(25, 25), c(45,100)),
dates = c(seq(as.Date("1987-01-01"), as.Date("1988-12-31"), by = "months"),
seq(as.Date("2010-01-01"), as.Date("2011-12-31"), by = "months")),
group = ifelse(lubridate::year(dates) %in% c(1987,1988), "1987-88", "2010-11"),
month = lubridate::month(dates))
ggplot(new_df, aes(x = month, y = precip, fill = group)) +
geom_bar(stat = 'identity', position = position_dodge()) +
scale_x_continuous(labels = paste0(1:12, "/1987 - 1988", "\n", 1:12, "/2010 - 2011"),
breaks = 1:12) +
scale_fill_manual(values = c("red", "navy")) +
theme_classic() +
theme(legend.title = element_blank(),
axis.text = element_text(size = 10))

How to plot a formula with a given range?

I am looking to plot the following:
L<-((2*pi*h*c^2)/l^5)*((1/(exp((h*c)/(l*k*T)-1))))
all variables except l are constant:
T<-6000
h<-6.626070040*10^-34
c<-2.99792458*10^8
k<-1.38064852*10^-23
l has a range of 20*10^-9 to 2000*10^-9.
I have tried l<-seq(20*10^-9,2000*10^-9,by=1*10^-9), however this does not give me the results I expect.
Is there a simple solution for this in R, or do I have to try in another language?
Thank you.

Looking at the spectral radiance equation wikipedia page, it seems that your formula is a bit off. Your formula multiplies an additional pi (not sure if intended) and the -1 is inside the exp instead of outside:
L <- ((2*pi*h*c^2)/l^5)*((1/(exp((h*c)/(l*k*T)-1))))
Below is the corrected formula. Also notice I have converted it into a function with parameter l since this is a variable:
T <- 6000 # Absolute temperature
h <- 6.626070040*10^-34 # Plank's constant
c <- 2.99792458*10^8 # Speed of light in the medium
k <- 1.38064852*10^-23 # Boltzmann constant
L <- function(l){((2*h*c^2)/l^5)*((1/(exp((h*c)/(l*k*T))-1)))}
# Plotting
plot(L, xlim = c(20*10^-9,2000*10^-9),
xlab = "Wavelength (nm)",
ylab = bquote("Spectral Radiance" ~(KW*sr^-1*m^-2*nm^-1)),
main = "Plank's Law",
xaxt = "n", yaxt = "n")
xtick <- seq(20*10^-9, 2000*10^-9,by=220*10^-9)
ytick <- seq(0, 4*10^13,by=5*10^12)
axis(side=1, at=xtick, labels = (1*10^9)*seq(20*10^-9,2000*10^-9,by=220*10^-9))
axis(side=2, at=ytick, labels = (1*10^-12)*seq(0, 4*10^13,by=5*10^12))
The plot above is not bad, but I think we can do better with ggplot2:
h <- 6.626070040*10^-34 # Plank's constant
c <- 2.99792458*10^8 # Speed of light in the medium
k <- 1.38064852*10^-23 # Boltzmann constant
L2 <- function(l, T){((2*h*c^2)/l^5)*((1/(exp((h*c)/(l*k*T))-1)))} # Plank's Law
classical_L <- function(l, T){(2*c*k*T)/l^4} # Rayleigh-Jeans Law
library(ggplot2)
ggplot(data.frame(l = c(20*10^-9,2000*10^-9)), aes(l)) +
geom_rect(aes(xmin=390*10^-9, xmax=700*10^-9, ymin=0, ymax=Inf),
alpha = 0.3, fill = "lightblue") +
stat_function(fun=L2, color = "red", size = 1, args = list(T = 3000)) +
stat_function(fun=L2, color = "green", size = 1, args = list(T = 4000)) +
stat_function(fun=L2, color = "blue", size = 1, args = list(T = 5000)) +
stat_function(fun=L2, color = "purple", size = 1, args = list(T = 6000)) +
stat_function(fun=classical_L, color = "black", size = 1, args = list(T = 5000)) +
theme_bw() +
scale_x_continuous(breaks = seq(20*10^-9, 2000*10^-9,by=220*10^-9),
labels = (1*10^9)*seq(20*10^-9,2000*10^-9,by=220*10^-9),
sec.axis = dup_axis(labels = (1*10^6)*seq(20*10^-9,2000*10^-9,by=220*10^-9),
name = "Wavelength (\U003BCm)")) +
scale_y_continuous(breaks = seq(0, 4*10^13,by=5*10^12),
labels = (1*10^-12)*seq(0, 4*10^13,by=5*10^12),
limits = c(0, 3.5*10^13)) +
labs(title = "Black Body Radiation described by Plank's Law",
x = "Wavelength (nm)",
y = expression("Spectral Radiance" ~(kWsr^-1*m^-2*nm^-1)),
caption = expression(''^'\U02020' ~'Spectral Radiance described by Rayleigh-Jeans Law, which demonstrates the ultraviolet catastrophe.')) +
annotate("text",
x = c(640*10^-9, 640*10^-9, 640*10^-9, 640*10^-9,
150*10^-9, (((700-390)/2)+390)*10^-9, 1340*10^-9),
y = c(2*10^12, 5*10^12, 14*10^12, 31*10^12,
35*10^12, 35*10^12, 35*10^12),
label = c("3000 K", "4000 K", "5000 K", "6000 K",
"UV", "VISIBLE", "INFRARED"),
color = c(rep("black", 4), "purple", "blue", "red"),
alpha = c(rep(1, 4), rep(0.6, 3)),
size = 4.5) +
annotate("text", x = 1350*10^-9, y = 23*10^12,
label = deparse(bquote("Classical theory (5000 K)"^"\U02020")),
color = "black", parse = TRUE)
Notes:
I created L2 by also making absolute temperature T a variable
For each T, I plot the function L2 using different colors for representation. I've also added a classical_L function to demonstrate classical theory of spectral radiance
geom_rect creates the light blue shaded area for "VISIBLE" light wavelength range
scale_x_continuous sets the breaks of the x axis, while labels sets the axis tick labels. Notice I have multiplied the seq by (1*10^9) to convert the units to nanometer (nm). A second x-axis is added to display the micrometer scale
Analogously, scale_y_continuous sets the breaks and tick labels for y axis. Here I multiplied by (1*10^-12) or (1*10^(-3-9)) to convert from watts (W) to kilowatts (kW), and from inverse meter (m^-1) to inverse nanometer (nm^-1)
bquote displays superscripts correctly in the y axis label
annotate sets the coordinates and text for curve labels. I've also added the labels for "UV", "VISIBLE" and "INFRARED" light wavelengths
ggplot2
Plot from wikipedia:
Image source: https://upload.wikimedia.org/wikipedia/commons/thumb/1/19/Black_body.svg/600px-Black_body.svg.png