Bar plot with a few extreme values - r

Consider the following vector:
vec <- c(-0.137042293280008 ,-0.0085530023889108 ,7.696986350237e-05 ,9.85275557252565e-05 ,0.000246261331270769 ,-0.0013658222244989 ,0.00117046787783182 ,-0.000423648394606887 ,-0.000112607126438433 ,0.00212185051472275 ,-0.000110104526782098)
names(vec) <- paste("var", 1:length(vec), sep = " ")
I would like to plot vec using a bar plot in R. However, as you can see, there is one or two values that are extreme compared to the rest of the vector. When the bar plot is drawn, the small values barely show on the graph.
par(xaxs='i',yaxs='i', mai = c(0.5,2,0.5,1.5))
bp2 <- barplot(vec, horiz = TRUE, col = "lightblue4", border = "lightblue4", yaxt = 'n', cex.axis = 0.7)
axis(2, at = bp2, labels = names(vec), tick = FALSE, las = 2, cex.axis = 0.7)
Is there a way to better display the chart? For example, is there a way to eventually split the x-axis? The graph below is an (unrelated) example, but it shows how the y-axis in this case is split to allow for all values to show on the graph.
P.S: Plotting with a log-scale is not an option in my case, as some of the vector values are negative.
Thank you!

You need gap.barplot from plotrix package. Take a look at this:
library(plotrix)
gap.barplot(vec,gap=c(-0.12,-0.04),xlab="Index",ytics=c(-0.04,-0.02,0),
ylab="",main="Barplot with gap", horiz=TRUE)
Modify gap and ytics argument to get the desired aesthetic for your plot.

Related

R barplots: specify intervals of date-based x-axis

I've been producing different sets of charts, all in R base. I have a problem though with barplots. I've formatted the x-axis to show the dates by year, however, many years show up several times. I would like each year to only show up once.
Here's my example code:
library(quantmod)
start <- as.Date("01/01/2010", "%d/%m/%Y")
#Download FRED data
tickers <- c("WTISPLC", "DCOILBRENTEU")
fred <- lapply(tickers, function(sym) {na.omit(getSymbols(sym, src="FRED", auto.assign=FALSE, return.class = "zoo"))})
df <- do.call(merge, fred)
#Subset for start date
df <- subset(df, index(df)>=start)
#Create bar plot
par(mar = c(5,5,5,5))
barplot(df[,2], names.arg=format(index(df), "%Y"), ann=FALSE, bty="n", tck=-0, col=1:1, border=NA, space=0); title(main="Example chart", ylab="y-axis")
This example should be reproducible and show clearly what I mean. Now, I've been researching how to add a separate x-axis and how to define that axis. So, I've tried to add the following code:
#Plot bars but without x-axis
barplot(df[,2], names.arg=format(index(df), "%Y"), ann=FALSE, bty="n", tck=-0, xaxt="n", col=1:1, border=NA, space=0); title(main="Example chart", ylab="y-axis")
# Set x-axis parameters
x_min <- min(index(df))
x_max <- max(index(df))
xf="%Y"
#Add x-axis
axis.Date(1, at=seq(as.Date(x_min), x_max, "years"), format=xf, las=1, tck=-0)
This does not give me an error message, but it also does absolutely nothing in terms of drawing an x-axis.
Please do not provide a solution for ggplot. Even though I like ggplot, these barplots are part of a bigger project for me, all using R base and I would not like to introduce ggplot into this project now.
Thanks!
If you are not limited to barplot, you may use the following very simple solution using plot.zoo behind the screens:
# only use what you want, and avoid multiple plots
df2 <- df[ , 2]
# use zoo.plot's functionality
plot(df2, main = "Example Chart", ylab = "y-axis", xlab = "")
This yields the following plot:
I know it is not a barplot, but I don't see what a barplot would add here. Please let me know, whether this is what you want or not.
Edit 1
If you do want to use barplot you may use the following code:
### get index of ts in year format
index_y <- format(index(df), "%Y")
### logical vector with true if it is the start of a new year
index_u <- !duplicated(index_y)
### index of start of new year for tick marks
at_tick <- which(index_u)
### label of start of new year
labels <- index_y[index_u]
### draw barplot without X-axis, and store in bp
### bp (bar midpoints) is used to set the ticks right with the axis function
bp <- barplot(df[,2], xaxt = "n", ylab= "y-axis")
axis(side = 1, at = bp[at_tick] , labels = labels)
yielding the following plot:
Please let me know, whether this is what you want.
Edit 2
We need to take into account two bits of information, when explaining why the ticks and labels group together at the left-hand side.
(1) in barplot, space defines the amount of space before each bar (as a fraction of the average bar width). In our case, it defaults to around zero (see ?barplot for details). In the illustration below, we use spaces of 0.0, 0.5, and 2.0
(2) Barplot returns a numeric vector with the midpoints of the bars drawn (again see the help pages for more detailed info). We can use these midpoints to add information to the graph, like we do in the following excerpt: after storing the result of barplot in bp, we use bp to set the ticks: axis(... at = bp[at_tick] ... ).
When we add space, the location of the bar midpoints change. So, when we want to use the bar midpoints after adding space, we need to be sure we have the right information. Simply stated, use the vector returned by barplot with the call where you added space. If you don't, the graph will be messed up. In the below, if you continue to use the bar-midpoints of the call with (space=0), and you increase space, the ticks and labels will group at the left-hand side.
Below, I illustrate this with your data limited to 3 months in 2017.
In the top layer 3 barplots are drawn with space equal to 0.0, 0.5 and 2.0. The information used to calculated the location of ticks and labels is recalculated and saved at every plot.
In the bottom layer, the same 3 barplots are drawn, but the information used to draw the ticks and labels is only created with the first plot (space=0.0)
# Subset for NEW start for illustration of space and bp
start2 <- as.Date("01/10/2017", "%d/%m/%Y")
df2 <- subset(df, index(df)>=start2)
### get index of ts in month format, define ticks and labels
index_y2 <- format(index(df2), "%m")
at_tick2 <- which(!duplicated(index_y2))
labels2 <- index_y2[!duplicated(index_y2)]
par(mfrow = c(2,3))
bp2 <- barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 0.0, main ="Space = 0.0")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
bp2 <- barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 0.5, main ="Space = 0.5")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
bp2 <- barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 2.0, main ="Space = 2.0")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
### the lower layer
bp2 <- barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 0.0, main ="Space = 0.0")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 0.5, main ="Space = 0.5")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
barplot(df2[,2], xaxt = "n", ylab= "y-axis", space= 2.0, main ="Space = 2.0")
axis(side = 1, at = bp2[at_tick2] , labels = labels2)
par(mfrow = c(1,1))
Have a look here:
Top layer: bp recalculated every time
Bottom layer: bp space=0 reused
Cutting and pasting the commands in your console may illustrate the effects better than the pic above.
I hope this helps.
You could use the axis function, I used match to obtain the indices of the dates on the axis:
space=1
#Plot bars but without x-axis
barplot(df[,2], names.arg=format(index(df), "%Y"), ann=FALSE, bty="n", tck=-0, xaxt="n",
col=1:1, border=NA, space=space); title(main="Example chart", ylab="y-axis")
# Set x-axis parameters
x_min <- min(index(df))
x_max <- max(index(df))
#Add x-axis
axis(1, at=match(seq(as.Date(x_min), x_max, "years"),index(df))*(1+space),
labels = format(seq(as.Date(x_min), x_max, "years"),"%Y"),lwd=0)
Hope this helps!

Plotting in R using plot function

I am trying to plot few graphs using loops. I am now describing in details.
First I have a function which is calculates the y-variable (called effect for vertical axis)
effect<- function (x, y){
exp(-0.35*log(x)
+0.17*log(y)
-0.36*sqrt(log(x)*log(y)/100))
}
Now I run the following code and use the option par to plot the lines in the same graph. I use axis=FALSE and xlab="" to get a plot without labels. I do this so that my labels are not re-written each time the loop runs and looks ugly.
for (levels in seq(exp(8), exp(10), length.out = 5)){
x = seq(exp(1),exp(10), length.out = 20)
prc= effect(levels,x)
plot(x, prc,xlim = c(0,max(x)*1.05), ylim=c(0.0,0.3),
type="o", xlab = "",ylab = "", pch = 16,
col = "dark blue", lwd = 2, cex = 1, axes = F)
label = as.integer(levels) #x variable
text(max(x)*1.03,max(prc), label )
par(new=TRUE)
}
Finally, I duplicate the plot command this time using the xlab and ylab options
plot(x, prc, xlab = "X-label", ylab = "effect",
xlim = c(0,max(x)*1.05), ylim = c(0,0.3),
type="l", col ='blue')
I have several other plots in the similar lines, using complex equations. I have two questions:
Is there an better option to have the same plot with smoother lines?
Is there an easier option with few lines to achieve the same, where I can place the texts (levels) for each line on the right with white background at the back?
I believe working with the plot function was tedious and time consuming. So, I have finally used ggplot2 to plot. There were several help available online, which I have used.

Twosided Barplot in R with different data

I was wondering if it's possible to get a two sided barplot (e.g. Two sided bar plot ordered by date) that shows above Data A and below Data B of each X-Value.
Data A would be for example the age of a person and Data B the size of the same person. The problem with this and the main difference to the examples above: A and B have obviously totally different units/ylims.
Example:
X = c("Anna","Manuel","Laura","Jeanne") # Name of the Person
A = c(12,18,22,10) # Age in years
B = c(112,186,165,120) # Size in cm
Any ideas how to solve this? I don't mind a horizontal or a vertical solution.
Thank you very much!
Here's code that gets you a solid draft of what I think you want using barplot from base R. I'm just making one series negative for the plotting, then manually setting the labels in axis to reference the original (positive) values. You have to make a choice about how to scale the two series so the comparison is still informative. I did that here by dividing height in cm by 10, which produces a range similar to the range for years.
# plot the first series, but manually set the range of the y-axis to set up the
# plotting of the other series. Set axes = FALSE so you can get the y-axis
# with labels you want in a later step.
barplot(A, ylim = c(-25, 25), axes = FALSE)
# plot the second series, making whatever transformations you need as you go. Use
# add = TRUE to add it to the first plot; use names.arg to get X as labels; and
# repeat axes = FALSE so you don't get an axis here, either.
barplot(-B/10, add = TRUE, names.arg = X, axes = FALSE)
# add a line for the x-axis if you want one
abline(h = 0)
# now add a y-axis with labels that makes sense. I set lwd = 0 so you just
# get the labels, no line.
axis(2, lwd = 0, tick = FALSE, at = seq(-20,20,5),
labels = c(rev(seq(0,200,50)), seq(5,20,5)), las = 2)
# now add y-axis labels
mtext("age (years)", 2, line = 3, at = 12.5)
mtext("height (cm)", 2, line = 3, at = -12.5)
Result with par(mai = c(0.5, 1, 0.25, 0.25)):

R horizontal barplot with axis labels split between two axis

I have a horizontal barplot, with zero in the middle of the x-axis and would like the name for each bar to appear on the same side as the bar itself. The code I am using is:
abun<-data$av.slope
species<-data$Species
cols <- c("blue", "red")[(abun > 0)+1]
barplot(abun, main="Predicted change in abundance", horiz=TRUE,
xlim=c(-0.04,0.08), col=cols, names.arg=species, las=1, cex.names=0.6)
I have tried creating two separate axes and the names do appear on the desired side for each bar, but are not level with the appropriate bar. I will try and upload an image of the barplot, am still very new to R, apologies if I am missing something basic!
barplot1- names in correct position but all on one axis
barplot2- names on both sides of plot but not in line with appropriate bar
We can accomplish this using mtext:
generate data
Since you didn't include your data in the question I generated my own dummy data set. If you post a dput of your data, we could adapt this solution to your data.
set.seed(123)
df1 <- data.frame(x = rnorm(20),
y = LETTERS[1:20])
df1$colour <- ifelse(df1$x < 0, 'blue', 'red')
make plot
bp <- barplot(df1$x, col = df1$colour, horiz = T)
mtext(side = ifelse(df1$x < 0, 2, 4),
text = df1$y,
las = 1,
at = bp,
line = 1)

Plot() command to move x-axis on top of the plot

I'm trying to move the x-axis labeling and tick marks above the plot on top. Here's my code.
ucsplot <- plot(ucs, depth, main= "Depth vs. UCS", xlab = "UCS (psi)", ylab="Depth (ft)", type="l", col="blue", xlim=c(0, max(dfplot[,3]+5000)), ylim = rev(range(depth)))
ucsplot
How do I get the x-axis labeling and tick marks to appear only on top, instead of the bottom? Also, how do I get the title to not sit right on top of the numbers right above the tick marks? Also, how do I get the chart to start not offset a little bit to the right? As in the zero and starting numbers are in the corners of the plot and not offset.
Seems the OP is looking for a plot where x-axis is at top. The data has not been provided by OP. Hence using a sample dataframe, solution can be displayed as:
df <- data.frame(a = 1:10, b = 41:50)
plot(a ~ b, data = df, axes = FALSE, xlab = NA, ylab = NA)
axis(side = 2, las = 1)
axis(side = 3, las = 1)

Resources