I'm trying to put the two axes of my biplot exactly equally scaled (i.e., 1 cm on the vertical axis must represent the same 1 cm on the horizontal axis). How can I do that with fviz_pca? or there is some better pca package?
My code
fviz_pca_ind(res.pca,
col.ind = groups, # color by groups
palette = c("#00AFBB", "#FC4E07"),
addEllipses = TRUE, # Concentration ellipses
ellipse.type = "convex",
legend.title = "Groups",
repel = T, ggtheme=theme(axis.text=element_text(size=16), axis.title=element_text(size=16)))
As the fviz function works based on ggplot2 you just need to add the information of axis lenght at the end of the funtion like this:
fviz_pca_biplot(scaptotrigona.pca,[...]) +xlim(-1, 1) + ylim (-1, 1)
remember that the values inside the "xlim()" and "ylim()" mus be equals, if you use "xlim(-1, 5)" for example, it won't work.
Related
I am currently working with bfastSpatial, I am attempting to plot breakpoint values as a year colour based legend. I am aware of changeMonth function for plotting monthly based breakpoints (http://www.loicdutrieux.net/bfastSpatial/) however, I am attempting to achieve a similar outcome as in Morrison et al. (2019) https://www.mdpi.com/2072-4292/10/7/1075
Any assistance would be appreciated.
If you want to round the breakpoint dates to integer years, you can use floor, as the dates are in decimal years. Next, to make a similar plot as the one you showed, you can use the tmap package. Since you did not attach any data to the OP, I used the tura data included in the bfastSpatial package.
library(bfastSpatial)
library(tmap)
# Load tura data
data(tura)
# Perform bfast analysis
bfm <- bfmSpatial(tura, start=c(2009, 1), order=1)
# Extract the first band (breakpoints)
change <- bfm[[1]]
# As breakpoints dates are in year decimals,
# you can use floor to round them to lowest integer
change <- floor(change)
# Set shape as change, the object to plot
tm_shape(change) +
# Plot it as raster and set the palette, number of categories,
# style (categorical) and title of the legend.
tm_raster(palette = "Spectral",
n = 5,
style = "cat",
title = "Year") +
# Set the legend's position and eliminate the comma used by default for
# separating thousands values. Add background color and transparency
tm_layout(legend.position = c("right", "bottom"),
legend.format=list(fun=function(x) formatC(x, digits=0, format="d")),
legend.bg.color = "white",
legend.bg.alpha = 0.7) +
# Add scale bar, set position and other arguments
tm_scale_bar(breaks = c(0,0.5,1),
position = c("right", "top"),
bg.color = "white",
bg.alpha = 0.7) +
# Add north arrow with additional parameters
tm_compass(type = "arrow",
position = c("left", "top"),
bg.color = "white",
bg.alpha = 0.7)
The obtained plot:
I'm trying to fit Variance-Gamma distribution to empirical data of 1-minute logarithmic returns. In order to visualize the results I plotted together 2 histograms: empirical and theoretical.
(a is the vector of empirical data)
SP_hist <- hist(a,
col = "lightblue",
freq = FALSE,
breaks = seq(a, max(a), length.out = 141),
border = "white",
main = "",
xlab = "Value",
xlim = c(-0.001, 0.001))
hist(VG_sim_rescaled,
freq = FALSE,
breaks = seq(min(VG_sim_rescaled), max(VG_sim_rescaled), length.out = 141),
xlab = "Value",
main = "",
col = "orange",
add = TRUE)
(empirical histogram-blue, theoretical histogram-orange)
However, after having plotted 2 histograms together, I started wondering about 2 things:
In both histograms I stated, that freq = FALSE. Therefore, the y-axis should be in range (0, 1). In the actual picture values on the y-axis exceed 3,000. How could it happen? How to solve it?
I need to change the bucketing size (the width of the buckets) and the density per unit length of the x-axis. How is it possible to do these tasks?
Thank you for your help.
freq=FALSE means that the area of the entire histogram is normalized to one. As your x-axis has a very small range (about 10^(-4)), the y-values must be quite large to achieve an area (= x times y) of one.
The only way to set the number of bins is by providing a vector of break points to the parameter breaks. Theoretically, this parameter also accepts a single number, but this number is ignored by hist. Thus try the following:
bins <- 6 # number of cells
breaks <- seq(min(x),max(x),(max(x)-min(x))/bins)
hist(x, freq=FALSE, breaks=breaks)
I was wondering if it's possible to get a two sided barplot (e.g. Two sided bar plot ordered by date) that shows above Data A and below Data B of each X-Value.
Data A would be for example the age of a person and Data B the size of the same person. The problem with this and the main difference to the examples above: A and B have obviously totally different units/ylims.
Example:
X = c("Anna","Manuel","Laura","Jeanne") # Name of the Person
A = c(12,18,22,10) # Age in years
B = c(112,186,165,120) # Size in cm
Any ideas how to solve this? I don't mind a horizontal or a vertical solution.
Thank you very much!
Here's code that gets you a solid draft of what I think you want using barplot from base R. I'm just making one series negative for the plotting, then manually setting the labels in axis to reference the original (positive) values. You have to make a choice about how to scale the two series so the comparison is still informative. I did that here by dividing height in cm by 10, which produces a range similar to the range for years.
# plot the first series, but manually set the range of the y-axis to set up the
# plotting of the other series. Set axes = FALSE so you can get the y-axis
# with labels you want in a later step.
barplot(A, ylim = c(-25, 25), axes = FALSE)
# plot the second series, making whatever transformations you need as you go. Use
# add = TRUE to add it to the first plot; use names.arg to get X as labels; and
# repeat axes = FALSE so you don't get an axis here, either.
barplot(-B/10, add = TRUE, names.arg = X, axes = FALSE)
# add a line for the x-axis if you want one
abline(h = 0)
# now add a y-axis with labels that makes sense. I set lwd = 0 so you just
# get the labels, no line.
axis(2, lwd = 0, tick = FALSE, at = seq(-20,20,5),
labels = c(rev(seq(0,200,50)), seq(5,20,5)), las = 2)
# now add y-axis labels
mtext("age (years)", 2, line = 3, at = 12.5)
mtext("height (cm)", 2, line = 3, at = -12.5)
Result with par(mai = c(0.5, 1, 0.25, 0.25)):
I am creating a number of heatmaps in R, but I am having problems when it comes to keeping the colour scale consistent across graphs.
I find that the colours are scaled within a graph, is there a way to make colours consistent across graphs? Ie. So that that colour difference between a value of 0.4 and 0.5 is always the same?
Code Example:
set.seed(123)
d1 = matrix(rnorm(9, mean = 0.2, sd = 0.1), ncol = 3)
d2 = matrix(rnorm(9, mean = 0.8, sd = 0.1), ncol = 3)
mat = list(d1, d2)
for(m in mat)
heatmap(m, Rowv = NA ,Colv = NA)
You'll note in the example that cell (2,3) the first graph is similar to cell (1,3) in the second, despite being ~0.8 different
Here's a way to do it with ggplot2, if you're open to not using base graphics:
library(reshape2)
library(ggplot2)
# Set common limits for color scale
limits = range(unlist(mat))
Here's the code for two separate graphs. The last line of code for each graph ensures that they use the same z limits for setting the colors:
ggplot(melt(mat[[1]]), aes(Var1, Var2, fill=value)) +
geom_tile() +
scale_fill_continuous(limits=limits)
ggplot(melt(mat[[2]]), aes(Var1, Var2, fill=value)) +
geom_tile() +
scale_fill_continuous(limits=limits)
Another option is to plot both heatmaps in a single graph using facetting, which automatically ensures both graphs are on the same color scale:
ggplot(melt(mat), aes(Var1, Var2, fill=value)) +
geom_tile() +
facet_grid(. ~ L1)
I've used the default colors here, but for either approach you can set the color scale to be anything you wish. For example:
ggplot(melt(mat), aes(Var1, Var2, fill=value)) +
geom_tile() +
facet_grid(. ~ L1) +
scale_fill_gradient(low="red", high="green")
You could use the image function directly (heatmap uses image), though it will require some extra formatting to match the output of heatmap. You can use zlim to set the color range. Quoting from the ?image page:
the minimum and maximum z values for which colors should be plotted,
defaulting to the range of the finite values of z. Each of the given
colors will be used to color an equispaced interval of this range. The
midpoints of the intervals cover the range, so that values just
outside the range will be plotted.
# define zlim min and max for all the plots
minz = Reduce(min, mat)
maxz = Reduce(max, mat)
for(m in mat) {
image( m, zlim = c(minz, maxz), col = heat.colors(20))
}
To get closer to the formatting produced by heatmap, you can just reuse some code from the heatmap function:
for(m in mat) {
labCol = dim(m)[2]
labRow = dim(m)[1]
image(seq_len(labCol), seq_len(labRow), m, zlim = c(minz, maxz),
col = heat.colors(20), axes = FALSE, xlab = "", ylab = "",
xlim = 0.5 + c(0, labCol), ylim = 0.5 + c(0, labRow))
axis(1, 1L:labCol, labels = seq_len(labCol), las = 2, line = -0.5, tick = 0)
axis(4, 1L:labRow, labels = seq_len(labRow), las = 2, line = -0.5, tick = 0)
}
Using the breaks argument to image is another option. It allows more flexibility than zlim in setting the breakpoints for colors. Quoting from the help page, breaks is
a set of finite numeric breakpoints for the colours: must have one
more breakpoint than colour and be in increasing order. Unsorted
vectors will be sorted, with a warning.
Using this example:
x<-mtcars;
barplot(x$mpg);
you get a graph that is a lot of barplots from (0 - 30).
My question is how can you adjust it so that the y axis is (10-30) with a split at the bottom indicating that there was data below the cut off?
Specifically, I want to do this in base R program using only the barplot function and not functions from plotrix (unlike the suggests already provided). Is this possible?
This is not recommended. It is generally considered bad practice to chop off the bottoms of bars. However, if you look at ?barplot, it has a ylim argument which can be combined with xpd = FALSE (which turns on "clipping") to chop off the bottom of the bars.
barplot(mtcars$mpg, ylim = c(10, 30), xpd = FALSE)
Also note that you should be careful here. I followed your question and used 0 and 30 as the y-bounds, but the maximum mpg is 33.9, so I also clipped the top of the 4 bars that have values > 30.
The only way I know of to make a "split" in an axis is using plotrix. So, based on
Specifically, I want to do this in base R program using only the barplot function and not functions from plotrix (unlike the suggests already provided). Is this possible?
the answer is "no, this is not possible" in the sense that I think you mean. plotrix certainly does it, and it uses base R functions, so you could do it however they do it, but then you might as well use plotrix.
You can plot on top of your barplot, perhaps a horizontal dashed line (like below) could help indicate that you're breaking the commonly accepted rules of what barplots should be:
abline(h = 10.2, col = "white", lwd = 2, lty = 2)
The resulting image is below:
Edit: You could use segments to spoof an axis break, something like this:
barplot(mtcars$mpg, ylim = c(10, 30), xpd = FALSE)
xbase = -1.5
xoff = 0.5
ybase = c(10.3, 10.7)
yoff = 0
segments(x0 = xbase - xoff, x1 = xbase + xoff,
y0 = ybase-yoff, y1 = ybase + yoff, xpd = T, lwd = 2)
abline(h = mean(ybase), lwd = 2, lty = 2, col = "white")
As-is, this is pretty fragile, the xbase was adjusted by hand as it will depend on the range of your data. You could switch the barplot to xaxs = "i" and set xbase = 0 for more predictability, but why not just use plotrix which has already done all this work for you?!
ggplot In comments you said you don't like the look of ggplot. This is easily customized, e.g.:
library(ggplot2)
ggplot(x, aes(y = mpg, x = id)) +
geom_bar(stat = "identity", color = "black", fill = "gray80", width = 0.8) +
theme_classic()