Adjust space between axis ticks in lattice - r

I created a simple Dotplot() using this data:
d <- data.frame(emot=rep(c("happy","angry"),each=2),
exp=rep(c("exp","non-exp"),2), accuracy=c(0.477,0.587,0.659,0.736),
Lo=c(0.4508,0.564,0.641,0.719), Hi=c(0.504,0.611,0.677,0.753))
and the code below:
library(Hmisc)
Dotplot(emot ~ Cbind(accuracy, Lo, Hi), groups=exp, data=d,
pch=c(1,16), aspect = "xy", par.settings = list(dot.line=list(col=0)))
What I want to do is to DECREASE the distance between y-axis ticks and decrease the distance between plot elements as well - so that happy/angry horizontal error lines will get closer to each other. I know I could probably achieve that by playing with scales=list(...) parameters (not sure how yet), but I would have to define labels again, etc. Is there a quicker way to do it? It seems like such a simple thing to solve, but I'm stuck.

Despite the fact that Hmisc ::Dotplot is using lattice, just adding a ylim argument seems to do the trick.You can figure out the default scale since those two values were factors with underlying 1/2 values:
Dotplot(emot ~ Cbind(accuracy, Lo, Hi), groups=exp, data=d, ylim=c(0,3),
pch=c(1,16), aspect = "xy", par.settings = list(dot.line=list(col=0)))

Related

Is there a way to rescale the axes of a plot produced by plot.clusterlm (R)?

I have run cluster analysis on some time series data using permuco in R. (Permutes the labels of control/treatment conditions and calculates the F statistic as to how likely it is that these time clusters of significant differences occurred by chance.)
So far so good.
I have produced a number of plots using the inbuilt function plot.clusterlm that comes with this package. However, the data come from different groups, and the F values on the y axis get rescaled in each plot, i.e. the values and ticks are reset depending on how strong the effects are.
This is problematic, because the different plots based on different cluster analyses are not visually comparable.
I would like to rescale the y axis, so that all clusters are visualised along the same F values (0-10 for example).
I haven't been able to do that, and I was wondering if there is a way to pass any additional functions into the plot.clusterlm to do this.
This is the usage of the function, but I don't see a way to rescale the y axis. (Although rescaling the x axis is possible by manipulating the nbbaselinepts & nbptsperunit, but that's not what I want...)
plot(x, effect = "all", type = "statistic",
multcomp = "clustermass", alternative = "two.sided",
enhanced_stat = FALSE, nbbaselinepts = 0, nbptsperunit = 1, ...)
If you have any ideas on this, please let me know.
Thank you!
Thanks for using permuco! I opened an issue on GitHub to have a solution for implementing these features. You can expect changes in further releases of permuco.
However, the plot() method shows the F statistic which is not a good measure of effect size. A better measure of effect size is the partial-eta square which is implemented in the afex package
In the base R plotting device axes are altered like this:
x<-1:10; y=x*x
# Simple graph
plot(x, y)
# Enlarge the scale
plot(x, y, xlim=c(1,15), ylim=c(1,150))
# Log scale
plot(x, y, log="y")
This is an example from STHDA where you can find many helpful tutorials.

Axis breaks in ggplot histogram in R [duplicate]

I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this:

How to get shaded background in xyplot in R?

using xyplot from the lattice package, I plot a time series over a number of years. I would to add a shaded area for some of these years to indicate that this time period was "special" (e.g. war).
Please apologize if this is trivial, but I could not figure out how to do that, so I would be happy if someone could help me out, or at least point me in the right direction. I think my main problem is that I don't really know how to approach this problem. I am still relatively new to R, and to lattice in particular.
Here a minimal example:
xyplot( rnorm(100) ~ 1:100, type="l", col="black")
In the corresponding plot, I would like the color of the background (from say x-values of 45 until 65) from the bottom to the top of the plotting area be shaded in, say, light grey.
Note that solutions that I have found so far use base graphics and the polygon-function, but there the intention is to shade the area under or above a curve, which is different from what I would like to do. I don't "just" want to shade the area below my line, or above my line. Instead I would like to shade the entire background for a given time interval.
If anyone could help me out here, I would be very grateful!
See ?panel.xblocks in the latticeExtra package:
library(latticeExtra)
x <- 1:100
xyplot( rnorm(100) ~ x, type="l", col="black") +
layer_(panel.xblocks(x, x > 20, col = "lightgrey"))
Try this:
xyplot(
rnorm(100) ~ 1:100, type="l", col="black",
panel=function (x,y,...){
panel.rect(xleft=45, xright=65,ybottom=-3, ytop=3,col="grey")
panel.xyplot(x,y,...)
}
)
The panel.rect() function controls the rectangle and is the lattice equivalent of the rect() function. It has a variety of settings that you may find useful. It is called first and then the xyplot() is put on top of it. You many need to play around a little to get your ybottom and ytop parameters to look as you like them.
trellis.focus("panel", 1, 1)
grid.rect(x =.55, , y=.5, w = .2, height=6,
gp = gpar(fill = "light grey"))
trellis.unfocus()
This differs from #JohnPaul's solution in a couple of ways (and I think his answer is better). This uses the center of the desired X-band for placement in "native coordinates" and calculates the width as 'range(xlim)/range(band)' and it modifies an existing plot. the grid.rect function is the grid packages lower level function that is used by panel.rect. I sometimes find this useful when integrating lattice panels inside the xyplot system defeats me.

Formatting and manipulating a plot from the R package "hexbin"

I generate a plot using the package hexbin:
# install.packages("hexbin", dependencies=T)
library(hexbin)
set.seed(1234)
x <- rnorm(1e6)
y <- rnorm(1e6)
hbin <- hexbin(
x = x
, y = y
, xbin = 50
, xlab = expression(alpha)
, ylab = expression(beta)
)
## Using plot method for hexbin objects:
plot(hbin, style = "nested.lattice")
abline(h=0)
This seems to generate an S4 object (hbin), which I then plot using plot.
Now I'd like to add a horizontal line to that plot using abline, but unfortunately this gives the error:
plot.new has not yet been called
I have also no idea, how I can manipulate e.g. the position of the axis labels (alpha and beta are within the numbers), change the position of the legend, etc.
I'm familiar with OOP, but so far I could not find out how plot() handles the object (does it call certain methods of the object?) and how I can manipulate the resulting plot.
Why can't I simply draw a line onto the plot?
How can I manipulate axis labels?
Use lattice version of hex bin - hexbinplot(). With panel you can add your line, and with style you can choose different ways of visualizing hexagons. Check help for hexbinplot for more.
library(hexbin)
library(lattice)
x <- rnorm(1e6)
y <- rnorm(1e6)
hexbinplot(x ~ y, aspect = 1, bins=50,
xlab = expression(alpha), ylab = expression(beta),
style = "nested.centroids",
panel = function(...) {
panel.hexbinplot(...)
panel.abline(h=0)
})
hexbin uses grid graphics, not base. There is a similar function, grid.abline, which can draw lines on plots by specifying a slope and intercept, but the co-ordinate system used is confusing:
grid.abline(325,0)
gets approximately what you want, but the intercept here was found by eye.
You will have more luck using ggplot2:
library(ggplot2)
ggplot(data,aes(x=alpha,y=beta)) + geom_hex(bins=10) + geom_hline(yintercept=0.5)
I had a lot of trouble finding a lot of basic plot adjustments (axis ranges, labels, etc.) with the hexbin library but I figured out how to export the points into any other plotting function:
hxb<-hexbin(x=c(-15,-15,75,75),
y=c(-15,-15,75,75),
xbins=12)
hxb#xcm #gives the x co-ordinates of each hex tile
hxb#ycm #gives the y co-ordinates of each hex tile
hxb#count #gives the cell size for each hex tile
points(x=hxb#xcm, y=hxb#ycm, pch=hxb#count)
You can just feed these three vectors into any plotting tool you normally use.. there is the usual tweaking of size scaling, etc. but it's far better than the stubborn hexplot function. The problem I found with the ggplot2 stat_binhex is that I couldn't get the hexes to be different sizes... just different colors.
if you really want hexagons, plotrix has a hexagon drawing function that i think is fine.

How can I plot a histogram of a long-tailed data using R?

I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this:

Resources