How do I plot tick marks inside a lattice plot - r

How do I plot tick marks inside of a lattice plot, opposed to the outside. "tcl = 0.5" seems to work for a normal plot; however not for my lattice dotplot.
Example of what I'd like to replicate in Lattice:
Thanks!

Tick length is controlled by the tck= argument, which needs to be passed in to higher-level plotting functions via their scales= argument. Setting tck=0 suppresses drawing of the ticks, while setting it to a negative number causes ticks to be drawn inside of the plot:
library(lattice)
xyplot(1:10 ~ 1:10,
scales = list(y = list(tck=0),
x = list(tck=c(-1, 0))))

Related

Change the y limits ( especially the minimum) with Vlnplot

I would like to draw a violin plot from my single cell data.
I am using this function :
Vlnplot(object, features, cols = NULL, pt.size = 0.1)
But I would like to change the y axis to 3000-10000 instead of 0-70000.
They only propose to change the y max but not the mean
Does someone have an idea how to do it ?
The VlnPlot function in the Seurat R package uses ggplot2 to draw the violin plot. This means we can modify the y-axis using scale_y_continuous.
In your case, to change the y-axis of your violin plot to 3000-10000, we would write:
VlnPlot(object, features, cols = NULL, pt.size = 0.1) + scale_y_continuous(limits = c(3000,10000))
To show you a reproducible example, we can draw a violin plot using the pbmc_small dataset from the Seurat package:
VlnPlot(pbmc_small, "CD3E")
The plot above has a default axis of 0 to around 6.3. Here is how the same plot looks like after altering the y-axis using scale_y_continuous, in which I zoom in between 0 and 3:
VlnPlot(pbmc_small, "CD3E") + scale_y_continuous(limits = c(0,3))
Many of the other visualizations in the Seurat package also use ggplot2, so you can make all types of cosmetic changes to them using various ggplot2 commands (themes, axis labels, colors, etc.)

Axis breaks in ggplot histogram in R [duplicate]

I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this:

Base Plot, correctly defining axis [duplicate]

How can I change the spacing of tick marks on the axis of a plot?
What parameters should I use with base plot or with rgl?
There are at least two ways for achieving this in base graph (my examples are for the x-axis, but work the same for the y-axis):
Use par(xaxp = c(x1, x2, n)) or plot(..., xaxp = c(x1, x2, n)) to define the position (x1 & x2) of the extreme tick marks and the number of intervals between the tick marks (n). Accordingly, n+1 is the number of tick marks drawn. (This works only if you use no logarithmic scale, for the behavior with logarithmic scales see ?par.)
You can suppress the drawing of the axis altogether and add the tick marks later with axis().
To suppress the drawing of the axis use plot(... , xaxt = "n").
Then call axis() with side, at, and labels: axis(side = 1, at = v1, labels = v2). With side referring to the side of the axis (1 = x-axis, 2 = y-axis), v1 being a vector containing the position of the ticks (e.g., c(1, 3, 5) if your axis ranges from 0 to 6 and you want three marks), and v2 a vector containing the labels for the specified tick marks (must be of same length as v1, e.g., c("group a", "group b", "group c")). See ?axis and my updated answer to a post on stats.stackexchange for an example of this method.
With base graphics, the easiest way is to stop the plotting functions from drawing axes and then draw them yourself.
plot(1:10, 1:10, axes = FALSE)
axis(side = 1, at = c(1,5,10))
axis(side = 2, at = c(1,3,7,10))
box()
I have a data set with Time as the x-axis, and Intensity as y-axis. I'd need to first delete all the default axes except the axes' labels with:
plot(Time,Intensity,axes=F)
Then I rebuild the plot's elements with:
box() # create a wrap around the points plotted
axis(labels=NA,side=1,tck=-0.015,at=c(seq(from=0,to=1000,by=100))) # labels = NA prevents the creation of the numbers and tick marks, tck is how long the tick mark is.
axis(labels=NA,side=2,tck=-0.015)
axis(lwd=0,side=1,line=-0.4,at=c(seq(from=0,to=1000,by=100))) # lwd option sets the tick mark to 0 length because tck already takes care of the mark
axis(lwd=0,line=-0.4,side=2,las=1) # las changes the direction of the number labels to horizontal instead of vertical.
So, at = c(...) specifies the collection of positions to put the tick marks. Here I'd like to put the marks at 0, 100, 200,..., 1000. seq(from =...,to =...,by =...) gives me the choice of limits and the increments.
And if you don't want R to add decimals or zeros, you can stop it from drawing the x axis or the y axis or both using ...axt. Then, you can add your own ticks and labels:
plot(x, y, xaxt="n")
plot(x, y, yaxt="n")
axis(1 or 2, at=c(1, 5, 10), labels=c("First", "Second", "Third"))
I just discovered the Hmisc package:
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, and recoding variables.
library(Hmisc)
plot(...)
minor.tick(nx=10, ny=10) # make minor tick marks (without labels) every 10th

Moving tick marks to inside of axis in raw.means.plot2

I'm formatting figures for a journal that requires that tick marks be inside the axis. For my first plot using base graphics, I've found that I can use this:
attach(mtcars)
plot(wt, mpg, tck = 0.02)
My second plot, however, uses raw.means.plot2 from the plotrix package, and I can't figure out how to get the tick marks to move to the inside of the x axis. I've looked through the plotrix documentation, but I can't seem to find anything on this. For example, I'd like to move the tick marks to the inside of both the x and y axes on this plot, but tck=0.02 only moves the tick marks on the y axis.
mtcars$ID <- seq(1,32)
library('plotrix')
raw.means.plot2(mtcars, col.id="ID",col.offset="cyl", col.x="am", col.value="mpg", tck=0.02)
If passing inside the function doesn't work, the next thing to try is a call to par before plotting.
library('plotrix')
mtcars$ID <- seq(1,32)
par(tck = .02)
raw.means.plot2(mtcars, col.id="ID",col.offset="cyl", col.x="am", col.value="mpg")
And note that after plotting par('tck') is still 0.02, so you can use
op <- par(no.readonly = TRUE)
par(stuff)
plot(stuff)
par(op)
to return to previous state. Or graphics.off() to clear all plots and reset to defaults
Turn off the xaxis and then add one manually
raw.means.plot2(mtcars, col.id="ID",col.offset="cyl", col.x="am", col.value="mpg", xaxis = FALSE, tck = 0.02)
axis(1, tck = 0.02)

How can I plot a histogram of a long-tailed data using R?

I have data that is mostly centered in a small range (1-10) but there is a significant number of points (say, 10%) which are in (10-1000). I would like to plot a histogram for this data that will focus on (1-10) but will also show the (10-1000) data. Something like a log-scale for th histogram.
Yes, i know this means not all bins are of equal size
A simple hist(x) gives
while hist(x,breaks=c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,3,4,5,7.5,10,15,20,50,100,200,500,1000,10000))) gives
none of which is what I want.
update
following the answers here I now produce something that is almost exactly what I want (I went with a continuous plot instead of bar-histogram):
breaks <- c(0,1,1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2,4,8)
ggplot(t,aes(x)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10('true size/predicted size', breaks = breaks, labels = breaks)![alt text][3]
the only problem is that I'd like to match between the scale and the actual bars plotted. There two options for doing that : the one is simply use the actual margins of the plotted bars (how?) then get "ugly" x-axis labels like 1.1754,1.2985 etc. The other, which I prefer, is to control the actual bins margins used so they will match the breaks.
Log scale histograms are easier with ggplot than with base graphics. Try something like
library(ggplot2)
dfr <- data.frame(x = rlnorm(100, sdlog = 3))
ggplot(dfr, aes(x)) + geom_histogram() + scale_x_log10()
If you are desperate for base graphics, you need to plot a log-scale histogram without axes, then manually add the axes afterwards.
h <- hist(log10(dfr$x), axes = FALSE)
Axis(side = 2)
Axis(at = h$breaks, labels = 10^h$breaks, side = 1)
For completeness, the lattice solution would be
library(lattice)
histogram(~x, dfr, scales = list(x = list(log = TRUE)))
AN EXPLANATION OF WHY LOG VALUES ARE NEEDED IN THE BASE CASE:
If you plot the data with no log-transformation, then most of the data are clumped into bars at the left.
hist(dfr$x)
The hist function ignores the log argument (because it interferes with the calculation of breaks), so this doesn't work.
hist(dfr$x, log = "y")
Neither does this.
par(xlog = TRUE)
hist(dfr$x)
That means that we need to log transform the data before we draw the plot.
hist(log10(dfr$x))
Unfortunately, this messes up the axes, which brings us to workaround above.
Using ggplot2 seems like the most easy option. If you want more control over your axes and your breaks, you can do something like the following :
EDIT : new code provided
x <- c(rexp(1000,0.5)+0.5,rexp(100,0.5)*100)
breaks<- c(0,0.1,0.2,0.5,1,2,5,10,20,50,100,200,500,1000,10000)
major <- c(0.1,1,10,100,1000,10000)
H <- hist(log10(x),plot=F)
plot(H$mids,H$counts,type="n",
xaxt="n",
xlab="X",ylab="Counts",
main="Histogram of X",
bg="lightgrey"
)
abline(v=log10(breaks),col="lightgrey",lty=2)
abline(v=log10(major),col="lightgrey")
abline(h=pretty(H$counts),col="lightgrey")
plot(H,add=T,freq=T,col="blue")
#Position of ticks
at <- log10(breaks)
#Creation X axis
axis(1,at=at,labels=10^at)
This is as close as I can get to the ggplot2. Putting the background grey is not that straightforward, but doable if you define a rectangle with the size of your plot screen and put the background as grey.
Check all the functions I used, and also ?par. It will allow you to build your own graphs. Hope this helps.
A dynamic graph would also help in this plot. Use the manipulate package from Rstudio to do a dynamic ranged histogram:
library(manipulate)
data_dist <- table(data)
manipulate(barplot(data_dist[x:y]), x = slider(1,length(data_dist)), y = slider(10, length(data_dist)))
Then you will be able to use sliders to see the particular distribution in a dynamically selected range like this:

Resources