How to plot a zero-order function on a scatterplot? - r

I would like to make a plot from a pairs of x-y values, but instead of the normal linear connection between the dots, I would like them to be connected only by horizontal and vertical lines (zero-order fit). Is this possible in R ?

You can use the option to make a step plot (I think this is what you're after?). This is done with the type="s" option to plot.
set.seed(0)
dat <- data.frame(x=sample(10), y=sample(10))
plot(dat[order(dat$x),], type="s")
points(dat, pch=16, col="steelblue")

Related

rgl 3D scatterplot - controlling size of spheres from 4th dimension (bubble plot)

I am working on a 3D scatter plot using rgl package in R, with multiple colors for different series. I was wondering if there would be a way to plot a 4th dimension by controlling the size of spheres.
I know it's possible with plotly ("bubble plot") : https://plot.ly/r/3d-scatter-plots/, but Plotly starts to flicker when dealing with lots of datapoints. Can the same result be achieved using Rgl?
set.seed(101)
dd <- data.frame(x=rnorm(100),y=rnorm(100),z=rnorm(100),
c=rnorm(100),s=rnorm(100))
Scaling function (I tweaked to keep the values strictly in (0,1), don't know if that's really necessary):
ss <- function(x) scale(x,center=min(x)-0.01,scale=diff(range(x))+0.02)
library(rgl)
Define colours (there may be a better way to do this ...)
cvec <- apply(colorRamp(c("red","blue"))(ss(dd$c))/255,1,
function(x) rgb(x[1],x[2],x[3]))
The picture (need type="s" to get spheres)
with(dd,plot3d(x,y,z,type="s",radius=ss(s), col=cvec))

R plot and barplot how to fix ylim not alike?

I try to use base R to plot a time series as a bar plot and as ordinary line plot. I try to write a flexible function to draw such a plot and would like to draw the plots without axes and then add universal axis manually.
Now, I hampered by strange problem: same ylim values result into different axes. Consider the following example:
data(presidents)
# shorten this series a bit
pw <- window(presidents,start=c(1965))
barplot(t(pw),ylim = c(0,80))
par(new=T)
plot(pw,ylim = c(0,80),col="blue",lwd=3)
I intentionally plot y-axes coming from both plots here to show it's not the same. I know I can achieve the intended result by plotting a bar plot first and then add lines using x and y args of lines.
But the I am looking for flexible solution that let's you add lines to barplots like you add lines to points or other line plots. So is there a way to make sure y-axes are the same?
EDIT: also adding the usr parameter to par doesn't help me here.
par(new=T,usr = par("usr"))
Add yaxs="i" to your lineplot. Like this:
plot(pw,ylim = c(0,80),col="blue",lwd=3, yaxs="i")
R start barplots at y=0, while line plots won't. This is to make sure that you see a line if it happens that your data is y=0, otherwise it aligns with the x axis line.

Is it possible to use more than two characters as points in a plot

I am trying to plot points in a plot where each dot is represented by a number. However, it seems that the points can only be one character long, as you can see in the plot produced by the code below:
set.seed(1); plot(rnorm(15), pch=paste(1:15))
I wonder if there is any workaround for this. Thanks.
set.seed(1); plot(rnorm(15), pch=paste(1:15),type='n')
text(x=1:15,y=rnorm(15),label=round(rnorm(15),2))
another grid option using lattice for example:
dat <- data.frame(x=1:15,y=rnorm(15))
xyplot(y~x,data=dat,
panel=function(x,y,...){
panel.xyplot(x,y,...)
panel.text(x,y,label=round(rnorm(15),2),adj=2,col='red')})

How to plot density plots with proportions on the y-axis?

I am using the sm package in R to draw a density plot of several variables with different sample sizes, like this:
var1 <- density(vars1[,1])
var2 <- density(vars2[,1])
var3 <- density(vars3[,1])
pdf(file="density.pdf",width=8.5,height=8)
plot(var1,col="BLUE")
par(new=T)
plot(var2,axes=FALSE,col="RED")
par(new=T)
plot(var3,axes=FALSE,col="GREEN")
dev.off()
The problem I'm having, is that I want the y-axis to show the proportions so I can compare the different variables with each other in a more meaningful way. The maxima of all three density plots are now exactly the same, and I'm pretty sure that they wouldn't be if the y-axis showed proportions. Any suggestions? Many thanks!
Edit:
I just learned that I should not plot on top of an existing plot, so now the plotting part of the code looks like this:
pdf(file="density.pdf",width=8.5,height=8)
plot(var1,col="BLUE")
lines(var2,col="RED")
lines(var3,col="GREEN")
dev.off()
The maxima of those lines however are now very much in line with the sample size differences. Is there a way to put the proportions on the y-axis for all three variables, so the area under the curve is equal for all three variables? Many thanks!
Don't plot on top of an existing plot, because they axes may be different. Instead, use lines() to plot the second and third densities after plotting the first. If necessary, adjust the ylim parameter in plot() so that they all fit.
An example for how sample size ought not matter:
set.seed(1)
D1 <- density(rnorm(1000))
D2 <- density(rnorm(10000))
D3 <- density(rnorm(100000))
plot(D1$x,D1$y,type='l',col="red",ylim=c(0,.45))
lines(D2$x,D2$y,lty=2,col="blue")
lines(D3$x,D3$y,lty=3,col="green")
You could make tim's solution a little more flexible by not hard-coding in the limits.
plot(D1$x,D1$y,type='l',col="red",ylim=c(0, max(sapply(list(D1, D2, D3),
function(x) {max(x$y)}))))
This would also cater for Vincent's point that the density functions are not necessarily constrained in their range.

Problem with axis limits when plotting curve over histogram [duplicate]

This question already has an answer here:
How To Avoid Density Curve Getting Cut Off In Plot
(1 answer)
Closed 6 years ago.
newbie here. I have a script to create graphs that has a bit that goes something like this:
png(Test.png)
ht=hist(step[i],20)
curve(insert_function_here,add=TRUE)
I essentially want to plot a curve of a distribution over an histogram. My problem is that the axes limits are apparently set by the histogram instead of the curve, so that the curve sometimes gets out of the Y axis limits. I have played with par("usr"), to no avail. Is there any way to set the axis limits based on the maximum values of either the histogram or the curve (or, in the alternative, of the curve only)?? In case this changes anything, this needs to be done within a for loop where multiple such graphs are plotted and within a series of subplots (par("mfrow")).
Inspired by other answers, this is what i ended up doing:
curve(insert_function_here)
boundsc=par("usr")
ht=hist(A[,1],20,plot=FALSE)
par(usr=c(boundsc[1:2],0,max(boundsc[4],max(ht$counts))))
plot(ht,add=TRUE)
It fixes the bounds based on the highest of either the curve or the histogram.
You could determine the mx <- max(curve_vector, ht$counts) and set ylim=(0, mx), but I rather doubt the code looks like that since [] is not a proper parameter passing idiom and step is not an R plotting function, but rather a model selection function. So I am guessing this is code in Matlab or some other idiom. In R, try this:
set.seed(123)
png("Test.png")
ht=hist(rpois(20,1), plot=FALSE, breaks=0:10-0.1)
# better to offset to include discrete counts that would otherwise be at boundaries
plot(round(ht$breaks), dpois( round(ht$breaks), # plot a Poisson density
mean(ht$counts*round(ht$breaks[-length(ht$breaks)]))),
ylim=c(0, max(ht$density)+.1) , type="l")
plot(ht, freq=FALSE, add=TRUE) # plot the histogram
dev.off()
You could plot the curve first, then compute the histogram with plot=FALSE, and use the plot function on the histogram object with add=TRUE to add it to the plot.
Even better would be to calculate the the highest y-value of the curve (there may be shortcuts to do this depending on the nature of the curve) and the highest bar in the histogram and give this value to the ylim argument when plotting the histogram.

Resources