ggplot2 ecdf behaviour seems odd - r

Consider the following series and cumulative plot:
x=c(0,0,0,0.5,10,1500)
qplot(x,geom='step',stat='ecdf')
This produces a graph that starts left of zero. In other words, it looks as if x has negative values. At the right it continues to the right after reaching 1500 and 100%. So, it looks as if there are x points larger than 1500.
I get what I expect when doing the whole thing manually:
xs=sort(x)
qplot(xs,1:length(xs)/length(xs),x,geom='step')
But this seems to defy the whole purpose of the stat='ecdf' shortcut.
What am I missing?

By default stat_ecdf will pad the endpoints by max(0.08 * diff(rx), median(diff(xvals))). In my answer to In R ggplot2, include stat_ecdf() endpoints (0,0) and (1,1) I give a way of working around this, but it might be a little drastic, depending on your use case.

Related

Error in pie3D (plotrix) in r

Lets make some data:
dat <- data.frame(art=c("Ål", "Gedde", "Brosme"), sum=c(708,3797,1385))
And when i plot this in a 3D plot, like this:
library(plotrix)
pie3D(dat$sum,labels=dat$art,explode=0.1, main="Arter")
This happens:
How can I avoid that red are below green?
Sometimes it helps to read the help:
Due to the somewhat primitive method used to draw sectors, a sector that extends beyond both pi/2 and 3*pi/2 radians in either direction may not display properly. Setting start to pi/2 will often fix this, but the user may have to adjust start and the order of sectors in extreme cases
pie3D(dat$sum,labels=dat$art,explode=0.03, start=pi/2, main="Arter")
(also, the explode=0.03 looks nicer imho)

Axis automatically changed?

I have a plot, where the lines are only in the negative range and when I plot it, the axis is automatically changed, i.e. negative larger values are going up and not down, in a normal plot. Currently I have the following plot:
But I want to have the y axis the other way round, so that negative larger values are going down and not up, I hope it is understandable what I mean.
How can I achieve this in R?
My code with my specific data is just using the normal plot() function.
As Ben Bolker said, the following has to be said:
I set the ylim range wrong, I set it like
ylim=c(-0.05,-1)
but
ylim=c(-1,-0.05)
should do what I want!

Drawing circles in R

I'm using plotrix package to draw circles.
And I don't get what is wrong with my code... :-(
I have three points. The first point (1,1) should be the center of the circle. The following two points (1,4) and (4,1) have the same distance/radius to the center.
So the circle in the plot should go through these points, right?
And I don't know why the circle looks wrong. Is there an explanation?
p1 <- c(1,1)
p2 <- c(4,1)
p3 <- c(1,4)
r <- sqrt(sum((p1-p2)^2))
plot(x=c(p1[1], p2[1], p3[1]),
y=c(p1[2], p2[2], p3[2]),
ylim=c(-5,5), xlim=c(-5,5))
draw.circle(x=p1[1], y=p1[2], radius=(r))
abline(v=-5:5, col="#0000FF66")
abline(h=-5:5, col="#0000FF66")
Take a look at the produced output here
As #Baptiste says above, you can use plot(...,asp=1). This will only work if your x and y ranges happen to be the same, though (because it sets the physical aspect ratio of your plot to 1). Otherwise, you probably want to use the eqscplot function from the MASS package. A similar issue arises whenever you try to do careful plots of geometric objects, e.g. Drawing non-intersecting circles
This plot is produced by substituting MASS::eqscplot for plot in your code above:
Note that depending on the details of what R thinks about your monitor configuration etc., the circle may look a bit squashed (even though it goes through the points) when you plot in R's graphics window -- it did for me -- but should look OK in the graphical output.

R plot axes don't meet and data extends beyond them

I have a VERY basic plot in R, and I'd like to solve two issues. Here is the code which produces the plot:
plot(o,n,bty="n",pch=21,cex=1.5,bg="gray",xlab="y",ylab="x",lwd=2)
And, here's the plot
There are two unwanted behaviors of this plot that I'm trying to fix. And I don't know how to do either one (nor do I understand why R doesn't do these things already...)
The X and Y axes do not meet. There is a gap near the origin in this plot. I want to remove that. The axes should touch, just like any other graph.
The data extends past the axis is both the X and Y direction. This clearly is unwanted. How can I fix this without having to manually make my own axis. Seems like there should be something more intuitive here.
bty="l".
You may also want to use something like:
xlim=c(0.02, 0.24), ylim=c(0.02, 0.24)
if you don't like the default limits of your two axes.
In general, check out ?par for guidance on both of these and many other options.
Try leaving out bty="n" or replacing it by bty="L" if you really do not want a box with edges above or on the right

How to avoid overplotting (for points) using base-graph?

I am in my way of finishing the graphs for a paper and decided (after a discussion on stats.stackoverflow), in order to transmit as much information as possible, to create the following graph that present both in the foreground the means and in the background the raw data:
However, one problem remains and that is overplotting. For example, the marked point looks like it reflects one data point, but in fact 5 data points exists with the same value at that place.
Therefore, I would like to know if there is a way to deal with overplotting in base graph using points as the function.
It would be ideal if e.g., the respective points get darker, or thicker or,...
Manually doing it is not an option (too many graphs and points like this). Furthermore, ggplot2 is also not what I want to learn to deal with this single problem (one reason is that I tend to like dual-axes what is not supprted in ggplot2).
Update: I wrote a function which automatically creates the above graphs and avoids overplotting by adding vertical or horizontal jitter (or both): check it out!
This function is now available as raw.means.plot and raw.means.plot2 in the plotrix package (on CRAN).
Standard approach is to add some noise to the data before plotting. R has a function jitter() which does exactly that. You could use it to add the necessary noise to the coordinates in your plot. eg:
X <- rep(1:10,10)
Z <- as.factor(sample(letters[1:10],100,replace=T))
plot(jitter(as.numeric(Z),factor=0.2),X,xaxt="n")
axis(1,at=1:10,labels=levels(Z))
Besides jittering, another good approach is alpha blending which you can obtain (on the graphics devices supporing it) as the fourth color parameter. I provided an example for 'overplotting' of two histograms in this SO question.
One additional idea for the general problem of showing the number of points is using a rug plot (rug function), this places small tick marks along the margin that can show how many points contribute (still use jittering or alpha blending for ties). This allows the actual points to show their true rather than jittered values, but the rug can then indicate which parts of the plot have more values.
For the example plot direct jittering or alpha blending is probably best, but in some other cases the rug plot can be useful.
You may also use sunflowerplot, while it would be hard to implement it here. I would use alpha-blending, as Dirk suggested.

Resources