Removal of points from plot - r

I'm trying to plot 4 lines from 4 different y-axis variables vs. the same x-axis variable on one graph. I am currently using:
plot(df$x1, df$y,
lines(smooth.spline(df$x1, df$y), col="red"),
main="Decrease over 10 years",
ylim = c(0, 10),
xlim = c(0,10),
xlab="Years",
ylab="Percentage",
pch="",
las = 1
)
points(df$x2, df$y,
lines(smooth.spline(df$x2, df$y), col="blue")
points(df$x3, df$y,
lines(smooth.spline(df$x3, df$y), col="green")
points(df$x4, df$y,
lines(smooth.spline(df$x4, df$y), col="black")
However, when I plot this I obtain the 4 desired curves, but also "o" points along the x-axis. Is there a way to remove these points (they don't refer to the data)? I've tried using the pch="" option, but this does not remove the points.
Thanks,
Mark

I tried to reproduce your problem with
set.seed(123)
df<-data.frame(
x1=sort(rnorm(25, 5, 2)),
x2=sort(rnorm(25, 5, 2)),
x3=sort(rnorm(25, 5, 2)),
x4=sort(rnorm(25, 5, 2))
)
df<-transform(df, y=x1*2-1+rnorm(25))
plot(df$x1, df$y, lines(smooth.spline(df$x1, df$y), col="red"))
points(df$x2, df$y, lines(smooth.spline(df$x2, df$y), col="blue"))
points(df$x3, df$y, lines(smooth.spline(df$x3, df$y), col="green"))
points(df$x4, df$y, lines(smooth.spline(df$x4, df$y), col="black"))
But i didn't see anything unusual on the plot. Can you explain how your data is different than the sample I generated?

Related

Increase margin of figure to include long labels

Essentially, I want to make sure that all my labels on the x-axis are non-overlapping and that the figure margins are long enough to see vertical labels.
par(mar=c(180, 70, 2, 2.1))
oldfont <- par(font=3)
table(new$Tag)
barplot(table(new$Tag),x,las=2,cex.lab=100)
Please find bar plot image here!
Are you looking for something more than just changing the margins, text size etc.?
Readability can be improve a bit by censoring out the single-counts and truncating the names.
set.seed(1)
words <- sapply(
sample(3:25, 50, replace=TRUE),
function(x) {
paste(sample(c(letters), x, replace=TRUE), collapse="")
}
)
strtrunc <- function(x, l, r="…") {
trunc <- nchar(x) > l
x[trunc] <- paste0(strtrim(x[trunc], l), r)
x
}
samp <- sample(1:50, 500, replace=TRUE)
samp.t <- round(1.2^table(samp))
samp.t[sample(1:50, 20)] <- 1
names(samp.t) <- words
dev.new(width=10, height=5)
par(mar=c(10, 4, 3, 0.5), mgp=c(0, 0.8, -0.5), cex=0.9)
b <- barplot(samp.t, xaxt="n", space=0.5, col=1)
axis(1, at=b, labels=names(samp.t), las=2, tick=FALSE, cex.axis=0.8)
mtext("All counts", line=1, cex=1.5)
#barplot with logarithmic y-axis, truncated names and no single-counts
samp.ts <- samp.t[samp.t != 1]
names(samp.ts) <- strtrunc(names(samp.ts), 15)
dev.new(width=10, height=5)
par(mar=c(10, 4, 3, 0.5), mgp=c(0, 0.8, -0.5), cex=0.9)
b <- barplot(samp.ts, xaxt="n", space=0.5, col=1, log="y")
axis(1, at=b, labels=names(samp.ts), las=2, tick=FALSE, cex.axis=1.2)
mtext("Counts > 1", line=1, cex=1.5)
Bar plots with more than 20 or so named categories generally doesn't really work so well, you'd might be better off finding a different way to visualize your data. Histogram or density plot might be an option, if it makes sense for your data. Otherwise breaking the bar plot up into smaller sections, maybe by sensible groups, might be another.

combine histogram with scatter plot in R

I am trying to produce a plot with histogram and scatter plot in just one plot using a secondary axis. In detail, here is an example data:
#generate example data
set.seed(1)
a <- rnorm(200,mean=500,sd=35)
data <- data.frame(a = a,
b = rnorm(200, mean=10, sd=2),
c = c(rep(1,100), rep(0,100)))
# produce a histogram of data$a
hist(a, prob=TRUE, col="grey")
#add a density line
lines(density(a), col="blue", lwd=2)
#scatter plot
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"))
What I want to do is to combine the histogram and scatter plot together. This implies my x-axis will be data$a, my primary y-axis is the frequency/density for the histogram and my secondary y-axis is data$b.
Maybe something like this...
# produce a histogram of data$a
hist(a, prob=TRUE, col="grey")
#add a density line
lines(density(a), col="blue", lwd=2)
par(new = TRUE)
#scatter plot
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"),
axes = FALSE, ylab = "", xlab = "")
axis(side = 4, at = seq(4, 14, by = 2))
There's a good blog on this here http://www.r-bloggers.com/r-single-plot-with-two-different-y-axes/.
Basically, as the blog describes you need to do:
par(new = TRUE)
plot(data$a,data$b,col=ifelse(data$c==1,"red","black"), axes = F, xlab = NA, ylab = NA)
axis(side = 4)

misplaced label on scatter plot data

I am quite new to R and was wondering if anyone could help with this problem:
I am trying to graph a set of data. I use plot to plot the scatter data and use text to add labels to the values. However the last label is misplaced on the graph and I can't figure out why. Below is the code:
#specify the dataset
x<-c(1:10)
#find p: the percentile of each data in the dataset
y=quantile(x, probs=seq(0,1,0.1), na.rm=FALSE, type=5)
#print the values of p
y
#plot p against x
plot(y, tck=0.02, main="Percentile Graph of Dataset D", xlab="Data of the dataset", ylab="Percentile", xlim=c(0, 11), ylim=c(0, 11), pch=10, seq(1, 11, 1), col="blue", las=1, cex.lab=0.9, cex.axis=0.9, cex.main=0.9)
#change the x-axis scale
axis(1, seq(1, 11, 1), tck=0.02)
#draw disconnected line segments
abline(h = 1:11, v = 1:11, col = "#EDEDED")
#Add data labels to the graph
text(y, x, labels= (y), cex=0.6, pos=1, col="red")
Your probs request returns 11 values, but you only have 10 x values. Therefore R recycles your y values, and the 11th label is plotted at y = 1 when you add the text. How to fix this depends upon what you are trying to do. Perhaps in your probs sequence you want seq(0, 1, length.out = 10)?

Plot two time series with different y-axes: one as a dot plot (or a bar plot) and the other as a line

I have two time series of data, each with a different range of values. I would like to plot one as a dotplot and the other as a line over the dotplot. (I would settle for a decent-looking barplot and a line over the barplot, but my preference is a dotplot.)
#make some data
require(lubridate)
require(ggplot)
x1 <- sample(1990:2010, 10, replace=F)
x1 <- paste(x1, "-01-01", sep="")
x1 <- as.Date(x1)
y1 <- sample(1:10, 10, replace=T)
data1 <- cbind.data.frame(x1, y1)
year <- sample(1990:2010, 10, replace=F)
month <- sample(1:9, 10, replace=T)
day <- sample(1:28, 10, replace=T)
x2 <- paste(year, month, day, sep="-")
x2 <- as.Date(x2)
y2 <- sample(100:200, 10, replace=T)
data2 <- cbind.data.frame(x2, y2)
data2 <- data2[with(data2, order(x2)), ]
# frequency data for dot plot
x3 <- sample(1990:2010, 25, replace=T)
data2 <- as.data.frame(x3)
I can make a dotplot or barplot with one data set in ggplot:
ggplot() + geom_dotplot(data=data2, aes(x=x3))
ggplot() + geom_bar(data=data, aes(x=x1, y=y1), stat="identity")
But I can't overlay the second data set because ggplot doesn't permit a second y-axis.
I can't figure out how to plot a time series using barplot().
I can plot the first set of data as an "h" type plot, using plot(), and add the second set of data as a line, but I can't make the bars any thicker because each one corresponds to a single day over a stretch of many years, and I think it's ugly.
plot(data$x1, data$y1, type="h")
par(new = T)
plot(data2$x2, data2$y2, type="l", axes=F, xlab=NA, ylab=NA)
axis(side=4)
Any ideas? My only remaining idea is to make two separate plots and overlay them in a graphics program. :/
An easy workaround is to follow your base plotting instinct and beef up lwd for type='h'. Be sure to set lend=1 to prevent rounded lines:
par(mar=c(5, 4, 2, 5) + 0.1)
plot(data1, type='h', lwd=20, lend=1, las=1, xlab='Date', col='gray',
xlim=range(data1$x1, data2$x2))
par(new=TRUE)
plot(data2, axes=FALSE, type='o', pch=20, xlab='', ylab='', lwd=2,
xlim=range(data1$x1, data2$x2))
axis(4, las=1)
mtext('y2', 4, 3.5)
I removed the original answer.
To answer your question about making a dot plot, you can rearrange your data so that you can use the base plotting function. An example:
use the chron package for plotting:
library(chron)
dummy data:
count.data <- data.frame("dates" = c("1/27/2000", "3/27/2000", "6/27/2000", "10/27/2000"), "counts" = c(3, 10, 5, 1), stringsAsFactors = F)
replicate the dates in a list:
rep.dates <- sapply(1:nrow(count.data), function(x) rep(count.data$dates[x], count.data$counts[x]))
turn the counts into a sequence:
seq.counts <- sapply(1:nrow(count.data), function(x) seq(1, count.data$counts[x], 1))
plot it up:
plot(as.chron(rep.dates[[1]]), seq.counts[[1]], xlim = c(as.chron("1/1/2000"), as.chron("12/31/2000")),
ylim = c(0, 20), pch = 20, cex = 2)
for(i in 2:length(rep.dates)){
points(as.chron(rep.dates[[i]]), seq.counts[[i]], pch = 20, cex = 2)
}

Controlling z labels in contourplot

I am trying to control how many z labels should be written in my contour plot plotted with contourplot() from the lattice library.
I have 30 contour lines but I only want the first 5 to be labelled. I tried a bunch of things like
contourplot(z ~ z+y, data=d3, cuts=30, font=3, xlab="x axis", ylab="y axis", scales=list(at=seq(2,10,by=2)))
contourplot(z ~ z+y, data=d3, cuts=30, font=3, xlab="x axis", ylab="y axis", at=seq(2,10,by=2))
but nothing works.
Also, is it possible to plot two contourplot() on the same graph? I tried
contourplot(z ~ z+y, data=d3, cuts=30)
par(new=T)
contourplot(z ~ z+y, data=d3, cuts=20)
but it's not working.
Thanks!
Here is my take:
library(lattice)
x <- rep(seq(-1.5,1.5,length=50),50)
y <- rep(seq(-1.5,1.5,length=50),rep(50,50))
z <- exp(-(x^2+y^2+x*y))
# here is default plot
lp1 <- contourplot(z~x*y)
# here is an enhanced one
my.panel <- function(at, labels, ...) {
# draw odd and even contour lines with or without labels
panel.contourplot(..., at=at[seq(1, length(at), 2)], col="blue", lty=2)
panel.contourplot(..., at=at[seq(2, length(at), 2)], col="red",
labels=as.character(at[seq(2, length(at), 2)]))
}
lp2 <- contourplot(z~x*y, panel=my.panel, at=seq(0.2, 0.8, by=0.2))
lp3 <- update(lp2, at=seq(0.2,0.8,by=0.1))
lp4 <- update(lp3, lwd=2, label.style="align")
library(gridExtra)
grid.arrange(lp1, lp2, lp3, lp4)
You can adapt the custom panel function to best suit your needs (e.g. other scale for leveling the z-axis, color, etc.).
You can specify the labels as a character vector argument and set the last values with rep("", 5), so perhaps for the example you offered on an earlier question about contour
x = seq(0, 10, by = 0.5)
y = seq(0, 10, by = 0.5)
z <- outer(x, y)
d3 <- expand.grid(x=x,y=y); d3$z <- as.vector(z)
contourplot(z~x+y, data=d3)
# labeled '5'-'90'
contourplot(z~x+y, data=d3,
at=seq(5,90, by=5),
labels=c(seq(5,25, by=5),rep("", 16) ),
main="Labels only at the first 5 contour lines")
# contourplot seems to ignore 'extra' labels
# c() will coerce the 'numeric' elements to 'character' if any others are 'character'
?contourplot # and follow the link in the info about labels to ?panel.levelplot

Resources