misplaced label on scatter plot data - r

I am quite new to R and was wondering if anyone could help with this problem:
I am trying to graph a set of data. I use plot to plot the scatter data and use text to add labels to the values. However the last label is misplaced on the graph and I can't figure out why. Below is the code:
#specify the dataset
x<-c(1:10)
#find p: the percentile of each data in the dataset
y=quantile(x, probs=seq(0,1,0.1), na.rm=FALSE, type=5)
#print the values of p
y
#plot p against x
plot(y, tck=0.02, main="Percentile Graph of Dataset D", xlab="Data of the dataset", ylab="Percentile", xlim=c(0, 11), ylim=c(0, 11), pch=10, seq(1, 11, 1), col="blue", las=1, cex.lab=0.9, cex.axis=0.9, cex.main=0.9)
#change the x-axis scale
axis(1, seq(1, 11, 1), tck=0.02)
#draw disconnected line segments
abline(h = 1:11, v = 1:11, col = "#EDEDED")
#Add data labels to the graph
text(y, x, labels= (y), cex=0.6, pos=1, col="red")

Your probs request returns 11 values, but you only have 10 x values. Therefore R recycles your y values, and the 11th label is plotted at y = 1 when you add the text. How to fix this depends upon what you are trying to do. Perhaps in your probs sequence you want seq(0, 1, length.out = 10)?

Related

R: Colour points on a map based on their value and add legend

I have a dataset containing longitude, latitude and a value column showing humidity, about 300 rows in length. Each point shows the humidity for a different location. I would like to plot all of them on a map and colour them according to their value (such as in gradient colours) and add a legend. It is a bit similar to the question here, but I can't get it to work. The code is basically there but only the colouring and displaying it properly in a legend does not really work. The points represent a line in Africa and the humidity values have been originally extracted from a raster dataset and they contain several digits. I created some sample data to illustrate where I am stuck.
library("maps")
library("raster")
# create sample data
lon <- seq(from=35.6, to=43.2, by=0.2)
lat <- seq(from=10.5, to=22.2, by=0.2)
humidity <- runif(59, min=9.6, max=13.5)
data <- data.frame(lon,lat, humidity)
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(100))
legend("topleft",title="q (g/kg)",legend=c(11,11.5,12,12.5,13),col =colfunc(100), pch=20)
The resulting plot looks like this:
Something is clearly wrong with the legend, I would like to have a few points shown in the legend with the corresponding colour and value or even use a nice colourbar. I am not sure why the colour in the legend is just blue. I also suspect that the line of points is not coloured according to their actual value and just displaying the whole colour gradient. Thanks for any suggestions!
UPDATE with code from Alex:
n <- 10
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
mycol <- function(x, myrange, n=10) round( 1+(x-myrange[1])/diff(myrange) * (n-1))
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(n)[mycol(humidity, range(humidity), n)])
mylist <- c(10,11,11.5,12,12.5,13)
legend("topleft",title="q (g/kg)",legend=mylist,col = colfunc(n)[mycol(mylist,range(humidity), n)], pch=20)
That generates this plot:
Points are overlapping and it is hard to see the overall values of the points, is there any way to colour the points according to a defined range using the colourramp? Such as "red" for values 10 to 11, "green" for 11 to 12 and so on?
You may have mixed up a few things. In you code, you are plotting points with a colour that is only a result of the order of the points (first point gets first colour in the list etc). The colour doesn't depend on the value.
Now in a colour gradient for humidity values in the full range 0:100 you will, frankly, not see any difference between values 11 and 13. You need a lot more contrast.
So you should first do
mycol <- function(x, myrange, n=100) round( 1+(x-myrange[1])/diff(myrange) * (n-1))
now mycol(x, range(humidity), n) will return an integer that is 1 for the minimum value and n for the maximum.
n=100
points(data$lon, data$lat, cex=.5, pch=19, col=colfunc(n)[mycol(humidity, range(humidity), n)])
mylist <- c(11,11.5,12,12.5,13)
legend("topleft",title="q (g/kg)",legend=mylist,col = colfunc(n)[mycol(mylist,range(humidity), n)], pch=20)
you can seq the legend
library("maps")
library("raster")
n <- 4 # number in legend
# create sample data
lon <- seq(from=35.6, to=43.2, by=0.2)
lat <- seq(from=10.5, to=22.2, by=0.2)
humidity <- runif(39, min=9.6, max=20)
data <- data.frame(lon,lat[1:39], humidity)
colfunc<-colorRampPalette(c("dodgerblue2","khaki","orangered")) # create colours
map('world', xlim = c(20, 80), ylim = c(5, 30), lwd=0.5, col = "grey95", fill = T, interior = FALSE)
title("specific humidity along line")
map.axes()
points(data$lon, data$lat, cex=.5,pch=18, col=colfunc(nrow(data)))
legend("topleft",title="q (g/kg)",legend=round(seq(min(humidity),max(humidity),length.out = n),0),col =colfunc(n), pch=20)

Points Scale in R barplot [duplicate]

This question already has answers here:
How can I plot with 2 different y-axes?
(6 answers)
Closed 6 years ago.
i'm having troubles in a multi axis barplot. I have an X,Y axis with bars and dots in the same graph. The point is that I have to shown both of them in different scales
While I can shown both (bars and dots) correctly, the problem comes when I try to set different scales in left and right axis. I dont know how to change the aditional axis scale, and how to bind the red dots to the right axis, and the bars to the left one.
This is my code and what I get:
labels <- value
mp <- barplot(height = churn, main = title, ylab = "% churn", space = 0, ylim = c(0,5))
text(mp, par("usr")[3], labels = labels, srt = 45, adj = c(1.1,1.1), xpd = TRUE, cex=.9)
# Population dots
points(popul, col="red", bg="red", pch=21, cex=1.5)
# Churn Mean
media <- mean(churn)
abline(h=media, col = "black", lty=2)
# Population scale
axis(side = 4, col= "red")
ylim= c(0,50)
ylim= c(0,5)
What I want is to have left(grey) axis at ylim=c(0,5) with the bars bound to that axis. And the right(red) axis at ylim=c(0,50) with the dots bound to that axis...
The goal is to represent bars and points in the same graph with diferent axis.
Hope I explained myself succesfully.
Thanks for your assistance!
Here is a toy example. The only "trick" is to store the x locations of the bar centers and the limits of the x axis when creating the barplot, so that you can overlay a plot with the same x axis and add your points over the centers of the bars. The xaxs = "i" in the call to plot.window indicates to use the exact values given rather than expanding by a constant (the default behavior).
set.seed(1234)
dat1 <- sample(10, 5)
dat2 <- sample(50, 5)
par(mar = c(2, 4, 2, 4))
cntrs <- barplot(dat1)
xlim0 <- par()$usr[1:2]
par(new = TRUE)
plot.new()
plot.window(xlim = xlim0, ylim = c(0, 50), xaxs = "i")
points(dat2 ~ cntrs, col = "darkred")
axis(side = 4, col = "darkred")

How to do 3D bar plot in R

I would like to produce this kind of graph:
However, I don't know how to do it using R. I was wondering if someone knew a solution to do it in R?
I would use the package rgl.
library(rgl)
# load your data
X= c(1:6)
Y=seq(10,70, 10)
Z=c(-70, -50, -30, -20, -10, 10)
# create an empty plot with the good dimensions
plot3d(1,1,1, type='n', xlim=c(min(X),max(X)),
ylim=c(min(Y),max(Y)),
zlim=c(min(Z),max(Z)),
xlab="", ylab="", zlab="", axe=F )
# draw your Y bars
for(i in X){ segments3d(x = rep(X[i],2), y = c(0,Y[i]), z=0, lwd=6, col="purple")}
# do the same for the Z bars
plot3d(X,0,Z, add=T, axe=F, typ="n")
for(i in X){segments3d(x = rep(X[i],2), y = 0, z= c(0,Z[i]), lwd=6, col="blue" )}
# draw your axis
axes3d()
mtext3d(text = "Time (days)", edge = "y+", line =3, col=1 )
mtext3d(text = "Change %", edge = "z++", line = 5, col=1 )
However I have found the width of the bars restricted to 6. That could be a limit. Better looking when you have more data.
Hope it could help.

Plot two time series with different y-axes: one as a dot plot (or a bar plot) and the other as a line

I have two time series of data, each with a different range of values. I would like to plot one as a dotplot and the other as a line over the dotplot. (I would settle for a decent-looking barplot and a line over the barplot, but my preference is a dotplot.)
#make some data
require(lubridate)
require(ggplot)
x1 <- sample(1990:2010, 10, replace=F)
x1 <- paste(x1, "-01-01", sep="")
x1 <- as.Date(x1)
y1 <- sample(1:10, 10, replace=T)
data1 <- cbind.data.frame(x1, y1)
year <- sample(1990:2010, 10, replace=F)
month <- sample(1:9, 10, replace=T)
day <- sample(1:28, 10, replace=T)
x2 <- paste(year, month, day, sep="-")
x2 <- as.Date(x2)
y2 <- sample(100:200, 10, replace=T)
data2 <- cbind.data.frame(x2, y2)
data2 <- data2[with(data2, order(x2)), ]
# frequency data for dot plot
x3 <- sample(1990:2010, 25, replace=T)
data2 <- as.data.frame(x3)
I can make a dotplot or barplot with one data set in ggplot:
ggplot() + geom_dotplot(data=data2, aes(x=x3))
ggplot() + geom_bar(data=data, aes(x=x1, y=y1), stat="identity")
But I can't overlay the second data set because ggplot doesn't permit a second y-axis.
I can't figure out how to plot a time series using barplot().
I can plot the first set of data as an "h" type plot, using plot(), and add the second set of data as a line, but I can't make the bars any thicker because each one corresponds to a single day over a stretch of many years, and I think it's ugly.
plot(data$x1, data$y1, type="h")
par(new = T)
plot(data2$x2, data2$y2, type="l", axes=F, xlab=NA, ylab=NA)
axis(side=4)
Any ideas? My only remaining idea is to make two separate plots and overlay them in a graphics program. :/
An easy workaround is to follow your base plotting instinct and beef up lwd for type='h'. Be sure to set lend=1 to prevent rounded lines:
par(mar=c(5, 4, 2, 5) + 0.1)
plot(data1, type='h', lwd=20, lend=1, las=1, xlab='Date', col='gray',
xlim=range(data1$x1, data2$x2))
par(new=TRUE)
plot(data2, axes=FALSE, type='o', pch=20, xlab='', ylab='', lwd=2,
xlim=range(data1$x1, data2$x2))
axis(4, las=1)
mtext('y2', 4, 3.5)
I removed the original answer.
To answer your question about making a dot plot, you can rearrange your data so that you can use the base plotting function. An example:
use the chron package for plotting:
library(chron)
dummy data:
count.data <- data.frame("dates" = c("1/27/2000", "3/27/2000", "6/27/2000", "10/27/2000"), "counts" = c(3, 10, 5, 1), stringsAsFactors = F)
replicate the dates in a list:
rep.dates <- sapply(1:nrow(count.data), function(x) rep(count.data$dates[x], count.data$counts[x]))
turn the counts into a sequence:
seq.counts <- sapply(1:nrow(count.data), function(x) seq(1, count.data$counts[x], 1))
plot it up:
plot(as.chron(rep.dates[[1]]), seq.counts[[1]], xlim = c(as.chron("1/1/2000"), as.chron("12/31/2000")),
ylim = c(0, 20), pch = 20, cex = 2)
for(i in 2:length(rep.dates)){
points(as.chron(rep.dates[[i]]), seq.counts[[i]], pch = 20, cex = 2)
}

Combining 2 datasets in a single plot in R

I have two columns of data, f.delta and g.delta that I would like to produce a scatter plot of in R.
Here is how I am doing it.
plot(f.delta~x, pch=20, col="blue")
points(g.delta~x, pch=20, col="red")
The problem is this: the values of f.delta vary from 0 to -7; the values of g.delta vary from 0 to 10.
When the plot is drawn, the y axis extends from 1 to -7. So while all the f.delta points are visible, any g.delta point that has y>1 is cut-off from view.
How do I stop R from automatically setting the ylims from the data values. Have tried, unsuccessfully, various combinations of yaxt, yaxp, ylims.
Any suggestion will be greatly appreciated.
Thanks,
Anjan
In addition to Gavin's excellent answer, I also thought I'd mention that another common idiom in these cases is to create an empty plot with the correct limits and then to fill it in using points, lines, etc.
Using Gavin's example data:
with(df,plot(range(x),range(f.delta,g.delta),type = "n"))
points(f.delta~x, data = df, pch=20, col="blue")
points(g.delta~x, data = df, pch=20, col="red")
The type = "n" causes plot to create only the empty plotting window, based on the range of x and y values we've supplied. Then we use points for both columns on this existing plot.
You need to tell R what the limits of the data are and pass that as argument ylim to plot() (note the argument is ylim not ylims!). Here is an example:
set.seed(1)
df <- data.frame(f.delta = runif(10, min = -7, max = 0),
g.delta = runif(10, min = 0, max = 10),
x = rnorm(10))
ylim <- with(df, range(f.delta, g.delta)) ## compute y axis limits
plot(f.delta ~ x, data = df, pch = 20, col = "blue", ylim = ylim)
points(g.delta ~ x, data = df, pch = 20, col = "red")
Which produces

Resources