How to fix overlapping issue - r

plot(USArrests$Murder, USArrests$UrbanPop,
xlab="murder", ylab="% urban population", pch=20, col="grey",
ylim=c(20, 100), xlim=c(0, 20))
text(USArrests$Murder, USArrests$UrbanPop, labels=rownames(USArrests),
cex=0.7, pos=3)
I tried everything, reducing font size with cex, change the positions, change the ylim, xlim to fit the size, I also tried changing the margins, which didn't really help me so I got rid of them. At this point, I don't know how to do this with base R tool. I do know ggplot method, which is way easier. But I want to know if I can do the same task with the base plot(),text() code.

To find neighbors which are too near you could run kmeans() cluster analysis about the data. It's quite a hack, though!
First, subset your data.
dat <- USArrests[c("Murder", "UrbanPop")]
Set a seed. Play around with that. Different seeds => different results.
set.seed(42)
Analyze clusters with kmeans(), option centers assigns number of clusters, play around with that.
dat$cl <- kmeans(dat, centers=10, nstart=5)$cluster
Now split data and assign altering pos numbers for positioning later in the text() command.
l <- split(dat, dat$cl)
l <- lapply(l, function(x) within(x, {
if (nrow(x) == 1)
pos <- 2 # for those with just one observation in cluster
else
pos <- as.numeric(as.character(factor((1:nrow(x)) %% 2, labels=c(2, 4))))
}))
Assemble.
dat <- do.call(rbind, unname(l))
Now plot into a png with a somewhat high resolution, I chose 800x800.
png("plot.png", 800, 800, "px")
plot(dat$Murder, dat$UrbanPop, xlab="murder", ylab="% urban population",
pch=20, col="grey", ylim=c(20, 100), xlim=c(0, 20))
# the sapply assigns the text position according to `pos` column
sapply(c(4, 2), function(x)
with(dat[dat$pos == x, ],
text(Murder, UrbanPop, labels=rownames(dat[dat$pos == x, ]),
cex=0.7, pos=x)))
dev.off()
Which gives me:
I'm sure you can optimize this further.

Related

Plotting a chessboard with no external libraries

I'd like if someone could help me with this problem I've been hours trying to solve.
I have to plot a chessboard with no external libraries (using only the default graphical functions in R).
My attempt is working with black squares till I have to filter and paint the white squares:
plot(c(1:9),c(1:9),type="n")
for (i in 1:8){
rect(i,1:9,i+1,9,col="black",border="white")
}
I could do it manually in this way, but I know there's a simpler way:
plot(c(1:9),c(1:9),type="n")
rect(1, 2, 2, 1,col="black",border="white")
rect(4, 1, 3, 2,col="black",border="white")
rect(6, 1, 5, 2,col="black",border="white")
rect(7, 1, 8, 2,col="black",border="white")
(...)
I've tried adding a function to filter even numbers inside the loop but doesn't seems to works for me.
I would appreciate any suggestion!
Use image and just repeat 0:1 over and over. Then you can mess with the limits a bit to make it fit nice.
image(matrix(1:0, 9, 9), col=0:1, xlim=c(-.05,.93), ylim=c(-.05,.93))
Just change the col= argument in your solution as shown. Also note that c(1:9) can be written as just 1:9 :
plot(1:9, 1:9, type = "n")
for (i in 1:8) {
col <- if (i %% 2) c("white", "black") else c("black", "white")
rect(i, 1:9, i+1, 9, col = col, border = "white")
}
remembering Jeremy Kun's post
https://jeremykun.com/2018/03/25/a-parlor-trick-for-set/ on Set helped
me figure the hard part (for me) of this question. i realized that
diagonals on the board (what bishops move on) have a constant color.
and, so, their Y-intercept (where they hit the Y-axis) will uniquely
determine their color, and adjacent Y values will have different
colors. for a square at (x,y), the y intercept (since the slope is 1)
will be at Y == (y-x). since the parity is the same for addition as
for subtraction, and i'm never sure which mod functions (in which
languages) may give a negative result, i use "(x+y) %% 2".
b <- matrix(nrow=8,ncol=8) # basic board
colorindex <- (col(b)+row(b))%%2 # parity of the Y-intercept
# for each square
colors <- c("red", "white")[colorindex+1] # choose colors
side <- 1/8 # side of one square
ux <- col(b)*side # upper x values
lx <- ux-side # lower x values
uy <- row(b)*side # upper y
ly <- uy-side # upper y
plot.new() # initialize R graphics
rect(lx, ly, ux, uy, col=colors, asp=1) # draw the board

Adding lines to graph created using plotrix library

I have created a stacked histogram using the multhist function in the plotrix library, but I am unable to add a straight line to this histogram. Code that I would normally use doesn't seem to work in this setting.
Here's an example. I am trying to add the mean and standard errors of the overall distribution as simple vertical lines on the histogram, but these do not work properly. What am I doing wrong?
library(plotrix)
test1<-rnorm(30,0)
test2<-rnorm(30,0)
test3<-rnorm(30,0)
forstats<-c(test1,test2,test3)
mn<-mean(forstats)
se<-std.error(forstats)
together<-list(test1,test2,test3)
multhist(together, col=c(7,4,2), space=c(0,0), beside=FALSE,right=FALSE)
abline(v=mn)
abline(v=mn+se)
abline(v=mn-se)
multhist uses barplot, so, as #BenBolker mentions here, the x-axis corresponds to bin index. It's a bit tricky to convert between native coordinates and bin index units, so I've put together another function for stacked histograms (for frequencies, anyway):
histstack <- function(x, breaks, col=rainbow(length(x)), ...) {
col <- rev(col)
if (length(breaks)==1) {
rng <- range(pretty(range(x)))
breaks <- seq(rng[1], rng[2], length.out=breaks)
}
h <- lapply(x, hist, plot=FALSE, breaks=breaks)
cumcounts <- apply(sapply(h, '[[', 'counts'), 1, cumsum)
for(i in seq_along(h)) {
h[[i]]$counts <- cumcounts[nrow(cumcounts) - i + 1, ]
}
max_cnt <- max(sapply(h, '[[', 'counts'))
plot(h[[1]], xlim=range(sapply(h, '[', 'breaks')), yaxt='n',
ylim=c(0, max(pretty(max_cnt))), col=col[1], ...)
sapply(seq_along(h)[-1], function(i) plot(h[[i]], col=col[i], add=TRUE, ...))
axis(2, at=pretty(c(0, max_cnt)), labels=pretty(c(0, max_cnt)), ...)
}
And here it is:
histstack(together, seq(-3, 3, 0.5), col=c(7, 4, 2), main='',
las=1, xlab='', ylab='')
abline(v=c(mn, mn+se, mn-se), lwd=2, )
IMO the x-axis labelling is probably more appropriate than that of multhist, since multhist implies that counts relate to the mid-bin values, whereas above it's clear that the x-axis ticks delineate the bins.

Heatmap like plot with Lattice

I can not figure out how the lattice levelplot works. I have played with this now for some time, but could not find reasonable solution.
Sample data:
Data <- data.frame(x=seq(0,20,1),y=runif(21,0,1))
Data.mat <- data.matrix(Data)
Plot with levelplot:
rgb.palette <- colorRampPalette(c("darkgreen","yellow", "red"), space = "rgb")
levelplot(Data.mat, main="", xlab="Time", ylab="", col.regions=rgb.palette(100),
cuts=100, at=seq(0,1,0.1), ylim=c(0,2), scales=list(y=list(at=NULL)))
This is the outcome:
Since, I do not understand how this levelplot really works, I can not make it work. What I would like to have is the colour strips to fill the whole window of the corresponding x (Time).
Alternative solution with other method.
Basically, I'm trying here to plot the increasing risk over time, where the red is the highest risk = 1. I would like to visualize the sequence of possible increase or clustering risk over time.
From ?levelplot we're told that if the first argument is a matrix then "'x' provides the
'z' vector described above, while its rows and columns are
interpreted as the 'x' and 'y' vectors respectively.", so
> m = Data.mat[, 2, drop=FALSE]
> dim(m)
[1] 21 1
> levelplot(m)
plots a levelplot with 21 columns and 1 row, where the levels are determined by the values in m. The formula interface might look like
> df <- data.frame(x=1, y=1:21, z=runif(21))
> levelplot(z ~ y + x, df)
(these approaches do not quite result in the same image).
Unfortunately I don't know much about lattice, but I noted your "Alternative solution with other method", so may I suggest another possibility:
library(plotrix)
color2D.matplot(t(Data[ , 2]), show.legend = TRUE, extremes = c("yellow", "red"))
Heaps of things to do to make it prettier. Still, a start. Of course it is important to consider the breaks in your time variable. In this very simple attempt, regular intervals are implicitly assumed, which happens to be the case in your example.
Update
Following the advice in the 'Details' section in ?color2D.matplot: "The user will have to adjust the plot device dimensions to get regular squares or hexagons, especially when the matrix is not square". Well, well, quite ugly solution.
par(mar = c(5.1, 4.1, 0, 2.1))
windows(width = 10, height = 2.5)
color2D.matplot(t(Data[ , 2]),
show.legend = TRUE,
axes = TRUE,
xlab = "",
ylab = "",
extremes = c("yellow", "red"))

R plotting frequency distribution

I know that we normally do in this way:
x=c(rep(0.3,100),rep(0.5,700))
plot(table(x))
However, we can only get a few dots or vertical lines in the graph.
What should I do if I want 100 dots above 0.3 and 700 dots above 0.5?
Something like this?
x <- c(rep(.3,100), rep(.5, 700))
y <- c(seq(0,1, length.out=100), seq(0,1,length.out=700))
plot(x,y)
edit: (following OP's comment)
In that case, something like this should work.
x <- rep(seq(1, 10)/10, seq(100, 1000, by=100))
x.t <- as.matrix(table(x))
y <- unlist(apply(x.t, 1, function(x) seq(1,x)))
plot(x,y)
You can lay with the linetype and linewidth settings...
plot(table(x),lty=3,lwd=0.5)
For smaller numbers (counts) you can use stripchart with method="stack" like this:
stripchart(c(rep(0.3,10),rep(0.5,70)), pch=19, method="stack", ylim=c(0,100))
But stripchart does not work for 700 dots.
Edit:
The dots() function from the package TeachingDemos is probably what you want:
require(TeachingDemos)
dots(x)

Reproduce frequency matrix plot

I have a plot that I would like to recreate within R. Here is the plot:
From: Boring, E. G. (1941). Statistical frequencies as dynamic equilibria. Psychological Review, 48(4), 279.
This is a little above my paygrade (abilities) hence asking here. Boring states:
On the first occasion A can occur only 'never' (0) or 'always' (1). On
the second occasion the frequencies
are 0,1/2, or 1; on the third 0, 1/3,
2/3, or 1 etc, etc.
Obviously, you don't have to worry about labels etc. Just a hint to generate the data and how to plot would be great. ;) I have no clue how to even start...
here is an example:
library(plyr)
ps <- ldply(1:36, function(i)data.frame(s=0:i, n=i))
plot.new()
plot.window(c(1,36), c(0,1))
apply(ps, 1, function(x){
s<-x[1]; n<-x[2];
lines(c(n, n+1, n, n+1), c(s/n, s/(n+1), s/n, (s+1)/(n+1)), type="o")})
axis(1)
axis(2)
ps represents all points. Each point has two children.
So draw lines from each point to the children.
A solution using base graphics:
x <- 1:36
boring <- function(x, n=1)n/(x+n-1)
plot(x, boring(x), type="l", usr=c(0, 36, 0, 1))
for(i in 1:36){
lines(tail(x, 36-i+1), head(boring(x, i), 36-i+1), type="o", cex=0.5)
lines(tail(x, 36-i+1), 1-head(boring(x, i), 36-i+1, type="o", cex=0.5))
}

Resources