Generate Random Data in the Form of a Letter - r

I couldn't found anything on this task while Googling, but I can't imagine no one has thought of doing this. Is there a way to generate random 2d data in the form of a letter of choice? So basically a function letter_random_data(letter) that would output x and y coordinates (within some boundaries) that together with some noise form the chosen letter.

Here's one way to do it: Draw an image containing the letter (or text, more generally). Read the image into an array, and use it to accept or reject points drawn randomly in the box holding the image.
For example,
library(png)
getTextImage <- function(text) {
filename <- tempfile()
png(filename = filename)
plot.new()
cex <- 1
repeat {
if (strwidth(text, cex = 2*cex) > 1) break
if (strheight(text, cex = 2*cex) > 1) break
cex <- 2*cex
}
text(0.5, 0.5, text, cex = cex)
dev.off()
image <- readPNG(filename)
unlink(filename) # clean up file
if (length(dim(image)) == 3)
image <- image[,,1] # just keep one channel
image
}
randomText <- function(n, text) {
image <- getTextImage(text)
nx <- dim(image)[1]
ny <- dim(image)[2]
hits <- 0
x <- y <- numeric(n)
while (hits < n) {
tryx <- runif(1)
tryy <- runif(1)
keep <- image[round((nx-1)*tryx + 1), round((ny-1)*tryy + 1)] == 0
if (keep) {
hits <- hits + 1
# Need to rotate so it looks good
x[hits] <- tryy
y[hits] <- 1 - tryx
}
}
cbind(x, y)
}
plot(randomText(1000, "Hello"))
This produces the following plot:

Related

ggarrange and ggplot inside a for loop

this is a simple example I have, where I generate 5 standard normals, each one with his own p value (just for the sake of the demonstration). I save that in a 4x50x5 array called X.
After that, I want to save 5 plots with 4 histograms each in a .pdf, and the following code does the job
pvec <- 2^(2:5)
n <- pvec/2
j <- 5
size <- 50
X <- array(rep(NA, length(pvec)*reps*j), dim=c(length(pvec), reps, j))
for (k in 1:length(pvec)){
for (i in 1:j){
X[k,,i] <- rnorm(size)
}
}
pdf("grafic.pdf")
par(mfrow=c(2,2))
for (w in 1:j){
for (k in 1:length(pvec)){
hist(X[k,,w], freq = F, col = 'lightgreen',main = paste("p = ", pvec[k], ",n =", n[k]))
curve(dnorm(x,mean=0,sd=1), add=TRUE,col="blue")
}
}
dev.off()
Obtaining, for example
Let's say that I want to do this now, but with ggplot. I have to use ggarrange in replace of par(mfrow). But ggarrange uses a plot.list as an argument, so inside the for I should have something like
graphlist <- NULL
for (w in 1:j){
for (k in 1:length(p.vec)){
graphlist[k,,w] <- ggplot(data=data.frame(X), aes(x=X[k,,w])) +
geom_histogram()
}
}
ggarrange(plotlist = graphlist, ncol = 2, nrow = 2)
But of course this doesn't work. How can I do stuffs like that, where I need to save the plots made by ggplot2 and then combine them with ggarrange? Thanks

Segmenting rings i.e. non-full objects in R (in EBIimage or other)

I am relying on edge detection (as opposed to colour detection) to extract features from blood cells. The original image looks like:
I am using the R EBImage package to run a sobel + low pass filter to get to something like this:
library(EBImage)
library(data.table)
img <- readImage("6hr-007-DIC.tif")
#plot(img)
#print(img, short = T)
# 1. define filter for edge detection
hfilt <- matrix(c(1, 2, 1, 0, 0, 0, -1, -2, -1), nrow = 3) # sobel
# rotate horizontal filter to obtain vertical filter
vfilt <- t(hfilt)
# get horizontal and vertical edges
imgH <- filter2(img, hfilt, boundary="replicate")
imgV <- filter2(img, vfilt, boundary="replicate")
# combine edge pixel data to get overall edge data
hdata <- imageData(imgH)
vdata <- imageData(imgV)
edata <- sqrt(hdata^2 + vdata^2)
# transform edge data to image
imgE <- Image(edata)
#print(display(combine(img, imgH, imgV, imgE), method = "raster", all = T))
display(imgE, method = "raster", all = T)
# 2. Enhance edges with low pass filter
hfilt <- matrix(c(1, 1, 1, 1, 1, 1, 1, 1, 1), nrow = 3) # low pass
# rotate horizontal filter to obtain vertical filter
vfilt <- t(hfilt)
# get horizontal and vertical edges
imgH <- filter2(imgE, hfilt, boundary="replicate")
imgV <- filter2(imgE, vfilt, boundary="replicate")
# combine edge pixel data to get overall edge data
hdata <- imageData(imgH)
vdata <- imageData(imgV)
edata <- sqrt(hdata^2 + vdata^2)
# transform edge data to image
imgE <- Image(edata)
plot(imgE)
I would like to know if there are any methods to fill in the holes in the large rings (blood cells) so they are solid bodies a bit like:
(obviously this is not the same image but imagine that last image only started out with edges.)
I would then like to use something like computeFeatures() method from the EBImage package (which as far as I'm aware only works on solid bodies)
EDIT Little more code to extract interior of objects with "connections" to border. The additional code includes defining the convex hull of the segmented cells and creating a filled mask.
The short answer is that fillHull and floodFill may be helpful for filling cells that have well defined borders.
The longer (edited) answer below suggests an approach with floodFill that might be useful. You did a great job extracting information from the low contrast DIC images, but even more image processing might be helpful such as "flat-field correction" for noisy DIC images. The principle is described in this Wikipedia page but a simple implementation does wonders. The coding solution suggested here requires user interaction to select cells. That's not such a robust approach. Still, perhaps more image processing combined with code to locate cells could work. In the end, the interior of cells are segmented and available for analysis with computeFeatures.
The code starts with the thresholded image (having trimmed the edges and converted to binary).
# Set up plots for 96 dpi images
library(EBImage)
dm <- dim(img2)/96
dev.new(width = dm[1], height = dm[2])
# Low pass filter with gblur and make binary
xb <- gblur(img2, 3)
xt <- thresh(xb, offset = 0.0001)
plot(xt) # thresh.jpg
# dev.print(jpeg, "thresh.jpg", width = dm[1], unit = "in", res = 96)
# Keep only "large" objects
xm <- bwlabel(xt)
FS <- computeFeatures.shape(xm)
sel <- which(FS[,"s.area"] < 800)
xe <- rmObjects(xm, sel)
# Make binary again and plot
xe <- thresh(xe)
plot(xe) # trimmed.jpg
# dev.print(jpeg, "trimmed.jpg", width = dm[1], unit = "in", res = 96)
# Choose cells with intact interiors
# This is done by hand here but with more pre-processing, it may be
# possible to have the image suitable for more automated analysis...
pp <- locator(type = "p", pch = 3, col = 2) # marked.jpg
# dev.print(jpeg, "marked.jpg", width = dm[1], unit = "in", res = 96)
# Fill interior of each cell with a unique integer
myCol <- seq_along(pp$x) + 1
xf1 <- floodFill(xe, do.call(rbind, pp), col = myCol)
# Discard original objects from threshold (value = 1) and see
cells1 <- rmObjects(xf1, 1)
plot(colorLabels(cells1))
# dev.print(jpeg, "cells1.jpg", width = dm[1], unit = "in", res = 96)
I need to introduce algorithms to connect integer points between vertices and fill a convex polygon. The code here implements Bresenham's algorithm and uses a simplistic polygon filling routine that works only for convex (simple) polygons.
#
# Bresenham's balanced integer line drawing algorithm
#
bresenham <- function(x, y = NULL, close = TRUE)
{
# accept any coordinate structure
v <- xy.coords(x = x, y = y, recycle = TRUE, setLab = FALSE)
if (!all(is.finite(v$x), is.finite(v$y)))
stop("finite coordinates required")
v[1:2] <- lapply(v[1:2], round) # Bresenham's algorithm IS for integers
nx <- length(v$x)
if (nx == 1) return(list(x = v$x, y = v$y)) # just one point
if (nx > 2 && close == TRUE) { # close polygon by replicating 1st point
v$x <- c(v$x, v$x[1])
v$y <- c(v$y, v$y[1])
nx <- nx + 1
}
# collect result in 'ans, staring with 1st point
ans <- lapply(v[1:2], "[", 1)
# process all vertices in pairs
for (i in seq.int(nx - 1)) {
x <- v$x[i] # coordinates updated in x, y
y <- v$y[i]
x.end <- v$x[i + 1]
y.end <- v$y[i + 1]
dx <- abs(x.end - x); dy <- -abs(y.end - y)
sx <- ifelse(x < x.end, 1, -1)
sy <- ifelse(y < y.end, 1, -1)
err <- dx + dy
# process one segment
while(!(isTRUE(all.equal(x, x.end)) && isTRUE(all.equal(y, y.end)))) {
e2 <- 2 * err
if (e2 >= dy) { # increment x
err <- err + dy
x <- x + sx
}
if (e2 <= dx) { # increment y
err <- err + dx
y <- y + sy
}
ans$x <- c(ans$x, x)
ans$y <- c(ans$y, y)
}
}
# remove duplicated points (typically 1st and last)
dups <- duplicated(do.call(cbind, ans), MARGIN = 1)
return(lapply(ans, "[", !dups))
}
And a simple routine to find interior points of a simple polygon.
#
# Return x,y integer coordinates of the interior of a CONVEX polygon
#
cPolyFill <- function(x, y = NULL)
{
p <- xy.coords(x, y = y, recycle = TRUE, setLab = FALSE)
p[1:2] <- lapply(p[1:2], round)
nx <- length(p$x)
if (any(!is.finite(p$x), !is.finite(p$y)))
stop("finite coordinates are needed")
yc <- seq.int(min(p$y), max(p$y))
xlist <- lapply(yc, function(y) sort(seq.int(min(p$x[p$y == y]), max(p$x[p$y == y]))))
ylist <- Map(rep, yc, lengths(xlist))
ans <- cbind(x = unlist(xlist), y = unlist(ylist))
return(ans)
}
Now these can be used along with ocontour() and chull() to create and fill a convex hull about each segmented cells. This "fixes" those cells with intrusions.
# Create convex hull mask
oc <- ocontour(cells1) # for all points along perimeter
oc <- lapply(oc, function(v) v + 1) # off-by-one flaw in ocontour
sel <- lapply(oc, chull) # find points that define convex hull
xh <- Map(function(v, i) rbind(v[i,]), oc, sel) # new vertices for convex hull
oc2 <- lapply(xh, bresenham) # perimeter points along convex hull
# Collect interior coordinates and fill
coords <- lapply(oc2, cPolyFill)
cells2 <- Image(0, dim = dim(cells1))
for(i in seq_along(coords))
cells2[coords[[i]]] <- i # blank image for mask
xf2 <- xe
for (i in seq_along(coords))
xf2[coords[[i]]] <- i # early binary mask
# Compare before and after
img <- combine(colorLabels(xf1), colorLabels(cells1),
colorLabels(xf2), colorLabels(cells2))
plot(img, all = T, nx = 2)
labs <- c("xf1", "cells1", "xf2", "cells2")
ix <- c(0, 1, 0, 1)
iy <- c(0, 0, 1, 1)
text(dm[1]*96*(ix + 0.05), 96*dm[2]*(iy + 0.05), labels = labs,
col = "white", adj = c(0.05,1))
# dev.print(jpeg, "final.jpg", width = dm[1], unit = "in", res = 96)

How to clip an isosurface to a ball?

Consider the Togliatti implicit surface. I want to clip it to the ball centered at the origin with radius 4.8. A solution, with the misc3d package, consists in using the mask argument of the computeContour3d function, which allows to use only the points satisfying x^2+y^2+z^2 < 4.8^2:
library(misc3d)
# Togliatti surface equation: f(x,y,z) = 0
f <- function(x,y,z){
w <- 1
64*(x-w)*
(x^4-4*x^3*w-10*x^2*y^2-4*x^2*w^2+16*x*w^3-20*x*y^2*w+5*y^4+16*w^4-20*y^2*w^2) -
5*sqrt(5-sqrt(5))*(2*z-sqrt(5-sqrt(5))*w)*(4*(x^2+y^2-z^2)+(1+3*sqrt(5))*w^2)^2
}
# make grid
nx <- 220; ny <- 220; nz <- 220
x <- seq(-5, 5, length=nx)
y <- seq(-5, 5, length=ny)
z <- seq(-4, 4, length=nz)
g <- expand.grid(x=x, y=y, z=z)
# calculate voxel
voxel <- array(with(g, f(x,y,z)), dim = c(nx,ny,nz))
# mask: keep points satisfying x^2+y^2+z^2 < 4.8^2, in order to
# clip the surface to the ball of radius 4.8
mask <- array(with(g, x^2+y^2+z^2 < 4.8^2), dim = c(nx,ny,nz))
# compute isosurface
surf <- computeContour3d(voxel, maxvol=max(voxel), level=0, mask=mask, x=x, y=y, z=z)
# draw isosurface
drawScene.rgl(makeTriangles(surf, smooth=TRUE))
But the borders of the resulting surface are irregular:
How to get regular, smooth borders?
The solution I found resorts to spherical coordinates. It consists in defining the function f in terms of spherical coordinates (ρ, θ, ϕ), then to compute the isosurface with ρ running from 0 to the desired radius, and then to transform the result to Cartesian coordinates:
# Togliatti surface equation with spherical coordinates
f <- function(ρ, θ, ϕ){
w <- 1
x <- ρ*cos(θ)*sin(ϕ)
y <- ρ*sin(θ)*sin(ϕ)
z <- ρ*cos(ϕ)
64*(x-w)*
(x^4-4*x^3*w-10*x^2*y^2-4*x^2*w^2+16*x*w^3-20*x*y^2*w+5*y^4+16*w^4-20*y^2*w^2) -
5*sqrt(5-sqrt(5))*(2*z-sqrt(5-sqrt(5))*w)*(4*(x^2+y^2-z^2)+(1+3*sqrt(5))*w^2)^2
}
# make grid
nρ <- 300; nθ <- 400; nϕ <- 300
ρ <- seq(0, 4.8, length = nρ) # ρ runs from 0 to the desired radius
θ <- seq(0, 2*pi, length = nθ)
ϕ <- seq(0, pi, length = nϕ)
g <- expand.grid(ρ=ρ, θ=θ, ϕ=ϕ)
# calculate voxel
voxel <- array(with(g, f(ρ,θ,ϕ)), dim = c(nρ,nθ,nϕ))
# calculate isosurface
surf <- computeContour3d(voxel, maxvol=max(voxel), level=0, x=ρ, y=θ, z=ϕ)
# transform to Cartesian coordinates
surf <- t(apply(surf, 1, function(rtp){
ρ <- rtp[1]; θ <- rtp[2]; ϕ <- rtp[3]
c(
ρ*cos(θ)*sin(ϕ),
ρ*sin(θ)*sin(ϕ),
ρ*cos(ϕ)
)
}))
# draw isosurface
drawScene.rgl(makeTriangles(surf, smooth=TRUE, color = "violetred"))
Now the resulting surface has regular, smooth borders:
Your solution is excellent for the problem you stated, because spherical coordinates are so natural for that boundary. However, here is a more general solution that would work for other smooth boundaries.
The idea is to allow input of a boundary function, and cull points when they are too large or too small. In your case it would be the squared distance from the origin, and you would want to cull points where the value is bigger than 4.8^2. But sometimes the triangles being drawn to make the smooth surface should only be partially culled: one point would be kept and two deleted, or two kept and one deleted. If you cull the whole triangle that leads to the jagged edges in your original plot.
To fix this, the points can be modified. If only one is supposed to be kept, then the other two points can be shrunk towards it until they lie on an approximation to the boundary. If two are supposed to be kept you want the shape to be a quadrilateral, so you would build that out of two triangles.
This function does that, assuming the input surf is the output of computeContour3d:
boundSurface <- function(surf, boundFn, bound = 0, greater = TRUE) {
# Surf is n x 3: each row is a point, triplets are triangles
values <- matrix(boundFn(surf) - bound, 3)
# values is (m = n/3) x 3: each row is the boundFn value at one point
# of a triangle
if (!greater)
values <- -values
keep <- values >= 0
# counts is m vector counting number of points to keep in each triangle
counts <- apply(keep, 2, sum)
# result is initialized to an empty array
result <- matrix(nrow = 0, ncol = 3)
# singles is set to all the rows of surf where exactly one
# point in the triangle is kept, say s x 3
singles <- surf[rep(counts == 1, each = 3),]
if (length(singles)) {
# singleValues is a subset of values where only one vertex is kept
singleValues <- values[, counts == 1]
singleIndex <- 3*col(singleValues) + 1:3 - 3
# good is the index of the vertex to keep, bad are those to fix
good <- apply(singleValues, 2, function(col) which(col >= 0))
bad <- apply(singleValues, 2, function(col) which(col < 0))
for (j in 1:ncol(singleValues)) {
goodval <- singleValues[good[j], j]
for (i in 1:2) {
badval <- singleValues[bad[i,j], j]
alpha <- goodval/(goodval - badval)
singles[singleIndex[bad[i,j], j], ] <-
(1-alpha)*singles[singleIndex[good[j], j],] +
alpha *singles[singleIndex[bad[i,j], j],]
}
}
result <- rbind(result, singles)
}
doubles <- surf[rep(counts == 2, each = 3),]
if (length(doubles)) {
# doubleValues is a subset of values where two vertices are kept
doubleValues <- values[, counts == 2]
doubleIndex <- 3*col(doubleValues) + 1:3 - 3
doubles2 <- doubles
# good is the index of the vertex to keep, bad are those to fix
good <- apply(doubleValues, 2, function(col) which(col >= 0))
bad <- apply(doubleValues, 2, function(col) which(col < 0))
newvert <- matrix(NA, 2, 3)
for (j in 1:ncol(doubleValues)) {
badval <- doubleValues[bad[j], j]
for (i in 1:2) {
goodval <- doubleValues[good[i,j], j]
alpha <- goodval/(goodval - badval)
newvert[i,] <-
(1-alpha)*doubles[doubleIndex[good[i,j], j],] +
alpha *doubles[doubleIndex[bad[j], j],]
}
doubles[doubleIndex[bad[j], j],] <- newvert[1,]
doubles2[doubleIndex[good[1,j], j],] <- newvert[1,]
doubles2[doubleIndex[bad[j], j],] <- newvert[2,]
}
result <- rbind(result, doubles, doubles2)
}
# Finally add all the rows of surf where the whole
# triangle is kept
rbind(result, surf[rep(counts == 3, each = 3),])
}
You would use it after computeContour3d and before makeTriangles, e.g.
fn <- function(x) {
apply(x^2, 1, sum)
}
drawScene.rgl(makeTriangles(boundSurface(surf, fn, bound = 4.8^2,
greater = FALSE),
smooth = TRUE))
Here's the output I see:
It's not quite as good as yours, but it would work for many different boundary functions.
Edited to add: Version 0.100.26 of rgl now has a function clipMesh3d which incorporates these ideas.

R: Draw a polygon with conditional colour

I want to colour the area under a curve. The area with y > 0 should be red, the area with y < 0 should be green.
x <- c(1:4)
y <- c(0,1,-1,2,rep(0,4))
plot(y[1:4],type="l")
abline(h=0)
Using ifelse() does not work:
polygon(c(x,rev(x)),y,col=ifelse(y>0,"red","green"))
What I achieved so far is the following:
polygon(c(x,rev(x)),y,col="green")
polygon(c(x,rev(x)),ifelse(y>0,y,0),col="red")
But then the red area is too large. Do you have any ideas how to get the desired result?
If you want two different colors, you need two different polygons. You can either call polygon multiple times, or you can add NA values in your x and y vectors to indicate a new polygon. R will not automatically calculate the intersection for you. You must do that yourself. Here's how you could draw that with different colors.
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(0,1,0,NA,0,-1,0)
#calculate color based on most extreme y value
g <- cumsum(is.na(x))
gc <- ifelse(tapply(y, g,
function(x) x[which.max(abs(x))])>0,
"red","green")
plot(c(1, 4),c(-1,1), type = "n")
polygon(x, y, col = gc)
abline(h=0)
In the more general case, it might not be as easy to split a polygon into different regions. There seems to be some support for this type of operation in GIS packages, where this type of thing is more common. However, I've put together a somewhat general case that may work for simple polygons.
First, I define a closure that will define a cutting line. The function will take a slope and y-intercept for a line and will return the functions we need to cut a polygon.
getSplitLine <- function(m=1, b=0) {
force(m); force(b)
classify <- function(x,y) {
y >= m*x + b
}
intercepts <- function(x,y, class=classify(x,y)) {
w <- which(diff(class)!=0)
m2 <- (y[w+1]-y[w])/(x[w+1]-x[w])
b2 <- y[w] - m2*x[w]
ix <- (b2-b)/(m-m2)
iy <- ix*m + b
data.frame(x=ix,y=iy,idx=w+.5, dir=((rank(ix, ties="first")+1) %/% 2) %% 2 +1)
}
plot <- function(...) {
abline(b,m,...)
}
list(
intercepts=intercepts,
classify=classify,
plot=plot
)
}
Now we will define a function to actually split a polygon using the splitter we've just defined.
splitPolygon <- function(x, y, splitter) {
addnullrow <- function(x) if (!all(is.na(x[nrow(x),]))) rbind(x, NA) else x
rollup <- function(x,i=1) rbind(x[(i+1):nrow(x),], x[1:i,])
idx <- cumsum(is.na(x) | is.na(y))
polys <- split(data.frame(x=x,y=y)[!is.na(x),], idx[!is.na(x)])
r <- lapply(polys, function(P) {
x <- P$x; y<-P$y
side <- splitter$classify(x, y)
if(side[1] != side[length(side)]) {
ints <- splitter$intercepts(c(x,x[1]), c(y, y[1]), c(side, side[1]))
} else {
ints <- splitter$intercepts(x, y, side)
}
sideps <- lapply(unique(side), function(ss) {
pts <- data.frame(x=x[side==ss], y=y[side==ss],
idx=seq_along(x)[side==ss], dir=0)
mm <- rbind(pts, ints)
mm <- mm[order(mm$idx), ]
br <- cumsum(mm$dir!=0 & c(0,head(mm$dir,-1))!=0 &
c(0,diff(mm$idx))>1)
if (length(unique(br))>1) {
mm<-rollup(mm, sum(br==br[1]))
}
br <- cumsum(c(FALSE,abs(diff(mm$dir*mm$dir))==3))
do.call(rbind, lapply(split(mm, br), addnullrow))
})
pss<-rep(unique(side), sapply(sideps, nrow))
ps<-do.call(rbind, lapply(sideps, addnullrow))[,c("x","y")]
attr(ps, "side")<-pss
ps
})
pss<-unname(unlist(lapply(r, attr, "side")))
src <- rep(seq_along(r), sapply(r, nrow))
r <- do.call(rbind, r)
attr(r, "source")<-src
attr(r, "side")<-pss
r
}
The input is just the values of x and y as you would pass to polygon along with the cutter. It will return a data.frame with x and y values that can be used with polygon.
For example
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(1,-2,2,NA,-1,2,-2)
sl<-getSplitLine(0,0)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
This should work for simple concave polygons as well with sloped lines. Here's another example
x <- c(1,2,3,4,5,4,3,2)
y <- c(-2,2,1,2,-2,.5,-.5,.5)
sl<-getSplitLine(.5,-1.25)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
Right now things can get a bit messy when the the vertex of the polygon falls directly on the splitting line. I may try to correct that in the future.
A faster, but not very accurate solution is to split data frame to list according to grouping variable (e.g. above=red and below=blue). This is a pretty nice workaround for rather big (I would say > 100 elements) datasets. For smaller chunks some discontinuity may be visible:
x <- 1:100
y1 <- sin(1:100/10)*0.8
y2 <- sin(1:100/10)*1.2
plot(x, y2, type='l')
lines(x, y1, col='red')
df <- data.frame(x=x, y1=y1, y2=y2)
df$pos_neg <- ifelse(df$y2-df$y1>0,1,-1) # above (1) or below (-1) average
# create the number for chunks to be split into lists:
df$chunk <- c(1,cumsum(abs(diff(df$pos_neg)))/2+1) # first element needs to be added`
df$colors <- ifelse(df$pos_neg>0, "red","blue") # colors to be used for filling the polygons
# create lists to be plotted:
l <- split(df, df$chunk) # we should get 4 sub-lists
lapply(l, function(x) polygon(c(x$x,rev(x$x)),c(x$y2,rev(x$y1)),col=x$colors))
As I said, for smaller dataset some discontinuity may be visible if sharp changes occur between positive and negative areas, but if horizontal line distinguishes between those two, or more elements are plotted then this effect is neglected:

change specific word color in wordcloud

I would like to build a word cloud with R (I have done so with the package wordcloud) and then color specific words a certain color. Currently the behavior of the function is to color words according to frequency (which can be useful) but word size already does this so I'd want to use color for additional meaning.
Any idea on how to color specific words in wordcloud? (If there's another wordcloud function in R I'm unaware of I'm more than willing to go that route.)
A mock example and my attempt (I tried to treat the color argument in the same manor I would a regular plot from the plot function):
library(wordcloud)
x <- paste(rep("how do keep the two words as one chunk in the word cloud", 3),
collapse = " ")
X <- data.frame(table(strsplit(x, " ")))
COL <- ifelse(X$Var1 %in% c("word", "cloud", "words"), "red", "black")
wordcloud(X$Var1, X$Freq, color=COL)
EDIT: I wanted to add that the new version of wordcloud (Jan 10, 2010; version 2.0)[Thank you Ian Fellows & David Robinson] now was this feature along with some other terrific additions.
Here is the code to accomplish the original goal within wordcloud:
wordcloud(X$Var1, X$Freq, color=COL, ordered.colors=TRUE, random.color=FALSE)
EDIT: As described in the comments, the feature described below has now been added to the wordcloud library.
My approach was to take the R function's code and customize it. It required changing only a few lines, and can now take either a single color or a vector of colors of the same length as words.
library(wordcloud)
colored.wordcloud <- function(words,freq,scale=c(4,.5),min.freq=3,max.words=Inf,random.order=TRUE,random.color=FALSE,
rot.per=.1,colors="black",ordered.colors=FALSE,use.r.layout=FALSE,...) {
tails <- "g|j|p|q|y"
last <- 1
nc<- length(colors)
if (ordered.colors) {
if (length(colors) != 1 && length(colors) != length(words)) {
stop(paste("Length of colors does not match length of words",
"vector"))
}
}
overlap <- function(x1, y1, sw1, sh1) {
if(!use.r.layout)
return(.overlap(x1,y1,sw1,sh1,boxes))
s <- 0
if (length(boxes) == 0)
return(FALSE)
for (i in c(last,1:length(boxes))) {
bnds <- boxes[[i]]
x2 <- bnds[1]
y2 <- bnds[2]
sw2 <- bnds[3]
sh2 <- bnds[4]
if (x1 < x2)
overlap <- x1 + sw1 > x2-s
else
overlap <- x2 + sw2 > x1-s
if (y1 < y2)
overlap <- overlap && (y1 + sh1 > y2-s)
else
overlap <- overlap && (y2 + sh2 > y1-s)
if(overlap){
last <<- i
return(TRUE)
}
}
FALSE
}
ord <- rank(-freq, ties.method = "random")
words <- words[ord<=max.words]
freq <- freq[ord<=max.words]
if (ordered.colors) {
colors <- colors[ord<=max.words]
}
if(random.order)
ord <- sample.int(length(words))
else
ord <- order(freq,decreasing=TRUE)
words <- words[ord]
freq <- freq[ord]
words <- words[freq>=min.freq]
freq <- freq[freq>=min.freq]
if (ordered.colors) {
colors <- colors[ord][freq>=min.freq]
}
thetaStep <- .1
rStep <- .05
plot.new()
op <- par("mar")
par(mar=c(0,0,0,0))
plot.window(c(0,1),c(0,1),asp=1)
normedFreq <- freq/max(freq)
size <- (scale[1]-scale[2])*normedFreq + scale[2]
boxes <- list()
for(i in 1:length(words)){
rotWord <- runif(1)<rot.per
r <-0
theta <- runif(1,0,2*pi)
x1<-.5
y1<-.5
wid <- strwidth(words[i],cex=size[i],...)
ht <- strheight(words[i],cex=size[i],...)
#mind your ps and qs
if(grepl(tails,words[i]))
ht <- ht + ht*.2
if(rotWord){
tmp <- ht
ht <- wid
wid <- tmp
}
isOverlaped <- TRUE
while(isOverlaped){
if(!overlap(x1-.5*wid,y1-.5*ht,wid,ht) &&
x1-.5*wid>0 && y1-.5*ht>0 &&
x1+.5*wid<1 && y1+.5*ht<1){
if (!random.color) {
if (ordered.colors) {
cc <- colors[i]
}
else {
cc <- ceiling(nc*normedFreq[i])
cc <- colors[cc]
}
} else {
cc <- colors[sample(1:nc,1)]
}
text(x1,y1,words[i],cex=size[i],offset=0,srt=rotWord*90,
col=cc,...)
#rect(x1-.5*wid,y1-.5*ht,x1+.5*wid,y1+.5*ht)
boxes[[length(boxes)+1]] <- c(x1-.5*wid,y1-.5*ht,wid,ht)
isOverlaped <- FALSE
}else{
if(r>sqrt(.5)){
warning(paste(words[i],
"could not be fit on page. It will not be plotted."))
isOverlaped <- FALSE
}
theta <- theta+thetaStep
r <- r + rStep*thetaStep/(2*pi)
x1 <- .5+r*cos(theta)
y1 <- .5+r*sin(theta)
}
}
}
par(mar=op)
invisible()
}
Some code to try it out:
colors = c("blue", "red", "orange", "green")
colored.wordcloud(colors, c(10, 5, 3, 9), colors=colors)

Resources