Adding transparent circles of defined radius to existing plot in R - r

I have a data.frame with X and Y coordinate values. X axis is position information and Y axis is log ratio values. The points are colored based log ratio values(green > 0.25 , -0.25 < grey < 0.25, and red < -0.25). The orange dashed horizontal lines are log2 values of 0.58, 0, and -1.
A circular binary segmentation algorithm segments changes in log ratio, indicated by horizontal blue line. In the image attached one can see several segments, most if it close to log2 of 0. Close to the left end of the figure are small blue segment with log value close to 0.58, and a much smaller (almost invisible because of surrounding red points) blue segment at log value close to -1 (right edge of plot). I have x and y coordinates of these blue segments in another data.frame. I want to achieve the following
1) add circles bounding these blue segments above -0.70 < log2 > 0.50. This helps in identifying small segments that could be missed
2) Add transparent colors to these circles using alpha values so that the blue segment is seen
3) The size of the circle would be based on the width of these blue segments.
I am also open to other ideas of highligting these blue segments at -0.70 < log2 > 0.5. Maybe I should suppress plotting the points (green and red) where these blue segments are found. I am using R to make this plot. Appreciate the help.
This was the code used: There are two df objects. The df(X) contains Chr.no, Chr.Start, Chr.End and Log2. The df(Y) is similar, but different col.names such as loc. start, and loc. end. And instead of Log2, they have seg.mean values
for (i in 1:25) { # Plot each chromosome separately
plot(X[which(X$Chr.No ==i),"Chr.Start"], X[which(X$Chr.No ==i),"Log2"], ylim=c(-4.0,4.0), col=X[which(X$Chr.No ==i),"Color"], pch=16, cex=0.4, ylab="Log2", xlab="Genomic Position", main= paste("KCL:180522_SS", "chromosome", i, sep=" "))
abline(h=c(-1,0,log2(3/2)), lty=2, col="chocolate")
xleft = Y[which(Y$Chr.No ==i),"loc.start"] # Left limit of the blue horizontal line
xright = Y[which(Y$Chr.No ==i),"loc.end"] # Right limit of the blue horizontal line
ybottom= Y[which(Y$Chr.No ==i),"seg.mean"] - 0.010 # Adding thickness to the "seg.mean"
ytop = Y[which(Y$Chr.No ==i),"seg.mean"] + 0.010 # Adding thickness to the "seg.mean"
rect(xleft=xleft, ybottom=ybottom, xright=xright, ytop=ytop, col="blue", border="blue")
}
#Dwin Yes, "Color" is a vector of "lightgreen", "grey" and "red". These are the color information for the pch=16 in the plot(x,y). I do not want to modify the pch=16 points. The horizontal "blue" line segments are added by the 'rect', and they span many pch=16 points. As you can see there are many "blue" segments, some very small and some large in length that differ in their log2 values.This is what I want to bound with a filled transparent circle. Not all "blue" segments, but only the ones where the "blue" segment 0.25< log2 > 0.25. In this figure the smaller "blue" segments are close to the edges of the plot, and since they are difficult to spot, I want to highlight them with a filled circle around them. Please let me know if I am still not clear. Thanks

(Deleted incorrect method based on guess about the manner in which the blue points (which were really segments) were being constructed.)
Edit: With the new information I would suggest drawing ordinary "points", i.e, open circles at the x-vector formed by (xleft+xright)/2 and the y-vector using ytop (which should be the same as ybottom) each for the selected ytop values that meet your criteria. You would make a logical vectors to select each of these vectors. So:
selvec <- ytop < -0.70 | ytop > 0.5
points ( x= (xleft[selvec]+xright[selvec])/2, y= ytop[selvec], cex =1.5, col="blue")
You could also use transparency if you used the rgb() function to create a color with transparency:
points ( x= (xleft[selvec]+xright[selvec])/2, y= ytop[selvec], cex = 2, col=rgb(0, 0, 1, 0.3) )
.... should give you transparent circles if your output device supports it.

Related

Finding xy coordinates of shelves in a store floorplan in r

I'm working on the following: I have a store layout, example see below (cannot add the real thing for GDPR reasons but the example should do the trick) on which I have xy coordinates from visitors (anonymous of course)
I already placed a grid on the picture so I can see which route they take in the store. That works fine. origin is bottom left and x & y are scaled from 0-100.
So far so good. Now next step is identifying the coordinates of the shelves, rectangles in the picture. Is there a way to do this without having to do this manually? The real store layout contains more than 900 shelves or am I pushing out the boat too far?
The output I'm looking for is a dataframe that contains a shelve ID and the coordinates for the corners. Idea is to create some heatmaps in the store to see that there are blind spots, hotspots, ...
The second analysis needs also the integer points. The idea is to create vectors of visitor points so we get a direction to which they are looking. By using the scope of what a human being can see I would give percentages of "seen" the products based on intersection with integer points.
thx!
JL
One approach is to perform clustering on the black pixels of the image. The clusters are then the shelves. If the shelves are axis parallel you can find the rectangles by just taking min/max in each direction. This works quite well:
Sample code (I converted the image to PNG as it is easier to read than gif):
library(png)
library(dbscan)
library(tidyverse)
library(RColorBrewer)
img <- readPNG("G18JU.png")
is_black <-
img %>%
apply(c(1, 2), sum) %>% #sum all color channels
{. < 2.5} %>% # we assume black if the sum is lower than 2.5 (max value is 3)
which(arr.ind=TRUE) # the indices of the black pixels
clust <- dbscan(is_black, 2) # identify clusters
rects <-
as.tibble(is_black) %>%
mutate(cluster = clust$cluster) %>% # add cluster information
group_by(cluster) %>%
## find corner points of rectangles normalized to [0, 1]
summarise(xleft = max(col) / dim(img)[2],
ybottom = 1 - min(row) / dim(img)[1],
xright = min(col) / dim(img)[2],
ytop = 1 - max(row) / dim(img)[1])
## plot the image and the rectangles
plot(c(0, 1), c(0, 1), type="n")
rasterImage(img, 0, 0, 1, 1)
for (i in seq_len(nrow(rects))) {
rect(rects$xleft[i], rects$ybottom[i], rects$xright[i], rects$ytop[i],
border = brewer.pal(nrow(rects), "Paired")[i], lwd = 2)
}
Of course this approach also detects other black lines as "rectangles" (e.g. the black border). But I guess you can easily create a "clean" image.
Edit: extend method to find shelves that share a black line
To extend the method such that it can separate shelves that share a black line:
First, identify the rectangles in the way outlined above.
Then, extract each rectangle from the image and compute the row means. This gives you a 1d image (= line) for each rectangle. In this line apply thresholding and clustering as before. The clusters are now the black line segments, and the mean of each cluster corresponds to a vertical line shared by two shelves.
To find horizontal shared lines, the same procedure can be applied, but with column means instead of row means.

R raster recognizing black color raster image

The code below produces two boxes on my image. I am planning to analyze pixels within those boxes further.
I want to put a condition that if along an edge of a box, there is a black color (or a similar color such as grey) pixel then don't proceed. How can i specify such condition?
In below example, in the case of the red square I don't want to proceed further as it has black pixels at the top right hand corner. While I would like to proceed in the case of green square as it doesn't have a black color pixel along it's edge.
library(raster)
r1 <- brick(system.file("external/rlogo.grd", package="raster"))
x <- crop(r1, extent(0,50,0,50))
plotRGB(x)
plot(extent(c(0,20,0,20)), lwd=2, col="red", add=TRUE)
plot(extent(c(21,35,0,10)), lwd=2, col="Green", add=TRUE)
That is not very well defined as in this case color is made of RGB values. But here is a general solution that you could adapt. I 'flatten' these to a single channel by taking the average, and then test for the smallest value being below a threshold (white is 255, 255, 255 in RGB, black is 0,0,0) at the boundary
proceed <- function(f, e, threshold) {
lns <- as(as(e, 'SpatialPolygons'), 'SpatialLines')
v <- unlist(extract(f, lns))
ifelse( min(v, na.rm=TRUE) < threshold, FALSE, TRUE)
}
# flat <- mean(x) # not sophisticated see
# http://stackoverflow.com/questions/687261/converting-rgb-to-grayscale-intensity
flat <- sum(x * c(0.2989, 0.5870, 0.1140))
proceed(flat, extent(c(0,20,0,20)), 100)
proceed(flat, extent(c(21,35,0,10)), 100)
(much improved after seeing jbaums' solution; which is now gone)

how to change the size, color of points in a scatter plot in R

You can find the example data in below
I want to color, recognise those points higher than 0 in another color and lower than 0 in another color. Is there any way to know which points are they ? I simply want to add a border higher and lower -1 and then say show those point higher than 1 in another color and print their name close to it while the same for lower than -1 but another color
This comment did not help since make read line randomly
x=(1:990)
cl = 1*(z>0) + 2*(z<=0)
cx = 1*(z>0) + 1.2*(z<=0)
plot(y~x, col=cl, cex=cx)
I don't want to generate red and black points around zero.
I want to detect those points higher and lower than 1 and -1 respectively.
I also want to plot them in different color and different size
Generate some data around 0:
d<-rnorm(1000,0,1)
To get the points higher than 0:
d[d>0]
To identify the index of points higher than 0:
which(d>0)
Plot points above 0 in green below 0 in red. Also, points above 0 will be a different size than points below 0:
s <- character(length(d))
s[d>0] <- "green"
s[d<0] <- "red"
# s[d > -0.5 & d < 0.5] <- "black" # to color points between 0.5 and -0.5 black
plot(d, col=s) # color effect only
sz <- numeric(length(d))
sz[d>0] <- 4 # I'm giving points greater than 0 a size of 4
sz[d<0] <- 1
plot(d, col=s, cex=sz) # size and color effect
Now, you also mention points above and below 1 and -1, respectively. You should be able to follow the code above to do what you want.
To add labels to points meeting a certain condition (e.g. greater than or less than 0.2 and -0.2, respectively), you can use the text function:
text(which(abs(d) > .2), d[abs(d) > .2], cex = 0.5, pos=3)
pos = 3 means to put the label above the point, and the cex argument to text is for adjusting the label size.
As the comments mentioned, there are many ways of doing this. Assuming that you are using the plot() function, here's a simple way of doing what you want. The key is to understand the arguments of plot(). Color of points is determined by col, size by cex, and so forth. These should all be vectors of the same size of y (else the recycling rule is used). See ?plot.
N = 999 # I don't care how many obs you have
y = rnorm(N)
# vector of colors (black for y>0, red for y<=0)
cl = 1*(y>0) + 1.2*(y<=0)
# vector of point sizes relative to default (1 for y>0, 1.2 y<=0)
cx = 1*(y>0) + 1.2*(y<=0)
plot(y, col=cl, cex=cx)
Edit:
I tried to give a general example (eg, coloring points by a third variable), but OP insists he had 2 variables. Well, just rename z by say x.
Edit:
# last edit I make
set.seed(1)
y = rnorm(N)
cl = rep(1, length(y))
cl[y > 0.5] = 2
cl[y < -0.5] = 3
plot(y, col=cl)
And here's what it gives:

How do I interpret the output of corrplot?

The corrplot packages provides some neat plots and documents with examples.
But I don't understand the output. I can see that if you have a matrix A_ij, you can plot it as an arrangement of n by n square tiles, where the color of tile ij corresponds to the value of A_ij. But some examples appear to have more dimensions:
Here we can guess that color shows the correlation coefficient, and orientation of the ellipse is negative/positive correlation. What is the eccentricity?
The documentation for method says:
the visualization method of correlation matrix to be used. Currently, it supports seven methods, named "circle" (default), "square", "ellipse", "number", "pie", "shade" and "color". See examples for details.
The areas of circles or squares show the absolute value of corresponding correlation coefficients. Method "pie" and "shade" came from Michael Friendly’s job (with some adjustment about the shade added on), and "ellipse" came from D.J. Murdoch and E.D. Chow’s job, see in section References.
So we know that the area, for circles and squares, should show the coefficient. What about the other dimensions, and other methods?
There is only one dimension shown by the plot.
Michael Friendly, in Corrgrams: Exploratory displays for correlation matrices (the corrplot documentation confusingly refers to this as his "job"), says:
In the shaded row, each cell is shaded blue or red depending on the sign of the correlation, and with the intensity of color scaled 0–100% in proportion to the magnitude of the correlation. (Such scaled colors are easily computed using RGB coding from red, (1, 0, 0), through white (1, 1, 1), to blue (0, 0, 1). For simplicity, we ignore the non-linearities of color reproduction and perception, but note that these are easily accommodated in the color mapping function.) White diagonal lines are added so that the direction of the correlation may still be discerned in black and white. This bipolar scale of color was chosen to leave correlations near 0 empty (white), and to make positive and negative values of equal magnitude approximately equally intensely shaded. Gray scale and other color schemes are implemented in our software (Section 6), but not illustrated here.
The bar and circular symbols also use the same scaled colors, but fill an area proportional to the absolute value of the correlation. For the bars, negative values are filled from the bottom, positive values from the top. The circles are filled clockwise for positive values, anti-clockwise for negative values. The ellipses have their eccentricity parametrically scaled to the correlation value (Murdoch and Chow, 1996). Perceptually, they have the property of becoming visually less prominent as the magnitude of the correlation increases, in contrast to the other glyphs.
(emphasis mine)
"Murdoch and Chow, 1996" is a publication describing the equation for drawing the ellipses (A Graphical Display of Large Correlation Matrices). The ellipses are apparently meant to be caricatures of bivariate normal distributions:
So in conclusion, the only dimension shown is always the correlation coefficient (or the value of A_ij, to use the question's terminology) itself. The multiple apparent dimensions are redundant.
I think the plot is quite self explanatory. On the right hand side you have the scale which is colored from red (negative correlation) to blue (positive correlation). The color follows a gradient according to the strength of the correlation.
If the ellipse leans towards the right, it is again positive correlation and if it leans to the left, it is negative correlation.
The diffusion around a line (which denotes perfect correlation, for example mpg ~ mpg) creates an ellipse. You will have a more diffused ellipse for lower strengths of the correlation. This is typically how a weakly correlated relationship will look in a scatterplot. These I think are caricatures, however.
Here is some code from the corrplot function responsible for drawing ellipses. I am not going to attempt to explain this (because it is a part of a larger system). I wanted to show that the logic is all there if you'd like to deep dive into it:
if (method == "ellipse" & plotCI == "n") {
ell.dat <- function(rho, length = 99) {
k <- seq(0, 2 * pi, length = length)
x <- cos(k + acos(rho)/2)/2
y <- cos(k - acos(rho)/2)/2
return(cbind(rbind(x, y), c(NA, NA)))
}
ELL.dat <- lapply(DAT, ell.dat)
ELL.dat2 <- 0.85 * matrix(unlist(ELL.dat), ncol = 2,
byrow = TRUE)
ELL.dat2 <- ELL.dat2 + Pos[rep(1:length(DAT), each = 100),
]
polygon(ELL.dat2, border = col.border, col = col.fill)
}

R: Counting points on a grid of rectangles:

I have a grid of rectangles, whose coordinates are stored in the variable say, 'gridPoints' as shown below:
gridData.Grid=GridTopology(c(min(data$LATITUDE),min(data$LONGITUDE)),c(0.005,0.005),c(32,32));
gridPoints = as.data.frame(coordinates(gridData.Grid))[1:1000,];
names(gridPoints) = c("LATITUDE","LONGITUDE");
plot(gridPoints,col=4);
points(data,col=2);
When plotted, these are the black points in the image,
Now, I have another data set of points called say , 'data', which when plotted are the blue points above.
I would want a count of how many blue points fall within each rectangle in the grid. Each rectangle can be represented by the center of the rectangle, along with the corresponding count of blue points within it in the output. Also, if the blue point lies on any of the sides of the rectangle, it can be considered as lying within the rectangle while making the count. The plot has the blue and black points looking like circles, but they are just standard points/coordinates and hence, much smaller than the circles. In a special case, the rectangle can also be a square.
Try this,
x <- seq(0,10,by=2)
y <- seq(0, 30, by=10)
grid <- expand.grid(x, y)
N <- 100
points <- cbind(runif(N, 0, 10), runif(N, 0, 30))
plot(grid, t="n", xaxs="i", yaxs="i")
points(points, col="blue", pch="+")
abline(v=x, h=y)
binxy <- data.frame(x=findInterval(points[,1], x),
y=findInterval(points[,2], y))
(results <- table(binxy))
d <- as.data.frame.table(results)
xx <- x[-length(x)] + 0.5*diff(x)
d$x <- xx[d$x]
yy <- y[-length(y)] + 0.5*diff(y)
d$y <- yy[d$y]
with(d, text(x, y, label=Freq))
A more general approach (may be overkill for this case, but if you generalize to arbitrary polygons it will still work) is to use the over function in the sp package. This will find which polygon each point is contained in (then you can count them up).
You will need to do some conversions up front (to spatial objects) but this method will work with more complicated polygons than rectangles.
If all the rectangles are exactly the same size, then you could use k nearest neighbor techniques using the centers of the rectangles, see the knn and knn1 functions in the class package.

Resources