I am trying to color in from the top of the graph (y=5) to the line that was in the original plot by creating a polygon and filling it in. I'm screwing up the point generation somehow. Can someone explain whats wrong here? (Didn't mean to fill in the triangle)
half_instances<-c(0,5,2)
Ts<-c(1,2,3)
xpairs<-c(Ts, rep(5,length(half_instances)))
ypairs<-c(Ts,half_instances)
xpairs #1 2 3 5 5 5
ypairs #0 5 2 1 2 3
plot(Ts,half_instances,type="l")
polygon(xpairs,ypairs)
accidental output:
You've mixed up your x and y values, the 5's need to go into the vector for the y coordinates:
half_instances<-c(0,5,2)
Ts<-c(1,2,3)
xpairs <- c(Ts, rev(Ts))
xpairs # 1 2 3 3 2 1 = original x-values from left to right for the bottom half, then go back from right to left by using the reverse of the original x-values
ypairs <- c(half_instances, rep(5, length(half_instances)))
ypairs # 0 5 2 5 5 5 = original y-values for bottom half, then fill up with 5's tor the top half
plot(Ts, half_instances,type="l")
polygon(xpairs, ypairs, col="red")
Because you have points at coordinate X = 5, You need to modify xlim if you want to see the whole polygon:
half_instances<-c(0,5,2)
Ts<-c(1,2,3)
xpairs<-c(Ts, rep(5,length(half_instances)))
ypairs<-c(half_instances,Ts)
xpairs #1 2 3 5 5 5
ypairs #0 5 2 1 2 3
plot(Ts,half_instances,type="l",xlim=c(1,5))
polygon(xpairs,ypairs)
I'm not entirely sure what you're trying to do, but hopefully the following code helps:
half_instances<-c(0,5,2)
Ts<-c(1,2,3)
xpairs<-c(Ts, rep(5,length(half_instances)))
ypairs<-c(Ts,half_instances)
xpairs #1 2 3 5 5 5
ypairs #0 5 2 1 2 3
points <- cbind(Ts, half_instances)
# Set up basic plot
plot(points, type="l")
# Create the outside polygon...
maxX <- max(points[, 1])
minX <- min(points[, 1])
maxY <- max(points[, 2])
minY <- min(points[, 2])
borderPoints <- matrix(c(minX,minY, minX,maxY, maxX,maxY), ncol=2, byrow=TRUE)
linePoints <- points[nrow(points):1, ]
outside <- rbind(borderPoints, linePoints)
# ...and plot it in blue
polygon(outside, border=NA, col='blue')
# Create the inside polygon and plot it in red
inside <- rbind(points, pts[1,])
polygon(inside, col='red', border=NA)
# Redraw the initial line if you want
lines(points, col='black', lwd=2)
Related
I have a empty graph and need to plot the graph based on the convex hull with inner verticies.
My attemp is:
library(igraph)
set.seed(45)
n = 10
g <- graph.empty(n)
xy <- cbind(runif(n), runif(n))
vp <- convex_hull(xy)$resverts + 1
#[1] 8 10 7 2 1
## convert node_list to edge_list
plot(g, layout=xy)
Expected result in right figure.
Question. How to convert a node list to an edge list in igraph??
You can use add_edges along with embed
g2 <- g %>%
add_edges(c(t(embed(vp, 2)), vp[1], vp[length(vp)])) %>%
as.undirected()
and plot(g2, layout = xy) in turn gives
convex_hull does not output a node list in the same sense that an igraph object has a node list. In this case, vp is the sequence of indices so in order to create an edge list, you just need to have the from vertex be going to the next vertex in the sequence. This can be accomplished with dplyr::lead using the first vertex as the default to create a circuit.
data.frame(
from = vp,
to = dplyr::lead(vp, 1, default = vp[1])
)
#> from to
#> 1 8 10
#> 2 10 7
#> 3 7 2
#> 4 2 1
#> 5 1 8
Try this.
## create graph.
vids <- as.character(c(8, 10, 7, 2, 1))
g <- make_graph(c(), length(vids))
V(g)$name <- vids
## and connect the dots.
g2 <- g + path(c(vids, vids[1]))
g2
Use this example data to see what I mean
tag <- as.character(c(1,2,3,4,5,6,7,8,9,10))
species <- c("A","A","A","A","B","B","B","C","C","D")
size <- c(0.10,0.20,0.25,0.30,0.30,0.15,0.15,0.20,0.15,0.15)
radius <- (size*40)
x <- c(9,4,25,14,28,19,9,22,10,2)
y <- c(36,7,15,16,22,24,39,20,34,9)
data <- data.frame(tag, species, size, radius, x, y)
# Plot the points using qplot (from package tidyverse)
qplot(x, y, data = data) +
geom_point(aes(colour = species, size = size))
Now that you can see the plot, what I want to do is for each individual “species A” point, I’d like to identify the largest point within a radius of size*40.
For example, in the bottom left of the plot you can see that species A (tag 2) would produce a radius large enough to contain the close species D point.
However, the species A point on the far right-hand-side of the plot (tag 3) would produce a radius large enough to contain both of the close species B and species C points, in which case I’d want some sort of output that identifies the largest individual within the species A radius.
I’d like to know what I can run (if anything) on this data set to get find the largest “within radius” point for each species A point and get an output like this:
Species A point ---- Largest point within radius
Species A tag 1 ----- Species C tag 9
Species A tag 2 ----- Species D tag 10
Species A tag 3 ----- Species B tag 5
Species A tag 4 ----- Species C tag 8
I've used spatstat and CTFSpackage to make some plots in the past but I can't figure out how to "find largest neighbor within radius". Perhaps I can tackle this in ArcMAP? Also, this is just a small example dataset. Realistically I will be wanting to find the "largest neighbor within radius" for thousands of points.
Any help or feedback would be greatly appreciated.
Following finds the largest species and tag pair that is within given radius for each of the species.
all_df <- data # don't wanna have a variable called data
res_df <- data.frame()
for (j in 1 : nrow(all_df)) {
# subset the data
df <- subset(all_df, species != species[j])
# index of animals within radius
ind <- which ((df$x - x[j])^2 + (df$y - y[j])^2 < radius[j]^2 )
# find the max `size` in the subset df
max_size <- max(df$size[ind])
# all indices with max_size in df
max_inds <- which(df$size[ind] == max_size)
# pick the last one is there is more than on max_size
new_ind <- ind[max_inds[length(max_inds)]]
# results in data.frame
res_df <- rbind(res_df, data.frame(org_sp = all_df$species[j],
org_tag = all_df$tag[j],
res_sp = df$species[new_ind],
res_tag = df$tag[new_ind]))
}
res_df
# org_sp org_tag res_sp res_tag
# 1 A 1 C 9
# 2 A 2 D 10
# 3 A 3 B 5
# 4 A 4 C 8
# 5 B 5 A 3
# 6 B 6 C 8
# 7 B 7 C 9
# 8 C 8 B 5
# 9 C 9 B 7
# 10 D 10 A 2
I have two data.frames called outlier and data.
outlier just keeps row numbers which needs to be coloured.
data has 1000 data.
It has two columns called x and y.
If row number exists in outliers I want dots in plot to be red, otherwise black
plot(data$x, data$y, col=ifelse(??,"red","black"))
Something should be in ?? .
Hi this way works for me using ifelse, let me know what you think:
outlier <- sample(1:100, 50)
data <- data.frame(x = 1:100, y = rnorm(n = 100))
plot(
data[ ,1], data[ ,2]
,col = ifelse(row.names(data) %in% outlier, "red", "blue")
,type = "h"
)
I think this can be accomplished by creating a new color column in your data frame:
data$color <- "black"
Then set the outliers to a different value:
data[outlier,"color"] <- "red"
I dont have your exact data but I think I got something similar to what you wanted using the following:
outlier <- c(1, 2, 7, 9)
data <- data.frame(x=c(1,2,3,4,5,6,7,8,9,10),
y=c(1,2,3,4,5,6,7,8,9,10))
data$color <- "black"
data[outlier,"color"] <- "red"
data
x y color
1 1 1 red
2 2 2 red
3 3 3 black
4 4 4 black
5 5 5 black
6 6 6 black
7 7 7 red
8 8 8 black
9 9 9 red
10 10 10 black
Finally plot using the new value in data:
plot(data$x, data$y, col=data$color)
Results in:
This question already has answers here:
Stacked bar chart
(4 answers)
Closed 7 years ago.
I have a data set that looks like this:
samp.data <- structure(list(Track = c(1,1,1,1,1,1,1,1,2,2,2),
Base = c("A","C","B","A","D","D","C","A","A","B","B"),
Length = c(1,1,1,1,2,3,1,1,1,1,1)),
.Names = c("Track", "Base", "Length"), class = "data.frame",row.names = c(NA, 11L))
# Track Base Length
# 1 1 A 1
# 2 1 C 1
# 3 1 B 1
# 4 1 A 1
# 5 1 D 2
# 6 1 D 3
# 7 1 C 1
# 8 1 A 1
# 9 2 A 1
# 10 2 B 1
# 11 2 B 1
I am trying to plot an unordered stacked bar, with Tracks on the x axis and Length on the y axis. In other words, the bar graph wouldn't group the A bases together and plot it as one length of 1+1+1+1=4. It would plot each base in order. First it would plot the A base of length 1 in Track 1, C base of length 1 above that, B base of length 1 above that, A base of length 1 above that, D base of length 2 above that, and so on.
Below is a crude ASCII diagram of what I am trying to describe:
| C
L | Y
e | Y Key
n | R A = Red
g | B B B = Blue
t | B G C = Green
h | R R D = Yellow
----------
2 1
Track
Sorry if the explanation is a little confusing. Thank you for your help!
Edit: This question is different from the possible duplicate, because I would like to ungroup the stacked sections.
Just use geom_bar(stat='identity'), set your x to Track, your y to length - it all works out.
Note - I converted your Base to factor (makes sense), as well as your Track (also makes sense to me, but if you wish to keep it numeric that's fine. You may wish to add a + scale_x_discrete() then in order to have your tracks show up as whole numbers on the x axis).
samp.data$Base <- factor(samp.data$Base)
samp.data$Track <- factor(samp.data$Track)
ggplot(samp.data, aes(x=Track, y=Length, fill=Base)) +
geom_bar(stat='identity') +
scale_fill_manual(values=c('red', 'blue', 'green', 'yellow'))
The last line sets the colours as you please.
If you wish to reverse the x axis order (so that your track 2 appears first), do + scale_x_reverse().
I do not know what you mean by "ungroup the base" in your question, but say you wanted to draw an outline around each "chunk" of DNA you could add (e.g.) colour="black" in the geom_bar (e.g. in track 1, there is a D of length 2 immediately followed by a D of length 3 so it's drawn as a big D of length 5 - adding colour="black" outlines the 2-chunk separately to the 3-chunk though they still have the same colour).
Using R package pheatmap to draw heatmaps. Is there a way to assign a color to NAs in the input matrix? It seems NA gets colored in white by default.
E.g.:
library(pheatmap)
m<- matrix(c(1:100), nrow= 10)
m[1,1]<- NA
m[10,10]<- NA
pheatmap(m, cluster_rows=FALSE, cluster_cols=FALSE)
Thanks
It is possible, but requires some hacking.
First of all let's see how pheatmap draws a heatmap. You can check that just by typing pheatmap in the console and scrolling through the output, or alternatively using edit(pheatmap).
You will find that colours are mapped using
mat = scale_colours(mat, col = color, breaks = breaks)
The scale_colours function seems to be an internal function of the pheatmap package, but we can check the source code using
getAnywhere(scale_colours)
Which gives
function (mat, col = rainbow(10), breaks = NA)
{
mat = as.matrix(mat)
return(matrix(scale_vec_colours(as.vector(mat), col = col,
breaks = breaks), nrow(mat), ncol(mat), dimnames = list(rownames(mat),
colnames(mat))))
}
Now we need to check scale_vec_colours, that turns out to be:
function (x, col = rainbow(10), breaks = NA)
{
return(col[as.numeric(cut(x, breaks = breaks, include.lowest = T))])
}
So, essentially, pheatmap is using cut to decide which colours to use.
Let's try and see what cut does if there are NAs around:
as.numeric(cut(c(1:100, NA, NA), seq(0, 100, 10)))
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3
[29] 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6
[57] 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9
[85] 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 NA NA
It returns NA! So, here's your issue!
Now, how do we get around it?
The easiest thing is to let pheatmap draw the heatmap, then overplot the NA values as we like.
Looking again at the pheatmap function you'll see it uses the grid package for plotting (see also this question: R - How do I add lines and text to pheatmap?)
So you can use grid.rect to add rectangles to the NA positions.
What I would do is find the coordinates of the heatmap border by trial and error, then work from there to plot the rectangles.
For instance:
library(pheatmap)
m<- matrix(c(1:100), nrow= 10)
m[1,1]<- NA
m[10,10]<- NA
hmap <- pheatmap(m, cluster_rows=FALSE, cluster_cols=FALSE)
# These values were found by trial and error
# They WILL be different on your system and will vary when you change
# the size of the output, you may want to take that into account.
min.x <- 0.005
min.y <- 0.01
max.x <- 0.968
max.y <- 0.990
width <- 0.095
height <- 0.095
coord.x <- seq(min.x, max.x-width, length.out=ncol(m))
coord.y <- seq(max.y-height, min.y, length.out=nrow(m))
for (x in seq_along(coord.x))
{
for (y in seq_along(coord.y))
{
if (is.na(m[x,y]))
grid.rect(coord.x[x], coord.y[y], just=c("left", "bottom"),
width, height, gp = gpar(fill = "green"))
}
}
A better solution would be to hack the code of pheatmap using the edit function and have it deal with NAs as you wish...
Actually, the question is easy now. The current pheatmap function has incorporated a parameter for assigning a color to "NA", na_col. Example:
na_col = "grey90"
You can enable assigning a colour by using the developer version of pheatmap from github. You can do this using devtools:
#this part loads the dev pheatmap package from github
if (!require("devtools")) {
install.packages("devtools", dependencies = TRUE)
library(devtools)
}
install_github("raivokolde/pheatmap")
Now you can use the parameter "na_col" in the pheatmap function:
pheatmap(..., na_col = "grey", ...)
(edit)
Don't forget to load it afterwards. Once it is installed, you can treat it as any other installed package.
If you don't mind using heatmap.2 from gplots instead, there's a convenient na.color argument. Taking the example data m from above:
library(gplots)
heatmap.2(m, Rowv = F, Colv = F, trace = "none", na.color = "Green")
If you want the NAs to be grey, you can simply force the "NA" as double.
m[is.na(m)] <- as.double("NA")
pheatmap(m, cluster_rows=F, cluster_cols=F)