Heatmap in R with raw values - r

I have this dataframe:
df <- data.frame(PatientID = c("3454","345","5","348","567","79"),
clas1 = c(1, 0, 5, NA, NA, 4),
clas2 = c(4, 1, 0, 3, 1, 0),
clas3 = c(1, NA, 0, 5, 5, 5), stringsAsFactors = F)
I would like to create a heatmap, with patient ID in the x axis and clas1, clas2 and clas3 in the y axis. The values represented in the heat map would be the raw value of each "clas". Here I post a drawing of what I would like
I apologise because I don't have available more colours to represent this, but this is only an example and any colour scale could be used.
An important thing is that I would like to distinguish between zeros and NAs so ideally NAs have their own colour or appear in white (empty).
I hope this is understandable enough.
But any questions just ask
Many thanks!

df <- data.frame(PatientID = c("3454","345","5","348","567","79"),
clas1 = c(1, 0, 5, NA, NA, 4),
clas2 = c(4, 1, 0, 3, 1, 0),
clas3 = c(1, NA, 0, 5, 5, 5), stringsAsFactors = F)
library(tidyverse)
df %>% pivot_longer(!PatientID) %>%
ggplot(aes(x= PatientID, y = name, fill = value)) +
geom_tile()
Created on 2021-05-25 by the reprex package (v2.0.0)

Here is a base R option with ``heatmap`
heatmap(t(`row.names<-`(as.matrix(df[-1]), df$PatientID)))
# Which is like
# x <- as.matrix(df[-1]
# row.names(x) <- df$PatientID
# heatmap(t(x))

Preparing the data
I'll give 4 options, in all four you need to assign the rownames and remove the id column. I.e.:
df <- data.frame(PatientID = c("3454","345","5","348","567","79"),
clas1 = c(1, 0, 5, NA, NA, 4),
clas2 = c(4, 1, 0, 3, 1, 0),
clas3 = c(1, NA, 0, 5, 5, 5), stringsAsFactors = F)
rownames(df) <- df$PatientID
df$PatientID <- NULL
df
The output is:
> df
clas1 clas2 clas3
3454 1 4 1
345 0 1 NA
5 5 0 0
348 NA 3 5
567 NA 1 5
79 4 0 5
Base R
With base R (decent output):
heatmap(as.matrix(df))
gplots
With gplots (a bit ugly, but many more parameters to control):
library(gplots)
heatmap.2(as.matrix(df))
heatmaply
With heatmaply you have nicer defaults to use for the dendrograms (it also organizes them in a more "optimal" way).
You can learn more about the package here.
Static
Static heatmap with heatmaply (better defaults, IMHO)
library(heatmaply)
ggheatmap(df)
Now with colored dendrograms
library(heatmaply)
ggheatmap(df, k_row = 3, k_col = 2)
With no dendrogram:
library(heatmaply)
ggheatmap(df, dendrogram = F)
Interactive
Interactive heatmap with heatmaply (hover tooltip, and the ability to zoom - it's interactive!):
library(heatmaply)
heatmaply(df)
And anything you can do with the static ggheatmap you can also do with the interactive heatmaply version.

Here is another option:
df <- data.frame(PatientID = c("3454","345","5","348","567","79"),
clas1 = c(1, 0, 5, NA, NA, 4),
clas2 = c(4, 1, 0, 3, 1, 0),
clas3 = c(1, NA, 0, 5, 5, 5), stringsAsFactors = F)
# named vector for heatmap
cols <- c("0" = "white",
"1" = "green",
"2" = "orange",
"3" = "yellow",
"4" = "pink",
"5" = "black",
"99" = "grey")
labels_legend <- c("0" = "0",
"1" = "1",
"2" = "2",
"3" = "3",
"4" = "4",
"5" = "5",
"99" = "NA")
df1 <- df %>%
pivot_longer(
cols = starts_with("clas"),
names_to = "names",
values_to = "values"
) %>%
mutate(PatientID = factor(PatientID, levels = c("3454", "345", "5", "348", "567", "79")))
ggplot(
df1,
aes(factor(PatientID), factor(names))) +
geom_tile(aes(fill= factor(values))) +
# geom_text(aes(label = values), size = 5, color = "black") + # text in tiles
scale_fill_manual(
values = cols,
breaks = c("0", "1", "2", "3", "4", "5", "99"),
labels = labels_legend,
aesthetics = c("colour", "fill"),
drop = FALSE
) +
scale_y_discrete(limits=rev) +
coord_equal() +
theme(line = element_blank(),
title = element_blank()) +
theme(legend.direction = "horizontal", legend.position = "bottom")

Related

Get row columns by group for geom_col in ggplot

I am trying to calculate row percentages by demographics of various score levels--in my data, that would be what % of white people (or % of black people, or % male, or % who have education level 2, and so on) have a score of 0 (or 1, 2, or 3)--and then use that to create a big plot.
So in my example data below, 8.33% of race == 1 (which is white) have a score of 0, 25% have a score of 1, 25% have a score of 2, and 41.67% have a score of 3.
Then the ultimate end goal would be to get some type of bar plot where the 4 levels of 'score' are across the x axis, and the various comparisons of demographics run down the y axis. Something that looks visually sort of like this, but with the levels of 'score' across the top instead of education levels: .
I already have some code to make the actual figure, which I've done in other instances but with externally/already-calculated percentages:
ggplot(data, aes(x = percent, y = category, fill = group)) +
geom_col(orientation = "y", width = .9) +
facet_grid(group~score_var,
scales = "free_y", space = "free_y") +
labs(title = "Demographic breakdown of 'Score'") +
theme_bw()
I am just struggling to figure out the best way to calculate these row percentages, presumably using group_by() and summarize and then storing or configuring them in a way that they can be plotted. Thank you.
d <- structure(list(race = c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1,
1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2,
3, 3), gender = c(0, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1,
0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1
), education = c(1, 3, 3, 2, 1, 3, 2, 3, 4, 4, 2, 3, 3, 2, 3,
4, 1, 3, 1, 3, 3, 2, 1, 3, 2, 3, 4, 4, 2, 3, 3, 2, 3, 4, 1, 3
), score = c(1, 2, 2, 1, 2, 3, 3, 2, 0, 0, 1, 2, 1, 3, 0, 0,
3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 2, 3, 1, 3, 3, 0, 1, 2, 2, 0)), row.names = c(NA,
-36L), spec = structure(list(cols = list(race = structure(list(), class = c("collector_double",
"collector")), gender = structure(list(), class = c("collector_double",
"collector")), education = structure(list(), class = c("collector_double",
"collector")), score = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x000001bd978b0df0>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
This may get you started:
library(dplyr)
library(ggplot2)
prop <- data %>%
mutate(race = factor(race, levels = c(1, 2, 3), labels = c("White", "Black", "Others"))) %>%
group_by(race) %>%
mutate(race_n = n()) %>%
group_by(race, score) %>%
summarise(percent = round(100*n()/race_n[1], 1))
prop %>%
ggplot(aes(x = percent, y = score, fill = race)) +
geom_col(orientation = "y", width = .9) +
geom_text(aes(label = percent), hjust = 1)+
facet_grid(~race) +
labs(title = "Demographic breakdown of 'Score'") +
theme_bw()
Edit
To put all characters together, you can get a bigger graph:
df <- data %>% mutate(
gender = factor(2-gender),
race = factor(race),
education = factor(education)) %>%
pivot_longer(!score, names_to = "character", values_to = "levels")
df %>% group_by(character, levels) %>%
mutate(group_n = n()) %>%
group_by(character, levels, score) %>%
summarise(percent = round(100*n()/group_n[1], 1)) %>%
ggplot(aes(x = percent, y = score, fill = character)) +
geom_col(orientation = "y", width = .9) +
geom_text(aes(label = percent), hjust = 1)+
facet_grid(character ~ levels) +
labs(title = "Demographic breakdown of 'Score'") +
theme_bw()
please note: I have changed the code for gender.
Taking inspiration from #Zhiqiang Wang's excellent first pass at this, I finally figured out a solution. I still need to change the order of the labels (to put the education levels in order, and move the race variables to the top of the figure) but this is basically what I was envisioning.
d_test <- d %>% mutate(
gender = factor(2-gender),
race = factor(race),
education = factor(education)) %>%
pivot_longer(!score, names_to = "group", values_to = "levels")
d_test <- d_test %>% group_by(group, levels) %>%
mutate(group_n = n()) %>%
group_by(group, levels, score) %>%
summarise(percent = round(100*n()/group_n[1], 1))
d_test <- d_test %>%
mutate(var = case_when(group == "gender" & levels == 1 ~ "female",
group == "gender" & levels == 2 ~ "male",
group == "race" & levels == 1 ~ "white",
group == "race" & levels == 2 ~ "black",
group == "race" & levels == 3 ~ "hispanic",
group == "education" & levels == 1 ~ "dropout HS",
group == "education" & levels == 2 ~ "grad HS",
group == "education" & levels == 3 ~ "some coll",
group == "education" & levels == 4 ~ "grad coll"))
ggplot(d_test, aes(x = percent, y = var, fill = group)) +
geom_col(orientation = "y", width = .9) +
facet_grid(group ~ score,
scales = "free_y", space = "free_y") +
labs(title = "Demographic breakdown of 'Score'",
y = "",
x = "Percent") +
theme_minimal() +
theme(legend.position = "none",
strip.text.y = element_blank())

Alluvial diagram with varying aesthetic / colors over one flow in R based on ggplot2

I have created the following alluvial diagram in R as follows:
df <- data.frame(Variable = c("X1", "X2", "X3", "X4", "X5", "X6"),
Pearson1 = c(6, 3, 2, 5, 4, 1),
Spearman1 = c(6, 5, 1, 2, 3, 4),
Kendall1 = c(6, 5, 1, 2, 3, 4),
Pearson2 = c(6, 5, 1, 2, 3, 4),
Spearman2 = c(6, 5, 1, 2, 4, 3),
Kendall2 = c(6, 5, 1, 2, 3, 4))
df$freq<-1
alluvial(df[1:7], freq=df$freq, cex = 0.7,col= "red")
which results in
How can I set some specific lines to have different col than red? e.g. X1 from Variables to Pearson1, and then again from Kendall1 to Spearman2 and X3 in all states? I see I can't do that based on alluvial(). How can I recreate the above alluvial based on another function??
ggalluvial allows for varying aesthetics over one "flow" (or alluvium). The documentation provides a trick to use geom_flow with stat = "alluvium" and to specify "lode.guidance = "frontback".
The actual aesthetic (color) will need to be added to the data. geom_flow and geom_stratum will require different columns for the aesthetic, (try what happens when you use the same for both). I am passing the color directly and using scale_identity, but you can of course also use random values and then define your colors with scale_manual.
library(ggalluvial)
#> Loading required package: ggplot2
library(tidyverse)
df <- data.frame(Variable = c("X1", "X2", "X3", "X4", "X5", "X6"),
Pearson1 = c(6, 3, 2, 5, 4, 1),
Spearman1 = c(6, 5, 1, 2, 3, 4),
Kendall1 = c(6, 5, 1, 2, 3, 4),
Pearson2 = c(6, 5, 1, 2, 3, 4),
Spearman2 = c(6, 5, 1, 2, 4, 3),
Kendall2 = c(6, 5, 1, 2, 3, 4))
df_long <-
df %>%
## reshape your data in order to bring it to the right shape
mutate(across(everything(), as.character)) %>%
rownames_to_column("ID") %>%
pivot_longer(-ID) %>%
## correct order of your x
mutate(
name = factor(name, levels = names(df)),
## now hard code where you want to change the color.
## lodes need a different highlighting then your strata
## there are of course many ways to add this information, I am using case_when here
## you could also create separate vectors and add them to your data frame
highlight_lode = case_when(
ID == 3 ~ "blue",
ID == 1 & name %in% c("Variable", "Kendall1", "Pearson2") ~ "orange",
TRUE ~ "red"
),
highlight_stratum = case_when(
ID == 3 ~ "blue",
ID == 1 & name %in% c(
"Variable", "Pearson1", "Kendall1", "Pearson2",
"Spearman2"
) ~ "orange",
TRUE ~ "red"
)
)
ggplot(df_long,
## now use different color aesthetics in geom_flow and geom_stratum
aes(x = name, stratum = value, alluvium = ID, label = value)) +
## I took this trick with lode guidance from the documentation - this allows varying aesthetics over one flow.
geom_flow(aes(fill = highlight_lode), stat = "alluvium", lode.guidance = "frontback", color = "darkgray") +
geom_stratum(aes(fill = highlight_stratum)) +
geom_text(stat = "stratum") +
## as I have named the colors directly, it is appropriate to use scale_identity
scale_fill_identity()
#> Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
Created on 2023-01-29 with reprex v2.0.2

R: non-numeric arguments to binary operators

I am working with the R programming language. I am trying to make a "parallel coordinates plot" using some fake data:
library(MASS)
a = rnorm(100, 10, 10)
b = rnorm(100, 10, 5)
c = rnorm(100, 5, 10)
d = matrix(a, b, c)
parcoord(d[, c(3, 1, 2)], col = 1 + (0:149) %/% 50)
However, a problem arises when I try to mix numeric and factor variables together:
group <- sample( LETTERS[1:4], 100, replace=TRUE, prob=c(0.25, 0.25, 0.25, 0.25) )
d = matrix(a,b, group)
parcoord(d[, c(3, 1, 2)], col = 1 + (0:149) %/% 50)
Error in x - min(x, na.rm = TRUE): non-numeric argument to binary operator
I am just curious. Can this problem be resolved? Or is it simply impossible to make such a plot using numeric and factor variables together?
I saw a previous stackoverflow post over here where a similar plot is made using numeric and factor variables: How to plot parallel coordinates with multiple categorical variables in R
However, I am using a computer with no USB port or internet access - I have a pre-installed version of R with limited libraries (I have plotly, ggplot2, dplyr, MASS ... I don't have ggally or tidyverse) and was looking for a way to do this only with the parcoord() function.
Does anyone have any ideas if this can be done?
Thanks
Thanks
One option is to label rows of the matrix using a factor and use that on the plot, e.g.
library(MASS)
set.seed(300)
par(xpd=TRUE)
par(mar=c(4, 4, 4, 6))
a = rnorm(12, 10, 10)
b = rnorm(12, 10, 5)
c = rnorm(12, 5, 10)
group <- sample(c("#FF9289", "#FF8AFF", "#00DB98", "#00CBFF"),
12, replace=TRUE)
d = cbind(a, b, c)
rownames(d) <- group
parcoord(d[, c(3, 1, 2)], col = group)
title(main = "Plot", xlab = "Variable", ylab = "Values")
axis(side = 2, at = seq(0, 1, 0.1),
tick = TRUE, las = 1)
legend(3.05, 1, legend = c("A", "B", "C", "D"), lty = 1,
col = c("#FF9289", "#FF8AFF", "#00DB98", "#00CBFF"))
EDIT
Thanks for the additional explanation. What you want does make sense, but unfortunately it doesn't look like it will work as I expected. I tried to make a plot using an ordered factor as the middle variable (per https://pasteboard.co/JKK4AUD.jpg) but got the same error ("non-numeric argument to binary operator").
One way I thought of doing it is to recode the factor as a number (e.g. "Var_1" -> 0.2, "Var_2" -> 0.4) as below:
library(MASS)
set.seed(123)
par(xpd=TRUE)
par(mar=c(4, 4, 4, 6))
a = rnorm(12, 10, 10)
b = c(rep("Var_1", 3),
rep("Var_2", 3),
rep("Var_3", 3),
rep("Var_4", 3))
c = rnorm(12, 5, 10)
group <- c(rep("#FF9289", 3),
rep("#FF8AFF", 3),
rep("#00DB98", 3),
rep("#00CBFF", 3))
d = data.frame("A" = a,
"Factor" = b,
"C" = c,
"Group" = group)
d$Factor <- sapply(d$Factor, switch,
"Var_1" = 0.8,
"Var_2" = 0.6,
"Var_3" = 0.4,
"Var_4" = 0.2)
parcoord(d[, c(1, 2, 3)], col = group)
title(main = "Plot", xlab = "Variable", ylab = "Values")
axis(side = 2, at = seq(0, 1, 0.1),
tick = TRUE, las = 1)
legend(3.05, 1, legend = c("A", "B", "C", "D"), lty = 1,
col = c("#FF9289", "#FF8AFF", "#00DB98", "#00CBFF"))
mtext(text = "Var 1", side = 1, adj = 0.6, padj = -30)
mtext(text = "Var 3", side = 1, adj = 0.6, padj = -12)
mtext(text = "Var 2", side = 1, adj = 0.6, padj = -21)
mtext(text = "Var 4", side = 1, adj = 0.6, padj = -3)

How to share a data.frame in R?

I need to provide a data frame for a MWE, result of other complex operations and tons of data not directly related to the point of the question.
In order to make the example simple and lean, is there a way to transform/convert the data frame into a R command that creates it?
In instances, something like:
yadf <- structure(list(x = c(0, 1, 2, 3, 4),
y = c(0, 1, 4, 9, 16)),
.Names = c('x', 'y'),
row.names = c('0', '1', '2', '3', '4'),
class = 'data.frame')
ggplot(yadf, aes(x, y)) + geom_line()
As #docendo-discimus pointed, dput is the command than convert the data frame into a R command:
> dput(yadf)
structure(list(x = c(0, 1, 2, 3, 4), y = c(0, 1, 4, 9, 16)), .Names = c("x",
"y"), row.names = c("0", "1", "2", "3", "4"), class = "data.frame")

connecting line like tree in r

I have following type data for human family:
indvidual <- c("John", "Kris", "Peter", "King", "Marry", "Renu", "Kim", "Ken", "Lu")
Parent1 <- c( NA, NA, "John", "John", "John", NA, "Peter", NA, NA)
Parent2 <- c( NA, NA, "Kris", "Kris", "Renu", NA, "Lu", NA, NA)
X <- c( 2, 3, 2, 3, 4, 5, 1.5, 1, 1)
Y <- c( 3, 3, 2, 2, 2, 3, 1, 3, 2)
pchsize <- c( 4.5, 4.3, 9.2, 6.2, 3.2, 6.4, 2.1, 1.9, 8)
fillcol <- c( 8.5, 8.3, 1.2, 3.2, 8.2, 2.4, 2.6, 6.1, 3.2)
myd <- data.frame (indvidual, Parent1, Parent2, X, Y, pchsize,fillcol)
indvidual Parent1 Parent2 X Y pchsize fillcol
1 John <NA> <NA> 2.0 3 4.5 8.5
2 Kris <NA> <NA> 3.0 3 4.3 8.3
3 Peter John Kris 2.0 2 9.2 1.2
4 King John Kris 3.0 2 6.2 3.2
5 Marry John Renu 4.0 2 3.2 8.2
6 Renu <NA> <NA> 5.0 3 6.4 2.4
7 Kim Peter Lu 1.5 1 2.1 2.6
8 Ken <NA> <NA> 1.0 3 1.9 6.1
9 Lu <NA> <NA> 1.0 2 8.0 3.2
I want plot something like the following, individuals points are connected to parents (Preferably different line color to Parent1 and Parent2 listed). Also pch size and pch fill is scaled to other variables pchsize and fillcol. Thus plot outline is:
Here is my progress in ggplot2:
require(ggplot2)
ggplot(data=myd, aes(X, Y,fill = fillcol)) +
geom_point(aes(size = pchsize, fill = fillcol), pch = "O") +
geom_text(aes (label = indvidual, vjust=1.25))
Issues unsolved: connecting lines, making size of pch big and fill color at the sametime.
Here is ggplot2 solution
library(ggplot2)
individual <- c("John", "Kris", "Peter", "King", "Marry", "Renu", "Kim", "Ken", "Lu")
Parent1 <- c( NA, NA, "John", "John", "John", NA, "Peter", NA, NA)
Parent2 <- c( NA, NA, "Kris", "Kris", "Renu", NA, "Lu", NA, NA)
X <- c( 2, 3, 2, 3, 4, 5, 1.5, 1, 1)
Y <- c( 3, 3, 2, 2, 2, 3, 1, 3, 2)
pchsize <- c( 4.5, 4.3, 9.2, 6.2, 3.2, 6.4, 2.1, 1.9, 8)
fillcol <- c( 8.5, 8.3, 1.2, 3.2, 8.2, 2.4, 2.6, 6.1, 3.2)
myd <- data.frame (individual, Parent1, Parent2, X, Y, pchsize,fillcol)
SegmentParent1 <- merge(
myd[, c("individual", "X", "Y")],
myd[!is.na(myd$Parent1), c("Parent1", "X", "Y")],
by.x = "individual", by.y = "Parent1")
SegmentParent2 <- merge(
myd[, c("individual", "X", "Y")],
myd[!is.na(myd$Parent1), c("Parent2", "X", "Y")],
by.x = "individual", by.y = "Parent2")
Segments <- rbind(SegmentParent1, SegmentParent2)
ggplot(data=myd, aes(X, Y)) +
geom_segment(data = Segments, aes(x = X.x, xend = X.y, y = Y.x, yend = Y.y)) +
geom_point(aes(size = pchsize, colour = fillcol)) +
geom_text(aes (label = indvidual), vjust = 0.5, colour = "red", fontface = 2) +
scale_x_continuous("", expand = c(0, 0.6), breaks = NULL) +
scale_y_continuous("", expand = c(0, 0.4), breaks = NULL) +
scale_size(range = c(20, 40)) +
theme_bw()
Here is a solution just using plot(), text(), and arrows(). The for loop is a bit cluttered, but will work for larger data sets and it should be easy to play with the plot and arrows:
plot(myd$X,myd$Y, col='white', type="p", main="", ylab="", xlab="",
axes = FALSE, ylim = c(min(myd$Y)*.8, max(myd$Y)*1.2),
xlim = c(min(myd$X)*.8, max(myd$X)*1.2))
child = data.frame()
child = myd[!is.na(myd$Parent1),]
DArrows = matrix(0,nrow(child),4);
MArrows = matrix(0,nrow(child),4);
for (n in 1:nrow(child)){
d<-child[n,];
c1<-myd$indvidual==as.character(d$Parent1);
b1<-myd[t(c1)];
c2<-myd$indvidual==as.character(d$Parent2);
b2<-myd[t(c2)];
DArrows[n, 1]=as.double(d$X)
DArrows[n, 2]=as.double(d$Y)
DArrows[n, 3]=as.double(b1[4])
DArrows[n, 4]=as.double(b1[5])
MArrows[n, 1]=as.double(d$X)
MArrows[n, 2]=as.double(d$Y)
MArrows[n, 3]=as.double(b2[4])
MArrows[n, 4]=as.double(b2[5])
}
arrows(DArrows[,3],DArrows[,4],DArrows[,1],DArrows[,2],lwd= 2, col = "blue",length=".1")
arrows(MArrows[,3],MArrows[,4],MArrows[,1],MArrows[,2],lwd=2, col = "red",length=".1")
par(new=TRUE)
plot(myd$X,myd$Y,type = "p", main = "", ylab = "", xlab = "",cex = myd$pchsize,
axes = FALSE, pch = 21, ylim = c(min(myd$Y)*.8, max(myd$Y)*1.2),
xlim=c(min(myd$X)*.8, max(myd$X)*1.2), bg = myd$fillcol,fg = 'black')
text(1.12*myd$X, .85*myd$Y, myd$indvidual)
arrows((DArrows[,3]+DArrows[,1])/2, (DArrows[,4]+DArrows[,2])/2,
DArrows[,1], DArrows[,2], lwd = 2, col = "blue", length = ".1")
arrows((MArrows[,3]+MArrows[,1])/2, (MArrows[,4]+MArrows[,2])/2,
MArrows[,1], MArrows[,2], lwd = 2, col = "red", length = ".1")
One thing that jumped out to me was to treat this is a network - R has many packages to plot these.
Here's a very simple solution:
First, I used your parent list to make a sociomatrix - you can generally input networks using edge lists as well - here I put 1 for the first parental relationship and 2 for the second.
psmat <- rbind(c(0, 0, 1, 1, 1, 0, 0, 0, 0),
c(0, 0, 2, 2, 0, 0, 0, 0, 0),
c(0, 0, 0, 0, 0, 0, 1, 0, 0),
rep(0, 9),
rep(0, 9),
c(0, 0, 0, 0, 2, 0, 0, 0, 0),
rep(0, 9),
rep(0, 9),
c(0, 0, 0, 0, 0, 0, 2, 0, 0))
Then, using the network package I just hit:
require(network)
plot(network(psmat), coord = cbind(X, Y), vertex.cex = pchsize,
vertex.col = fillcol, label = indvidual, edge.col = psmat)
This isn't terribly pretty in itself, but I think gives you all the basic elements you wanted.
For the colors, I believe the decimal places are just rounded - I wasn't sure what to do with those.
I know I've seen people plot networks in ggplot, so that might give you a better result.
Edit:
So here's a really messy way of turning your data into a network object directly - someone else might be able to fix it. Additionally, I add an edge attribute (named 'P' for parental status) and give the first set a value of 1 and the second set a value of 2. This can be used when plotting to set the colors.
P1 <- match(Parent1, indvidual)
e1 <- cbind(P1, 1:9); e1 <- na.omit(e1); attr(e1, 'na.action') <- NULL
P2 <- match(Parent2, indvidual)
e2 <- cbind(P2, 1:9); e2 <- na.omit(e2); attr(e2, 'na.action') <- NULL
en1 <- network.initialize(9)
add.edges(en1, e1[,1], e1[,2])
set.edge.attribute(en1, 'P', 1)
add.edges(en1, e2[,1], e2[,2], names.eval = 'P', vals.eval = 2)
plot(en1, coord = cbind(X, Y), vertex.cex = pchsize,
vertex.col = fillcol, label = indvidual, edge.col = 'P')
Alternative solution use igraph
library(igraph)
mm<-data.frame(dest=c(as.character(myd$Parent1),as.character(myd$Parent2)))
mm$orig<-myd$individual
g<-graph.edgelist(as.matrix(mm[!is.na(mm$dest),]))
rownames(myd)<-as.character(myd[,1])
l<-as.matrix(myd[V(g)$name,4:5])
plot(g,layout=l,vertex.color=myd[V(g)$name,6],vertex.size=myd[V(g)$name,6])
Just play a bit with color a sizes!

Resources