Add points to splines - r

I wish to add points directly on top of the curved spline.
The code here does not work because geom_point places the dots as if the lines were straight. See points #2, #3. I've tried using stat_bspline2 with geom = "point" without success.
Help is much appreciated.
library(tidyverse)
library(ggforce)
data <- tibble (
x = c(10, 15, 17, 17, 20, 22, 22, 23, 25, 25, 27, 29),
y = c(5, 7, 4, 4, 0, 5, 5, 6, 5, 5, 4, 5.5),
g = c("A", "A", "A", "B", "B", "B", "C", "C", "C", "D","D","D"),
pt = c(1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1)
)
data <- data %>%
mutate(pt_x = ifelse(pt == 1, x, NA),
pt_y = ifelse(pt == 1, y, NA))
ggplot(data) +
stat_bspline2(aes(x=x, y=y, color = ..group.., group = g), size = 4, n = 300, geom = "bspline0") +
scale_color_gradientn(colours = c("red", "pink", "green", "white"), guide = F) +
geom_point(aes(pt_x, pt_y), size = 7)

Related

ggbetweenstats: logarithmic y axis removes grouped analysis from plot

I am conducting a kruskal-wallis test to determine statistically significance between three groups of a measurement. I use ggbetweenstats to determine between which group there is a statistically significant association.
Here is the code for sample data and the plot:
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
)
You can see the results from the kruskal wallis test on the top of the plot as well as the groupes analysis in the plot. Now I want to change y axis to logarithmic scale:
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) +
ggplot2::scale_y_continuous(trans=scales::pseudo_log_trans(sigma = 1, base = exp(1)), limits = c(0,25000), breaks = c(0,1,10,100,1000,10000)
)
However, this removes the grouped analysis. I have tried different scaling solutions and browsed SO for a solution but couldn't find anything. Thank you for your help!
It seems that the y_position parameter in the geom_signif component is not affected by the y axis transformation. You will need to pass the log values of the desired bracket heights manually. In theory, you can pass these via the ggsignif.args parameter, but it seems that in the latest version of ggstatsplot this isn't possible because the y_position is hard-coded.
One way tound this is to store the plot then change the y positions after the fact. Here's a full reprex with the latest versions of ggplot2, ggstatsplot and their dependencies (at the time of writing)
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
library(scales)
p <- ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) + scale_y_continuous(trans = pseudo_log_trans(sigma = 1, base = exp(1)),
limits = c(0, exp(13)),
breaks = c(0, 10^(0:5)),
labels = comma)
#> Scale for y is already present.
#> Adding another scale for y, which will replace the existing scale.
i <- which(sapply(p$layers, function(x) inherits(x$geom, "GeomSignif")))
p$layers[[i]]$stat_params$y_position <- c(10, 10.8, 11.6)
p
Created on 2023-01-15 with reprex v2.0.2

R markdown: Reduce the space between two plots in pdf output document

Aim: R markdown: To construct one DinA4 pdf page with a rectangle on the top left side and two plots.
Problem: After drawing the rectangle, the next plot is far away with a large white space in between.
Desired Output: Heatmap should appear immediately after the rectangle may with one or two white lines.
I guess the problem is the drawing of the rectangle. Here I need some help. Thank you.
---
output:
pdf_document
documentclass: article
classoption: a4paper
geometry: margin=1cm
subparagraph: yes
header-includes: |
\usepackage{titlesec}
\titlespacing{\title}{0pt}{\parskip}{-\parskip}
title: "Example of Title to Body Text"
subtitle: Subtitle Places Here
---
\vspace{-5truemm}
\pagenumbering{gobble}
#``` {r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(draw)
library(ggplot2)
library(dplyr)
# ```
#```{r rectangle}
drawBox(x =2, y = 3.5, width = 2.5, height = 1)
#```
#```{r heatmap}
df <- data.frame(
test_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4),
test_nr = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5,
1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 2),
region = c("A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A",
"B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B",
"C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D"),
test_value = c(3, 1, 2, 2, 2, 1, 2, 2, 3, 2, 2, 3, 2, 1, 2, 2, 1, 2, 3,
4, 2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 3, 2, 2, 2, 99, 99, 3, 3, 3, 3)
)
# named vector for heatmap
cols <- c("1" = "green",
"2" = "darkgreen",
"3" = "orange",
"4" = "red",
"99" = "black")
labels_legend <- c("1" = "very good",
"2" = "good",
"3" = "not so good",
"4" = "bad",
"99" = "NA")
df <- df %>%
filter(test_id==1)
ggplot(
df,
aes(region, test_nr)) +
geom_tile(aes(fill= factor (test_value))) +
geom_text(aes(label = test_value), size = 10, color = "white") + # text in tiles
scale_colour_manual(
values = cols,
breaks = c("1", "2", "3", "4", "99"),
labels = labels_legend,
aesthetics = c("colour", "fill")
) +
theme(text = element_text(size = 14)) + # this will change all text size
labs(title = "Test (Individual heatmap)", x = "Region", y = "Event") +
labs(fill = "Test") +
coord_fixed(ratio=1, clip="on") +
theme(axis.text.y = element_text(face = "bold", size = 12)) +
theme(axis.text.x = element_text(angle = 0, face = "bold", size = 12)) +
theme(axis.line = element_line(colour = "darkblue",
size = 1, linetype = "solid")
)
# ```
## Information
You can use the subfigure environment to display multiple plots side by side, though you may not want to place the rectangle under the same main caption as the heatmap.
---
output:
pdf_document:
extra_dependencies: "subfig"
documentclass: article
classoption: a4paper
geometry: margin=1cm
subparagraph: yes
header-includes: |
\usepackage{titlesec}
\titlespacing{\title}{0pt}{\parskip}{-\parskip}
title: "Example of Title to Body Text"
subtitle: Subtitle Places Here
---
\vspace{-5truemm}
\pagenumbering{gobble}
``` {r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(draw)
library(ggplot2)
library(dplyr)
```
```{r rectangle}
drawBox(x =2, y = 3.5, width = 2.5, height = 1)
```
```{r heatmap-data}
df <- data.frame(
test_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4),
test_nr = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5,
1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 2, 2, 2, 2),
region = c("A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A",
"B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B",
"C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D"),
test_value = c(3, 1, 2, 2, 2, 1, 2, 2, 3, 2, 2, 3, 2, 1, 2, 2, 1, 2, 3,
4, 2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 3, 2, 2, 2, 99, 99, 3, 3, 3, 3)
)
# named vector for heatmap
cols <- c("1" = "green",
"2" = "darkgreen",
"3" = "orange",
"4" = "red",
"99" = "black")
labels_legend <- c("1" = "very good",
"2" = "good",
"3" = "not so good",
"4" = "bad",
"99" = "NA")
df <- df %>%
filter(test_id==1)
```
```{r heatmap, fig.show="hold", fig.cap='Rectangle and Heatmap', fig.subcap=c('LEFT', 'RIGHT'), out.width='50%', fig.align = "center"}
drawBox(x =2, y = 3.5, width = 2.5, height = 1)
ggplot(
df,
aes(region, test_nr)
) +
geom_tile(aes(fill= factor (test_value))) +
geom_text(aes(label = test_value), size = 10, color = "white") + # text in tiles
scale_colour_manual(
values = cols,
breaks = c("1", "2", "3", "4", "99"),
labels = labels_legend,
aesthetics = c("colour", "fill")
) +
theme(text = element_text(size = 14)) + # this will change all text size
labs(title = "Test (Individual heatmap)", x = "Region", y = "Event") +
labs(fill = "Test") +
coord_fixed(ratio=1, clip="on") +
theme(axis.text.y = element_text(face = "bold", size = 12)) +
theme(axis.text.x = element_text(angle = 0, face = "bold", size = 12)) +
theme(
axis.line = element_line(
colour = "darkblue",
size = 1, linetype = "solid"
)
)
```
## Information

How to obtain the tree from igraph object in R?

I have a random directed weighted graph gg, it has the next structure:
gg <-
structure(list(10, TRUE, c(0, 0, 1, 2, 2, 5, 5, 6, 6, 6, 6, 9,
9, 9, 9, 9), c(6, 9, 3, 0, 5, 3, 7, 1, 3, 5, 8, 2, 4, 6, 7, 8
), c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15), c(3,
7, 11, 2, 5, 8, 12, 4, 9, 0, 13, 6, 14, 10, 15, 1), c(0, 2, 3,
5, 5, 5, 7, 11, 11, 11, 16), c(0, 1, 2, 3, 6, 7, 9, 11, 13, 15,
16), list(c(1, 0, 1), structure(list(), .Names = character(0)),
structure(list(name = c("C", "D", "I", "J", "K", "N", "O",
"Q", "S", "T"), color = c("yellow", "red", "red", "red",
"red", "red", "green", "red", "red", "green")), .Names = c("name",
"color")), structure(list(weight = c(0.5, 0.5, 1, 0.333333333333333,
0.333333333333333, 0.333333333333333, 0.333333333333333,
0.25, 0.25, 0.25, 0.25, 0.2, 0.2, 0.2, 0.2, 0.2)), .Names = "weight")),
<environment>), class = "igraph")
I need to find all walks from the root (yellow node) to leaves (red nodes). Leaves defined by (a) edge direction and (b) the distance -- from the root to the leave should be two edges only.
In my case, the root is C and leaves should be D, J, N, S, I, K, Q.
I tried to define the (a) condition only.
root <- "C"
leaves = which(degree(gg, v = V(gg), mode = "out")==0, useNames = T)
leaves
# J K Q S
# 4 5 8 9
plot(gg, layout = layout.reingold.tilford(gg, root=root),
edge.arrow.size=0.2, edge.curved=T,
edge.label = round(E(gg)$weight,2))
Question. How to define the (b) condition and add to leaves set D, N, I, K nodes?
Here's one way to do it: use shortest_paths to get all the vertices that are exactly two edges from the root node.
two.edges.from.root = unlist(sapply(shortest_paths(gg,
from = as.numeric(V(gg)["C"]),
mode = "out")$vpath,
function(x) { if(length(x) == 3) { x[3] } }))

plot (ggplot ?) smooth + color area between 2 curves

I have a question for you please :
My data :
Nb_obs <- as.vector(c( 2, 0, 6, 2, 7, 1, 8, 0, 2, 1, 1, 3, 11, 5, 9, 6, 4, 0, 7, 9))
Nb_obst <- as.vector(c(31, 35, 35, 35, 39, 39, 39, 39, 39, 41, 41, 42, 43, 43, 45, 45, 47, 48, 51, 51))
inf20 <- as.vector(c(2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 4, 3, 5, 4))
sup20 <- as.vector(c(3, 4, 4, 4, 5, 4, 4, 5, 4, 4, 5, 5, 5, 6, 5, 6, 6, 5, 7, 6))
inf40 <- as.vector(c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 4, 3))
sup40 <- as.vector(c(4, 5, 5, 5, 6, 5, 5, 6, 5, 5, 6, 6, 6, 7, 6, 7, 7, 7, 9, 7))
inf60 <- as.vector(c(1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 2))
sup60 <- as.vector(c(5, 6, 6, 6, 8, 7, 7, 7, 7, 7, 7, 7, 8, 9, 8, 9, 9, 9, 11, 9))
inf90 <- as.vector(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1))
sup90 <- as.vector(c(10, 11, 11, 11, 15, 13, 13, 14, 12, 13, 13, 13, 14, 17, 15, 17, 17, 16, 21, 18))
data <- cbind.data.frame(Nb_obs, Nb_obst, inf20, sup20, inf40, sup40, inf60 , sup60, inf90 , sup90)
My plot :
plot(data$Nb_obst, data$Nb_obs, type = "n", xlab = "Number obst", ylab = "number obs", ylim = c(0, 25))
lines(data$Nb_obst, data$inf20, col = "dark red")
lines(data$Nb_obst, data$sup20, col = "dark red")
lines(data$Nb_obst, data$inf40, col = "red")
lines(data$Nb_obst, data$sup40, col = "red")
lines(data$Nb_obst, data$inf60, col = "dark orange")
lines(data$Nb_obst, data$sup60, col = "dark orange")
lines(data$Nb_obst, data$inf90, col = "yellow")
lines(data$Nb_obst, data$sup90, col = "yellow")
My question :
There are two things I'd like to do (and so I think it could be done by ggplot):
In the idea of the graph at the top, the "inf" and "sup" are limits of my model in the IC 20%, then 40%, then 60%, and finally 90%. I would first like to smooth each curve, and then I would like to color the surface between two curves of the same IC, for example that the surface between "data$inf90" and "data$sup90" is yellow, the area between "data$inf60" and "data$60" is orange, etc. And I would like to superimpose each of these colored surfaces + put the good legend please.
Thanks for your help !
Cool question since I had to give myself a crash course in using LOESS for ribbons!
First thing I'm doing is getting the data into a long shape, since that's what ggplot will expect, and since your data has some characteristics that are kind of hidden within values. For example, if you gather into a long shape and have, say a column key, with a value of "inf20" and another of "sup20", those hold more information than you currently have access to, i.e. the measure type is either "inf" or "sup", and the level is 20. You can extract that information out of that column to get columns of measure types ("inf" or "sup") and levels (20, 40, 60, or 90), then map aesthetics onto those variables.
So here I'm getting the data into a long shape, then using spread to make columns of inf and sup, because those will become ymin and ymax for the ribbons. I made level a factor and reversed its levels, because I wanted to change the order of the ribbons being drawn such that the narrow one would come up last and be drawn on top.
library(tidyverse)
data_long <- data %>%
as_tibble() %>%
gather(key = key, value = value, -Nb_obs, -Nb_obst) %>%
mutate(measure = str_extract(key, "\\D+")) %>%
mutate(level = str_extract(key, "\\d+")) %>%
select(-key) %>%
group_by(level, measure) %>%
mutate(row = row_number()) %>%
spread(key = measure, value = value) %>%
ungroup() %>%
mutate(level = as.factor(level) %>% fct_rev())
head(data_long)
#> # A tibble: 6 x 6
#> Nb_obs Nb_obst level row inf sup
#> <dbl> <dbl> <fct> <int> <dbl> <dbl>
#> 1 0 35 20 2 2 4
#> 2 0 35 40 2 2 5
#> 3 0 35 60 2 1 6
#> 4 0 35 90 2 0 11
#> 5 0 39 20 8 3 5
#> 6 0 39 40 8 2 6
ggplot(data_long, aes(x = Nb_obst, ymin = inf, ymax = sup, fill = level)) +
geom_ribbon(alpha = 0.6) +
scale_fill_manual(values = c("20" = "darkred", "40" = "red",
"60" = "darkorange", "90" = "yellow")) +
theme_light()
But it still has the issue of being jagged, so for each level I predicted smoothed values of both inf and sup versus Nb_obst using loess. group_by and do yield a nested data frame, and unnest pulls it back out into a workable form. Feel free to adjust the span parameter, as well as other loess.control parameters that I know very little about.
data_smooth <- data_long %>%
group_by(level) %>%
do(Nb_obst = .$Nb_obst,
inf_smooth = predict(loess(.$inf ~ .$Nb_obst, span = 0.35), .$Nb_obst),
sup_smooth = predict(loess(.$sup ~ .$Nb_obst, span = 0.35), .$Nb_obst)) %>%
unnest()
head(data_smooth)
#> # A tibble: 6 x 4
#> level Nb_obst inf_smooth sup_smooth
#> <fct> <dbl> <dbl> <dbl>
#> 1 90 35 0 11.
#> 2 90 39 0 13.4
#> 3 90 48 0.526 16.7
#> 4 90 39 0 13.4
#> 5 90 41 0 13
#> 6 90 41 0 13
ggplot(data_smooth, aes(x = Nb_obst, ymin = inf_smooth, ymax = sup_smooth, fill = level)) +
geom_ribbon(alpha = 0.6) +
scale_fill_manual(values = c("20" = "darkred", "40" = "red",
"60" = "darkorange", "90" = "yellow")) +
theme_light()
Created on 2018-05-26 by the reprex package (v0.2.0).
This produces the plot with shaded areas using base R graphics.
The trick is to pair the x values with the y values.
plot(data$Nb_obst, data$Nb_obs, type = "n", xlab = "Number obst", ylab = "number obs", ylim = c(0, 25))
lines(data$Nb_obst, data$inf20, col = "dark red")
lines(data$Nb_obst, data$sup20, col = "dark red")
lines(data$Nb_obst, data$inf40, col = "red")
lines(data$Nb_obst, data$sup40, col = "red")
lines(data$Nb_obst, data$inf60, col = "dark orange")
lines(data$Nb_obst, data$sup60, col = "dark orange")
lines(data$Nb_obst, data$inf90, col = "yellow")
lines(data$Nb_obst, data$sup90, col = "yellow")
with(data, polygon(c(Nb_obst, rev(Nb_obst)), c(inf90, rev(sup90)), col = "yellow"))
with(data, polygon(c(Nb_obst, rev(Nb_obst)), c(inf60, rev(sup60)), col = "dark orange"))
with(data, polygon(c(Nb_obst, rev(Nb_obst)), c(inf40, rev(sup40)), col = "red"))
with(data, polygon(c(Nb_obst, rev(Nb_obst)), c(inf20, rev(sup20)), col = "dark red"))
The code for a ggplot graph is a bit longer. There is a function geom_ribbon perfect for this.
g <- ggplot(data)
g + geom_ribbon(aes(x = Nb_obst, ymin = sup60, ymax = sup90), fill = "yellow") +
geom_ribbon(aes(x = Nb_obst, ymin = sup40, ymax = sup60), fill = "dark orange") +
geom_ribbon(aes(x = Nb_obst, ymin = sup20, ymax = sup40), fill = "red") +
geom_ribbon(aes(x = Nb_obst, ymin = inf20, ymax = sup20), fill = "dark red") +
geom_ribbon(aes(x = Nb_obst, ymin = inf40, ymax = inf20), fill = "red") +
geom_ribbon(aes(x = Nb_obst, ymin = inf60, ymax = inf40), fill = "dark orange") +
geom_ribbon(aes(x = Nb_obst, ymin = inf90, ymax = inf60), fill = "yellow")
Data.
I will redo your dataset, simplifying its creation. You don't need as.vector and if you are creating a data.frame there is no need for the data.frame method of cbind, data.frame(.) is enough.
Nb_obs <- c( 2, 0, 6, 2, 7, 1, 8, 0, 2, 1, 1, 3, 11, 5, 9, 6, 4, 0, 7, 9)
Nb_obst <- c(31, 35, 35, 35, 39, 39, 39, 39, 39, 41, 41, 42, 43, 43, 45, 45, 47, 48, 51, 51)
inf20 <- c(2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 4, 4, 3, 5, 4)
sup20 <- c(3, 4, 4, 4, 5, 4, 4, 5, 4, 4, 5, 5, 5, 6, 5, 6, 6, 5, 7, 6)
inf40 <- c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 3, 3, 4, 3)
sup40 <- c(4, 5, 5, 5, 6, 5, 5, 6, 5, 5, 6, 6, 6, 7, 6, 7, 7, 7, 9, 7)
inf60 <- c(1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 2)
sup60 <- c(5, 6, 6, 6, 8, 7, 7, 7, 7, 7, 7, 7, 8, 9, 8, 9, 9, 9, 11, 9)
inf90 <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1)
sup90 <- c(10, 11, 11, 11, 15, 13, 13, 14, 12, 13, 13, 13, 14, 17, 15, 17, 17, 16, 21, 18)
data <- data.frame(Nb_obs, Nb_obst, inf20, sup20, inf40, sup40, inf60 , sup60, inf90 , sup90)

How to mirror the outer positions with the variable with R

I have a data frame:
tes <- data.frame(x = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
y = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
d = c(10, 20, 30, 100, 11, 12, 403, 43, 21))
They look like this on the plot
ggplot(aes(x = x, y = y), data = tes) + geom_point(aes(color = factor(d)), size = 5)
I'd like to "mirror the outer rows in this data to obtain such data and plot
tes1 <- data.frame(x = c(0, 0, 0, 0,0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4),
y = c(0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4),
d = c(10, 10, 20, 30, 30, 10, 10, 20, 30, 30, 100, 100, 11, 12, 12, 403, 403, 43, 21, 21, 403, 403, 43, 21, 21))
ggplot(aes(x = x, y = y), data = tes1) + geom_point(aes(color = factor(d)), size = 4)
Does this do what you're after?
Explanation: We first convert tes into a flattened table with ftable(xtabs(...). Then we simply replicate the first and last column, and first and last row. We then give new column and row names to reflect the extra "flanking" rows and columns, and finally convert back to a long dataframe with data.frame(table(...))
# Convert to table then matrix
m <- ftable(xtabs(d ~ x + y, data = tes));
class(m) <- "matrix";
# Replicate first and last column/row by binding to the beginning
# and end, respectively of the matrix
m <- cbind(m[, 1], m, m[, ncol(m)]);
m <- rbind(m[1, ], m, m[nrow(m), ]);
# Set column/row names
rownames(m) <- seq(min(tes$x) - 1, max(tes$x) + 1);
colnames(m) <- seq(min(tes$y) - 1, max(tes$y) + 1);
# Convert back to long dataframe
tes.ext <- data.frame(as.table(m));
colnames(tes.ext) <- colnames(tes);
# Plot
ggplot(aes(x = x, y = y), data = tes.ext) + geom_point(aes(color = factor(d)), size = 5)
Data
tes <- data.frame(x = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
y = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
d = c(10, 20, 30, 100, 11, 12, 403, 43, 21))

Resources