How to label individual points with vectors in R - r

I'm pretty new to coding and R in general. I'm trying to figure out how to create plots both as individual points as well as vectors. I should be getting the same result for both options, but I can't seem to figure out how to correlate the labels for the points when using vectors.
Here's the table I was given
Here's my code for the individual plotting and the plot
plot(
x = NULL,
xlim = c(0, 8),
ylim = c(0, 10),
main = "Problem 3a- Individual Points Fuction",
xlab = "x",
ylab = "y",
las = 1
)
text( 0.6, 7.5, "A" )
points( 1, 7, pch = 19, cex = 3, col = "navy" )
text( 3.4, 2.5, "B" )
points( 4, 3, pch = 15, cex = 6, col = "blueviolet" )
text( 5.6, 4.0, "C" )
points( 6, 5, pch = 17, cex = 4, col = "firebrick2" )
text( 1.6, 1.5, "D" )
points( 2, 2, pch = 18, cex = 5, col = "cyan3" )
text( 6.8, 3.5, "E" )
points( 7, 4, pch = 16, cex = 2, col = "seagreen3" )
Here's my code for the vector method, with the plot:
plot(
x = NULL,
xlim = c(0, 8),
ylim = c(0, 10),
main = "Problem 3b- Vector Points Fuction",
xlab = "x",
ylab = "y",
las = 1
)
points(
x = c(1, 4, 6, 2, 7),
y = c(7, 3, 5, 2, 4),
pch = c(19, 15, 17, 18, 16),
cex = c(3, 6, 4, 5, 2),
col = c("navy", "blueviolet", "firebrick2", "cyan3", "seagreen3"),
)
I can't seem to figure out how to label the points on the vector, and have it labeled at certain coordinates. I've tried just putting Text = ("A", "B", etc) as well as trying to make that a vector too (text = c("A",etc), but I keep getting errors. Any advice and resources would be appreciated.

You can use the text function as shown below. I added the xDisp variable to easily setup the labels position (if needed you can add a yDisp variable as well for vertical position).
xDisp = -0.5
plot(
x = NULL,
xlim = c(0, 8),
ylim = c(0, 10),
main = "Problem 3b- Vector Points Fuction",
xlab = "x",
ylab = "y",
las = 1
)
points(
x = c(1, 4, 6, 2, 7),
y = c(7, 3, 5, 2, 4),
pch = c(19, 15, 17, 18, 16),
cex = c(3, 6, 4, 5, 2),
col = c("navy", "blueviolet", "firebrick2", "cyan3", "seagreen3")
)
text(
x = c(1+xDisp, 4+xDisp, 6+xDisp, 2+xDisp, 7+xDisp), y = c(7, 3, 5, 2, 4), labels = c("A","B","C","D","E")
)

Related

ggbetweenstats: logarithmic y axis removes grouped analysis from plot

I am conducting a kruskal-wallis test to determine statistically significance between three groups of a measurement. I use ggbetweenstats to determine between which group there is a statistically significant association.
Here is the code for sample data and the plot:
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
)
You can see the results from the kruskal wallis test on the top of the plot as well as the groupes analysis in the plot. Now I want to change y axis to logarithmic scale:
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) +
ggplot2::scale_y_continuous(trans=scales::pseudo_log_trans(sigma = 1, base = exp(1)), limits = c(0,25000), breaks = c(0,1,10,100,1000,10000)
)
However, this removes the grouped analysis. I have tried different scaling solutions and browsed SO for a solution but couldn't find anything. Thank you for your help!
It seems that the y_position parameter in the geom_signif component is not affected by the y axis transformation. You will need to pass the log values of the desired bracket heights manually. In theory, you can pass these via the ggsignif.args parameter, but it seems that in the latest version of ggstatsplot this isn't possible because the y_position is hard-coded.
One way tound this is to store the plot then change the y positions after the fact. Here's a full reprex with the latest versions of ggplot2, ggstatsplot and their dependencies (at the time of writing)
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
library(scales)
p <- ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) + scale_y_continuous(trans = pseudo_log_trans(sigma = 1, base = exp(1)),
limits = c(0, exp(13)),
breaks = c(0, 10^(0:5)),
labels = comma)
#> Scale for y is already present.
#> Adding another scale for y, which will replace the existing scale.
i <- which(sapply(p$layers, function(x) inherits(x$geom, "GeomSignif")))
p$layers[[i]]$stat_params$y_position <- c(10, 10.8, 11.6)
p
Created on 2023-01-15 with reprex v2.0.2

Plotly node placement in R in a Sankey Diagram

Okay, I can't figure out what I'm doing wrong here.
I want the nodes to be positioned green, yellow, red in descending order. I'm trying to create a number of them, so I don't want to have to position the nodes by hand in Viewer.
I've updated R, and plotly, and everything else I can think of. Through trial and error I think I have the right side in the correct order, but the left side still bedevils me.
fig <- plot_ly(type = 'sankey',
orientation = 'h',
arrangement = 'snap',
node = list(label = c("Low", "Moderate", "High", "-4.9%", "+0%", "+4.9%"),
color = c('green', 'yellow', 'red', 'green', 'yellow', 'red'),
y = c(0, .1, .5, 0, .1, .5),
x = c(0, 0, 0, 1, 1, 1),
pad = 10,
thickness = 20,
line = list(color = 'black',
width = .5)
),
link = list(source = c(0, 0, 0, 1, 1, 1, 2, 2, 2),
target = c(3, 4, 5, 3, 4, 5, 3, 4, 5),
value = c(17,7, 8, 5, 1,10, 5, 8,42)))
fig <- fig %>%
layout()
fig
Edit: To be more specific about my question, I don't understand how the x and y coordinates work. The effect of changing those parameters seems to be very unpredictable, and I can't suss out how they work.
According to this open issue, node.x and node.y arguments for manual positions can't be equal to 0. Changing the 0 values to 0.001 in your code fixes the ordering. It seems that if any 0 values are present, the arguments are ignored with a silent error. I have been digging into this recently and opened a related issue about the documentation and general problems with overriding the node order.
fig <- plot_ly(type = 'sankey',
orientation = 'h',
arrangement = 'snap',
node = list(label = c("Low", "Moderate", "High", "-4.9%", "+0%", "+4.9%"),
color = c('green', 'yellow', 'red', 'green', 'yellow', 'red'),
y = c(0.001, .1, .5, 0, .1, .5),
x = c(0.001, 0.001, 0.001, 1, 1, 1),
pad = 10,
thickness = 20,
line = list(color = 'black',
width = .5)
),
link = list(source = c(0, 0, 0, 1, 1, 1, 2, 2, 2),
target = c(3, 4, 5, 3, 4, 5, 3, 4, 5),
value = c(17,7, 8, 5, 1,10, 5, 8,42)))
fig <- fig %>%
layout()
fig

Add points to splines

I wish to add points directly on top of the curved spline.
The code here does not work because geom_point places the dots as if the lines were straight. See points #2, #3. I've tried using stat_bspline2 with geom = "point" without success.
Help is much appreciated.
library(tidyverse)
library(ggforce)
data <- tibble (
x = c(10, 15, 17, 17, 20, 22, 22, 23, 25, 25, 27, 29),
y = c(5, 7, 4, 4, 0, 5, 5, 6, 5, 5, 4, 5.5),
g = c("A", "A", "A", "B", "B", "B", "C", "C", "C", "D","D","D"),
pt = c(1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1)
)
data <- data %>%
mutate(pt_x = ifelse(pt == 1, x, NA),
pt_y = ifelse(pt == 1, y, NA))
ggplot(data) +
stat_bspline2(aes(x=x, y=y, color = ..group.., group = g), size = 4, n = 300, geom = "bspline0") +
scale_color_gradientn(colours = c("red", "pink", "green", "white"), guide = F) +
geom_point(aes(pt_x, pt_y), size = 7)

Grey Background in R When Using qcc (quality control charts) Plot

I'm having a problem where my graph is always on a light grey background which looks awful in LaTeX. I've tried using par(bg=NA), par(bg="white") which is what everyone suggests but that literally does nothing...
Here's the code:
# install.packages('qcc')
library(qcc)
nonconforming <- c(3, 4, 6, 5, 2, 8, 9, 4, 2, 6, 4, 8, 0, 7, 20, 6, 1, 5, 7)
samplesize <- rep(50, 19)
control <- qcc(nonconforming, type = "p", samplesize, plot = "FALSE")
warn.limits <- limits.p(control$center, control$std.dev, control$sizes, 2)
par(mar = c(5, 3, 1, 3), bg = "blue")
plot(control, restore.par = FALSE, title = "P Chart for Medical Insurance Claims",
xlab = "Day", ylab = "Proportion Defective")
abline(h = warn.limits, lty = 3, col = "blue")
v2 <- c("LWL", "UWL") # the labels for warn.limits
mtext(side = 4, text = v2, at = warn.limits, col = "blue", las = 2)
Check out ?qcc.options() -- specifically, the bg.margin option. The following will change your plot to have a lightgreen background (note: probably not a good choice for LaTeX, but it illustrates the point):
library(qcc)
nonconforming <- c(3, 4, 6, 5, 2, 8, 9, 4, 2, 6, 4, 8, 0, 7, 20, 6, 1, 5, 7)
samplesize <- rep(50, 19)
old <- qcc.options() # save the original options
qcc.options(bg.margin = "lightgreen")
par(mar = c(5, 3, 1, 3))
control <- qcc(nonconforming, type = "p", samplesize, plot = "FALSE")
warn.limits <- limits.p(control$center, control$std.dev, control$sizes, 2)
plot(control, restore.par = FALSE, title = "P Chart for Medical Insurance Claims",
xlab = "Day", ylab = "Proportion Defective")
abline(h = warn.limits, lty = 3, col = "blue")
v2 <- c("LWL", "UWL") # the labels for warn.limits
mtext(side = 4, text = v2, at = warn.limits, col = "blue", las = 2)
qcc.options(old) # reset the old options

How to scale points in R plot?

I have 40 pairs of birds with each male and female scored for their colour. The colour score is a categorical variable (range of 1 to 9). I would like to plot the frequency of the number of males and female pairs colour combinations. I have to created a 'table' with the number of each combination (1/1, 1/2, 1/3, ... 9/7, 9/8, 9/9), then converted it to a vector called 'Colour_Count'. I would like to use 'Colour_Count' for the 'cex' parameter in the 'plot' to scale the size of each combination of colours. This does not work because of the order the data is read from the table. How do I create a vector with the frequency of each colour combination to scale my plot points?
See data and code below:
## Dataset pairs of males and females and their colour classes
Pair_Colours <- structure(list(Male = c(7, 6, 4, 6, 8, 8, 5, 6, 6, 8, 6, 6, 5,
7, 9, 5, 8, 7, 5, 5, 4, 6, 7, 7, 3, 6, 5, 4, 7, 4, 3, 9, 4, 4,
4, 4, 9, 6, 6, 6), Female = c(9, 8, 8, 9, 3, 6, 8, 5, 8, 9, 7,
3, 6, 5, 8, 9, 7, 3, 6, 4, 4, 4, 8, 8, 6, 7, 4, 2, 8, 9, 5, 6,
8, 8, 4, 4, 5, 9, 7, 8)), .Names = c("Male", "Female"), class = "data.frame", row.names = c(NA,
40L))
Pair_Colours[] <- as.data.frame(lapply(Pair_Colours, factor, levels=1:9))
## table of pair colour values (colours 1 to 9 - categoricial variable)
table(Pair_Colours$Male, Pair_Colours$Female)
Colour_Count <- as.vector(table(Pair_Colours$Male, Pair_Colours$Female)) #<- the problem occurs here
## plot results to visisually look for possible assortative mating by colour
op<-par(mfrow=c(1,1), oma=c(2,4,0,0), mar=c(4,5,1,2), pty = "s")
plot(1,1, xlim = c(1, 9), ylim = c(1, 9), type="n", xaxt = "n", yaxt = "n", las=1, bty="n", cex.lab = 1.75, cex.axis = 1.5, main = NULL, xlab = "Male Colour", ylab = "Female Colour", pty = "s")
axis(1, at = seq(1, 9, by = 1), labels = T, cex.lab = 1.5, cex.axis = 1.5, tick = TRUE, tck = -0.015, lwd = 1.25, lwd.ticks = 1.25)
axis(2, at = seq(1, 9, by = 1), labels = T, cex.lab = 1.5, cex.axis = 1.5, tick = TRUE, tck = -0.015, lwd = 1.25, lwd.ticks = 1.25, las =2)
points(Pair_Colours$Male, Pair_Colours$Female, pch = 21, cex = Colour_Count, bg = "darkgray", col = "black", lwd = 1)
You can summarise your data with function ddply() of library plyr and then use this new data frame to plot your data. Counts are in column V1 of new data frame.
library(plyr)
df<-ddply(Pair_Colours,.(Male,Female),nrow)
df
Male Female V1
1 3 5 1
2 3 6 1
3 4 2 1
4 4 4 3
points(df$Male, df$Female, pch = 21, cex = df$V1,
bg = "darkgray", col = "black", lwd = 1)
UPDATE - solution using aggregate
Other possibility is to use function aggregate(). First, add new column N that contains just values 1. Then with aggregate() sum N values for each Male and Female combination.
Pair_Colours$N<-1
aggregate(N~Male+Female,data=Pair_Colours,FUN=sum)
Male Female N
1 4 2 1
2 6 3 1
3 7 3 1
4 8 3 1
5 4 4 3

Resources