Boxplot disappears when adjusting y lim - r

I want my boxplot to have the y axis to go from 0 to 20,000 , but when I add the argument ylim = c(0,20000), my entire boxplot disappears.
Here is my code:
bp.gender <- boxplot(
music_data.frame$Income ~ music_data.frame$Gender,
xlab = "Gender",
ylab = "Income",
main = "Income distribution for Gender",
col = "red",
ylim = c(0,20000)

Related

RDA triplot in R- plot only numeric explanatory variables as arrows; factors as centroids

I ran a distance-based RDA using capscale() in the vegan library in R and I am trying to plot my results as a custom triplot. I only want numeric or continuous explanatory variables to be plotted as arrows/vectors. Currently, both factors and numeric explanatory variables are being plotted with arrows, and I want to remove arrows for factors (site and year) and plot centroids for these instead.
dbRDA=capscale(species ~ canopy+gmpatch+site+year+Condition(pair), data=env, dist="bray")
To plot I extracted % explained by the first 2 axes as well as scores (coordinates in RDA space)
perc <- round(100*(summary(spe.rda.signif)$cont$importance[2, 1:2]), 2)
sc_si <- scores(spe.rda.signif, display="sites", choices=c(1,2), scaling=1)
sc_sp <- scores(spe.rda.signif, display="species", choices=c(1,2), scaling=1)
sc_bp <- scores(spe.rda.signif, display="bp", choices=c(1, 2), scaling=1)
I then set up a blank plot with scaling, axes, and labels
dbRDAplot<-plot(spe.rda.signif,
scaling = 1, # set scaling type
type = "none", # this excludes the plotting of any points from the results
frame = FALSE,
# set axis limits
xlim = c(-1,1),
ylim = c(-1,1),
# label the plot (title, and axes)
main = "Triplot db-RDA - scaling 1",
xlab = paste0("db-RDA1 (", perc[1], "%)"),
ylab = paste0("db-RDA2 (", perc[2], "%)"))
Created a legend and added points for site scores and text for species
pchh <- c(2, 17, 1, 19)
ccols <- c("black", "red", "black", "red")
legend("topleft", c("2016 MC", "2016 SP", "2018 MC", "2018 SP"), pch = pchh[unique(as.numeric(as.factor(env$siteyr)))], pt.bg = ccols[unique(as.factor(env$siteyr))], bty = "n")
points(sc_si,
pch = pchh[as.numeric(as.factor(env$siteyr))], # set shape
col = ccols[as.factor(env$siteyr)], # outline colour
bg = ccols[as.factor(env$siteyr)], # fill colour
cex = 1.2) # size
text(sc_sp , # text(sc_sp + c(0.02, 0.08) tp adjust text coordinates to avoid overlap with points
labels = rownames(sc_sp),
col = "black",
font = 1, # bold
cex = 0.7)
Here is where I add arrows for explanatory variables, but I want to be selective and do so for numeric variables only (canopy and gmpatch). The variables site and year I want to plot as centroids, but unsure how to do this. Note that the data structure for these are definitely specified as factors already.
arrows(0,0, # start them from (0,0)
sc_bp[,1], sc_bp[,2], # end them at the score value
col = "red",
lwd = 2)
text(x = sc_bp[,1] -0.1, # adjust text coordinate to avoid overlap with arrow tip
y = sc_bp[,2] - 0.03,
labels = rownames(sc_bp),
col = "red",
cex = 1,
font = 1)
#JariOksanen thank you for your answer. I was able to use the following to fix the problem
text(dbRDA, choices = c(1, 2),"cn", arrow=FALSE, length=0.05, col="red", cex=0.8, xpd=TRUE)
text(dbRDA, display = "bp", labels = c("canopy", "gmpatch"), choices = c(1, 2),scaling = "species", arrow=TRUE, select = c("canopy", "gmpatch"), col="red", cex=0.8, xpd = TRUE)
#JariOksanen thank you for your answer. I was able to use the following to fix the problem
text(dbRDA, choices = c(1, 2),"cn", arrow=FALSE, length=0.05, col="red", cex=0.8, xpd=TRUE)
text(dbRDA, display = "bp", labels = c("canopy", "gmpatch"), choices = c(1, 2),scaling = "species", arrow=TRUE, select = c("canopy", "gmpatch"), col="red", cex=0.8, xpd = TRUE)

adding a legend to a barplot at he top left

barcols <- c("green","red","purple")
barcols
barplot(table(gender$Alert.Level, gender$Gender),las=1, beside= TRUE, ylab= "Frequency", xlab="gender", cex.names=0.7, main="Frequency of violations by gender" , col=barcols)
How would you go about adding a legend to the top left where green= alert level 1, red = alert level 2 and purple = alert level 3
Here are two plots, one with a vertical legend and the other with a horizontal legend.
Note that in the first plot the y axis limits are extended in order to have the legend not overplot the bars.
set.seed(2022)
al <- c("alert level 1", "alert level 2", "alert level 3")
al <- factor(al, levels = al)
gender <- data.frame(Alert.Level = sample(al, 100, TRUE),
Gender = sample(c("Female", "Male", "Unknown"), 100, TRUE))
barcols <- c("green","red","purple")
barplot(
table(gender$Alert.Level, gender$Gender),
las=1, beside = TRUE,
#
ylim = c(0, 17),
#
ylab= "Frequency", xlab="gender",
cex.names=0.7,
main="Frequency of violations by gender",
col = barcols
)
lgd <- sort(unique(gender$Alert.Level))
legend("topleft", legend = lgd, fill = barcols)
barplot(
table(gender$Alert.Level, gender$Gender),
las=1, beside= TRUE,
#
ylim = c(0, 17),
#
ylab= "Frequency", xlab="gender",
cex.names=0.7,
main="Frequency of violations by gender",
col = barcols
)
legend("topleft", legend = lgd, fill = barcols, horiz = TRUE)
Created on 2022-03-23 by the reprex package (v2.0.1)

Do we need to call dev.off() after creating a pdf file?

When I call dev.off() my pdf gets created but I get the following message "null device 1".
I don't get any warning when I remove dev.off and my pdf gets created so why do I need to call dev.off for?
plot(x # independent variable (population_density)
, y # dependent variable (case_fatality_rate)
, main = "ScatterPlot - Case Fatality Rate vs Population Density Per Square Mile" # chart
title
, xlab = "Population Density Per Square Mile" # x-axis label
, ylab = "Case Fatality Rate" # y-axis label
, pch = 19 # point shape (filled circle)
, frame = T # surround chart with a frame
, xlim = c(0, 1200), ylim = c(0, 3)
)
model <- lm(y ~ x, data = dataset) # compute the linear model
abline(model, col = "blue") # draw the model as a blue line
hist(y # depandant variable (case_fatality_rate)
, main = "Histogram - Case Fatality Rate Frequency" # chart title
, xlab = "Case Fatality Rate",
ylab = "Frequency",
col = "#f0ffff",
breaks = 15,
freq = FALSE,
prob = TRUE,
xlim = c(0.5,2.5),
ylim = c(0.0,2.0)
)
lines(density(y, adjust=1.2), col="blue", lwd=2)
grid(nx = NA, ny = NULL,
lty = 1, col = "gray", lwd = 1)
dev.off()

how can i make the circles of my plot smaller in R?

This is the code I used:
resources <- read.csv("https://raw.githubusercontent.com/umbertomig/intro-prob-stat-FGV/master/datasets/resources.csv")
res <- subset(resources, select = c("cty_name", "year", "regime",
"oil", "logGDPcp", "illit"))
resNoNA <- na.omit(res)
resNoNAS <- scale(resNoNA[, 3:6])
colMeans(resNoNA[, 3:6])
apply(resNoNA[, 3:6], 2, sd)
cluster2 <- kmeans(resNoNAS, centers = 2)
table(cluster2$cluster)
## this gives standardized answer, which is hard to interpret
cluster2$centers
## better to subset the original data and then compute means
g1 <- resNoNA[cluster2$cluster == 1, ]
colMeans(g1[, 3:6])
g2 <- resNoNA[cluster2$cluster == 2, ]
colMeans(g2[, 3:6])
plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
xlab = "GDP per Capita", ylab = "Illiteracy",
col = cluster2$cluster, cex = resNoNA$oil)
but I wanted to make the circles smaller in order to fit within the limits of the graph
You control the circle diameter with cex= here.
plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
xlab = "GDP per Capita", ylab = "Illiteracy",
col = cluster2$cluster, cex = resNoNA$oil)
plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
xlab = "GDP per Capita", ylab = "Illiteracy",
col = cluster2$cluster, cex = resNoNA$oil/3)
plot(x = resNoNA$logGDPcp, y = resNoNA$illit, main = "Illiteracy v GDP",
xlab = "GDP per Capita", ylab = "Illiteracy",
col = cluster2$cluster, cex = resNoNA$oil/5)
Realize, however, that if you are using this in some automated report generator (e.g., rmarkdown, shiny), then you may need to adjust the dimensions of the plot to control it from the other angle: update xlim and ylim.

How to add legend in a 3D scatterplot

I have a 3D scatter plot that looks like this
and the code associated with it is as follows
nr = c(114,114,1820,100,100)
acc = c(70.00,45.00,98.89,82.00,74.90)
ti = c(25.00,87.50,0.25,41.40,51.30)
label = c(1, 2, 3, 4, 5)
data = data.frame(nr, acc, ti, label)
library(scatterplot3d)
scatterplot3d(data$nr, data$acc, data$ti, main = "3D Plot - Requirements, Accuracy & Time", xlab = "Number of requirements", ylab = "Accuracy", zlab = "Time", pch = data$label, angle = 45)
Now, I want to add a legend to the bottom right to indicate what those symbols mean
tech <- c('BPL','W','RT','S','WSM')
For instance, the triangle stands for BPL, + for RT and so on
You can try this:
library(scatterplot3d)
# define a plot
s3d <-scatterplot3d(data$nr, data$acc, data$ti, main = "3D Plot - Requirements, Accuracy & Time",
xlab = "Number of requirements", ylab = "Accuracy", zlab = "Time", pch = data$label, angle = 45)
# add a legend
legend("topright",s3d$xyz.convert(18, 0, 12), pch = data$label, yjust=0,
# here you define the labels in the legend
legend = c('BPL','W','RT','S','WSM'), cex = 1.1
)

Resources