How do I modify labels produced by scales package? - r

So I'm making pyramid visualizations. I'm using scale_y_continuous(labels = scales::label_number_si(accuracy = 0.1)) to produce the labels. However, I want to get rid of the negative sign on the female section of the graph.
I think the best way to keep the SI suffixes, but remove the negative sign is to modify the labels output by label_number_si, but labels = abs(label_number_si()) gives the following error: Error in abs: non-numeric argument to mathematical function
Any insight is appreciated.
EDIT: Use demo_continuous(c(-1e10,1e10), label = label_number_si()) labels should look as they do below EXCEPT that negative numbers should not have a "-" in front

I bet there's a simpler way to do this but I haven't figured it out yet.
Here's an example that replicates your question's result using the normal scales::label_number_si:
ggplot(data = data.frame(x = 1000*c(-5:-1, 1:5),
type = rep(1:2, each = 5))) +
geom_col(aes(x,abs(x),fill = type), orientation = "y") +
scale_x_continuous(labels = scales::label_number_si()) +
facet_wrap(~type, scales = "free_x")
We could make a custom version of scales::label_number_si which makes them absolute values in the last step. To make this, I used command-click (Mac OS X) on the function name to see the underlying function's code, and then just pasted that into a new function definition with minor modifications.
label_number_si_abs <- function (accuracy = 1, unit = NULL, sep = NULL, ...)
{
sep <- if (is.null(unit))
""
else " "
function(x) {
breaks <- c(0, 10^c(K = 3, M = 6, B = 9, T = 12))
n_suffix <- cut(abs(x), breaks = c(unname(breaks), Inf),
labels = c(names(breaks)), right = FALSE)
n_suffix[is.na(n_suffix)] <- ""
suffix <- paste0(sep, n_suffix, unit)
scale <- 1/breaks[n_suffix]
scale[which(scale %in% c(Inf, NA))] <- 1
scales::number(abs(x), accuracy = accuracy, scale = unname(scale),
suffix = suffix, ...)
}
}
We could replace with the custom function to get abs value labels:
ggplot(data = data.frame(x = 1000*c(-5:-1, 1:5),
type = rep(1:2, each = 5))) +
geom_col(aes(x,abs(x),fill = type), orientation = "y") +
scale_x_continuous(labels = label_number_si_abs()) +
facet_wrap(~type, scales = "free_x")

Related

How to add tick marks on a plot that is not from plot() in R

I use a R package, SetMethods, to get the fsQCA results of panel data. In the package, it uses cluster.plot() function to generate a plot.
However, I have a hard time letting the x-axis of the graph show the number of units as tick marks. For example, I want it shows 10, 20, 30,..,140 on the x-axis to know how many units' consistency score lower than a certain point.
Is there any method to add tick marks on a plot that is not generated by plot() function? Thanks in advance.
Here I use the dataset in the package as an example.
install.packages("SetMethods")
library(SetMethods)
data("PAYF")
PS <- minimize(data = PAYF,
outcome = "HL",
conditions = c("HE","GG","AH","HI","HW"),
incl.cut = 0.9,
n.cut = 2,
include = "?",
details = TRUE,
show.cases = TRUE)
PS
# Perform cluster diagnostics:
CB <- cluster(data = PAYF,
results = PS,
outcome = "HL",
unit_id = "COUNTRY",
cluster_id = "REGION",
necessity=FALSE,
wicons = FALSE)
CB
# Plot pooled, between, and within consistencies:
cluster.plot(cluster.res = CB,
labs = TRUE,
size = 8,
angle = 6,
wicons = TRUE)
Finally, I get a graph as follows.
However, I want it shows 10, 20, 30,..,140 on the x-axis to know how many units' consistency score lower than a certain point.
Is there any method to add tick marks on a plot that is not generated by plot() function? Thanks in advance.
If you look inside the cluster.plot function definition (in RStudio press F2 while pointer is on it) you will see that it uses ggplot2 under the hood. Only it doesn't return ggplot2 objects but just prints them one over another. Because of this it's not really possible to modify the output afterwards in any covenient manner.
But you can always copy the function code and rewrite it for your own need. The part that prints the final plot in your case is
CTw <- list()
ticklabw = unique(as.character(cluster.res$unit_ids))
xtickw <- seq(1, length(ticklabw), by = 1)
if (class(cluster.res) == "clusterminimize") {
for (i in 1:length(cluster.res$output)) {
CTw[[i]] <- cluster.res$output[[i]]$WICONS
dtw <- data.frame(x = xtickw, y = CTw[[i]])
dtw <- dtw[order(dtw$y), ]
dtw$xr <- reorder(dtw$x, 1 - dtw$y)
pw <- ggplot(dtw, aes(y = dtw[, 2], x = dtw[,
3])) + geom_point() + ylim(0, 1) + theme_classic(base_size = 16) +
geom_hline(yintercept = cluster.res$output[[i]]$POCOS) +
labs(title = names(cluster.res$output[i]),
x = "Units", y = "Consistency") + theme(axis.text.x = element_blank())
suppressWarnings(print(pw))
}
}
You can modify the ggplot2 construction part to something like this (packages ggplot2 and dplyr need to be loaded):
pw <-
dtw %>%
mutate(x_ind = as.numeric(xr)) %>%
ggplot(aes(x_ind, y)) +
geom_point() +
ylim(0, 1) +
theme_classic(base_size = 16) +
geom_hline(yintercept = cluster.res$output[[i]]$POCOS) +
scale_x_continuous(breaks = seq(from = 0, to = 140, by = 10)) +
labs(title = names(cluster.res$output[i]),
x = "Units", y = "Consistency")

Superscripts within ggplot2's axis text

I would like to create a graph that has superscripts on the axis instead of displaying unformatted numbers using ggplot2. I know that there are a lot of answers which change the axis label, but not the axis text. I am not trying to change the label of the graph, but the text on the axis.
Example:
x<-c('2^-5','2^-3','2^-1','2^1','2^2','2^3','2^5','2^7','2^9','2^11','2^13')
y<-c('2^-5','2^-3','2^-1','2^1','2^2','2^3','2^5','2^7','2^9','2^11','2^13')
df<-data.frame(x,y)
p<-ggplot()+
geom_point(data=df,aes(x=x,y=y),size=4)
p
So I would like the x-axis to display the same numbers but without the carrot.
EDIT:
A purely base approach:
df %>%
mutate_all(as.character)->new_df
res<-unlist(Map(function(x) eval(parse(text=x)),new_df$x))#replace with y for y
to_use<-unlist(lapply(res,as.expression))
split_text<-strsplit(gsub("\\^"," ",names(to_use))," ")
join_1<-as.numeric(sapply(split_text,"[[",1)) #tidyr::separate might help, less robust for numeric(I think)
join_2<-as.numeric(sapply(split_text,"[[",2))
to_use_1<-sapply(seq_along(join_1),function(x) parse(text=paste(join_1[x],"^",
join_2[x])))
The above can be reduced to less step, I posted the stepwise approach I took. The result for only x, the same can be done for y:
new_df %>%
ggplot()+
geom_point(aes(x=x,y=y),size=4)+
scale_x_discrete(breaks=df$x,labels=to_use_1)#replace with y and scale_y_discrete for y
Plot:
Original and erroneous answer:
I have deviated from standard tidyverse practice by using $, you can replace it with . and it might work although in this case it's not really important since the focus is on labels.:
library(dplyr)
df %>%
mutate(new_x=gsub("\\^"," ",x),
new_y=gsub("\\^"," ",y))->new_df
new_df %>%
ggplot()+
geom_point(aes(x=x,y=y),size=4)+
scale_x_discrete(breaks=x,labels=new_df$new_x)+
scale_y_discrete(breaks=y,labels=new_df$new_y)
This can be done with functions scale_x_log2 and scale_y_log2 that can be found in GitHub package jrnoldmisc.
First, install the package.
devtools::install_github("jrnold/rubbish")
Then, coerce the variables to numeric. I wil work with a copy of the original dataframe.
df1 <- df
df1[] <- lapply(df1, function(x){
x <- as.character(x)
sapply(x, function(.x)eval(parse(text = .x)))
})
Now, graph it.
library(jrnoldmisc)
library(ggplot2)
library(MASS)
library(scales)
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^x, n = 10)) +
scale_y_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^x, n = 10))
a + annotation_logticks(base = 2)
Edit.
Following the discussion in the comments, here are the two other ways that were seen to give different axis labels.
Axis labels every tick mark. Set limits = c(1.01, NA) and function argument n = 11, an odd number.
Axis labels on odd number exponents. Keep limits = c(0.01, NA), change to function(x) 2^(x - 1), n = 11.
Just the instructions, no plots.
The first.
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(1.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x), n = 11)) +
scale_y_log2(limits = c(1.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x), n = 11))
a + annotation_logticks(base = 2)
And the second.
a <- ggplot(df1, aes(x = x, y = y, size = 4)) +
geom_point(show.legend = FALSE) +
scale_x_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x - 1), n = 11)) +
scale_y_log2(limits = c(0.01, NA),
labels = trans_format("log2", math_format(2^.x)),
breaks = trans_breaks("log2", function(x) 2^(x - 1), n = 11))
a + annotation_logticks(base = 2)
You can provide a function to the labels argument of the scale_x_*** and scale_y_*** functions to generate labels with superscripts (or other formatting). See examples below.
library(jrnoldmisc)
library(ggplot2)
df<-data.frame(x=2^seq(-5,5,2),
y=2^seq(-5,5,2))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_log2(breaks=2^seq(-5,5,2),
labels=function(x) parse(text=paste("2^",round(log2(x),2))))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_continuous(breaks=c(2^-5, 2^seq(1,5,2)),
labels=function(x) parse(text=paste("2^",round(log2(x),2))))
ggplot(df) +
geom_point(aes(x=x,y=y),size=2) +
scale_x_log10(breaks=10^seq(-1,1,1),
labels=function(x) parse(text=paste("10^",round(log10(x),2))))

Pass a variable-length string as an aesthetic to ggplot2 based on number of columns

I'd like to be able to build a string based on the number of columns in my matrix and pass that to ggplot as an aesthetic. This doesn't seem to be covered by the aes_string() function. The reason I want this is that I'm using the ggalluvial package but the intricacies matter less than the principle. My code looks like this:
library(ggplot2)
library(ggalluvial)
my_alluvial_plot <- function(scores, n_groups = 5) {
score_names <- names(scores)
scr_mat <- data.matrix(scores)
n_cols <- ncol(scores)
# create ntiles of scores so that flow can be seen between groups
ranks <- apply(scr_mat, 2, function(x) {
rk <- dplyr::ntile(x, n_groups)
return(as.factor(rk))
})
to_plot <- data.frame(ranks)
# build the string for the aes() function
a_string <- ""
for (i in 1:n_cols) {
a_string <- paste0(a_string, "axis", i, " = to_plot[, ", i, "],")
}
# remove final comma
a_string <- substr(a_string, 1, nchar(a_string) - 1)
ggplot(to_plot,
aes(eval(a_string))) +
geom_alluvium(aes(fill = to_plot[, n_cols], width = 1/12)) +
geom_stratum(width = 1/12, fill = "black", color = "grey") +
scale_x_continuous(breaks = 1:n_cols, labels = score_names) +
scale_fill_brewer(type = "qual", palette = "Set1")
}
df <- data.frame(col1 = runif(10),
col2 = runif(10),
col3 = rnorm(10),
col4 = rnorm(10))
my_alluvial_plot(df)
This produces a blank plot with the following error:
Warning: Ignoring unknown aesthetics: width
Error: Discrete value supplied to continuous scale
Basically, I want to build an alluvial plot that can support an arbitrary number of columns, so the ggplot code as it's evaluated would end up being like
ggplot(to_plot,
aes(axis1 = data[, 1], axis2 = data[, 2], axis3 = data[, 3], ...))
But neither eval() or parse() produce anything sensible. aes_string() produces the same problem. Is there any way to do this systematically?
The reason you can't run parse() or eval() on strings like "axis1 = col1, axis2 = col2" is that such is a string by itself is not valid R code. But the entire ggplot call? That can be parsed!
If you rework the plot call like this, it produces the alluvial plot just fine:
gg_string <- paste0("ggplot(to_plot,
aes(", a_string, ")) +
geom_alluvium(aes(fill = to_plot[, n_cols], width = 1/12)) +
geom_stratum(width = 1/12, fill = 'black', color = 'grey') +
scale_x_continuous(breaks = 1:n_cols, labels = score_names) +
scale_fill_brewer(type = 'qual', palette = 'Set1')")
eval(parse(text = gg_string))

Need help on ggplot in R, I am getting an error: 'argument "x" is missing, with no default'

[enter image description here][1]I am trying to create a lowry plot in R but am having difficulty debugging the errors returned. I am using the following code to create the plot:
library(ggplot2)
library(reshape)
m_xylene_data <- data.frame(
Parameter = c(
"BW", "CRE", "DS", "KM", "MPY", "Pba", "Pfaa",
"Plia", "Prpda", "Pspda", "QCC", "QfaC", "QliC",
"QPC", "QspdC", "Rurine", "Vfac", "VliC", "Vmax"),
"Main Effect" = c(
1.03E-01, 9.91E-02, 9.18E-07, 3.42E-02, 9.27E-3, 2.82E-2, 2.58E-05,
1.37E-05, 5.73E-4, 2.76E-3, 6.77E-3, 8.67E-05, 1.30E-02,
1.19E-01, 4.75E-04, 5.25E-01, 2.07E-04, 1.73E-03, 1.08E-03),
Interaction = c(
1.49E-02, 1.43E-02, 1.25E-04, 6.84E-03, 3.25E-03, 7.67E-03, 8.34E-05,
1.17E-04, 2.04E-04, 7.64E-04, 2.84E-03, 8.72E-05, 2.37E-03,
2.61E-02, 6.68E-04, 4.57E-02, 1.32E-04, 6.96E-04, 6.55E-04
)
)
fortify_lowry_data <- function(data,
param_var = "Parameter",
main_var = "Main.Effect",
inter_var = "Interaction")
{
#Convert wide to long format
mdata <- melt(data, id.vars = param_var)
#Order columns by main effect and reorder parameter levels
o <- order(data[, main_var], decreasing = TRUE)
data <- data[o, ]
data[, param_var] <- factor(
data[, param_var], levels = data[, param_var]
)
#Force main effect, interaction to be numeric
data[, main_var] <- as.numeric(data[, main_var])
data[, inter_var] <- as.numeric(data[, inter_var])
#total effect is main effect + interaction
data$.total.effect <- rowSums(data[, c(main_var, inter_var)])
#Get cumulative totals for the ribbon
data$.cumulative.main.effect <- cumsum(data[, main_var])
data$.cumulative.total.effect <- cumsum(data$.total.effect)
#A quirk of ggplot2 means we need x coords of bars
data$.numeric.param <- as.numeric(data[, param_var])
#The other upper bound
#.maximum = 1 - main effects not included
data$.maximum <- c(1 - rev(cumsum(rev(data[, main_var])))[-1], 1)
data$.valid.ymax <- with(data,
pmin(.maximum, .cumulative.total.effect)
)
mdata[, param_var] <- factor(
mdata[, param_var], levels = data[, param_var]
)
list(data = data, mdata = mdata)
}
lowry_plot <- function(data,
param_var = "Parameter",
main_var = "Main.Effect",
inter_var = "Interaction",
x_lab = "Parameters",
y_lab = "Total Effects (= Main Effects + Interactions)",
ribbon_alpha = 0.5,
x_text_angle = 25)
{
#Fortify data and dump contents into plot function environment
data_list <- fortify_lowry_data(data, param_var, main_var, inter_var)
list2env(data_list, envir = sys.frame(sys.nframe()))
p <- ggplot(data) +
geom_bar(aes_string(x = param_var, y = "value", fill = "variable"),
data = mdata) +
geom_ribbon(
aes(x = .numeric.param, ymin = .cumulative.main.effect, ymax =
.valid.ymax),
data = data,
alpha = ribbon_alpha) +
xlab(x_lab) +
ylab(y_lab) +
scale_y_continuous(labels = "percent") +
theme(axis.text.x = text(angle = x_text_angle, hjust = 1)) +
scale_fill_grey(end = 0.5) +
theme(legend.position = "top",
legend.title =blank(),
legend.direction = "horizontal"
)
p
}
m_xylene_lowry <- lowry_plot(m_xylene_data)
When I run the code, it is giving me the following error:
Error: argument "x" is missing, with no default
It is not specific enough for me to know what the issue is. What is causing the error to be displayed and how can I make error statements more verbose?
Lowry PLOT
It seems that you have more than one faulty element in your code than just the error it throws. In my experience it always helps to first check whether the code works as expected before putting it into a function. The plotting-part below should work:
p <- ggplot(data) + # no need to give data here, if you overwrite it anyway blow, but does not affect outcome...
# geom_bar does the counting but does not take y-value. Use geom_col:
geom_col(aes_string(x = param_var, y = "value", fill = "variable"),
data = mdata,
position = position_stack(reverse = TRUE)) +
geom_ribbon(
aes(x = .numeric.param, ymin = .cumulative.main.effect, ymax =
.valid.ymax),
data = data,
alpha = ribbon_alpha) +
xlab(x_lab) +
ylab(y_lab) +
# use scales::percent_format():
scale_y_continuous(labels = scales::percent_format()) +
# text is not an element you can use here, use element_text():
theme(axis.text.x = element_text(angle = x_text_angle, hjust = 1)) +
scale_fill_grey(end = 0.5) +
# use element_blank(), not just blank()
theme(legend.position = "top",
legend.title = element_blank(),
legend.direction = "horizontal"
)
This at least plots something, but I'm not sure whether it is what you expect it to do. It would help if you could show the desired output.
Edit:
Added position = position_stack(reverse = TRUE) to order according to sample plot.

ggplot2: how to reduce the number of items in a legend

I have the following function:
gg.barplots <- function(inp, order, xlab.strg, ylab.strg) {
require(RColorBrewer)
require(ggplot2)
require(reshape2)
arg <- c(expression(hat(p)[M]), expression(hat(p)[C]))
p <- order
col <- c(colorRampPalette(brewer.pal(9,'Blues')[2:9])(p+2),
colorRampPalette(brewer.pal(9,'Oranges')[2:9])(p+2))
lab <- c(0:p, paste(">",p,sep=""))
freq.mat <- data.frame(labels = lab, inp)
names(freq.mat) <- c("x", "Magnitude-only", "Complex-valued")
freq.mat$x <- factor(freq.mat$x, levels = c(levels(freq.mat$x)[-1],levels(freq.mat$x)[1]))
## force the orders to be as we want them to appear, using the factor function with levels specified.
freq.df <- melt(data = freq.mat, id.vars = 1, measure.vars = 2:3)
fill.vars <- paste(rep(names(freq.mat)[-1], times = p), rep(freq.mat$x, each = 2), sep = ":")
fill.vars <- factor(fill.vars, levels = fill.vars)
freq.df <- data.frame(fill.vars, freq.df[rep(c(0,p+2), times = p + 2) + rep(1:(p + 2), each = 2), ])
ggplot(data=freq.df, aes(x = x, y = value, fill = fill.vars)) +
geom_bar(stat="identity", position=position_dodge(), colour = "black") +
scale_fill_manual(values = col[rep(c(0,p+2), times = p + 2) + rep(1:(p + 2), each = 2)]) +
theme_bw() +
xlab(arg) +
ylab(ylab.strg) +
xlab(xlab.strg) +
ylab(ylab.strg)
}
which gives me the following (two dodged barplots) as in the following example:
dput(out.AR2$AR.rate)
structure(c(0.25178, 0.06735, 0.64564, 0.03523, 0.04396, 0.0027,
0.90415, 0.04919), .Dim = c(4L, 2L), .Dimnames = list(c("0",
"1", "2", ">2"), NULL))
and calling the function:
gg.barplots(inp = out.AR2$AR.rate, order = 2, xlab.strg = "AR order", ylab.strg = "Proportions")
which results in the following figure:
Now I feel that (even ignoring the inherent ugliness of the current legend in this plot), the whole legend is not necessary. I think it is enought to have only the colors (say the mid-valye of the Oranges scale and the mid-value of the Blues scale) should be enough to represent the important parts of the plot. The remainder (AR orders in the legend) are already there in the figure.
My question: is how do I make a legend which has only these two colors (and the words Complex-value and Magnitude-only) associated with them? I have tried several things and I am a bit lost, sorry.
Your function is a little messy - you could probably split it into two functions, one to clean and one to plot.
Anyways, the easiest way to get what you want is to use the breaks argument to scale_fill_manual. This allows you to choose only those levels you want in the legend:
gg.barplots <- function(inp, order, xlab.strg, ylab.strg) {
require(RColorBrewer)
require(ggplot2)
require(reshape2)
arg <- c(expression(hat(p)[M]), expression(hat(p)[C]))
p <- order
col <- c(colorRampPalette(brewer.pal(9,'Blues')[2:9])(p+2),
colorRampPalette(brewer.pal(9,'Oranges')[2:9])(p+2))
lab <- c(0:p, paste(">",p,sep=""))
freq.mat <- data.frame(labels = lab, inp)
names(freq.mat) <- c("x", "Magnitude-only", "Complex-valued")
freq.mat$x <- factor(freq.mat$x, levels = c(levels(freq.mat$x)[-1],levels(freq.mat$x)[1]))
## force the orders to be as we want them to appear, using the factor function with levels specified.
freq.df <- melt(data = freq.mat, id.vars = 1, measure.vars = 2:3)
fill.vars <- paste(rep(names(freq.mat)[-1], times = p), rep(freq.mat$x, each = 2), sep = ":")
fill.vars <- factor(fill.vars, levels = fill.vars)
freq.df <- data.frame(fill.vars, freq.df[rep(c(0,p+2), times = p + 2) + rep(1:(p + 2), each = 2), ])
ggplot(data=freq.df, aes(x = x, y = value, fill = fill.vars)) +
geom_bar(stat="identity", position=position_dodge(), colour = "black") +
scale_fill_manual(values = col[rep(c(0,p+2), times = p + 2) + rep(1:(p + 2), each = 2)], breaks = c("Magnitude-only:2", "Complex-valued:2")) +
theme_bw() +
xlab(arg) +
ylab(ylab.strg) +
xlab(xlab.strg) +
ylab(ylab.strg)
}

Resources