I've currently got a barplot that has a few basic parameters. However, I'm looking to try and convert this into ggplot. The extra parameters don't matter too much; the main problem that I'm having is that I'm trying to plot the sum of various columns, but I'm unable to transpose it correctly as t(data) doesn't seem to work. Here's what I've got so far:
## Subset of indicators
indicators <- clean_data[c(8, 12, 14:23)]
## Get sum of columns
indicator_sums <- colSums(indicators, na.rm = TRUE)
### Transpose for ggplot
(empty)
## Make bar plot
barplot(indicator_sums, ylim=range(pretty(c(0, indicator_sums))), cex.axis=0.75,cex.lab=0.8, cex.names=0.7, col='magenta', las=2, ylab = 'Offences Recorded Using Indicator')
You may try
library(dplyr)
library(reshape2)
dummy <- data.frame(
A = c(1:20),
B = rnorm(20, 10, 4),
C = runif(20, 19,30),
D = sample(c(10:40),20, replace = T)
)
barplot(colSums(dummy))
dummy %>%
colSums %>%
melt %>%
rownames_to_column %>%
ggplot(aes(x = rowname, y = value)) +
geom_col()
I have created a qqplot (with quantiles of beta distribution) from a dataset including two groups. To visualize, which points belong to which group, I would like to color them. I have tried the following:
res <- beta.mle(data$values) #estimate parameters of beta distribution
qqplot(qbeta(ppoints(500),res$param[1], res$param[2]),data$values,
col = data$group,
ylab = "Quantiles of data",
xlab = "Quantiles of Beta Distribution")
the result is shown here:
I have seen solutions specifying a "col" vector for qqnorm, hover this seems to not work with qqplot, as simply half the points is colored in either color, regardless of group. Is there a way to fix this?
A simulated some data just to shown how to add color in ggplot
Libraries
library(tidyverse)
# install.packages("Rfast")
Data
#Simulating data from beta distribution
x <- rbeta(n = 1000,shape1 = .5,shape2 = .5)
#Estimating parameters
res <- Rfast::beta.mle(x)
data <-
tibble(
simulated_data = sort(x),
quantile_data = qbeta(ppoints(length(x)),res$param[1], res$param[2])
) %>%
#Creating a group variable using quartiles
mutate(group = cut(x = simulated_data,
quantile(simulated_data,seq(0,1,.25)),
include.lowest = T))
Code
data %>%
# Adding group variable as color
ggplot(aes( x = quantile_data, y = simulated_data, col = group))+
geom_point()
Output
For those who are wondering, how to work with pre-defined groups, this is the code that worked for me:
library(tidyverse)
library(Rfast)
res <- beta.mle(x)
# make sure groups are not numerrical
# (else color skale might turn out continuous)
g <- plyr::mapvalues(g, c("1", "2"), c("Group1", "Group2"))
data <-
tibble(
my_data = sort(x),
quantile_data = qbeta(ppoints(length(x)),res$param[1], res$param[2]),
group = g[order(x)]
)
data %>%
# Adding group variable as color
ggplot(aes( x = quantile_data, y = my_data, col = group))+
geom_point()
result
I have measurements from several groups which I would like to plot as violin plots:
set.seed(1)
df <- data.frame(val = c(runif(100,1,5),runif(100,1,5),rep(0,100)),
group = c(rep("A",100),rep("B",100),rep("C",100)))
Using R's ggplot2:
library(ggplot2)
ggplot(data = df, aes(x = group, y = val, color = group)) + geom_violin()
I get:
But when I try to get the equivalent with R's plotly using:
library(plotly)
plot_ly(x = df$group, y = df$val, split = df$group, type = 'violin', box = list(visible = F), points = F, showlegend = T, color = df$group)
I get:
Where group "C" gets an inflated/artificial violin.
Any idea how to deal with this and not by using ggplotly?
I did not find a way to fix the behaviour of plotly (probably worth making a bug report for this). A workaround would be to filter your data to only draw violin plots on groups whose range is greater than zero. If you also need to show where the other groups are, you can use a boxplot for these.
To demonstrate, I use library(data.table) for the filtering stage. You could use dplyr or base versions of the same procedure if you prefer:
setDT(df)[, toplot := diff(range(val)) > 0, group]
Now we can plot the groups using different trace styles depending on whether they should have violins or not
plot_ly() %>%
add_trace(data = df[(toplot)], x = ~group, y = ~val, split = ~group,
type = 'violin', box = list(visible = F), points = F) %>%
add_boxplot(data = df[(!toplot)], x = ~group, y = ~val, split = ~group)
I am trying to plot multiple box plots as a single graph. The data is where I have done a wilcoxon test. It should be like this
I have four/five questions and I want to plot the respondent score for two sets as a box plot. This should be done for all questions (Two groups for each question).
I am thinking of using ggplot2. My data is like
q1o <- c(4,4,5,4,4,4,4,5,4,5,4,4,5,4,4,4,5,5,5,5,5,5,5,5,5,3,4,4,3,4)
q1s <- c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,5,4,4)
q2o <- c(3,3,3,4,3,4,4,3,3,3,4,4,3,4,3,3,4,3,3,3,3,4,4,4,4,3,3,3,3,4)
q2s <- c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,3,4,4)
....
....
q1 means question 1 and q2 means question 2. I also want to know how to align these stacked box plots based on my need. Like one row or two rows.
This should get you started:
Unfortunately you don't provide a minimal example with sample data, so I will generate some random sample data.
# Generate sample data
set.seed(2017);
df <- cbind.data.frame(
value = rnorm(1000),
Label = sample(c("Good", "Bad"), 1000, replace = T),
variable = sample(paste0("F", 5:11), 1000, replace = T));
# ggplot
library(tidyverse);
df %>%
mutate(variable = factor(variable, levels = paste0("F", 5:11))) %>%
ggplot(aes(variable, value, fill = Label)) +
geom_boxplot(position=position_dodge()) +
facet_wrap(~ variable, ncol = 3, scale = "free");
You can specify the number of columns and rows in your 2d panel layout through arguments ncol and nrow, respectively, of facet_wrap. Many more details and examples can be found if you follow ?geom_boxplot and ?facet_wrap.
Update 1
A boxplot based on your sample data doesn't make too much sense, because your data are not continuous. But ignoring that, you could do the following:
df <- data.frame(
q1o = c(4,4,5,4,4,4,4,5,4,5,4,4,5,4,4,4,5,5,5,5,5,5,5,5,5,3,4,4,3,4),
q1s = c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,5,4,4),
q2o = c(3,3,3,4,3,4,4,3,3,3,4,4,3,4,3,3,4,3,3,3,3,4,4,4,4,3,3,3,3,4),
q2s = c(5,4,4,5,5,5,5,5,4,5,4,4,5,4,5,5,5,5,5,5,5,5,5,5,5,5,4,3,4,4));
df %>%
gather(key, value, 1:4) %>%
mutate(
variable = ifelse(grepl("q1", key), "F1", "F2"),
Label = ifelse(grepl("o$", key), "Bad", "Good")) %>%
ggplot(aes(variable, value, fill = Label)) +
geom_boxplot(position = position_dodge()) +
facet_wrap(~ variable, ncol = 3, scale = "free");
Update 2
One way of visualising discrete data would be in a mosaicplot.
mosaicplot(table(df2));
The plot shows the count of value (as filled rectangles) per Variable per Label. See ?mosaicplot for details.