I've currently got a barplot that has a few basic parameters. However, I'm looking to try and convert this into ggplot. The extra parameters don't matter too much; the main problem that I'm having is that I'm trying to plot the sum of various columns, but I'm unable to transpose it correctly as t(data) doesn't seem to work. Here's what I've got so far:
## Subset of indicators
indicators <- clean_data[c(8, 12, 14:23)]
## Get sum of columns
indicator_sums <- colSums(indicators, na.rm = TRUE)
### Transpose for ggplot
(empty)
## Make bar plot
barplot(indicator_sums, ylim=range(pretty(c(0, indicator_sums))), cex.axis=0.75,cex.lab=0.8, cex.names=0.7, col='magenta', las=2, ylab = 'Offences Recorded Using Indicator')
You may try
library(dplyr)
library(reshape2)
dummy <- data.frame(
A = c(1:20),
B = rnorm(20, 10, 4),
C = runif(20, 19,30),
D = sample(c(10:40),20, replace = T)
)
barplot(colSums(dummy))
dummy %>%
colSums %>%
melt %>%
rownames_to_column %>%
ggplot(aes(x = rowname, y = value)) +
geom_col()
Related
I am trying to plot two series at different scales on same plot with dygraph lib in r.
dygraph(data.frame(x = 1:10, y = runif(10),y2=runif(10)*100)) %>%
dyAxis("y", valueRange = c(0, 1.5)) %>%
dyAxis(runif(10)*100,name="y2", valueRange = c(0, 100)) %>%
dyEvent(2, label = "test") %>%
dyAnnotation(5, text = "A")
however, The plot does not fit the data with larger scale, I cannot figure out how to align the two axises. I suspect the option independentTicks in dyAxis() function does the trick but I cannot find how to use it in the documentation. Please help out with this. Best
One way could be:
We pass the named vector of the column with higher values to dySeries function:
See here https://rstudio.github.io/dygraphs/gallery-axis-options.html
library(dygraphs)
library(dplyr)
df = data.frame(x = 1:10, y = runif(10),y2=runif(10)*100)
y2 <- df %>%
pull(y2)
names(y2) <- df$x
dygraph(df) %>%
dySeries("y2", axis = 'y2')
I need to plot ECDF's of all columns of the dataframe in one single plot and get an x_limit on the axis too.
The function that I wrote:
library(lattice)
library(latticeExtra)
ecdf_plot <- function(data){
# Drop columns with only NA's
data <- data[, colSums(is.na(data)) != nrow(data)]
data$P_key <- NULL
ecdfplot(~ S12, data=data, auto.key=list(space='right'))
}
Problem:
The ECDF in the above function only plots for the column S12 but I need this for all columns in the dataframe. I know i can do S12 + S13 + ... but the source data changes and we don't exactly know how many and which columns will the dataframe get. Is there a better way out of this? Also, is it possible to get the x_limit for the combined plot to be just one range like xlim(0,100)?
I think this task would be easier using ggplot. It would be very easy to set the limits as required, customise the appearance, etc.
The function would look like this:
library(dplyr)
library(tidyr)
library(ggplot2)
ecdf_plot <- function(data) {
data[, colSums(is.na(data)) != nrow(data)] %>%
pivot_longer(everything()) %>%
group_by(name) %>%
arrange(value, by_group = TRUE) %>%
mutate(ecdf = seq(1/n(), 1 - 1/n(), length.out = n())) %>%
ggplot(aes(x = value, y = ecdf, colour = name)) +
xlim(0, 100) +
geom_step() +
theme_bw()
}
Now let's test it on a random data frame:
set.seed(69)
df <- data.frame(unif = runif(100, 0, 100), norm = rnorm(100, 50, 25))
ecdf_plot(df)
My question is about expanding R plotly's grouped violin plot to a case with more than two groups.
Taking the data that are used in the grouped violin plot example code and adding a third level to df$sex:
library(dplyr)
set.seed(1)
df <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv")
df <- df %>%
rbind(df[sample(nrow(df), 100, replace = F),] %>%
dplyr::mutate(sex = "undefined", day = sample(df$day, 100 , replace = F), day = sample(df$day, 100, replace = F)))
df$sex <- factor(df$sex)
Trying to plot this with:
plotly::plot_ly(x = df$day, y = df$total_bill, type = 'violin', split = df$sex, color = df$sex)
I get the violins of each of the sexes centered rather than split:
And this remains the case if I switch split = df$sex to name = df$sex.
But if I change type = 'violin' to type = 'bar' I do get df$sex split:
Any idea how to get this to work for the type = 'violin' case?
I want to add a summary table to plot with ggplot. I am using annotation_custom to add a previous created table.
My problem is that the table shows a different number of decimals.
As an example I am using the mtcars database and my lines of code are the following:
rm(list=ls()) #Clear environment console
data(mtcars)
head(mtcars)
library(dplyr)
library(tidyr)
library(ggplot2)
library(gridExtra)
table <- mtcars %>% #summary table that needs to be avelayed to the plot
select(wt) %>%
summarise(
Obs = length(mtcars$wt),
Q05 = quantile(mtcars$wt, prob = 0.05),
Mean = mean(mtcars$wt),
Med = median(mtcars$wt),
Q95 = quantile(mtcars$wt, prob = 0.95),
SD = sd(mtcars$wt))
dens <- ggplot(mtcars) + #Create example density plot for wt variable
geom_density(data = mtcars, aes(mtcars$wt))+
labs(title = "Density plot")
plot(dens)
dens1 <- dens + #Overlaping summary table to density plot
annotation_custom(tableGrob(t(table),
cols = c("WT"),
rows=c("Obs", "Q-05", "Mean", "Median", "Q-95", "S.D." ),
theme = ttheme_default(base_size = 11)),
xmin=4.5, xmax=5, ymin=0.2, ymax=0.5)
print(dens1)
Running the previous I obtain the following picture
density plot
I would like to fix the number of displayed decimals to only 2.
I already tried adding sprintf
annotation_custom(tableGrob(t(sprintf("%0.2f",table)),
But obtained the following error "Error in sprintf("%0.2f", table_pet) :
(list) object cannot be coerced to type 'double'"
I have been looking without any look. Any idea how can I do this.
Thank you in advance
grid.table leaves the formatting up to you,
d = data.frame(x = "pi", y = pi)
d2 = d %>% mutate_if(is.numeric, ~sprintf("%.3f",.))
grid.table(d2)
I could not find an answer / a solution to the following question:
I have two numeric variables. I take the sum of both and want to bar plot the relative frequency of that summed variable + indicate the proportion of its sub components (i.e. the mean proportion of one variable as part of the sum).
Example: I have v1 = number questions and v2 = number of answers. Each observation can have x questions and y answers and x+y interactions.
Example code:
df <- data.frame(matrix(ncol = 2, nrow = 5))
x <- c("questions", "answers")
colnames(df) <- x
df$questions <- c(1,2,3,1,2)
df$answers <- c(2,3,4,2,3)
df$interactionsum <- df$questions + df$answers
ggplot(df, aes(x = interactionsum)) +
geom_bar(aes(y = (..count..)/sum(..count..))) +
ylab("Relative frequencies") +
xlab("Sum of interactions")
In this data setting, one third of the first bar would be questions (mean proportion) and two thirds answers (mean proportion). How can I achieve this type of grouping with ggplot2?
Thank you in advance!
df <- data.frame(matrix(ncol = 2, nrow = 5))
x <- c("questions", "answers")
colnames(df) <- x
df$questions <- c(1,2,3,1,2)
df$answers <- c(2,3,4,2,3)
df$interactionsum <- df$questions + df$answers
require(dplyr)
require(tidyr)
require(ggplot2)
df<-df %>% group_by(interactionsum) %>%
summarize(questions=mean(questions)/mean(interactionsum) ,answers=mean(answers)/mean(interactionsum) , n=n()/nrow(df) ) %>% mutate(interactionsum=as.factor(interactionsum)) %>%
gather("key","means",questions, answers)
ggplot(df,aes(x=interactionsum,y=means*n,fill=key))+geom_bar(stat="identity")
For each possible interaction sum, we create the mean of all its questions variable and the mean of all its answer variable. Then we gather then (using tidyr) to make the long data format favoured by ggplot, then we plot those means in a stacked bar using the "identity" statistic, since they already reflect the frequency in the value.
I also turned interaction sum into a factor to improve the way it looks in the end result.
# example data
df = data.frame(questions = c(1,2,3,1,2),
answers = c(2,3,4,2,3))
df$interactionsum <- df$questions + df$answers
library(tidyverse)
df %>%
group_by(interactionsum) %>%
summarise_all(sum) %>%
gather(x,y,-interactionsum) %>%
group_by(interactionsum) %>%
mutate(y = y/sum(y)) %>%
ggplot(aes(interactionsum, y, fill=x))+
geom_bar(stat="identity")