How to Increase the radius of a circle using Coord_polar() - r

I have a doughnut sort of plot which i plot using the ggplot2. Code was shared by #Jonspring.
data.frame(
stringsAsFactors = FALSE,
Tenure.Type = c("Tenure_A","Tenure_B",
"Tenure_C","Tenure_D","Tenure_E"),
In.Poverty = c(45786L, 98453L, 34954L, 29586L, 74854L),
Not.in.Poverty = c(784733L, 359584L, 385884L, 948434L, 385869L)
) -> Poverty
library(tidyverse)
Poverty %>%
pivot_longer(-Tenure.Type) %>%
uncount(round(value/1000)) %>%
ggplot(aes(1, name, color = Tenure.Type)) +
geom_jitter() +
coord_polar()
This is what i got -
Plot
I was wondering if there is any way to increase the size/surface area of the outer ring while keeping the inner ring as it is. Thanks.
I tried using the agruments inside the Coord_polar() but I can't get it to work.
Note - If you can notice, in the plot each dot represents 1000 observations. So, is there something in which we can achieve like each outer ring's dot represents 10,000 observations and each inner ring's dot represent 1,000 observations? Thanks.

You can split those 2 categories between separate jitter layers and play around with jitter heights to achieve something like this:
library(tidyverse)
Poverty %>%
pivot_longer(-Tenure.Type) %>%
uncount(round(value/1000)) %>%
ggplot(aes(1, name, color = Tenure.Type, shape = name)) +
geom_jitter(data = ~filter(.x, name == "Not.in.Poverty" ), height = .6) +
geom_jitter(data = ~filter(.x, name == "In.Poverty" ), height = .35) +
scale_shape_manual(
name = "Scale",
values = c("Not.in.Poverty" = 15, "In.Poverty" = 19),
labels = c("Not.in.Poverty" = "n=10000", "In.Poverty" = "n=1000")
)+
coord_polar()
Input data:
Poverty <- data.frame(
stringsAsFactors = FALSE,
Tenure.Type = c("Tenure_A","Tenure_B",
"Tenure_C","Tenure_D","Tenure_E"),
In.Poverty = c(45786L, 98453L, 34954L, 29586L, 74854L),
Not.in.Poverty = c(784733L, 359584L, 385884L, 948434L, 385869L)
)
Created on 2023-01-31 with reprex v2.0.2

Related

Specification curve "choices" plot using ggplot2

I have a small dataset of estimates from many regressions of an outcome variable on a main treatment variable and then various sets of control variables (in fact, all possible combinations of those controls variables). The table of estimates is as follows:
df <-
structure(list(control_set = c("cen21_hindu_pct", "cen83_urban_pct",
"cen21_hindu_pct + cen83_urban_pct", "NONE"), xest = c(0.0124513609978549,
0.00427174623249021, 0.006447506098051, 0.0137107176362076),
xest_conf_low = c(0.00750677700140716, -0.00436301983024899,
-0.0013089334064237, 0.00925185534519074), xest_conf_high = c(0.0173959449943027,
0.0129065122952294, 0.0142039456025257, 0.0181695799272245
)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
))
I want to make the two plots for the classic "specification curve analysis." The top plot is simply the set of estimates ordered by the magnitude of the estimate on the main treatment variable (no issue here):
df %>%
arrange(xest) %>%
mutate(specifications = 1:nrow(.)) %>%
ggplot(aes(x = specifications, y = xest, ymin = xest_conf_low, ymax = xest_conf_high)) +
geom_pointrange(alpha = 0.1, size = 0.6, fatten = 1) +
labs(x = "", y = "Estimate\n") +
theme_bw()
My problem is with the aligned plot underneath that describes the control-set choices. Directly underneath each coefficient dot and whisker from the plot just made I want a plot that indicates the set of corresponding control variables that were included in that model (i.e. the list of controls in the control_set column in the df data frame row). So the plot I need in this example would look just like this:
This is a (failed) sketch of what I tried to get there, by modifying the earlier estimation dataset in long form, but I couldn't get multiple ticks to show vertically: (Note, this bit of code won't run)
# forplot %>%
# arrange(xest) %>%
# mutate(specifications = 1:nrow(.)) %>%
# mutate(value = "|") %>%
# ggplot(aes(specifications, term)) +
# geom_text(aes(label = value)) +
# scale_color_manual(values = c("lightblue")) +
# labs(x = "\nSpecification number", y = "") +
# theme_bw()
How can I use ggplot2 to make the plot-figure shown above from the information in the data frame, df?
If we define your plot as -> a...
library(patchwork)
b <- tibble(specifications = c(1,2,2,3),
control_set = rep(c("cen83_urban_pct", "cen21_hindu_pct"), each = 2)) %>%
ggplot(aes(specifications, control_set)) +
geom_text(aes(label = "|"), size = 5) +
coord_cartesian(xlim = c(1,4)) +
labs(x = NULL, y = NULL) +
theme_bw()+
theme(axis.ticks = element_blank(),
axis.text.x = element_blank())
a/b + plot_layout(heights = c(3,1))
If you want to generate the key automatically, you might use something like this:
library(dplyr)
df %>%
select(control_set) %>%
mutate(specifications = 1:4) %>%
separate_rows(control_set, sep = "\\+") %>%
mutate(control_set = trimws(control_set)) %>% # b/c my regex not good enough to trim spaces in line above
...
If you want to relabel the numbers in the y-axis with the control_set labels you can add
+ scale_y_continuous(breaks = df$xest, labels = df$control_set)

Using different data for positioning and display of labels in plots

TL;DR: with plot labels using geom_label etc., is it possible to use different data for the calculation of positions of using position_stack or similar functions, than for the display of the label itself? Or, less generally, is it possible to subset the label data after positions have been calculated?
I have some time series data for many different subjects. Observations took place at multiple time points, which are the same for each subject. I would like to plot this data as a stacked area plot, where the height of a subject's curve at each time point corresponds to the observed value for that subject at that time point. Crucially, I also need to add labels to identify each subject.
However, the trivial solution of adding one label at each observation makes the plot unreadable, so I would like to limit the displayed labels to the "most important" subjects (the ones that have the highest peak), as well as only display a label at the respective peak. This subsetting of the labels themselves is not a problem either, but I cannot figure out how to then position the (subset of) labels correctly so they match with the stacked area chart.
Here is some example code, which should work out of the box with tidyverse installed, to illustrate my issue. First, we generate some data which has the same structure as mine:
library(tidyverse)
set.seed(0)
# Generate some data
num_subjects = 50
num_timepoints = 10
labels = paste(sample(words, num_subjects), sample(fruit, num_subjects), sep = "_")
col_names = c("name", paste0("timepoint_", c(1:num_timepoints)))
df = bind_rows(map(labels,
~c(., cumsum(rnorm(num_timepoints))) %>%
set_names(col_names))) %>%
pivot_longer(starts_with("timepoint_"), names_to = "timepoint", names_prefix = "timepoint_") %>%
mutate(across(all_of(c("timepoint", "value")), as.numeric)) %>%
mutate(value = if_else(value < 0, 0, value)) %>%
group_by(name) %>% mutate(peak = max(value)) %>% ungroup()
Now, we can trivially make a simple stacked area plot without labels:
# Plot (without labels)
ggplot(df,
mapping = aes(x = factor(timepoint), y = value, group = name, fill = factor(peak))) +
geom_area(show.legend = FALSE, position = "stack", colour = "gray25") +
scale_fill_viridis_d()
Plot without labels (it appears that I currently cannot embed images, which is very unfortunate as they are extremely illustrative here...)
It is also not too hard to add non-specific labels to this data. They can easily be made to appear at the correct position — so the center of the label is at the middle of the area for each time point and subject — using position_stack:
# Plot (all labels, positions are correct but the plot is basically unreadable)
ggplot(df,
mapping = aes(x = factor(timepoint), y = value, group = name, fill = factor(peak))) +
geom_area(show.legend = FALSE, position = "stack", colour = "gray25") +
geom_label(mapping = aes(label = name), show.legend = FALSE, position = position_stack(vjust = 0.5)) +
scale_fill_viridis_d()
Plot with a label at each observation
However, as noted before, the labels almost entirely obscure the plot itself. So my approach would be to only show labels at the peaks, and only for the 10 subjects with the highest peaks:
# Plot (only show labels at the peak for the 10 highest peaks, readable but positions are wrong)
max_labels = 10 # how many labels to show
df_labels = df %>%
group_by(name) %>% slice_max(value, n = 1) %>% ungroup() %>%
slice_max(value, n = max_labels)
ggplot(df,
mapping = aes(x = factor(timepoint), y = value, group = name, fill = factor(peak))) +
geom_area(show.legend = FALSE, position = "stack", colour = "gray25") +
geom_label(data = df_labels, mapping = aes(label = name), show.legend = FALSE, position = position_stack(vjust = 0.5)) +
scale_fill_viridis_d()
Plot with only a subset of labels
This code also works fine, but it is apparent that the labels no longer show up at the correct positions, but are instead too low on the plot, especially for the subjects which would otherwise be higher up. (The only subject where the position is correct is work_eggplant.) This makes perfect sense, as the data used for calculation of position_stack are now only a subset of the original data, so the observations which would receive no labels are not considered when stacking. This can be illustrated by zeroing out all the observations which would not receive a label:
df_zeroed = anti_join(df %>% mutate(value = 0),
df_labels,
by = c("name", "timepoint")) %>% bind_rows(df_labels)
ggplot(df_zeroed,
mapping = aes(x = factor(timepoint), y = value, group = name, fill = factor(peak))) +
geom_area(show.legend = FALSE, position = "stack", colour = "gray25") +
geom_label(data = df_labels, mapping = aes(label = name), show.legend = FALSE, position = position_stack(vjust = 0.5)) +
scale_fill_viridis_d()
Plot with unlabeled observations zeroed out
So now my question is, how can this problem be solved? Is there a way to use the original data for the positioning, but the subset data for the actual display of the labels?
Maybe this is what you are looking for. To achieve the desired result you could
use the whole dataset for plotting the labels to get the right positions,
use an empty string "" for the non-desired labels ,
set the fill and color of non-desired labels to "transparent"
# Plot (only show labels at the peak for the 10 highest peaks, readable but positions are wrong)
max_labels = 10 # how many labels to show
df_labels = df %>%
group_by(name) %>%
slice_max(value, n = 1) %>%
ungroup() %>%
slice_max(value, n = max_labels) %>%
mutate(label = name)
df1 <- df %>%
left_join(df_labels) %>%
replace_na(list(label = ""))
#> Joining, by = c("name", "timepoint", "value", "peak")
ggplot(df1,
mapping = aes(x = factor(timepoint), y = value, group = name, fill = as.character(peak))) +
geom_area(show.legend = FALSE, position = "stack", colour = "gray25") +
geom_label(mapping = aes(
label = label,
fill = ifelse(label != "", as.character(peak), NA_character_),
color = ifelse(label != "", "black", NA_character_)),
show.legend = FALSE, position = position_stack(vjust = 0.5)) +
scale_fill_viridis_d(na.value = "transparent") +
scale_color_manual(values = c("black" = "black"), na.value = "transparent")
EDIT If you want the fill colors to correspond to the value of peak then
a simple solution would be to map peak on fill instead of factor(peak) and make use of fill = ifelse(label != "", peak, NA_real_) in geom_label. However, in that case you have to switch to a continuous fill scale.
as I guess that you had a good reason to make use of discrete scale an other option would be to make peak an orderd factor. This approach however is not that simple. To make this work I first reorder factor(peak) according to peak, add an additional NA level and make us of an auxilliary variable peak1 to fill the labels. However, as we have two different variables to be mapped on fill I would suggest to make use of a second fill scale using ggnewscale::new_scale_fill to achieve the desired result:
library(tidyverse)
set.seed(0)
#cumsum(rnorm(num_timepoints)) * 3
# Generate some data
num_subjects = 50
num_timepoints = 10
labels = paste(sample(words, num_subjects), sample(fruit, num_subjects), sep = "_")
col_names = c("name", paste0("timepoint_", c(1:num_timepoints)))
df = bind_rows(map(labels,
~c(., cumsum(rnorm(num_timepoints)) * 3) %>%
set_names(col_names))) %>%
pivot_longer(starts_with("timepoint_"), names_to = "timepoint", names_prefix = "timepoint_") %>%
mutate(across(all_of(c("timepoint", "value")), as.numeric)) %>%
mutate(value = if_else(value < 0, 0, value)) %>%
group_by(name) %>% mutate(peak = max(value)) %>% ungroup()
# Plot (only show labels at the peak for the 10 highest peaks, readable but positions are wrong)
max_labels = 10 # how many labels to show
df_labels = df %>%
group_by(name) %>%
slice_max(value, n = 1) %>%
ungroup() %>%
slice_max(value, n = max_labels) %>%
mutate(label = name)
df1 <- df %>%
left_join(df_labels) %>%
replace_na(list(label = ""))
#> Joining, by = c("name", "timepoint", "value", "peak")
df2 <- df1 %>%
mutate(
# Make ordered factor
peak = fct_reorder(factor(peak), peak),
# Add NA level to peak
peak = fct_expand(peak, NA),
# Auxilliary variable to set the fill to NA for non-desired labels
peak1 = if_else(label != "", peak, factor(NA)))
ggplot(df2, mapping = aes(x = factor(timepoint), y = value, group = name, fill = peak)) +
geom_area(show.legend = TRUE, position = "stack", colour = "gray25") +
scale_fill_viridis_d(na.value = "transparent") +
# Add a second fill scale
ggnewscale::new_scale_fill() +
geom_label(mapping = aes(
label = label,
fill = peak1,
color = ifelse(label != "", "black", NA_character_)),
show.legend = FALSE, position = position_stack(vjust = 0.5)) +
scale_fill_viridis_d(na.value = "transparent") +
scale_color_manual(values = c("black" = "black"), na.value = "transparent")

Faceting a proportional half-area plot made with ggforce?

I am using ggforce to create a plot like this. .
My goal is to facet this type of plot.
For background on how the chart was made, check out update 3 on this question. The only modification that I have made was adding a geom_segment between the x axis and the Y value positions.
The reason why I believe faceting this graph is either difficult, or even impossible, is because continuous value x coordinates are used to determine where the geom_arc_bar is positioned in space.
My only idea for getting this to work has been supplying each "characteristic" that I want to facet with a set of x coordinates (1,2,3). Initially, as I will demonstrate in my code, I worked with set of highly curated data. Ideally, I would like to scale this to a dataset with many variables.
In the example graph that I have provided, the Y value is from table8, filtered for rows with "DFT". The area of the half-circles is proportional to the values of DDFS and FDFS from table9. Ideally, I would like to be able to create a function allowing for the easy creation of these graphs, with perhaps 3 parameters, the data for the y value, and for both half circles.
Here is my data.
Here is the code that I have written thus far.
For making a single plot
#Filter desired Age and Measurement
table9 %>%
filter(Age == "6-11" & Measurement != 'DFS' ) %>%
select( SurveyYear, Total , Measurement ) %>%
arrange(SurveyYear) %>%
dplyr::rename(Percent = Total) -> table9
#Do the same for table 8.
table8 %>%
filter(Age == "6-11" & Measurement != "DS" & Measurement != "FS") %>%
select(SurveyYear, Total) %>%
dplyr::rename(Y = Total)-> table8
table8 <- table8 %>%
bind_rows(table8) %>%
arrange(Y) %>%
add_column(start = rep(c(-pi/2, pi/2), 3), x = c(1,1,2,2,3,3))
table8_9 <- bind_cols(table8,table9) %>%
select(-SurveyYear1)
#Create the plot
ggplot(table8_9) + geom_segment( aes(x=x, xend=x, y=0, yend=Y), size = 0.5, linetype="solid") +
geom_arc_bar(aes(x0 = x, y0 = Y, r0 = 0, r = sqrt((Percent*2)/pi)/20,
start = start, end = start + pi, fill = Measurement),
color = "black") + guides(fill = guide_legend(title = "Type", reverse = T)) +
guides(fill = guide_legend(title = "Measurement", reverse = F)) +
xlab("Survey Year") + ylab("Mean dfs") + coord_fixed() + theme_pubr() +
scale_y_continuous(expand = c(0, 0), limits = c(0, 5.5)) +
scale_x_continuous(breaks = 1:3, labels = paste0(c("1988-1994", "1999-2004", "2011-2014"))) +
scale_fill_discrete(labels = c("ds/dfs", "fs/dfs")) -> lolliPlot
lolliPlot
Attempt at many plots
#Filter for "DFS"
table8 <- table8 %>%
filter(Measurement=="DFS")
#Duplicate DF vertically, and add column specifying the start point for the arcs.
table8 <- table8 %>%
bind_rows(table8) %>%
add_column(start = rep(c(-pi/2, pi/2), length(.$SurveyYear)/2), x = rep(x = c(1,2,3),length(.$SurveyYear)/3)) %>%
arrange(Age, x)
#Bind two tables today, removing all of the characteristic columns from table 8.
table8_9 <- bind_cols(table8,table9) %>%
select(-Age1, -SurveyYear1, -Measurement) %>%
gather(key = Variable, value = Y, -x,-start,-Age, -SurveyYear, -Measurement1, -Total1, -Male1, -Female1, -'White, non-Hispanic1', -'Black, non-hispanic1', -'Mexican American1', -'Less than 100% FPG1', -'100-199% FPG1', -'Greater than 200% FPG1')
This is where I get stuck. I can't figure out a way to format the data so that I can facet the graph. If anybody has any ideas or advice, I would greatly appreciate it.

R: PCA ggplot Error "arguments imply differing number of rows"

I have a dataset:
https://docs.google.com/spreadsheets/d/1ZgyRQ2uTw-MjjkJgWCIiZ1vpnxKmF3o15a5awndttgo/edit?usp=sharing
that I'm trying to apply PCA analysis and to achieve a graph based on graph provided in this post:
https://stats.stackexchange.com/questions/61215/how-to-interpret-this-pca-biplot-coming-from-a-survey-of-what-areas-people-are-i
However, an error doesn't seem to go away:
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names =
TRUE, :
arguments imply differing number of rows: 0, 1006
Following is my code that I have trouble finding the source of error. Would like to have some help for error detection. Any hints?
The goal is to produced a PCA graph grouped by levels of Happiness.in.life. I modified the original code to fit with my dataset. Originally, group is determined by Genders, which has 2 levels. What I'm attempting to do is to build a graph based on 5 levels of Happiness.in.life. However, it doesn't seem I can use the old code...
Thanks!
library(magrittr)
library(dplyr)
library(tidyr)
df <- happiness_reduced %>% dplyr::select(Happiness.in.life:Internet.usage, Happiness.in.life)
head(df)
vars_on_hap <- df %>% dplyr::select(-Happiness.in.life)
head(vars_on_hap)
group<-df$Happiness.in.life
fit <- prcomp(vars_on_hap)
pcData <- data.frame(fit$x)
vPCs <- fit$rotation[, c("PC1", "PC2")] %>% as.data.frame()
multiple <- min(
(max(pcData[,"PC1"]) - min(pcData[,"PC1"]))/(max(vPCs[,"PC1"])-
min(vPCs[,"PC1"])),
(max(pcData[,"PC2"]) - min(pcData[,"PC2"]))/(max(vPCs[,"PC2"])-
min(vPCs[,"PC2"]))
)
ggplot(pcData, aes(x=PC1, y=PC2)) +
geom_point(aes(colour=groups)) +
coord_equal() +
geom_text(data=vPCs,
aes(x = fit$rotation[, "PC1"]*multiple*0.82,
y = fit$rotation[,"PC2"]*multiple*0.82,
label=rownames(fit$rotation)),
size = 2, vjust=1, color="black") +
geom_segment(data=vPCs,
aes(x = 0,
y = 0,
xend = fit$rotation[,"PC1"]*multiple*0.8,
yend = fit$rotation[,"PC2"]*multiple*0.8),
arrow = arrow(length = unit(.2, 'cm')),
color = "grey30")
Here is an approach on how to plot the result of PCA in ggplot2:
library(tidyverse)
library(ggrepel)
A good idea (not in all cases for instance if they are all in the same units) is to scale the variables prior to PCA
hapiness %>% #this is the data from google drive. In the future try not top post such links on SO because they tend to be unusable after some time has passed
select(-Happiness.in.life) %>%
prcomp(center = TRUE, scale. = TRUE) -> fit
Now we can proceed to plotting the fit:
fit$x %>% #coordinates of the points are in x element
as.data.frame()%>% #convert matrix to data frame
select(PC1, PC2) %>% #select the first two PC
bind_cols(hapiness = as.factor(hapiness$Happiness.in.life)) %>% #add the coloring variable
ggplot() +
geom_point(aes(x = PC1, y = PC2, colour = hapiness)) + #plot points and color
geom_segment(data = fit$rotation %>% #data we want plotted by geom_segment is in rotation element
as.data.frame()%>%
select(PC1, PC2) %>%
rownames_to_column(), #get to row names so you can label after
aes(x = 0, y = 0, xend = PC1 * 7, yend = PC2* 7, group = rowname), #I scaled the rotation by 7 so it fits in the plot nicely
arrow = arrow(angle = 20, type = "closed", ends = "last",length = unit(0.2,"cm")),
color = "grey30") +
geom_text_repel(data = fit$rotation %>%
as.data.frame()%>%
select(PC1, PC2) %>%
rownames_to_column(),
aes(x = PC1*7,
y = PC2*7,
label = rowname)) +
coord_equal(ratio = fit$sdev[2]^2 / fit$sdev[1]^2) + #I like setting the ratio to the ratio of eigen values
xlab(paste("PC1", round(fit$sdev[1]^2/ sum(fit$sdev^2) *100, 2), "%")) +
ylab(paste("PC2", round(fit$sdev[2]^2/ sum(fit$sdev^2) *100, 2), "%")) +
theme_bw()
Look at all them happy people on the left (well it is hard to notice because of the colors used, I suggest using the palette jco from ggpubr library) get_palette('jco', 5) ie scale_color_manual(values = get_palette('jco', 5))
quite a similar plot can be achieved with library ggord:
library(ggord)
ggord(fit, grp_in = as.factor(hapiness$Happiness.in.life),
size = 1, ellipse = F, ext = 1.2, vec_ext = 5)
the major difference is ggord uses equal scaling for axes. Also I scaled the rotation by 5 instead of 7 as in the first plot.
As you can see I do not like many intermediate data frames.

Highchart: Can I use a different variable as the data labels?

I'm trying to build a column chart through highchart in r studio. I've converted the values to % as I want the graph to show %, but I want the data labels to show the value, is there a way of doing this?
My data set has a column with the values for London and the percentages for London, I want the Y axis of the graph to show the % while the data labels show the value.
This is my current code:
hc <- highchart() %>%
hc_title(text= "Gender - London")%>%
hc_colors('#71599b') %>%
hc_yAxis(max = 0.7) %>%
hc_xAxis(categories = Sex$Gender) %>%
hc_add_series(name = "London", type = "column",
data = Sex$LON_PERC, dataLabels = list(enabled=TRUE, format={Sex$London}) )
So, I've put Sex$LON_PERC (% in London) as the data to plot while Sex$London is the data labels.
But this code puts all the values of London in each data label.
Edit:
This is the data I'm trying to plot, LON_PERC on the Y Axis, Gender on the X axis and London as the Data Labels
Gender London LON_PERC
Declined 5 0.000351247
Female 8230 0.578152441
Male 4640 0.325957148
No Data 1360 0.095539164
I am rather uncomfortable working with the ´highcharter´ package, as it requires a commercial license, which I do not have.
The result you want to achieve can be reached with the following - rather straightforward - code using base r or ggplot functionality, both of which are freeware. I will show this with two code fragments below.
### your data
Sex <- read.table(header = TRUE, text =
"Gender London LON_PERC
Declined 5 0.000351247
Female 8230 0.578152441
Male 4640 0.325957148
'No Data' 1360 0.095539164
")
A Solution using base r
The barplot function returns a vector (when besides is false) with the coordinates of all the midpoints of the bars drawn (if besides is true, it is a matrix). This gives us the X-coordinates for setting text above the bars, the bar-heights we already have in the data we plot, right.
# Draw the barplot and store result in `mp`
mp <- barplot(Sex$LON_PERC, # height of the bar
names.arg = Sex$Gender, # x-axis labels
ylim = c(0, 0.7), # limits of y-axis
col = '#71599b', # your color
main = "Gender - London") # main title
# add text to the barplot using the stored values
text(x = mp, # middle of the bars
y = Sex$LON_PERC, # height of the bars
labels = Sex$London, # text to display
adj = c(.5, -1.5)) # adjust horizontally and vertically
This yields the following plot:
A solution based on ggplot
library(ggplot2)
ggplot(aes(x = Gender, y = LON_PERC), data = Sex) +
geom_bar(stat = "identity", width = .60, fill = "#71599b" ) +
geom_text(aes(label = London),
position = position_dodge(width = .9),
vjust = -.3, size = 3, hjust = "center") +
theme_minimal() +
scale_y_continuous(limits = c(0, 0.7),
breaks = seq(0.0, 0.7, by = 0.1),
minor_breaks = NULL) +
labs(title = "Gender - London") +
theme(axis.title.y = element_blank(), axis.title.x = element_blank())
yielding the following plot:
In both cases, a lot of characteristics may be adapted to your needs/wishes.
I hope you benefit from these examples, even though it is not made with highcharter.
I've found a work around.
So, I can add in a "tooltip" that appears when I hover over the column/bar.
Firstly, a function is needed:
myhc_add_series_labels_values <- function (hc, labels, values, text, colors= NULL, ...)
{
assertthat::assert_that(is.highchart(hc), is.numeric(values),
length(labels) == length(values))
df <- dplyr::data_frame(name = labels, y = values, text=text)
if (!is.null(colors)) {
assert_that(length(labels) == length(colors))
df <- mutate(df, color = colors)
}
ds <- list_parse(df)
hc <- hc %>% hc_add_series(data = ds, ...)
hc
}
and then when creating the highchart this function needs to be called.
The data looks as follows:
Sex <- read.table(header = TRUE, text =
"Gender London LON_PERC
Declined 5 0.000351247
Female 8230 0.578152441
Male 4640 0.325957148
'No Data' 1360 0.095539164
")
Then the code to generate the highchart is:
Gender<- highchart() %>%
hc_xAxis(categories = Sex$Gender, labels=list(rotation=0))%>%
myhc_add_series_labels_values(labels = Sex$Gender,values=Sex$LON_PERC, text=Sex$London, type="column")%>%
hc_tooltip(crosshairs=TRUE, borderWidth=5, sort=TRUE, shared=TRUE, table=TRUE,pointFormat=paste('<br>%: {point.y}%<br>#: {point.text}'))%>%
hc_legend()
This gives the below output:
Then when I hover over each column/bar it gives be the % information and the number information as can be seen here:

Resources