How to add "labels" and "value" arguments with highcharter - r

I want to specify which column I want as label and value in a pie chart
The problem is when I use the function hc_add_series_labels_values() which accept this 2 argument I have no output because seems to be deprecated.
The hc_add_series() seems to automaticly the 2 column depending on order, type ...
This package is not well documented I couldnt find what I need
Thanks
In my example I want to specify the name2 column as label
and high as value, how to do that ?
library(dplyr)
library(highcharter)
n <- 5
set.seed(123)
colors <- c("#d35400", "#2980b9", "#2ecc71", "#f1c40f", "#2c3e50", "#7f8c8d")
colors2 <- c("#000004", "#3B0F70", "#8C2981", "#DE4968", "#FE9F6D", "#FCFDBF")
df <- data.frame(x = seq_len(n) - 1) %>%
mutate(
y = 10 + x + 10 * sin(x),
y = round(y, 1),
z = (x*y) - median(x*y),
e = 10 * abs(rnorm(length(x))) + 2,
e = round(e, 1),
low = y - e,
high = y + e,
value = y,
name = sample(fruit[str_length(fruit) <= 5], size = n),
color = rep(colors, length.out = n),
segmentColor = rep(colors2, length.out = n)
)
df$name2 <- c("mos", "ok", "kk", "jji", "hufg")
## x y z e low high value name color segmentColor
## 1 0 10.0 -25.6 7.6 2.4 17.6 10.0 plum #d35400 #000004
## 2 1 19.4 -6.2 4.3 15.1 23.7 19.4 lemon #2980b9 #3B0F70
## 3 2 21.1 16.6 17.6 3.5 38.7 21.1 mango #2ecc71 #8C2981
## 4 3 14.4 17.6 2.7 11.7 17.1 14.4 pear #f1c40f #DE4968
## 5 4 6.4 0.0 3.3 3.1 9.7 6.4 apple #2c3e50 #FE9F6D
highchart() %>%
hc_chart(type = "pie") %>%
hc_add_series(df, name = "Fruit Consumption", showInLegend = FALSE)

For People who have same problem you can check this :
This package seems to work like ggplot2, the function hchart do the job with the hcaes argument
hchart(df, type = "pie", hcaes(name2, high))
Output :

Related

Make ggplot2 graph based on a matrix

I try to make ten lines using ggplot2, the data is from a 15*10 matrix and I want to plot line 1 using the first column of matrix, line 2 using the second column of matrix, etc. My code looks like this, but the output is really weird.
The code is:
prop_df <- data.frame(prop_combined)
colnames(prop_df) <- c('Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Qinf', 'Qhyb')
prop_df <- stack(as.data.frame(prop_combined))
prop_df$x <- (rep(seq_len(nrow(prop_combined)), ncol(prop_combined))-1)/10
ggplot(data = prop_df, aes(x = x, y = values), group = ind) +
geom_line() +
geom_point() +
labs(color='Gamma') +
ylim(0, 80) +
xlab(TeX("$tau$")) +
ylab("Power")
The prop_df looks like:
values ind x
1 8.6 Q1 0.0
2 10.7 Q1 0.1
3 11.8 Q1 0.2
4 11.2 Q1 0.3
5 13.0 Q1 0.4
6 15.6 Q1 0.5
7 21.4 Q1 0.6
8 25.0 Q1 0.7
9 28.8 Q1 0.8
10 34.2 Q1 0.9
11 39.5 Q1 1.0
12 48.2 Q1 1.1
13 55.2 Q1 1.2
14 61.6 Q1 1.3
15 67.2 Q1 1.4
16 71.7 Q1 1.5
17 8.8 Q2 0.0
18 11.0 Q2 0.1
19 10.7 Q2 0.2
And the output is:
In addition to #Freguglia's suggestions, I made some changes to create a clear plot. I used the Cairo() library to name the x axis.
Sample code:
library(ggplot2)
library(Cairo) # to define the greek letters
prop_df <- data.frame(prop_df)
#colnames(prop_df) <- c('Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7', 'Q8', 'Qinf', 'Qhyb') # for this example you don't need this line
#prop_df <- stack(as.data.frame(prop_combined)) # this was already
#prop_df$x <- (rep(seq_len(nrow(prop_combined)), ncol(prop_combined))-1)/10
prop_df$values=factor(prop_df$values, levels=c("8.6","8.8", "11", "10.7", "11.2", "11.8", "13", "15.6" ,"21.4" ,"25", "28.8", "34.2", "39.5", "48.2", "55.2", "61.6", "67.2", "71.7"))
prop_df$x=factor(prop_df$x, levels=c("0", "0.1","0.2","0.3","0.4","0.5","0.6","0.7","0.8","0.9","1","1.1","1.2","1.3","1.4","1.5"))
ggplot(data = prop_df, aes(x = x, y = values, group = ind)) +
geom_line(lwd=2) +
geom_point(aes(color=ind), size=6) +
labs( y="Power",color="Conditions", x=paste(expression("\u03A4"))) +
theme_minimal()
Plot:
Sample data:
values=c(8.6,10.7, 11.8, 11.2, 13.00,15.6,21.4, 25.00, 28.8,34.2, 39.5, 48.2, 55.2,61.6,67.2, 71.7,8.8, 11.00, 10.7)
x=c(0, 0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1,1.1,1.2,1.3,1.4,1.5, 0.00, 0.1, 0.2)
ind=c("Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q1","Q2","Q2","Q2")
prop_df<-cbind(values, ind,x)

ggplot boxplot with mean and confidence interval by group

I'd like to make a boxplot with mean instead of median. Moreover, I would like the line to stop at 5% (lower) end 95% (upper) quantile. Here the code;
ggplot(data, aes(x=Cement, y=Mean_Gap, fill=Material)) +
geom_boxplot(fatten = NULL,aes(fill=Material), position=position_dodge(.9)) +
xlab("Cement") + ylab("Mean cement layer thickness") +
stat_summary(fun=mean, geom="point", aes(group=Material), position=position_dodge(.9),color="black")
I'd like to change geom to errorbar, but this doesn't work. I tried middle = mean(Mean_Gap), but this doesn't work either. I tried ymin = quantile(y,0.05), but nothing was changing. Can anyone help me?
The standard boxplot using ggplot. fill is Material:
Here is how you can create the boxplot using custom parameters for the box and whiskers. It's the solution shown by #lukeA in stackoverflow.com/a/34529614/6288065, but this one will also show you how to make several boxes by groups.
The R built-in data set called "ToothGrowth" is similar to your data structure so I will use that as an example. We will plot the length of tooth growth (len) for each vitamin C supplement group (supp), separated/filled by dosage level (dose).
# "ToothGrowth" at a glance
head(ToothGrowth)
# len supp dose
#1 4.2 VC 0.5
#2 11.5 VC 0.5
#3 7.3 VC 0.5
#4 5.8 VC 0.5
#5 6.4 VC 0.5
#6 10.0 VC 0.5
library(dplyr)
# recreate the data structure with specific "len" coordinates to plot for each group
df <- ToothGrowth %>%
group_by(supp, dose) %>%
summarise(
y0 = quantile(len, 0.05),
y25 = quantile(len, 0.25),
y50 = mean(len),
y75 = quantile(len, 0.75),
y100 = quantile(len, 0.95))
df
## A tibble: 6 x 7
## Groups: supp [2]
# supp dose y0 y25 y50 y75 y100
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 OJ 0.5 8.74 9.7 13.2 16.2 19.7
#2 OJ 1 16.8 20.3 22.7 25.6 26.9
#3 OJ 2 22.7 24.6 26.1 27.1 30.2
#4 VC 0.5 4.65 5.95 7.98 10.9 11.4
#5 VC 1 14.0 15.3 16.8 17.3 20.8
#6 VC 2 19.8 23.4 26.1 28.8 33.3
# boxplot using the mean for the middle and 95% quantiles for the whiskers
ggplot(df, aes(supp, fill = as.factor(dose))) +
geom_boxplot(
aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
stat = "identity"
) +
labs(y = "len", title = "Boxplot with Mean Middle Line") +
theme(plot.title = element_text(hjust = 0.5))
In the figure above, the boxplot on the left is the standard boxplot with regular median line and regular min/max whiskers. The boxplot on the right uses the mean middle line and 5%/95% quantile whiskers.

Interpolation (akima) omits part of data when x/y contains duplicate elements

I am making a function which receives three vectors, interpolates them using akima and plots them using plot_ly(). Although the general code works, I am encountering issues with scaling of the z-matrix that interp() outputs.
Let me give you an example:
x is a non-NA numeric containing some duplicate values.
y is a non-NA numeric containing some duplicate values.
z is a non-NA continuous vector
Some summary statistics:
> unique(x)
[1] 60 48 36 32 18 24 30 15 12 28 21 19 54 20 16 27 10 39 14 17 9 6 50 8 13
> range(x)
[1] 6 60
> unique(y)
[1] 10.00 10.50 13.50 12.50 14.00 12.00 11.00 9.00 11.50 9.25 13.00 10.25 6.50 6.75 8.25 9.50
[17] 8.00 8.85 9.75 7.90 7.00 8.60 8.75 7.50 8.90 8.50 7.49 7.40 5.50 7.60 7.25 8.35
[33] 6.00 5.00 7.75 7.35 6.30 4.50 5.75 8.40 5.60 5.90 7.74 9.90 6.20 5.80
> range(y)
[1] 4.5 14.0
> head(z)
[1] 2.877272 3.267328 3.175478 3.843326 4.809792 2.827825
> range(z)
[1] 2.316529 28.147808
I implement the baseline function below:
labs = list(x = 'x', y = 'y', z = 'z')
mat = interp(x, y, z, duplicate = 'mean', extrap = T, xo = sort(unique(x)))
plot_ly(x = mat$x, y = mat$y, z = mat$z, type = 'surface') %>%
layout(title = title,
scene = list(xaxis = list(title = labs$x),
yaxis = list(title = labs$y),
zaxis = list(title = labs$z)))
When I run this, the output is the following:
The issue is that a portion of the data is not covered in this picture. For instance, there is a sizeable data portion around x > 50, y < 11 that is omitted by the interpolation (and hence not plotted).
length(x[x > 50])
[1] 304
> length(y[x > 50 & y < 11])
[1] 290
> length(z[x > 50 & y < 11])
[1] 290
I suspected that this has to do with the duplicate x values. Hence, I configured the xo argument in interp() such that:
mat = interp(x, y, z, duplicate = 'mean', xo = sort(unique(x)), decreasing = T)
In which case the previously omitted region is partially plotted. It looks like the following:
Nonetheless, the x and y axes still do not correspond to their respective data ranges (despite data availability). Bottom line: How do I tweak the function such that the surface always extends the full range of x and y?
Best
It turns out that the error arose from plot_ly(). Apparently, the z-matrix cannot be passed straight through from interp() to plot_ly(), as the axis become erroneously passed through to the graph. Hence, the interpolated z-matrix needs to be transformed.
If you use these two functions in combination, ensure to carry out the transformation of z as shown below:
mat = interp(x,y,z, duplicate = 'mean')
x = mat$x
y = mat$y
z = matrix(mat$z, nrow = length(mat$y), byrow = TRUE)
plot_ly(x, y ,z, type = 'surface')

R animated plotly: line graph not plotting line

I am trying to create an animated plotly line graph in R using my own data. The animation works when I used markers as the 'mode', however, when changing the mode to 'lines' or 'plines' nothing shows on the graph.
Any suggestions?
Data:
CH4
X FIRST SECOND
1 1 23.9 71.9
2 2 2.9 23.7
3 3 85.7 6.0
4 4 1.2 94.0
5 5 1.1 66.8
6 6 1.5 99.9
Code:
plot_ly(CH4, x=~X, y=~FIRST, name="FIRST",
hoverinfo = 'text',
text = ~paste('Test Round: ', CH4$X, '<br>',
'Concentration: ', CH4$SECOND),
type = "scatter", mode = "plines", frame=~frame) %>%
add_trace(x=~X, y = ~SECOND, name="SECOND", mode = 'plines') %>%
layout(yaxis = list(title = "CH4 Concentrations"), xaxis = list(title =
"Test Round"))
You can find a solution here.
Starting from your data, you need to generate an accumulated dataframe, using for example the accumulate_by function defined below.
dts <- read.table(text="
X FIRST SECOND
1 1 23.9 71.9
2 2 2.9 23.7
3 3 85.7 6.0
4 4 1.2 94.0
5 5 1.1 66.8
6 6 1.5 99.9
", header=T)
library(plotly)
library(lazyeval)
library(dplyr)
accumulate_by <- function(dat, var) {
var <- f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
bind_rows(dats)
}
CH4 <- dts %>% accumulate_by(~X)
head(CH4, 10)
# X FIRST SECOND frame
# 1 1 23.9 71.9 1
# 2 1 23.9 71.9 2
# 3 2 2.9 23.7 2
# 4 1 23.9 71.9 3
# 5 2 2.9 23.7 3
# 6 3 85.7 6.0 3
# 7 1 23.9 71.9 4
# 8 2 2.9 23.7 4
# 9 3 85.7 6.0 4
# 10 4 1.2 94.0 4
Now your code works correctly:
plot_ly(CH4, x=~X, y=~FIRST, frame=~frame,
name='FIRST', hoverinfo = 'text',
text = ~paste('Test Round: ', CH4$X, '<br>',
'Concentration: ', CH4$SECOND),
type = 'scatter', mode = 'plines') %>%
add_trace(x=~X, y = ~SECOND, name='SECOND', mode = 'plines') %>%
layout(yaxis = list(title = 'CH4 Concentrations'),
xaxis = list(title = 'Test Round'))

How do I prevent x labels from overlapping my bars in a barplot?

Here is the code:
barplot(colMeans(sample_data, na.rm = TRUE),
las = 1,
main = "Main Title",
xlab = "Variable",
ylab = "How Characteristic",
col = rainbow(20),
cex.names = 0.9,
horiz = FALSE)
A sample data set is available here:
https://github.com/akaEmma/public_data/blob/master/sample_data.csv
Or you can type some of it in yourself. These are the variable names:
Love of Chocolate,Asian Knowledge,Stable Cleanliness,Love of People,Attention,Ethics,Aggression,Swimming,Style Points,Felinity
And here are some of the data that go with the names:
8.67 9 6.25 7.33 6.33 5 6.67 5 5.25
8 3 6 6.67 8 7 7.67 4.5 5.25
7.33 7.5 5.75 8.67 8.67 8 5.33 2.5 3
8 6.5 6 6.33 8.33 5.33 5.67 6 6.5
6 5.5 5.25 5.33 5 4.67 4 4 3.5
7.67 7 6 4.67 7.33 5.67 7.67 5 3.75
8.67 8 7.5 5.67 7.33 5 8.33 7 7.75
If I use the code above I get the following (ignore the periods; they aren't important):
If I create a larger plot (like fill my screen with it) I get this:
(ignore the missing label; I accidentally left it off and it's supposed to be "Felinity," whatever that is)
This sort of bar chart is for a PowerPoint on a huge screen, so I can go very small with the labels.
Here is what I want: I want clean pretty labels, one per bar, and since this is a wish list, I want the labels to adjust their own sizes so that they are small enough to fit one per bar, and I want them to be at the right vertical point so that they do not overlap with the bars. Any ideas?
Go crazy. I want beautiful bar charts and I have to make a lot of them, so twiddling for each one is simply not an option. This has to work every time with data files of this type regardless of the length of the variable names.
Thanks!
Please note that theme_set and theme_tufte are ggplot2-specific functions.
Using ggplot2 you can do something like this
df <- read.csv("https://raw.githubusercontent.com/akaEmma/public_data/master/sample_data.csv")
library(tidyverse)
library(ggthemes)
df %>%
gather(key, value) %>%
group_by(key) %>%
summarise(mean.value = mean(value, na.rm = T)) %>%
mutate(key = factor(key, levels = key[rev(order(mean.value))])) %>%
ggplot(aes(key, mean.value, fill = as.numeric(key))) +
geom_col() +
theme_tufte() +
scale_fill_gradientn(colours = rainbow(5), guide = F) +
theme(axis.text.x = element_text(size = 6)) +
labs(x = "", y = "How characteristic")

Resources