I am looking at a dataset from tidytuesday, available here:
video_games <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-30/video_games.csv")
I wrote this code to create a horizontal bar plot, ranked in descending order.
video_games %>%
top_n(10, metascore) %>%
arrange(desc(metascore)) %>%
plot_ly(x = ~metascore, y = ~fct_reorder(game, metascore),
type = "bar") %>%
layout(xaxis = list(title = "Metascore"),
yaxis = list(title = ""))
I want to reuse the code with multiple variables without copying and pasting, so I created a function with 2 entries for the variables I want to plot. (I left out the layout section. If there is a way to automatically re-lable the plot inside the function, that would be cool.)
video_games_ranking_plot <- function(A, B) {
top_n(10, A) %>%
arrange(desc(A)) %>%
plot_ly(x = ~A, y = ~fct_reorder(B, A),
type = "bar")
}
When I run the function
video_games %>%
video_games_ranking_plot(metascore, game)
... I get the error message Error in video_games_ranking_plot(., metascore, game) :
unused argument (game)
Does anyone know why?
The source of the problem seems to be that you are passing the same arguments metascore, game of one type to very different elements of your custom function that accepts arguments of different types:
top_n(10, metascore)
arrange(desc(metascore)
plot_ly(x = ~metascore, y = ~fct_reorder(game, metascore)
The fact that you are also passing columns ase arguments using piping can also pose certain challenges. I haven't found the time to build a complete solution, but hopefully this will help you on your way to a complete solution:
Plot:
Code:
library(dplyr)
library(forcats)
library(plotly)
# get data
video_games <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-07-30/video_games.csv")
data <- video_games
# custom function
video_games_ranking_plot <- function(data, topn, col_top, col_ord){
# select and arrand data
df <- data %>% top_n(topn, {{col_top}}) %>% arrange(desc({{col_ord}})) #%>%
col_top_name <- deparse(substitute(col_top))
col_ord_name <- deparse(substitute(col_ord))
df2<- df[c(col_top_name, col_ord_name)]
# build plotly pliot
p <- plot_ly(x = df2[[col_top_name]], y = df2[[col_ord_name]], type = "bar")
}
plt <- video_games_ranking_plot(data=video_games, topn=5, metascore, game)
plt
There's still an issue with the ~fct_reorder(game, metascore) part.
I had to raise a question myself to even get this far. Take a look at the answer from user Ronak Shah to the post How to pass a dataframe column as an argument in a function using piping? to learn more on how to pass arguments to piping functions.
I hope this helps!
Related
library(highcharter)
library(dplyr)
library(viridisLite)
library(forecast)
library(treemap)
data("Groceries", package = "arules")
dfitems <- tbl_df(Groceries#itemInfo)
set.seed(10)
dfitemsg <- dfitems %>%
mutate(category = gsub(" ", "-", level1),
subcategory = gsub(" ", "-", level2)) %>%
group_by(category, subcategory) %>%
summarise(sales = n() ^ 3 ) %>%
ungroup() %>%
sample_n(31)
hctreemap2(group_vars = c("category","subcategory"),
size_var = "sales")%>%
hc_tooltip(pointFormat = "<b>{point.name}</b>:<br>
Pop: {point.value:,.0f}<br>
GNI: {point.colorValue:,.0f}")
the error is the following
Error in hctreemap2(., group_vars = c("category", "subcategory"), size_var = "sales") : Treemap data uses same label at multiple levels.
I tried everything and it doesn't work out, could someone with experience explain to me what is happening?
When I tried your code, it also stated that the function was deprecated and to use data_to_hierarchical. Although, it's never quite that simple, right? I tried multiple ways to get hctreemap2 to work, but wasn't able to discern that issue. From there I turned to the package recommended data_to_hierarchical. Now that worked without an issue--once I figured out the right type, which in hindsight seemed kind-of obvious.
That being said, this is what I've got:
data_to_hierarchical(data = dfitemsg,
group_vars = c(category,subcategory),
size_var = sales) %>%
hchart(type = "treemap") %>%
hc_tooltip(pointFormat = "<b>{point.name}</b>:<br>
Pop: {point.value:,.0f}<br>
GNI: {point.colorValue:,.0f}")
You didn't actually designate a color, so the GNI comes up blank.
Let me know if you run into any issues.
Based on your comment:
I have not found a way to change the color to density, which is what both hctreemap2 and treemap appear to do. The function data_to_heirarchical codes the colors to the first grouping variable or the level 1 variable.
Inadvertently, I did figure out why the function hctreemap2 would not work. It checks to see if any category labels are the same as a subcategory label. I didn't go through all of the data, but I know there is a perfumery perfumery. I don't understand what that's a hard stop. If that is a problem for this call, why wouldn't data_to_heirchical be looking for this issue, as well?
So, I changed the function. First, I called the function itself.
x = hctreemap2
Then I selected it from the environment pane. Alternatively, you can code View(x).
This view is read-only, but it's easier to read than the console. I copied the function and assigned it to its original name with changes. I removed two pieces of the code, which changed nothing structurally speaking to how the chart is created.
I removed the first line of code in the function:
.Deprecated("data_to_hierarchical")
and this code (about a third of the way down)
if (data %>% select(!!!group_syms) %>% map(unique) %>% unlist() %>%
anyDuplicated()) {
stop("Treemap data uses same label at multiple levels.")
}
This left me to recreate the function with this code:
hctreemap2 <- function (data, group_vars, size_var, color_var = NULL, ...)
{
assertthat::assert_that(is.data.frame(data))
assertthat::assert_that(is.character(group_vars))
assertthat::assert_that(is.character(size_var))
if (!is.null(color_var))
assertthat::assert_that(is.character(color_var))
group_syms <- rlang::syms(group_vars)
size_sym <- rlang::sym(size_var)
color_sym <- rlang::sym(ifelse(is.null(color_var), size_var, color_var))
data <- data %>% mutate_at(group_vars, as.character)
name_cell <- function(..., depth) paste0(list(...),
seq_len(depth),
collapse = "")
data_at_depth <- function(depth) {
data %>%
group_by(!!!group_syms) %>%
summarise(value = sum(!!size_sym), colorValue = sum(!!color_sym)) %>%
ungroup() %>%
mutate(name = !!group_syms[[depth]], level = depth) %>%
mutate_at(group_vars, as.character()) %>% {
if (depth == 1) {
mutate(., id = paste0(name, 1))
}
else {
mutate(.,
parent = pmap_chr(list(!!!group_syms[seq_len(depth) - 1]),
name_cell, depth = depth - 1),
id = paste0(parent, name, depth))
}
}
}
treemap_df <- seq_along(group_vars) %>% map(data_at_depth) %>% bind_rows()
data_list <- treemap_df %>% highcharter::list_parse() %>%
purrr::map(~.[!is.na(.)])
colorVals <- treemap_df %>%
filter(level == length(group_vars)) %>% pull(colorValue)
highchart() %>%
hc_add_series(data = data_list, type = "treemap",
allowDrillToNode = TRUE, ...) %>%
hc_colorAxis(min = min(colorVals), max = max(colorVals), enabled = TRUE)
}
Now your code, as originally written will work. You did not change the highcharter package by doing this. So if you think you'll use it in the future save the function code, as well. You will need the library purrr, since you already called dplyr (where most, if any conflicts occur), you could just call tidyverse (which calls several libraries at one time, including both dplyr and purrr).
This is what it will look like with set.seed(10):
If you drill down on the largest block:
It looks odd to me, but I'm guessing that's what you were looking for to begin with.
I built a Shiny app where I create some plot from hist() and density() objects, both saved in a list into an .RDS file from another script file. So, in shiny I only read the .RDS and make the plot.
Everything is working now, except that I am not finding how to change the height of the highchart plot using the hchart() function. In my code, the way it was built, I cannot work with pipes "%>%", beacuse I am using hchart inside a purrr::map() function.
To explain better I created a small example, that follows.
# Example of how the objects are structured
list <-
list(df1 = list(Sepal.Length = hist(iris$Sepal.Length, plot = FALSE)),
df2 = list(Sepal.Length = density(iris$Sepal.Length)))
# Example of a plot built with hchart function
list[['df2']]['Sepal.Length'] %>%
purrr::map(hchart, showInLegend = FALSE)
# Example of what does not work
list[['df2']]['Sepal.Length'] %>%
purrr::map(hchart, showInLegend = FALSE, height = 200)
Actually, I also would like to change more options of the chart, like colors, for example. But I am not finding a way with this solution I found.
Thanks in advance.
Wlademir.
I can see 2 main ways to do what you need (not sure why you can't use the pipe):
Option 1
Create a function to process every data and add the options inside that function:
get_hc <- function(d) {
hchart(d, showInLegend = FALSE) %>%
hc_size(height= 200) %>%
hc_title(text = "Purrr rocks")
}
Then:
list_of_charts <- list[['df2']]['Sepal.Length'] %>%
purrr::map(get_hc)
Option 2
You can use successively purrr::map:
list_of_charts <- list[['df2']]['Sepal.Length'] %>%
purrr::map(hchart, showInLegend = FALSE)
# change heigth
list_of_charts <- purrr::map(list_of_charts, hc_size, height = 200)
# change title
list_of_charts <- purrr::map(list_of_charts, hc_title. text = "Purrr rocks")
Or you can use successively purrr::map/ %>% combo:
list_of_charts <- list[['df2']]['Sepal.Length'] %>%
purrr::map(hchart, showInLegend = FALSE) %>%
purrr::map(hc_size, height = 200) %>%
purrr::map(hc_title, text = "Purrr rocks")
I am using highcharter for my graphs. I have different courses, which I need to draw same graphs for all. I am trying to have a for loop over courses and plot these graphs inside my loop.
The problem that it doesnt plot anything inside the loop and it works fine from outside the loop.
for(i in unique(my_data$CourseID)){
courseData<-subset(my_data, my_data$CourseID== i)
#Course by Majer
byMajer <- table(courseData$MajerName)
#barplot(byMajer, main="Students by Majer", xlab="Majers")
byM <- aggregate(courseData$MajerName, by=list(courseData$MajerName), FUN=length)
hc <- highchart(debug = TRUE) %>%
hc_title(text = courseData$CourseName[1]) %>%
hc_chart(type = "column") %>%
hc_xAxis(categories = byM$Group.1) %>%
hc_add_series(data = byM$x)
#this is not working. it shows nothing.
hc
#but if I try to have this, it works, I will use below variables outside the loop
assign(paste("hc",i,sep=""), hc)
}
#Below graphs showing in the output. I need to get rid of them and use the one inside the loop.
hc1; hc2; hc3; hc4; hc5; hc6; hc7; hc8; hc9; hc10; hc11; hc12; hc13; hc14; hc15; hc16; hc17; hc18
Answer to the problem is here: https://stackoverflow.com/a/4716380/4046096
In short: Automatic printing is turned off in a loop, so you need to explicitly print something.
Instead of hc use print(hc).
I do not have your data, but this code works fine:
for(i in c(1,2,3,4,5)){
hc <- highchart(debug = TRUE) %>%
hc_chart(type = "column") %>%
hc_add_series(data = list(1, i, i * 2, 5))
print(hc)
}
I've got a set of functions that I'm trying to work with and I'm struggling to figure out why the assignment isn't working. Here are the functions I'm using:
new_timeline <- function() {
timeline = structure(list(), class="timeline")
timeline$title <- list("text" = list("headline" = NULL, "text" = NULL),
"start_date" = list("year" = NULL, "month" = NULL, "day" = NULL),
"end_date" = list("year" = NULL, "month" = NULL, "day" = NULL))
return(timeline)
}
.add_date <- function(self, date, time_type) {
valid_date <- stringr::str_detect(date, "^[0-9]{4}(-[0-9]{1,2}){0,2}$")
if (!valid_date) {
stringr::str_interp("Your ${time_type} date does not appear to be formatted correctly. It must be of the form 'yyyy-mm-dd'. Only the year is required.") %>% stop()
}
date_elements <- date %>% as.character() %>% stringr::str_split(" ") %>% unlist()
date <- date_elements[1] %>% stringr::str_split("-") %>% unlist()
stringr::str_interp("self$title$${time_type}_date$year <- date[1]") %>% parse(text = .) %>% eval()
if (!is.na(date[2])) stringr::str_interp("self$title$${time_type}_date$month <- date[2]") %>% parse(text = .) %>% eval()
if (!is.na(date[3])) stringr::str_interp("self$title$${time_type}_date$day <- date[3]") %>% parse(text = .) %>% eval()
return(self)
}
edit_title <- function(self, headline = NULL, text = NULL, start_date = NULL, end_date = NULL) {
if (class(self) != "timeline") stop("The object passed must be a timeline object.")
if (is.null(headline) && is.null(self$title$text$headline)) stop("Headline cannot be empty when adding a new title.")
if (!is.null(headline)) self$title$text$headline <- headline
if (!is.null(text)) self$title$text$text <- text
if (!is.null(start_date)) self <- .add_date(self, date = start_date, time_type = "start")
if (!is.null(end_date)) self <- .add_date(self, date = end_date, time_type = "end")
return(self)
}
EDIT: The above code has been severely reduced per a request in the comments. The code is still sufficient to reproduce the error.
I know that's a bit long-winded, so I apologize. The first function establishes a new timeline object. The third function allows us to change the title of the timeline object and the second function is a helper function that handles dates. The code would be used like this:
library(magrittr)
#devtools::install_github("hadley/stringr")
library(stringr)
tl <- new_timeline()
tl <- tl %>% edit_title(headline = "My Timeline", text = "Example", start_date = "2015-10-18")
The code runs with no errors, but when I call tl$title$start_date$year, it comes back as NULL. Using an answer I got in this previous question I asked, I tried to set envir = globalenv() within the eval function. When I do that, the function returns an error saying that object self cannot be found.
So I'm under the impression that self is held in the parent.frame(). So I add both of these to a list: envir = list(globalenv(), parent.frame()). This causes the function to run without error, but there's still no assignment.
Where am I going wrong? Thanks in advance!
As mentioned in the comments, I think you could probably do away with all of the code parsing and just pass variables in [[ for your assignments. Anyway, when you use the pipe operator a bunch of function wrapping happens so determining how many frames to go back is painful. Here are a couple solutions modifying the .add_date function.
You already found one, using <<-, since it searches back through the parent environments until it finds the variable (or doesnt and assigns it in the global).
Another would be just storing the function environment() and passing that to eval.
A third would be counting how many frames deep you go, and using sys.frame to tell eval which environment to look in.
.add_date <- function(self, date, time_type) {
valid_date <- stringr::str_detect(date, "^[0-9]{4}(-[0-9]{1,2}){0,2}$")
if (!valid_date) {
stringr::str_interp("Your ${time_type} date does not appear to be formatted correctly. It must be of the form 'yyyy-mm-dd'. Only the year is required.") %>% stop()
}
## Examining environemnts
e <- environment() # current env
efirst <- sys.nframe() # frame number
print(paste("Currently in frame", efirst))
envs <- stringr::str_interp("${date}") %>% parse(text=.) %>% {.; sys.frames()} # list of frames
elast <- stringr::str_interp("${date}") %>% parse(text=.) %>% {.; sys.nframe()} # number of last
print(paste("Went", elast, "frames deep."))
## Go back this many frames in eval
goback <- efirst-elast
date_elements <- date %>% as.character() %>% stringr::str_split(" ") %>% unlist()
date <- date_elements[1] %>% stringr::str_split("-") %>% unlist()
## Solution 1: use sys.frame
stringr::str_interp("self$title$${time_type}_date$year <- date[1]") %>%
parse(text = .) %>% eval(envir=sys.frame(goback))
## Solution 2: use environment defined in function
if (!is.na(date[2])) stringr::str_interp("self$title$${time_type}_date$month <- date[2]") %>%
parse(text = .) %>% eval(envir=e)
return(self)
}
Could anyone know why range in scale_numeric in ggvis does not work correctly?
library(ggvis)
mtcars %>%
ggvis(~wt,~hp) %>%
layer_points() %>%
scale_numeric("x", range = c(2,3))
update
when I use domain = c(2,3) this is result:
next update
Ok, after using domain = c(2,3) with clamp = T the result is better but still it is not expected outcome.
Thanks to #NicE and #jazzurro I figured it out. Additionally, I have to add something more, when I have just one dot point. I mean I add not only scale_numeric("y",...) but also scale_numeric("x",...) because without it, it does not look well.
df <- mtcars[mtcars$wt>2.4 & mtcars$wt<2.5,]
df %>%
ggvis(~wt,~hp) %>%
layer_points() %>%
# try with and without scale_numeric("x",...), and see what happened
# scale_numeric("x", domain = c(2,3), clamp = T, nice = F) %>%
scale_numeric("y", domain = c(50,100), clamp = T, nice = F)