Is there a way to paginate by group in trelliscopejs R package? - r

I would like the user to be able to page through the plots in a trelliscopejs object by group. In the following example the group I would like to paginate by is "continent". In the example, there are always 10 plots per page. The first 5 plots are from continent Africa, the next three Americas and the final two Asia. Instead, I would like each page to show one continent only by default (user could then sort and filter as needed). So the 5 Africa plots would show on the first page, then the user would click next, and view the 3 Americas plots, then click next and view 3 Asia plots, and so on.
library(trelliscopejs)
library(tidyverse)
library(gapminder)
library(purrr)
library(plotly)
# reduced gapminder dataset to use for this example
set.seed(123)
data <- gapminder %>%
filter(country %in% sample(levels(country),15)) %>%
nest(data = !one_of(c("country", "continent")))
# add a plot column with map_plot
pops <- data %>% mutate(
panel = map_plot(data, function(x) {
plot_ly(data = x, x = ~year, y = ~pop,
type = "scatter", mode = "markers")
}))
# generate trelliscope object
pops %>%
arrange(continent, country) %>%
trelliscope(name = "populations", nrow = 2, ncol = 5)
Would it be possible to paginate by group in this way?

Related

why my bar chart not showing all the data

I am working on a music streaming project, and I am trying to get the top15 global streamings in 2020 and make it an interactive graph.
It successfully showed the top 15 song names as a dataframe, but it failed to show as a bar graph, I wonder where did I do wrong here? Although it worked after I flip the bar graph into horizontal, but the data seem to look a bit off.
It looks like this as a vertical bar graph:
The horizontical bar graph looks like this, but the data seem incorrect:
Here is the code I have:
library("dplyr")
library("ggplot2")
# load the .csv into R studio, you can do this 1 of 2 ways
#read.csv("the name of the .csv you downloaded from kaggle")
spotiify_origional <- read.csv("charts.csv")
spotiify_origional <- read.csv("https://raw.githubusercontent.com/info201a-au2022/project-group-1-section-aa/main/data/charts.csv")
View(spotiify_origional)
# filters down the data
# removes the track id, explicit, and duration columns
spotify_modify <- spotiify_origional %>%
select(name, country, date, position, streams, artists, genres = artist_genres)
#returns all the data just from 2022
#this is the data set you should you on the project
spotify_2022 <- spotify_modify %>%
filter(date >= "2022-01-01") %>%
arrange(date) %>%
group_by(date)
# use write.csv() to turn the new dataset into a .csv file
write.csv(Your DataFrame,"Path to export the DataFrame\\File Name.csv", row.names = FALSE)
write.csv(spotify_2022, "/Users/oliviasapp/Documents/info201/project-group-1-section-aa/data/spotify_2022.csv" , row.names = FALSE)
# then I pushed the spotify_2022.csv to the GitHub repo
View(spotiify_origional)
spotify_2022_global <- spotify_modify %>%
filter(date >= "2022-01-01") %>%
filter(country == "global") %>%
arrange(date) %>%
group_by(streams)
View(spotify_2022_global)
top_15 <- spotify_2022_global[order(spotify_2022_global$streams, decreasing = TRUE), ]
top_15 <- top_15[1:15,]
top_15$streams <- as.numeric(top_15$streams)
View(top_15)
col_chart <- ggplot(data = top_15) +
geom_col(mapping = aes(x = name, y = streams)) +
ggtitle("Top 15 Songs Daily Streamed Globally") +
theme(plot.title = element_text(hjust = 0.5))
col_chart <- col_chart + coord_cartesian(ylim = c(999000,1000000)) + coord_flip()
col_chart
Thank you so much! Any suggestions will hugely help!
top_15 <- spotify_2022_global[order(spotify_2022_global$streams, decreasing = TRUE), ]
This code sorts in decreasing order, but the streams data here is still of character type, so numbers like 999975 will be "higher" than 1M, which is why your data looks weird. One song had two weeks just under 1M which is why it shows up with ~2M.
If you use this instead you'll get more what you intended:
top_15 <- spotify_2022_global[order(as.numeric(spotify_2022_global$streams), decreasing = TRUE), ]
However, this is finding the highest song-weeks, not the highest songs, so in this case all 15 highest song-weeks were one song.
I'd suggest you group_by(name) and then summarize to get total streams by song, filter top 15, and then make name an ordered factor, e.g. with forcats::fct_reorder.

R: Creative visualization in RStudio

I am at the final stages of a project where i have been comparing the appraisal price vs the sold price of different properties. The complete code for data collection and tidying is below.
At this stage i am looking at different ways to visualize my data. However, I am quite new to it so my question is whether anyone has any "new" or special ways they visualizing data that they find usefull og intuitive. I have given a couple of examples of what i am able to visualize now using ggplot.
Additionally: Now my visualizations plots all 1275 observations every time. I would however also like to visualize the data both with mean and median for the Percentage, Sold and Tax variables which i am most interested in. For example to visualize the mean value of the Percentage column based on different years.
Appreciate any help!
Complete code:
#Step 1: Load needed library
library(tidyverse)
library(rvest)
library(jsonlite)
library(stringi)
library(dplyr)
library(data.table)
library(ggplot2)
#Step 2: Access the URL of where the data is located
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/"
#Step 3: Direct JSON as format of data in URL
data <- jsonlite::fromJSON(url, flatten = TRUE)
#Step 4: Access all items in API
totalItems <- data$TotalNumberOfItems
#Step 5: Summarize all data from API
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>%
jsonlite::fromJSON(., flatten = TRUE) %>%
.[1] %>%
as.data.frame() %>%
rename_with(~str_replace(., "ListItems.", ""), everything())
#Step 6: removing colunms not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
#Step 7: remove whitespace and change to numeric in columns SoldAmount and Tax
#https://stackoverflow.com/questions/71440696/r-warning-argument-is-not-an-atomic-vector-when-attempting-to-remove-whites/71440806#71440806
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
#Step 8: Remove rows where value is NA
#https://stackoverflow.com/questions/4862178/remove-rows-with-all-or-some-nas-missing-values-in-data-frame
alldata <- allData %>%
filter(across(where(is.numeric),
~ !is.na(.)))
#Step 9: Remove values below 10000 NOK on SoldAmount og Tax.
alldata <- alldata %>%
filter_all(any_vars(is.numeric(.) & . > 10000))
#Step 10: Calculate percentage change between tax and sold amount and create new column with percent change
#df %>% mutate(Percentage = number/sum(number))
alldata_Percent <- alldata %>% mutate(Percentage = (SoldAmount-Tax)/Tax)
Visualization
# Plot Percentage difference based on County
ggplot(data=alldata_Percent,mapping = aes(x = Percentage, y = County)) +
geom_point(size = 1.5)
#Plot County with both Date and Percentage difference The The
theme_set(new = ggthemes::theme_economist())
p <- ggplot(data = alldata_Percent,
mapping = aes(x = Date, y = Percentage, colour = County)) +
geom_line(na.rm = TRUE) +
geom_point(na.rm = TRUE)
p

How to Create Senate Roll Call Voting Map in R

I am looking to ultimately render a map in Leaflet that shows the roll call voting results for a specific vote from the Senate. This obviously involves coloring a state polygon based on the unique combination of the Senator's party affiliation and how they voted (2 senators per state). The problem I have is developing a workflow to color code a state (here I am using a simple sf dataframe of the US states) in this manner. The idea would be to "stripe" the state in two different colors based on each of the Senator's party affiliation and vote type.
Below is a workflow that has already been created for viewing roll call voting results by congressional districts (not what I want, I want to do this for Senate voting), but I figured this would be a starting point or baseline for hoping to create a similar map for a roll call vote from the Senate. This code can be found at https://www.r-bloggers.com/2020/09/mapping-congressional-roll-calls/. The only thing different is that I provided a function that I found on another website to directly read a congressional district shapefile from the website where they are housed courtesy of the UCLA Political Science Department:
# Workflow for mapping congressional district roll call voting results
library(Rvoteview)
library(tidyverse)
devtools::install_github("jaytimm/wnomadds")
library(wnomadds)
library(sf)
library(tigris)
# Function to download a shapefile for any congressional district of your choice.
get_congress_map <- function(cong=113) {
tmp_file <- tempfile()
tmp_dir <- tempdir()
zp <- sprintf("http://cdmaps.polisci.ucla.edu/shp/districts%03i.zip",cong)
download.file(zp, tmp_file)
unzip(zipfile = tmp_file, exdir = tmp_dir)
fpath <- paste(tmp_dir, sprintf("districtShapes/districts%03i.shp",cong), sep = "/")
st_read(fpath)
}
# Get the shapefile for the 89th congress
cd89 <- get_congress_map(cong = 89)
options(tigris_use_cache = TRUE, tigris_class = "sf")
# List the FIPS for US territories (and Alaska and Hawaii) that we won't include in maps.
nonx <- c('78', '69', '66', '72', '60', '15', '02')
# Create a simple states dataframe
states <- tigris::states(cb = TRUE) %>%
data.frame() %>%
select(STATEFP, STUSPS) %>%
rename(state_abbrev = STUSPS)
# Join the congressional districts shapefile with the simple states dataframe we
# created above.
cd_sf <- cd89 %>%
mutate(STATEFP = substr(ID, 2, 3),
district_code = as.numeric(substr(ID, 11, 12))) %>%
left_join(states, by = "STATEFP") %>%
filter(!STATEFP %in% nonx) %>%
select(STATEFP, state_abbrev, district_code)
# Download rollcall data from the Voteview database. Here for the Voting
# Rights Act of 1965
vra <- Rvoteview::voteview_search('("VOTING RIGHTS ACT OF 1965") AND (congress:89)
AND (chamber:house)') %>%
filter( date == '1965-07-09') %>%
janitor::clean_names()
votes <- Rvoteview::voteview_download(vra$id)
names(votes) <- gsub('\\.', '_', names(votes))
# Restructure the roll call voting data stored in votes
big_votes <- votes$legis_long_dynamic %>%
left_join(votes$votes_long, by = c("id", "icpsr")) %>%
filter(!grepl('POTUS', cqlabel)) %>%
group_by(state_abbrev) %>%
mutate(n = length(district_code)) %>%
ungroup() %>%
mutate(avote = case_when(vote %in% c(1:3) ~ 'Yea',
vote %in% c(4:6) ~ 'Nay',
vote %in% c(7:9) ~ 'Not Voting'),
party_code = case_when(party_code == 100 ~ 'Dem',
party_code == 200 ~ 'Rep' ),
Party_Member_Vote = paste0(party_code, ': ', avote),
## fix at-large --
district_code = ifelse(district_code %in% c(98, 99), 0, district_code),
district_code = ifelse(n == 1 & district_code == 1, 0, district_code),
district_code = as.integer(district_code)) %>%
select(-n)
#Members who represent historical “at-large” districts are
##assigned 99, 98, or 1 in various circumstances. Per VoteView.
# Make the Party_Member_Vote variable a factor and change the order of its levels.
big_votes$Party_Member_Vote <- factor(big_votes$Party_Member_Vote)
big_votes$Party_Member_Vote <-
factor(big_votes$Party_Member_Vote,
levels(big_votes$Party_Member_Vote)[c(3,6,1,4,2,5)])
# Join the roll call voting data with the shapefile and plot.
cd_sf_w_rolls <- cd_sf %>%
left_join(big_votes, by = c("state_abbrev", "district_code"))
main1 <- cd_sf_w_rolls %>%
ggplot() +
geom_sf(aes(fill = Party_Member_Vote),
color = 'white',
size = .25) +
wnomadds::scale_fill_rollcall() +
theme_minimal() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.title.y=element_blank(),
axis.text.y=element_blank(),
legend.position = 'none') # +
main1 + ggtitle(vra$short_description)
This is fine for mapping congressional districts on roll call votes by the house. I am trying to figure out a way to reproduce a similar map for senate roll call votes. So, I started with the same workflow and am not sure how to proceed further or if it is even possible:
# Now I want to make a similar map for the senators of each state, not
# the representatives.
# I want to include Hawaii and Alaska in my Senate maps, so remove those FIPS
# from the vector.
non_states <- c('78', '69', '66', '72', '60', '11')
# No congressional district shapefile is therefore needed. So, here I just make
# a simple sf dataframe for the US States. Set the coordinate reference system to
# 4326 (World Geodetic System 1984) because I want to render the map in Leaflet
# and that's the reference system Leaflet uses.
states_Senate <- tigris::states(cb = TRUE) %>%
st_as_sf(crs = 4326) %>%
select(STATEFP, STUSPS, geometry) %>%
filter(!STATEFP %in% non_states) %>%
rename(state_abbrev = STUSPS)
# Query a roll call vote in the Voteview database. Any vote will work, here
# a vote related to marketing of non-prescription drugs in the 116th congress in the
# Senate now, not the House.
vra2 <- Rvoteview::voteview_search('("A bill to amend the Federal Food, Drug, and Cosmetic Act")
AND (congress:116) AND (chamber:senate)') %>%
janitor::clean_names()
votes2 <- Rvoteview::voteview_download(vra2$id)
names(votes2) <- gsub('\\.', '_', names(votes2))
# Restructure the roll call voting data stored in votes2
big_votes2 <- votes2$legis_long_dynamic %>%
left_join(votes2$votes_long, by = c("id", "icpsr")) %>%
filter(!grepl('POTUS', cqlabel)) %>%
mutate(avote = case_when(vote %in% c(1:3) ~ 'Yea',
vote %in% c(4:6) ~ 'Nay',
vote %in% c(7:9) ~ 'Not Voting'),
party_code = case_when(party_code == 100 ~ 'Dem',
party_code == 200 ~ 'Rep' ),
Party_Member_Vote = paste0(party_code, ': ', avote))
# Now I have a dataframe, big_votes2 that has 2 rows for each state. I need to figure
# out how to color the polygons for each state based on the unique combination of
# party affiliation and vote cast.
# Make Party_Member_Vote a factor like the congressional district workflow above,
# join big_votes2 with states_Senate sf dataframe, and plot........finishing this
# workflow and making a Senate map is essentially my question.
My hope is make a final map that looks similiar to the following (found at https://voteview.com/rollcall/RS1160389), which is the resulting roll call vote for the example query
I provide in the script of my workflow for creating a senate map directly above (the roll call vote about non-prescription drugs). This is probably done in Javascript, maybe D3, but I am working on an R Shiny app looking at roll call voting, so I am strictly looking to do this in R.
Here the state polygons are "striped" by the senator's party and how they voted. If the senators in a state are both one party and vote in unison, the state is a solid color reflecting this. The color palette is based off the voteview_pal provided in the wnomadds package. The colors in this palette don't include senators that consider themselves independent, but I can update the palette if there is a solution to creating the striping pattern within the state polygons. In my use of R I can't think of a way to accomplish this, since color fills are based on unique levels of a factor variable and here we have to rows per state, as the dataframe is being created in this workflow. Additionally, I've never seen a pattern or stripe fill in ggplot that could accomplish this even if the dataframe was arranged in a way that there were only 1 row/observation per state. If this is even possible, I would want to render this in Leaflet, but if the basic concepts can be accomplished by plotting the sf object in ggplot I would gladly start there. Any help would appreciated.

R highcharter - Two barchart in same plot with different X-axis

I am trying to do the following:
I have a two datasets about my company. The first one has, say, the top 20 growing sellers. The second one has the bottom 20 losing sellers. So, it's something like this:
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(growing_seller,sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(losing_seller,sales_yoy_losing)
I am trying to plot both charts in the same plot using DIFFERENT categories, corresponding to the sellers' name. So what I have so far is this:
library(highcharter)
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(growing_seller,sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(losing_seller,sales_yoy_losing)
highchart() %>%
hc_add_series(
data = top_growing$sales_yoy_growing,
type = "column",
grouping = FALSE
) %>%
hc_add_series(
data = bottom_losing$sales_yoy_losing,
type = "column"
)
This is what I want to achieve graphically: Chart example
Now,I would like to have a different category array per each independent x-axis: something like the possibility to have "two hc_xAxis" controls, where I could specify per each plotted series its own categories.
My final aim is to, then, have the seller's name as I parse over each of the different columns.
Hope I was clear enough :)
Thanks
Highcharts displays the point's name in the tooltip by default. You just need to point the name value in your data.
You can do it this way:
top_growing <- data.frame(name = growing_seller, y = sales_yoy_growing)
This is the whole code:
library(highcharter)
growing_seller <- c("a","b","c","d","e","f","g","h","i","h")
sales_yoy_growing <- c(100000,90000,75000,50000,37500,21000,15000,12000,10000,8000)
top_growing <- data.frame(name = growing_seller, y = sales_yoy_growing)
losing_seller <- c("i","j","k","l","m","n","o","p","q","r")
sales_yoy_losing <- c(-90000,-75000,-50000,-37500,-21000,-15000,-12000,-10000,-8000,-5000)
bottom_losing <- data.frame(name = losing_seller, y = sales_yoy_losing)
highchart() %>%
hc_add_series(
data = top_growing,
type = "column",
grouping = FALSE
) %>%
hc_add_series(
data = bottom_losing,
type = "column"
)

Add layer underneath interactive layer with ggvis

I want to use the tooltip function within ggvis to create hover text for specific points along a curve. I can get the plot to form, but the text in the hover field won't show up. This occurs when I try to add a background layer that should not be considered part of the interactive part of the visualization. Below is some code illustrating:
library(ggvis)
# one-compartment oral concentration curve
comp1.oral <- function(ka,ke,v,f,dose,time){
(ka * dose * f)/ (v*(ka-ke)) * (exp(-ke * time) - exp(-ka*time))
}
time <- 0:200 # time points to create curve
tp <- 6 # number of times to sample
tmax <- max(time)
#generically choosing tp points to sample at
tnew <- exp(seq(0,log(tmax),length=(tp)))
#computing the concentration (y value)
y <- comp1.oral(.1,.03,4,1,100,tnew)
#creating dataframe with values
# PK and ECG should be in the hover text
d1 <- data.frame(
Conc= y,
Time=tnew,
PK = 1:tp,
ECG= "No"
)
# creating a column with the text to appear in the hover box
d1$long <- paste0("PK: ",d1$PK,"<br>","ECG: ",d1$ECG,"<br>")
#creating another data frame to input the time-conc curve as a background layer
d2 <- data.frame(
x=time,
y=comp1.oral(.1,.03,4,1,100,time)
)
The code below will form the plot I want but without the hover text.
d1 %>%
ggvis(x = ~Time, y=~Conc) %>%
layer_points(size.hover:=200) %>%
layer_paths(~x,~y,data=d2) %>%
add_tooltip(function(d1){
if (!is.null(d1$Time)) paste0("PK:", "<br>ECG:", "<br>Time: ", as.character(round(d1$Time)), " minutes post-dose")
}, "hover")
I would like to get the otehr values from d1$long into the hover text box. I tried adding it similar to what is seen in the rotten tomatoes shiny example, but it wouldn't work.
I tried the following, but it can't seem to find the additional text in the variable d1$long
d1 %>%
ggvis(x = ~Time, y=~Conc, key := ~long) %>%
layer_points(size.hover:=200) %>%
layer_paths(~x,~y,data=d2) %>%
add_tooltip(function(d1){
if (!is.null(d1$Time)) paste0(as.character(d1$long),"Time: ", as.character(round(d1$Time)), " minutes post-dose")
}, "hover")
The error is that the variable should be passed in the layer_points
d1 %>%
ggvis(x = ~Time, y=~Conc) %>%
layer_points(size.hover:=200, key := ~long) %>%
layer_paths(~x,~y,data=d2) %>%
add_tooltip(function(d1){
if (!is.null(d1$Time)) paste0(as.character(d1$long),"Time: ", as.character(round(d1$Time)), " minutes post-dose")
}, "hover")

Resources