textplot_wordcloud group label highlight color - r

I'm trying to replicate some quanteda() applications from this post. Yet when I replicated their textplot_wordcloud() example on Presidential speeches, the group labels on my output does not contain highlight colors like grey-ish background in the example:
Since the textplot_wordcloud() function is inherited from comparison.cloud(), so I refer back to the latter's document to see if it has any arguments to set label highlight colors, but couldn't find any. I am wondering if it's possible to highlight group labels in textplot_wordcloud() with colors?
The replication code is attached at below.
library(quanteda)
data(data_corpus_inaugural)
compDfm <- dfm(corpus_subset(data_corpus_inaugural, President %in% c("Washington", "Jefferson", "Madison")),
groups = "President", remove = stopwords("english"), removePunct = TRUE)

You are looking at an old example. You should look here on the quanteda website for current plotting examples.
The function textplot_wordcloud has been rewritten and only uses internal quanteda calls so the reference to wordcloud::wordcloud_comparison is not really valid anymore. In this case you can't set the back ground color for the labels anymore. You can adjust the color and the size of the labels if you want to:
library(quanteda)
# Package version: 2.0.0
# See https://quanteda.io for tutorials and examples.
corpus_subset(data_corpus_inaugural,
President %in% c("Washington", "Jefferson", "Madison")) %>%
dfm(groups = "President", remove = stopwords("english"), remove_punct = TRUE) %>%
dfm_trim(min_termfreq = 5, verbose = FALSE) %>%
textplot_wordcloud(comparison = TRUE,
labelcolor = "green",
labelsize = 2)

Related

Is there a way to have one series name for multiple series without duplicating the series name in the tooltip?

I have the following code chunk:
series_name = "Census"
# Default plot settings
hc = highchart(type="stock") %>%
hc_add_series(data = census.by.day, hcaes(x = time_ts, y = census, group = area), type = "line", name = series_name) %>%
hc_legend(enabled=TRUE) %>%
hc_tooltip(crosshairs = TRUE, backgroundColor = "#FCFFC5",shared = TRUE, borderWidth = 1, split = FALSE,
pointFormat = '<b>{series.name}</b><br>{point.area}: {point.y:.0f} <br>') %>%
hc_xAxis(title=list(text="Date Hour")) %>%
hc_chart(backgroundColor="white")
In this example, the user can select multiple "areas" which in turn will update the hcaes(group = var) to create more than one series for each area. I'm trying to figure out how to keep a single series name for when the user selects more than one area. Right now, the series name is being duplicated because I have it in the tooltip. So the tooltip is repeated for each potential series.
In the example above, I only want to "Census" once.
I think you could achieve this by using the tooltip.formatter function from Highcharts JS API: https://api.highcharts.com/highstock/tooltip.formatter
Here you can find an article that may be helpful to learn how to work with Highcharts JS API in R: https://www.highcharts.com/blog/tutorials/working-with-highcharts-javascript-syntax-in-r/?fbclid=IwAR3b9X-GsVfGT_QVWFALi0KOJ83XbWoKTK1HQA4459U4NNg0UTEDXG-MGss

Utilizing background colors in R Shiny datatable to color cells

Hi I'm working with an R Shiny dashboard of mine using NBA data to showcase the top 15 highest scorers from the previous day's games. I have a simple table setup showing the points scored and the shooting efficiency, but it's pretty bland and I think it would be useful to use background color on certain cells to do things like highlight standout performances like someone getting their season high PTS, or a player having a high efficiency night relative to their average for the season.
Below I've pasted screenshots of what the table looks like & the final columns I want it to have, but I also posted a screenshot of the additional variables I've already calculated in the data frame that I want to color by. I don't want to include those extra variables in the actual table, but I want to use them to background color the cells, if that makes sense. The data frame is called top_15_yesterday and I posted the R Shiny output code as well.
output$top_15 <- DT::renderDataTable(datatable(top_15_yesterday, rownames = FALSE,
options = list(searching = FALSE, pageLength = 15,
lengthChange = FALSE, info = FALSE, paging = FALSE)) %>%
formatCurrency(6, currency = "$", interval = 3, mark = ",", digits = 0) %>%
formatPercentage(4, digits = 1))
For example, if a player's ppg_difference is greater than 20 then their PTS cell should be colored as green, any ts_difference above .20 should be colored as green, and if a player got his season high (PTS = season_high) then their PTS cell should be colored as purple. And on the other hand, if a player's ts_difference is say .10 below their average, color their TS% cell as red.
That's just a general idea of the kind of coloring scheme I want to use, but I have no idea how to implement that into datatable. The examples I've seen don't have the setup I have where the variables I want to color by are already in the data frame. If anyone was able to follow along and have any advice on how to implement these background colors I'd appreciate it !
In your options, make the columns you'd like to reference invisible. The first column is zero in JavaScript
options = list(searching = FALSE,
pageLength = 15,
lengthChange = FALSE,
info = FALSE,
paging = FALSE
columnDefs = list(list(targets = c(0, 7, 8, 9, 10, 11),
visible = FALSE)))
close datatable() and pipe onto
formatStyle(columns = 'theColumn',
valueColumns = 'theReferenceColumn',
backgroundColor = styleEqual(levels = c(a vector with the levels),
values = c(a vector with the background colors))
or styleInterval(cuts, values)

Custom static coloring for Shiny Leaflet polygons

Is there a way to build a custom qualitative color palette that maps category values to color values?
I am trying to build a basic leaflet map in Shiny that colors property parcels (polygons) by their land use (factor). Usually this is simple but I need specific colors for specific categories.
For example, parcels with a land use of 'Commercial' need to be the color '#FF4C4C'. There are about 10 land use categories.
I have tried splitting the data into different layers:
leaflet() %>%
addPolygons(data=parcels[parcels$category == 'Commercial',], fillColor = '#FF4C4C') %>%
addPolygons(data=parcels[parcels$category == 'Residential',], fillColor = '#E9E946')
And so forth but slicing the large SpatialPolygonsDataFrame ten times is slow and consumes a lot of resources. An added problem is that these categories have sub-categories that will need to be shown later, sometimes up to 20 sub-categories, and slicing the spdf 10+20 times won't do.
All of the documentation and stackoverflow questions I have found focus on defining the ranges between two or more colors, but I don't want ranges. I want an exact mapping between factor levels and specific color codes.
I hope there is a simple answer to this. I was hoping I could do something like:
lu_pal <- c('Residential' = '#E9E946', 'Commercial' = '#FF4C4C')
and find a magical function to turn that list into my palette.
parcels$category <- as.factor(parcel$category)
factpal <- colorFactor(c("#FF4C4C", "#E9E946"), parcels$category)
leaflet(parcels) %>%
addPolygons(stroke = FALSE, smoothFactor = 0.2, fillOpacity = 1,
color = ~factpal(category))
You could do something like this:
Data creation:
library(leaflet)
library(sp)
library(sf)
Sr1 = Polygon(cbind(c(2,4,4,1,2),c(2,3,5,4,2)))
Sr2 = Polygon(cbind(c(5,4,2,5),c(2,3,2,2)))
Sr3 = Polygon(cbind(c(4,4,5,10,4),c(5,3,2,5,5)))
Sr4 = Polygon(cbind(c(5,6,6,5,5),c(4,4,3,3,4)), hole = TRUE)
Srs1 = Polygons(list(Sr1), "s1")
Srs2 = Polygons(list(Sr2), "s2")
Srs3 = Polygons(list(Sr3, Sr4), "s3/4")
SpP = SpatialPolygons(list(Srs1,Srs2,Srs3), 1:3)
SpF <- st_as_sf(SpP)
SpF$category <- c("Commercial", "Residential", "Residential")
Note: I am switching the SpatialPolygonsDataFrame to a SimpleFeature from the sf package, which is easier and faster to handle/manipulate.
So, you define a matching data.frame, with the colors for each category. Then you use the merge function and you define which columns to merge on. In this example the Polygon SpF has the column category and the matching dataframe has the column cat. By merging, the new Shapefile NewSp will also have the column col which holds the colors. And you just pass those colors to leaflet.
matching = data.frame(
cat = c("Commercial", "Residential"),
col = c("#FF4C4C", "#E9E946")
)
NewSp <- base::merge(SpF, matching, by.x ="category", by.y="cat")
leaflet() %>%
addTiles() %>%
addPolygons(data=NewSp, color=NewSp$col, opacity = 1, fillOpacity = 0.6)

Remove unused GEOID in geo_join

I am attempting to plot profitability on top of counties in Minnesota, Iowa, and Nebraska. Using leaflet and tigris, I have been able to plot ALL counties, whether or not I have data for it. This leaves me with a few counties with colors and the rest labeled as NA. Is there a way for me to remove all NA's from my geo_join data so that it just isn't used ala unused Wisconsin areas? I have tried using fortify, but I can't figure out how to determine what county boundaries I'm looking at when I merge the TIGER boundary lines with my County FIPS file in order to remove them.
Here is what my leaflet currently looks like:
My code to get the map is this:
library(tigris)
library(leaflet)
pal <- colorNumeric(c("yellow","dark red"),county$Construction.Cost,na.color="white")
IA_counties <- counties(state="IA", cb=TRUE, resolution ="20m")
MN_counties <- counties(state="MN",cb=TRUE,resolution="20m")
NE_counties <- counties(state="NE",cb=TRUE,resolution="20m")
IA_merged <- geo_join(IA_counties,county,"GEOID", "GEOID")
MN_merged <- geo_join(MN_counties,county,"GEOID","GEOID")
NE_merged <- geo_join(NE_counties,county,"GEOID","GEOID")
popupIA <- paste0("County Projects: ", as.character(paste('$',formatC(format(round(IA_merged$Construction.Cost, 0), big.mark=',', format = 'f')))))
popupMN <- paste0("County Projects: ", as.character(paste('$',formatC(format(round(MN_merged$Construction.Cost, 0), big.mark=',', format = 'f')))))
popupNE <- paste0("County Projects: ", as.character(paste('$',formatC(format(round(NE_merged$Construction.Cost, 0), big.mark=',', format = 'f')))))
leaflet() %>%
addProviderTiles("MapQuestOpen.OSM") %>%
addLegend(pal = pal,
values = IA_merged$Construction.Cost,
position = "bottomright",
title = "County Projects",
labFormat=labelFormat(prefix="$")) %>%
addCircles(lng=yup2$lon, lat=yup2$lat,weight=.75,fillOpacity=0.01,color="red",
radius = 96560) %>%
addCircles(lng=yup2$lon, lat=yup2$lat,weight=.75,fillOpacity=0.01,color="blue",
radius = 193121) %>%
addPolygons(data = IA_counties,
fillColor = ~pal(IA_merged$Construction.Cost),
layerId=1,
fillOpacity = .25,
weight = 0.05,
popup = popupIA)%>%
addPolygons(data=MN_counties,
fillColor=~pal(MN_merged$Construction.Cost),
fillOpacity=0.25,
weight=0.05,
popup = popupMN) %>%
addPolygons(data=NE_counties,
fillColor=~pal(NE_merged$Construction.Cost),
fillOpacity=0.25,
weight=0.05,
popup = popupNE)
I apologize for not including reproducible data, but if needed, please ask. I'm hoping that this is more of a simple na.color= formula solution. The map looks "okay" as of now, but I'd like it if it's possible to not have to make the fillOpacity so light so the NA counties don't stand out.
Thanks for any and all help and please, let me know if you have any questions!
I'm the creator of the tigris package. Thanks so much for using it! In the development version of tigris on GitHub (https://github.com/walkerke/tigris), I've added an option to geo_join to accommodate inner joins, which would remove the unmatched data entirely from the resultant spatial data frame (if this is what you are looking for). You can also supply a common merge column name as a named argument to the new by parameter if you want. For example:
IA_merged <- geo_join(IA_counties, county, by = "GEOID", how = "inner")
should work. I'm still testing but I'll probably submit this update to CRAN in January.
So, embarrassingly, the answer to this question was as simple as I had hoped. I tweaked the following na.color code and it worked exactly as I wanted.
pal <- colorNumeric(c("yellow","dark red"),county$Construction.Cost,na.color="transparent")

How to set different colors for different points in rPlot?

I want to set different colors for every point in rPlot. I expected it was rPlot(V2~V1, data=data, type="point", color=color) where color is a vector like c("#6B7BFDFF", "#A7E7BFFF", "#13A0BBFF", ...) but this doesn't work. So what is the correct grammar? Thank you.
By the way can I find the documentation of rCharts somewhere? I only see examples on the project's website.
Documentation is still a work in progress but closer than ever. If you choose to go with polycharts (make sure you are aware of paid commercial licensing), then these examples might be helpful. Here is another StackOverflow question on the same topic. I made a quick example below.
library(rCharts)
data(iris)
colnames(iris) <- sapply(colnames(iris), FUN = gsub, pattern = "\\.", replacement = "")
p5 <- rPlot(SepalWidth ~ SepalLength, data = iris, color = "Species", type = "point", height = 400)
# again match polychartjs example exactly to show how we can change axis and legend titles
p5$guides(color = list(scale = "#! function(value){
color_mapping = {versicolor: '#ff2385',setosa:'#229922',virginica:'#2B24D6'}
return color_mapping[value];
} !#"), y = list(title = "sepalWidth"), x = list(title = "sepalLength"))
p5$set(title = "Iris Flowers")
p5
If you choose to use another library, specifying the color will be different, so let me know and I'll be glad to help out.

Resources