Javascript error using DT package in RStudio - r

This is my first post, and I'm probably doing something silly to create this error ... I am running R 4.2.1 in RStudio, version 2022.07.1, built 554 ("Spotted Wakerobin"). Using a built-in dataset, here is a reproducible example:
table(esoph$agegp, esoph$alcgp, dnn = c("age", "alc")) |>
DT::datatable(
options = list(
scrollY = FALSE)
)
I am getting a Javascript Alert. In case the image doesn't appear, it says,
DataTables warning: table id=DataTables_Table_0 - Requested unknown parameter '4' for row 0, column 4. For more information about this error, please see http://datatables.net/tn/4
I read that page, and I wonder if it means there is a combination of values that does not exist in the dataset. Appreciate any help. Thank you.

I asked a work colleague about this problem, and she came up with this solution: Converting the table to a data frame, then pivoting wider:
table(agegp = esoph$agegp, alcgp = esoph$alcgp) |>
as.data.frame() |>
tidyr::pivot_wider(names_from = alcgp, values_from = Freq) |>
DT::datatable(options = list(scrollY = FALSE))
Thank you to all who replied.

Related

webscrape unicef data with rvest

I wan wanting to automate downloading of some unicef data from https://data.unicef.org/indicator-profile/ using rvest or a simila r package. I have noticed that there are indicator codes, but I am having trouble identifying the correct codes and actually downloading the data.
Upon inspecting element, there is a data-inner-wrapper class that seems like it might be useful. You can access a download link by going to a page associated with an indicator and specifying a time period. For example, CME_TMY5T9 is the code for Deaths aged 5 to 9.
The data is available by going to
https://data.unicef.org/resources/data_explorer/unicef_f/?ag=UNICEF&df=GLOBAL_DATAFLOW&ver=1.0&dq=.CME_TMY5T9..&startPeriod=2017&endPeriod=2022` and then clicking a download link.
If anyone could help me figure out how to get all the data, that would be fantastic. Thanks
library(rvest)
library(dplyr)
library(tidyverse)
page = "https://data.unicef.org/indicator-profile/"
df = read_html(page) %>%
#html_nodes("div.data-inner-wrapper")
html_nodes(xpath = "//div[#class='data-inner-wrapper']")
EDIT: Alternatively, downloading all data for each country would be possible. I think that would just require getting the download link or getting at at the data within the table (since country codes arent much of an issue)
This shows all the data for Afghanistan. I just need to figure out a programmatic way of actually downloading the data....
https://data.unicef.org/resources/data_explorer/unicef_f/?ag=UNICEF&df=GLOBAL_DATAFLOW&ver=1.0&dq=AFG..&startPeriod=1970&endPeriod=2022
You are on the right track! When you visit the website https://data.unicef.org/indicator-profile/, it does not directly contain the indicator codes, because these are loaded dynamically at a later point. You can try using the "network analysis" function of your webbrowser and look at the different requests your browser does to fully load a webpage. The one you are looking for, with all the indicator codes is here: https://uni-drp-rdm-api.azurewebsites.net/api/indicators
library(httr)
library(jsonlite)
library(glue)
## this gets the indicator codes
indicators <- GET("https://uni-drp-rdm-api.azurewebsites.net/api/indicators") %>%
content(as = "text") %>%
jsonlite::fromJSON()
## try looking at it in your browser
browseURL("https://uni-drp-rdm-api.azurewebsites.net/api/indicators")
You also correctly identied the URL, which lets you download individual datasets in the data browser. Now you just needed to find the one that pops up, when you actually download an excel file and recursively add in the differnt helix-codes from the indicators. I have not tried applying this to all indicators, for some the url might differ and you might get incomplete data or errors. But this should get you started.
GET(glue("https://sdmx.data.unicef.org/ws/public/sdmxapi/rest/data/UNICEF,GLOBAL_DATAFLOW,1.0/.{indicators$helixCode[3]}..?startPeriod=2017&endPeriod=2022&format=csv&labels=name")) %>%
content(as = "text") %>%
read_csv()
This might be a good place to get started on how to mimick requests that your browser executes. https://cran.r-project.org/web/packages/httr/vignettes/quickstart.html
Here is what I did based on the very helpful code from #Datapumpernickel
library(dplyr)
library(httr)
library(jsonlite)
library(glue)
library(tidyverse)
library(tictoc)
## this gets the indicator codes
indicators <- GET("https://uni-drp-rdm-api.azurewebsites.net/api/indicators") %>%
content(as = "text") %>%
jsonlite::fromJSON()
## try looking at it in your browser
#browseURL("https://uni-drp-rdm-api.azurewebsites.net/api/indicators")
tic()
FULL_DF = NULL
for(i in seq(1,length(unique(indicators$helixCode)),1)){
# Set up a trycatch loop to keep on going when it encounters errors
tryCatch({
print(paste0("Processing : ", i, " of 546 ", indicators$helixCode[i]))
TMP = GET(glue("https://sdmx.data.unicef.org/ws/public/sdmxapi/rest/data/UNICEF,GLOBAL_DATAFLOW,1.0/.{indicators$helixCode[i]}..?startPeriod=2017&endPeriod=2022&format=csv&labels=name")) %>%
content(as = "text") %>%
read_csv(col_types = cols())
# # Basic formatting for variables I want
TMP = TMP %>%
select(`Geographic area`, Indicator, Sex, TIME_PERIOD, OBS_VALUE) %>%
mutate(description = indicators$helixCode[i]) %>%
rename(country = `Geographic area`,
variablename = Indicator,
disaggregation = Sex,
year = TIME_PERIOD,
value = OBS_VALUE)
# rbind each indicator to the full dataframe
FULL_DF = FULL_DF %>% rbind(TMP)
},
error = function(cond){
cat("\n WARNING COULD NOT PROCESS : ", i, " of 546 ", indicators$helixCode[i])
message(cond)
return(NA)
}
)
}
toc()
# Save the data
rio::export(FULL_DF, "unicef-data.csv")

How to exclude cumulative and total proportions from tables generated with `freq` function in R?

I'm using the freq function from the summarytools package to create frequency tables in RStudio.
It doesn't seem possible to turn off the cumulative and total percentage columns in the tables. For example:
library(summarytools)
data(mtcars)
view(freq(mtcars$cyl, totals=FALSE, cumul=FALSE))
still produces a table containing duplicate cumulative and total percentage columns. All I need is a table with the variable values, count #, and percentage.
I've tried resetting the global options with st_options(freq.cumul = FALSE, freq.totals = FALSE) but receive an error message:
Error in st_options(freq.cumul = FALSE, freq.totals = FALSE) :
unused arguments (freq.cumul = FALSE, freq.totals = FALSE)
UPDATE
Finally figured it out - I wasn't using enough arguments in the freq function. The following code produces a decent frequency table:
cyl_freq <- freq(mtcars$cyl, report.nas = FALSE, totals=FALSE, cumul=FALSE, style = "rmarkdown", headings = FALSE);
view(cyl_freq)
and if you need to create a bunch of tables across multiple columnsmultiple_:
multiple_freq <- lapply(mtcars[c(2,8:11)], function(x) freq(x, report.nas = FALSE, totals=FALSE, cumul=FALSE, headings = FALSE));
view(multiple_freq)
This isn't using the summarytools package, but I think this may be what you're looking for.
frtable <- table(mtcars$cyl)
percent <- prop.table(frtable)
dt <- cbind(frtable , percent) %>% set_colnames(c("Count", "Percent"))
DT::datatable(dt) %>% DT::formatPercentage('percent')
Seems you found how to make it work... Just as a tip, you can skip the lapply part. So this should work as expected:
library(summarytools)
freq(mtcars[c(2,8:11)],
report.nas=FALSE, totals=FALSE, cumul=FALSE, headings=FALSE)
There was an issue where the cumul argument didn't register when doing this in versions prior to 0.9.8, but it's fixed. Version 0.9.8 will be on CRAN any day now, but you can always install the latest version from GitHub with remotes::install_github("dcomtois/summarytools")

flextable package assigns changes without <-

I'm not sure if this is the correct forum to post this, but I have noticed some strange behavior with the flextable package in R, and was wondering if anyone can shed any light.
In the documentation for flextable it shows objects being modified when they are re-assigned to themselves, eg:
ft <- regulartable(head(iris))
ft <- color(ft, color = "orange", part = "body" )
However, my code is modifying the actual table even without re-assigning it, just using piping %>%:
myft <- regulartable(head(iris))
myft %>% align(j = 1, align = "left")
myft # changed
I don't think piping is the issue as it doesn't have the same effect with other packages, eg:
library(plyr)
df <- head(iris)
df %>% mutate(Sum=Sepal.Width*2)
df # unchanged
Is this a bug in flextable? Or is this by design?
It's true that you can mutate formats without assigning the object. But that's not a behavior you can rely on. This is an unwanted design ;) and should be corrected in the next versions so it is safer to assign the result if you want your code to work with future versions.

dplyr table display style change [duplicate]

Using the latest version of tibble the output of wide tibbles is not properly displayed when setting width = Inf.
Based on my tests with previous versions wide tibbles were printed nicely until versions later than 1.3.0. This is what I would like the output to be printed like:
...but this is what it looks like using the latest version of tibble:
I tinkered around with the old sources but to no avail. I would like to incorporate this in a package so the solution should pass R CMD check. When I just copied a load of functions from tibble v1.3.0 I managed to restore the old behavior but could not pass the check.
There's an open issue on Github related to this problem but it's apparently 'not high priority'. Is there a way to print tibbles properly with the new version?
Try out this function:
print_width_inf <- function(df, n = 6) {
df %>%
head(n = n) %>%
as.data.frame() %>%
tibble:::shrink_mat(width = Inf, rows = NA, n = n, star = FALSE) %>%
`[[`("table") %>%
print()
}
This seems to have change, now one can just use:
options(tibble.width = Inf)

Print tibble with column breaks as in v1.3.0

Using the latest version of tibble the output of wide tibbles is not properly displayed when setting width = Inf.
Based on my tests with previous versions wide tibbles were printed nicely until versions later than 1.3.0. This is what I would like the output to be printed like:
...but this is what it looks like using the latest version of tibble:
I tinkered around with the old sources but to no avail. I would like to incorporate this in a package so the solution should pass R CMD check. When I just copied a load of functions from tibble v1.3.0 I managed to restore the old behavior but could not pass the check.
There's an open issue on Github related to this problem but it's apparently 'not high priority'. Is there a way to print tibbles properly with the new version?
Try out this function:
print_width_inf <- function(df, n = 6) {
df %>%
head(n = n) %>%
as.data.frame() %>%
tibble:::shrink_mat(width = Inf, rows = NA, n = n, star = FALSE) %>%
`[[`("table") %>%
print()
}
This seems to have change, now one can just use:
options(tibble.width = Inf)

Resources