Sorting a column of a table in R

Sorting a column of a table in R - r

I am using the following code :
Daily_intensity %>%
mutate(weekdays = weekdays(date)) %>%
group_by(weekdays) %>%
summarise(minutes_fairly_very_active = sum(fairlyactiveminutes + veryactiveminutes))
The result is not in the order of the weekdays. What should I add so that I get result in order, from Monday to Sunday?

You could use lubridate instead of base and get an ordered factor of the desired kind without needing to specify the order of the weekdays yourself:
mutate(weekdays = lubridate::wday(date, label = TRUE, week_start = 1))

Related

How do I extract the key from a key-value pair (character = character) in R?

I have this variable
variable_pairs <- c("var_name1" = "var_id1", "var_name2" = "var_id2")
I need to extract the names, and strip the quotes, so I get something like (var_name1, var_name2)
Is there something to this effect in r? I suspect this is easier done in Python, but I have to do it in R. Also I'm an absolute R beginner.
Many thanks in advance for any help!
EDIT: making my purpose a bit clearer.
I need to send the names in the select = c(year, *indicator_vector_name*)part. I firstly had to do it manually (select = c(year, *total_population, gdp*)) but it's obviously not good. The 'mmr' object then goes to a ggplot function.
indicator_vector <- c('total_population'='SP.POP.TOTL', 'gdp' = 'NY.GDP.MKTP.CD', 'gdp_growth' ='NY.GDP.MKTP.KD.ZG', 'fertility_rate' ='SP.DYN.TFRT.IN', 'pop_growth'='SP.POP.GROW', 'pop_0_14'='SP.POP.0014.TO.ZS', 'pop_65plus'='SP.POP.65UP.TO.ZS', 'trade_ratio' ='NE.TRD.GNFS.ZS')
indicator_vector_name <- noquote(names(indicator_vector))
for (i in indicator_vector){
indicator_data <- WDI(indicator = indicator_vector, country = c(country), start = start, end= end)
}
mmr <- melt(subset(indicator_data, select = c(year, indicator_vector_name)), id.vars = "year")

You don't need to change variable name without quotes. Try this :
indicator_vector_name <- names(indicator_vector)
mmr <- melt(indicator_data[c('year', indicator_vector_name)], id.vars = "year")

We can use pivot_longer
library(dplyr)
library(tidyr)
indicator_data %>%
select(year, all_of(names(indicator_vector)) %>%
pivot_longer(cols = -year)

How to specify a custom formatter function gt table grand summary?

I'm trying to use a custom formatter function to format grand summary rows in gt table.
in the example below I was trying to use seconds_to_period from lubridate but I get
the error "**Error in stop_if_not_gt(data = data) : **
require(tidyverse)
require(lubridate)
sp500 %>%
dplyr::filter(
date >= "2015-01-05" &
date <="2015-01-16"
) %>%
dplyr::arrange(date) %>%
dplyr::mutate(
week = paste0(
"W", strftime(date, format = "%V"))
) %>%
dplyr::select(-adj_close, -volume) %>%
gt(
rowname_col = "date",
groupname_col = "week"
) %>%
grand_summary_rows(
columns = vars(open, high, low, close),
fns = list(
min = ~min(.),
max = ~max(.),
avg = ~mean(.)),
formatter = fmt(fns=seconds_to_period),
use_seps = FALSE
)
I've tried some variations like formatter = fmt(columns=vars(min),fns=seconds_to_period) with no sucess.
Thanks

Since you didn't share the seconds_to_period function, my guess is you are trying a transformation/computation with a format function. As far as I know, that's not possible. formatter arg allows you to apply a format to an already computed summary cell. And the sintaxys to pass the fmt* arguments inside grand_summary_rows is quite different. Instead of passing them inside parenthesis fmt* function, you should pass them as grand_summary_rows arguments:
df |>
gt()|>
grand_summary_rows(
columns = vars(open, high, low, close),
fns = list(
min = ~min(.),
max = ~max(.),
avg = ~mean(.)),
formatter = fmt_number,
decimals = 1,
use_seps = FALSE
)
In any case, in your code you're passing use_steps, which is a fmt_number argument, to a fmt function, which does not admit it. Take a look at fmt docummentation.
Anyway, it's not clear what you a are trying to accomplish. OHLC are prices data. And I guess from your function name (seconds_to_period) that you're trying to give this prices data a time class format. If the case, you should compute/transform the content before trying to format it.

Reorder panels in facet_wrap/facet_grid based on another factor, with multiple occurrences

Consider this example. I want to create a custom label for my panels by joining two columns into a string.
The panels created through faceting are ordered alphabetically, but actually, I want them to be ordered by src, so SRC01 should come first, then SRC02, etc.
library(tidyverse)
tibble::tibble(
src = rep(c("SRC03", "SRC04", "SRC01", "SRC02"), 2),
data = runif(8)
) %>%
mutate(
foo = case_when(src %in% c("SRC01", "SRC02") ~ "foo", TRUE ~ "bar"),
label = paste(foo, src)
) %>%
ggplot(aes(x = data)) +
geom_density() +
facet_wrap(~label)
Created on 2019-05-22 by the reprex package (v0.3.0)
I know that this order depends on the order of underlying factor levels, but this question shows how to manually specify the levels, which I do not want (there are many more SRC values and I don't want to type all of them…).
I found a solution using fct_reorder, in which I could specify:
mutate(label = fct_reorder(label, src, .fun = identity))
But this only works when there is one line per src/label combination. If I add data (i.e., more than one data point per combination), it fails with:
Error: `fun` must return a single value per group
What would be the most succinct way to achieve what I need?

You can use the numeric part of src, and then use reorder():
tibble::tibble(
src = rep(c("SRC03", "SRC04", "SRC01", "SRC02"), 2),
data = runif(8)
) %>%
mutate(
foo = case_when(src %in% c("SRC01", "SRC02") ~ "foo", TRUE ~ "bar"),
label = paste(foo, src)
) %>%
mutate(label_order = as.numeric(str_extract(src, "\\d+"))) %>%
# use str_extract() to find the "01" inside "SRC01", turn it to numeric.
ggplot(aes(x = data)) +
geom_density() +
facet_wrap(~reorder(label, label_order))
# user reorder to change the ordering based on the numbers
A note about str_extract(), it works on your example because:
str_extract("SRC01", "\\d+") gives "01", then transformed to 1. But:
str_extract("2SRC01", "\\d+") would return 2, which wouldn't be ideal possibly.
Luckily there are tons of way to use regex to extract what you may need.

Is it possible to add a third dummy variable using ifelse() in R?

I was using this code to create a new Group column based on partial strings found inside the column var for 2 groups, Sui and Swe. I had to add another group, TRD, and I've been trying to tweak the ifelse function do this, but no success. Is this doable? are there any other solutions or other functions that might help me do this?
m.df <- molten.df%>% mutate(
Group = ifelse(str_detect(variable, "Sui"), "Sui", "Swedish"))
Current m.df:
var value
ADHD_iFullSuiTrim.Threshold1 0.00549427
ADHD_iFullSuiTrim.Threshold1 0.00513955
ADHD_iFullSweTrim.Threshold1 0.00466352
ADHD_iFullSweTrim.Threshold1 0.00491633
ADHD_iFullTRDTrim.Threshold1 0.00658535
ADHD_iFullTRDTrim.Threshold1 0.00609122
Desired Result:
var value Group
ADHD_iFullSuiTrim.Threshold1 0.00549427 Sui
ADHD_iFullSuiTrim.Threshold1 0.00513955 Sui
ADHD_iFullSweTrim.Threshold1 0.00466352 Swedish
ADHD_iFullSweTrim.Threshold1 0.00491633 Swedish
ADHD_iFullTRDTrim.Threshold1 0.00658535 TRD
ADHD_iFullTRDTrim.Threshold1 0.00609122 TRD
Any help or suggestion would be appreciated even if the result can be accomplished using other functions.

No ifelse() is needed. I'd use Group = str_extract(var, pattern = "(Sui)|(TRD)|(Swe)").
You could do fancier regex with a lookbehind for "iFull" and a lookahead for "Trim", but I can never remember how to do that.
A little more roundabout, but general if you want whatever is between "iFull" and "Trim" would be a replacement:
str_replace_all(var, pattern = "(.*iFull)|(Trim.*)", "")

Try to use multiple ifelse
library(dplyr)
library(stringr)
m.df <- molten.df %>%
mutate(Group = ifelse(str_detect(var, "Sui"), "Sui",
ifelse(str_detect(var, "Swe"), "Swedish", "TRD")))
Or case_when
m.df <- molten.df %>%
mutate(Group = case_when(
str_detect(var, "Sui") ~ "Sui",
str_detect(var, "Swe") ~ "Swe",
TRUE ~ "TRD"
))
Data Preparation
molten.df <- read.table(text = "var value
'ADHD_iFullSuiTrim.Threshold1' 0.00549427
'ADHD_iFullSuiTrim.Threshold1' 0.00513955
'ADHD_iFullSweTrim.Threshold1' 0.00466352
'ADHD_iFullSweTrim.Threshold1' 0.00491633
'ADHD_iFullTRDTrim.Threshold1' 0.00658535
'ADHD_iFullTRDTrim.Threshold1' 0.00609122",
header = TRUE, stringsAsFactors = FALSE)

For future reference - provide all the necessary components for repeating the analysis e.g., packages and example data
# load ----
library(dplyr)
library(stringr)
# data ----
df=data.frame(var=c('ADHD_iFullSuiTrim.Threshold1',
'ADHD_iFullSuiTrim.Threshold1',
'ADHD_iFullSweTrim.Threshold1',
'ADHD_iFullSweTrim.Threshold1',
'ADHD_iFullTRDTrim.Threshold1',
'ADHD_iFullTRDTrim.Threshold1'),
value = c(0.00549427, 0.00513955, 0.00466352, 0.00491633, 0.00658535, 0.00609122))
df %>%
mutate(Group = case_when(str_detect(var, "Sui")~"Sui",
str_detect(var, "Swe")~"Swedish",
str_detect(var, "TRD")~"TRD"))

All values in column used for 'key' property should be unique, but some values are duplicated

I have an application built in R using the shiny and ggvis libraries finally working -- sort of. One issue I'm running into is that the add_tooltip function results in the following warning:
All values in column used for 'key' property should be unique, but some values are duplicated.
I'm using a column named id to serve as the key for my dataframe, and I've run na.omit to eliminate NA values and the duplicated function to test for duplicates in that column; it confirms that each value is unique. The values in that column were generated using the sample function, with replacement set to FALSE. I also generated it by setting df$id <- 1:nrow(df) though the warning persists.
As it is a warning I can still run the application, but when I click on a tooltip, I get a whole string of values replete with NAs. below is the code for the function that builds' the tooltip's content and the ensuing ggvis code that calls the function.
allviztip <- function(x) { if(is.null(x))return(NULL)
row <- allviz[allviz$id == x$id, ]
paste0(row$Ticker)}
allviz %>% ggvis(x=~PriceBook,y=~DivYield,key:=~id,size=~MarketCapinMil,shape=~PayoutCat,fill=~PriceSales) %>%
layer_points() %>% add_tooltip(allviztip,"click") %>% add_axis('x', title='Price/Book Ratio',title_offset=60,properties=axis_props(labels=list(fontSize=12), title=list(fontSize=18))) %>%
add_axis("y", title = "Dividend Yield",title_offset=50,properties=axis_props(labels=list(fontSize=12), title=list(fontSize=18))) %>%
add_legend(scales="shape", title="Payout Ratio Category") %>%
add_legend(scales="fill", title="Price/Sales Ratio", properties = legend_props(legend = list(y = 100))) %>%
add_legend(scales="size", title="Market Cap",values=c(1,25,50,100,500,1000,5000,50000,100000,300000),properties = legend_props(legend = list(y = 200))) %>%
set_options(duration = 0,height="auto",width="auto") %>%
scale_numeric(property="fill",range=c("lightblue","darkblue")) }) %>% bind_shiny("visplot","visplot_ui") `
Any ideas on how I can get the tooltip working properly?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Sorting a column of a table in R - r

You could use lubridate instead of base and get an ordered factor of the desired kind without needing to specify the order of the weekdays yourself: mutate(weekdays = lubridate::wday(date, label = TRUE, week_start = 1))

Related

How do I extract the key from a key-value pair (character = character) in R?

How to specify a custom formatter function gt table grand summary?

Reorder panels in facet_wrap/facet_grid based on another factor, with multiple occurrences

Is it possible to add a third dummy variable using ifelse() in R?

All values in column used for 'key' property should be unique, but some values are duplicated

Categories

Resources