I have a dataset, which has a common feature - at the end of column names (after comma) is written the group to which a specific column corresponds to. Is it possible to create a table where two row column names will be used? In an example, the first row is Up and goes Quantity, Price, Quality. Is it also possible to somehow separate by empty column/ or some bold border these 3 groups (Up, Down and Total)? I know there is DT library that helps to make it easier, however, I am looking for the solution using shiny library only.
I also found that using tags$style() and CSS can help to solve it, however not familiar with CSS.
library(shiny)
df <- structure(list(Year = c(2022L, 2022L, 2022L, 2022L, 2022L),
Week = c(0, 1, 2, 3, 4),
`Quantity, Up` = c(335, 305, 740, 910, 515),
`Price, Up` = c(1, 2, 5, 5, 3),
`Quality, Up` = c(243, 243, 243, 12321.434052, 1706.327462),
`Quantity, Down` = c(-175, -900, -205, -35, 0),
`Price, Down` = c(243, 243, 1219.717851, 902.819827, 0),
`Quality, Down` = c(2834.205418, 243, -1219.717851, 902.819827, 0),
`Quantity, Total` = c(510, 1205, 945, 945, 515),
`Price, Total` = c(431, 32, 423, 342, 243),
`Quality, Total` = c(24, 4, -2, 42, 1706.327462)),
class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L),
groups = structure(list(Year = 2022L, .rows = structure(list(1:5), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))),
class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -1L), .drop = TRUE))
ui <- fluidPage(
# Application title
titlePanel("aFRR"),
# plot graphs
mainPanel(tabsetPanel(
tabPanel("Up",
h3(helpText("aFRR Price and Quantity")),
tableOutput("table_up"))
)
)
)
server <- function(input, output, session) {
output$table_up <- renderTable(df, na = "missing",align = 'c',striped = TRUE
)
}
shinyApp(ui, server)
Related
I have the following df:
structure(list(Donorcode = c("406A001", "406A002", "406A003",
"406A004"), Doos = c(1, 1, 2, 2), `Leeftijd T0` = c(70, 73, 79,
75), Instituut = c("Spaarne ziekenhuis", "Spaarne ziekenhuis",
"Spaarne ziekenhuis", "Spaarne ziekenhuis"), Datum = structure(c(1567468800,
1567468800, 1567468800, 1567468800), class = c("POSIXct", "POSIXt"
), tzone = "UTC")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-4L))
I need to randomize the column 'Donorcode' based on the other 4 columns, not one column 'weighs' more than the other so the order of which column randomizes the Donorcode column first does not matter.
Is there a way to do this in R?
Many thanks!
I have a list containing 180.000 elements each represents data about an investor and a specific traded asset.
I want to save all the elements of the list into single dataframes called df into a specific folder "dev/test-data/investors-singleass/" , so that I can later on apply a specific function on all the dfs of the folder
The list of my data has a structure similar to this
list(`4Z627.004125` = structure(list(investor = c("4Z627", "4Z627",
"4Z627"), asset = c("004125", "004125", "004125"), datetime = c("2015-05-12",
"2015-05-28", "2016-08-19"), Avgprice = c(169.4, 168, 162), operation = c(2000,
1000, -3000), portfolio = c(2000, 3000, 0), last_port = c(0,
2000, 3000), marketprice = c(169.4, 166.5, 161.75), portprice = c(169.4,
168.933333333333, 0), G = c(0, 0, 1), gainminus = c(2, 0, 0),
numasset = c(5, 8, 13)), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))
Basically each elements of the list is an "investor" ID and an "asset" code for which i then have multiple other columns to work with
I would do it like this based on link
df1 <- list(`4Z627.004125` = structure(list(investor = c("4Z627", "4Z627",
"4Z627"), asset = c("004125", "004125", "004125"),
datetime = c("2015-05-12", "2015-05-28", "2016-08-19"),
Avgprice = c(169.4, 168, 162),
operation = c(2000, 1000, -3000), portfolio = c(2000, 3000, 0),
last_port = c(0,2000, 3000), marketprice = c(169.4, 166.5, 161.75),
portprice = c(169.4,
168.933333333333, 0), G = c(0, 0, 1), gainminus = c(2, 0, 0),
numasset = c(5, 8, 13)), row.names = c(NA, -3L),
class = c("tbl_df", "tbl", "data.frame")),
`4Z628.004128` = structure(list(investor = c("4Z627", "4Z627",
"4Z627"), asset = c("004125", "004125", "004125"),
datetime = c("2015-05-12", "2015-05-28", "2016-08-19"),
Avgprice = c(169.4, 168, 162),
operation = c(2000, 1000, -3000), portfolio = c(2000, 3000, 0),
last_port = c(0,2000, 3000), marketprice = c(169.4, 166.5, 161.75),
portprice = c(169.4,
168.933333333333, 0), G = c(0, 0, 1), gainminus = c(2, 0, 0),
numasset = c(5, 8, 13)), row.names = c(NA, -3L),
class = c("tbl_df", "tbl", "data.frame")))
library(purrr)
iwalk(df1, ~saveRDS(.x, paste0("dev/test-data/investors-singleass/", .y, '.RData')))
You can get the data back into R with
library(dplyr)
df <- list.files(path = "dev/test-data/investors-singleass/", pattern = ".RData") %>%
map_dfr(readRDS)
I'm having issues joining a set of columns with a simple inner_join even though all of my data is tidy. Below is the error that I receive and below that I will paste simple samples of my data.
library(tidyverse)
library(janitor)
regions_name = regions %>% select(region, name)
regions_name$region = as.numeric(regions_name$region)
postcode_clean = postcode %>% clean_names()
#postcode_clean$pr = as.double(postcode_clean$pr)
postcode_province = postcode_clean %>% left_join(y = regions_name, by = c("pr", "region"))
#> Error: Join columns must be present in data.
#> x Problem with `region`.
> dput(head(postcode_clean, 10))
structure(list(pc = structure(c("A0A1A0", "A0A1B0", "A0A1C0",
"A0A1C0", "A0A1C0", "A0A1C0", "A0A1C0", "A0A1C0", "A0A1E0", "A0A1G0"
), label = "Postal code", format.spss = "A6"), pr = structure(c(10,
10, 10, 10, 10, 10, 10, 10, 10, 10), label = "Province or territory code", format.spss = "F2.0", display_width = 4L, labels = c(Newfoundland = 10,
`Prince Edward Island` = 11, `Nova Scotia` = 12, `New Brunswick` = 13,
Quebec = 24, Ontario = 35, Manitoba = 46, Saskatchewan = 47,
Alberta = 48, `British Columbia` = 59, Yukon = 60, `Northwest Territories` = 61,
Nunavut = 62), class = c("haven_labelled", "vctrs_vctr", "double"
)), cs_duid = structure(c(1001144, 1001464, 1001557, 1001557,
1001557, 1001557, 1001557, 1001557, 1001347, 1001409), label = "Census subdivision unique identifier", format.spss = "F7.0", display_width = 9L)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
> dput(head(regions_name, 10))
structure(list(region = c(1, 35, 24, 59, 48, 46, 47, 12, 13,
10), name = c("Canada", "Ontario", "Quebec", "British Columbia",
"Alberta", "Manitoba", "Saskatchewan", "Nova Scotia", "New Brunswick",
"Newfoundland and Labrador")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"), last_updated = structure(1648783776.07826, class = c("POSIXct",
"POSIXt")))
I don't understand it. I'm not doing anything complicated, yet I am being thrown this error. Any suggestions?
I have a flextable that I am trying to conditionally format percentage numbers based if they are > or less than a certain %. It's a simple conditional format so I'm not sure why it's not working. I feel as though I'm missing something obvious here.
Here is an example:
myft = structure(list(Name = c("Bob", "Fred", "Joe"), `2020-03-30` = c(96,
100, 36)), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"))
myft = flextable(myft)
myft = bg(myft, i = ~ Name > 50,
j = 2,
bg="red")
myft
This code produces this image:
You want to use the conditional formatting based on the "2020-03-30" column:
library(flextable)
myft = structure(list(Name = c("Bob", "Fred", "Joe"), `2020-03-30` = c(96,
100, 36)), row.names = c(NA, -3L), class = c("tbl_df", "tbl",
"data.frame"))
myft = flextable(myft)
myft = bg(myft, i = ~ `2020-03-30` > 50,
j = 2,
bg="red")
myft
Edit:
If you want conditional coloring across multiple columns, you could create a color matrix:
library(flextable)
myft = structure(list(Name = c("Bob", "Fred", "Joe"),
`2020-03-30` = c(96, 100, 36),
`2020-04-30` = c(30, 100, 36)),
row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
colormatrix <- ifelse(myft[, -1] > 50, "red", "white")
myft %>% flextable() %>% bg(j = 2:3, bg=colormatrix)
I have a panel (cross-sectional time series) dataset. For each group (defined by (NAICS2, occ_type) in time ym) I have many variables. For each variable I would like to subtract each group's first (dplyr::first) value from every value of that group.
Ultimately I am trying to take the Euclidean difference between the vector of each row 's group's first entry, (i.e. sqrt(c_1^2 + ... + c_k^2).
I was able to create the a column equal to the first entries for each group:
df2 <- df %>%
group_by(ym, NAICS2, occ_type) %>%
distinct(ym, NAICS2, occ_type, .keep_all = T) %>%
arrange(occ_type, NAICS2, ym) %>%
select(group_cols(), ends_with("_scf")) %>%
mutate_at(vars(-group_cols(), ends_with("_scf")),
list(first = dplyr::first))
I then tried to include variations of f.diff = . - dplyr::first(.) in the list, but none of those worked. I googled the dot notation for a while as well as first and lag in dplyr timeseries but have not been able to resolve this yet.
Ideally, I unite all variables into a vector for each row first and then take the difference.
df2 <- df %>%
group_by(ym, NAICS2, occ_type) %>%
distinct(ym, NAICS2, occ_type, .keep_all = T) %>%
arrange(occ_type, NAICS2, ym) %>%
select(group_cols(), ends_with("_scf")) %>%
unite(vector, c(-group_cols(), ends_with("_scf")), sep = ',') %>%
# TODO: DISTANCE_BETWEEN_ENTRY_AND_FIRST
mutate(vector.diff = ???)
I expect the output to be a numeric column that contains a distance measure of how different each group's row vector is from its initial row vector.
Here is a sample of the data:
structure(list(ym = c("2007-01-01", "2007-02-01"), NAICS2 = c(0L,
0L), occ_type = c("is_middle_manager", "is_middle_manager"),
Administration_scf = c(344, 250), Agriculture..Horticulture..and.the.Outdoors_scf = c(11,
17), Analysis_scf = c(50, 36), Architecture.and.Construction_scf = c(57,
51), Business_scf = c(872, 585), Customer.and.Client.Support_scf = c(302,
163), Design_scf = c(22, 17), Economics..Policy..and.Social.Studies_scf = c(7,
7), Education.and.Training_scf = c(77, 49), Energy.and.Utilities_scf = c(25,
28), Engineering_scf = c(90, 64), Environment_scf = c(19,
19), Finance_scf = c(455, 313), Health.Care_scf = c(105,
71), Human.Resources_scf = c(163, 124), Industry.Knowledge_scf = c(265,
174), Information.Technology_scf = c(467, 402), Legal_scf = c(21,
17), Maintenance..Repair..and.Installation_scf = c(194, 222
), Manufacturing.and.Production_scf = c(176, 174), Marketing.and.Public.Relations_scf = c(139,
109), Media.and.Writing_scf = c(18, 20), Personal.Care.and.Services_scf = c(31,
16), Public.Safety.and.National.Security_scf = c(14, 7),
Religion_scf = c(0, 0), Sales_scf = c(785, 463), Science.and.Research_scf = c(52,
24), Supply.Chain.and.Logistics_scf = c(838, 455), total_scf = c(5599,
3877)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -2L), groups = structure(list(ym = c("2007-01-01",
"2007-02-01"), NAICS2 = c(0L, 0L), occ_type = c("is_middle_manager",
"is_middle_manager"), .rows = list(1L, 2L)), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))