ggplot2 geom_text - 'dynamically' place label over barchart - r

I have what I know is going to be an impossibly easy question. I am showing an average number of days by month using a bar chart, using the following example:
dat <- structure(list(Days = c("217.00", "120.00", "180.00", "183.00",
"187.00", "192.00"), Amt = c("1,786.84", "1,996.53",
"1,943.23", "321.30", "2,957.03", "1,124.32"), Month = c(201309L,
201309L, 201309L, 201310L, 201309L, 201309L), Vendor = c("Comp A",
"Comp A", "Comp A", "Comp A", "Comp A",
"Comp A"), Type = c("Full", "Full",
"Self", "Self", "Self", "Self"
), ProjectName = c("Rpt 8",
"Rpt 8", "Rpt 8",
"Rpt 8", "Rpt 8",
"Rpt 8")), .Names = c("Days",
"Amt", "Month", "Vendor", "Type", "ProjectName"
), row.names = c("558", "561", "860", "1157", "1179", "1221"), class =
"data.frame")
ggplot(dat, aes(x=as.character(Month),y=as.numeric(Days),fill=Type))+
stat_summary(fun.y='mean', geom = 'bar')+
ggtitle('Rpt 8')+
xlab('Month')+
ylab('Average Days')+
geom_text(stat='bin',aes(y=100, label=paste('Avg:\n',..count..)))
Right now my labels are showing counts & showing up where ever i designate y.
I want to:
place labels at the top of the bars.
display the average, not the count.
I've pretty thoroughly - and unsuccessfully - tried most of the other solutions on SO & elsewhere.

Just got it:
means<-ddply(dat,.(Vendor,Type,Month), summarise, avg=mean(as.numeric(Days)))
ggplot(dat, aes(x=as.character(Month),y=as.numeric(Days),fill=Type))+
stat_summary(fun.y='mean', geom = 'bar')+
geom_text(data = means, stat='identity',
aes(y=avg+7, label=round(avg,0),group=Type))
i realize there is code nearly identical to this sitting elsewhere. my error came in placing the round's 0 outside the correct closing parenthesis -- thus moving all my labels to 0 on x axis... DUH!

Related

How to make cells bold in KableExtra based on a condition

My data looks like this as a kable:
pdtable %>%
kbl(caption = "This is the caption") %>%
kable_classic_2()
However, I want to make some cells bold. Is there a way to do it without editing the input dataframe? I tried to integrate cell_spec in the pipes but I can't get it to work.
Does anyone have a solution?
EDIT:
here is some example data. I want to make all cells bold, that are below a value of 0.05 in the brackets. Using a conditional row_spec however, does not seem to work because there are two values in the cells.
structure(list(`2012` = c("4.16 (0.02)", "1.39 (0.043)", "-3.65 (0.213)",
"4.35 (0.248)", "3.16 (0.036)", "8.84 (0.002)", "15.13 (0)",
"13.03 (0)", "11.16 (0.002)", "4.35 (0.047)", "-2.39 (0.6)",
"-1.45 (0.531)"), `2013` = c("-5.97 (0.24)", "-2.45 (0.73)",
"1.58 (0.002)", "17.77 (0)", "24.23 (0)", "17.29 (0)", "24.62 (0)",
"26.95 (0)", "16.92 (0)", "2.53 (0.13)", "3.79 (0.019)", "4.37 (0)"
), `2014` = c("-22.53 (0.04)", "-14.01 (0.899)", "-3.06 (0.079)",
"12.06 (0.072)", "20.32 (0.011)", "13.86 (0.009)", "34.91 (0)",
"32.15 (0)", "27.33 (0)", "2.53 (0.412)", "3.79 (0.158)", "-6.35 (0)"
), `2012-2014` = c("-26.36 (0.002)", "-13.62 (0.028)", "-4.05 (0)",
"34.98 (0)", "46.65 (0)", "37.45 (0)", "76.91 (0)", "77.23 (0)",
"60.26 (0)", "-14.44 (0.004)", "-15.67 (0)", "-6.71 (0)")), class = "data.frame", row.names = c("test 3",
"test 7", "test 15", "test1 3", "test1 7", "test1 15",
"test3 3", "test 3", "test 4", "test 4", "test 4", "test 4"))
You could use cell_spec conditionally with dplyr::mutate and stringr
library(kableExtra)
library(dplyr)
library(stringr)
pdtable |>
mutate(across(everything(), ~cell_spec(.x, bold = ifelse(as.numeric(str_extract(.x, "(?<=\\().*?(?=\\))"))<0.05, TRUE, FALSE)))) |>
kbl(caption = "This is the caption",
escape = FALSE) |>
kable_classic_2()
column_spec can accept a vector of logical values to control text formats of individual cells in a column. This example sets cell (3, 1) to bold.
library(tidyverse)
library(kableExtra)
df <- tibble(a = 1:5, b = 1:5)
df %>%
kbl() %>%
column_spec(1, bold = ifelse(df$a == 3, TRUE, FALSE)) %>%
kable_styling()

table1() Output Labeling all Data as "Missing"

I am trying to make a descriptive statistics table in R and my code functions properly (producing a table) but despite the fact that I have no missing values in my dataset, the table outputs all of my values as missing. I am still a novice in R, so I do not have a broad enough knowledge base to troubleshoot.
My code:
data <- read_excel("Data.xlsx")
data$stage <-
factor(data$stage, levels=c(1,2,3,4,5,6,7),
labels =c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
data$primary_language <-factor(data$primary_language, levels=c(1,2), labels = c("Spanish", "English"))
data$status_zipcode <- factor(data$status_zipcode, levels = (1:3), labels = c("Minority", "Majority", "Diverse"))
data$status_censusblock <- factor(data$status_censusblock, levels = c(0:2), labels = c("Minority", "Majority", "Diverse"))
data$self_identity <- factor(data$self_identity, levels = c(0:1), labels = c("Hispanic/Latina","White/Caucasian"))
data$subjective_identity <- factor(data$subjective_identity, levels = c(0,1,2,4), labels = c("Hispanic/Latina", "White/Caucasian", "Multiracial", "Asian"))
label (data$stage)<- "Stage at Diagnosis"
label(data$age) <- "Age"
label(data$primary_language) <- "Primary language"
label(data$status_zipcode)<- "Demographic Status in Zipcode Area"
label(data$status_censusblock)<- "Demographic Status in Census Block Group"
label(data$self_identity) <- "Self-Identified Racial/Ethnic Group"
label(data$subjective_identity)<- "Racial/Ethnic Group as Identified by Others"
table1(~ stage +age + primary_language + status_zipcode + status_censusblock + self_identity + subjective_identity| primary_language, data=data)
Table output:
enter image description here
Data set:
enter image description here
When I run the data set the values are there. It actually worked for me when I re-did the spacing:
data$stage <- factor(data$stage,
levels = c(1,2,3,4,5,6,7),
labels = c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
When I did it exactly as you typed it came up with NA's, too. Try the first and see if it works for you that way. Then check the spacing for the others. That may be all it is.
I do end up with one NA on the stage column because 0 is not defined in your levels.
Edit: Ran the rest so here are some other points.
You end up with an NA in stage because one of your values is 0 but it's not defined with a label
You end up with NA's in language because you have a 0 and a 1 but you define it as 1, 2. So you'd need to change to the values. You end up with NA's in other portions because of the :
Change your code to this and you should have the values you need except that initial 0 in "stage":
data$stage <- factor(data$stage,
levels=c(1,2,3,4,5,6,7),
labels =c("Stage 0", "Stage 1", "Stage 2", "Stage 3", "Unsure", "Unsure (Early Stage)", "Unsure (Late Stage"))
data$primary_language <-factor(data$primary_language,
levels=c(0,1),
labels = c("Spanish", "English"))
data$status_zipcode <- factor(data$status_zipcode,
levels = c(0,1,2),
labels = c("Minority", "Majority", "Diverse"))
data$status_censusblock <- factor(data$status_censusblock,
levels = c(0,1,2),
labels = c("Minority", "Majority", "Diverse"))
data$self_identity <- factor(data$self_identity,
levels = c(0,1),
labels = c("Hispanic/Latina","White/Caucasian"))
data$subjective_identity <- factor(data$subjective_identity,
levels = c(0,1,2,4),
labels = c("Hispanic/Latina", "White/Caucasian", "Multiracial", "Asian"))
enter image description here

Subgroups in R timevis

Utilizing the grouping feature in the excellent R timevis package is well documented and examples are provided in the help page of timevis::timevis().
The documentation also says that it is possible to define subgroups, which
"Groups all items within a group per subgroup, and positions them on the same height instead of stacking them on top of each other."
I am having trouble understanding how to use this feature. For example, in the example below, I would expect that "event 1" and "event 2" are defined as their own subgroups and hence they would be positioned on the same height. However, this is not the case.
timedata <- data.frame(
id = 1:6,
start = Sys.Date() + c(1, - 10, 4, 20, -10, 10),
end = c(rep(as.Date(NA), 4), Sys.Date(), Sys.Date() + 20),
group = c(1,1,1,2,2,2),
content = c("event 1", "event 2", "event 2", "event 1", "range 1", "range 1"),
subgroup = c("1.1", "1.2", "1.2", "2.1", "2.2", "2.2")
)
groups <- data.frame(id = c(1,2), content = c("g1", "g2"))
timevis::timevis(data =timedata, groups = groups)
The result of the example code. The definition of subgroups is unsuccesful
How to correctly utilize the subgroups feature?
I'm working through the subgroup and subgroupOrder functions myself, and wanted to share a couple of tips. The code below should achieve overlaying the events on top of each other, as opposed to stacking them. Note the addition of stack = FALSE in the options list().
The other place to look is at the JS documentation: http://visjs.org/docs/timeline/
timedata <- data.frame(
id = 1:6,
start = Sys.Date() + c(1, - 10, 4, 20, -10, 10),
end = c(rep(as.Date(NA), 4), Sys.Date(), Sys.Date() + 20),
group = c(1,1,1,2,2,2),
content = c("event 1", "event 2", "event 2", "event 1", "range 1", "range 1"),
subgroup = c("1.1", "1.2", "1.2", "2.1", "2.2", "2.2")
)
groups <- data.frame(id = c(1,2), content = c("g1", "g2"))
timevis::timevis(data =timedata, groups = groups, options = list(stack = FALSE))
Produces this output,
Not sure if that's exactly what you're trying to achieve, but just a response. Hope you've made some progress otherwise!

Create empty data frame

I created a completely empty matrix. I would like to split a observation in 2 indices (like in Excel).
Indices <- matrix(NA, 8, 2)
rownames(Indices) <- rownames(Indices, do.NULL = FALSE, prefix = "Plot") # brauche ich das?
rownames(Indices) <- c("Plot 1", "Plot 2", "Plot 3", "Plot 8", "Plot 9", "Plot 10",
"Plot 12", "Plot 13")
colnames(Indices) <- c("Density", "Trees per ha")
I would like to split Densityone time in Density only Oaks and Density total. I have no idea how to call this, and is this even possible in R?

Change direction of axis marks in a barplot

I need the following barplot with special axis marks. I tried for while but have difficulties getting it to work. Especially my axis-labels need to change their direction. I know that I have to use axTicks, axis and barplot commands. Anyone with an idea?
How it should look like:
Here is my data:
bpsamplevalues<-structure(c(21.3389252731795, 18.9930828477016, 19.4378755546201,
22.1009743407998, 23.8099463895258, 18.9706355343085, 19.4619810121121,
19.3433394825869, 26.8760997862876, 19.0948710373689), .Names = c("Div 1",
"Div 2", "Div 3", "Div 4", "Div 5", "Div 6", "Div 7", "Div 8",
"Div 9", "Div 10"))
I started with this code but I can not find a solution to get further:
barplot(bpsamplevalues, col="#87DEE1", axes=F, names.arg=F)
You may try this. It is the las argument which sets the orientation of axes labels. See ?par for more information.
barplot(bpsamplevalues, col = "#87DEE1", axes = FALSE, las = 2)
axis(side = 2, tick = FALSE, las = 1)
grid(nx = NA, ny = NULL, col = "white", lty = "solid")

Resources