I'm new in R and I'm struggling with some plotting in ggplot.
I have some monthly data I simply plotted as points connected with lines.
ggplot(data=df, aes(x=x,y=y)) +
geom_line(aes(group=g)) + geom_point()
Now, I'd like to add pairwise results of Wilcoxon tests between the three categories grouped.
It should look like this.
I'm a bit confused, I know stat_pvalue_manual works with categories, but I have a continuous y axis. and it should be horizontal.
Maybe there are more functions to do this.
does anyone have an example of how this could be done?
Thanks in advance.
structure(list(x = c("April", "April", "April", "May", "May",
"May", "June", "June", "June", "July", "July", "July", "August",
"August", "August", "September", "September", "September", "October",
"October", "October", "November", "November", "November", "December",
"December", "December", "January", "January", "January", "February",
"February", "February"), g = c("a", "b", "c", "a", "b", "c",
"a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a",
"b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b",
"c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767,
5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94,
5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964,
5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))
There's a few things that make this fiddly, the main ones being that you have a discrete scale for your x-axis, and stat_pvalue_manual seems to only work with continuous scales, and a coordinate swap is needed. As a result the factor needs to be ordered, and changed from geom_line to geom_path, and the means for each factor level need to be calculated and added into the stat_test object. This results in:
#Test data
df <- structure(list(x = c("April", "April", "April", "May", "May", "May", "June", "June", "June", "July", "July", "July", "August", "August", "August", "September", "September", "September", "October", "October", "October", "November", "November", "November", "December", "December", "December", "January", "January", "January", "February", "February", "February"), g = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767, 5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94, 5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964, 5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))
df$x <- factor(df$x, levels=unique(df$x))
stat.test <- compare_means(
y ~ g, data = df
)
#Calculate mean values by group
means <- aggregate(df$y, list(g=df$g), mean)
means2 <- means$x
names(means2) <- means$g
stat.test$group1 <- means2[stat.test$group1]
stat.test$group2 <- means2[stat.test$group2]
stat.test$y.position = c(13, 13.5, 13) #arbitrary location for plotting brackets
#Modify the plot
ggplot(data=df, aes(x=y,y=as.numeric(x))) +
geom_path(aes(group=g)) +
geom_point() +
stat_pvalue_manual(stat.test, coord.flip = TRUE) + coord_flip() +
scale_y_continuous("Month", labels=levels(df$x),
breaks=seq_along(levels(df$x)), minor_breaks = 1)
Related
I am trying to convert a character column containing month names into a numerical column containing numbers (1=January, etc.).
Example 1. This works:
df2 <- structure(list(Month = c("January", "January", "March", "March", "April")), class = "data.frame", row.names = c(NA, -5L))
df2 %>%
group_by(Month) %>%
mutate(Month = which(Month == month.name)) -> df2
Example 2. This does not work (error: "Month must be size 31 or 1, not 2."):
df <- structure(list(Month = c("August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "December",
"December", "December", "December")), class = "data.frame", row.names = c(NA, -35L))
df %>%
group_by(Month) %>%
mutate(Month = which(Month == month.name)) -> df
The code for converting is the same in both cases. Why doesn't it work in the second example? I can't get my head around it.
I'm coding an interactive map in R with Shiny and Leaflet. I programmed 1 select all button for the months (checkGroup) and it worked fine, but adding the select all button for the other inputs has caused none of the map to work properly.
#import data
data <- structure(list(Area = c("Scarborough", "Etobicoke", "East York",
"North York", "North York", "Etobicoke", "Downtown Core (Central)",
"York", "Downtown Core (Central)", "York"), occurrenceyear = c(2017L,
2018L, 2018L, 2018L, 2018L, 2018L, 2017L, 2018L, 2018L, 2018L
), occurrencemonth = structure(c(12L, 5L, 5L, 5L, 5L, 5L, 6L,
12L, 12L, 12L), .Label = c("", "April", "August", "December",
"February", "January", "July", "June", "March", "May", "November",
"October", "September"), class = "factor"), Long = c(-79.1886063,
-79.5458221, -79.3138199, -79.4392548, -79.4406738, -79.5390091,
-79.3820572, -79.4840012, -79.3930817, -79.4356079), Lat = c(43.7639694,
43.5895691, 43.6753197, 43.7586555, 43.727829, 43.6431503, 43.6683502,
43.6842308, 43.6707535, 43.6820869)), row.names = c(NA, 10L), class = "data.frame")
# Define UI ----
ui <- fluidPage(
titlePanel("Interactive Toronto Auto Theft Visualization"),
sidebarLayout(
sidebarPanel(
checkboxGroupInput("checkGroup", h3("Month"), choices = list("January", "February", "March", "April", "May", "June", "July", "August" ,"September", "October", "November", "December"), selected = "Janurary"),
actionLink("selectall", "Select All"),
checkboxGroupInput("checkGroup2", h3("Year"),
choices = list(2014, 2015,2016 , 2017, 2018 ), selected = 2018),
actionLink("Selectall2", "Select All"),
checkboxGroupInput("checkGroup3", "Toronto Neighbourhoods", choices = list("Downtown Core (Central)", "East End", "North End", "West End", "East York", "Etobicoke", "North York", "Scarborough", "York"), selected = "York"),
actionLink("Selectall3", "Select All")
),
mainPanel (leafletOutput("map", "100%", 500))
))
# Define server logic ----
server <- function(input, output, session){
observe({
if(input$selectall == 0) return(NULL)
else if(input$selectall%%2==0)
{
updateCheckboxGroupInput(session, "checkGroup", "Month", choices = list("January", "February", "March", "April", "May", "June", "July", "August" ,"September", "October", "November", "December"))
}
else
{
updateCheckboxGroupInput(session, "checkGroup", "Month", choices = list("January", "February", "March", "April", "May", "June", "July", "August" ,"September", "October", "November", "December"), selected = list("January", "February", "March", "April", "May", "June", "July", "August" ,"September", "October", "November", "December"))
}
if(input$Selectall2 == 0) return(NULL)
else if(input$Selectall2 %%2 == 0)
{
updateCheckboxGroupInput(session, "checkGroup2", "Year", choices = list(2014, 2015,2016 , 2017, 2018))
}
else
{
updateCheckboxGroupInput(session, "checkGroup2", "Year", choices = list(2014, 2015,2016 , 2017, 2018), selected = list(2014, 2015,2016 , 2017, 2018))
}
})
filtered <- reactive({
if (is.null(input$checkGroup) & is.null(input$checkGroup2) & is.null(input$checkGroup3)){
return (NULL)
}
data %>% filter(occurrencemonth %in% input$checkGroup & occurrenceyear %in% input$checkGroup2 & Area %in% input$checkGroup3)
})
output$map <- renderLeaflet({
leaflet()%>%
addProviderTiles("CartoDB") %>%
addCircleMarkers(data = filtered(), radius = 2)
})
}
I believe the problem is in the observe function because that is where the programming for the select all buttons are placed. I've only programmed 2 of the buttons before running into the problem and was trying to fix the issue before adding in the third button (selectall3).
I've tried creating two separate observe functions for the two separate buttons, but that did not fix the problem.
You have typo here:
if(input$selectall2 == 0) return(NULL)
Should be:
if(input$Selectall2 == 0) return(NULL)
I've been around the forums looking for a solution to my issue but can't seem to find anything. Derivatives of my question and their answer haven't really helped either. My data has four columns, one for Year and one for Month). I've been wanting to plot the data all in one graph without using any facets for years in ggplot. This is what I've been struggling with so far with:
df<-data.frame(Month = rep(c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "February", "March"),each = 20),
Year = rep(c("2018", "2019"), times = c(220, 40)),
Type = rep(c("C", "T"), 260),
Value = runif(260, min = 10, max = 55))
df$Month<-ordered(df$Month, month.name)
df$Year<-ordered(df$Year)
ggplot(df) +
geom_boxplot(aes(x = Month, y = Value, fill = Type)) +
facet_wrap(~Year)
I'd ideally like to manage this using dplyr and lubridate. Any help would be appreciated!
One option would be to make a true date value, then you can use the date axis formatter. Something like this is a rough start
ggplot(df) +
geom_boxplot(aes(x = lubridate::mdy(paste(Month, 1, Year)), y = Value, fill = Type, group=lubridate::mdy(paste(Month, 1, Year)))) +
scale_x_date(breaks="month", date_labels = "%m")
Do you mean this?
df<-data.frame(Month = rep(c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "February", "March"),each = 20),
Year = rep(c("2018", "2019"), times = c(220, 40)),
Type = rep(c("C", "T"), 260),
Value = runif(260, min = 10, max = 55))
df$Month <- factor(df$Month,levels=c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "Dicember"), ordered = T)
df$Month<-ordered(df$Month)
df$Year<-ordered(df$Year)
df$Year_Month <- paste0(df$Month, " ", df$Year)
df$Year_Month <- factor(df$Year_Month, levels = unique(df$Year_Month))
ggplot(df) +
geom_boxplot(aes(x = Year_Month, y = Value, fill = Type))
Here is my code. I am trying to make an rshiny page to show the mean symptoms in the past 30 days vesus the state based off of slider position (1 to 12 - for month). I know it is a little sloppy but I almost have it. I can get a graph that changes the title based off of the month on the slider but the graph just lists all of the data and not by month. Any help would be great.
`asthma = read.csv("AsthmaChild.Ozone.2006_2007.Sample.csv")
state.month = asthma[,-3:-10]
state.month = state.month[,-4]
state.month = aggregate(state.month$Symptoms.Past30D ~ state.month$STATE +
state.month$Month, state.month, mean)
colnames(state.month) = c("STATE", "Month", "Symptoms.Past30D")
sd = asthma[,-3:-10]
sd = sd[,-4]
sd = aggregate(sd$Symptoms.Past30D ~ sd$STATE + sd$Month, sd, function(x)
sd = sd(x))
colnames(sd) = c("STATE", "Month", "sd")
merged = merge(state.month,sd, by=c("STATE", "Month"))
df = count(asthma, "STATE", "Month")
colnames(df) = c("STATE","Freq")
data = merge(df, merged,by=c("STATE"))
data$sem = (data$sd)/(sqrt(data$Freq))
merged = data
merged$ConfUp = (merged$Symptoms.Past30D) + (merged$sem)
merged$ConfDown = (merged$Symptoms.Past30D) - (merged$sem)
merged$Month = as.character(merged$Month)
merged$Month = gsub("12", "December", merged$Month)
merged$Month = gsub("11", "November", merged$Month)
merged$Month = gsub("10", "October", merged$Month)
merged$Month = gsub("9", "September", merged$Month)
merged$Month = gsub("8", "August", merged$Month)
merged$Month = gsub("7", "July", merged$Month)
merged$Month = gsub("6", "June", merged$Month)
merged$Month = gsub("5", "May", merged$Month)
merged$Month = gsub("4", "April", merged$Month)
merged$Month = gsub("3", "March", merged$Month)
merged$Month = gsub("2", "February", merged$Month)
merged$Month = gsub("1", "January", merged$Month)
index = c(1:12)
values = c("January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December")
ui = fluidPage(
sidebarPanel(
sliderInput("Month", "Month: Jan=1, Dec=12",min = 1, max =
12,step=1,value=1)),
mainPanel(plotOutput("plot")))
server = function(input,output){
sliderInput(inputId="Month",
label="Month: Jan=1, Dec=12",
min = 1,
max = 12,
value=1,
step=1)
mainPanel(plotOutput("plot"))
dat = reactive({
test <- merged[merged$Month %in%
seq(from=min(input$Month),to=max(input$Month),by=1),]
})
output$plot = renderPlot({
ggplot(data=merged, aes(x=Symptoms.Past30D, y = STATE)) +
geom_errorbarh(aes(xmin=ConfUp,xmax=ConfDown), height=1, linetype = 1) +
xlab ("Mean Sympotms.Past30D (SEM)") + ylab ("STATE") +
labs(title=paste(values[match(input$Month, index)]))
})
}
shinyApp(ui, server)`
I'm using the MX DateField control in Flex and want to display the date as 01 Jul 2011 or 01 July 2011. Does anyone know how to do this? I tried setting the formatString to "DD MMM YYYY" but it didn't work.
This works:
<fx:Declarations>
<mx:DateFormatter id="myDf" formatString="DD MMM YYYY"/>
</fx:Declarations>
<fx:Script>
<![CDATA[
private function formatDate(date:Date):String{
return myDf.format(date);
}
]]>
</fx:Script>
<mx:DateField id="dateField" labelFunction="formatDate" />
Found it in the LiveDocs at http://livedocs.adobe.com/flex/3/html/help.html?content=controls_12.html
However this does not explain why the formatString property on the component does not work properly.
I can confirm that it does not work as expected.
Cheers
I would use something like this:
<mx DateField id = "dateField"
dayNames ="["S", "M", "T", "W", "T", "F", "S"]"
monthNames="["January", "February", "March", "April", "May",
"June", "July", "August", "September", "October",
"November", "December"]" />
Since you mentioned 3 character month names, this is a good example. If you don't need the day names, of course, remove that line.
<mx DateField id = "dateField"
dayNames ="["S", "M", "T", "W", "T", "F", "S"]"
monthNames="["Jan", "Feb", "Mar", "Apr", "May",
"Jun", "Jul", "Aug", "Sep", "Oct",
"Nov", "Dec"]"
formatString = "DD MMM YYY" />
Hope this helps.