Incomplete display of plot due to ylim - r

I have a question about the restricted representation of a plot. The limits of the y axis should range from 0-5. Due to ceiling and ground effects the plot is now partly not displayed correctly. See attachment. How can I get the plots to be displayed completely without having to change the scaling? Thank you, you are very helpful!
# visual inspection of data
fit<-Anxiety_full
# Plot model
plot_Anxiety <- plot_model(fit, type = "eff", terms = c("Condition", "Group"))+ #geom_line(size = 1)
coord_cartesian(xlim = c(0.5, NA), clip = "off") + theme_tq() +scale_colour_tq() + scale_fill_tq(light) +
labs(
title = "",
y = "Anxiety Score [ 0-5 ]",
x = "") + xlim(c("baseline", "60 bpm", "16 bpm", "10 bpm", "6 bpm", "random")) +
ylim(c(-0.5,5.5)
)+ ggplot2::labs(colour = "Group") + scale_color_manual(values=c('Red','Black'))
plot_Anxiety<- plot_Anxiety + theme_apa()
plot_Anxiety

Related

Problem with colouring a GG Plot Histogram

I`ve got an issue with colouring a ggplot2 histogram.
R-Junk
ggplot(Hospital, aes(x=BodyTemperature)) +
geom_histogram(aes(fill = factor(BodyTemperature))) +
scale_x_continuous(breaks = seq(0, 100, by = 10)) +
ylab("prevalence") +
xlab("BodyTemperature") +
ggtitle("Temperature vs. prevalence")
So the histogram should plot the information (x-axis), that as higher the temperature gets, the worse it is. So for example „temperature“ at 36°C should be green, 38°C yellow, 40° red - going from left to right on the x-axis.
Y-Axis should provide how often these temperatures ocures in the Patientdata of the Hospital. The Data "BodyTemperature" is a list of 200+ Data like: "35.3" or "37.4" etc.
How can this chunk be fixed to provide the color changes? For a non-ggplot version ive already written this r-junk positiv:
```{r, fig.width=8}
color1 <- rep(brewer.pal(1, "Greens"))
color2 <- rep("#57c4fa", 0)
color3 <- brewer.pal(8, "Reds")
hist(Hospital$BodyTemperature[-357],
breaks = seq(from = 0, to = 100, by = 10),
main = "Temperature vs. prevalence",
ylab = "prevalence",
xlab = "Temperature",
col = c(color1, color2, color3))
```
The key is to make sure the bin intervals used for the fill scale match those used for the x axis. You can do this by setting the binwidth argument to geom_histogram(), and using ggplot2::cut_width() to break BodyTemperature into the same bins for the fill scale:
set.seed(13)
library(ggplot2)
# example data
Hospital <- data.frame(BodyTemperature = 36.5 + rchisq(100, 2))
ggplot(Hospital, aes(BodyTemperature)) +
geom_histogram(
aes(fill = cut_width(BodyTemperature, width = 1)),
binwidth = 1,
show.legend = FALSE
) +
scale_fill_brewer(palette = "RdYlGn", direction = -1) +
labs(
title = "Temperature vs. Prevalence",
x = "Body Temperature (°C)",
y = "Prevalence"
) +
theme_minimal()
Created on 2022-10-24 with reprex v2.0.2

Increase grid.arrange plot size

I'm using grid.arrange to display the following three plots on top of each other.
p1 <- ggseasonplot(ng2) + labs(title = "Natural Gas Consumption from Jan. 2001 to Nov. 2021 - Seasonal Plot", x = "Month", y = "Cubic Feet (Millions)") + scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6))
p2 <- ggsubseriesplot(ng2) + labs(title = "Natural Gas Consumption from Jan. 2001 to Nov. 2021 - Subseries Plot", x = "Month", y = "Cubic Feet (Millions)") + scale_y_continuous(labels = unit_format(unit = "M", scale = 1e-6))
p3 <- ggAcf(ng2, lag.max = 36) + labs(title = "Natural Gas Consumption from Jan. 2001 to Nov. 2021 - ACF Plot", x = "Lag", y = "Correlation")
gridExtra::grid.arrange(p1, p2, p3, nrow = 3, ncol = 1)
The resulting plots are unreadable.
Using the heights function only seems to adjust plot size relative to one another. Any idea how to make each plot larger (longer) as a whole so each is more readable?
In your markdown/knitr chunk, try a value like fig.height = 10
```{r, fig.height = 10}
# Small fig.width
ggplot(cars, aes(speed, dist)) + geom_point()
```
Or you can set the default height for all the figures in an Rmd file
knitr::opts_chunk$set(fig.height = 10)
References
https://community.rstudio.com/t/rmarkdown-rnotebook-text-in-graphs-too-small-lines-to-thin-after-update/123741/3
http://zevross.com/blog/2017/06/19/tips-and-tricks-for-working-with-images-and-figures-in-r-markdown-documents/
Plot size and resolution with R markdown, knitr, pandoc, beamer
https://rmd4sci.njtierney.com/customising-your-figures.html

ggplot: aggregate multi-year data by Month-Year, aesthetic length error

i've read every relevant aggregate() by month and lubridate question i could find but am still running into an error of aesthetic length. lots didn't work for me bc they grouped data by month but the dataframe only contained data from one year. i don't need the cumulative total of every January across time – i need it to be month- AND year-specific.
my sample data: (df is called "sales")
order_date_create order_sum
2020-05-19 900
2020-08-29 500
2020-08-30 900
2021-02-01 200
2021-02-06 500
aggregating by month-year:
# aggregate by month (i used _moyr short for month year)
sales$bymonth <- aggregate(cbind(order_sum)~month(order_date_create),
data=sales,FUN=sum)
sales$order_moyr <- format(sales$order_date_create, '%m-%Y') # why does this get saved under values instead of data?
here's my ggplot:
# plot
ggplot(sales, aes(order_moyr, order_sum)) +
scale_x_date(limits = c(min, as.Date(now())),
breaks = "1 month",
labels = date_format("%m-%Y")) +
scale_y_continuous(labels = function(x) format(x, big.mark = "'", decimal.mark = ".", scientific = FALSE)) +
labs(x = "Date", y = "Sales Volume", title = "Sales by Month") +
geom_bar(stat="identity")+ theme_economist(base_size = 10, base_family = "sans", horizontal = TRUE, dkpanel = FALSE) + scale_colour_economist()
if i use x = order_date_create and y = order_sum it plots correctly, with month-year axis, but each bar is still daily sum.
if i use x = order_moyr and y = bymonth, i get this error:
Error: Aesthetics must be either length 1 or the same as the data (48839): y
tangentially, if anyone knows how to use both scale::dollar AND format the thousands separator in the same scale_y_continous fcn it would be a great help. i've not found how to do both.
library(scales); library(lubridate); library(dplyr);
library(ggthemes)
sales %>%
count(order_moyr = floor_date(order_date_create, "month"),
wt = order_sum, name = "order_sum") %>%
ggplot(aes(order_moyr, order_sum)) +
scale_x_date(breaks = "1 month",
labels = date_format("%m-%Y")) +
scale_y_continuous(labels = scales::dollar_format(big.mark = "'",
decimal.mark = ".")) +
labs(x = "Date", y = "Sales Volume", title = "Sales by Month") +
geom_bar(stat="identity", width = 25)+
theme_economist(base_size = 10, base_family = "sans",
horizontal = TRUE, dkpanel = FALSE) +
scale_colour_economist()

Colour segments of density plot by bin

Warning, I am brand-new to R!
I have the R bug and having a play with the possibilities but getting very lost. I want to try and colour segments of a density plot with a condition '>' to indicate bins. In my head it look like:
...but not quartile or % change dependant.
My data shows; x = duration (number of days) and y = frequency. I would like the plot to colour split on 3 month intervals up to 12 months and one colour after (using working days i.e. 63 = 3 months).
I have had a go, but really not sure where to start!
ggplot(df3, aes(x=Investigation.Duration))+
geom_density(fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>0],
fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>63], color = "white",
fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>127], color = "light Grey",
fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>190], color = "medium grey",
fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>253], color = "dark grey",
fill = W.S_CleanNA$Investigation.Duration[W.S_CleanNA$Investigation.Duration>506], color = "black")+
ggtitle ("Investigation duration distribution in 'Wales' complexity sample")+
geom_text(aes(x=175, label=paste0("Mean, 136"), y=0.0053))+
geom_vline(xintercept = c(136.5), color = "red")+
geom_text(aes(x=80, label=paste0("Median, 129"), y=0.0053))+
geom_vline(xintercept = c(129.5), color = "blue")
Any really simple help much appreciated.
Unfortunately, you can't do this directly with geom_density, as "under the hood" it is built with a single polygon, and a polygon can only have a single fill. The only way to do this is to have multiple polygons, and you need to build them yourself.
Fortunately, this is easier than it sounds.
There was no sample data in the question, so we will create a plausible distribution with the same median and mean:
#> Simulate data
set.seed(69)
df3 <- data.frame(Investigation.Duration = rgamma(1000, 5, 1/27.7))
round(median(df3$Investigation.Duration))
#> [1] 129
round(mean(df3$Investigation.Duration))
#> [1] 136
# Get the density as a data frame
dens <- density(df3$Investigation.Duration)
dens <- data.frame(x = dens$x, y = dens$y)
# Exclude the artefactual times below zero
dens <- dens[dens$x > 0, ]
# Split into bands of 3 months and group > 12 months together
dens$band <- dens$x %/% 63
dens$band[dens$band > 3] <- 4
# This us the complex bit. For each band we want to add a point on
# the x axis at the upper and lower ltime imits:
dens <- do.call("rbind", lapply(split(dens, dens$band), function(df) {
df <- rbind(df[1,], df, df[nrow(df),])
df$y[c(1, nrow(df))] <- 0
df
}))
Now we have the polygons, it's just a case of drawing and labelling appropriately:
library(ggplot2)
ggplot(dens, aes(x, y)) +
geom_polygon(aes(fill = factor(band), color = factor(band))) +
theme_minimal() +
scale_fill_manual(values = c("#003f5c", "#58508d", "#bc5090",
"#ff6361", "#ffa600"),
name = "Time",
labels = c("Less than 3 months",
"3 to 6 months",
"6 to 9 months",
"9 to 12 months",
"Over 12 months")) +
scale_colour_manual(values = c("#003f5c", "#58508d", "#bc5090",
"#ff6361", "#ffa600"),
guide = guide_none()) +
labs(x = "Days since investigation started", y = "Density") +
ggtitle ("Investigation duration distribution in 'Wales' complexity sample") +
geom_text(aes(x = 175, label = paste0("Mean, 136"), y = 0.0053),
check_overlap = TRUE)+
geom_vline(xintercept = c(136.5), linetype = 2)+
geom_text(aes(x = 80, label = paste0("Median, 129"), y = 0.0053),
check_overlap = TRUE)+
geom_vline(xintercept = c(129.5), linetype = 2)

how do i vectorise (automate) plot creation in R

edited to include sample data:
Sample data
I have been trying to write code to generate and save multiple plots from a large dataset and have to admit defeat. Would love some help if possible..
i have a df (dat) of 4 years of daily monitoring data (sampling year goes July - June, so Sampling.Year notation is YYYY-YYYY). I would like to export jpgs for each SITENAME, with facet wrap/facet grid so each Sampling.Year is stacked vertically. Individual Sampling.Year plots show timeseries data for the full year (x=DATE, y = Daily.Ave.PAF). End result should be individual jpg files (SITENAME saved in file name) with sampling years stacked but DATE (x axis) aligned. That way we can get a quick snapshot of differences over time. The string is below and my (probably crappy) code is below that. The code is exporting plots just fine, but the data seems to be mixed up - i.e. where a SITENAME only has 2 Sampling.Years worth of data there should only be 2 plots in the jpg but this code produces 4... it's obviously wrong but I don't know how to fix it. THanks in advance.
'data.frame': 521 obs. of 6 variables:
$ STATION : chr "1240062" "125013A" "122013A" "126001A" ...
$ SITENAME : chr "Oconnell River at Caravan Park" "Pioneer River at Dumbleton Weir Headwater" "Proserpine River at Glen Isla" "Sandy Creek at Homebush" ...
$ Sampling.Year: chr "2016-2017" "2018-2019" "2018-2019" "2018-2019" ...
$ DATE : Date, format: "2017-02-01" "2019-02-01" "2019-02-01" "2019-02-01" ...
$ Daily.Ave.PAF: num 24.344 15.226 45.529 44.936 0.208 ...
$ Site.Year : chr "Oconnell River at Caravan Park_2016-2017" "Pioneer River at Dumbleton Weir Headwater_2018-2019" "Proserpine River at Glen Isla_2018-2019" "Sandy Creek at Homebush_2018-2019" …
CODE:
for(i in 1:length(dat)){
png(filename = paste("N:/Projects and project proposals/", dat$SITENAME[i], ".png", sep=""), width = 1500, height = 1000)
print({pesticidePlot <- ggplot(dat, aes(DATE, Daily.Ave.PAF)) +
geom_point(aes(colour = Daily.Ave.PAF)) +
scale_colour_gradientn(colours=c("dark green","yellow","orange", "red"),
breaks=c(5,10,20), labels=format(c("5", "10", "20"))) +
facet_wrap(~Sampling.Year, ncol = 1,scales="free") +
labs(x = "Month", y = "Total PAF (% affected)") +
scale_x_date(breaks = "1 month", labels = date_format("%B")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))})
dev.off()
}
This code can help you. I have used the data you included (Just define a directory to save the plots):
library(tidyverse)
#Data
dat <- read.csv('Sample.csv',stringsAsFactors = F)
dat$DATE <- as.Date(dat$DATE,'%d/%m/%Y')
#Create a list
List <- split(dat,dat$SITENAME)
#Function for plots
myplot <- function(x)
{
pesticidePlot <- ggplot(x, aes(DATE, Daily.Ave.PAF)) +
geom_point(aes(colour = Daily.Ave.PAF)) +
scale_colour_gradientn(colours=c("dark green","yellow","orange", "red"),
breaks=c(5,10,20), labels=format(c("5", "10", "20"))) +
facet_wrap(~Sampling.Year, ncol = 1,scales="free") +
labs(x = "Month", y = "Total PAF (% affected)") +
scale_x_date(breaks = "1 month", labels = scales::date_format("%B-%y")) +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
ggtitle(unique(x$SITENAME))
return(pesticidePlot)
}
#Create plots
List2 <- lapply(List,myplot)
#Export
namesvec <- paste0(names(List2),'.png')
mapply(ggsave, List2,filename=namesvec,width = 15,units = 'cm')
That code will create next plots:
You can modify myplot if you need a more customized plots.
Here is a solution that will save the plots created in a lapply loop. The files are then written in another loop, this time with mapply.
In the example below the files are saved in the working directory, change this at will.
library(ggplot2)
SITENAME_plot <- function(X){
ggplot(X, aes(DATE, Daily.Ave.PAF)) +
geom_point(aes(colour = Daily.Ave.PAF)) +
scale_colour_gradientn(colours=c("dark green","yellow","orange", "red"),
breaks=c(5,10,20), labels=format(c("5", "10", "20"))) +
labs(x = "Month", y = "Total PAF (% affected)") +
scale_x_date(breaks = "1 month", labels = scales::date_format("%B")) +
facet_wrap(~Sampling.Year, ncol = 1, scales = "free") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))
}
SITENAME_plot_write <- function(name, g, dir = "N:/Projects and project proposals"){
flname <- file.path(dir, name)
flname <- paste0(flname, ".png")
png(filename = flname, width = 1500, height = 1000)
print(g)
dev.off()
flname
}
dat$DATE <- as.Date(dat$DATE, format = "%d/%m/%Y")
sp <- split(dat, dat$SITENAME)
gg_list <- sapply(sp, SITENAME_plot, simplify = FALSE)
mapply(SITENAME_plot_write, names(gg_list), gg_list, MoreArgs = list(dir = getwd()))
rm(sp) # final clean-up

Resources