How to change x-axis from weekday names into dates? - r

I'm having trouble getting my plot to display dates (ie. 23/01) instead of weekday names (ie. Thu). My dataset consists of dates and measurements of bat activity. I've set the 'Dates' column of my data as as.Date in the format "%d.%m.%y" and whenever I plot my graph I get weekday names instead of dates.
My code looks like this:
rdate<-as.Date(df,"%d.%m.%Y")
plot(df$Afromontane)
My plot ends up looking like this (below). It's all fine except I'd like the weekday names to be dates in the format (d/m).
df looks like this:
structure(list(Date = c("23.01.20", "24.01.20", "25.01.20", "26.01.20",
"27.01.20", "28.01.20", "29.01.20"), Afromontane = c(13.67, 0,
0, 1.67, 3.67, 22, 3.33), Milkwood = c(8.33, 3.67, 8, 8.33, 4.33,
6.33, 1)), row.names = c(NA, -7L), class = c("tbl_df", "tbl",
"data.frame"))

A minimal example using ggplot2:
library(ggplot2)
df = data.frame(date = sample(seq(as.Date('2001/01/01'), as.Date('2003/01/01'), by="day"), 10), x = runif(10, 1, 10))
df$shortdate <- format(df$date, format="%m-%d")
ggplot(df, aes(x = shortdate, y = x)) +
geom_point()
Alternatively, using base R:
df = data.frame(date = sample(seq(as.Date('2001/01/01'), as.Date('2003/01/01'), by="day"), 10), x = runif(10, 1, 10))
plot(as.Date(df$date), df$x,xaxt = "n", type = "p")
axis(1, df$date, format(df$date, "%m-%d"))

Related

How to remove the last x zeros from a column dataframe in r?

My data frame looks like this:
dput(tree)
structure(list(date = c(2.0220409e+13, 2.022041e+13, 2.0220411e+13,
2.0220412e+13, 2.0220413e+13, 2.0220414e+13, 2.0220415e+13, 2.0220416e+13,
2.0220417e+13, 2.0220418e+13), N = c(1, 2, 3, 4, 5, 6, 7, 8,
9, 10), NDVI = c(0.7192, 0.7034, 0.689, 0.6761, 0.6646, 0.6545,
0.6458, 0.6386, 0.6299, 0.6231)), class = "data.frame", row.names = c(NA,
-10L))
In the column date I want to remove the last 6 zeros (which are repeated for all entries), how can I do that?
Any help is much appreciated.
Maybe like this?
options(scipen = 999)
library(dplyr)
df |>
dplyr::mutate(across(date, ~.x/1000000))

how to make a plot to show start and end days

I have a df that looks like this:
sample data can be build using codes:
df<-structure(list(ID = c(101, 101, 101, 101, 101, 101), AEDECOD = c("Diarrhoea",
"Vitreous floaters", "Musculoskeletal pain", "Diarrhoea", "Decreased appetite",
"Fatigue"), AESTDY = structure(c(101, 74, 65, 2, 33, 27), class = "difftime", units = "days"),
AEENDY = structure(c(105, 99, NA, 5, NA, NA), class = "difftime", units = "days")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
I would like to make a plot that looks like following:
Sorry the the blurry plot. This is the closest one that I can find. What someone give me some guidance on how to make such plot?
Thanks.
With ggplot2, using Unicode's black "left pointer" and "right pointer" characters for the start and end arrows.
df %>%
ggplot(aes(y = AEDECOD, yend = AEDECOD, x = AESTDY, xend = AEENDY)) +
geom_point(aes(x = AESTDY), shape = "\u25BA") +
geom_point(aes(x = AEENDY), shape = "\u25C4") +
geom_segment()
This might get you started.
There are issues about what to do with or how to interpret NAs - this approach just ignores them - you do not get a line.
Start days are indicated by a point.
library(dplyr)
library(tidyr)
library(stringr)
library(ggplot2)
df1 <-
df %>%
mutate(across(ends_with("DY"), ~ as.numeric(str_extract(.x, "\\d+"))))
ggplot(df1)+
geom_segment(aes(y = AEDECOD, yend = AEDECOD, x = AESTDY, xend = AEENDY))+
geom_point(data = filter(df1, is.na(AEENDY)), aes(y = AEDECOD, x = AESTDY))
#> Warning: Removed 3 rows containing missing values (geom_segment).
Created on 2021-04-12 by the reprex package (v2.0.0)

How do you create a grouped barplot in R from only certain columns?

I have a data frame that looks like
Role <- letters(1:3)
df <- data.frame(Role,
Female1=c(1,4,2),
Male1 = c(3,0,0),
Female2 = c(3,5,3),
Male2 = c(1,3,0),
FemaleTotal = Female1+Female2,
MaleTotal = Male1+Male2)
And want to create a barplot grouped with Male,Female for each column category, (in this example it would be 1 and 2), stacked with Roles and also another plot with just the totals. To do just the totals I could use melt() and subset the dataframe to only have those columns, but that seems messy and doesnt help witht the main plot I want to make.
An option would be to reshape to 'long' format
library(dplyr)
library(tidyr)
library(ggplot2)
df %>%
pivot_longer(cols = -Role, names_to = c( "group", '.value'),
names_sep="(?<=[a-z])(?=(\\d+|Total))") %>%
pivot_longer(-c(Role, group)) %>%
ggplot(aes(x = Role, y = value, fill = group)) +
geom_col() +
facet_wrap(~ name)
-output
data
df <- structure(list(Role = c("a", "b", "c"), Female1 = c(1, 4, 2),
Male1 = c(3, 0, 0), Female2 = c(3, 5, 3), Male2 = c(1, 3,
0), FemaleTotal = c(4, 9, 5), MaleTotal = c(4, 3, 0)), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame"))

Adding a log scale on my graph is not working? 'Non-numeric argument to mathematical function'

I am using the package growthcurver to create a graph with a sigmoidal curve. It needs to have a logarithmic scale on the y axis.
This works to create a graph without a log scale:
# install.packages("growthcurver")
library("growthcurver")
gcfit <- SummarizeGrowth(curveA$time, curveA$biomass)
gcfit
plot(gcfit)
I have tried plot(gcfit, log=y) and plot(gcfit, log="curveA$biomass"). This gives me the error
'Non-numeric argument to mathematical function'.
Could it be that I am using a data frame? How do I get around this?
dput(curveA)
structure(list(time = c(1, 2, 3, 4, 5, 6, 7, 8, 17, 18, 19, 20,
21, 22, 23, 24, 25), biomass = c(0.153333333, 1.303333333, 2.836666667,
4.6, 6.21, 6.746666667, 7.283333333, 7.973333333, 8.663333333,
9.046666667, 10.19666667, 10.50333333, 11.04, 11.88333333, 11.96,
11.96, 9.966666667)), class = c("spec_tbl_df", "tbl_df", "tbl",
"data.frame"), row.names = c(NA, -17L), spec = structure(list(
cols = list(time = structure(list(), class = c("collector_number",
"collector")), biomass = structure(list(), class = c("collector_number",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
Without the data that you are using it is difficult to reproduce the problem (how to produce reproducible example: How to make a great R reproducible example), but the easiest solution to your problem might be using log function directly on the data...
gcfit_log <- SummarizeGrowth(curveA$time, log10(curveA$biomass))
plot(gcfit_log)
The workaround is to extract the data from the model and plot using ggplot2 package:
library(ggplot2)
library(dplyr)
gcfit <- SummarizeGrowth(curveA$time, curveA$biomass)
points_data <- bind_cols(t = gcfit$data$t, n = gcfit$data$N)
line_fit <- bind_cols(x = max(gcfit$data$t) * (1 : 30) / 30,
y = NAtT(gcfit$vals$k, gcfit$vals$n0, gcfit$vals$r,
max(gcfit$data$t) * (1 : 30) / 30))
ggplot(data = points_data, aes(t, n + 1)) +
geom_point() +
geom_line() +
geom_line(data = line_fit, aes(x, y + 1), color = "red") +
scale_y_log10()

Facets and multiple datasets in ggplot2

I need to display two datasets on the same faceted plots with ggplot2. The first dataset (dat) is to be shown as crosses like this:
While the second dataset (dat2) is to be shown as a color line. For an element of context, the second dataset is actually the Pareto frontier of the first set...
Both datasets (dat and dat2) look like this:
modu mnc eff
1 0.3080473 0 0.4420544
2 0.3110355 4 0.4633741
3 0.3334024 9 0.4653061
Here's my code so far:
library(ggplot2)
dat <- structure(list(modu = c(0.30947265625, 0.3094921875, 0.32958984375,
0.33974609375, 0.33767578125, 0.3243359375, 0.33513671875, 0.3076171875,
0.3203125, 0.3205078125, 0.3220703125, 0.28994140625, 0.31181640625,
0.352421875, 0.31978515625, 0.29642578125, 0.34982421875, 0.3289453125,
0.30802734375, 0.31185546875, 0.3472265625, 0.303828125, 0.32279296875,
0.3165234375, 0.311328125, 0.33640625, 0.3140234375, 0.33515625,
0.34314453125, 0.33869140625), mnc = c(15, 9, 6, 0, 10, 12, 14,
9, 5, 11, 0, 15, 0, 2, 14, 13, 14, 17, 11, 12, 13, 6, 4, 0, 13,
7, 10, 12, 7, 13), eff = c(0.492448979591836, 0.49687074829932,
0.49421768707483, 0.478571428571428, 0.493537414965986, 0.493809523809524,
0.49891156462585, 0.499319727891156, 0.495102040816327, 0.492285714285714,
0.482312925170068, 0.498911564625851, 0.479931972789116, 0.492857142857143,
0.495238095238095, 0.49891156462585, 0.49530612244898, 0.495850340136055,
0.50156462585034, 0.496, 0.492897959183673, 0.487959183673469,
0.495605442176871, 0.47795918367347, 0.501360544217687, 0.497850340136054,
0.493496598639456, 0.493741496598639, 0.496734693877551, 0.499659863945578
)), .Names = c("modu", "mnc", "eff"), row.names = c(NA, 30L), class = "data.frame")
dat2 <- structure(list(modu = c(0.26541015625, 0.282734375, 0.28541015625,
0.29216796875, 0.293671875), mnc = c(0.16, 0.28, 0.28, 0.28,
0.28), eff = c(0.503877551020408, 0.504149659863946, 0.504625850340136,
0.505714285714286, 0.508503401360544)), .Names = c("modu", "mnc",
"eff"), row.names = c(NA, 5L), class = "data.frame")
dat$modu = dat$modu
dat$mnc = dat$mnc*50
dat$eff = dat$eff
dat2$modu = dat2$modu
dat2$mnc = dat2$mnc*50
dat2$eff = dat2$eff
res <- do.call(rbind, combn(1:3, 2, function(ii)
cbind(setNames(dat[,c(ii, setdiff(1:3, ii))], c("x", "y")),
var=paste(names(dat)[ii], collapse="/")), simplify=F))
ggplot(res, aes(x=x, y=y))+ geom_point(shape=4) +
facet_wrap(~ var, scales="free")
How should I go about doing this? Do I need to add a layer? If so, how to do this in a faceted plot?
Thanks!
Here's one way:
pts <- do.call(rbind, combn(1:3, 2, function(ii)
cbind(setNames(dat[,c(ii, setdiff(1:3, ii))], c("x", "y")),
var=paste(names(dat)[ii], collapse="/")), simplify=F))
lns <- do.call(rbind, combn(1:3, 2, function(ii)
cbind(setNames(dat2[,c(ii, setdiff(1:3, ii))], c("x", "y")),
var=paste(names(dat2)[ii], collapse="/")), simplify=F))
gg.df <- rbind(cbind(geom="pt",pts),cbind(geom="ln",lns))
ggplot(gg.df,aes(x,y)) +
geom_point(data=gg.df[gg.df$geom=="pt",], shape=4)+
geom_path(data=gg.df[gg.df$geom=="ln",], color="red")+
facet_wrap(~var, scales="free")
The basic idea is to create separate data.frames for the points and the lines, then bind them together row-wise with an extra column (geom) indicating which geometry the data goes with. Then we plot the points based on the subset of gg.df with geom=="pt" and similarly with the lines.
The result isn't very interesting with your limited example, but this seems (??) to be what you want. Notice the use of geom_path(...) rather than geom_line(...). The latter orders the x-values before plotting.

Resources