Decimal digits in `Slope Graph` with `ggplot2` - r

Following a former question I opened few weeks ago:
Slope Chart - ggplot2
I face another issue, concerning the numeric values reported in the graph. Even specifying the decimal digits I need (exactly 3) with any of the two commands:
y=round(y, digit = 3) at the endof the code
or
options(digits=3) at the beginning of the whole code
The graphical output, doesn't give me the desired number of digits but only concerning 0. In the graph, I wanted to have 0.800 (not 0.8) and 0.940 (not 0.94). It looks like it removes 0. Below the graphical output from R, I circled in red the number I intended to change.
Below the whole code:
library(dplyr)
library(ggplot2)
#options(digits=3)
theme_set(theme_classic())
#### Data
df <- structure(list(group = c("Ups", "Ups", "Ups", "Ups", "Ups"),
yshift = c(0, 0, 0, 0, 0), x = structure(1:5, .Label = c("1 day",
"2 days", "3 days", "5 days", "7 days"), class = "factor"),
y = c(0.108, 0.8, 0.94, 1.511, 1.905), ypos = c(0.10754145,
0.8, 0.94, 1.5111111, 1.90544651164516)), row.names = c(1L,
3L, 5L, 7L, 9L), class = "data.frame")
# Define functions. Source: https://github.com/jkeirstead/r-slopegraph
plot_slopegraph <- function(df) {
ylabs <- subset(df, x==head(x,1))$group
yvals <- subset(df, x==head(x,1))$ypos
fontSize <- 3
gg <- ggplot(df,aes(x=x,y=ypos)) +
geom_line(aes(group=group),colour="grey80") +
geom_point(colour="white",size=8) +
geom_text(aes(label=y), size=fontSize, family="American Typewriter") +
scale_y_continuous(name="", breaks=yvals, labels=ylabs)
return(gg)
}
## Plot
plot_slopegraph(df) + labs(title="Monomer content after days of heating")
I am making any mistake or missing something? Is there any other way to force 0 digits?
Thank you in advance for every eventual reply or comment.

I like the scales package functions for things like this (though you could certainly use formatC or sprintf instead).
I've modified plot_slopegraph to use label=scales::label_number(accuracy = 0.001)(y)) in the geom_text():
plot_slopegraph <- function(df) {
ylabs <- subset(df, x==head(x,1))$group
yvals <- subset(df, x==head(x,1))$ypos
fontSize <- 3
gg <- ggplot(df,aes(x=x,y=ypos)) +
geom_line(aes(group=group),colour="grey80") +
geom_point(colour="white",size=8) +
geom_text(aes(label=scales::label_number(accuracy = 0.001)(y)), size=fontSize, family="American Typewriter") +
scale_y_continuous(name="", breaks=yvals, labels=ylabs)
return(gg)
}
plot_slopegraph(df)

Related

X-axis for times

I am trying to generate a series of plots that show the same patient taking drinks and urinating at different times. Each plot represents a single day. I want to compare the days and hence I need to ensure that all graphs plotted have the same x-axis. My code is below which I cribbed from How to specify the actual x axis values to plot as x axis ticks in R
### Data Input
time_Thurs <- c("01:10", "05:50", "06:00","06:15", "06:25", "09:35", "10:00", "12:40",
"14:00", "17:20", "18:50", "19:10", "20:10", "21:00", "22:05", "22:35")
event_Thurs <- c("u", "u", "T", "T", "u", "u", "T","T","u", "u", "T", "T", "T", "T", "u", "W")
volume_Thurs <- c(NA, NA, 0.25, 0.25, NA, NA, 0.125, 0.625, NA, NA, 0.25, 0.25, 0.25, 0.25,
NA, 0.25)
total_liquids_Thurs <- sum(volume_Thurs, na.rm=TRUE)
time_Thurs <- paste("04/04/2019", time_Thurs, sep=" ")
time_Fri <- c("01:15", "06:00", "06:10", "06:25", "06:30", "07:10", "08:40", "09:20",
"12:45", "13:45")
event_Fri <- c("u","u", "T","T","u","uu","T", "u", "T", "u")
volume_Fri <- c(NA, NA, 0.25, 0.25, NA, NA, 0.125, NA, 0.625, NA)
total_liquids_Fri <- sum(volume_Fri, na.rm=TRUE)
time_Fri <- paste("05/04/2019", time_Fri, sep=" ")
### Collect all data together
event <- c(event_Thurs, event_Fri)
Volume <- c(volume_Thurs, volume_Fri)
time_log <- c(time_Thurs, time_Fri)
time_log <- strptime(time_log, format = "%d/%m/%Y %H:%M")
time_view <- format(time_log, "%H:%M")
### Put into Dataframe
patient_data <- data.frame(time_log, time_view, event, Volume)
# write.csv(patient_data, file="patient_data.csv", row.names = FALSE)
daily_plot <- function(x, day) {
# x patient data - a data.frame with four columns:
# POSIXct time, time, event and Volume
# date number of day of month
# y volume of liquid
# TotVol total volume of intake over week
# Event - drink or otherwise
x <- x[as.numeric(format(x[,1], "%d")) == day, ]
TotVol <- sum(x[,4], na.rm = TRUE)
DayOfWeek <- weekdays(x[1,1], abbreviate = FALSE)
plot(x[,1],x[,4],
xlim = c(x[1,1],x[length(x[,1]),1]),
xlab="Hours of Study", ylab = "Volume of Liquid Drank /L",
main = paste("Total Liquids Drank = ", TotVol, " L on ", DayOfWeek, "Week 1, Apr 2019"),
sub = "dashed red line = urination", pch=16,
col = c("black", "yellow", "green", "blue")[as.numeric(x[,3])],
xaxt = 'n'
)
xAxis_hrs <- seq(as.POSIXct(x[1,1]), as.POSIXct(x[length(x[,1]),1]), by="hour")
axis(1, at = xAxis_hrs, las = 2)
abline( v = c(x[x[,3] == "u",1]), lty=3, col="red")
}
When I run the function,
daily_plot(patient_data, 4)
I want to print out my x-axis, as amended in the form of hours representing the events over the 24 hour period.
When I wrap my xAxis_hrs vector in strptime(xAxis_hrs, format = "%H") the code crashes - that is the x-axis doesn't print out and I see, Error in axis(1, at = xAxis_hrs, las = 2) : (list) object cannot be coerced to type 'double' . Any help?
The issue is that you pass the labels to the wrong named argument, namely at (which should be the numeric positions of the labels). Use the following instead:
axis(1, at = xAxis_hrs, labels = strptime(xAxis_hrs, format = "%H"), las = 2)
Unfortunately this doesn’t change the fact that the axis labels don’t fit into the plot, and collide with the axis title. The former can be fixed by adjusting the plot margins. I’m not aware of a good solution for the latter, although changing the time format might help: it’s probably not necessary/helpful to print the full minutes and seconds (which are always 0). In fact, did you mean to use format instead of strptime?
Apart from that I fundamentally agree with the other answer recommending ggplot2 in the long run. It makes this kind of stuff a lot less painful.
If you're open to a ggplot solution:
library(tidyverse)
library(lubridate)
daily_ggplot <- function(df, selected_day) {
df_day <- filter(df, day(time_log) == selected_day)
df_urine <- filter(df_day, event == "u")
df_drink <- filter(df_day, event != "u")
TotVol <- sum(df_day$Volume, na.rm = TRUE)
Date <- floor_date(df_day$time_log[1], 'days')
DayOfWeek <- weekdays(Date, abbreviate = F)
plot_title <- paste0("Total drank = ", TotVol, "L on ", DayOfWeek, " Week 1, Apr 2018")
ggplot(df_drink) +
aes(time_log, Volume, color = event) +
geom_point(size = 2) +
geom_vline(data = df_urine, aes(xintercept = time_log), color = "red", linetype = 3) +
labs(x = "Hours of Study", ylab = "Volume of Liquid Drank (L)",
title = plot_title, subtitle = "lines = urination") +
theme_bw() +
scale_x_datetime(date_labels = "%H:%M", limits = c(Date, Date + days(1)))
}
daily_ggplot(patient_data, 4)

Adding vline to time-series ggplot collapses X asis?

I have ts data that I draw with quarterly legend:
z <- as.zoo(my_data)
breaks <- seq(min(time(z)), max(time(z)), .25);
autoplot(z, geom="line",ylim=c(0,75)) + scale_x_yearqtr(breaks = breaks, format = "%yQ%q")
I'd like to draw a vertical line at a predetermined position (let's say at 1975-08-01).
The problem is, then I add "+ geom_vline", I get a really weird collapsed chart. Obviously, I have no true idea what I'm doing but I've tried this:
+ geom_vline(xintercept=as.Date("1975-08-01"))
+ geom_vline(xintercept=as.numeric(as.Date("1975-08-01")))
and as offsets (not sure how it goes):
+ geom_vline(xintercept=as.numeric(z[c(10,11)]))
+ geom_vline(xintercept=as.numeric(z[10]))
+ geom_vline(xintercept=3)
This is what happens (without geom_vline it is OK):
How can I put an offset ("draw vline at datapoint X") or data ("1975-08-01")?
What am I doing wrong?
Adding some data.
dput(z)
structure(c(NA, NA, NA, 56.0775, 58.53, 58.17, 61.5025, 57.71,
56.5075, 53.9375, 47.345, 48.6975, 53.15, 60.3125, 60.2, 65.1025,
63.445, 57.86, 62.1225, 62.19, 64.075, 71.7725, 69.565, 63.4575000000001,
59.2175, 53.8525, 53.4175, 50.1475, 50.9, 50.0675, 52.6925, 59.9325,
59.8625, 61.8375, 57.655, 50.23, 47.8775, 39.5475, 40.1375, 43.2075,
44.885, 48.115), index = structure(c(1974, 1974.08333333333,
1974.16666666667, 1974.25, 1974.33333333333, 1974.41666666667,
1974.5, 1974.58333333333, 1974.66666666667, 1974.75, 1974.83333333333,
1974.91666666667, 1975, 1975.08333333333, 1975.16666666667, 1975.25,
1975.33333333333, 1975.41666666667, 1975.5, 1975.58333333333,
1975.66666666667, 1975.75, 1975.83333333333, 1975.91666666667,
1976, 1976.08333333333, 1976.16666666667, 1976.25, 1976.33333333333,
1976.41666666667, 1976.5, 1976.58333333333, 1976.66666666667,
1976.75, 1976.83333333333, 1976.91666666667, 1977, 1977.08333333333,
1977.16666666667, 1977.25, 1977.33333333333, 1977.41666666667
), class = "yearmon"), frequency = 12, class = c("zooreg", "zoo"))
The index class of z has class `"yearmon"
class(index(z))
## [1] "yearmon"
so the xintercept= should be specified consistently, i.e. also as a "yearmon" object:
p <- autoplot(z, ylim=c(0,75)) +
scale_x_yearqtr(breaks = breaks, format = "%yQ%q")
p + geom_vline(xintercept = as.yearmon("1975-08"))
Any other valid specification of a "yearmon" object would work as well, e.g.
p + geom_vline(xintercept = as.yearmon(1975 + (8-1) / 12))
p + geom_vline(xintercept = as.yearmon(as.Date("1975-08-01")))
As bVa pointed out, I used dput to see the format of index.
As dates are stored in decimal, the solution is to use simple decimal value. 1975.67 for aug 1975.
geom_vline(xintercept=as.numeric(1975.67))

data of class numeric error when plotting vertical and horizontal lines in ggplot

iarray <- iv$iarray
varray <- iv$varray
n<-gsub("^\\{+(.+)\\}+$", '\\1', iarray)
n1 <- strsplit(n,",")
n1 <- unlist(n1)
n1 <- as.numeric(n1)
df <- as.data.frame(n1)
n<-gsub("^\\{+(.+)\\}+$", '\\1', varray)
n2 <- strsplit(n,",")
n2 <- unlist(n2)
n2 <- as.numeric(n2)
df <- cbind(df,n2)
vmpp <-iv$vmpp
impp <- iv$impp
print(impp)
print(vmpp)
})
output$ivcurve <- renderPlot({
ggplot(data3(), aes(x=n2, y= n1)) + geom_line(colour='blue')+ geom_vline(xintercept = vmpp)+ geom_hline(yintercept = impp) + scale_y_continuous(limits = c(-1, 11))
Basically I'm trying to draw an IV curve from the above code.
As seen in the photo I need a horizontal and a vertical line.
But after I added the geom_vline function it gives me the Error : ggplot2 doesn't know how to deal with data of class numeric
iv is a dataframe and iarray and varray basically looks like this.
iarray = "{9.467182035,9.252423958,9.179368178,9.142931845}"
varray = "{-1.025945126,-0.791203874,-0.506481774,-0.255416444}"
And vmpp and impp are basically numbers as 8.5 and 20
suggestions?
P.s :
dput(iv)
structure(list(id = 3L, seris_id = "SERTPTR0003", module_id = 2L,
isc = 9.1043, voc = 37.61426, impp = 8.524, vmpp = 30.0118,
pmpp = 255.8095, unique_halm_id = 4414L, iarray = "{9.471385758,9.251831868,9.174032904,9.135095327,9.109244512,9.087563112,9.081257993,9.079282455,9.078209387,9.077396672,9.076717653,9.076285598,9.075914058,9.075549594,9.075098675,9.074659768,9.074080201,9.073659578,9.073411255,9.073349331,9.073215686,9.073189667,9.073011759,9.072868405,9.072659064,9.072636165,9.072659725,9.072729724,9.072779321,9.072915415,9.072951718,9.072855259,9.072758863,9.072562734,9.072286497,9.072036161,9.071858009,9.07165223,9.071458902,9.071172024,9.070818323,9.070364851,9.069865071,9.069392026,9.069058847,9.068673155,9.068486996,9.0684006,9.068241175,9.067848351,9.067533806,9.066886103,9.066177782,9.0655086,9.065025577,9.064457111,9.064154995,9.063866251,9.063564149,9.063221961,9.06295813,9.062580288,9.062182005,9.06179715,9.061378517,9.060847632,9.06033015,9.059686156,9.058814993,9.057817299,9.056732355,9.055534236,9.054389596,9.05351149,9.052819766,9.052254696,9.051816304,9.051431465,9.051000987,9.050664797,9.050589584,9.050615635,9.050795719,9.051096084,9.05121704,9.050958132,9.050478383,9.049724325,9.048695951,9.047619756,9.046715916,9.04602525,9.045615278,9.045512729,9.045617691,9.045803509,9.045989974,9.046083526,9.045997615,9.045871618,9.045772357,9.045599926,9.045340971,9.045082036,9.04473025,9.044178732,9.043440888,9.042642632,9.04185002,9.041056695,9.040316091,9.039781509,9.039426971,9.039199774,9.039026035,9.038805897,9.038478843,9.037978051,9.037190302,9.036262611,9.035408047,9.034687132,9.03411323,9.033759457,9.033445779,9.033105372,9.032611665,9.031991392,9.031298017,9.030631384,9.029991493,9.02931152,9.028518372,9.027678053,9.026644378,9.025384369,9.023971135,9.022443918,9.020510444,9.018469233,9.015987042,9.013123551,9.009951782,9.006524239,9.002508657,8.99806541,8.993200713,8.987509287,8.980851319,8.97337198,8.964883202,8.955065215,8.944015742,8.931773812,8.91796823,8.902911552,8.886450605,8.868452754,8.848678419,8.827119435,8.80336248,8.777313996,8.748941051,8.718309497,8.685225063,8.649388501,8.610785476,8.569040812,8.52363426,8.474699468,8.422382481,8.366516735,8.307103187,8.244481209,8.178090447,8.10779633,8.033345875,7.954744415,7.871665908,7.784296593,7.692116999,7.595199333,7.493377787,7.386704971,7.275055109,7.158981607,7.038484468,6.913650942,6.784728642,6.651977027,6.515069048,6.374111623,6.228897233,6.079031999,5.924669253,5.766323899,5.604063459,5.43841477,5.26939121,5.096619936,4.919752772,4.738936722,4.554312451,4.366039658,4.174017769,3.978461295,3.779470133,3.576724216,3.370764477,3.162238756,2.951119622,2.737359938,2.521133452,2.302407806,2.08132299,1.858467726,1.632539296,1.397202225,1.149523324,0.890812319,0.62251893,0.349040094,0.084409259,-0.164612445,-0.4001423,-0.625408177,-0.844927296,-1.067373925,-1.297998987,-1.536777099,-1.782558235,-2.033692207,-2.28906274,-2.54694712,-2.806836154,-3.068463186,-3.331653821,-3.596227332,-3.862303417,-4.129421924,-4.397321356,-4.666082505,-4.935632162,-5.206170796,-5.478105728,-5.751638617,-6.027203502,-6.304753878,-6.584235675,-6.865027697,-7.146774939,-7.428922534,-7.711971427,-7.995982555,-8.281623641,-8.569128828,-8.85847189,-9.14887768,-9.440152159,-9.731968139,-10.02382391,-10.315645796,-10.608918155,-10.906228043,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}",
varray = "{-1.055634971,-0.820094649,-0.530478984,-0.277519378,-0.049665975,0.168173928,0.369832037,0.557189853,0.73806136,0.918444007,1.100988955,1.285835111,1.471379381,1.656087228,1.83947039,2.021804885,2.204138782,2.387586314,2.572217234,2.757544476,2.943083961,3.127927125,3.311936258,3.49497066,3.677517995,3.860273388,4.043516446,4.227247167,4.411953813,4.597148124,4.781785019,4.965795342,5.149247651,5.331933288,5.514618924,5.698279889,5.882706331,6.067759345,6.253369179,6.43883951,6.623542572,6.807967224,6.991834459,7.175283184,7.359219574,7.543715171,7.727930567,7.91102934,8.092315166,8.270881273,8.44728269,8.622987785,8.800575578,8.981370755,9.16634984,9.355446065,9.546982405,9.738937256,9.930334092,10.119987137,10.30698723,10.492242934,10.676242509,10.859335313,11.042009008,11.224684494,11.40735998,11.589966311,11.77250289,11.955040067,12.138134659,12.321927365,12.506347836,12.691464628,12.877626501,13.06357852,13.248553416,13.432761044,13.616063692,13.798391012,13.981067688,14.165071441,14.350401073,14.536497974,14.722526917,14.907441627,15.090542195,15.272247132,15.453463208,15.634886746,15.817842429,16.00302897,16.188982779,16.375285347,16.561309505,16.745870074,16.92840904,17.110739948,17.293002899,17.47610287,17.660388619,17.84523298,18.029659429,18.213737119,18.397257992,18.580849811,18.765279846,18.950546303,19.136368979,19.322329963,19.507382979,19.691527431,19.874834264,20.057444179,20.239565832,20.422106592,20.605414621,20.789699174,20.974891096,21.161129891,21.347647095,21.533745789,21.718937711,21.903640773,22.087576567,22.271163603,22.454611733,22.638548722,22.822346805,23.006144888,23.189873219,23.373392294,23.556911967,23.740989056,23.925624756,24.110540657,24.296084919,24.481558832,24.666267268,24.850416495,25.034286715,25.217528572,25.400211222,25.582891485,25.765153238,25.947554494,26.130303912,26.31319522,26.495810505,26.678077627,26.859158373,27.03919344,27.2183891,27.396813913,27.574889968,27.752475972,27.92950277,28.106320312,28.283627309,28.460794206,28.638101203,28.814990286,28.99097379,29.16472524,29.336732302,29.506996765,29.675866194,29.843757906,30.011789121,30.179820337,30.347501601,30.514694006,30.680631478,30.844614709,31.006505389,31.166790586,31.32526224,31.482546323,31.638644628,31.793695462,31.947491958,32.10059034,32.252713392,32.40427843,32.555147743,32.705110284,32.853817294,33.001129867,33.14656034,33.29010692,33.431979459,33.572526716,33.711957348,33.85055096,33.988306953,34.125295678,34.260890565,34.395089819,34.528311952,34.660559355,34.792040088,34.923102908,35.053748414,35.1835569,35.312946278,35.441359731,35.568936163,35.695816274,35.821859963,35.946717873,36.071016573,36.194685715,36.317446291,36.439648854,36.561153301,36.681402816,36.801235017,36.924136292,37.050595501,37.180333637,37.313071096,37.446575824,37.573873847,37.693009727,37.80489203,37.910636184,38.012195237,38.11396235,38.218102215,38.324333435,38.432308447,38.541817399,38.652023867,38.762158191,38.872501769,38.982844151,39.092768022,39.202552988,39.31199099,39.420452469,39.528287977,39.635566071,39.742146647,39.848308713,39.954401627,40.060004483,40.165396887,40.270857247,40.376317008,40.481775571,40.587234732,40.692485836,40.797320225,40.901805855,41.005874173,41.109873339,41.21366265,41.317242107,41.420542558,41.523772657,41.626762446,41.729471102,41.832232456,41.937530675,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0}"), .Names = c("id",
"seris_id", "module_id", "isc", "voc", "impp", "vmpp", "pmpp",
"unique_halm_id", "iarray", "varray"), row.names = 1L, class = "data.frame")
Solved it. since the ggplot requires data frames. I just added two more columns to the data frame and added them there.
vmpp <- iv$vmpp
df <- cbind(df,vmpp)
impp <- iv$impp
df <- cbind(df,impp)
print(df)
})
output$ivcurve <- renderPlot({
ggplot(data3(), aes(x=n2, y= n1)) + geom_line(colour='blue')+ scale_y_continuous(limits = c(-1, 11))+ geom_vline(aes(xintercept = vmpp))+ geom_hline(aes(yintercept = impp))

calendar heat map tetris chart

So I was reading this post and I fell a little in love with the calendar heat map with Tetris-style month breaks.
However, the ggplot example doesn't implement the Tetris breaks, which are arguably the best part.
So, FTFY, gist here:
The procedure for this is:
create appropriate Tetris breaks for your data
left_join your data to the Tetris breaks created in (1)
pump the above through ggplot with some specially crafted geoms
The methodology for (1) is reasonably straightforward, implemented in the calendar_tetris_data(...) function in the gist, though it would be nice to make it a little more flexible.
My question is mainly around (3): how do I bundle up the 7 geoms necessary to make the breaks into a single procedure or geom?
If I do this:
calendar_tetris_geoms <- function() {
geom_segment(aes(x=x, xend=x, y=ymin, yend=ymax)) + # (a)
geom_segment(aes(x=xmin, xend=xmax, y=y, yend=y)) + # (b)
geom_segment(aes(x=dec.x, xend=dec.x, y=dec.ymin, yend=dec.ymax)) + # (c)
geom_segment(aes(x=nye.xmin, xend=nye.xmax, y=nye.y, yend=nye.y)) + # (d)
geom_segment(x=-0.5, xend=51.5, y=7.5, yend=7.5) + # put a line along the top
geom_segment(x=0.5, xend=52.5, y=0.5, yend=0.5) + # put a line along the bottom
geom_text(aes(x=month.x, y=month.y, label=month.l), hjust=0.25) # (e)
}
And then try to add that to my ggplot, it doesn't work:
> ggplot(data) + calendar_tetris_geoms()
Error in calendar_tetris_geoms() :
argument "plot" is missing, with no default
I clearly don't understand how this works. How does this work?
Modifying #baptiste's suggestion, if I do this:
calendar_tetris_geoms <- function() {
list(
geom_segment(aes(x=x, xend=x, y=ymin, yend=ymax)), # (a)
geom_segment(aes(x=xmin, xend=xmax, y=y, yend=y)), # (b)
geom_segment(aes(x=dec.x, xend=dec.x, y=dec.ymin, yend=dec.ymax)), # (c)
geom_segment(aes(x=nye.xmin, xend=nye.xmax, y=nye.y, yend=nye.y)), # (d)
geom_segment(x=-0.5, xend=51.5, y=7.5, yend=7.5), # put a line along the top
geom_segment(x=0.5, xend=52.5, y=0.5, yend=0.5), # put a line along the bottom
geom_text(aes(x=month.x, y=month.y, label=month.l), hjust=0.25) # (e)
)
}
Then this works a treat:
calendar_tetris_data(min(stock.data$date), max(stock.data$date)) %>%
left_join(stock.data) %>%
ggplot() +
geom_tile(aes(x=week, y=wday2factor(wday), fill = Adj.Close), colour = "white") +
calendar_tetris_geoms() +
facet_wrap(~ year, ncol = 1)
Update 2019-08-06 - Pulling everything into one post to make a Tetris Calendar Heat Map
Sample date data.
This is a stand in for your date data.
mydatedata<-as.Date(paste(sample(c(2018:2019), 3000, replace = TRUE), # year
sample(c(1:12), 3000, replace = TRUE), # month
sample(c(1:28), 3000, replace = TRUE), # day
sep="-"))
Create a data frame summarizing your data
Replace mydatedata with your df$date field.
newdf<-as.data.frame(table(mydatedata), stringsAsFactors = FALSE);
names(newdf)<-c("date", "n")
newdf$date<-as.Date(newdf$date, format='%Y-%m-%d')
Create Calendar Tetris Data Functions
Note: I created a weekday label, renamed several functions to avoid name collision, and moved the the helper functions inside the main function.
Original source links:
1) https://gist.github.com/dvmlls/5f46ad010bea890aaf17
2) calendar heat map tetris chart
calendar_tetris_data <- function(date_min, date_max) {
year2 <- function(d) as.integer(format(d, '%Y'))
wday2 <- function(d) {
n <- as.integer(format(d, '%u'))
ifelse(n==7, 0, n) + 1 # I want the week to start on Sunday=1, so turn 7 into 0.
}
wday2factor <- function(wd) factor(wd, levels=1:7, labels=c('Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday'))
week2 <- function(d, year) {
# If January 1st is a Sunday, my weeks will start from 1 instead of 0 like the rest of them.
nyd <- as.Date(ISOdate(year, 1, 1))
# So if that's the case, subtract 1.
as.integer(format(d, '%U')) - ifelse(wday2(nyd) == 1, 1, 0)
}
start <- as.Date(ISOdate(year2(min(date_min)),1,1))
end <- as.Date(ISOdate(year2(max(date_max)), 12, 31))
all.dates <- start + 0:as.integer(end - start, units='days')
data.frame(date=all.dates) %>% tbl_df %>%
mutate(
wday=wday2(date),
year=year2(date),
month=as.integer(format(date, '%m')),
week=week2(date, year),
day=as.integer(format(date, '%d')),
weekday=wday2factor(wday), #20190806, adding weekday label
# (a) put vertical lines to the left of the first week of each month
x=ifelse(day <= 7, week - 0.5, NA),
ymin=ifelse(day <= 7, wday - 0.5, NA),
ymax=ifelse(day <= 7, wday + 0.5, NA),
# (b) put a horizontal line at the bottom of the first of each month
y=ifelse(day == 1, wday - 0.5, NA),
xmin=ifelse(day == 1, week - 0.5, NA),
xmax=ifelse(day == 1, week + 0.5, NA),
# (c) in december, put vertical lines to the right of the last week
dec.x=ifelse(month==12 & day >= 25, week + 0.5, NA),
dec.ymin=ifelse(month==12 & day >= 25, wday - 0.5, NA),
dec.ymax=ifelse(month==12 & day >= 25, wday + 0.5, NA),
# (d) put a horizontal line at the top of New Years Eve
nye.y=ifelse(month==12 & day == 31, wday + 0.5, NA),
nye.xmin=ifelse(month==12 & day == 31, week - 0.5, NA),
nye.xmax=ifelse(month==12 & day == 31, week + 0.5, NA),
# (e) put the first letter of the month on the first day
month.x=ifelse(day == 1, week, NA),
month.y=ifelse(day == 1, wday, NA),
month.l=ifelse(day == 1, substr(format(date, '%B'), 1, 3), NA)
)
}
Create the ggplot2 geom:
calendar_tetris_geoms <- function() {
list(
geom_segment(aes(x=x, xend=x, y=ymin, yend=ymax)), # (a)
geom_segment(aes(x=xmin, xend=xmax, y=y, yend=y)), # (b)
geom_segment(aes(x=dec.x, xend=dec.x, y=dec.ymin, yend=dec.ymax)), # (c)
geom_segment(aes(x=nye.xmin, xend=nye.xmax, y=nye.y, yend=nye.y)), # (d)
geom_segment(x=-0.5, xend=51.5, y=7.5, yend=7.5), # put a line along the top
geom_segment(x=0.5, xend=52.5, y=0.5, yend=0.5), # put a line along the bottom
geom_text(aes(x=month.x, y=month.y, label=month.l), hjust=0.25) # (e)
)
}
Create the plot:
library(ggplot2)
library(dplyr) # for %>% pipe
calendar_tetris_data(min(newdf$date), max(newdf$date)) %>%
left_join(newdf) %>%
ggplot() +
geom_tile(aes(x=week, y=weekday, fill = n), colour = "white") +
calendar_tetris_geoms() +
facet_wrap(~ year, ncol = 1)

plot an item map (based on difficulties)

I have a data set of item difficulties that correspond to items on a questionnaire that looks like this:
## item difficulty
## 1 ITEM_01_A 2.31179818
## 2 ITEM_02_B 1.95215238
## 3 ITEM_03_C 1.93479536
## 4 ITEM_04_D 1.62610855
## 5 ITEM_05_E 1.62188759
## 6 ITEM_06_F 1.45137544
## 7 ITEM_07_G 0.94255210
## 8 ITEM_08_H 0.89941812
## 9 ITEM_09_I 0.72752197
## 10 ITEM_10_J 0.61792597
## 11 ITEM_11_K 0.61288399
## 12 ITEM_12_L 0.39947791
## 13 ITEM_13_M 0.32209970
## 14 ITEM_14_N 0.31707701
## 15 ITEM_15_O 0.20902108
## 16 ITEM_16_P 0.19923607
## 17 ITEM_17_Q 0.06023317
## 18 ITEM_18_R -0.31155481
## 19 ITEM_19_S -0.67777282
## 20 ITEM_20_T -1.15013758
I want to make an item map of these items that looks similar (not exactly) to this (I created this in word but it lacks true scaling as I just eyeballed the scale). It's not really a traditional statistical graphic and so I don't really know how to approach this. I don't care what graphics system this is done in but I am more familiar with ggplot2 and base.
I would greatly appreciate a method of plotting this sort of unusual plot.
Here's the data set (I'm including it as I was having difficulty using read.table on the dataframe above):
DF <- structure(list(item = c("ITEM_01_A", "ITEM_02_B", "ITEM_03_C",
"ITEM_04_D", "ITEM_05_E", "ITEM_06_F", "ITEM_07_G", "ITEM_08_H",
"ITEM_09_I", "ITEM_10_J", "ITEM_11_K", "ITEM_12_L", "ITEM_13_M",
"ITEM_14_N", "ITEM_15_O", "ITEM_16_P", "ITEM_17_Q", "ITEM_18_R",
"ITEM_19_S", "ITEM_20_T"), difficulty = c(2.31179818110545, 1.95215237740899,
1.93479536058926, 1.62610855327073, 1.62188759115818, 1.45137543733965,
0.942552101641177, 0.899418119889782, 0.7275219669431, 0.617925967008653,
0.612883990709181, 0.399477905189577, 0.322099696946661, 0.31707700560997,
0.209021078266059, 0.199236065264793, 0.0602331732900628, -0.311554806052955,
-0.677772822413495, -1.15013757942119)), .Names = c("item", "difficulty"
), row.names = c(NA, -20L), class = "data.frame")
Thank you in advance.
Here is a quick example:
ggplot(DF, aes(x=1, y=difficulty, label = item)) +
geom_text(size = 3) +
scale_y_continuous(breaks = DF$difficulty, minor_breaks = NULL, labels = sprintf("%.02f", DF$difficulty)) +
scale_x_continuous(breaks = NULL) +
opts(panel.grid.major = theme_blank())
but sometimes two items are too narrow so overlapped. You may do like this:
m <- 0.1
nd <- diff(rev(DF$difficulty))
nd <- c(0, cumsum(ifelse(nd < m, m, nd)))
DF$nd <- rev(rev(DF$difficulty)[1] + nd)
ggplot(DF, aes(x=1, y=nd, label = item)) +
geom_text(size = 3) +
scale_y_continuous(breaks = DF$nd, labels = sprintf("%.02f", DF$difficulty), DF$difficulty, minor_breaks = NULL) +
scale_x_continuous(breaks = NULL) +
opts(panel.grid.major = theme_blank())
Here is a solution with base graphics.
# Compute the position of the labels to limit overlaps:
# move them as little as possible, but keep them
# at least .1 units apart.
library(quadprog)
spread <- function(b, eps=.1) {
stopifnot(b == sort(b))
n <- length(b)
Dmat <- diag(n)
dvec <- b
Amat <- matrix(0,nr=n,nc=n-1)
Amat[cbind(1:(n-1), 1:(n-1))] <- -1
Amat[cbind(2:n, 1:(n-1))] <- 1
bvec <- rep(eps,n-1)
r <- solve.QP(Dmat, dvec, Amat, bvec)
r$solution
}
DF <- DF[ order(DF$difficulty), ]
DF$position <- spread(DF$difficulty, .1)
ylim <- range(DF$difficulty)
plot( NA,
xlim = c(.5,2),
ylim = ylim + .1*c(-1,1)*diff(ylim),
axes=FALSE, xlab="", ylab=""
)
text(.9, DF$position, labels=round(DF$difficulty,3), adj=c(1,0))
text(1.1, DF$position, labels=DF$item, adj=c(0,0))
arrows(1,min(DF$position),1,max(DF$position),code=3)
text(1,min(DF$position),labels="Easier",adj=c(.5,2))
text(1,max(DF$position),labels="More difficult",adj=c(.5,-1))
text(.9, max(DF$position),labels="Difficulty",adj=c(1,-2))
text(1.1,max(DF$position),labels="Item", adj=c(0,-2))
My own attempt but I think I'm going to like Vincent's solution much better as it looks like my original specification.
DF <- DF[order(DF$difficulty), ]
par(mar=c(1, 1, 3, 0)+.4)
plot(rep(1:2, each=10), DF$difficulty, main = "Item Map ",
ylim = c(max(DF$difficulty)+1, min(DF$difficulty)-.2),
type = "n", xlab="", ylab="", axes=F, xaxs="i")
text(rep(1.55, 20), rev(DF$difficulty[c(T, F)]),
DF$item[c(F, T)], cex=.5, pos = 4)
text(rep(1, 20), rev(DF$difficulty[c(F, T)]),
DF$item[c(T, F)], cex=.5, pos = 4)
par(mar=c(0, 0, 0,0))
arrows(1.45, 2.45, 1.45, -1.29, .1, code=3)
text(rep(1.52, 20), DF$difficulty[c(T, F)],
rev(round(DF$difficulty, 2))[c(T, F)], cex=.5, pos = 2)
text(rep(1.44, 20), DF$difficulty[c(F, T)],
rev(round(DF$difficulty, 2))[c(F, T)], cex=.5, pos = 2)
text(1.455, .5, "DIFFICULTY", cex=1, srt = -90)
text(1.45, -1.375, "More Difficult", cex=.6)
text(1.45, 2.5, "Easier", cex=.6)
par(mar=c(0, 0, 0,0))

Resources