Bokeh labels truncated on x Axis - bokeh

I have created a chart with Bokeh, where the X-Axis is of type 'datetime'.
Unfortunately the labels shown on the X Axis are truncated.
How can I prevent this truncation?
This is my code:
TOOLS = "pan,wheel_zoom,box_zoom,reset,save"
p = figure(x_axis_type="datetime", tools=TOOLS, plot_width=1000, plot_height=600, title = "Feed")
p.xaxis.formatter=DatetimeTickFormatter(
hours=["%d %B %Y"],
days=["%d %B %Y"],
months=["%d %B %Y"],
years=["%d %B %Y"],
)
p.grid.grid_line_alpha=0.3
p.line(_df.datetime, [i for i in range(len(_df.datetime))], color='firebrick', legend='Fast Ask')
output_file("bokeh.html", title="example")
output_notebook()
show(p) # open a browser
And this is the chart:
Thanks,
Gerald

From the reference documentation for DatetimeTickFormatter:
DatetimeTickFormatter has the following properties (listed together with their default values) that can be used to control the formatting of axis ticks at different scales scales:
microseconds = ['%fus']
milliseconds = ['%3Nms', '%S.%3Ns']
seconds = ['%Ss']
minsec = [':%M:%S']
minutes = [':%M', '%Mm']
hourmin = ['%H:%M']
hours = ['%Hh', '%H:%M']
days = ['%m/%d', '%a%d']
months = ['%m/%Y', '%b%y']
years = ['%Y']
You are only setting the last four scales, starting with hours. But from your picture it's clear that the x-axis range only extends over the scale of minutes, so Bokeh is using the default format for minutes scales, shown above. If you want a "full" label for smaller scales(e.g. hourmin and minutes or smaller ones), then you need to configure those as well when you create your DatetimeTickFormatter.

Related

Getting more x-axis ticks/labels for large time series plot pdf export

I am plotting a very large time series (200.000 observations).
But I want to look at every day in detail and scroll through it day by day.
That is why I am exporting it as .pdf with width=500.
Here is the code ( x is a multivariate zoo time series)
pdf(file= "myPlot.pdf", width=500, height = 20 )
plot(x,plot.type = "multiple", nc = 1, col =c("green"),
ylim = list(c(0,25), c(0.05,1),c(4,10), c(700,1000),
c(100,500),c(0,2), c(0,3600)))
dev.off()
This works good so far. I can zoom into the pdf and then scroll through the series.
Problem is now, I have no accurate x-axis ticks / labels.
I would like to have at least daily labels. But I get one label per month.
I tried now the following:
pdf(file= "myPlot.pdf", width=500, height = 20 )
plot(x,plot.type = "multiple", nc = 1, col =c("green"), xaxt="n",
ylim = list(c(0,25), c(0.05,1),c(4,10), c(700,1000),
c(100,500),c(0,2), c(0,3600)))
axis(1,time(x))
dev.off()
But now no x axis appears in the pdf at all.

Display density() graph with date in x axis using R

I have a partial success with
input = "date,data
1-1-2015,5.5
2-1-2016,1.0
3-1-2016,4.0
4-1-2016,4.0
5-1-2019,3.0"
new = read.csv(text=input)
new$date = as.Date(new$date, "%d-%m-%Y")
new$date = as.numeric(new$date, as.Date("2015-01-01"), units="days") #https://stat.ethz.ch/pipermail/r-help/2008-May/162719.html
plot(density(new$date))
Resulting in working graph, unfortunately x axis is obviously formatted as integers. How can I produce graph with x axis formatted as data?
I expected
new = read.csv(text=input)
new$date = as.Date(new$date, "%d-%m-%Y")
plot(density(new$date))
to work, unfortunately it crashed with Error in density.default(new$date) : argument 'x' must be numeric.
density() wasn't really optimized to work with dates. The easiest fix would probably be to just replace the default axis labeling with date values. Here's how you can do that
plot(density(new$date), xaxt="n")
at<-axTicks(1)
axis(1,at, as.Date(at, origin="1970-01-01"))

Automatically scale x-axis by date range within a factor using xyplot()

I've been trying to write out an R script that will plot the date-temp series for a set of locations that are identified by a Deployment_ID.
Ideally, each page of the output pdf would have the name of the Deployment_ID (check), a graph with proper axes (check) and correct scaling of the x-axis to best show the date-temp series for that specific Deployment_ID (not check).
At the moment, the script makes a pdf that shows each ID over the full range of the dates in the date column (i.e. 1988-2010), instead of just the relevant dates (i.e. just 2005), which squishes the scatterplot down into uselessness.
I'm pretty sure it's something to do with how you define xlim, but I can't figure out how to have R access the date min and the date max for each factor as it draws the plots.
Script I have so far:
#Get CSV to read data from, change the file path and name
data <- read.csv(file.path("C:\Users\Person\Desktop\", "SampleData.csv"))
#Make Date real date - must be in yyyy/mm/dd format from the csv to do so
data$Date <- as.Date(data$Date)
#Call lattice to library, note to install.packages(lattice) if you don't have it
library(lattice)
#Make the plots with lattice, this takes a while.
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
ylim = c(-10,40)
)
#make the pdf
pdf("Dataplots_SampleData.pdf", onefile = TRUE)
#print to the pdf? Not really sure how this works. Takes a while.
print(dataplot)
dev.off()
Use the scales argument. give this a try
dataplot <- xyplot(data$Temp~data$Date|factor(data$Deployment_ID),
data=data,
stack = TRUE,
auto.key = list(space = "right"),
layout = c(1,1),
scales= list( relation ="free")
)

How to fix a delay in updating xaxis using R's playwith package

I am using R's playwith package to interactively plot large chunks of time-series data. As a result, I often zoom in and zoom out considerably, to the point where a single 'by' argument to axis.POSIXct becomes impracticable (i.e., sometimes I want to look at the whole range of the data, thus preferring by="year", while some other times I zoom in for detail and then a by="day" would be much more helpful).
I have devised a way to adjust 'by' and 'format' as a function of the current zoom level, however my solution still imposes a "delay". This delay means that:
I select a zoom area with the 'navigate' tool
the plot refreshes, reflecting my 'zoom in' action, except...
the xaxis remains formatted as if using the previous 'by' and 'format' arguments
I now use again the 'navigate' tool to select a zoom area similar to the current plot area (i.e., I don't zoom in at all)
The plot area remains roughly the same (as expected) but xaxis assumes the format I expected in 3. (this is the 'delay')
You can check that yourselves using the following code:
library(xts)
library(playwith)
xts1 <- xts(rnorm(200, 0), seq(as.POSIXct(0, origin="1970-01-01"), as.POSIXct(100000000, origin="1970-01-01"), length=200))
xts2 <- xts(rnorm(200, 0), seq(as.POSIXct(0, origin="1970-01-01"), as.POSIXct(100000000, origin="1970-01-01"), length=200))
XRANGE <- diff(range(as.numeric(index(xts1))))/3600/24/365
# plot with playwith
playwith(
{
plot(xts1, xlab="",xaxt="n",ylab="", main="'Zoomable' xlim", ylim=range(c(xts1, xts2)), xlim=range(index(xts1)), type="n", cex.lab=0.7, cex.axis=0.7)
# choose the appropriate xaxis "by" parameter
axis.POSIXct(1, at=seq(range(index(xts1))[1], range(index(xts1))[2], by=BY), format=FORMAT, las=2, cex.axis=0.7)
# plot data
lines(xts1, col="red")
lines(xts2, col="blue")
},
update.actions=function(playState)
{
.GlobalEnv$XRANGE <- diff(as.numeric(rawXLim(playState)))/3600/24/365;
# evaluate the "by" and "format" parameters for the xaxis
.GlobalEnv$BY <- ifelse(XRANGE >= 2.5, "year", ifelse(XRANGE >= 0.15 & XRANGE < 2.5, "month", "day"));
.GlobalEnv$FORMAT <- ifelse(XRANGE >= 2.5, "%Y", ifelse(XRANGE >= 0.15 & XRANGE < 2.5, "%Y-%m", "%Y-%m-%d"));
print(diff(as.numeric(rawXLim(playState)))/3600/24/365);
print(BY)
},
time.mode=T)
Can anyone help me with this? Given my very superficial knowledge on using the playwith package I wouldn't be surprised if I was just using things in the wrong order, not updating relevant variables before they get used in the plot. I have, however, exhausted my ability to fiddle with these very "theoretical" playwith constructs, which I hardly understand.
P.S., the script includes 2 print lines that will show you that the 'by' argument does change when it's thresholds are crossed; it just somehow doesn't get reflected in the plot's x-axis...
Cheers

R plot with an x time axis: how to force the ticks labels to be the days?

I have this file in csv format:
timestamp,pages
2011-12-09T11:20:50.33,4
2012-01-23T17:44:02.71,132
2012-01-28T15:07:59.34,168
The first column is a timestamp, the second one is a page count.
I need to plot the page count on the vertical axis and the timestamp on the horizontal axis.
The timestamps are not regularly spaced, I have one day in december ant two close days in january.
I tried this code
df = read.csv("my_data.csv")
df$timestamp = strptime(df$timestamp, "%Y-%m-%dT%H:%M:%S")
plot(df$timestamp,df$pages)
and I got a plot with just one tick on the middle of the x axis and with the label "Jan": it's not wrong but I would like to have three ticks with just the day number and the month.
I tried
plot(df$timestamp,df$pages,xaxt="n")
axis.Date(1,df$timestamp,"days")
but no x axis is plotted.
Any idea?
Thank you
I would as.Date() your timestamp like this:
df$timestamp = as.Date(strptime(df$timestamp, "%Y-%m-%dT%H:%M:%S"))
This works then:
plot(df$timestamp,df$pages,xaxt="n")
axis.Date(1,at=df$timestamp,labels=format(df$timestamp,"%b-%d"),las=2)
This will work:
plot(df$timestamp,df$pages,xaxt="n")
axis.POSIXct(1, at=df$timestamp, labels=format(df$timestamp, "%m/%d"))
Essentially in axis.POSIXct (note that you have POSIXct dates in your data frame) you specify where to have the axis ticks (at) and what the labels are.
Typically I like my dates label vertical rather than horizontal. To get that use par(las=2) before the plot.
I found this: http://personality-project.org/r/r.plottingdates.html
Which gave me my solution...
dm = read.csv("my_data.csv", sep=",", head=TRUE)
dm$DateTime <- as.POSIXct(dm$timestamp, format="%Y-%m-%dT%H:%M:%S")
daterange=c(as.POSIXlt(min(dm$DateTime)), as.POSIXlt(max(dm$DateTime)))
plot(pages ~ DateTime, dm, xaxt = "n")
axis.POSIXct(1, at=seq(daterange[1], daterange[2], by="day"), format="%b %d")
The important parts being daterange and at=seq(..., by="day").
I hope this can help. I made this function that allows adding a fixed number of equidistant time ticks.
By setting first="month", the function puts the tick to the 1st of each month. If first="day" the function puts the tick to the 00:00 hour of each day.
Of course, the plot must be created with xaxt="n" argument.
By default, it adds 10 ticks (ticks.n=10) with a dd/mm format (format.x="%d/%m"), no first day of the month or day, and horizontal orientation of the labels (las=1).
axis.time=function(time.x=Sys.time(),ticks.n=10,format.x="%d/%m",first="none",las=1){
tz=attr(time.x,"tzone")
if (first == "day"){
time.x=seq(time.x[1],time.x[length(time.x)],60*30)
time.x=time.x[which(diff(as.numeric(format(time.x,"%H")))<0)+1]
time.x=strptime(as.character(as.Date(time.x)),"%Y-%m-%d",tz)
} else if (first == "month") {
time.x=seq(time.x[1],time.x[length(time.x)], 60*60*24/2)
time.x=time.x[which(diff(as.numeric(format(time.x,"%d")))<0)+1]
time.x=strptime(as.character(as.Date(time.x)),"%Y-%m-%d",tz)
} else {
time.x = seq(time.x[1],time.x[length(time.x)], length.out=ticks.n)
}
axis.POSIXct(side = 1,x = time.x,at = time.x,format = format.x,las=las)
Suppose you have a data frame:
df1=data.frame(time=seq(Sys.time()-1e8,Sys.time(),length.out = 100),Y=runif(100))
a plot with plot(df1) will put the X-axis ticks only at the beginning of each year. If you plot as plot(df1,xaxt="n") you can use the axis.time function:
axis.time(time.x = df1$time,first = "month",las=2,format.x = "%m-%y")
to get a tick on the first day of each month and with a different format and alignment.

Resources