I have a couple of questions about microbenchmark and autoplot
Suppose this is my code:
library("ggplot2")
tm <- microbenchmark(rchisq(100, 0),rchisq(100, 1),rchisq(100, 2),rchisq(100, 3),rchisq(100, 5), times=1000L)
autoplot(tm)
What are the units in tm$time? how can i convert that to seconds?
How can I change the marks on the x axis to something like seq(from=0, to=100,by = 5)?
Thanks!
help(microbenchmark) gives:
1. ‘time’ is the measured execution time
of the expression in nanoseconds.
NANOseconds not milliseconds or microseconds.
So divide by 1000000000 to convert to seconds.
And for your second question, my first response is "why?". But its ggplot-based, so you can override bits by adding ggplot things:
autoplot(tm) + scale_y_log10(name="seq(whatever)")
Note the plot is rotated so the x-axis is the y-scale....
I've just thought you really mean "tick marks"? Slightly different but doable, but not really appropriate given the log axis. You can force a non-log axis with specified tick marks:
autoplot(tm) + scale_y_continuous(breaks=seq(0,10000,len=5),name="not a log scale")
You can keep the log scale and set the tick mark points:
autoplot(tm) + scale_y_log10(breaks=c(50,200,500))
Related
I have a dataset like revenue and date.
I used arima to plot the data.
ts_data = ts(dataset$Revenue,frequency = 7)
arima.ts = auto.arima(ts_data)
pred = forecast(arima.ts,h=30)
plot(pred,xaxt="n")
When I plot the data, it produces plot like below.
My expectations are below,
I need to display values in Million for predicted values like 13.1M.
I need to show x-axis as date instead of data points numbers.
I tried several links but couldn't crack it. Below are the experiments I made,
Tried with start date and end date in ts_data that also doesnt work.My start date is "2019-09-27" and end date is "2020-07-02"
tried wit axis_date in plot function that also doesnt work.
Please help me to crack the same.
Thanks a lot.
You can specify axis tick values and labels with axis()
plot(pred,xaxt="n", yaxt="n") # deactivate x and y axis
yl<-seq(-5000000,15000000,by=5000000) # position of y-axis ticks
axis(2, at=yl, label=paste(yl/1000000, "M")) # 2 = left axis
You can specify the desired position of y axis ticks with at and the labels to be associated with label. In order to obtain the values like 10 M I have used the function paste to join the numbers with the letter M.
You could use the same method also for x-axis, even tough more efficient methods probably exist. Not having the specific data available I have used a generic spacing and labels only to give you the idea. Once you have set the desired position you can generate the sequence of dates associated with it (to generate a sequence of dates see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq.Date)
axis(1, at=seq(1,40,by=1), label=seq(as.Date("2019-09-27"),as.Date("2020-07-02"),by="week")) # 1 = below axis
You can also change the format of the dates displayed with format() for example label=format(vector_of_date, "%Y-%b-%d") to have year-month(in letter)-day.
I want to change x-axis in my graphic, but it doesn't work properly with axis(). Datas in the graphic are daily datas and I want to show only years. Hope someone understands me and find a solution. This is how it looks like now: enter image description here and this is how it looks like with the code >axis (1, at = seq(1800, 1975, by = 25), las=2): enter image description here
Without a reproducible code is not easy to get what could be the problem. I try a "quick and dirt" approach.
High level plots are composed by elements that are sub-composed themselves. Hence, separate drawing commands could turn in use by allowing a finer control on the plotting procedure.
In practice, the first thing to do is plot "nothing".
> plot(x, y, type = "n", xlab = "", ylab = "", axes = F)
type = "n" causes the data to not be drawn. axes = F suppresses the axis and the box around the plot. In spite of that, the plotting region is ready to show the data.
The main benefit is that now the plotting area is correctly dimensioned. Try now to add the desired x axis as you tried before.
> points(x, y) # Plots the data in the area
> axis() # Plots the desired axis with your scale
> title() # Plots the desired titles
> box() # Prints the box surrounding the plot
EDITED based on comment by #scoa
As a quick and dirty solution, you can simply enter the following line after your plot() line:
# This reads as, on axis x (1), anchored at the first (day) value of 0
# and last (day) value of 63917 with 9131 day year increments (by)
# and labels (las) perpendicular (2) to axis (for readability)
# EDITED: and AT the anchor locations, put the labels
# 1800 (year) to 1975 (year) in 25 (year) increments
axis (1, at = seq(0, 63917, by = 9131), las=2, labels=seq(1800, 1975, by=25));
For other parameters, check out ?axis. As #scoa mentioned, this is approximate. I have used 365.25 as a day-to-year conversion, but it's not quite right. It should suffice for visual accuracy at the scale you have provided. If you need precise conversion from days to years, you need to operate on your original data set first before plotting.
I want to create a one dimensional scatterplot of points in time (range of about 5 hours), to visualise e.g. the time when I get up in the morning.
I tried
time=rep(Sys.time(),100)+round(3600*rnorm(100),1)
stripchart(as.numeric(time), main="stripchart", method="jitter", jitter = 2)
but that gives me
where I believe time is given in seconds since epoch. I'm interested in times of the day(8:02, 7:50,...) so time in seconds does not work for me. I need as.numeric however as I get 'invalid first argument' for leaving it out.
Plot the chart without the x-axis labels with xaxt="n" and add it afterwards with hours and minutes using axis. I'm using pretty to get the beginning of the hour.
time=rep(Sys.time(),100)+round(3600*rnorm(100),1)
stripchart(as.numeric(time), main="stripchart", method="jitter", jitter = 2, xaxt="n")
axis(side=1,pretty(time), format(pretty(time),"%H:%M"))
You could do this:
time=rep(Sys.time(),100)+round(3600*rnorm(100),1)
minutes <- (time - min(time))/60
stripchart(as.numeric(minutes), main="stripchart", method="jitter", jitter = 2)
Which yields this (x-axis in minutes):
I need to plot a vector of numbers. Let's say these numbers range from 0 to 1000. I need to make a histogram where the x axis goes from 100 to 500, and I want to specify the number of bins to be 10. How do I do this?
I know how to use xlim and break separately, but I don't know how to make a given number of bins inside the custom range.
This is a very good question actually! I was bothered by this all the time but finally your question has kicked me to finally solve it :-)
Well, in this case we cannot simply do hist(x, xlim = c(100, 500), breaks = 9), as the breaks refer to the whole range of x, not related to xlim (in other words, xlim is used only for plotting, not for computing the histogram and setting the actual breaks). This is a clear flaw of the hist function and there is no simple remedy found in the documentation.
I think the easiest way out is to "xlim" the values before they go to the hist function:
x <- runif(1000, 0, 1000) # example data
hist(x[x > 100 & x < 500], breaks = 9)
breaks should be number of cells minus one.
For more info see ?hist
If I want my data axis to have more breaks but without a transformation on the values, how can I do it in ggplot2? eg:
... + scale_x_continuous(breaks=scales.trans_breaks("log2", function(x) 2^x, n=8), limits=limits)
works if you want your data transformed and the n= parameter lets you say how many breaks. How can you specify breaks without transforming the data? Do you just give it an identity function?
I prefer not to give explicit ticks based on calculations in the data, and so I want ggplot2 to pick the tick marks for me given only the limits and the number of ticks. This code works for me:
library(scales)
scale_x_continuous(breaks = trans_breaks(identity, identity, n = numticks))
of course you can always set the tick marks explicitly with breaks = ... as agstudy wrote.
you can give scale_x_continuous a vector of breaks like this :
n=5
breaks = seq(min(dat$x),max(dat$x), length.out = n)
m + scale_x_continuous(breaks=breaks)