How to select a subplot of the timeVariation function in OpenAir - r

I want to produce a plot showing diurnal variation per weekday and month. The timeVariation function produces the desired plot, along with three other subplots. This makes the subplot at the top hard to read:
library(openair)
mary <- importAURN(site = "my1", year = 2000)
timeVariation(mary,
pollutant = 'no2',
type = 'month')
I would like to plot only the top subplot showing weekdays. I tried using plot(myOutput, subset = "day.hour") as described in the
OpenAir manual:
plot(timeVariation(mary,
pollutant = 'no2',
type = 'month'),
subset = 'day.hour')
But that produces this:
This plot may contain the correct data, but the replication of the labels makes it overcrowded and very confusing. Is there a way to extract just the plot I want, formatted as shown in the top image?

Related

Forecast plot with x axis labels as date

I have a dataset like revenue and date.
I used arima to plot the data.
ts_data = ts(dataset$Revenue,frequency = 7)
arima.ts = auto.arima(ts_data)
pred = forecast(arima.ts,h=30)
plot(pred,xaxt="n")
When I plot the data, it produces plot like below.
My expectations are below,
I need to display values in Million for predicted values like 13.1M.
I need to show x-axis as date instead of data points numbers.
I tried several links but couldn't crack it. Below are the experiments I made,
Tried with start date and end date in ts_data that also doesnt work.My start date is "2019-09-27" and end date is "2020-07-02"
tried wit axis_date in plot function that also doesnt work.
Please help me to crack the same.
Thanks a lot.
You can specify axis tick values and labels with axis()
plot(pred,xaxt="n", yaxt="n") # deactivate x and y axis
yl<-seq(-5000000,15000000,by=5000000) # position of y-axis ticks
axis(2, at=yl, label=paste(yl/1000000, "M")) # 2 = left axis
You can specify the desired position of y axis ticks with at and the labels to be associated with label. In order to obtain the values like 10 M I have used the function paste to join the numbers with the letter M.
You could use the same method also for x-axis, even tough more efficient methods probably exist. Not having the specific data available I have used a generic spacing and labels only to give you the idea. Once you have set the desired position you can generate the sequence of dates associated with it (to generate a sequence of dates see https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/seq.Date)
axis(1, at=seq(1,40,by=1), label=seq(as.Date("2019-09-27"),as.Date("2020-07-02"),by="week")) # 1 = below axis
You can also change the format of the dates displayed with format() for example label=format(vector_of_date, "%Y-%b-%d") to have year-month(in letter)-day.

Add data labels to spineplot in R

iFacColName <- "hireMonth"
iTargetColName <- "attrition"
iFacVector <- as.factor(c(1,1,1,1,10,1,1,1,12,9,9,1,10,12,1,9,5))
iTargetVector <- as.factor(c(1,1,0,1,1,0,0,1,1,0,1,0,1,1,1,1,1))
sp <- spineplot(iFacVector,iTargetVector,xlab=iFacColName,ylab=iTargetColName,main=paste0(iFacColName," vs. ",iTargetColName," Spineplot"))
spLabelPass <- sp[,2]/(sp[,1]+sp[,2])
spLabelFail <- 1-spLabelPass
text(seq_len(nrow(sp)),rep(.95,length(spLabelPass)),labels=as.character(spLabelPass),cex=.8)
For some reason, the text() function only plots one label far to the right of the graph. I have used this format to apply data labels to other types of graphs, so I am confused.
EDIT: added more code to make example work
You're not placing your labels inside the plotting region. It only extends to around 1.3 on the x axis. Try plotting something like
text(
cumsum(prop.table(table(iFacVector))),
rep(.95, length(spLabelPass)),
labels = as.character(round(spLabelPass, 1)),
cex = .8
)
and you'll get something like
This is obviously not the right positions for the labels, but you should be able to figure that out by yourself. (You're going to have to subtract half of the frequency for each bar from the cumulative frequency and account for the fact that the bars are padded with some amount of whitespace.

Change axis in R with different number of datas

I want to change x-axis in my graphic, but it doesn't work properly with axis(). Datas in the graphic are daily datas and I want to show only years. Hope someone understands me and find a solution. This is how it looks like now: enter image description here and this is how it looks like with the code >axis (1, at = seq(1800, 1975, by = 25), las=2): enter image description here
Without a reproducible code is not easy to get what could be the problem. I try a "quick and dirt" approach.
High level plots are composed by elements that are sub-composed themselves. Hence, separate drawing commands could turn in use by allowing a finer control on the plotting procedure.
In practice, the first thing to do is plot "nothing".
> plot(x, y, type = "n", xlab = "", ylab = "", axes = F)
type = "n" causes the data to not be drawn. axes = F suppresses the axis and the box around the plot. In spite of that, the plotting region is ready to show the data.
The main benefit is that now the plotting area is correctly dimensioned. Try now to add the desired x axis as you tried before.
> points(x, y) # Plots the data in the area
> axis() # Plots the desired axis with your scale
> title() # Plots the desired titles
> box() # Prints the box surrounding the plot
EDITED based on comment by #scoa
As a quick and dirty solution, you can simply enter the following line after your plot() line:
# This reads as, on axis x (1), anchored at the first (day) value of 0
# and last (day) value of 63917 with 9131 day year increments (by)
# and labels (las) perpendicular (2) to axis (for readability)
# EDITED: and AT the anchor locations, put the labels
# 1800 (year) to 1975 (year) in 25 (year) increments
axis (1, at = seq(0, 63917, by = 9131), las=2, labels=seq(1800, 1975, by=25));
For other parameters, check out ?axis. As #scoa mentioned, this is approximate. I have used 365.25 as a day-to-year conversion, but it's not quite right. It should suffice for visual accuracy at the scale you have provided. If you need precise conversion from days to years, you need to operate on your original data set first before plotting.

R X-axis Date Labels using plot()

Using the plot() function in R, I'm trying to produce a scatterplot of points of the form (SaleDate,SalePrice) = (saldt,sapPr) from a time-series, cross-section real estate sales dataset in dataframe format. My problem concerns labels for the X-axis. Just about any series of annual labels would be adequate, e.g. 1999,2000,...,2013 or 1999-01-01,...,2013-01-01. What I'm getting now, a single label, 2000, at what appears to be the proper location won't work.
The following is my call to plot():
plot(r12rgr0$saldt, r12rgr0$salpr/1000, type="p", pch=20, col="blue", cex.axis=.75,
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01")),
ylim=c(100,650),
main="Heritage Square Sales Prices $000s 1990-2014",xlab="Sale Date",ylab="$000s")
The xlim and ylim are called out to bound the date and price ranges of the data to be plotted; note prices are plotted as $000s. r12rgr0$saldt really is a date; str(r12rgr0$saldt) returns:
Date[1:4190], format: "1999-10-26" "2013-07-06" "2003-08-25" NA NA "2000-05-24" xx
I have reviewed several threads here concerning similar questions, and see that the solution probably lies with turning off the default X-axis behavior and using axis.date, but i) At my current level of R skill, I'm not sure I'd be able to solve the problem, and ii) I wonder why the plotting defaults are producing these rather puzzling (to me, at least) results?
Addl Observations: The Y-axis labels are just fine 100, 200,..., 600. The general appearance of the scatterplot indicates the called-for date ranges are being observed and the relative positions of the plotted points are correct. Replacing xlim=... as above with xlim=c("1999-01-01","2014-01-01")
or
xlim=c(as.numeric(as.character("1999-01-01")),as.numeric(as.character("2014-01-01")))
or
xlim=c(as.POSIXct("1999-01-01", format="%Y-%m-%d"),as.POSIXct("2014-01-01", format="%Y-%m-%d"))
all result in error messages.
With plots it's very hard to reproduce results with out sample data. Here's a sample I'll use
dd<-data.frame(
saldt=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
salpr = cumsum(rnorm(31))
)
A simple plot with
with(dd, plot(saldt, salpr))
produces a few year marks
If i wanted more control, I could use axis.Date as you alluded to
with(dd, plot(saldt, salpr, xaxt="n"))
axis.Date(1, at=seq(min(dd$saldt), max(dd$saldt), by="30 mon"), format="%m-%Y")
which gives
note that xlim will only zoom in parts of the plot. It is not directly connected to the axis labels but the axis labels will adjust to provide a "pretty" range to cover the data that is plotted. Doing just
xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01"))
is the correct way to zoom the plot. No need for conversion to numeric or POSIXct.
If you are running a plot in real time and don't mind some warnings, you can just pass, e.g., format = "%Y-%m-%d" in the plot function. For instance:
plot(seq((Sys.Date()-9),Sys.Date(), 1), runif(10), xlab = "Date", ylab = "Random")
yields:
while:
plot(seq((Sys.Date()-9), Sys.Date(), 1), runif(10), format = "%Y-%m-%d", xlab = "Date", ylab = "Random")
yields:
with lots of warnings about format not being a graphical parameter.

How to add Legend in a graph when using package Gadfly.jl in Julia

I am using Julia for Financial Data Processing and then plotting graphs based on the financial data.
on X-Axis of graph I am plotting dates (per day prices)
on Y-Axis I am plotting Stock Prices, MovingAverage13 and MovingAverage21
I am currently using DataFrames to plot the data
Code-
df=DataFrame(x=dates,y1=pricesClose,y2=m13,y3=m21)
l1=layer(x="x",y="y1",Geom.line,Theme(default_color=color("blue")));
l2=layer(x="x",y="y2",Geom.line,Theme(default_color=color("red")));
l3=layer(x="x",y="y3",Geom.line,Theme(default_color=color("green")));
p=plot(df,l1,l2,l3);
draw(PNG("stock.png",6inch,3inch),p)
I am Getting the graphs correctly but I am not able to add a Legend in the Graph that shows
blue line is for Close Prices
red line is for moving average 13
green line is for moving average 21
How can we add a legend to the graph?
I understand from the comments in this link that currently it is not possible to get a legend for a list of layers.
Gadfly is based on Hadley Wickhams's ggplot2 for R and thus the usual pattern is to arrange data into a DataFrame with a discrete column for labelling purposes. In your case, this approach would look like:
x = 1:10
df1 = DataFrame(x=x, y=2x, label="double")
df2 = DataFrame(x=x, y=x.^2, label="square")
df3 = DataFrame(x=x, y=1./x, label="inverse")
df = vcat(df1, df2, df3)
p = plot(df, x="x", y="y", color="label", Geom.line,
Scale.discrete_color_manual("blue","red", "green"))
draw(PNG("stock.png", 6inch, 3inch), p)
Now you can try with manual_color_key.
The only change in your code is needed here:
p=plot(df,l1,l2,l3,
Guide.ylabel("Some text"),
Guide.title("My title"),
Guide.manual_color_key("Legend", ["I'm blue l1", "I'm red l2", "I'm green l3"], ["blue", "red", "green"]))

Resources