Customizing x-axis labels on plot - r

I plotted my data and also suppressed the auto x-axis labeling successfully.
Now I'm using the following command to customize my x=axis labels:
axis(
1,
at = min(LoopVariable[ ,"TW"]) - 1 : max(LoopVariable[ ,"TW"]) + 1,
labels = min(LoopVariable[ ,"TW"]) - 1 : max(LoopVariable[ ,"TW"]) + 1,
las = 2
)
And I'm getting:
This is correct in the sense that I'm having 28 data points, but when I do:
LoopVariable[ ,"TW"]
Then I get:
[1] 2801 2808 2813 2825 2833 2835 2839 2840 2844 2856 2858 2863 2865 2868 2870 2871 2873 2879 2881 2903 2904 2914 2918 2947 2970 2974 2977 2986
These are the the values I want as x-axis labels rather than 1:28. There is obviously a little bit missing in my line I seem not to figure out.

Related

Change axises' scale in a plot without creating new varibale

I have a dataset like below (this is only the first 20 rows and the first 3 columns of data):
row fitted measured
1 1866 1950
2 2489 2500
3 1486 1530
4 1682 1720
5 1393 1402
6 2524 2645
7 2676 2789
8 3200 3400
9 1455 1456
10 1685 1765
11 2587 2597
12 3040 3050
13 2767 2769
14 3300 3310
15 4001 4050
16 1918 2001
17 2889 2907
18 2063 2150
19 1591 1640
20 3578 3601
I plotted this data
plot(data$measured~data$fitted, ylab = expression("Measured Length (" * mu ~ "m)"),
xlab = expression("NIR Fitted Length (" * mu ~ "m)"), cex.lab=1.5, cex.axis=1.5)
and got the following:
As you can see the axises scales are in micrometer, I need the axis to be in millimeter.
How can I plot the data while axises are in millimeter, WITHOUT creating a new variable?
Like this;
If I want to create a new variable, I have to change the whole 2000 lines code that I've written before and that's not a road that I want to go! :|
Thanks much :)
I used #bdemarest method for plot and #IukeA method for abline ;
plot(y=data$measured/1000,x=data$fitted/1000, ylab = expression("Measured Length (mm)"),
xlab = expression("NIR Fitted Length (mm)"), cex.lab=1.5, cex.axis=1.5)
a = lm(I(data$measured/1000)~I(data$fitted/1000), data=data)
abline(a)
Here is the final plot;

R: How to calculate accuracy in ETS of test set?

I am having a problem with the calculating of accuracy in ETS of test set.
train_ts<- ts(head(t$value,141), frequency=7) # this is train set (first 141 rows)
fit=auto.arima(train_ts)
forecasts = forecast(fit,h=12)
vector = ts(tail(t$value,12),frequency=7) # this is test set (last 12 rows)
accuracy(forecasts, vector, test=NULL, d=NULL, D=NULL) # I try to calculate accuracy
And I have this error:
Error in window.default(x, ...) : 'start' cannot be after 'end'
In addition: Warning message:
In window.default(x, ...) : 'start' value not changed
Result of forecasting:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
191 4742.038402 3781.130910 5702.945894 3272.457210 6211.619593
192 5068.467231 4105.169285 6031.765177 3595.230155 6541.704307
193 5233.951079 4270.487205 6197.414954 3760.460238 6707.441921
194 4883.850503 3910.172814 5857.528191 3394.738981 6372.962025
195 4857.666612 3883.140593 5832.192631 3367.257681 6348.075543
196 5180.408585 4203.616284 6157.200886 3686.533674 6674.283496
197 5091.348011 4112.687519 6070.008503 3594.615948 6588.080074
198 4833.290365 3848.222297 5818.358433 3326.758761 6339.821969
199 5003.034291 4017.771775 5988.296807 3496.205304 6509.863278
200 5175.020752 4189.555595 6160.485908 3667.881854 6682.159650
201 4963.008654 3972.665298 5953.352010 3448.409193 6477.608114
202 4882.858876 3890.856391 5874.861360 3365.721997 6399.995754
vector:
Time Series:
Start = 1
End = 12
Frequency = 1
[1] 5243 5010 5374 4952 6911 4260 6063 5597 4536 5522 4254 5048
How can I fix my error or how can I calculate accuracy correctly?
Example data (t$value):
[1] 5564 6657 7184 6456 5597 5951 6771 5990 6289 6885 6171 4739 5737 5950 6721
[16] 6579 6763 6829 5779 5346 5652 6319 6407 7232 6600 6244 5631 5198 6360 7922
[31] 6035 4221 4361 4475 5585 4845 5958 6833 3617 5036 4560 3820 5724 6352 5773
[46] 6200 4378 5614 5165 6345 5769 6228 6378 4827 4402 5829 4880 6333 6406 434
[61] 4754 4303 5498 5048 6042 6664 5492 5684 6194 5349 5846 5916 5069 5071 4367
[76] 5381 5694 5731 6029 5639 5539 4490 5223 5436 5819 941 6576 5235 3574 6319
[91] 5063 5765 5919 6006 5479 3653 4281 5433 4851 5543 5995 5049 4728 5449 5728
[106] 6009 5378 5730 5206 4764 5458 5970 5254 5653 5539 1907 4438 5421 5529 5225
[121] 6158 5572 4777 4575 5275 4742 5648 5198 5624 4781 3959 4368 5478 4681 5288
[136] 5758 4540 3899 5760 4797 5580 5433 4898 4473 3566 4779 4897 5099 5866 6231
[151] 4982 4375 5976
Firstly, something seems off in the forecast output you posted; it starts at point 191 which means the fitted series ended at 190, but that doesn't seem right given the code you posted.
Regardless, DatamineR is correct in his comment. You are providing two time series with different ranges of time. The forecast function will pick up where the fitted time series left off, but when you use ts(tail(t$value,12),frequency=7) you are creating a new time series that starts at 1.
One option is to convert one (or both) into numeric vectors, as DatamineR suggested. Otherwise you can set the start time for your test set to the correct value doing something like:
vector = ts(tail(t,12),start=end(train_ts)+c(0, 1), frequency=7)
where end(train_ts) gives you the last time point of the training series, and then I added one more time step (in the same cycle) by adding c(0,1) to set the start time of the test series.

Creating line graph using ggplot in r

Days Profit
4672 5195 79823.72824
4673 5196 79823.72824
4674 5197 79823.72824
4675 5198 79823.72824
4676 5199 79823.72824
4677 5200 79823.72824
4678 5201 79823.72824
4679 5203 77760.56168
4680 5204 77760.56168
4681 5205 77760.56168
4682 5206 77760.56168
4683 5207 77760.56168
4684 5212 85379.47144
4685 5213 85379.47144
4686 5214 85379.47144
4687 5215 85379.47144
4688 5216 85379.47144
Above is an example of the data frame that I created, I only posted a small chunk of it as it is around 7000 rows. I am trying to create a line graph with the data using ggplot. The graphs that I see others post look very nice and professional but when I created mine it was not. Below you will see how I used ggplot and my graph result.
df <- data.frame(Days = Day_value, Profit = PnL_value)
p <- ggplot(data=df, aes(x=Days, y=Profit)) + geom_line() + geom_point()
The graph below is the print of my entire data set and not just the select data I shared. One thing to notice is that my Days column does not always increment by 1.
I would ideally have my graph look like this one with only 1 line instead of 3.
Ideal Graph
When I check the Structure I get:
'data.frame': 6993 obs. of 2 variables:
$ Days : Factor w/ 6993 levels "100","1000","1001",..: 4180 4286 4397 4489 4598 4699 4810 4910 5008 5111 ...
$ Profit: num 0 0 0 0 0 0 0 0 0 0 ...
NULL
When I run it I also get this error:
geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?

R oce - How to apply read.topo or as.topo to .xyz file from ETOPO1

I really like the oce package and would like to use plot.topo to make a map of the coastline and bathymetry of Eastern Canada and Northeastern United States. I am trying to make a dataset like "topoMaritimes", which is available and covers much of my study area, but I need Newfoundland/Labrador and New England, USA to be included as well.
I tried using the worldCoastline dataset but the resolution is too low.
I obtained a .xyz dataset from ETOPO1 from my study area (using the same source as the oce creator: http://www.ngdc.noaa.gov/mgg/gdas/gd_designagrid.html), which should work for me. I am now trying to read it into R using read.topo() or as.topo() but have run into problems.
I assumed read.topo() would work, given the info specified here: http://www.inside-r.org/packages/cran/oce/docs/read.topo. But when I use read.topo:
geo<-read.topo("ETOPO1data.xyz")
Error in if (is.character(file) && grep(".nc$", file)) { :
missing value where TRUE/FALSE needed
Apparently this error indicates that there are missing values/NAs?
When I try importing it as a generic text file, it works:
geo<-read.table(file="ETOPO1data.xyz", sep="")
head(geo)
V1 V2 V3
1 -73.000000, 56.000000, 331
2 -72.983333, 56.000000, 328
3 -72.966667, 56.000000, 327
4 -72.950000, 56.000000, 327
5 -72.933333, 56.000000, 325
6 -72.916667, 56.000000, 324
However, when I try to coerce this into a "topo" type of object, I get an error:
geo<-as.topo(geo$V1, geo$V2, geo$V3)
Error in Summary.factor(c(1261L, 1260L, 1259L, 1258L, 1257L, 1256L, 1255L, :
min not meaningful for factors
As it turns out, the longitude and latitude fields are factors:
str(geo)
'data.frame': 1211821 obs. of 3 variables:
$ V1: Factor w/ 1261 levels "-52.000000,",..: 1261 1260 1259 1258 1257 1256 1255 1254 1253 1252 ...
$ V2: Factor w/ 961 levels "40.000000,","40.016667,",..: 961 961 961 961 961 961 961 961 961 961 ...
$ V3: int 331 328 327 327 325 324 325 329 336 351 ...
However, converting the lats/lons to numeric type changes their values completely:
geo$V1<-as.numeric(geo$V1)
geo$V2<-as.numeric(geo$V2)
head(geo)
V1 V2 V3
1 1261 961 331
2 1260 961 328
3 1259 961 327
4 1258 961 327
5 1257 961 325
6 1256 961 324
Anyone know how to convert an x/y/z (i.e. lon/lat/depth) file to a "topo" object? Could it be an issue caused by me using ETOPO1 instead of ETOPO2?
Thanks in advance!
Try using the marmap package.
library(marmap)
papoue <- getNOAA.bathy(lon1 = 140, lon2 = 155, lat1 = -13, lat2 = 0,
resolution = 10)
summary(papoue)
blues <- colorRampPalette(c("red","purple","blue","cadetblue1",
"white"))
You can convert between data types using as.xyz and as.bathy
plot(papoue, image = TRUE, bpal = blues(100))

Multiple time series in one plot

I have a time series of several years that I need to plot in one graph. The largest series has a mean of 340 and a minimum of 245 and maximum of 900. The smallest series has a mean of 7 with a minimum of -28 and maximum of 31. The remaining series has values in the range of 6 to 700. The series follows a regular annual and seasonal pattern over years until suddenly there was an upsurge of temperature for a month which was followed by much increased deaths than usual.
I cannot provide any real data, but I have simulated the following data and tried the code below which was based on an example code found here http://www.r-bloggers.com/multiple-y-axis-in-a-r-plot/. But the plot has not produced what I have desired. I have the following questions
In the plot it is difficult to clearly depict any of the series and important facts are hidden in the detail. How can I better present this data?
The Y axes have different lengths. How could I have axes with the same length? I appreciate any idea and suggestion on how to improve this code and present a better plot. The data I have simulated does not reflect my data as I am unable to simulate the extreme values that mirror the period of extreme weather episode.
Many thanks
temp<- rnorm(365, 5, 10)
mort<- rnorm(365, 300, 45)
poll<- rpois(365, lambda=76)
date<-seq(as.Date('2011-01-01'),as.Date('2011-12-31'),by = 1)
df<-data.frame(date,mort,poll,temp)
windows(600,600)
par(mar=c(5, 12, 4, 4) + 0.1)
with(df, {
plot(date, mort, axes=F, ylim=c(170,max(mort)), xlab="", ylab="",type="l",col="black", main="")
points(date,mort,pch=20,col="black")
axis(2, ylim=c(170,max(mort)),col="black",lwd=2)
mtext(2,text="Mortality",line=2)
})
par(new=T)
plot(date, poll, axes=F, ylim=c(45,max(poll)), xlab="", ylab="",
type="l",col="red",lty=2, main="",lwd=1)
axis(2, ylim=c(45,max(poll)),lwd=1,line=3.5)
points(date, poll,pch=20)
mtext(2,text="PM10",line=5.5)
par(new=T)
plot(date, temp, axes=F, ylim=c(-28,max(temp)), xlab="", ylab="",
type="l",lty=3,col="brown", main="",lwd=1)
axis(2, ylim=c(-28,max(temp)),lwd=1,line=7)
points(date, temp,pch=20)
mtext(2,text="Temperature",line=9)
axis(1,pretty(range(date),10))
mtext("date",side=1,col="black",line=2)
Here are 6 approaches:
library(zoo)
z <- read.zoo(df)
# classic graphics in separate and single plots
plot(z)
plot(z, screen = 1)
# lattice graphics in separate and single plots
library(lattice)
xyplot(z)
xyplot(z, screen = 1)
# ggplot2 graphics in separate and single plots
library(ggplot2)
autoplot(z) + facet_free()
autoplot(z, facet = NULL)
I had the same task in hand and after some research I came across ts.plot {stats} function in r which was very helpful.
The usage of the function is as follows :
ts.plot(..., gpars = list())
gpars is the graphic parameters where you can specify the graphic components of the plot.
I had a data similar to this and stored in a variable called time:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
V3 1951 1100 433 5638 1760 2385 2602 11007 2490 421
V5 433 880 216 4988 220 8241 13229 18704 6289 421
V7 4001 440 433 3686 880 9976 12795 21036 13229 1263
V9 2385 1320 650 8241 440 12795 13229 19518 11711 1474
V11 4771 880 1084 6723 0 17783 17566 27326 11060 210
V13 6940 880 2168 2602 1320 21036 16265 10843 15831 1474
V15 3903 1760 1951 3470 0 18217 14964 0 13663 2465
V17 4771 440 2819 8458 880 25591 24940 1518 17783 1895
V19 7807 1760 5205 2385 0 14096 22771 13880 12578 1263
V21 5205 880 5205 6506 880 28410 18217 13229 19952 1474
V23 6506 1760 5638 7590 880 14747 26675 11928 12795 1474
V25 7373 440 5855 10626 0 19301 21470 15398 19952 1895
V27 5638 2640 6289 0 880 16482 20603 30796 14313 2316
V29 8241 440 6506 6723 880 11277 35784 25157 23205 4423
V31 7373 2640 6072 8891 220 17133 27109 31013 27287 4001
V33 6723 660 5855 14313 660 6940 26892 17566 24111 4844
V35 9325 2420 9325 12578 0 6506 30796 34483 23422 5476
V37 4771 440 6872 12361 880 9325 36218 25808 30362 4844
V39 9976 2640 7658 12361 440 11277 36001 31013 40555 4633
V41 10410 880 6506 12795 440 26241 33398 27976 24940 5686
V43 5638 2200 7590 14313 0 9976 34483 29928 33832 6108
V45 10843 440 8675 11711 440 7807 29278 24940 43375 4633
V47 8675 1760 8891 13663 0 9108 38386 31230 33398 4633
V49 10410 1760 9542 13880 440 8675 39051 31446 42507 5476
. . . . . . . . .
And I had to get a Time series plot for each column on the same plot.
The code is as follows:
ts.plot(time,gpars= list(col=rainbow(10)))
I'd use separate plots for each variable, making their y-axis different. I like this better than introducing multiple y-axes in one plot. I will use ggplot2 to do this, and more specifically the concept of facetting:
library(ggplot2)
library(reshape2)
df_melt = melt(df, id.vars = 'date')
ggplot(df_melt, aes(x = date, y = value)) +
geom_line() +
facet_wrap(~ variable, scales = 'free_y', ncol = 1)
Notice that I stack the facets on top of each other. This will enable you to easily compare the timing of events in each of the series. Alternatively, you could put them next to each other (using nrow = 1 in facet_wrap), this will enable you to easily compare the y-values.
We can also introduce some extremes:
df = within(df, {
temp[61:90] = temp[61:90] + runif(30, 30, 50)
mort[61:90] = mort[61:90] + runif(30, 300, 500)
})
df_melt = melt(df, id.vars = 'date')
ggplot(df_melt, aes(x = date, y = value)) +
geom_line() +
facet_wrap(~ variable, scales = 'free_y', ncol = 1)
Here you can see easily that the increase in temp is correlated with the increase in mortality.

Resources