YYYYMMDD format - r

structure(list(date = c(20140717L, 20140611L, 20140611L, 20140704L,
20140411L, 20140906L, 20140512L, 20140717L, 20140819L, 20140415L,
20140812L, 20140403L, 20140424L, 20140818L, 20140922L, 20140625L,
20141006L, 20140918L, 20140811L, 20140819L, 20140602L, 20140626L,
20140729L, 20140624L, 20140909L, 20140705L, 20140920L, 20140515L,
20140531L, 20140628L, 20140822L, 20140508L, 20140809L, 20140627L,
20140727L, 20140711L, 20140714L, 20140710L, 20140403L, 20140525L,
20140428L, 20140501L, 20140915L, 20140510L, 20140601L, 20140921L,
20140815L, 20140610L, 20140418L, 20140812L, 20140614L, 20140814L,
20140626L, 20140412L, 20140912L, 20140514L, 20140919L, 20140706L,
20140411L, 20140711L, 20140624L, 20140430L, 20140521L, 20140418L,
20140713L, 20140424L, 20140601L, 20140923L, 20140406L, 20140905L,
20140613L, 20140412L, 20140407L, 20140402L, 20140813L, 20140903L,
20140827L, 20140521L, 20140524L, 20140404L, 20140419L, 20140412L,
20140902L, 20140623L, 20140925L, 20140528L, 20140731L, 20140513L,
20140821L, 20140703L, 20140724L, 20140818L, 20140801L, 20140628L,
20140801L, 20140521L, 20140906L, 20140725L, 20140522L, 20140927L,
20140615L, 20140920L, 20140813L, 20140815L, 20140924L, 20140614L,
20140912L, 20140710L, 20140807L, 20140501L, 20140420L, 20140630L,
20140704L, 20140401L, 20140605L, 20140928L, 20140806L, 20140614L,
20140907L, 20140704L, 20140403L, 20140804L, 20140603L, 20140728L,
20140919L, 20140731L, 20140426L, 20140930L, 20140502L, 20140827L,
20140815L, 20140628L, 20140902L, 20140616L, 20140613L, 20140726L,
20140721L, 20140425L, 20140715L, 20140607L, 20140913L, 20140621L,
20140708L, 20140427L, 20140506L, 20140425L, 20140411L, 20140615L,
20140713L, 20140424L, 20140406L, 20140711L, 20140415L, 20140909L,
20141004L, 20140725L, 20140602L, 20140405L, 20140525L, 20140605L,
20140521L, 20140506L, 20140414L, 20140916L, 20140512L, 20140830L,
20140722L, 20140711L, 20140628L, 20140613L, 20140618L, 20140719L,
20140416L, 20140727L, 20140521L, 20140718L, 20140814L, 20140515L,
20140501L, 20140725L, 20140507L, 20140619L, 20140525L, 20140609L,
20140614L, 20140402L, 20140914L, 20140517L, 20140826L, 20140602L,
20140920L, 20140718L, 20140915L, 20140715L, 20140708L, 20140419L,
20140819L, 20140501L, 20140807L, 20140404L)), .Names = "date", row.names = c(NA,
-200L), class = "data.frame")
This data frame has date values as class of integer. This data set is just one column of my data set. The original data set also has another variable called "total sales". I want to make a plot which has dates in X axis and on Y axis, total sales.
However, because the dates are regarded as integer, the plot is bad. So I want to let R understand the date column as date variables so I can get improved plot.
How can it be possible? Please give me help. Thank you very much.

You might have better luck with as.Date. If df is the data, then you can do
df$date <- as.Date(as.character(df$date), format = "%Y%m%d")
with(df, plot(date))

If d is the date column, you can try:
strptime(t(d),format='%Y%m%d')

Related

Rolling Sample standard deviation in R

I wanted to get the standard deviation of the 3 previous row of the data, the present row and the 3 rows after.
This is my attempt:
mutate(ming_STDDEV_SAMP = zoo::rollapply(ming_f, list(c(-3:3)), sd, fill = 0)) %>%
Result
ming_f
ming_STDDEV_SAMP
4.235279667
0.222740262
4.265353
0.463348209
4.350810667
0.442607461
3.864739333
0.375839159
3.935632333
0.213821765
3.802632333
0.243294783
3.718387667
0.051625808
4.288542333
0.242010836
4.134689
0.198929941
3.799883667
0.112733475
This is what I expected:
ming_f
ming_STDDEV_SAMP
4.235279667
0.225532646
4.265353
0.212776157
4.350810667
0.23658801
3.864739333
0.253399417
3.935632333
0.26144862
3.802632333
0.246259684
3.718387667
0.20514358
4.288542333
0.208578409
4.134689
0.208615874
3.799883667
0.233948429
It doesn't match your output exactly, but perhaps this is what you need:
zoo::rollapply(quux$ming_f, 7, FUN=sd, partial=TRUE)
(It also works replacing 7 with list(-3:3).)
This expression isn't really different from your sample code, but the output is correct. Perhaps your original frame has a group_by still applied?
Data
quux <- structure(list(ming_f = c(4.235279667, 4.265353, 4.350810667, 3.864739333, 3.935632333, 3.802632333, 3.718387667, 4.288542333, 4.134689, 3.799883667), ming_STDDEV_SAMP = c(0.225532646, 0.212776157, 0.23658801, 0.253399417, 0.26144862, 0.246259684, 0.20514358, 0.208578409, 0.208615874, 0.233948429)), class = "data.frame", row.names = c(NA, -10L))

Why is 'weeks' from specific date not calculated?

I have a sample q below that contains three dates of dd/mm/yy in q$test
test
1 210376
2 141292
3 280280
I want to create a new covariate q$new that calculates the date difference from q$test to today.
I tried
q$new <- as.numeric(difftime(as.Date(q$test,format='%d/%m/%y'), as.Date(Sys.Date()), unit="weeks"))
But I receive an error message
Error in q$new <- as.numeric(difftime(as.Date(q$test, format =
"%d/%m/%y"), : object of type 'closure' is not subsettable
Do you have any idea whats wrong? Or have another solution?
q <- structure(list(test = c(210376L, 141292L, 280280L)), class = "data.frame", row.names = c(NA,
-3L))
You could do
as.numeric(difftime(Sys.Date(), as.Date(as.character(q$test), "%d%m%y"), units = "weeks"))
#[1] 2257.286 1384.143 2051.714
Few pointers -
1) Sys.Date is already of class "Date" so no need for as.Date there
2) as.Date was expecting a character string as input hence wrapped q$test in as.character
3) format in as.Date is used to represent the format we have as input and not the output we want. So in your case you used the format "%d/%m/%y" whereas the format you had was %d%m%y.

Building and analysing trends in time series

I need advice about building time series. I have a bunch of files with monthly data for sea surface temperature for an number of locations across 408 months. I have aggregated monthly values in a data frame with the following structure
longitude, latitude, SST for month 1, SST for month 2, .... SST for month n
This is just a small piece of the data frame so you can see
dput(sst_subset)
structure(list(lon = c(-19.875, -19.625, -19.375, -19.125), lat = c(30.125,
30.125, 30.125, 30.125), sst = c(293.197412803228, 293.092251515256,
292.999348291526, 293.013219258958), sst.1 = c(292.490350607051,
292.504279178168, 292.502850606771, 292.438922036772), sst.2 = c(291.994832184947,
291.887412832509, 291.832896704695, 291.810638640677), sst.3 = c(292.095993473008,
292.066660140331, 292.091993473098, 292.110326806021), sst.4 = c(293.071606354427,
293.095799902274, 293.106445063326, 293.116122482465), sst.5 = c(294.981993408501,
294.996326741514, 295.004660074661, 295.018993407674), sst.6 = c(295.568703072806,
295.600315975326, 295.597735330222, 295.49418694544), sst.7 = c(296.250961122073,
296.175154672154, 296.079348222683, 296.052251449095)), .Names = c("lon",
"lat", "sst", "sst.1", "sst.2", "sst.3", "sst.4", "sst.5", "sst.6",
"sst.7"), row.names = c(NA, 4L), class = "data.frame")
To build a time series I have extracted a row of the data frame, that corresponds to all the monthly values in a location (defined by longitude and latitude), transposed to a column and created a new data frame
ncolumnes<-ncol(sst_all)
sst_point1<-sst_all[1:3,ncolumnes]
sst1_df <- as.data.frame(t(sst_point1))
dput(sst1_ts)
structure(c(293.197412803228, 292.490350607051, 291.994832184947,
292.095993473008, 293.071606354427, 294.981993408501, 295.568703072806,
296.250961122073, 296.73166003606, 296.385154667461, 294.611660083445,
293.484186990367, 292.372896692626, 291.348207775437, 291.627090257683,
291.957326809441, 292.71063862056, 293.545326773947, 295.897412742879,
296.671928854599, 296.681326703851, 296.483864342674, 294.934660076226,
293.76709020985, 292.45870314232, 291.399993488565, 291.446767681068,
291.918993476964, 292.889025713347, 293.71099343691, 294.01418697852,
296.219025638916, 296.90166003226, 296.119993383065, 294.936326742855,
293.405154734069, 291.834509607885, 291.638564911804, 291.527412840556,
292.055326807251, 292.020961216621, 294.573660084295, 295.850315969738,
295.978380483004, 296.863660033109, 297.228380455065, 296.00866005222,
294.711606317771, 293.067735386772, 291.577136341748, 291.426445100877,
291.602993484028, 292.42096120768, 293.742993436195, 294.709348253305,
295.973219192797, 296.913993365318, 296.213219187433, 294.494326752735,
293.59225150408, 292.492251528667, 291.838207764485, 292.225477341082,
292.385993466526, 294.063864396765, 295.407326732328, 295.98386435385,
297.471928836718, 297.880660010378, 297.070638523107, 294.419993421063,
293.154509578381, 292.307735403759, 291.263441767479, 291.197412847932,
292.566660129155, 293.590316020253, 294.627660083088, 295.085477277156,
296.166122414292, 296.608660038809, 296.143864350273, 294.568660084407,
293.292251510786, 292.269670888481, 291.425350630855, 291.424832197687,
291.351326822986, 292.945799905626, 296.319660045269, 297.158380456629,
297.712251411991, 297.68699334804, 296.391928860858, 294.519660085502,
292.856445068914, 291.953864443927, 291.813922050742, 291.561606388179,
291.680660148958, 293.242574092542, 294.903326743593, 295.748057907507,
297.715799799009, 298.00999334082, 297.161606263009, 295.690326726002,
294.133541814562, 292.727412813734, 292.312493468169, 291.931928960546,
291.646326816392, 291.639670902563, 293.339326778551, 295.357090174311,
297.108703038385, 298.576993328147, 296.577735308317, 295.347660066995,
293.425154733622, 292.446445078078, 291.951027959007, 291.967735411359,
291.957993476093, 292.77838055453, 294.320326756624, 295.738703069007,
296.466122407586, 296.747993369028, 296.3506385392, 294.958326742363,
293.579348278562, 292.182574116234, 291.279279205549, 291.659993482754,
291.872993477993, 292.670316040816, 294.635326749583, 295.305477272238,
296.348057894096, 297.221993358433, 296.08612241608, 294.042993429489,
292.95160635711, 292.009670894293, 291.243207777784, 290.859025758721,
291.319993490353, 292.587412816863, 294.628660083066, 294.788057928965,
296.454832085258, 296.454326708925, 296.265477250781, 295.604326727924,
294.013219236607, 293.043541838926, 292.523922034872, 292.038703151708,
292.477326797818, 294.406122453631, 295.478993397392, 296.886122398199,
297.362251419814, 297.879993343726, 296.978703041291, 295.939326720436,
293.980638592173, 293.048703129133, 291.979993475601, 291.462896712966,
292.266326802534, 293.046445064667, 294.074993428774, 295.435477269333,
296.886122398199, 297.262660024191, 296.517090148383, 295.193326737111,
293.43967086233, 292.486122496546, 292.043564902752, 291.806767673021,
292.480660131077, 293.707735372467, 295.127326738586, 295.877735323964,
296.78192885214, 297.788326679108, 297.02450949188, 295.75766005783,
294.890315991195, 293.371606347722, 292.426422037051, 292.379670886022,
292.746993458457, 293.078057967186, 294.512993418984, 295.54612242815,
296.109348222013, 297.133660027074, 296.816767561039, 295.519326729824,
294.220638586809, 292.947412808816, 291.781422051468, 291.450638648723,
292.118660139168, 293.846122466148, 294.885993410647, 295.964832096211,
297.745154637062, 298.001326674347, 297.287735292448, 295.068993406557,
293.324509574581, 291.593864451974, 291.534821071758, 291.633219289804,
292.017993474752, 292.164187019871, 293.516660107921, 295.506122429044,
296.33321918475, 297.117660027432, 296.34741273282, 294.993660074907,
293.8032192413, 293.077735386549, 292.511779178, 292.344832177124,
292.459326798221, 293.437412797864, 295.860326722202, 296.416444989342,
297.083864329263, 298.678993325867, 297.782251410427, 295.657993393391,
293.652251502739, 293.274186995061, 292.307136325432, 291.922251541408,
291.564993484877, 292.452574110199, 293.996326763866, 294.823219218502,
296.541283696229, 297.421660020637, 296.747735304518, 295.771993390843,
294.041928913384, 293.317090219908, 292.421422037163, 292.680316040593,
292.577660128909, 293.240316028076, 295.254993402399, 296.815477238487,
297.524186900066, 298.126326671553, 297.598380446795, 295.563326728841,
294.207735361291, 293.43805795914, 293.115855519178, 292.753864426046,
292.466993464716, 292.925154744798, 296.035326718291, 296.538380470487,
298.612573972513, 298.241993335634, 297.065154652261, 295.770993390866,
293.72934827521, 292.379670886022, 291.370350632085, 291.601928967922,
292.473326797908, 293.597412794288, 294.678993415274, 296.042896610595,
297.383541741919, 297.729326680427, 296.714186918171, 295.008993407898,
293.465154732728, 292.365154757315, 292.279993468896, 291.722896707154,
292.651993460581, 293.469670861659, 295.145993404835, 296.262896605677,
297.257090131842, 297.550326684428, 297.544832060895, 296.194326714737,
294.499670838637, 293.095799902274, 292.836064885038, 292.445799916802,
292.78566012426, 293.216445060867, 294.3869934218, 295.256767595908,
296.333864346026, 296.692993370257, 296.250315960797, 295.23466006952,
293.713864404588, 292.874187004001, 292.378614156346, 291.931606379908,
292.099326806267, 293.999348269175, 295.055660073521, 296.170638543223,
296.729670788792, 297.024993362837, 296.646444984201, 294.817993412167,
293.368057960704, 292.39579991792, 291.174279207896, 291.343541876924,
291.974660142387, 292.742574103717, 294.785993412882, 296.685477241393,
297.067735297365, 297.318326689613, 297.265154647791, 296.419993376359,
294.439993420616, 293.224509576816, 293.140707735371, 292.928057970539,
293.028326785502, 293.116767643741, 294.067993428931, 295.034832116997,
296.24192886421, 297.204660025487, 297.0212836855, 295.618993394263,
294.195477297049, 293.26644505975, 292.1507077575, 291.842574123834,
292.212326803741, 292.898380551848, 293.698660103853, 294.868057927177,
296.104832093081, 297.440660020212, 296.802574012969, 295.234993402846,
293.692574082483, 292.617090235554, 291.535510726915, 291.344832199475,
292.175660137894, 293.799025693007, 295.795993390307, 296.195799832983,
297.432573998888, 298.643659993323, 297.612251414226, 296.027326718469,
294.692896640769, 293.446122475089, 292.611779175765, 292.494832173771,
293.027326785525, 293.948380528378, 294.144326760558, 295.259670821649,
296.524509503055, 297.014660029734, 296.854832076317, 295.413326732193,
294.306122455866, 292.857735391466, 291.982493475545, 291.549025743299,
292.710993459262, 293.044832161478, 294.210660092408, 296.063864352061,
296.959993364289, 298.161660004097, 297.040315943139, 295.179326737424,
293.474509571228, 292.265799920826, 291.409993488342, 291.042574141715,
291.81732681257, 293.374186992826, 294.908993410133, 296.215799832536,
297.686767541593, 298.667326659461, 297.63999334909, 295.589993394911,
294.077412783559), .Dim = c(408L, 1L), .Dimnames = list(NULL,
"1"), .Tsp = c(1982, 2015.91666666667, 12), class = "ts")
and then decompose in its additive trend, seasonal and random components and remove seasonal component from original data
sst1_dec<-decompose(sst1_ts)
sst1_noseason<-sst1_ts - sst1_dec$seasonal
Now, how do I get a linear regression for this data (sst1_noseason)? I have tried lm() but as there is only single var in the dataframe I think I can't. Should I build a new date column (time) with monthly dates and then run lm (sst ~ time)?
Is there any other R package that deals with time series that can do better? I have looked at ggseas and tidyr, they seem promising but maybe I need to build than date column to run this analysis in any case.
My final objective is to have a single value for the trend in each longitude and latitude point and plot a map to look for the areas with the highest climatic trend for sea surface temperature.
Maybe there is a better procedure and you could point me to another R package running spatio-temporal analysis. Any help would be appreciated.
Thanks in advance for your help
I am not a fan of specialised class in R, since they are usually not as intuitive and require additional vocabulary to deal with. Here's an attempt to convert the time-series you'd made into a data.frame, using zoo package:
library(zoo)
df1 <- data.frame(zoo(sst1_ts), time=as.yearmon(time(sst1_ts)))
df1$jday <- as.Date(df1$time)
(fit1<-lm(X1 ~ jday, df1))
Call:
lm(formula = X1 ~ jday, data = df1)
Coefficients:
(Intercept) jday
2.937e+02 6.025e-05
Plotting is more intuitve with a data.frame as well:
library(ggplot2)
base <- ggplot(df1, aes(jday, X1)) + geom_line() + stat_smooth(method="lm")
p<-base + scale_x_date(date_labels = "%Y")
You can further use an interactive package such as plotly to navigate the plot created with ggplotly.
library(plotly)
ggplotly(p)

how do you perfom subtraction for times vectors in R

I am trying to perform subtraction from threshold value and time value. Both vectors are as times data type from chron library.
section 1
dput(time)
structure(0.685162037037037, format = "h:m:s", class = "times")
dput(threshold)
structure(0.753472222222222, format = "h:m:s", class = "times")
when threshold is greater then time, it works as follows:
threshold-time
[1] 01:38:22
section 2
But when the time is greater than threshold value, I get some fraction number as follows:
dput(time)
structure(0.83318287037037, format = "h:m:s", class = "times")
dput(threshold)
structure(0.753472222222222, format = "h:m:s", class = "times")
threshold-time
[1] -0.07971065
I need to be able to get the same type of results as section 1. Any ideas how could this work. Is there any way to format the results in section 2 in terms of %H:%M:%S?
We can wrap with abs and get the %H:%M:%S format
abs(threshold-time)
#[1] 01:54:47
threshold-time
#[1] -0.07971065
If we do the reverse
time-threshold
#[1] 01:54:47

TIme series data in R, problems with dates

Date T1V T2V T3V T1MV T2MV T3MV
1997-12-31 2.631202 2.201695 -0.660092 -0.77492483 0.282662305 4.66506798
1998-01-30 2.193793 3.763458 5.565432 3.50711734 2.874381814 5.14118430
1998-02-27 5.173496 8.727646 6.333820 2.59892279 8.363146480 9.27289259
This is the table I am working with in R. It is much bigger. Data is on monthly basis up until 2014.The different columns are just the return dates on different portfolios. I always get errors if I want to use it as a time series data. I downloaded the PerformanceAnalytics package. For example for the SharpeRatio function it gives me.
> SharpeRatio(T1V)
Error in checkData(R) :
The data cannot be converted into a time series. If you are trying to passin names from a data object with one column, you should use the form 'data[rows, columns, drop = FALSE]'. Rownames should have standard date formats, such as '1985-03-15'.
when you look at the date column in the table you see that the date format is exactly this format.
I tried a hundred things. It also doesn^t let me plot the charts with lines only with points.
Any help is much appreciated.
> dput(FactorR[1:5,])
structure(list(Date = structure(1:5, .Label = c("1997-12-31",
"1998-01-30", "1998-02-27", "1998-03-31", "1998-04-30", "1998-05-29",
"1998-06-30", "1998-07-31", "1998-08-31", "1998-09-30", "1998-10-30",
"1998-11-30", "1998-12-31", "1999-01-29", "1999-02-26", "1999-03-31",
"1999-04-30", "1999-05-31", "1999-06-30", "1999-07-30", "1999-08-31",
"1999-09-30", "1999-10-29", "1999-11-30", "1999-12-31", "2000-01-31",
"2000-02-29", "2000-03-31", "2000-04-28", "2000-05-31", "2000-06-30",
"2000-07-31", "2000-08-31", "2000-09-29", "2000-10-31", "2000-11-30",
"2000-12-29", "2001-01-31", "2001-02-28", "2001-03-30", "2001-04-30",
.
.
.
, class = "factor"),
T1V = c(2.631202, 2.193793, 5.173496, 8.033864, 1.369065),
T2V = c(2.201695, 3.763458, 8.727646, 11.375482, 3.097196
), T3V = c(-0.660092, 5.565432, 6.33382, 20.608638, 4.022475
), T1MV = c(-0.774924835, 3.507117337, 2.598922792, 16.26945887,
4.544096701), T2MV = c(0.282662305, 2.874381814, 8.36314648,
12.7091841, 1.078742371), T3MV = c(4.665067984, 5.141184302,
9.27289259, 10.62133318, 2.791853987), T1BTM = c(0.617378168,
3.498582776, 3.332624722, 8.802164975, 1.366229683), T2BTM = c(1.101407825,
5.578394125, 8.910685728, 20.05317039, 1.258609942), T3BTM = c(2.454019461,
2.445706552, 7.991651412, 10.79096755, 5.464002646), T1MOM = c(2.99986853,
4.982808153, 8.657010689, 10.60637296, 4.44333707), T2MOM = c(0.011102554,
3.184165606, 7.55229158, 11.9341773, 0.328377299), T3MOM = c(1.161834369,
3.355709694, 4.025659592, 17.12665788, 3.55822744), Rm = c(1.390935,
3.840895, 6.744987, 13.262647, 2.753486), SMB = c(-5.439992819,
-1.634066965, -6.673969798, 5.648125694, 1.752242715), HML = c(-1.836641293,
1.052876225, -4.65902669, -1.988802574, -4.097772963), MOM = c(1.838034161,
1.62709846, 4.631351096, -6.520284921, 0.885109629)), .Names = c("Date",
"T1V", "T2V", "T3V", "T1MV", "T2MV", "T3MV", "T1BTM", "T2BTM",
"T3BTM", "T1MOM", "T2MOM", "T3MOM", "Rm", "SMB", "HML", "MOM"
), row.names = c(NA, 5L), class = "data.frame")
Two things are wrong:
Your Date column doesn't contain dates but factors.
SharpeRatio doesn't know how to convert your data.frame to a time series object.
By doing the conversion manually, we can specify which column to use as time index and on-the-fly convert it to Date:
library(PerformanceAnalytics)
FactorR_xts <- xts(x = FactorR[, -1], # use all columns except for first column (date) as data
order.by = as.Date(FactorR$Date) # Convert Date column from factor to Date and use as time index
)
SharpeRatio(FactorR_xts)

Resources