Add exponential fit to time series in ggplot - r

I've searched around StackOverflow for the issue I'm facing and can't quite find something similar.
I'm working with a large time series, with a portion of the dataset below. With that, I'm trying to find a way to add an exponential fit to it using ggplot. Others have used geom_smooth(method = "lm", formula = (y ~ exp(x))) but that doesn't work with time series data or POSIXct class variables and returns the error "Computation failed in stat_smooth(): NA/NaN/Inf in 'x'". Previously, I simply used method = "loess", span = 0.1, but it doesn't capture the nature of the data very well.
Any help you could provide would be greatly appreciated!
data<-structure(list(avg_time = structure(c(1551420000, 1551506400,
1551592800, 1551679200, 1551765600, 1551852000, 1551938400, 1552024800,
1552111200, 1552197600, 1552280400, 1552366800, 1552453200, 1552539600,
1552626000, 1552712400, 1552798800, 1552885200, 1552971600, 1553058000,
1553144400, 1553230800, 1553317200, 1553403600, 1553490000, 1553576400,
1553662800, 1553749200, 1553835600, 1553922000, 1554008400, 1554094800,
1554181200, 1554267600, 1554354000, 1554440400, 1554526800, 1554613200,
1554699600, 1554786000, 1554872400, 1554958800, 1555045200, 1555131600,
1555218000, 1555304400, 1555390800, 1555477200, 1555563600, 1555650000,
1555736400, 1555822800, 1555909200, 1555995600, 1556082000, 1556168400,
1556254800, 1556341200, 1556427600, 1556514000, 1556600400, 1556686800,
1556773200, 1556859600, 1556946000, 1557032400, 1557118800, 1557205200,
1557291600, 1557378000, 1557464400, 1557550800, 1557637200, 1557723600,
1557810000, 1557896400, 1557982800, 1558069200, 1558155600, 1558242000,
1558328400, 1558414800, 1558501200, 1558587600, 1558674000, 1558760400,
1558846800, 1558933200, 1559019600, 1559106000, 1559192400, 1559278800,
1559365200, 1559451600, 1559538000, 1559624400, 1559710800), tzone = "", class = c("POSIXct",
"POSIXt")), ChlaMed = c(7.49786224129294, 6.33265484668835, 8.02891354394607,
8.36583527788548, 7.21848200004542, 3.87836804380364, 6.12041645730209,
6.11129053757413, 3.82314913061958, 6.66935722139803, 10.5846145945807,
1.3922819262622, 2.46397555374784, 3.5387541991258, 9.4377648342203,
3.8359888625491, 9.92938437268906, 9.84931346445947, 7.61136832417625,
10.422317215878, 9.92795625389519, 10.2145441518957, 9.87188069822321,
6.75768698400432, 7.50045495545547, 7.3979513362914, 12.0524471187313,
11.0031790178811, 9.23929610466274, 12.2253404703908, 10.8260865574934,
5.79312487695101, 7.86859910828088, 13.9784098169617, 13.3707820039944,
8.11038273190177, 13.852156279962, 6.94197529427832, 10.1752314872054,
10.3435349795235, 14.4105077850521, 12.3100928225917, 11.4965118440029,
13.5176883961026, 10.4577799463301, 11.8074169933709, 13.245655700942,
13.5716513275785, 14.0549071116729, 14.6034112846714, 13.8998981372714,
11.0290734663967, 12.7725741301044, 14.0037640681163, 12.99276716795,
12.9177278644427, 15.6103759408624, 11.4159351143177, 14.7053508114725,
14.3380030612979, 14.846661975045, 14.1918024501013, 14.1478311220769,
15.4169566103641, 14.1251696199414, 13.4057098254015, 15.0936022765442,
14.94796281727, 11.9943525040373, 15.6886181916423, 15.7057435474498,
16.1855936444667, 17.4195546581076, 16.977113306558, 16.4826655395595,
14.273959862613, 18.6570604979906, 15.2969835201503, 15.6502935625097,
16.4619111787213, 17.8995674961064, 16.9938925321631, 17.409705465615,
19.7838080835222, 18.7386731671602, 19.6515930205419, 20.4308399460097,
18.787235170191, 18.758368516805, 19.2927499812326, 19.4763785903839,
20.4249755976496, 19.0471858942877, 20.0134726662527, 20.9237871993584,
20.0967875761179, 20.7116516016657)), row.names = c(NA, -97L), class = c("tbl_df",
"tbl", "data.frame"))

You could use nls() to get an exponential fit, make predictions, and plot those in addition to the raw data points:
data %>%
mutate(
d = as.numeric(difftime(as.Date(avg_time),min(as.Date(avg_time)),units = "days")),
preds =predict(nls(ChlaMed~a*exp(r*d), start = list(a=0.5, r=0.1), data=data))
) %>%
ggplot(aes(x=avg_time)) +
geom_point(aes(y=ChlaMed)) +
geom_line(aes(y=preds),color="red", linewidth=1.5)

You can give it a try the timetk package using the natural log function for the response variable.
library(timetk)
data %>%
plot_time_series_regression(
.date_var = avg_time,
.formula = log(ChlaMed) ~ avg_time,
.interactive =FALSE
)

Related

Tidysynth error -- Please specify only one treated unit

I am trying to calculate a Synthetic control using the tidysynth package. I'm fairly new to the package and the data but here is my code:
#Import data
synth <- read.csv("https://raw.githubusercontent.com/FDobkin/coal_paper/main/synth_data.csv")
#Convert year to date
synth$year <- strptime(synth$year, format = "%Y")
synth_out <- synth %>%
synthetic_control(
outcome=saleprice,
unit=fips,
time=year,
i_unit=47145,
i_time=2009-10-19,
generate_placebos=T
) %>%
generate_predictor(
time_window=2000-10-19:2009-10-19,
population = pop,
white = white_p,
age = age65p_p,
rucc = rucc_code,
income = median_income,
unemploy = unemprate,
laborforce = lfrate
) %>%
generate_weights(optimization_window = 2000-10-19:2009-10-19, # time to use in the optimization task
margin_ipop = .02,sigf_ipop = 7,bound_ipop = 6 # optimizer options
)
%>%
generate_control()
The error is:
Error in synth_method(treatment_unit_covariates = treatment_unit_covariates, :
Please specify only one treated unit.
The error seems to becoming from the generate_weights() statement. I am specifying the specific county that is receiving the treatment in the synthetic_control() statement. What is the error noting is wrong?

Use lag(x,1) or lag(x,-1) for dynamic regression?

I have a simple yet somehow confusing question about dynamic regressions and lagged independent variables. I have 3 time series and I want to study the effect of 3 indedendent variables (namely PSVI, NSVI, and BTC_Ret) from the previous week on the current weeks bitcoin log returns. I want to analyse for example if a negative change in PSVI (Positve Sentiment Index) from the previous week can tell us something about the direction of the BTC returns in the following week.
I came across the lag function which can do exactly do that.
If I understand the function correctly, I would use the the lag function in combination with the dyn$lm function from the package dyn to get the results I want.
My code would then look as follows:
test1 <- dyn$lm(BTC_Ret~lag(PSVI,1)+lag(NSVI,1)+lag(BTC_Ret,1))
summary(test1)
Am I right to assume that I need to use lag(x,1) and not lag(x,-1)?
And should I use dyn$lm to study the effect or is there a better way to do all of this?
My data looks as follows:
structure(c(0.151825062532955, -0.179352391776254, -0.171610266403897,
0.0159227765884022, -0.353420091085592, -0.0179223189753976,
0.260710954985742, -0.0878045204765083, 0.17494222283881, -0.183889954532262,
-0.15249960475038, 0.0325479482522972, -0.216135243885031, 0.0258548317723122,
0.170469815313808, 0.0552681180119521, 0.0676987678252168, 0.0247151614282206,
-0.101373110320685, -0.0244444101458825, -0.363995910827583,
-0.819549195465083, -0.311532754839479, -0.661660753934884, -0.036159476713393,
-0.0116417252109642, -0.219357256430676, -0.386169350367107,
-0.468384245564164, 0.226420789220966, -0.2366560332375, 0.2425676656972,
-0.351430535471613, -0.287492079068963, 0.548071569094531, -0.228973857164721,
-0.139490538928287, 0.247548840497568, -0.361502742177194, 0.0604938285432965,
0.619445016304069, 0.0947076213861557, -0.887137767470338, 0.0485516007581502,
0.0429273907756451, -0.701341407090506, 0.34191134646093, -0.428167056300805,
-0.298917079322128, 0.517537828051947, 0.0474069010338689, -0.118044838446349,
-0.414289228784203, 0.143198527419672, 0.0733053148180489, 0.0131259707878403,
-0.106103445964187, 0.107827719520595, -0.604074345624302, 0.444400965939648
), .Dim = c(20L, 3L), .Dimnames = list(NULL, c("BTC_Ret", "PSVI",
"NSVI")), .Tsp = c(2018, 2018.36538461538, 52), class = c("mts",
"ts", "matrix"))
Many thanks!
Assuming tt defined in the Note at the end (copied from the question) we use the following.
ts class is normally used with R's lag. The -1 in that means move the series 1 forward so that the previous value lines up with the current row. There is more information in ?lag.
Do not use dplyr's lag which does not work with ts class and furthermore is different and uses the opposite convention or if you want to load dplyr use library(dplyr, exclude = c("filter", "lag")) to ensure that you are using R's lag.
library(dyn)
test1 <- dyn$lm(BTC_Ret ~ lag(PSVI,-1) + lag(NSVI,-1) + lag(BTC_Ret,-1), tt)
These alternatives also work:
Lag <- function(x, k = 1) lag(x, -k)
test2 <- dyn$lm(BTC_Ret ~ Lag(PSVI) + Lag(NSVI) + Lag(BTC_Ret), tt)
test3 <- dyn$lm(BTC_Ret ~ lag(tt, -1), tt)
Note
tt <- structure(c(0.151825062532955, -0.179352391776254, -0.171610266403897, 0.0159227765884022, -0.353420091085592, -0.0179223189753976, 0.260710954985742, -0.0878045204765083, 0.17494222283881, -0.183889954532262, -0.15249960475038, 0.0325479482522972, -0.216135243885031, 0.0258548317723122, 0.170469815313808, 0.0552681180119521, 0.0676987678252168, 0.0247151614282206, -0.101373110320685, -0.0244444101458825, -0.363995910827583, -0.819549195465083, -0.311532754839479, -0.661660753934884, -0.036159476713393, -0.0116417252109642, -0.219357256430676, -0.386169350367107, -0.468384245564164, 0.226420789220966, -0.2366560332375, 0.2425676656972, -0.351430535471613, -0.287492079068963, 0.548071569094531, -0.228973857164721, -0.139490538928287, 0.247548840497568, -0.361502742177194, 0.0604938285432965, 0.619445016304069, 0.0947076213861557, -0.887137767470338, 0.0485516007581502, 0.0429273907756451, -0.701341407090506, 0.34191134646093, -0.428167056300805, -0.298917079322128, 0.517537828051947, 0.0474069010338689, -0.118044838446349, -0.414289228784203, 0.143198527419672, 0.0733053148180489, 0.0131259707878403, -0.106103445964187, 0.107827719520595, -0.604074345624302, 0.444400965939648 ), .Dim = c(20L, 3L), .Dimnames = list(NULL, c("BTC_Ret", "PSVI", "NSVI")), .Tsp = c(2018, 2018.36538461538, 52), class = c("mts", "ts", "matrix"))

Error in if (class(x) == "numeric") { : the condition has length > 1

I`m trying to visualise data of the following form:
date volaEUROSTOXX volaSA volaKENYA25 volaNAM volaNIGERIA
1 10feb2012 0.29844454 0.1675901 0.007862087 0.12084170 0.10247617
2 17feb2012 0.31811157 0.2260064 0.157017220 0.33648935 0.22584127
3 24feb2012 0.30013672 0.1039974 0.083863921 0.11694768 0.16388161
To do so, I first converted the date (stored as a character in the original data frame) into a date-format. Which works just fine:
vola$date <- as.Date(vola$date)
str(vola$date)
Date[1:543], format: "2012-02-10" "2012-02-17" "2012-02-24" "2012-03-02" "2012-03-09"
However, if I now try to graph my data by using the chart.TimeSeries command, I get the following:
chart.TimeSeries(volatility_annul_stringdate,lwd=2,auto.grid=F,ylab="Annualized Log Volatility",xlab="Time",
main="Log Volatility",lty=1,
legend.loc="topright")
Error in if (class(x) == "numeric") { : the condition has length > 1
I tried:
Converting my date variable (in the date format) further into a time series object:
vola$date <- ts(vola$date, frequency=52, start=c(2012,9)) #returned same error from above
Converting the whole data set using its-command:
vol.xts <- xts(vola, order.by= vola$date, unique = TRUE ) # which then returned:
order.by requires an appropriate time-based object
#even though date is a time-series
What am I doing wrong? I am rather new to RStudio.. I really want to use the chart.TimeSeries command. Can someone help me?
Thanks in advance!
My MRE:
library(PerformanceAnalytics)
vola <- structure(list(date_2 = c("2012-02-10", "2012-02-17", "2012-02-24",
"2012-03-02"), volaEUROSTOXX = c(0.298444539308548, 0.318111568689346,
0.300136715173721, 0.299697518348694), volaKENYA25 = c(0.00786208733916283,
0.157017216086388, 0.0838639214634895, 0.152377054095268), volaNAM = c(0.120841704308987,
0.336489349603653, 0.116947680711746, 0.157027021050453), volaNIGERIA = c(0.102476172149181,
0.225841268897057, 0.163881614804268, 0.317349642515182), volaSA = c(0.167590111494064,
0.226006388664246, 0.103997424244881, 0.193037077784538), date = structure(c(1328832000,
1329436800, 1330041600, 1330646400), tzone = "UTC", class = c("POSIXct",
"POSIXt"))), row.names = c(NA, -4L), class = c("tbl_df", "tbl",
"data.frame"))
vola <- subset(vola, select = -c(date))
vola$date_2 <- as.Date(vola$date_2)
chart.TimeSeries(vola,lwd=2,auto.grid=F,ylab="Annualized Log Volatility",xlab="Time",
main="Log Volatility",lty=1,
legend.loc="topright")
#This returns the above mentioned error message.
#Thus, I tried the following:
vola$date_2 <- ts(vola$date_2, frequency=52, start=c(2012,9))
chart.TimeSeries(vola,lwd=2,auto.grid=F,ylab="Annualized Log Volatility",xlab="Time",
main="Log Volatility",lty=1,
legend.loc="topright")
#Which returned a different error (as described above)
#And I tried:
vol.xts <- xts(vola, order.by= vola$date_2, unique = TRUE )
#This also returned an error message.
#My intention was to then run:
#chart.TimeSeries(vol.xts,lwd=2,auto.grid=F,ylab="Annualized Log Volatility",xlab="Time",
main="Log Volatility",lty=1,
legend.loc="topright")
The documentation of PerformanceAnalytics::chart.TimeSeries is a bit vague. The issue is that when passing a dataframe you have to set the dates as row.names. To this end I first converted your data (which is a tibble) to a data.frame. Afterwards I add the dates as rownames and drop the date column:
library(PerformanceAnalytics)
vola <- as.data.frame(vola)
vola <- subset(vola, select = -c(date))
row.names(vola) <- as.Date(vola$date_2)
vola$date_2 <- NULL
chart.TimeSeries(vola,
lwd = 2, auto.grid = F, ylab = "Annualized Log Volatility", xlab = "Time",
main = "Log Volatility", lty = 1,
legend.loc = "topright"
)

Find zero crossing in R

If I have the following data:
df <- structure(list(x = c(1.63145539094563, 1.67548187017034, 1.71950834939504,
1.76353482861975, 1.80756130784445, 1.85158778706915, 1.89561426629386,
1.93964074551856, 1.98366722474327, 2.02769370396797, 2.07172018319267,
2.11574666241738, 2.15977314164208, 2.20379962086679, 2.24782610009149,
2.2918525793162, 2.3358790585409, 2.3799055377656, 2.42393201699031,
2.46795849621501, 2.51198497543972, 2.55601145466442, 2.60003793388912,
2.64406441311383, 2.68809089233853, 2.73211737156324, 2.77614385078794,
2.82017033001265, 2.86419680923735, 2.90822328846205, 2.95224976768676,
2.99627624691146, 3.04030272613617, 3.08432920536087, 3.12835568458557,
3.17238216381028, 3.21640864303498, 3.26043512225969, 3.30446160148439,
3.3484880807091, 3.3925145599338, 3.4365410391585, 3.48056751838321,
3.52459399760791, 3.56862047683262, 3.61264695605732, 3.65667343528202,
3.70069991450673, 3.74472639373143, 3.78875287295614), y = c(24.144973858154,
18.6408277478876, 21.9174270206615, 22.8017876727379, 20.9766270378248,
18.604384256745, 18.4805250429826, 15.8436744335752, 13.6357170277296,
11.6228806771368, 9.4065868126964, 6.81644596802601, 4.41187500831424,
4.31911614349431, 0.678259284890563, -1.18632719250877, -2.32986407762089,
-3.84480566043122, -5.24738510499144, -5.20160089844013, -5.42094587600499,
-5.39886757202858, -5.26753920575326, -4.68727963638973, -2.73267203102102,
0.296905237887623, 2.45725152489283, 5.12102449689086, 7.13986218237411,
10.2044876281093, 14.4358946463429, 19.0643081865458, 22.8920445618834,
26.7229418763085, 31.3776791707576, 36.19058349817, 41.2843224331918,
46.3396522631345, 51.4321502764393, 56.4080998038294, 61.5215778808583,
66.6845421308734, 71.3912749310486, 76.0856977880158, 80.7039319129457,
84.4095953723555, 88.0163019647757, 89.918078622734, 91.6341473685881,
94.0404562451352)), class = c("tbl_df", "tbl", "data.frame"), .Names = c("x",
"y"), row.names = c(NA, -50L))
Plot:
How do I find the exact x value when y == 0? I tried doing interpolation, but it does not necessarily give me a y value equals to zero. Does anyone know of a function to find zero crossings?
Firstly, one can define a corresponding (linearly) interpolated function with
approxfun(df$x, df$y)
where the result looks like
curve(approxfun(df$x, df$y)(x), min(df$x), max(df$x))
Those zero crossing then can be seen as the roots of this function. In base R there is a function uniroot, but it looks for a single root, while in your case we have two. Hence, one option would be the rootSolve package as in
library(rootSolve)
uniroot.all(approxfun(df$x, df$y), interval = range(df$x))
# [1] 2.263841 2.727803

Building and analysing trends in time series

I need advice about building time series. I have a bunch of files with monthly data for sea surface temperature for an number of locations across 408 months. I have aggregated monthly values in a data frame with the following structure
longitude, latitude, SST for month 1, SST for month 2, .... SST for month n
This is just a small piece of the data frame so you can see
dput(sst_subset)
structure(list(lon = c(-19.875, -19.625, -19.375, -19.125), lat = c(30.125,
30.125, 30.125, 30.125), sst = c(293.197412803228, 293.092251515256,
292.999348291526, 293.013219258958), sst.1 = c(292.490350607051,
292.504279178168, 292.502850606771, 292.438922036772), sst.2 = c(291.994832184947,
291.887412832509, 291.832896704695, 291.810638640677), sst.3 = c(292.095993473008,
292.066660140331, 292.091993473098, 292.110326806021), sst.4 = c(293.071606354427,
293.095799902274, 293.106445063326, 293.116122482465), sst.5 = c(294.981993408501,
294.996326741514, 295.004660074661, 295.018993407674), sst.6 = c(295.568703072806,
295.600315975326, 295.597735330222, 295.49418694544), sst.7 = c(296.250961122073,
296.175154672154, 296.079348222683, 296.052251449095)), .Names = c("lon",
"lat", "sst", "sst.1", "sst.2", "sst.3", "sst.4", "sst.5", "sst.6",
"sst.7"), row.names = c(NA, 4L), class = "data.frame")
To build a time series I have extracted a row of the data frame, that corresponds to all the monthly values in a location (defined by longitude and latitude), transposed to a column and created a new data frame
ncolumnes<-ncol(sst_all)
sst_point1<-sst_all[1:3,ncolumnes]
sst1_df <- as.data.frame(t(sst_point1))
dput(sst1_ts)
structure(c(293.197412803228, 292.490350607051, 291.994832184947,
292.095993473008, 293.071606354427, 294.981993408501, 295.568703072806,
296.250961122073, 296.73166003606, 296.385154667461, 294.611660083445,
293.484186990367, 292.372896692626, 291.348207775437, 291.627090257683,
291.957326809441, 292.71063862056, 293.545326773947, 295.897412742879,
296.671928854599, 296.681326703851, 296.483864342674, 294.934660076226,
293.76709020985, 292.45870314232, 291.399993488565, 291.446767681068,
291.918993476964, 292.889025713347, 293.71099343691, 294.01418697852,
296.219025638916, 296.90166003226, 296.119993383065, 294.936326742855,
293.405154734069, 291.834509607885, 291.638564911804, 291.527412840556,
292.055326807251, 292.020961216621, 294.573660084295, 295.850315969738,
295.978380483004, 296.863660033109, 297.228380455065, 296.00866005222,
294.711606317771, 293.067735386772, 291.577136341748, 291.426445100877,
291.602993484028, 292.42096120768, 293.742993436195, 294.709348253305,
295.973219192797, 296.913993365318, 296.213219187433, 294.494326752735,
293.59225150408, 292.492251528667, 291.838207764485, 292.225477341082,
292.385993466526, 294.063864396765, 295.407326732328, 295.98386435385,
297.471928836718, 297.880660010378, 297.070638523107, 294.419993421063,
293.154509578381, 292.307735403759, 291.263441767479, 291.197412847932,
292.566660129155, 293.590316020253, 294.627660083088, 295.085477277156,
296.166122414292, 296.608660038809, 296.143864350273, 294.568660084407,
293.292251510786, 292.269670888481, 291.425350630855, 291.424832197687,
291.351326822986, 292.945799905626, 296.319660045269, 297.158380456629,
297.712251411991, 297.68699334804, 296.391928860858, 294.519660085502,
292.856445068914, 291.953864443927, 291.813922050742, 291.561606388179,
291.680660148958, 293.242574092542, 294.903326743593, 295.748057907507,
297.715799799009, 298.00999334082, 297.161606263009, 295.690326726002,
294.133541814562, 292.727412813734, 292.312493468169, 291.931928960546,
291.646326816392, 291.639670902563, 293.339326778551, 295.357090174311,
297.108703038385, 298.576993328147, 296.577735308317, 295.347660066995,
293.425154733622, 292.446445078078, 291.951027959007, 291.967735411359,
291.957993476093, 292.77838055453, 294.320326756624, 295.738703069007,
296.466122407586, 296.747993369028, 296.3506385392, 294.958326742363,
293.579348278562, 292.182574116234, 291.279279205549, 291.659993482754,
291.872993477993, 292.670316040816, 294.635326749583, 295.305477272238,
296.348057894096, 297.221993358433, 296.08612241608, 294.042993429489,
292.95160635711, 292.009670894293, 291.243207777784, 290.859025758721,
291.319993490353, 292.587412816863, 294.628660083066, 294.788057928965,
296.454832085258, 296.454326708925, 296.265477250781, 295.604326727924,
294.013219236607, 293.043541838926, 292.523922034872, 292.038703151708,
292.477326797818, 294.406122453631, 295.478993397392, 296.886122398199,
297.362251419814, 297.879993343726, 296.978703041291, 295.939326720436,
293.980638592173, 293.048703129133, 291.979993475601, 291.462896712966,
292.266326802534, 293.046445064667, 294.074993428774, 295.435477269333,
296.886122398199, 297.262660024191, 296.517090148383, 295.193326737111,
293.43967086233, 292.486122496546, 292.043564902752, 291.806767673021,
292.480660131077, 293.707735372467, 295.127326738586, 295.877735323964,
296.78192885214, 297.788326679108, 297.02450949188, 295.75766005783,
294.890315991195, 293.371606347722, 292.426422037051, 292.379670886022,
292.746993458457, 293.078057967186, 294.512993418984, 295.54612242815,
296.109348222013, 297.133660027074, 296.816767561039, 295.519326729824,
294.220638586809, 292.947412808816, 291.781422051468, 291.450638648723,
292.118660139168, 293.846122466148, 294.885993410647, 295.964832096211,
297.745154637062, 298.001326674347, 297.287735292448, 295.068993406557,
293.324509574581, 291.593864451974, 291.534821071758, 291.633219289804,
292.017993474752, 292.164187019871, 293.516660107921, 295.506122429044,
296.33321918475, 297.117660027432, 296.34741273282, 294.993660074907,
293.8032192413, 293.077735386549, 292.511779178, 292.344832177124,
292.459326798221, 293.437412797864, 295.860326722202, 296.416444989342,
297.083864329263, 298.678993325867, 297.782251410427, 295.657993393391,
293.652251502739, 293.274186995061, 292.307136325432, 291.922251541408,
291.564993484877, 292.452574110199, 293.996326763866, 294.823219218502,
296.541283696229, 297.421660020637, 296.747735304518, 295.771993390843,
294.041928913384, 293.317090219908, 292.421422037163, 292.680316040593,
292.577660128909, 293.240316028076, 295.254993402399, 296.815477238487,
297.524186900066, 298.126326671553, 297.598380446795, 295.563326728841,
294.207735361291, 293.43805795914, 293.115855519178, 292.753864426046,
292.466993464716, 292.925154744798, 296.035326718291, 296.538380470487,
298.612573972513, 298.241993335634, 297.065154652261, 295.770993390866,
293.72934827521, 292.379670886022, 291.370350632085, 291.601928967922,
292.473326797908, 293.597412794288, 294.678993415274, 296.042896610595,
297.383541741919, 297.729326680427, 296.714186918171, 295.008993407898,
293.465154732728, 292.365154757315, 292.279993468896, 291.722896707154,
292.651993460581, 293.469670861659, 295.145993404835, 296.262896605677,
297.257090131842, 297.550326684428, 297.544832060895, 296.194326714737,
294.499670838637, 293.095799902274, 292.836064885038, 292.445799916802,
292.78566012426, 293.216445060867, 294.3869934218, 295.256767595908,
296.333864346026, 296.692993370257, 296.250315960797, 295.23466006952,
293.713864404588, 292.874187004001, 292.378614156346, 291.931606379908,
292.099326806267, 293.999348269175, 295.055660073521, 296.170638543223,
296.729670788792, 297.024993362837, 296.646444984201, 294.817993412167,
293.368057960704, 292.39579991792, 291.174279207896, 291.343541876924,
291.974660142387, 292.742574103717, 294.785993412882, 296.685477241393,
297.067735297365, 297.318326689613, 297.265154647791, 296.419993376359,
294.439993420616, 293.224509576816, 293.140707735371, 292.928057970539,
293.028326785502, 293.116767643741, 294.067993428931, 295.034832116997,
296.24192886421, 297.204660025487, 297.0212836855, 295.618993394263,
294.195477297049, 293.26644505975, 292.1507077575, 291.842574123834,
292.212326803741, 292.898380551848, 293.698660103853, 294.868057927177,
296.104832093081, 297.440660020212, 296.802574012969, 295.234993402846,
293.692574082483, 292.617090235554, 291.535510726915, 291.344832199475,
292.175660137894, 293.799025693007, 295.795993390307, 296.195799832983,
297.432573998888, 298.643659993323, 297.612251414226, 296.027326718469,
294.692896640769, 293.446122475089, 292.611779175765, 292.494832173771,
293.027326785525, 293.948380528378, 294.144326760558, 295.259670821649,
296.524509503055, 297.014660029734, 296.854832076317, 295.413326732193,
294.306122455866, 292.857735391466, 291.982493475545, 291.549025743299,
292.710993459262, 293.044832161478, 294.210660092408, 296.063864352061,
296.959993364289, 298.161660004097, 297.040315943139, 295.179326737424,
293.474509571228, 292.265799920826, 291.409993488342, 291.042574141715,
291.81732681257, 293.374186992826, 294.908993410133, 296.215799832536,
297.686767541593, 298.667326659461, 297.63999334909, 295.589993394911,
294.077412783559), .Dim = c(408L, 1L), .Dimnames = list(NULL,
"1"), .Tsp = c(1982, 2015.91666666667, 12), class = "ts")
and then decompose in its additive trend, seasonal and random components and remove seasonal component from original data
sst1_dec<-decompose(sst1_ts)
sst1_noseason<-sst1_ts - sst1_dec$seasonal
Now, how do I get a linear regression for this data (sst1_noseason)? I have tried lm() but as there is only single var in the dataframe I think I can't. Should I build a new date column (time) with monthly dates and then run lm (sst ~ time)?
Is there any other R package that deals with time series that can do better? I have looked at ggseas and tidyr, they seem promising but maybe I need to build than date column to run this analysis in any case.
My final objective is to have a single value for the trend in each longitude and latitude point and plot a map to look for the areas with the highest climatic trend for sea surface temperature.
Maybe there is a better procedure and you could point me to another R package running spatio-temporal analysis. Any help would be appreciated.
Thanks in advance for your help
I am not a fan of specialised class in R, since they are usually not as intuitive and require additional vocabulary to deal with. Here's an attempt to convert the time-series you'd made into a data.frame, using zoo package:
library(zoo)
df1 <- data.frame(zoo(sst1_ts), time=as.yearmon(time(sst1_ts)))
df1$jday <- as.Date(df1$time)
(fit1<-lm(X1 ~ jday, df1))
Call:
lm(formula = X1 ~ jday, data = df1)
Coefficients:
(Intercept) jday
2.937e+02 6.025e-05
Plotting is more intuitve with a data.frame as well:
library(ggplot2)
base <- ggplot(df1, aes(jday, X1)) + geom_line() + stat_smooth(method="lm")
p<-base + scale_x_date(date_labels = "%Y")
You can further use an interactive package such as plotly to navigate the plot created with ggplotly.
library(plotly)
ggplotly(p)

Resources