Trying to program trading signals in R - r

I new new to R and am trying to program a pair trading strategy in R.
I have already written the code for downloading the data. And have created additional columns and prepared the data. Now i need to calculate the trading signals.
My signal rules are as follows.
- If Z-Score is greater than 2.25 , Sell the pair; Buy back when Z-Score is less than 0.25.
- If Z-Score is less than -2.25 , Buy the pair; sell (Exit) when z-score is above -0.25.
- close any open position if there is a change in signal.
When we sell a pair, we sell the first stock and buy the second stock. In this case, we sell ACC and Buy Ambujacem.
When we buy a pair, we buy the first stock and sell the second stock. In this case, we buy ACC and Sell Ambujacem.
Could anyone help me with the coding for the trading signals.
Enclosing the code.
Regards,
Subash
# Trading Code
library(quantmod)
getSymbols("ACC.NS", from=as.Date('2007-01-01'), to=as.Date('2015-07-24'))
getSymbols("AMBUJACEM.NS", from=as.Date('2007-01-01'), to=as.Date('2015-07-24'))
acc=ACC.NS[,6]
amb=AMBUJACEM.NS[,6]
t.zoo <- merge(acc, amb, all=TRUE)
t.zoo=as.data.frame(t.zoo)
typeof(t.zoo)
t.zoo=na.omit(t.zoo)
#adding columns
t.zoo$spread <- 0
t.zoo$adfTest <- 0
t.zoo$mean <- 0
t.zoo$stdev <- 0
t.zoo$zScore <- 0
t.zoo$signal <- 0
t.zoo$BuyPrice <- 0
t.zoo$SellPrice <- 0
t.zoo$LongReturn <- 0
t.zoo$ShortReturn <- 0
t.zoo$Slippage <- 0
t.zoo$TotalReturn <- 0
#preparing the data
#Calculating the pair ratio
t.zoo$pairRatio <- t.zoo$ACC.NS.Adjusted/t.zoo$AMBUJACEM.NS.Adjusted
#Calculate the log prices of the two time series
t.zoo$LogA <- log10(t.zoo$ACC.NS.Adjusted)
t.zoo$LogB <- log10(t.zoo$AMBUJACEM.NS.Adjusted)
#Calculating the spread
t.zoo$spread <- t.zoo$ACC.NS.Adjusted/t.zoo$AMBUJACEM.NS.Adjusted
#Calculating the mean
# Computes the mean using the SMA function
# choose the number of days for calculating the mean
SMAdays = 20
t.zoo$mean <- SMA(t.zoo$spread,SMAdays)
#Calculating the Std Deviation
t.zoo$stdev <- rollapply(t.zoo$spread,20,sd, fill=NA, align='right')
#Calculating the Z Score
t.zoo$zScore <- (t.zoo$pairRatio - t.zoo$mean)/t.zoo$spread
View(t.zoo)
#Calculation of trading signals and trading prices
#Trigger sell or buy signal if Z Score moves above 2.25 or below -2.25.
# Close position if Z Score reaches 0.2 or -0.2.
# close any open position if there is a change in signal.

I think the main issue was to come up with trading signals for a strategy that depends not only on the current level of indicator but also on the direction from which the indicator is crossed.
There were a number of problems with the code posted in comments, including use of single = for comparisons . So I've worked it afresh
Here's my attempt at solving this. It seems to be fine. I've added some plotting code to eyeball the results. I suggest you check the result over different periods.
This code comes after the one in the original question . Only difference is that I have kept t.zoo as an xts/zoo object and not converted it to data.frame. Also, I've multiplied zScores with 100
It generates trigger dates and also a column depicting the state of strategy. Calculating returns would be easy from there
colnames(t.zoo)
#t.zoo must be an xts object
#working on a separate xts object
sigs<- t.zoo[, c("ACC.NS.Adjusted", "AMBUJACEM.NS.Adjusted" , "zScore")]
# creating my own triggers as there are not enough good values
# buyTrig<- mean(t.zoo$zScore ,na.rm = T) - 1*sd(t.zoo$zScore ,na.rm = T)
# sellTrig<- (-1) * buyTrig
# sqOffTrig<- mean(t.zoo$zScore ,na.rm = T) - 0.5*sd(t.zoo$zScore ,na.rm = T)
# Another approach: scaling tz.zoo to fit your criterion
sigs$zScore<- sigs$zScore*100
buyTrig<- (-2.25)
sellTrig<- (-1) * buyTrig
sqOffTrig<- 0.25
cat ( buyTrig, sellTrig , sqOffTrig)
hist(sigs$zScore, breaks = 40)
abline(v=c(buyTrig,sellTrig), col="red")
abline(v=c(-sqOffTrig, sqOffTrig), col="green")
sum(sigs$zScore >= -sqOffTrig & sigs$zScore<= sqOffTrig , na.rm = T) # 139
sigs$action<- 0
sigs$mode <- NA
sigs$zLag<- lag.xts(sigs$zScore,1)
sigs[19:22,]
#these are not the real trigger dates, but they will serve our purpose
# along with na.locf
buyTrigDays<- time(sigs[sigs$zScore<= buyTrig & sigs$zLag > buyTrig, ])
sellTrigDays<- time(sigs[sigs$zScore>= sellTrig & sigs$zLag < sellTrig, ])
#square offs
buySqOffDays<- time( sigs[sigs$zScore>= (-1*sqOffTrig) & sigs$zLag < (-1*sqOffTrig), ] )
buySqOffDays
sellSqOffDays<- time( sigs[sigs$zScore<= (sqOffTrig) & sigs$zLag > (sqOffTrig), ] )
sellSqOffDays
sigs$mode[buyTrigDays]=1 ; sigs$mode[sellTrigDays]= -1;
sigs$mode[buySqOffDays]=0 ; sigs$mode[sellSqOffDays]= 0;
sigs$mode
# use local fill to repeat these triggered position into future
# till you meet another non NA value
sigs$mode<- na.locf(sigs$mode, fromLast = F)
plot((sigs$zScore["2015"] ))
points(sigs$zScore[sigs$mode==1], col="red", on=1, pch = 19)
points(sigs$zScore[sigs$mode==-1], col="green", on=1 , pch = 19)
points(sigs$zScore[sigs$mode==0], col="blue", on=1)
sum(is.na(sigs$mode))
#now to get the real dates when square off is triggered
trigdays<- time( sigs[diff(sigs$mode,1) != 0, ] ) #when the value changes
squareOffTrigger_real<- time(sigs[sigs$mode==0][trigdays])
buyTrigger_real<- time(sigs[sigs$mode==1] [trigdays])
sellTrigger_real<- time(sigs[sigs$mode==-1][trigdays])
#check
length(sellTrigger_real) + length(buyTrigger_real) == length(squareOffTrigger_real)
plot(sigs$zScore["2015"])
points(sigs$zScore[buyTrigger_real] , col="blue", pch = 19, on=1)
points(sigs$zScore[sellTrigger_real] , col="red", pch = 19, on=1)
points(sigs$zScore[squareOffTrigger_real] , col="green", pch = 19, on=1)
abline(h=c(-sqOffTrig, sqOffTrig) , col= "green" )
# further calculations can be easily made using either the mode
# column or the trigger dates computed at the end

Related

for loop to determine the top 10 percent of values in an interval

I essentially have two columns (vectors) with speed and accel in a data.frame as such:
speed acceleration
1 3.2694444 2.6539535522
2 3.3388889 2.5096979141
3 3.3888889 2.2722134590
4 3.4388889 1.9815256596
5 3.5000000 1.6777544022
6 3.5555556 1.3933215141
7 3.6055556 1.1439051628
8 3.6527778 0.9334115982
9 3.6722222 0.7561602592
I need to find for each value speed on the x axis (speed), what is the top 10% max values from the y axis (acceleration). This also needs to be in a specific interval. For example speed 3.2-3.4, 3.4-3.6, and so on. Can you please show me how a for loop would look like in this situation?
As #alistaire already pointed out, you have provided a very limited amount of data. So we first have to simulate I a bit more data based on which we can test our code.
set.seed(1)
# your data
speed <- c(3.2694444, 3.3388889, 3.3388889, 3.4388889, 3.5,
3.5555556, 3.6055556, 3.6527778, 3.6722222)
acceleration <- c(2.6539535522, 2.5096979141, 2.2722134590,
1.9815256596, 1.6777544022, 1.3933215141,
1.1439051628, 0.9334115982, 0.7561602592)
df <- data.frame(speed, acceleration)
# expand data.frame and add a little bit of noise to all values
# to make them 'unique'
df <- as.data.frame(do.call(
rbind,
replicate(15L, apply(df, 2, \(x) (x + runif(length(x), -1e-1, 1e-1) )),
simplify = FALSE)
))
The function create_intervals, as the name suggests, creates user-defined intervals. The rest of the code does the 'heavy lifting' and stores the desired result in out.
If you would like to have intervals of speed with equal widths, simply specify the number of groups (n_groups) you would like to have and leave the rest of the arguments (i.e. lwr, upr, and interval_span) unspecified.
# Cut speed into user-defined intervals
create_intervals <- \(n_groups = NULL, lwr = NULL, upr = NULL, interval_span = NULL) {
if (!is.null(lwr) & !is.null(upr) & !is.null(interval_span) & is.null(n_groups)) {
speed_low <- subset(df, speed < lwr, select = speed)
first_interval <- with(speed_low, c(min(speed), lwr))
middle_intervals <- seq(lwr + interval_span, upr - interval_span, interval_span)
speed_upp <- subset(df, speed > upr, select = speed)
last_interval <- with(speed_upp, c(upr, max(speed)))
intervals <- c(first_interval, middle_intervals, last_interval)
} else {
step <- with(df, c(max(speed) - min(speed))/n_groups)
intervals <- array(0L, dim = n_groups)
for(i in seq_len(n_groups)) {
intervals[i] <- min(df$speed) + i * step
}
}
return(intervals)
}
# three intervals with equal width
my_intervals <- create_intervals(n_groups = 3L)
# Compute values of speed when acceleration is greater then
# or equal to the 90th percentile
out <- lapply(1:(length(my_intervals)-1L), \(i) {
x <- subset(df, speed >= my_intervals[i] & speed <= my_intervals[i+1L])
x[x$acceleration >= quantile(x$acceleration, 0.9), ]
})
# function to round values to two decimal places
r <- \(x) format(round(x, 2), nsmall = 2L)
# assign names to each element of out
for(i in seq_along(out)) {
names(out)[i] <- paste0(r(my_intervals[i]), '-', r(my_intervals[i+1L]))
}
Output 1
> out
$`3.38-3.57`
speed acceleration
11 3.394378 2.583636
21 3.383631 2.267659
57 3.434123 2.300234
83 3.394886 2.580924
101 3.395459 2.460971
$`3.57-3.76`
speed acceleration
6 3.635234 1.447290
41 3.572868 1.618293
51 3.615017 1.420020
95 3.575412 1.763215
We could also compute the desired values of speed based on intervals that make more 'sense' than just equally spaced speed intervals, e.g. [min(speed), 3.3), [3.3, 3.45), [3.45, 3.6), and [3.6, max(speed)).
This can be accomplished by leaving n_groups unspecified and instead specify lwr, upr, and an interval_span that makes sense. For instance, it makes sense to have a interval span of 0.15 when the lower limit is 3.3 and the upper limit is 3.6.
# custom boundaries based on a lower limit and upper limit
my_intervals <- create_intervals(lwr = 3.3, upr = 3.6, interval_span = 0.15)
Output 2
> out
$`3.18-3.30`
speed acceleration
37 3.238781 2.696456
82 3.258691 2.722076
$`3.30-3.45`
speed acceleration
11 3.394378 2.583636
19 3.328292 2.711825
73 3.315306 2.644580
83 3.394886 2.580924
$`3.45-3.60`
speed acceleration
4 3.520530 2.018930
40 3.517329 2.032943
58 3.485247 2.079893
67 3.458031 2.078545
$`3.60-3.76`
speed acceleration
6 3.635234 1.447290
34 3.688131 1.218969
51 3.615017 1.420020
78 3.628465 1.348873
Note: use function(x) instead of \(x) if you use a version of R <4.1.0

Calculate average of a subset of a vector only when subset values meet a condition in R?

I have a daily curve x and I am trying to approximate the average peak and offpeak values of x:
https://ibb.co/Fq1Byzk
I have defined a delta threshold such that when delta is below the threshold value, x will be in the offpeak or peak period. I want to get the average peak value where the average is only of values within x where the delta < threshold. Right now it is averaging out the outliers as well.
delta <- matrix(0,24,ncol=1)
for (i in 2:24){
# i-th element is the i-th hour per day
delta[i] = x[i,2]-x[i-1,2]
}
# Find hour at which max and min daily values occur
max_threshold = 0.15*max(delta)
min_threshold = 0.15*min(delta)
c <- abs(delta) < max_threshold
t1 <- which(delta>max_threshold)[1]-1 # t1: time index at end of off-peak
t2 <- which.max(delta) + 1 # t2 is time of initial peak
t3 <- which.min(delta)-2 # t3 is time of end peak
t4 <- which.min(delta) # t4 time index of evening off-peak
am <- mean(x[1:t1,2]) # average morning off-peak value
peak <- mean(x[t2:t3,2]) #average peak value
pm <- mean(x[t4:24,2]) # average evening off-peak value
> dput(x)
structure(list(time = structure(c(1451952000, 1451955600, 1451959200,
1451962800, 1451966400, 1451970000, 1451973600, 1451977200, 1451980800,
1451984400, 1451988000, 1451991600, 1451995200, 1451998800, 1452002400,
1452006000, 1452009600, 1452013200, 1452016800, 1452020400, 1452024000,
1452027600, 1452031200, 1452034800, 1452038400, 1452042000, 1452045600,
1452049200, 1452052800, 1452056400, 1452060000, 1452063600, 1452067200,
1452070800, 1452074400, 1452078000, 1452081600, 1452085200, 1452088800,
1452092400, 1452096000, 1452099600, 1452103200, 1452106800, 1452110400,
1452114000, 1452117600, 1452121200), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Crow_education_Omer = c(0.019186330898848,
0.0192706664192825, 0.0182164724138513, 0.018174304653634, 0.019355001939717,
0.0197345117816722, 0.023951287803397, 0.0323848398468467, 0.0343245568168401,
0.0378244809148717, 0.0393003525224754, 0.0403545465279066, 0.0405232175687756,
0.0393425202826927, 0.0398907011655169, 0.0377401453944372, 0.0344932278577091,
0.0317101556833707, 0.0304872906370705, 0.0297282709531601, 0.0287584124681633,
0.0252584883701317, 0.0196080085010205, 0.0197345117816722, 0.0194815052203687,
0.0196080085010205, 0.0184273112149375, 0.0184694789751548, 0.0191441631386307,
0.019692344021455, 0.025469327171218, 0.0352522475416196, 0.0376136421137855,
0.0403967142881239, 0.0435592963044175, 0.0433484575033313, 0.0430532831818105,
0.042968947661376, 0.043306289743114, 0.044655658070066, 0.0424207667785518,
0.0416195793344241, 0.0382883262772615, 0.03769797763422, 0.0330173562501054,
0.0281680638251219, 0.0234452746807901, 0.0225597517162278)), row.names = 97:144, class = "data.frame")
Also, how would I be able to ggplot both the new simplified curve along with the original curve x on the same graph? I can't seem to melt or rbind() the new curve with reduced number of data points with x since my time column is POSIXCT.
Thanks.
This is just a partial solution, since it breaks down for the second day. I named the data.frame df instead of x.
library(ggplot2)
library(dplyr)
library(lubridate)
df_obj <- df %>%
group_by(day = day(time)) %>% # group by days
filter(day == 5) %>% # filter for day 5
mutate(
delta_rev = Crow_education_Omer - lag(
Crow_education_Omer,
default = first(Crow_education_Omer)
), # delta between day n and n-1
delta_for = lead(
Crow_education_Omer,
default = last(Crow_education_Omer)
) - Crow_education_Omer, # delta between day n-1 and n
max_tresh = 0.15 * max(delta_rev)
) %>%
group_by(grp = 1 - (abs(delta_rev) < 0.15 * max(delta_rev) | abs(delta_for) < 0.15 * max(delta_for)),
grp2 = cumsum(grp != lag(grp, default = 0))
) %>%
mutate(
average = mean(Crow_education_Omer) *
(1 - grp) *
(abs(first(Crow_education_Omer) - last(Crow_education_Omer)) < max_tresh)
)
First we need to modify your existing data.frame to build up your averages. Based on this calculation, we use ggplot2 for plotting:
df_obj %>%
ggplot(aes(x = time, y = Crow_education_Omer)) +
geom_point() +
geom_line(aes(color = "sample")) +
geom_line(data = df_obj[df_obj$average != 0, ], aes(x = time, y = average, color = "average")) +
xlab("Time") +
ylab("Value")
returns
But for day 6 this doesn't work as expected: Changing to filter(day == 6) and plotting again returns
which isn't the expected result. Changing the threshold value to 0.33 * max(delta) and plotting again creates
So, perhaps you can build up on this code to create a correct and working solution. Good luck!
A few explanations:
We build up delta_rev and delta_for. delta_rev equals your delta, so for a given row/data point i we calculate df[i,2] - df[i-1,2].
delta_for changes this, now we calculate df[i + 1,2] - df[i,2] for a given i. My idea here is: Using both, delta_rev and delta_for allows us to look at the preceeding and succeeding points. This gives us more information about the neighbours of a given point and is useful to determine if the point belongs to a group (am, peak, pm).
The group_by-function tries to build up the groups based on the treshhold. grp checks, if a data point is < 0.15 max(delta), grp2 creates a unique grouping number.
There are a few issues:
Based on this algorithm, there can be more than three groups.
The group_by finds another group between 15:00 and 20:00, we filter it out (that's the (abs(first(Crow_education_Omer) - last(Crow_education_Omer)) < max_tresh)-part). I'm not sure, if this is a good solution.
As stated above, this doesn't return a reasonable plot for day 6. Perhaps geom_point's df_obj[df_obj$average != 0, ]-part causes this.

Mapping slope of an area and returning percent above and below a threshold in R

I am trying to figure our the proportion of an area that has a slope of 0, +/- 5 degrees. Another way of saying it is anything above 5 degrees and below 5 degrees are bad. I am trying to find the actual number, and a graphic.
To achieve this I turned to R and using the Raster package.
Let's use a generic country, in this case, the Philippines
{list.of.packages <- c("sp","raster","rasterVis","maptools","rgeos")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)}
library(sp) # classes for spatial data
library(raster) # grids, rasters
library(rasterVis) # raster visualisation
library(maptools)
library(rgeos)
Now let's get the altitude information and plot the slopes.
elevation <- getData("alt", country = "PHL")
x <- terrain(elevation, opt = c("slope", "aspect"), unit = "degrees")
plot(x$slope)
Not very helpful due to the scale, so let's simply look at the Island of Palawan
e <- drawExtent(show=TRUE) #to crop out Palawan (it's the long skinny island that is roughly midway on the left and is oriented between 2 and 8 O'clock)
gewataSub <- crop(x,e)
plot(gewataSub, 1)## Now visualize the new cropped object
A little bit better to visualize. I get a sense of the magnitude of the slopes and that with a 5 degree restriction, I am mostly confined to the coast. But I need a little bit more for analysis.
I would like Results to be something to be in two parts:
1. " 35 % (made up) of the selected area has a slope exceeding +/- 5 degrees" or " 65 % of the selected area is within +/- 5 degrees". (with the code to get it)
2. A picture where everything within +/- 5 degrees is one color, call it good or green, and everything else is in another color, call it bad or red.
Thanks
There are no negative slopes, so I assume you want those that are less than 5 degrees
library(raster)
elevation <- getData('alt', country='CHE')
x <- terrain(elevation, opt='slope', unit='degrees')
z <- x <= 5
Now you can count cells with freq
f <- freq(z)
If you have a planar coordinate reference system (that is, with units in meters or similar) you can do
f <- cbind(f, area=f[,2] * prod(res(z)))
to get areas. But for lon/lat data, you would need to correct for different sized cells and do
a <- area(z)
zonal(a, z, fun=sum)
And there are different ways to plot, but the most basic one
plot(z)
You can use reclassify from the raster package to achieve that. The function assigns each cell value that lies within a defined interval a certain value. For example, you can assign cell values within interval (0,5] to value 0 and cell values within the interval (5, maxSlope] to value 1.
library(raster)
library(rasterVis)
elevation <- getData("alt", country = "PHL")
x <- terrain(elevation, opt = c("slope", "aspect"), unit = "degrees")
plot(x$slope)
e <- drawExtent(show = TRUE)
gewataSub <- crop(x, e)
plot(gewataSub$slope, 1)
m <- c(0, 5, 0, 5, maxValue(gewataSub$slope), 1)
rclmat <- matrix(m, ncol = 3, byrow = TRUE)
rc <- reclassify(gewataSub$slope, rclmat)
levelplot(
rc,
margin = F,
col.regions = c("wheat", "gray"),
colorkey = list(at = c(0, 1, 2), labels = list(at = c(0.5, 1.5), labels = c("<= 5", "> 5")))
)
After the reclassification you can calculate the percentages:
length(rc[rc == 0]) / (length(rc[rc == 0]) + length(rc[rc == 1])) # <= 5 degrees
[1] 0.6628788
length(rc[rc == 1]) / (length(rc[rc == 0]) + length(rc[rc == 1])) # > 5 degrees
[1] 0.3371212

Buy & sell after local minima and maxima

I've got a data series which is of the form date. open high low and close (prices). I want to create local maxima and minima for the close column of the data. I further want to buy after 2 days of local minima # close and sell after two days of local maxima # close. I further want to calculate the profit and loss for the same. the code for the same is as under.
require(quantmod)
tckr1<-"^NSEI" 
start<-Sys.Date()-200
end<- format(Sys.Date(),"%Y-%m-%d") # yyyy-mm-dd 
getSymbols(tckr1, from=start, to=end) 
data<- NSEI$NSEI.Close
data$n <- 1:nrow(data)
data$z <- ZigZag(data$NSEI.Close , change = 2 , percent = T)
data$level<- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
data$NSEI.Close.1<- NULL
data$n.1<- NULL
data$trade<- lag(data$level,2)
Now i need the data column to tell me when to buy and sell by +1 and -1 and also to calculate the profit and loss for the same. In this above mentioned data i will buy when n= 29 # 5719.70 and when n=36 # 5851.20 etc.
regards
Ashish
require(quantmod)
tckr1<-"^NSEI" 
start<-Sys.Date()-200
end<- format(Sys.Date(),"%Y-%m-%d") # yyyy-mm-dd 
getSymbols(tckr1, from=start, to=end) 
data<- NSEI$NSEI.Close
data$n <- 1:nrow(data)
data$z <- ZigZag(data$NSEI.Close , change = 2 , percent = T)
data$level<- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
ex <- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
data$trade<- data$level
data$trade[is.na(data$level)]<- 0
data$trade[data$trade!=0,]<- c(1,-1)
This way you can get your trade column +/- 1.
Just to provide an answer to the final "P/L" curve part of the question, the below code will generate an equity curve based upon ANUP's code,
require(PerformanceAnalytics)
ex <- data[c(findPeaks(data$z) , findValleys(data$z)) - 1 , ]
returns <- ROC(ex$NSEI.Close)*(Lag(ex$trade))
equity <- exp(cumsum(na.trim(returns)))
charts.PerformanceSummary(equity)

solaR timestamp for radiation on a tilted surface

I am trying to use the R package solaR to calculate irradiance on a tilted plane given measured irradiance on the horizontal plane. I can get the code to work, but the final output timestamp does not make sense.
Data for this code can be found here. It is one day's worth of measured irradiance (global horizontal -- ghz, direct normal -- dir, diffuse horizontal -- dhz, and outdoor temp ta) for Austin, TX. The timestamp is local 'CST6CDT' time. The data is for a clear day, so that maximum value of global horizontal (ghz) should roughly correspond with solar noon (the time that the sun crosses the local meridian).
My code is as follows:
library(solaR)
sol_data <- read.csv(file)
# The data must be named a certain way.
names(sol_data) <- c('time', 'G0', 'B', 'D0', 'Ta')
# The negatives are an artifact of the sensor and are set to 0.
sol_data$G0 <- ifelse(sol_data$G0 < 0, 0, sol_data$G0)
sol_data$B <- ifelse(sol_data$B < 0, 0, sol_data$B)
sol_data$D0 <- ifelse(sol_data$D0 < 0, 0, sol_data$D0)
# This calculates the beam incidence on the horizontal plane.
sol_data$B0 <- sol_data$G0 - sol_data$D0
sol_data$B0 <- ifelse(sol_data$B0 < 0, 0, sol_data$B0)
# This takes the data and assigns the timestamp to a certain format and timezone
idxLocal <- with(sol_data, as.POSIXct(time, format='%Y-%m-%d %H:%M:%S', tz = 'CST6CDT'))
# This converts the timestamp to solar time
idx <- local2Solar(idxLocal, lon = -97.7428)
# Creates a zoo object needed to make the Meteo file for input
z <- zoo(sol_data[,c('G0', 'D0', 'B0', 'Ta')], idx)
# local latitude
lat = 30.2669
# Creates a Meteo file
My_Meteo <- zoo2Meteo(z, lat=lat)
# Finds the start and end date of the input file
start <- idx[1]
end <- idx[length(idx)]
# Returns a base time for the calculations
BTd <- fBTd(mode = 'serie', year = '2013', start = start, end = end, format = '%Y-%m-%d %H:%M:%S')
# Computes the movement of the sun/earth
sol <- calcSol(lat = 30.2669, BTd, sample = 'min')
# Creates a G0 file for solar rad on horizontal surface
compI <- calcG0(30.2669, modeRad = 'bdI', dataRad = My_Meteo, corr = 'none')
# creates the angles for calculation of the rad on a tilted surface
angGen <- fTheta(sol = sol, beta = 0, alfa = 0)
# Calculates the irradiance on a tilted surface
irad_tilt <- fInclin(compI, angGen)
When I use beta = 0, alfa = 0 (a flat plane) I should get roughly the same output as my input. However, when I search for the max value of global horizontal irradiance:
x <- which.max(irad_tilt$G)
irad_tilt[x,]
I get it to return a max at 2013-05-05 10:43:01 and I cannot figure out what/why this time is as it is. It is not local time, that should be around 13:24. Local solar time should be around 12:00. UTC time should be around 18:24, and UTC solar time (if there is such a thing) should be 17:00...
I know this is obscure, but any thoughts?
I have tested the code and data in my computer with correct
results. Let's reproduce the main steps with some graphical
outputs:
library(solaR)
sol_data <- read.csv('/tmp/one_day_WSL_8.csv')
## The data must be named a certain way.
names(sol_data) <- c('time', 'G0', 'B', 'D0', 'Ta')
## The negatives are an artifact of the sensor and are set to 0.
sol_data$G0 <- ifelse(sol_data$G0 < 0, 0, sol_data$G0)
sol_data$B <- ifelse(sol_data$B < 0, 0, sol_data$B)
sol_data$D0 <- ifelse(sol_data$D0 < 0, 0, sol_data$D0)
## This calculates the beam incidence on the horizontal plane.
sol_data$B0 <- sol_data$G0 - sol_data$D0
sol_data$B0 <- ifelse(sol_data$B0 < 0, 0, sol_data$B0)
## This takes the data and assigns the timestamp to a certain format and timezone
idxLocal <- with(sol_data, as.POSIXct(time, format='%Y-%m-%d %H:%M:%S', tz = 'CST6CDT'))
The function local2Solar converts the time zone of a POSIXct object to the mean solar time and set its time zone to UTC as a synonym of mean solar time. It includes two corrections: the difference of longitudes between the location and the time zone, and the daylight saving time.
idx <- local2Solar(idxLocal, lon = -97.7428)
## Creates a zoo object needed to make the Meteo file for input
z <- zoo(sol_data[,c('G0', 'D0', 'B0', 'Ta')], idx)
Because your data belongs to a clear day and this time series uses
mean solar time, the maximum should be located around noon.
xyplot(z, type=c('l', 'g'))
Now we compute the sun geometry with calcSol. Here I am using a
different code from yours.
## local latitude
lat = 30.2669
## Computes the movement of the sun/earth
sol <- calcSol(lat, BTi=idx)
xyplot(as.zooI(sol), type=c('l', 'g'))
Next we calculate radiation on the horizontal surface.
g0 <- calcG0(lat, modeRad = 'bdI', dataRad = z, corr = 'none')
xyplot(as.zooI(g0), type=c('l', 'g'))
Finally, with calcGef we obtain irradiance on a tilted surface:
gef <- calcGef(lat=lat, modeRad='bdI', dataRad=z)
xyplot(as.zooI(gef), type=c('l', 'g'))
I suspect that your problem is related with time zones defined in
your computer. Could you check these results?:
lonHH('America/Chicago')
## [1] -1.570796
lonHH('CDT6CST')
## [1] -1.570796
idxLocal1 <- as.POSIXct(sol_data$time, format='%Y-%m-%d %H:%M:%S', tz = 'CST6CDT')
idxLocal2 <- as.POSIXct(sol_data$time, format='%Y-%m-%d %H:%M:%S', tz = 'America/Chicago')
idxUTC1 <- as.POSIXct(format(idxLocal1, tz='UTC'), tz='UTC')
idxUTC2 <- as.POSIXct(format(idxLocal2, tz='UTC'), tz='UTC')
all.equal(idxUTC1, idxUTC2)
## [1] TRUE
Maybe these technical notes are useful for additional information on
this topic:
Ripley, B. D. and Hornik, K. (2001) Date-time classes. R News, 1/2, 8–11.
Gabor Grothendieck and Thomas Petzoldt (2004), Date and Time Classes in
R, R News 4(1), 29-32.
Besides, you should take a look at the information and examples of help(timezone).
thank you greatly for responding directly and for the great package. It turns out we had a wildly wrong interpretation of solar time. I am seeing a different possible issue that would not fit into the comments section.
When I run:
local2Solar(as.POSIXct("2013-07-07 13:36:00",tz="America/Chicago"),lon=-97.7428)
I get "2013-07-07 12:05:01 UTC". According to NOAA, "2013-07-07 13:36:00" is solar noon for that day.
Just to confuse matter, when I run:
local2Solar(as.POSIXct("2013-06-07 13:30:00",tz="America/Chicago"),lon=-97.7428)
I get "2013-06-07 11:59:01 UTC", so it appears to be very close. According to NOAA, "2013-06-07 13:30:00" is solar noon for that day.
If you were to run:
local2Solar(as.POSIXct("2013-01-07 12:37:27",tz="America/Chicago"),lon=-97.7428)
You would get "2013-01-07 12:06:28 UTC". According to NOAA, "2013-01-07 12:37:27"" is solar noon for that day.
I ran G. Master's equations separately from solaR and got: "2013-06-07 13:29:30 CDT" (the highest precision is each minute for this version) for the time with maximum incident power for the first case on "2013-06-07".

Resources