cannot calculte wind direction means with "timeAverage" function in openair - r

i'm working on a project where I use the openair project. Specifically, I want to use the "timeAverage" function to average a dataset at daily time step, but when performing such a task, the variable wind direction won't be extracted. Here is the R code i'm using:
library("pmetar")
library("openair")
library("dplyr")
library("stringr")
NFTL<-metar_get_historical("NFTL", start_date = "2022-01-14", end_date = "2022-01-18", from = "iastate")
decoded_NFTL <- metar_decode(NFTL, metric = TRUE, altimeter = FALSE)
NFTL_obs<-select(decoded_NFTL, METAR_Date, Wind_direction)
NFTL_obs1 <- NFTL_obs
NFTL_obs1$Wind_direction <- str_replace_all(NFTL_obs1$Wind_direction, 'Variable', 'NA')
NFTL_obs1$Wind_direction<-gsub(",.*","",NFTL_obs1$Wind_direction)
names(NFTL_obs1)[names(NFTL_obs1) == "METAR_Date"] <- "date"
names(NFTL_obs1)[names(NFTL_obs1) == "Wind_direction"] <- "wd"
daily <- timeAverage(NFTL_obs1, avg.time = "day")
In this example, you can check that the wind direction (wd) variable was not extracted when executing the last command from openair, how can I fix this?

The reason to keep Wind_direction as the character is that sometimes the value is the text "Variable" or "variable from n1 to n2". You can use the function metar_dir for extracting wind directions as numeric.
NFTL_obs$Wind_direction <- metar_dir(NFTL, numeric_only = TRUE)

Related

Exporting Data Frame to Excel in R

enter image description hereI try to export the data (which I got by coding from Yahoo). I can export data, but the date part does not go through, only the value part goes through. How can I export the date and value together to excel.
#Code:
library(tseries)
library(prophet)
library(tidyverse)
library(writexl)
library(readxl)
#determine date
start = "2013-01-01"
end = "2021-05-25"
#get the data from sources
TL <- get.hist.quote(instrument = "TRYUSD=X",
start = start,
end = end,
quote = "Close",
compression = "d")
#change the format of the dataset
y <- data.frame(TL)
#convert dataframe to excel document
write_xlsx(y,"C:/Users/hay/OneDrive/Desktop/Turkish Lira Forecast/TL.xlsx")#Load the dataset
#upload the dataset from excel file
bitcoin <- read_excel("C:/Users/hay/OneDrive/Desktop/bitcoin.xlsx")
View(bitcoin)
#call the prophet function to fit the model
model <- prophet(TL)
future <- make_future_dataframe(model, periods = 365)
tail(future)
#forecast
forecast <- predict(model, future)
tail(forecast[c('ds', 'yhat', 'yhat_lower', 'yhat_upper')])
#plot the model estimates
dyplot.prophet(model, forecast)
prophet_plot_components(model, forecast)
We can use
library(dplyr)
library(tibble)
y %>%
rownames_to_column('Date') %>%
writexl::write_xlsx("data.xlsx")
The date part can be extracted from the rownames -
y$Date <- rownames(y)
writexl::write_xlsx(y,"data.xlsx")

Why does this code plot by day of the week?

Link to the data set which is a date and time column along with electricity usage columns
https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip
power1 <- read.csv(file = "c:/datasets/household_power_consumption.txt", stringsAsFactors=F, header = TRUE,
sep=";", dec = ".", na.strings="?", col.names = c("date1","time1","Global_active_power", "Global_reactive_power",
"Voltage","Global_intensity","Sub_metering_1","Sub_metering_2",
"Sub_metering_3"))
power1$date1 <- as.Date(power1$date1, format="%d/%m/%Y")
power2 <- subset(power1, subset=(date1 >= "2007-02-01" & date1 <= "2007-02-02"))
datetime1 <- paste(as.Date(power2$date1), power2$time1)
power2$Datetime <- as.POSIXct(datetime1)
plot(power2$Global_active_power~power2$Datetime, type="l", ylab="Global Active Power (kilowatts)", xlab="")
When I run the above, I get the graph like I'm supposed to with the days of the week on the x axis even when I run summary, head and str() I don't see anything in the data about days of the week.
I tried to add my own day column with mutate but it didn't work.
And it didn't work when I subset it like the following. It subset properly where I had only the data I needed, but it wouldn't plot with the date1 column or the day of the week column I created via mutate
power2 <- subset(power1, subset=(as.Date(date1, format = "%d/%m/%Y") >= "2007-02-01"
& as.Date(date1, format = "%d/%m/%Y") <= "2007-02-02"))
I know that as.Posixct will have all the metadata in there, but I don't understand why is it when I combine the date and time columns into it's own column only then it plots by day of the week graphwithout me asking.
When I run it like this, the combined date and time column data is corrupted with the wrong year
power11 <- read.csv(file = "c:/datasets/household_power_consumption.txt", stringsAsFactors=F, header = TRUE,
sep=";", dec = ".", col.names = c("date1","time1","Global_active_power", "Global_reactive_power",
"Voltage","Global_intensity","Sub_metering_1","Sub_metering_2",
"Sub_metering_3"))
#colClasses = c("Date", "character", "factor", "numeric","numeric","numeric","numeric","numeric","numeric"))
power22 <- subset(power11, subset=(as.Date(date1, format = "%d/%m/%Y") >= "2007-02-01"
& as.Date(date1, format = "%d/%m/%Y") <= "2007-02-02"))
datetime1 <- paste(as.Date(power22$date1), power22$time1)
power22$Datetime <- as.POSIXct(datetime1)
Maybe this link would be helpful:
http://earlh.com/blog/2009/07/07/plotting-with-custom-x-axis-labels-in-r-part-5-in-a-series/
add an argument to your plot() call: xaxt='n'
plot(power2$Global_active_power~power2$Datetime, type="l", ylab="Global Active Power (kilowatts)", xlab="", xaxt='n')
that tells plot not to add x-axis labels. Then add an axis() call:
axis(side=1, at=power22$Datetime, labels=format(power22$Datetime, '%b-%y'))
I used '%b-%y' here, because that's what I saw on the site I referenced, but you would want to use the format code appropriate to your needs.

using wux for ensemble climate data

I am very new to R, and I am working with New England climate data. I am currently attempting to use the package WUX to find ensemble averages for each year of the mean, minimum, and maximum temperatures across all 29 climate models. In the end for example, I want to have a raster stack of model averages, one stack for each year. The ultimate goal is to obtain a graph that shows variability. I have attempted to read through the WUX.pdf online, but because I am so new, and because it is such a general overview I feel I am getting lost. I need help developing a simple framework to run the model. My script so far has a general outline of what I think I need. Correct me if I am wrong, but I think I want to be using the 'models2wux' function. Bare in mind that my script is a bit messy at this point.
#This script will:
#Calculate ensemble mean, min, and max across modules within each year.
#For example, the script will find the average temp across all modules
#for the year 1980. It will do the same for all years.
#There will be a separate ensemble mean, max, and min for each scenario
library(raster)
library(rasterVis)
library(utils)
library(wux)
library(lattice)
#wux information
#https://cran.r-project.org/web/packages/wux/wux.pdf
path <- "/net/nfs/merrimack/raid/Northeast_US_Downscaling_cmip5/"
vars <- c("tasmin", "tasmax") #, "pr")
mods <- c("ACCESS1-0", "ACCESS1-3", "bcc-csm1-1", "bcc-csm1-1-m")
#"CanESM2", "CCSM4", "CESM1-BGC", "CESM1-CAM5", "CMCC-CM",
#"CMCC-CMS", "CNRM-CM5", "CSIRO-Mk3-6-0", "FGOALS-g2", "GFDL-CM3",
#"GFDL-ESM2G", "GFDL-ESM2M", "HadGEM2-AO", "HadGEM2-CC", "HadGEM2-ES",
#"inmcm4", "IPSL-CM5A-LR", "IPSL-CM5A-MR", "MIROC5", "MIROC-ESM-CHEM",
#"MIROC-ESM", "MPI-ESM-LR", "MPI-ESM-MR", "MRI-CGCM3", "NorESM1-M")
scns <- c("rcp45", "rcp85") #, "historical")
#A character vector containing the names of the models to be processed
climate.models <- c(mods)
#ncdf file for important cities we want to look at (with lat/lon)
cities.path <-
"/net/home/cv/marina/Summer2017_Projects/Lat_Lon/NE_Important_Cities.csv"
necity.vars <- c("City", "State", "Population",
"Latitude", "Longitude", "Elevation(meters")
# package = wux -- models2wux
#models2wux(userinput, modelinput)
#modelinput information
#Start 4 Loops to envelope netcdf data
for (iv in 1:2){
for (im in 1:4){
for (is in 1:2){
for(i in 2006:2099){
modelinput <- paste(path, vars[iv], "_day_", mods[im], "_", scns[is], "_r1i1p1_", i, "0101-", i, "1231.16th.nc", sep="")
print(modelinput)
} # end of year loop
} # end of scenario loop
} # end of model loop
} # end of variable loop
# this line will print the full file name
print(full)
#more modelinput information necessary? List of models
# package = wux -- models2wux
#userinput information
parameter.names <- c("tasmin", "tasmax")
reference.period <- "2006-2099"
scenario.period <- "2006-2099"
temporal.aggregation <- #maybe don't need this
subregions <- # will identify key areas we want to look at (important cities?)
#uses projection file
#These both read the .csv file (first uses 'utils', second uses 'wux')
#1
cities.read <- read.delim(cities.path, header = T, sep = ",")
#2
read.table <- read.wux.table(cities.path)
cities.read <- subset(cities.read, subreg = "City", sep = ",")
# To read only "Cities", "Latitude", and "Longitude"
cities.points <- subset(cities.read, select = c(1, 4, 5))
cities.points <- as.data.frame(cities.points)
colnames(cities.points)<- c("City", "Latitude", "Longitude" )
#Set plot coordinates for .csv graph
coordinates(cities.points) <- ~ Longitude + Latitude
proj4string(cities.points) <- c("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0")
subregions <- proj4string(cities.points)
area.fraction <- # reread pdf on this (p.24)
#Do we want area.fraction = T or F ? (FALSE is default behavior)
spatial.weighting <- FALSE
#spatial.weighting = TRUE enables cosine weighting of latitudes, whereas
omitting or setting
#FALSE results in unweighted arithmetic areal mean (default). This option is
valid only for data on a regular grid.
na.rm = FALSE #keeps NA values
#plot.subregions refer to pdf pg. 25
#save.as.data saves data to specific file
# 1. use the brick function to read the full netCDF file.
# note: the varname argument is not necessary, but if a file has multiple varables, brick will read the first one by default.
air_t <- brick(full, varname = vars[iv])
# 2. use the calc function to get average, min, max for each year over the entire set of models
annualmod_ave_t <- calc(air_t, fun = mean, na.rm = T)
annualmod_max_t <- calc(air_t, fun = max, na.rm = T)
annualmod_min_t <- calc(air_t, fun = min, na.rm = T)
if(i == 2006){
annual_ave_stack = annualmod_ave_t
}else if{
annual_ave_stack <- stack(annual_ave_stack, annualmod_max_t))
}else{
annual_ave_stack <- stack(annual_ave_stack, annualmod_min_t)
} # end of if/else

Plot function outside the candlestick pattern in R

I have two xts objects: stock and base. I calculate the relative strength (which is simply the ratio of closing price of stock and of the base index) and I want to plot the weekly relative strength outside the candlestick pattern. The links for the data are here and here.
library(quantmod)
library(xts)
read_stock = function(fichier){ #read and preprocess data
stock = read.csv(fichier, header = T)
stock$DATE = as.Date(stock$DATE, format = "%d/%m/%Y") #standardize time format
stock = stock[! duplicated(index(stock), fromLast = T),] # Remove rows with a duplicated timestamp,
# but keep the latest one
stock$CLOSE = as.numeric(stock$CLOSE) #current numeric columns are of type character
stock$OPEN = as.numeric(stock$OPEN) #so need to convert into double
stock$HIGH = as.numeric(stock$HIGH) #otherwise quantmod functions won't work
stock$LOW = as.numeric(stock$LOW)
stock$VOLUME = as.numeric(stock$VOLUME)
stock = xts(x = stock[,-1], order.by = stock[,1]) # convert to xts class
return(stock)
}
relative.strength = function(stock, base = read_stock("vni.csv")){
rs = Cl(stock) / Cl(base)
rs = apply.weekly(rs, FUN = mean)
}
stock = read_stock("aaa.csv")
candleChart(stock, theme='white')
addRS = newTA(FUN=relative.strength,col='red', legend='RS')
addRS()
However R returns me this error:
Error in `/.default`(Cl(stock), Cl(base)) : non-numeric argument to binary operator
How can I fix this?
One problem is that "vni.csv" contains a "Ticker" column. Since xts objects are a matrix at their core, you can't have columns of different types. So the first thing you need to do is ensure that you only keep the OHLC and volume columns of the "vni.csv" file. I've refactored your read_stock function to be:
read_stock = function(fichier) {
# read and preprocess data
stock <- read.csv(fichier, header = TRUE, as.is = TRUE)
stock$DATE = as.Date(stock$DATE, format = "%d/%m/%Y")
stock = stock[!duplicated(index(stock), fromLast = TRUE),]
# convert to xts class
stock = xts(OHLCV(stock), order.by = stock$DATE)
return(stock)
}
Next, it looks like the the first argument to relative.strength inside the addRS function is passed as a matrix, not an xts object. So you need to convert to xts, but take care that the index class of the stock object is the same as the index class of the base object.
Then you need to make sure your weekly rs object has an observation for each day in stock. You can do that by merging your weekly data with an empty xts object that has all the index values for the stock object.
So I refactored your relative.strength function to:
relative.strength = function(stock, base) {
# convert to xts
sxts <- as.xts(stock)
# ensure 'stock' index class is the same as 'base' index class
indexClass(sxts) <- indexClass(base)
index(sxts) <- index(sxts)
# calculate relative strength
rs = Cl(sxts) / Cl(base)
# weekly mean relative strength
rs = apply.weekly(rs, FUN = mean)
# merge 'rs' with empty xts object contain the same index values as 'stock'
merge(rs, xts(,index(sxts)), fill = na.locf)
}
Now, this code:
stock = read_stock("aaa.csv")
base = read_stock("vni.csv")
addRS = newTA(FUN=relative.strength, col='red', legend='RS')
candleChart(stock, theme='white')
addRS(base)
Produces this chart:
The following line in your read_stock function is causing the problem:
stock = xts(x = stock[,-1], order.by = stock[,1]) # convert to xts class
vni.csv has the actual symbol name in the third column of your data, so when you put stock[,-1] you're actually including a character column and xts forces all the other columns to be characters as well. Then R alerts you about dividing a number by a character at Cl(stock) / Cl(base). Here is a simple example of this error message with division:
> x <- c(1,2)
> y <- c("A", "B")
> x/y
Error in x/y : non-numeric argument to binary operator
I suggest you remove the character column in vni.csv that contains "VNIndex" in every row or modify your function called read_stock() to better protect against this type of issue.

I want to run code on data frame up to a certain date (column 2)

I am trying to run code on a data frame up to a certain date. I have individual game statistics, the second column is Date in order. I thought this is how to do this however I get an error:
Error in `[.data.frame`(dfmess, dfmess$Date <= Standingdate) :
undefined columns selected
Here is my code:
read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv")
dfmess <- read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv", stringsAsFactors = FALSE)
Standingdate <- as.Date("09/14/15", format = "%m/%d/%y")
dfmess[dfmess$Date <= Standingdate] -> dfmess
You probably want to convert dfmess$Date to as.Date first prior to comparing. In addition, per #Roland's comment, you require an additional comma ,:
dfmess <- read.csv("http://www.football-data.co.uk/mmz4281/1516/E0.csv", stringsAsFactors = FALSE)
dfmess$Date <- as.Date(dfmess$Date, "%m/%d/%y")
Standingdate <- as.Date("09/14/15", format = "%m/%d/%y")
dfmess[dfmess$Date <= Standingdate, ]

Resources