Hi and thanks in advance.
So I'm trying very hard to get a working forecasting function going in R, but I'm having no luck.
Here's what I need to accomplish:
1) Extract 2 sets of data from a single txt file (one being a value of quantity, and the other being time). The amount of the records must be able to vary (e.g. 4 quantity values in 4 different times, or 5, or 6 etc)
For this I have tried to use a data frame, but the later forecasting function won't accept it for some reason.
2) Put the data into a forecasting function in order to generate new 'forecasted' data (Btw, the method of forecasting that I need to use does not matter at all [I only can't use a mean forecast as it is too simple], but something like a naive or an rwf will work just fine).
3) I want to save the newly produced 'forecasted' data to another txt file for storage.
Here is my code so far:
These lines I used to create and save my sample data (i have currently 18 records):
library(forecast)
library(ggplot2)
library(reshape2)
Quantity <- c(5,3,8,4,0,5,2,7,4,2,6,8,4,7,8,9,4,6)
Time <- c("2010/01/01", "2010/07/02", "2010/08/03", "2011/02/04", "2011/11/05", "2011/12/06", "2012/06/07", "2012/08/30", "2013/04/16", "2013/03/18", "2014/02/22", "2014/01/27", "2015/12/15", "2015/09/08", "2016/05/04", "2017/11/07", "2017/09/22", "2017/04/04")
Frame <- data.frame(Time,Quantity)
write.table(Frame,file="....path..../Frame.txt",quote=F)
I then used these lines to put that data into (hopefully) a dataframe or any container that could hold both of the above vectors data:
Frame <- read.table("....path..../Frame.txt")
Out of surety sake, I attempted to plot the data (but I don't need to) just to see if the program has properly read my data (to no avail):
There were 4 plot attempts(unfortunately none succeeded properly):
1-
plot.ts(Frame)
2-
Frame <- read.table("....path..../Frame.txt")
3-
Frame <- window(start = 2000, end = 2019)
autoplot(Frame) + autolayer(meanf(Frame,h=11),PI=FALSE, series="Means") + autolayer(naive(Frame, h=11),PI=FALSE,series="Naive") + ggtitle("Quantity vs Time") + xlab("Time") + ylab("Quantity") + guides(colour=guide_legend(title="Forecast"))
4-
plot(Frame, xlab="Time",ylab="Quantity",main="Stock Quantity vs Time",type='l')
I have yet to reach the part where I need to send the new 'forecasted' data to a new txt file, so I don't have any attempt code for that.
Any help is appreciate. Thank you.
Related
I'm sure this is a simple question, but relatively new here. I'm trying to extract the forecasted values in a CSV/table I can use outside of R. I followed along with the multiple series example from here: https://www.mitchelloharawild.com/blog/fable/ . I'm trying to extract the 2 years forecasted data that's completed in this step:
fit %>%
forecast(h = "2 years") %>%
autoplot(tourism_state, level = NULL)
I can see the 3 models in the autoplot, but can't figure out how to get the forecasted values from the Fit tsibble. Any help is appreciated. It looks like there's quite a bit of information that can be genreated (forecast intervals, etc.), so if there's somewhere I can reference on how to parse through what all can be downloaded and how please let me know. Thanks!
The forecasted values of a fable can be saved to a csv using readr::write_csv().
When used with columns that are not in a flat format (such as forecast distributions or intervals), the values will be stored as character strings and information will be lost. Before writing to a file, you should flatten these structures by extracting their components into separate columns.
You can use unpack_hilo() to extract the lower, upper, and level values within a <hilo> to create a flat data structure. Alternatively you can access the components of a <hilo> with $, for example: my_interval$lower.
I need to find a way to run a loop on ARIMA code across multiple files or within a data frame
This is for my thesis project i am working on. However; given the sample size - running the code one by one will be too tedious and time consuming. Is therea way I can get the below code to work in a loop format if I had all my observations in a data frame?
Or alternatively, how can I get it to read multiple files and run the same code automatically?
library(readxl)
X104485 <- read_excel("Wits Business School/Thesis/Trial 1/Pilot - Data Files/104485.xlsx")
library(forecast)
myts <- ts(X104485, start=c(2015, 1), end=c(2019, 5), frequency=12)
fit <- arima(myts, order=c(1, 1, 36))
fcast <- forecast(fit,31)
write.csv((fcast), file = "X104485.csv")
The code above works - I just need it to be efficient in running several iterations at a time.
Please help.
I have a sample usage table of 'Account','Asset','Date','Asset Network Usage' with 15 days of summarised Usage data per Asset. I am trying to append the table with forecasted usage per day over the next 15 days, or at least create an output with the same table structure.
E.g.
Date (m/d/Y) Account Asset Network Usage
4/4/2019 Acct#100 AS-4310 56.5251
4/5/2019 Acct#100 AS-4310 592.1843
4/6/2019 Acct#100 AS-4310 556.1898
4/7/2019 Acct#100 AS-4310 808.2403
4/8/2019 Acct#100 AS-4310 466.118
I've been able to produce the appended table aggregating only by Date. I want to include Date / Account / Asset however I'm challenged in setting an index that doesn't run into an error on the timeseries ts() function
library(forecast)
library(ggfortify)
dataset <-
as.data.frame(read.csv(file="/path/Data.csv",header=TRUE,sep=","))
dataset <- aggregate(Network_Usgae ~ Date,data = dataset, FUN= sum)
ts <- ts(dataset$Network_Usage, frequency=15)
decom <- stl(ts,s.window = "periodic")
pred <- forecast(decom,h = 15)
fort <- fortify(pred,ts.connect= TRUE )
Any suggestions on syntax updates, or use of a different method to achieve my outcome?
I think forecast only works on objects convertable to matrixes, my suggestion is using lists and predicting the "values" while keeping relevant information about other stuff in other elements.
If you provide a dput() dataset I can create an example for you.
Good Luck.
I'm new to R, and new to this forum. I've searched but cannot easily find an answer to this question:
I have numbers of cases of a disease by week according to location, stored in a .csv file with variable names cases.wk24, cases.wk25, etc. I also have population for each location, and want to generate incidence rates (# cases/population) for each of the locations.
I would like to write a loop that generates incidence rates by location for each week, and stores these in new variables called "ir.wk24", "ir.wk25", etc
I am stuck at 2 points:
is it possible to tell R to run a loop if it comes across a variable that looks like "cases.wk"? In some programmes, one would use a star - cases.wk*
How could I then generate the new variables with sequential naming and store these in the dataset?
I really appreciate any help on this - been stuck with internet searches all day!
thanks
x <- data.frame(case.wk24=c(1,3),case.wk25=c(3,2), pop=c(7,8))
weeks <- 24:25
varnames <- paste("case.wk", weeks, sep="")
ir <- sapply(varnames,FUN=function(.varname){
x[,.varname]/x[,"pop"]
})
ir <- as.data.frame(ir)
names(ir) <- paste("ir.wk", weeks, sep="")
x <- cbind(x,ir)
x
I want to use ChemoSpec with a mass spectra of about 60'000 datapoint.
I have them already in one txt file as a matrix (X + 90 samples = 91 columns; 60'000 rows).
How may I adapt this file as spectra data without exporting again each single file in csv format (which is quite long in R given the size of my data)?
The typical (and only?) way to import data into ChemoSpec is by way of the getManyCsv() function, which as the question indicates requires one CSV file for each sample.
Creating 90 CSV files from the 91 columns - 60,000 rows file described, may be somewhat slow and tedious in R, but could be done with a standalone application, whether existing utility or some ad-hoc script.
An R-only solution would be to create a new method, say getOneBigCsv(), adapted from getManyCsv(). After all, the logic of getManyCsv() is relatively straight forward.
Don't expect such a solution to be sizzling fast, but it should, in any case, compare with the time it takes to run getManyCsv() and avoid having to create and manage the many files, hence overall be faster and certainly less messy.
Sorry I missed your question 2 days ago. I'm the author of ChemoSpec - always feel free to write directly to me in addition to posting somewhere.
The solution is straightforward. You already have your data in a matrix (after you read it in with >read.csv("file.txt"). So you can use it to manually create a Spectra object. In the R console type ?Spectra to see the structure of a Spectra object, which is a list with specific entries. You will need to put your X column (which I assume is mass) into the freq slot. Then the rest of the data matrix will go into the data slot. Then manually create the other needed entries (making sure the data types are correct). Finally, assign the Spectra class to your completed list by doing something like >class(my.spectra) <- "Spectra" and you should be good to go. I can give you more details on or off list if you describe your data a bit more fully. Perhaps you have already solved the problem?
By the way, ChemoSpec is totally untested with MS data, but I'd love to find out how it works for you. There may be some changes that would be helpful so I hope you'll send me feedback.
Good Luck, and let me know how else I can help.
many years passed and I am not sure if anybody is still interested in this topic. But I had the same problem and did a little workaround to convert my data to class 'Spectra' by extracting the information from the data itself:
#Assumption:
# Data is stored as a numeric data.frame with column names presenting samples
# and row names including domain axis
dataframe2Spectra <- function(Spectrum_df,
freq = as.numeric(rownames(Spectrum_df)),
data = as.matrix(t(Spectrum_df)),
names = paste("YourFileDescription", 1:dim(Spectrum_df)[2]),
groups = rep(factor("Factor"), dim(Spectrum_df)[2]),
colors = rainbow(dim(Spectrum_df)[2]),
sym = 1:dim(Spectrum_df)[2],
alt.sym = letters[1:dim(Spectrum_df)[2]],
unit = c("a.u.", "Domain"),
desc = "Some signal. Describe it with 'desc'"){
features <- c("freq", "data", "names", "groups", "colors", "sym", "alt.sym", "unit", "desc")
Spectrum_chem <- vector("list", length(features))
names(Spectrum_chem) <- features
Spectrum_chem$freq <- freq
Spectrum_chem$data <- data
Spectrum_chem$names <- names
Spectrum_chem$groups <- groups
Spectrum_chem$colors <- colors
Spectrum_chem$sym <- sym
Spectrum_chem$alt.sym <- alt.sym
Spectrum_chem$unit <- unit
Spectrum_chem$desc <- desc
# important step
class(Spectrum_chem) <- "Spectra"
# some warnings
if (length(freq)!=dim(data)[2]) print("Dimension of data is NOT #samples X length of freq")
if (length(names)>dim(data)[1]) print("Too many names")
if (length(names)<dim(data)[1]) print("Too less names")
if (length(groups)>dim(data)[1]) print("Too many groups")
if (length(groups)<dim(data)[1]) print("Too less groups")
if (length(colors)>dim(data)[1]) print("Too many colors")
if (length(colors)<dim(data)[1]) print("Too less colors")
if (is.matrix(data)==F) print("'data' is not a matrix or it's not numeric")
return(Spectrum_chem)
}
Spectrum_chem <- dataframe2Spectra(Spectrum)
chkSpectra(Spectrum_chem)