I have a question regarding what approach to building a predictive model in R would be best for my data.
Say I have a series of orders per month for the past 5 years. The data have three variables- month, year and sum or orders.
What is the best way to build a model that will predict the number of orders for next month based on the number of orders over the past 6 months and the normal seasonal peaks and troughs for the number of orders? What is the best way to approach this problem using R?
Unfortunately I do not have the data at hand, but am just asking generally how to approach this problem in R.
Thanks in advance.
Related
I am a cross country runner on a high school team, and I am using my limited knowledge of R and linear algebra to create a ranking index for xc teams.
I get my data from milesplit.com, but I am unsure if I am formatting this data properly. So far I created matrices for each race, with odd columns including runner score and even columns including time, where each team has a team_score and team_time column. I want to analyze growth of teams in a time series, but I have two questions about this:
(1): can I combine all of these "race matrices" into a time series? Can I assign all the data in a race matrix a certain date, then make one big time series including all 25 race matrices I made?
(2): Am I closing myself off to insights by not including name and grade for each runner (as I only record time and score)? If so, how can I write a matrix that contains all this information?
So I have a rather large dataset for Stock returns. Specifically it contains 4 columns, One with the Stock name, date, returns, and lagged returns. What I would like is to somehow illustrate how many of the Stocks that is autocorrelated with the 1-lagged return. It is especially difficult as my data contains about 20k Stocks and 227k observations. Any suggestions? :)
Thanks in advance!
The dataset contains over 1 million rows with columns for the monthly returns, the Date and the Securities ID. So it is the monthly data for about 20,000 funds. Here is a screenshot of how the data is structured.
The problem is, that I can not get the calculation of the Sharpe Ratio to work using the PerformanceAnalytics library. At least not with the data as it is given. How would you approach this calculation? Would really appreciate your input, as I am new to the world of R and happy to learn. Thanks in advance!
I have a panel data set with return, ESG score and market value for a number of companies over 11 years. I need to extract data for all variables for one year at a time, to make yearly portfolios.
The data frame looks like this:
How can I extract one year at a time and then construct portfolios of high and low ESG score for each year?
Thanks in advance
Have you considered processing the data with Python and Pandas instead of R? The following solution should help to slice your data into different time intervals:
Slice JSON File into Different Time Intercepts with Python
In terms of sorting ESG scores, you can use the following command: df.sort_values('ESG')
Hope that helps and good luck with your dataset.
I'm making a project connected with identifying dynamic of sales. That's how the piece of my database looks like http://imagizer.imageshack.us/a/img854/1958/zlco.jpg . There are three columns:
Product - present the group of product
Week - time since launch the product (week), first 26 weeks
Sales_gain - how the sales of product change by week
In the database there is 3302 observations = 127 time series
My aim is to cluster time series in groups which are going to show me different dynamic of sales. I used k-medoids algorithm (after transforming data with FFT/DWT) and I do not know how to present each cluster = grouped time series on different plots.
Can somebody tell me how should I do that?
Here is the code of clustering:
clustersalesGain = pam(t(salesGain), 8)
nazwy = as.character(nazwy)
cbind(nazwy,clustersalesGain$clustering)
I would like to present the output on different plots.
k-medoids returns actual data points as cluster centers.
Just visualize them the same way you visualize your data!
(And if you havn't been visualizing your data, you better work on that now.)