Sorting specific values into time buckets - r

I'm dealing with an enourmous problem. I need to distribute some specific values on a specific time bucket structure simply looking at their expiration date (eventually it is possible to compute the expiration period in days or months or years simply subtracting issue date and expiration date).
So basically, what I'm asking for is this. I have these amounts with relative dates:
and this time bucket structure:
Is there a way to distribute automatically the amounts in first picture into the time structure above using r? Basically first element should go in "tra 12 mesi e 2 anni", the second element in "overnight", the third element in "tra 12 mesi e 2 anni", and the last in "tra 3 e 4 mesi".
Thank you all
Davide

Related

ideas for creating a report

Colleagues, I really need your advice on how to create a report with the following format in Cognos Analitics:
I only have the value "amount of money" and the dimensions "date" and "Person", and I need to display in the report the value for a specific date, and the change from the previous date.
for example, 01.02.2018 Person1 had 50 of the money, and 01.03.2018 Person1 had 61, so field № 3 is equal to 11 (61-50).
As you can see, there is no "change" column after the first field, because there is nothing to compare it with.
Do you have any ideas on how to generate such a report?
P.S. user selects the start date and end date of the report independently in the invitation
Maybe try creating multiple metrics
Call the first Day 1 Amount
Call the second Day 2 Amount
Call the third Day 3 Amount
You could even define each metric relative to the other
Day 1 is based on the date selected
Day 2 is for the prior day
Day 3 is 2 days before... etc
Build the crosstab slightly different. Instead of placing the metric in the middle
place them side by side
Then you can run calculations, %difference, growth etc on the fly

Creating a Time Series with Half Hourly Data in R

This is my first time ever asking a question on Stack Overflow and I'm a programming novice so any advice as to how to improve my question asking abilities would be appreciated.
Onto my question: I have two csv files, one containing three columns (date time in dd/mm/yyyy hh:(00 or 30) format, production of a certain product, and demand for said product), and the other containing several columns (decomposition of the date time into year, month, day, hour, and whether it is :00 or :30 represented by 1 or 2 respectively, alongside several columns for independent variables which may affect production/demand of said product).
I've only played around with the first csv file, converting the string into a datetime object but the ts() function won't recognise the datetime objects as my times. I've tried adjusting the frequency parameter but ultimately failed and have no idea how to create a time series using half hourly data. Would appreciate any help.
Thanks in advance!
My suggestion is to apply the "difftime" over all your time data. For instance, like following code, you can use your initial time (the time of first record) for all comparisons as time_start and the others as time_finish. Then it return the time intervals as number of seconds and then you are ready to use other column values as the value of the time stamps.
interval=as.integer(difftime(strptime(time_finish,"%H:%M"),strptime(time_start,"%H:%M"),units = "sec"))
Second 0 10 15 ....

RFID tags moving through a factory: How to expand timestamp data by minute?

Issue:
I have about 9,000 rows of data that show XYZ data for about 10 tags.
All tags ping at different rates. Some at 1 second, some at 7 seconds, some every few hours, and so on.
I want to show how the tagged items move through a factory. Preferably with an animated GIF...
Desired data set:
For each of the 10 tags, on a one minute interval, where is it in the factory? Some tags will have multiple records per minute. Some records will have hours since the last ping.
Is there a way to do this in R? I can brute force it with Excel (shutter) but I'm hoping someone can set me on the right path, even if it's the proper search term to send me on my journey. Thanks!
Without an example data set to look at, there is a limit to what kind of answer I can provide.
However, the first thing that you want to do is get all of your observations on a unified time line. That is, each row should be a time stamp and each column should be the x, y, or z coordinate for one of your tags. Without seeing the data, I can't help you do this. Here are the types of things you need to think about. If tag A has measurements at timestamp 1, 2, 3, 4, 5, and 6 and tag B has measurements at timestamp 1 and 6, what value should tag B have for the measurements at timestamp 2, 3, 4, and 5. You could simply copy the value from timestamp 1 or you could do some type of interpolation between timestamp 1 and timestamp 6. If you don't know what I am talking about here, let me know.
Second, there could be package to make a gif in R. However, I have taken advantage of the R function Sys.sleep. Here is what I would suggest. Once you have your unified timeline for your dataset, write a for loop over the rows of your dataset. At the top of each row do Sys.sleep(0.25). This will make R "sleep" for 0.25 seconds. Then plot the current row of the dataset for all of your tags. This creates a really cool video that you could record with some other tool.

R - Cluster x number of events within y time period

I have a dataset that has 59k entries recorded over 63 years, I need to identify clusters of events with the criteria being:
6 or more events within 6 hours
Each event has a unique ID, time HH:MM:SS and date DD:MM:YY, an output would ideally have a cluster ID, the eventS that took place within each cluster, and start and finish time and date.
Thinking about the problem in R we would need to look at every date/time and count the number of events in the following 6 hours, if the number is 6 or greater save the event IDs, if not move onto the next date and perform the same task. I have taken a data extract that just contains EventID, Date, Time and Year.
https://dl.dropboxusercontent.com/u/16400709/StackOverflow/DataStack.csv
If I come up with anything in the meantime I will post below.
Update: Having taken a break to think about the problem I have a new approach.
Add 6 hours to the Date/Time of each event then count the number of events that fall within the start end time, if there are 6 or more take the eventIDs and assign them a clusterID. Then move onto the next event and repeat 59k times as a loop.
Don't use clustering. It's the wrong tool. And the wrong term. You are not looking for abstract "clusters", but something much simpler and much more well defined. In particular, your data is 1 dimensional, which makes things a lot easier than the multivariate case omnipresent in clustering.
Instead, sort your data and use a sliding window.
If your data is sorted, and time[x+5] - time[x] < 6 hours, then these events satisfy your condition.
Sorting is O(n log n), but highly optimized. The remainder is O(n) in a single pass over your data. This will beat every single clustering algorithm, because they don't exploit your data characteristics.

how count the selected children of a hierarchy

I'm in this situation.
I have a cube that has data aggregated by day.
This cube has a dimension with hierarchy (year-> month-> day).
in this hierarchy are visible every day for the last 4 months.
my problem is counting the days selected, because I need to do calculations based on the days observed (on a pivot).
example:
There are data for the day 11/11/2013 (amount orders), but in the hierarchy I select values from 10.11.2013 to 20.11.2013 ..
how do I calculate (amount orders) / 11?
11 is the number of days 10-20.
See my answer at MDX average fact items count by time dimension.
Having the day count as a measure in the cube would automatically apply all filters etc. without you having to take care of it.
You can use the Count() function to calculate the number of members in a set. Your set can consist of a range of dates if you use a colon between start and end date. Something like this:
Count({[Date].[Day].&[2013]&[11]&[10] : [Date].[Day].&[2013]&[11]&[20]})

Resources