Colleagues, I really need your advice on how to create a report with the following format in Cognos Analitics:
I only have the value "amount of money" and the dimensions "date" and "Person", and I need to display in the report the value for a specific date, and the change from the previous date.
for example, 01.02.2018 Person1 had 50 of the money, and 01.03.2018 Person1 had 61, so field № 3 is equal to 11 (61-50).
As you can see, there is no "change" column after the first field, because there is nothing to compare it with.
Do you have any ideas on how to generate such a report?
P.S. user selects the start date and end date of the report independently in the invitation
Maybe try creating multiple metrics
Call the first Day 1 Amount
Call the second Day 2 Amount
Call the third Day 3 Amount
You could even define each metric relative to the other
Day 1 is based on the date selected
Day 2 is for the prior day
Day 3 is 2 days before... etc
Build the crosstab slightly different. Instead of placing the metric in the middle
place them side by side
Then you can run calculations, %difference, growth etc on the fly
I need to extract Facebook country likes for a number of brands. My problem is I dont even know where to start - I have spent the past 4 hours searching but to be honest Im not even sure what to search for. I can get the data but am struggling to convert into a usable format in R for time series analysis.
Any assistance would be gratefully accepted
The data I'm retrieving via Facebook Graph API for likes by country (of Coca-Cola is as follows:
[1] "{\"data\":[{\"id\":\"1542920115951985\/insights\/page_fans_country\/lifetime\",\"name\":\"page_fans_country\",\"period\":\"lifetime\",\"values\":[{\"value\":{\"BR\":17270087,\"US\":13567311,\"MX\":5674950,\"AR\":3616300,\"FR\":3409959,\"IN\":2949669,\"GB\":2670260,\"TH\":2657306,\"IT\":2401621,\"CO\":1946677,\"ID\":1921076,\"EG\":1805233,\"PK\":1665707,\"PH\":1614358,\"TR\":1607936,\"CL\":1504917,\"VN\":1384143,\"DE\":1312448,\"PL\":1201112,\"VE\":1084783,\"CA\":990114,\"RO\":932538,\"EC\":856116,\"PE\":815942,\"ES\":790320,\"AU\":759775,\"MA\":578003,\"TN\":515510,\"RS\":476986,\"NG\":476934,\"PT\":469059,\"MY\":435316,\"BE\":431930,\"ZA\":431509,\"IQ\":354145,\"SE\":352331,\"KE\":342997,\"GR\":333749,\"HU\":333281,\"NL\":330307,\"GT\":326328,\"CR\":304006,\"DZ\":300497,\"PR\":287430,\"DO\":278847},\"end_time\":\"2015-01-01T08:00:00+0000\"},{\"value\":{\"BR\":17270151,\"US\":13566624,\"MX\":5675012,\"AR\":3618242,\"FR\":3409837,\"IN\":2949969,\"GB\":2669934,\"TH\":2658044,\"IT\":2401726,\"CO\":1946797,\"ID\":1921156,\"EG\":1805337,\"PK\":1665824,\"PH\":1614402,\"TR\":1608104,\"CL\":1504979,\"VN\":1384782,\"DE\":1312138,\"PL\":1201212,\"VE\":1084776,\"CA\":990093,\"RO\":932788,\"EC\":856129,\"PE\":816002,\"ES\":790385,\"AU\":759775,\"MA\":578080,\"TN\":518210,\"RS\":477264,\"NG\":476965,\"PT\":469177,\"MY\":435296,\"ZA\":433741,\"BE\":431908,\"IQ\":364228,\"SE\":352267,\"KE\":343007,\"GR\":333771,\"HU\":333312,\"NL\":330232,\"GT\":326513,\"CR\":304021,\"DZ\":300587,\"PR\":287432,\"DO\":278892},\"end_time\":\"2015-01-02T08:00:00+0000\"}],\"title\":\"Lifetime Likes by Country\",\"description\":\"Lifetime: Aggregated Facebook location data, sorted by country, about the people who like your Page. (Unique Users)\"}],\"paging\":{\"previous\":\"https:\/\/graph.facebook.com\/cocacola\/insights\/page_fans_country?access_token=EAACEdEose0cBAMLTB1Ufx44l8Q2hT34jxjmVjONPzhqncAvv985cUXOY6Q9FZBLuL3OM8oLXDPTBroD5DY8SS9ZBd1OIhSAMwjrISQRgWh5kkJVu75Ss7aWESIlKrwBLyLt6VYHUEUlUlmCV72TSQGZBkkOeE4OaZA4gvHIZBngZDZD&since=1419897600&until=1420070400\",\"next\":\"https:\/\/graph.facebook.com\/cocacola\/insights\/page_fans_country?access_token=EAACEdEose0cBAMLTB1Ufx44l8Q2hT34jxjmVjONPzhqncAvv985cUXOY6Q9FZBLuL3OM8oLXDPTBroD5DY8SS9ZBd1OIhSAMwjrISQRgWh5kkJVu75Ss7aWESIlKrwBLyLt6VYHUEUlUlmCV72TSQGZBkkOeE4OaZA4gvHIZBngZDZD&since=1420243200&until=1420416000\"}}"
This data is for two days and the data I need to retrieve start from line 3 (BR has 17270087 fans for Coca-Cola) and ends on line 10 (DO has 278847 fans) plus the date (indicated by end time of 2015-01-01). I then need to repeat the extract for line 12 to line 19 plus the end time of 2015-01-02 for each of the country references. Ideally I also want to capture the Facebook ID on line 2 (1542920115951985) to be able to build a data frame with Facebook ID, Date, Country and Likes in each record.
I am at a standstill with this problem. I outlined it in another question ( Creating data histograms/visualizations using ipython and filtering out some values ) which meandered a bit so I'd like to fix the question and give it more context since I am sure others must have a workaround for this or have the problem. I've also seen similar, not identical, questions asked and can't quite adapt any of the solutions thus far given.
I have columns in my data frame for Start Time and End Time and created a 'Duration' column for time lapsed. I'm using ipython.
The Start Time/End Time columns have fields that look like:
2014/03/30 15:45
A date and then a time in hh:mm
when I type:
pd.to_datetime('End Time') and
pd.to_datetime('Start Time')
I get fields resulting that look like:
2014-03-30 15:45:00
same date but with hyphens and same time but with :00 seconds appended
I then decided to create a new column for the difference between the End and Start times. The 'Duration' or time lapsed column was created by typing in one command:
df['Duration'] = pd.to_datetime(df['End Time'])-pd.to_datetime(df['Start Time'])
The format of the fields in the duration column is:
01:14:00
no date just a time lapsed in the format hh:mm:ss
to indicate time lapsed or 74 mins in the above example.
When I type:
df.Duration.dtype
dtype('m8[ns]') is returned, whereas, when I type
df.Duration.head(4)
0 00:14:00
1 00:16:00
2 00:03:00
3 00:09:00
Name: Duration, dtype: timedelta64[ns]
is returned which seems to indicate a different dtype for Duration.
How can I convert the format I have in the Duration column to a single integer value of minutes (time lapsed)? I see no methods that I can use, I'd write a function but wouldn't know how to treat the input of hh:mm:ss. This must be a common requirement of data analysis, should I be going about converting these dates and times differently if my end goal is to get a single integer indicating minutes lapsed? Should I just be using Excel?... because I have so far spent a day on this problem and it should be a simple problem to solve.
**update:
THANK YOU!! (Jeff and Dataswede) I added a column with the command:
df['Durationendminusstart'] = pd.to_timedelta(df.Duration,unit='ns').astype('timedelta64[m]')
which seems to give me the Duration (minutes lapsed) as wanted so that huge part is solved!
What still is not clear is why there were two different dtypes for the same column depending how I asked, oh well right now it doesn't matter.**
I need to save a time interval in a column in a table. based on: http://docs.sqlalchemy.org/en/rel_0_8/core/types.html
I can use Interval type for that. My database is SQLite, and I don't quite understand this description in the document:
"The Interval type deals with datetime.timedelta objects. In PostgreSQL, the
native INTERVAL type is used; for others, the value is stored as a date which
is relative to the “epoch” (Jan. 1, 1970)."
Can anybody tell me how should I do that?
So from what I get in the question, you want to just store an interval and take it out of the database to use it again? But you want to understand how it is stored?
Concerning the storage: This is probably easier with Unix timestamps than with DateTimes. Suppose you want to store timedelta(1), i.e. a delta of one day. What is stored in the database is the time since the "epoch", i.e. second "0" in Unix timestamps and as a date: 1970-01-01 00:00:00 (this is where Unix timestamps start counting the seconds). If you don't know about epoch or timestamp, then read Wikipedia on Unix time.
So we want to store one day of difference? The documentation claims it stored "time since epoch". We just learned "epoch" is "second 0", so a day later would be 60 seconds per minute, 60 minutes per hour, 24 hours per day: 60 * 60 * 24 = 86400. So stored as an integer this is easy to understand: If you find the value 86400 in your database, then it means 1 day, 0 hours, 0 minutes, 0 seconds.
Reality is a bit different: It does not store an integer but a DateTime object. Speaking from this perspective, the epoch is 1970-01-01 00:00:00. So what is a delta of one day since the epoch? That is easy: it's 1970-01-02 00:00:00. You can see, it is a day later.
An hour later? 1970-01-01 01:00:00.
Two days, four hours, 30 seconds?: 1970-01-03 04:00:30.
And you could even do it yourself:
epoch = datetime.utcfromtimestamp(0)
delta = timedelta(1)
one_day = datetime.utcfromtimestamp(86400)
print "Date to be stored in database:", epoch + delta
print "Timedelta from date:", one_day - epoch
As you can see, the calculation is easy and this is all that is done behind the scenes. Take a look at this full example:
interval = IntervalItem(interval=delta)
session.add(interval)
i = session.query(IntervalItem).first()
print "Timedelta from database:", i.interval
You can see it is no different from the above example except it goes through the database. The only thing to keep in mind with this, is this note:
Note that the Interval type does not currently provide date arithmetic operations
on platforms which do not support interval types natively.
That means you should be careful how you use it, for example addition in the query might not be a good idea, but you should just play around with it.
Is it possible to create a custom time zone in R for handling datetime objects?
More specifically I am interested in dealing with POSIXct objects, and would like to create a time zone than corresponds to "US/Eastern" - 17 hours. Time zones with a similar offset do not follow the same daylight savings convention as the US.
The reason for using a time zone so defined comes from FX trading, for which 5 pm EST is a reasonable 'midnight'.
When you are concerned about a specific ”midnight-like” time for each day, I assume that you want to obtain a date without time which switches over at that time. If that is your intention, then how about simply subtracting 17 hours (= 17*3600 seconds) from your vector of times, and taking the date of the resulting POSIXct value?
That would avoid complicated time zone maniplulations, which are usually not hanled by R itself but the underlying C libraray, as far as I know, so they might be difficult to achieve from within R. Instead, all computations would be performed in EST, and you'd still get a different switchover time than the local midnight.