I need to routinely call Bing News Search results via its API, checking for fresh stories matching a given search query.
I only want to return stories newly published since the last time I called the API.
For example, an hourly call to the API should constrain the search for stories from between the last hour and now (ie. stories published within the last hour).
Here is documentation for Bing News Search API - https://learn.microsoft.com/en-us/rest/api/cognitiveservices/bing-news-api-v7-reference
It makes clear a parameter, "since", which takes Unix epoch time. I will always be able to programmatically generate the epoch time for the start of the period.
Documentation states:
The Unix epoch time (Unix timestamp) that Bing uses to select the trending topics. Bing returns trending topics that it discovered on or after the specified date and time, not the date the topic was published.
If I want to return stories starting from June 22, epoch time for human GMT time Friday, June 22, 2018 12:39:51 PM is 1529671191.
This should allow me to generate API query URL https://api.cognitive.microsoft.com/bing/v7.0/news/search?q=%22Cardiff%22&since=1529671191000&count=100&sortBy=Date&textDecorations=true&textFormat=HTML
q="Cardiff"
since=1529671191000
count=100 (maximum)
sortBy=Date
textDecorations=true
textFormat=HTML
However, when that call is performed, the longest-ago "datePublished" field for a returned story object is "2018-06-20T23:18:00.0000000Z" (ie. June 20), which is clearly two days before the "since" parameter that I specified.
It's so curious, and frustrating. The alternative constraint parameter "freshness", when specified as "Day", seems to successfully constrain the search period to the last 24 hours. But that is not granular enough. "Since" does not work and does not do anything at all.
Is "since" only intended to be used to return Bing News' "Trending Topics" story lists, and not results of news search queries? The documentation language may be ambiguous.
If this is the case, how can I constrain the start/"since" date for my search through the API, other than with "freshness"?
I think the answer is on your question question:
You say:
However, when that call is performed, the longest-ago "datePublished"
field for a returned story object is "2018-06-20T23:18:00.0000000Z"
(ie. June 20), which is clearly two days before the "since" parameter
that I specified.
But just before, you are quoting this from the documentation:
The Unix epoch time (Unix timestamp) that Bing uses to select the
trending topics. Bing returns trending topics that it discovered on or
after the specified date and time, not the date the topic was
published.
So it has probably been discovered after your since value, you cannot compare with the datePublished field
Related
I'm getting campaigns/adGroups reports from Sponsored Brands/Sponsored Products Amazon Advertising API. When I select reportDate older than 60 days, I'm getting the error "Report date is too far in the past. Reports are only available for 60 days." (code 406). Is it really not possible to get older reports? Or maybe older reports need to be queried differently? Also, isn't it possible to get the report for time period longer than one day in one request?
There is an information about "reportDate" parameter, that it is "The date for which to retrieve the performance report in YYYYMMDD format. The time zone is specified by the profile used to request the report. If this date is today, then the performance report may contain partial information. Reports are not available for data older than 60 days." - but is it for all reports always?
It seems strange to me, as other services normally offers more stats that 2 months, and also there is the note in the documentation, that "Note: New-to-brand metrics are calculated from November 1, 2018. If a report date is requested earlier than this date, the metrics will be calculated from November 1, 2018."
Thank you for explanation and your help!
Ela
No, you cannot get data older than 60 days through the API. The data does exist within Amazon's databases, but cannot be accessed via API.
If you have a vendor manager contact or something similar, it's theoretically possible to request this data from them, but they'd probably only do it as a one-off for a large client.
Can someone give little clarification how to interpret following parameter:
deletionRequestTime: datetime`:
This marks the point in time up to which all user data for the specified end user and Google Analytics property or Firebase project should be deleted.`
If I set it to 1st Jan2018 (GTM), does it delete all user data:
from that date till today (which is how I interpret).. meaning all 2018 data will be gone?
or, (from epoch time) till that date ... meaning all 2016/2017 etc. data is gone and all that remains is 2018 data?
When trying the API > refreshed User Explorer report in GA interface > I notice all-time data seems is gone (giving me impression that this filed is not respected?). But let me wait 72hrs since API request to draw any conclusion..
Thanks for any clarification.
Cheers!
First off i dont think your miss-interperting it I dont think the documentation is clear.
The following is from userDeletionRequest
deletionRequestTime datetime
This marks the point in time up to which all user data for the specified end user and Google Analytics property or Firebase project should be deleted.
Now to me that means that its a point in time that the data should be deleted. as in one day? one minute a time stamp? would this then mean you will need to loop though every hour minute in a day to delete everything.
My current answer is this is confusing. I am going to contact the team for clarification they are in West coast USA we wont get an answer back for several hours. I will updated this when i know more.
Clarification from Google
As per documentation, deletionRequestTime represents a timestamp up to
which all use data will be deleted. In other words, all data from the
beginning of time until the point returned in deletionRequestTime will
be deleted.
I don't believe you can set the deletionRequestTime field. It is set to the time you make the call.
I believe that is why you are seeing this behavior.
I wish to extract (via the Analytics Core Reporting API) all the transactions made TODAY by users that had a specific ga:eventCategory few weeks ago.
I'm looking to see the date of a transaction and all dated of event that are related to that transaction.
If GA was sql I would join by the ga user and take in the dimension both his transactions date and his dimension update date...
Thanks.
Noam.
Like I have indicated in my comment you can segment the data to include only those users who have the specific event. Segmentation works fine with the core reporting API.
Your segment defintion would look like this:
users::condition::ga:eventCategory==[myEventCategory]
(where obviously the thing in [brackets] is a placeholder that needs to be substituted for the event category name). The "users::" prefix means you are segmenting by user scope (as opposed to sessions), so this will include all sessions in the selected timeframe for users who had the event at least in one of their session (even if the event was outside the selected timeframe).
Select transactionId as dimension and some metric (revenue) and todays date and you are done. Or you would be done if this was actually going to work, but there are at least two caveats:
Google Analytics does not work in realtime, so it's unlikely that TODAYs transactions are fully available (Google says it's 24 hours until the data is processed - actually it might happen faster, but you cannot rely on it).
If a user has deleted his or her cookie she won't be recognized as a recurring user and GA will be unable to segment her out. The longer the interval between the event and the transaction the less likey it is that the GA cookie is still present.
So even with a technically correct query it might be that you won't get the data you need.
When getting information from Twitter's API for a user, they provide two fields related to the user's time zone:
utc_offset: -14400,
time_zone: "Indiana (East)"
Unfortunately, this doesn't tell the full story because I don't know if that UTC offset was calculated during standard time or daylight savings time. After dividing by 3600 seconds, I get -4 hours, which is valid during the summer months, but in the winter the correct value would be -5 hours.
If the value was ALWAYS determined by the daylight savings time value then I could write an algorithm for that, however after some searching on the subject I've seen several pasted outputs that contradict that assumption. (as a quick example, this question shows his/her offset as -21600 and then he/she says he/she is on central time, which if calculated during daylight savings time would be -18000).
It would make sense to me that the value would be calculated as of Jan 1 and the several pasted outputs I've found online fall into that category, but my own Twitter account shows the values listed above for which this assumption is invalid. My next thought was maybe it was calculated at the time I created my account, but then that seems erroneous as well because I can change my time zone at any later point (and even so, I created my account in November when I would have been on standard and not daylight time!).
My last thought was that maybe the value is being calculated by the date of the API request. This makes a lot of sense and the Twitter accounts I own all seem to validate this. BUT, the SA question I linked to earlier shows that the person answered the question on June 2nd, which is daylight savings time and his/her value of -21600 reflects a standard time for the Central time zone.
Anyone out there solve this problem? Thanks so much!
Twitter's front end uses Ruby on Rails. If you go to your own twitter account settings and look at the possible options for time zones (view source on the dropdown list), you will find that they match up with those provided by ActiveSupport::TimeZone, shown in this documentation. Although there appears to be some zones understood by Rails that Twitter has omitted, all of the Twitter zone key names are in that list.
I have asked Twitter to use standard time zone names in the future, in this developer request.
Why does Rails limit this list and use their own key values? Who knows. I have asked before, and gotten very little response. Read here.
But you can certainly use their mapping dictionary to turn the time_zone value into a standard IANA time zone identifier. For example:
"Indiana (East)" => "America/Indiana/Indianapolis"
"Central Time (US & Canada)" => "America/Chicago"
This can be found in the Rails documentation, and in the source code. (Scroll down to MAPPING.)
Then you can use any standard IANA/Olson/TZDB implementation you wish. They exist for just about every language and platform. For further details, see the timezone tag wiki. If you need help with a specific implementation, you'll need to expand your question to tell us what language you are using and what you have tried so far. (Or consider asking a new question about just that part of it.)
In regards to the utc_offset field, twitter does not make it clear what basis they use to calculate it. My guess is that it is the user's current offset, based on the time that you call the API.
Update 1
I have added support for converting Rails time zone names to both IANA and Windows standard time zone identifiers in my TimeZoneConverter library for .NET. If you are using .NET, you can use this library to simplify your conversions and stay on top of updates more easily.
Update 2
Twitter's API now returns the time zone in this format:
"time_zone": {
"name": "Pacific Time (US & Canada)",
"tzinfo_name": "America/Los_Angeles",
"utc_offset": -28800
},
Use the tzinfo_name field. Done. :)
I have the feeling, in every RSS.xml file, both the pubDate and the lastBuildDate match.
I am sure that this one, is not always true...
So firstly, what is the difference between those two above?
Secondly, the RSS readers, sort the content by Date, based on the pubDate or the lastBuildDate?
pubDate:
The original publication date for the channel or item. (optional)
lastBuildDate:
The most recent time the content of the channel was modified. (optional)
Here are some docs for the optional items in the RSS 2.0 spec.
Answers here are all over the place. Some people are getting confused by the fact that item has a pubDate as well. I believe the OP is specifically asking about the difference between lastBuildDate and pubDate at the channel level.
From the best of my understanding of the RSS spec, which is notorious for ambiguous explanations, lastBuildDate would be the last time the feed was created. For example, if you cache a copy of it on your server for some period of time, lastBuildDate would the time that cached copy was created.
pubDate, on the other hand, seems to be basically the last time any actual content within the feed has changed. For the most part it's pretty much going to be the latest pubDate value from the items in the feed, since generally, the feed content is only changing when some new item gets published. However, it could also be a date when you made some change to the channel, itself, such as changing the channel title, description, etc.
lastBuildDate specifies the last date/time the entry was modified. pubDate specifies the actual publication date/time.
The reason you see these as generally the same is because by the time you get the RSS feed, there hasn't been any edit to the article.
I can't find the RSS spec on this unfortunately, but I am pretty positive that's what they are.
By RSS 2.0 specification, it seems they are roughly equivalent:
lastBuildDate:
The last time the content of the channel changed.
pubDate:
The publication date for the content in the channel. ...
The difference is subtle: They tell us about the method that was used. In case of <pubDate>, the channel is published manually or in fixed period. In case of <lastBuildDate>, the channel is built automatically upon new article being added on the website, adding it as new item.
While the other answers here do provide some good information, I feel the need to elaborate just a little bit for any future visitors.
pubDate
The publication date for the content in the channel. For example, the New York Times publishes on a daily basis, the publication date flips once every 24 hours. That's when the pubDate of the channel changes.
lastBuildDate
The last time the content of the channel changed.
So, taking the New York Times as an example again, the <pubDate> is the date the feed was published while the <lastBuildDate> would be the date the content inside the feed changed. In the end, I would view the <pubDate> as the date the feed is published and the <lastBuildDate> as the date any content in the feed was last modified.