RSS Item updates - rss

I'm working on an RSS feed for a custom tasking system we use, and I'm still wrapping my head around how things should work. What I want to have is a feed for each user that shows tasks assigned to them, and additionally a feed for each task that shows updates for the task.
What I want to know right now concerns the user feed. When a case assigned to a user is updated, I currently have code to change the pubDate entry for that item and the lastBuildDate for the channel. I was hoping this would make the item appear as unread in readers so that the user would know to look at the item again, but this seems not to be the case. Should I be changing the guid, even though it's really the same items? What would the side-effects of that be?
Is there anything I'm missing? How can I solve this?

Changing the <pubDate> does indicate that the entry changed, but there is no requirement that a given RSS reader do anything about it. (Strictly speaking, there is no requirement than an RSS reader do anything, but let's remain reasonable.) Some reader do mark updated entries as changed. For example Bloglines.com can optionally detect changes in the <description> and mark entries as new again if that case.
Depending on your reader, changing the <title>, <description>, or <pubDate> might give you the behavior you want. But as GateKiller mentions above, your safest option is to make it an entirely new entry with a new <guid>. While you're at it, you might want to use it as an opportunity to add a direct link or details about the update.
Of course, if you're writing both the producer and consumer of the RSS, and your goal is that the feed always contains the full set of assigned tasks, just updating the <pubDate> will work just fine.

The solution is to also change the GUID which means including the updated time in it. The GUID provides the uniqueness for each item in the feed and will be marked as unread if you put the date updated in it.

Related

How to track most used filters on product filter page with GTM and GA4?

I have a custom build page where users can filter products based on price, category, brand, ...
These are made out of checkboxes and a range input for the price.
I'm trying to figure out what the best way would be to track every action/filter in order to find out which brand / categories are the most popular.
Important to know
The menu contains a submenu for the categories. When the user clicks one of these links the filterpage will have this category checked in the filters.
The page does not reload when applying a filter. I'm using JS to perform a search and show new results. The page url gets updated with the correct search query parameters.
I think I have 2 options:
Track click events on the checkboxes and send every change with datalayer.push.
Track the page URL after each filter.
Option 1 is an issue because people might go to the page with some parameters in the URL. This won't be tracked because there was no click event. This issue will also apply to users that click the category in the submenu that prefills the filter.
Option 2 also is an issue because with this solution the category might be tracked 5 times if the user keeps adding or removing other filters. It always tracks all filters instead of the one that has been added.
The first step of tracking is using the analog of Occam's Razor. You want to cut off stuff that has no chance of answering legit business questions.
Your business question here is: What filters are the most helpful for the users? Now it's important to know why the business wants to know it. Cuz remember, the business is not very competent at data analysis even if it doesn't realize it.
So you need to know exactly how answering that question improves OKRs/KPIs. In this case, the legit answer could be: cuz we want to sort the filters by the usage frequency and measure if that would ease the engagement and thus, improve the conversion rate for the part of the journey from the product list to the pdp
That's a pretty weak reason, but passable. Especially if there's an issue in that transition currently.
Good, now having that context, why would we want to track filters used in pre-populated urls? Say some overzealous employee made a mistake and pre-populated some weird unneeded filter using, say, date and time of when the product has been added. And now they use that URL in all ads, so you get a lot of third party traffic coming to product lists with a date as a filter.
And then, let's say, that employee keeps using that filter for other persistent links to the effect of the date/time filter becoming uncanningly popular. There. Your data slowly becomes garbage and stops answering the original question.
There are other issues with tracking pre-set filters, some of which you've outlined, but the real issue is the ability of the data to answer good business questions clearly. Tracking all filters may be able to answer some technical questions, but it's not the aim of behavioral analytics to answer technical questions. Let them use access logs and whatever else they use to answer those.

Drupal Embedded views not working correctly

I recently took over a site from someone else at a new company. Having never used Drupal before, updating things has been a bit cumbersome. There were some outstanding security updates that I applied(but I haven't updated the core yet). Anyway, after doing this, the calls to views_embeded_view have not been working. For example:
print views_embed_view('news_block');
Will break the links(by using the title, rather than alias for the link), or it will link correctly, but not follow the paging rules I have set(show 1 page, 6 items per page) instead it shows 10 items and has links for other pages.
I am not sure if the update has anything to do with it, but it seems likely. Would updating the core resolve this issue potentially?
The first argument of views_embed_view is view name, the second one is display id. If display_id is not provided, 'default' is used. Make sure that you are displaying the correct display. (i.e. default can be configured differently than some other display which you actually wish to see)

What are ramifications of changing RSS item GUID or Atom entry id?

When I make a substantial update (not just correcting a typo) to an article in my blog, I want to ensure that readers see the updated article again in their news feed. From what I have read, here are some of the options I see:
Create an entirely new article (largely a duplicate of the original). Apparently a bad idea -- duplicate content would be bad for SEO.
Change the published and/or updated timestamp of the article. It seems that, in most readers, this will not make the article show up as unread.
Change the RSS item GUID or Atom entry id. This is a big NO-NO according to the Atom specs, but I'm not sure about RSS.
So, there doesn't seem to be a good option, unless I'm missing something.
What are the ramifications of changing RSS item GUID or Atom entry id? Are the Feed Police going to show up at my door for changing an article ID?
updating the "updated" field for that entry should be correct. Do not forget to also update "updated" field for feed itself, any Etags/last-modified HTTP headers (if existing but not auto-generated), and wait/force reader to actually do the refresh.
if you still have the problems with some of the readers you should check with feed reading software authors to see if that is intentional.
As for the second part, changing id won't get Feed Police on your door, but if it happens often enough, such articles which would show as duplicate could annoy your followers to just ignore/drop the feed.
see this and this answers too
The RSS <guid> or Atom <id> is an element used to uniquely identify its parent item. Feed readers and aggregators use this field to determine if the item has already been downloaded or fetched.
If you change an RSS <guid> or Atom <id>, then readers and aggregators may use this as a signal or flag that the item is to be downloaded again because the GUID or ID held previously no longer matches what it has in its database or lookup.
Changing the GUID or ID is not a way to force an update in place. It's a way to say, "I have something brand new for you to download/fetch".
In RSS, if you add ?fake=parameter to the GUID that can be a substitute way to force a new download. But the old fetched item will still remain because it doesn't share the GUID.
You can't reliably force a download via RSS or Atom using the publish or updated date.
Best you can do is to change the contents of the item and allow readers or aggregators to update as they wish, as not all work the same in what they do when they see a change in content like this for an item it already has.
As both answers at this point state: there is no perfect way. Changing the guid will make everyone believe that the content is brand new, hence probably creating duplicate content, and chaging just the element will probably not always trigger a full refresh.
Using PubSubHubbub may help as it is fat pings. Wich means that the subscriber will get the updated data right away and can store it under the same key/unique id that the previous version.

Abusing HTTP POST

Currently reading Bloch's Effective Java (2nd Edition) and he makes a point to state, in bold, that overusing POSTs in web applications is inherently bad. Unfortunately, he doesn't specify why.
This startled me, because when I do any web development, all I ever use are POSTs! I have always steered clear of GETs for security reasons and because it felt more professional (long, unsightly URLs always bother me for some reason).
Are there performance differentials between GET and POST? Can anyone elaborate on why overusing POSTs is bad, and why? My understanding - and preliminary searches - seem to all indicate that these two are handles very similarly by the web server. Thanks in advance!
You should use HTTP as it's supposed to be used.
GET should be used for idempotent, read queries (i.e. view an item, search for a product, etc.).
POST should be used for create, delete or update requests (i.e. delete an item, update a profile, etc.)
GET allows refreshing the page, bookmark it, send the URL to someone. POST doesn't allow that. A useful pattern is post/redirect/get (AKA redirect after post).
Note that, except for long search forms, GET URLs should be short. They should usually look like http://www.foo.com/app/product/view?productId=1245, or even http://www.foo.com/app/product/view/1245
You should almost always use GET when requesting content. Only use POST when you are either:
Transmitting sensitive information which should not appear in the URL bar, or
Changing the state on the server (adding/changing/deleting stuff, altough recently some web applications use POST to change, PUT to add and DELETE to delete.)
Here's the difference: If you want to give the link to the page to a friend, or save it somewhere, or even only add it to your bookmarks, you need the full URL of the page. Just like your address bar should say http://stackoverflow.com/questions/7810876/abusing-http-post at the moment. You can Ctrl-C that. You can save that. Enter that link again, you're back at this page.
Now when you use any action other than GET, there is simply no URL to copy. It's like your browser would say you are at http://stackoverflow.com/question. You can't copy that. You can't bookmark that. Besides, if you would try to reload this page, your browser would ask you whether you want to send the data again, which is rather confusing for the non-tech-savy users of your page. And annoying for the entire rest.
However, you should use POST/PUT when transferring data. URL's can only be so long. You can't transmit an entire blog post in an URL. Also, if you reload such a page, You'll almost certainly double-post, because the above described message does not appear.
GET and POST are very different. Choose the right one for the job.
If you are using POST for security reasons, I might drop a mention of other security factors here. You need to ensure that you send the data from a form submit in encrypted form even if you are using POST.
As for the difference between GET and POST, it is as simple as GET is used to send a GET request. So, you would want to get data from a page and act upon it and that is the end of everything.
POST on the other hand, is used to POST data to the application. I am talking about transactions here (complete create, update or delete operations).
If you have a sensitive application that takes, say and ID to delete a user. You would not want to use GET for it because in that case, a witty user may raise mayhem simply changing the ID at the end of the URL and deleting all random uses.
POST allows more data and can be hacked to send streams of files as well. GET has a limited size though.
There is hardly any tradeoff in using GET or POST.

Architecture question involving search & session state

I have a grid with several thousand rows that can be filtered and sorted. On each row you can click a details button, which will bring you a new page with detailed information about the page. Because this is a button, you can't middle click or right click and open in a new tab. In addition, when clicking back you lose your filters and search results.
To solve this problem, I considered the following: Switch the buttons to links, and when filtering and searching, use get instead of post requests. This way, you could switch to new pages with a right click or middle click, and if you did follow a link normally, back would work properly.
This change was not made however. We were asked add a 'next result / previous result' set of buttons on the details page, that would allow you to navigate. While not an elegant solution, it would at least work.
I proposed adding querystring parameters to the details page, that would regenerate the search query based on filter, and allow you to get the next and previous results in code.
A team member took issue with this solution. He believes that it is a waste of server resources to re-query the database. Instead, a solution was proposed to add session variable that included a list of results. You could then use that to navigate.
I took issue with that because you can't have multiple tabs open without breaking navigation, and new results aren't appended to the list in real time. Also, if you worried about optimization, session would be the last thing to use since it eats memory and prevents server replication... unless you store the results back in the database.
What's the best solution?
Session doesn't sound like a winner, won't scale with lots of users.
Hitting the database repeatedly does seem unnecessary, but it depends on the cost - how many users, how often would they refresh/filter and what is the cost of that query?
If you do use querystrings you could cache the pages by parameter.
What about some AJAX code on that button to retrieve details - leave the underlying grid in place and display details in a div/panel or a new window/tab.

Resources