What is an acceptable poll rate for an rss feed? - rss

I'm currently polling the youtube rss feed servers about 3 times per second (multiple channels getting the feed every 3 seconds each). Is this too much and will getting it this often result in getting updates slower since it's going to start caching it for me? I'm trying to get new videos updated as quickly as possible but I can't find any gudielines on this sort of stuff.
I used to get them using the /v3/search/ at the same rate but the response from my server was always late compared to what I got when I did it on my local machine(where I didn't poll this often).
Also, is there any good alternatives? (I tried to push notifications but they were highly unreliable)

You should be using caching in your app to reduce bandwidth and latency. When caching, store the eTag so that you can include it when getting a resource. If the resource has not changed, you will get a 304 response code (NOT MODIFIED), which means you can use your cached version. Otherwise, you will get the updated version of the resource.
More info:
https://developers.google.com/youtube/v3/getting-started#etags

Related

Why is my website experiencing random slow api requests?

I have a VB.NET/Vue website hosted on an internal IIS 8.5 Windows 2012R2 Server. Our company has about 30 users using the site at any given time. The users are experiencing random delays throughout the day and on some days there's no delays (site works great most of the time). What I'm looking for is any suggestions on where to start looking to solve the issue. Here's what I've found so far.
User goes to site and initiates an api request from the UI
User sees a loading icon for anywhere up to a minute or so while the request returns
The request eventually reaches the server after some time and executes really fast within milliseconds and returns the response to the user
By this time, many users have already refreshed the page making new requests that succeed on page load. For the users that are patient and wait for the response, it eventually returns the response.
Here's some screenshots:
So to sum everything up, there are several users experiencing delays on a daily basis.
Some days we don’t have any delays, but most days we have several users experiencing multiple delays of several seconds to 30 seconds to 1 minute.
I’ve found all this using LogRocket and NewRelic and what is happening is all these requests are completing within milliseconds, but the request doesn’t seem to reach the server for some period of time.
I’ve been monitoring the CPU/Memory/Network on these servers and it all seems fine to me during when these issues occur.
It seems that the problem lies between the users computer and whatever hardware/software exists before reaching the web server.
Update here... Found that the problem is occurring on the users computer in all these instances. Using google chrome's performance api, I was able to track timing info for these requests and found that the problem is in the fetchStart. So whatever is happening here is the cause of the issue.
Example below:
entryType: resource
startTime: 1119531.820000033
duration: 56882.43999995757
initiatorType: xmlhttprequest
nextHopProtocol: http/1.1
workerStart: 0
redirectStart: 0
redirectEnd: 0
fetchStart: 1119531.820000033
domainLookupStart: 1176401.0199999902
domainLookupEnd: 1176402.2699999623
connectStart: 1176402.2699999623
connectEnd: 1176404.8350000521
secureConnectionStart: 1176403.6700000288
requestStart: 1176404.8549999716
responseStart: 1176413.5300000198
responseEnd: 1176414.2599999905
transferSize: 15145
encodedBodySize: 14884
decodedBodySize: 14884
serverTiming: []
workerTiming: []
fetchStart is at 1119531.820000033, then requestStart is at 1176404.8549999716 so the problem is something between fetchStart and requestStart. Still looking into what is causing this.
In 2022, we are experiencing something very similar with a small fraction of our customers. There is a significant gap between the timing api requestStart and the startTime. This gap can be up to 8 minutes -- I admire the patience of customers waiting that long. The wait periods are also close to multiples of a minute.
In our case, it appears that there is a (transparent?) proxy between those browsers and our server infrastructure which appears to be triggering the problem. In particular, it forces a downgrade of HTTP/2 to HTTP/1.1. Whitelisting our website in that proxy does solve the problem. This isn't a very satisfactory solution, but it does make the customer happier!
[UPDATE]
In our case, it turned out that we were sending a Content-Length header with a non-zero value on a 304 response. This is technically invalid and it caused problems with the proxy. This happened because of the Django CommonMiddleware which always puts a content-length header on responses. The solution was to add a new piece of middleware that strips out the content-length (and content) on a 304 response.
It turned out that the content was already being stripped by our nginx frontend, but it is better not to generate it in the first place.
And what was the content? -- in our case, it was the 4 characters 'null'!

Different Ways to Call Google Analytics from Server Side?

Currently, I am using Measurement Protocol to push the data to GA. The problem is I didn't get any response back for Success or Error on Production, If yes Please suggest?
Due to this, I am looking if there is any other options available for the same like can we achieve it using analytics 360?
The google analytics production data collection endpoint does not return a request status back (always 200 OK) by design to ensure ultra-light processing speed.
What I usually recommend to clients using Measurement Protocol server-side is to
To log a reasonable amount (or all of them) of requests somewhere. Storage is extremely cheap nowadays and knowing the data format if an emergency happens you will be able to manually extract the data
Every once in a while (one on thousand or one on a million or even more oftne depending on the importance of the data randomly) validate request on GA debug endpoint and parse the returned json. If there are any warnings or error send a notification for further investigation. This way, if anything went wrong you will be on top of the problem by the time BI & Marketing Team would be affected

How to handle Caching before it has been set and multiple users access the website at the same time

I have set up Caching for my website, which expires after an hour.
So my problem is if the Cache does not exist and Multiple users access the website at the same time I would like to avoid all the users making the same request at the same time. As this has an impact on the CPU usage to be at 100% for a longer period of time.
I am using System.Runtime.Caching.MemoryCache.
MVC ASP.NET application.
I have thought of a solution but I am not sure how to best implement it, my thoughts are,
One of the many users will first come in first, and start a request and I set a flag to say fetching data and then after any more users come in they will be shown no cache has been set but before starting the request they will check the flag if the request has already been triggered. If it has the application should wait until a response has come back (is this possible?), and then use the response from the cache.
This way One request is sent and it will be quicker response from the service and it will be a quicker response and CPU usage will still be quite low.
Please do suggest an alternative to this, my idea could be wrong
Can someone please advise?
Thanks

WP-Engine 502 timeout- what options do I have to get around this limitation?

We have a plugin for Wordpress that we've been using successfully on many customers- the plugin syncs stock numbers with our warehouse and exports orders to our warehouse.
We have recently had a client move to WP-Engine who seem to impose a hard 30 second limit on the length of a running request. Because sometimes we have many orders to export, the script simply hits a 502 bad gateway error.
According to WP-Engine documentation, this cannot be turned off on a client by client basis.
https://wpengine.com/support/troubleshooting-502-error/
My question is, what options do I have to get around a host's 30 second timeout limit? Setting set_time_limit has no effect (as expected as it is the web server killing the request, not PHP). The only thing I can think of is make heavy modifications to the plugin whereby it acts as an API and we simply pull the data from the clients system, however this is a last resort.
The long-process timeout is 60 seconds.
This cannot be turned off on shared plans, only plans with dedicated servers. You will not be able to get around this by attempting to modify it as it runs directly on Apache outside of your particular install
Your optons are:
1. 'Chunk' the upload to be smaller
2. Upload the sql file to your sFTP _wpeprivate folder and have their support import it for you.
3. Optimize the import so the content is imported more efficiently.
I can see three options here.
Change the web host (easy option).
Modify a plugin to process the sync in batches. However, this also won't give you a 100% guarantee with a hard script execution time limit - something may get lost in one or more batches and you won't even know.
Contact WP Engine and ask to raise the limit for this particular client.

RSS: refresh rate?

I'm writing a little application for my own use which will consume a publicly published RSS feed.
As far as I can tell, there's no subscribe/post mechanism in the protocol; I need to have my application HTTP-GET the RSS feed periodically.
If that's the case, I'd like to grab it every ten minutes or so, but I'm worried about being seen as an abuser. I'd certainly be concerned if I saw someone poking my server every ten minutes for weeks on end.
Is this a valid concern? Is there any general advice on what a "reasonable" refresh rate is? Do I even have my facts straight?
Since RSS is built on the HTTP protocol, in general, most sites should respect the If-Modified-Since HTTP header. This is fairly lightweight and most servers should be able to return this information quickly.
So for the client-side, you'll need to keep track of the last time you've sent the request and pass it to the server. If the server returns a 304 code, then you'll know that nothing has changed. But even more importantly, the server doesn't need to return the feed info, saving bytes of traffic. If the server returns a 200, then you'll need to process the results and save the response date.
Ultimately, the answer to this question depends on what type of information is at the other end of the RSS feed. If it is a blog, then probably once every 4-8 hours is sufficient. But if RSS feed is a feed of stock quote (not likely, just an example), then every 10 minutes is not sufficient.

Resources