I've created API with Kimono Labs, to generate RSS feed from a website. It is working ok, crawling data every hour, however every several days it just stop working. No errors, nothing. It the crawl history i can see, that previous crawls was successful, and then API just stop crawling the data. Until i launch manual crawl. Then API start working again, but only for a several days. And then all again, it stops, i initiate manual crawl, it's working for some time. What can cause such a behavior?
It's intended behaviour described in every API's under (?) popover:
<p>Auto-run frequency <span class="icon-question-circle" data-html="true" popover="Specify how often this API will automatically fetch new data from the target page(s). APIs are limited to 1 URL for a hourly auto-run, <1000 URLs for a daily auto-run, and <10,000 URLs for a weekly auto-run."></span></p>
Anyway, it was a kimono issue, that is fixed now. I've got an e-mail from support
This is a crawling bug that we've now implemented handling for.
We are running a script that will check for queued scheduled crawls every hour
and start them if they are not running.
Related
I have a VB.NET/Vue website hosted on an internal IIS 8.5 Windows 2012R2 Server. Our company has about 30 users using the site at any given time. The users are experiencing random delays throughout the day and on some days there's no delays (site works great most of the time). What I'm looking for is any suggestions on where to start looking to solve the issue. Here's what I've found so far.
User goes to site and initiates an api request from the UI
User sees a loading icon for anywhere up to a minute or so while the request returns
The request eventually reaches the server after some time and executes really fast within milliseconds and returns the response to the user
By this time, many users have already refreshed the page making new requests that succeed on page load. For the users that are patient and wait for the response, it eventually returns the response.
Here's some screenshots:
So to sum everything up, there are several users experiencing delays on a daily basis.
Some days we don’t have any delays, but most days we have several users experiencing multiple delays of several seconds to 30 seconds to 1 minute.
I’ve found all this using LogRocket and NewRelic and what is happening is all these requests are completing within milliseconds, but the request doesn’t seem to reach the server for some period of time.
I’ve been monitoring the CPU/Memory/Network on these servers and it all seems fine to me during when these issues occur.
It seems that the problem lies between the users computer and whatever hardware/software exists before reaching the web server.
Update here... Found that the problem is occurring on the users computer in all these instances. Using google chrome's performance api, I was able to track timing info for these requests and found that the problem is in the fetchStart. So whatever is happening here is the cause of the issue.
Example below:
entryType: resource
startTime: 1119531.820000033
duration: 56882.43999995757
initiatorType: xmlhttprequest
nextHopProtocol: http/1.1
workerStart: 0
redirectStart: 0
redirectEnd: 0
fetchStart: 1119531.820000033
domainLookupStart: 1176401.0199999902
domainLookupEnd: 1176402.2699999623
connectStart: 1176402.2699999623
connectEnd: 1176404.8350000521
secureConnectionStart: 1176403.6700000288
requestStart: 1176404.8549999716
responseStart: 1176413.5300000198
responseEnd: 1176414.2599999905
transferSize: 15145
encodedBodySize: 14884
decodedBodySize: 14884
serverTiming: []
workerTiming: []
fetchStart is at 1119531.820000033, then requestStart is at 1176404.8549999716 so the problem is something between fetchStart and requestStart. Still looking into what is causing this.
In 2022, we are experiencing something very similar with a small fraction of our customers. There is a significant gap between the timing api requestStart and the startTime. This gap can be up to 8 minutes -- I admire the patience of customers waiting that long. The wait periods are also close to multiples of a minute.
In our case, it appears that there is a (transparent?) proxy between those browsers and our server infrastructure which appears to be triggering the problem. In particular, it forces a downgrade of HTTP/2 to HTTP/1.1. Whitelisting our website in that proxy does solve the problem. This isn't a very satisfactory solution, but it does make the customer happier!
[UPDATE]
In our case, it turned out that we were sending a Content-Length header with a non-zero value on a 304 response. This is technically invalid and it caused problems with the proxy. This happened because of the Django CommonMiddleware which always puts a content-length header on responses. The solution was to add a new piece of middleware that strips out the content-length (and content) on a 304 response.
It turned out that the content was already being stripped by our nginx frontend, but it is better not to generate it in the first place.
And what was the content? -- in our case, it was the 4 characters 'null'!
My asp.net web application is encountering down time everyday, it takes forever to respond. But once I stop and start (not iis reset) the website in IIS it will work again. Then hours/a day later it will become unresponsive again. What would be the reason? I'm suspecting an unclosed connection to database but hard to find them. The codes were made by the previous programmer.
Check the queue length which is a setting under apppool.
If its happening during a particular time of the day then please check the resource utilization like CPU/RAM consumed during that particular time.
There are APM tools like App Insight available which you can use to monitor the request response time for the requests.
You can implement Google analytics to see number of users online or requesting to see if its threshold issue.
Look into IIS logs during the time of issue and check the time-taken field. If its above normal, proceed to the following step
During the time of issue (before you restart the website), capture a manual hang dump of the w3wp process - https://blogs.msdn.microsoft.com/debugdiag/2013/03/15/debug-diagnostic-1-2-generate-a-manual-hang-dump-on-a-specific-process/
Run Debug Diag report and share it if you can. It'll tell you things that are possible going wrong.
We're displaying embedded videos using the iFrame API on a page that's hit by selenium-driven automation ~2 times/10 mins. Our tests have started failing because we're getting intermittent 404s from https://www.youtube.com/iframe_api.
This has been going on for two days now. Are we being rate limited here? Or does the problem lie on Youtube's end?
I am accessing a Drupal Views feed through xmlrpc. The script has worked in the past and my goal today was solely to access another feed. In theory, there was nothing to do except to change the name of the feed. The endpoint had not changed, my domain had not changed, I can log in to the remote site so my user credentials there are valid.
I am scratching my head as to what may have changed. Is there an obvious question that I have missed? What could have changed on the Drupal end that I should be taking into account?
I can also get a session id for an anonymous user okay.
The failure comes during the complicate authentication (that has worked in the past).
Any suggestions?
Thanks.
Ah... if anyone else has the same problem, as I worked through my script, printing out its effect at each line, I came across a comment I had made when I wrote it.
Make sure the client and the remote are on the same time, preferably the time provided by www.time.is.
My PC was running a minute slow. The detafult Resynchronise on Windows 7 runs at 1am on a Sunday. Change that to a more sensible time.
And for an immediate fix, change the PC time to within a few seconds of www.time.is.
That was the problem. Authenticated login uses a time stamp. It the remote server regards your time as too inaccurate, it will reject your login. Make sure the client is running with an accurate clock.
I am examining the page load time numbers in GA and Pingdom. My avg via pingdom is consistently 3 seconds, or around that. My page load time in GA is consistently around 10 seconds. Can anyone explain the technical reason for this difference. Please.
Any Reference to this information would be helpful, I haven't been able to find a straight answer.
This is an old question, and you've probably moved on, but Pingdom only tests the response time from the server (how long it takes the server to return a 200, 4XX or 5XX status), not even the time to receive the HTML doc; while Google Analytics shows the load time of the entire page, including all content and every asset loaded.