I'm sending 4 metrics every minute from another server to the server hosting graphite. I've set up graphite & grafana and am able to see the data in grafana. However, I notice that there's about a 3-minute delay from the time I sent the metric to the time I see it in Grafana.
I'm using graphite and Grafana for a real-time display and is setting Grafana to auto-refresh every 10s. It's a bit unusual to have this 3-min delay. I doubt if the network is causing that much delay. Is there anyway to look into why this delay is so high?
Thank you
If you point your graphite setup (in graphite-web local_settings.py) to a memcached cluster, metrics are cached there for (by default) 1 min.
It could explain part of the delay.
turns out there's some issue in the graphite version I was using. The answer is here: https://answers.launchpad.net/graphite/+question/254964
I need to add that graphite doesn't show last point from rrd, thats why you can had delay.
/opt/graphite/webapp/graphite/readers/rrd.py in "fetch" method you need to comment
rows.pop() if this problem is important for you
Related
We have a backend API which runs in almost constant time (it does "sleep" for given period). When we run a managed API which proxies to it for a long time, we see that from time to time its execution time increases up to twice the average.
From analyzing the Amazon ALB data in production, it seems that the time the request spends inside Synapse remains the same, but the connection time (the time the request enters the queue for processing) is high.
In an isolated environment we noticed that those lags happen approximately every 10 minutes. In production, where we have multiple workers that gets request all the time, the picture is more obscured, as it happens more often (possibly the lag accumulates).
Does anyone aware of any periodic activity in the worker which result delays entering the queue every
few minutes? Any parameter that control this? Otherwise, any idea how to figure out what is the cause?
Attached is an image demonstrating it.
Could be due to gateway token cache invalidation. The default timeout is 15 minutes.
Once the key validation is done via the Key Manager, the key validation info is cached in the gateway. The subsequent API invocations, key validation will be done from this cache. During this time, the execution time will be lower.
After the cache invalidation, the token validation will be done via the key manager (using DB). This will cause an increase in the execution time.
Investigating further, we found out two causes for the spikes.
Our handler writes log to shared file system which was set as sync instead of async. This caused delays. This reduced most of the spikes.
Additional spikes seem to be related to registry updates. We did not investigate those, as they were more sporadic.
I have a web app. I am living a problem about time of Meteor.logout() and Meteor.call(). When i meteor.logout(), it takes time between about 30-40 sec. Same for Meteor.call() as well. About 200-250 clients use this system on the same time.
if a client see about 100-200 items his on app screen this delay time is so much. but 10-20 items, it's a little well. we get data every 5-10 sec as different times each others on these items. I mean, live screen.
I don't get this problem when i work this system on diffrent port with same code and same database by the way just use only me.
I can't figure it. What can be reason it. I need your ideas and help.
The logout function waits for a callback form the server, there is something wrong with the way you have configured your server.
Run the same code on another machine, it should not happen.
You can use this.unblock() in every method and publications.
By default, Meteor process requests one by one, it will queue all the requests coming, if one is processing.
This may be due to the reason that some of the functions doing some bigger functionalities will be requiring more time and all other request to the server have to wait till it ends.
You need to simply place this.unblock() at the starting of every method and publications and it will not block your requests.
Thanks
I solved my problem.
While the collection update process is performed from one side, the meteor publish process is performed from the other side. As the number of clients increases, the server becomes unresponsive. I solved it with Mongodb oplog feature.
Thank you for your interest.
There could be multiple reasons.
There could be unsubscription of collections, which means client and server exchange the list of id's which are being unsubscribed.
You many have reactive UI, which suddenly gets overwhelmed with the amount of data that is being transferred and needs to update itself. (example angular digest cycle always runs after meteor sub/unsub)
Chrome Inspector - Network websocket frame is your best tool understand how soon Meteor logout fires and and if there are any messages being passed back and forth before server retutns the result of logout request.
You may also use this.unblock() feature in subscribe. This way your subscritption run parallelly and don't block each other
There does not seem to be a way to query/output the availability of a monitored web application via MS Application Insights for a given month, even if it is the current month.
I'd think this would be one of the (if not THE) most important metric to monitor, so I can't imagine that this just isn't possible. What am I overlooking?
Application Insights's Analytics area seems to be limited to queries of just over a week, as is the detail data for a Web Test if one increases the time range.
Is there really no way to do that?
You can use Application Insights Export feature to bypass 8 days retention and store the results externally. As per documentation, this functionality exports web test results as well, so you'll be able to access them for as long as you store them.
Sorry if this sounds like a workaround (because it is) rather than a solution.
[Feb 2017] Data retention is now 90 days.
I'm writing a web application that use websockets for bidirectional communication between the client and the server. My main concern is user-perceived latency, so, I am measuring and profiling whatever I can. In particular, I'm capturing the current time at the onmessage() event. This is useful, but I also want to know when the event has been pushed into the browser's event loop - which happens before the onmessage event is fired.
In Chrome Developer Tools, I see the times in the "Network->Frames" tab, which, I think, is the time when the event enters the event loop. But I need to capture this programmatically in Javascript. Any idea how to do this?
I did some "console.log"ing and saw in a few cases a difference of approximately 10 milliseconds between the time showing in Developer Tools, and the time I capture in the onmessage event. I want my measurements to show if the difference is always as small as 10 milliseconds, or whether sometimes the difference is much higher, due to rendering or some other thing that happens in the page.
The browser api for websocket is too restricted to expose the information that you want.
Browsers started to expose timing informations with the Performance interface, but that interface will only tell to you the timing informations of the initial connection to the websocket server, it don't know about websocket frames
Based on your description of the problem, it is not necessary. The delay between when the http stack receives your message and when it passes it to you in the application so it can be programmatically logged is negligible and almost certainly below the precision of the javascript datetime value (you could use performance.now, but I have my doubts as to how precise it actually is).
Your latency is going to be driven by network factors and server response time - if you can get reasonable measurements of those, you will be where you want to be. The other factors may contribute to measurement "noise" - so long as it is less than 10% of the value you are trying to measure, there are no issues.
I am currently collecting monitoring metrics with Ganglia and I would like to show graphs with that data with Graphite. I know such an integration is possible, and I found an article describing how it should be done. I am not quite sure exactly how this integration works, especially when I want to send it straight into graphite without parsing the data of the gmetad. Any help on how to integrate Ganglia with Graphite will be great.
thanks
There are two approaches to integrate ganglia with graphite.
use third party process to get metrics from gmetad/gmond, tweak metrics data format, send metrics data to carbon server finally.
use the feature "graphite integration" of gmetad where you just need to configure the carbon server address, port, protocol (with an optional graphite path syntax), then gmetad will do all the things left. The more details can be found from your /etc/ganglia/gmetad.conf
I would recommend #2 since it's pretty simple. you just need to upgrade your ganglia packages to version 3.3+.
In above solutions, you can store metrics data in both RRD and whisper. If you don't want this approach, it also supports altering rrdtool graphs with graphite in ganglia-web. see "Using graphite as graphing engine"
Have you checked the ganglia-web wiki ? There is a section Graphite Integration and an other called Using Graphite as the graphing engine which explain well how to do what you want.
I've worked a lot with Ganglia, Graphite from what I've researched works similarly. I was never able to master Whisper, but I've found RRD's (round robin databases) to be pretty reliable. Not sure what you're interested in monitoring, but I would definitely check out JMXtrans. You can get the code from Google. It provides multiple methods for extracting metric data from whatever JVM you're monitoring, and lets you define which metrics you'd like to pipe to Ganglia/Graphite, and some other options.