Stackdriver Trace on GKE with Python app aggregates metrics in one collection - stackdriver

I am preparing a demo on GKE with an simple server app written in python that gets the requests from an Envoy proxy, and adds some delay.
I wanted to use Stackdriver Trace to show the delay applied by each Envoy proxy and the app, but what I am getting is not that attractive visually.
So, there is this Envoy proxy that proxies the request to 3 different apps, called blue, green and red. Each of them also has an Envoy proxy that adds a delay and sends the request to the Python server that returns a simple "Hello ...".
I sent 1000 requests to the front service that randomly forwarded these request to the 3 apps. Now when I go to Stackdriver Trace console, this is what I see:
When I click on each of them (say green), I can see all the requests accumulated under green, with the time it took to respond for each request:
So, all these seems to be fine, but from the point of view of the demo, it is not that attractive to show this dashboard. If I send only few requests, it sometimes is not gathering, and the trace never shows up on the dashboard. I was wondering if there is a way to have these requests broken down to 1. So I will have a dashboard similar to this one (from Stackdriver Examples):
...where each request seems to be 1 dot, and I can just click on it and get the info.
Again, my demo is with Python, and it is in Alpha.

Related

How to inspect Firestore network traffic with charles proxy?

As far as I can tell, Firestore uses protocol buffers when making a connection from an android/ios app. Out of curiosity I want to see what network traffic is going up and down, but I can't seem to make charles proxy show any real decoded info. I can see the open connection, but I'd like to see what's going over the wire.
Firestores sdks are open source it seems. So it should be possible to use it to help decode the output. https://github.com/firebase/firebase-js-sdk/tree/master/packages/firestore/src/protos
A few Google services (like AdMob: https://developers.google.com/admob/android/charles) have documentation on how to read network traffic with Charles Proxy but I think your question is, if it’s possible with Cloud Firestore since Charles has support for protobufs.
The answer is : it is not possible right now. The firestore requests can be seen, but can't actually read any of the data being sent since it's using protocol buffers. There is no documentation on how to use Charles with Firestore requests, there is an open issue(feature request) on this with the product team which has no ETA. In the meanwhile, you can try with the Protocol Buffers Viewer.
Alternatives for viewing Firestore network traffic could be :
From Firestore documentation,
For all app types, Performance Monitoring automatically collects a
trace for each network request issued by your app, called an HTTP/S
network request trace. These traces collect metrics for the time
between when your app issues a request to a service endpoint and when
the response from that endpoint is complete. For any endpoint to which
your app makes a request, Performance Monitoring captures several
metrics:
Response time — Time between when the request is made and when the response is fully received
Response payload size — Byte size of the network payload downloaded by the app
Request payload size — Byte size of the network payload uploaded by the app
Success rate — Percentage of successful responses compared to total responses (to measure network or server failures)
You can view data from these traces in the Network requests subtab of
the traces table, which is at the bottom of the Performance dashboard
(learn more about using the console later on this page).This
out-of-the-box monitoring includes most network requests for your app.
However, some requests might not be reported or you might use a
different library to make network requests. In these cases, you can
use the Performance Monitoring API to manually instrument custom
network request traces. Firebase displays URL patterns and their
aggregated data in the Network tab in the Performance dashboard of the
Firebase console.
From stackoverflow thread,
The wire protocol for Cloud Firestore is based on gRPC, which is
indeed a lot harder to troubleshoot than the websockets that the
Realtime Database uses. One way is to enable debug logging with:
firebase.firestore.setLogLevel('debug');
Once you do that, the debug output will start getting logged.
Firestore use gRPC as their API, and charles not support gRPC now.
In this case you can use Mediator, Mediator is a Cross-platform GUI gRPC debugging proxy like Charles but design for gRPC.
You can dump all gRPC requests without any configuration.
For decode the gRPC/TLS traffic, you need download and install the Mediator Root Certificate to your device follow the document.
For decode the request/response message, you need download proto files which in your description, then configure the proto root in Mediator follow the document.

Why are only six HTTPS requests allowed to be sent simultaneously?

One of my front-end developer co-workers asked for my help because of the browser not sending HTTPS requests asynchronously. At least that is what he thought at first.
By opening up the Firefox networking tab, I realized that the requests get sent asynchronously, but for some reason, only six HTTPS requests are allowed to be sent parallel.
I assume this is a technical limit of the HTTPS protocol, though I do not know the cause.
I searched for the cause on the web, but I have not been able to find it so far.
In the following picture, the red parts of the lines mark the duration spent being blocked by another request:
My understanding is that this is a courtesy to the server you're connecting to. You wouldn't want to overload a server and prevent others from simultaneously connecting to it.
Depending on the browser, the limit changes as well.
See also: Max parallel http connections in a browser?

Server receiving webRequests long after removed from LoadBalancer

I had the following issue on a system that I supported ~7 years ago. We never got to the bottom of it, and focus shifted onto other issues. I was recently reminded of it, and wondered if anyone would know what was going on. But alas I'll be a little short on details. Sorry.
The Setup
I had a farm of web servers sitting behind a load balancer. The servers were hosting a system that would receive HTTP requests (XML &/or SOAP) from clients, then for each one kick-off a bunch of further HTTP requests to 3rd-party-suppliers, wait for the suppliers' responses, process and combine the results and respond to the client's request.
Think insurance comparison, but as Business-To-Business XML service.
The whole processing would take 5s of seconds, from receiving the initial client request to them sending back a response to that original HTTP request, and the server would be processing 10s or 100s of requests in parallel (i.e. at any given point, a given webserver would have many client Requests that had come in, and been logged, but not yet been responded to.)
We had detailed logging which record the reciept of the requests, including origin IP and which server was processing the request, and record when a response was sent.
All client requests were sent to a single IP address (well, URL), which was the address of the loadbalancer, which would then forward requests to the webservers, which weren't individually accessible to the internet (they didn't have public IP addresses).
Our load balancer would allow us to take individual web-servers out of rotation, for maintenance.
When we did that we could watch the DB logs, and see new requests stop coming in, and the existing request gradually get completed, until there were now outstanding requests and the server was idle.
The problem
We found that sometimes, when we took a server out of rotation ... it wouldn't entirely stop receiving requests. You could see the large bulk of request suddenly stop coming in, but it would still receive a trickle of fresh requests (I don't know ... maybe 0.1% of normal load, maybe less?). I think the longest we left it going was maybe ... 10 minutes?
Notably we realised that all of those requests were coming from a single client/IP address (I don't remember which).
I forget whether other (still-in-rotation) webservers were still receiving requests from this client, but I think they were?
If we rebooted the webserver, no further requests would come in after restarting.
Web stack was Windows, IIS, ASP.NET; pretty old school even at the time. All servers individually owned and configured.
What was happening?
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server. But that was BS-waffle, and since we never needed to actually understand what was going on, we ignored it and moved on with our lives :)
But I'd still like to know what we were seeing, if anyone can diagnose it from that description.
We vaguely waved our hands and asserted that the client's integration with us was "holding an HTTP tunnel open and sending multiple requests through it", rather than sending each request separately, and thus was maintaining that tunnel even after the LB stopped sending new requests to that server.
That sounds like a good explanation.
Normally, a LB will refuse new connections to a removed server, but will allow open connections to live on until they naturally close. This is known as "connection draining" or "graceful shutdown".
If one of your clients had HTTP keepalive on, and was holding a TCP connection open and sending HTTP requests through it for a long time, it would give the symptoms you describe.
Most LBs will have a configuration knob for how long to wait for connections to close before force-closing them during this "connection draining" time. You can set a timeout here to avoid this scenario if it is a problem for you.
The HTTP connection handling behaviour of clients will vary at the client's discretion, to a large extent. Perhaps most of your clients were of one type (say, web browsers) and weren't holding open a single connection for 10 mins, but perhaps one client was different (say, a programmatic HTTP API client)?
Further reading about "connection draining" on AWS Load Balancers here (the exact details will vary by LB vendor): https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-conn-drain.html
Further reading about HTTP keep alive here: https://en.wikipedia.org/wiki/HTTP_persistent_connection

Vugen (Loadrunner) not able to record wms and wfs calls

I have an internal website which is completely based on Geo Map and uses HTTP/HTTPS protocol also wfs and wms service calls.
While recording with LR I am able to launch and record all the contents,
but after that for next transaction when I click on any of the option to see the map It is not recording in LR.
For next transaction type of calls are jquery which gets the server response in form of images (geo locations).
I tried recording with HTTP single and HTTP with Web-services multiple protocols also
My LR version is 12.01 and recording it with Chrome browser
Help me out please !
Both WFS and WMS, from the open geospacial consortium, should be using HTTP as a carrier. Can you provide insight on why such calls to these servers are being ignored
Do you have the servers in any sort of filter as a third party element which should not be tested as you do not have ownership or control of the target?
Are you electing to connect to these services over a protocol other than HTTP?
You note, "Not recording," what is the specific objective evidence that a recording is not occuring? Lack of evidence in the recording logs? Lack of an increase in the events counter? Lack of ?? Please clarify
Have you tried the HTTP Proxy method of recording versus the default sockets model?

Flex Monitor Server Requests

I have the System below
one server
20 clients (could increase)
the clients display a webpage that has the flex swf file. it will display a list. it will poll server every 1sec for any changes in the data. if there's any, it will refresh the data.
the polling is handled using a url that returns json object.
Now i want to have an webapplication that i can use to see the current status of all the monitors on the network
any smart solution?
You could potentially have all the monitor apps connect to a NetConnectionGroup along with your status check app. They could then post pings and health checks into the group, and if your status check app is also connected it could report those statuses (this of course doesn't help if one of the monitors has crashed or not connected, but you'll have that problem with any solution!)

Resources