"Ping" a tensorflow serving server - grpc

I implemented several gRPC services, which all share the basic interface which allows me to "ping" them to see if they are up. For that, I have the getServiceVersion() request, which returns me the service version if the service is up, and the request breaks if the service is not up. How would you do this is tf serving? Or is there a better procedure in GRPC in general?
If no other answer comes up, I will just create a ping model which returns the service version from a constant. This would be a prediction model which predicts with 100% accuracy if the predictor is up. :)

Related

Will multiple AWS Lambdas making request to some HTTP(s) resource be detected as throttling?

Suppose I have multiple AWS Lambdas making a request to some HTTP API, is there any documentation how these requests will be seen on the API side in terms of throttling. Will they be treated as requests coming from different hosts or sometimes (or always) the same one?
You should assume no 2 runs of the same Lambda function will even run on the same machine. AWS might reuse the same instance to run your function, but that's to save the time spent downloading the code on every function call.
You can here from the AWS Lambda FAQ page found here, the following:
Q: Will AWS Lambda reuse function instances?
To improve performance, AWS Lambda may choose to retain an instance of
your function and reuse it to serve a subsequent request, rather than
creating a new copy. To learn more about how Lambda reuses function
instances, visit our documentation. Your code should not assume that
this will always happen.
Q: Why must AWS Lambda functions be stateless?
Keeping functions stateless enables AWS Lambda to rapidly launch as
many copies of the function as needed to scale to the rate of incoming
events. While AWS Lambda’s programming model is stateless, your code
can access stateful data by calling other web services, such as Amazon
S3 or Amazon DynamoDB.
Also, to understand more take a look at this interesting blog post (back from 2014)
If you place your Lambda functions outside of a VPC, your API will see requests coming from different hosts or sometimes the same one (Which is unpredictable since AWS reuses already provisioned Hot Lambda functions to handle requests for the same configuration)
However if you place your Lambda functions inside a VPC in a private subnet while configuring a NAT gateway for egress traffic, your API will see them coming from the NAT gateway IP address.
Depending on your rquirement you can use either the approaches where mostly the second option is needed for security purposes to whitelist IPs for ingress traffic to the API.

How does the Realm Mobile Platform scale?

You could say I am a fan of the Realm Mobile Platform. I'm using it and it seems to be working well.
However I am confused with how to operate it going to production. It seems to be deployed only to one server, and even the professional and enterprise editions are working on my single server.
Assuming Realm have thought of this (as Enterprise edition supports 'enterprise scaling) - how does this work if all clients point to my owned server URL?
Another question is how to monitor the load on that server.
Thanks!
The Professional Edition and the Enterprise Edition emit statsd compatible metrics which allow you to track the usage and load on each node in a Realm Object Server cluster. These metrics are also used internally inside the cluster in order to display statistics about the health of the cluster.
We are obviously still adding metrics as we understand more about our customer's use-cases, and fine-tuning the ones that we have.
With regards to the way the clustering works, we are currently implementing this according to an iterative process, where we add more and more features, and more and more resilience to the system with every passing day.
Basically, we have a logical load balancer process, which receives the incoming client connections, and then dispatches that to a node inside the cluster. This logical load balancer can be HA'd and LB'd itself as well, just like you would any regular WS connection handler. Handling many connections these days is easy. It's handling the quadratic merge algorithms that is expensive on the Realm Object Server, which is why the clustering is required for deployments at scale.

Jmeter TCP sampler used for what

From http://abh1sh3k.blogspot.com/2013/11/simple-tcp-server-and-jmeter-as-client.html, I practiced and it ran well.
I don't understand, what the test (jmeter tcp sampler) is used for? My company usually uses a web application and a process application.
Who can show me any example?

Service Oriented Architecture - AMQP or HTTP

A little background.
Very big monolithic Django application. All components use the same database. We need to separate services so we can independently upgrade some parts of the system without affecting the rest.
We use RabbitMQ as a broker to Celery.
Right now we have two options:
HTTP Services using a REST interface.
JSONRPC over AMQP to a event loop service
My team is leaning towards HTTP because that's what they are familiar with but I think the advantages of using RPC over AMQP far outweigh it.
AMQP provides us with the capabilities to easily add in load balancing, and high availability, with guaranteed message deliveries.
Whereas with HTTP we have to create client HTTP wrappers to work with the REST interfaces, we have to put in a load balancer and set up that infrastructure in order to have HA etc.
With AMQP I can just spawn another instance of the service, it will connect to the same queue as the other instances and bam, HA and load balancing.
Am I missing something with my thoughts on AMQP?
At first,
REST, RPC - architecture patterns, AMQP - wire-level and HTTP - application protocol which run on top of TCP/IP
AMQP is a specific protocol when HTTP - general-purpose protocol, thus, HTTP has damn high overhead comparing to AMQP
AMQP nature is asynchronous where HTTP nature is synchronous
both REST and RPC use data serialization, which format is up to you and it depends of infrastructure. If you are using python everywhere I think you can use python native serialization - pickle which should be faster than JSON or any other formats.
both HTTP+REST and AMQP+RPC can run in heterogeneous and/or distributed environment
So if you are choosing what to use: HTTP+REST or AMQP+RPC, the answer is really subject of infrastructure complexity and resource usage. Without any specific requirements both solution will work fine, but i would rather make some abstraction to be able switch between them transparently.
You told that your team familiar with HTTP but not with AMQP. If development time is an important time you got an answer.
If you want to build HA infrastructure with minimal complexity I guess AMQP protocol is what you want.
I had an experience with both of them and advantages of RESTful services are:
they well-mapped on web interface
people are familiar with them
easy to debug (due to general purpose of HTTP)
easy provide API to third-party services.
Advantages of AMQP-based solution:
damn fast
flexible
cost-effective (in resources usage meaning)
Note, that you can provide RESTful API to third-party services on top of your AMQP-based API while REST is not a protocol but rather paradigm, but you should think about it building your AQMP RPC api. I have done it in this way to provide API to external third-party services and provide access to API on those part of infrastructure which run on old codebase or where it is not possible to add AMQP support.
If I am right your question is about how to better organize communication between different parts of your software, not how to provide an API to end-users.
If you have a high-load project RabbitMQ is damn good piece of software and you can easily add any number of workers which run on different machines. Also it has mirroring and clustering out of the box. And one more thing, RabbitMQ is build on top of Erlang OTP, which is high-reliable,stable platform ... (bla-bla-bla), it is good not only for marketing but for engineers too. I had an issue with RabbitMQ only once when nginx logs took all disc space on the same partition where RabbitMQ run.
UPD (May 2018):
Saurabh Bhoomkar posted a link to the MQ vs. HTTP article written by Arnold Shoon on June 7th, 2012, here's a copy of it:
I was going through my old files and came across my notes on MQ and thought I’d share some reasons to use MQ vs. HTTP:
If your consumer processes at a fixed rate (i.e. can’t handle floods to the HTTP server [bursts]) then using MQ provides the flexibility for the service to buffer the other requests vs. bogging it down.
Time independent processing and messaging exchange patterns — if the thread is performing a fire-and-forget, then MQ is better suited for that pattern vs. HTTP.
Long-lived processes are better suited for MQ as you can send a request and have a seperate thread listening for responses (note WS-Addressing allows HTTP to process in this manner but requires both endpoints to support that capability).
Loose coupling where one process can continue to do work even if the other process is not available vs. HTTP having to retry.
Request prioritization where more important messages can jump to the front of the queue.
XA transactions – MQ is fully XA compliant – HTTP is not.
Fault tolerance – MQ messages survive server or network failures – HTTP does not.
MQ provides for ‘assured’ delivery of messages once and only once, http does not.
MQ provides the ability to do message segmentation and message grouping for large messages – HTTP does not have that ability as it treats each transaction seperately.
MQ provides a pub/sub interface where-as HTTP is point-to-point.
UPD (Dec 2018):
As noticed by #Kevin in comments below, it's questionable that RabbitMQ scales better then RESTful servies. My original answer was based on simply adding more workers, which is just a part of scaling and as long as single AMQP broker capacity not exceeded, it is true, though after that it requires more advanced techniques like Highly Available (Mirrored) Queues which makes both HTTP and AMQP-based services have some non-trivial complexity to scale at infrastructure level.
After careful thinking I also removed that maintaining AMQP broker (RabbitMQ) is simpler than any HTTP server: original answer was written in Jun 2013 and a lot of changed since that time, but the main change was that I get more insight in both of approaches, so the best I can say now that "your mileage may vary".
Also note, that comparing both HTTP and AMQP is apple to oranges to some extent, so please, do not interpret this answer as the ultimate guidance to base your decision on but rather take it as one of sources or as a reference for your further researches to find out what exact solution will match your particular case.
The irony of the solution OP had to accept is, AMQP or other MQ solutions are often used to insulate callers from the inherent unreliability of HTTP-only services -- to provide some level of timeout & retry logic and message persistence so the caller doesn't have to implement its own HTTP insulation code. A very thin HTTP gateway or adapter layer over a reliable AMQP core, with option to go straight to AMQP using a more reliable client protocol like JSONRPC would often be the best solution for this scenario.
Your thoughts on AMQP are spot on!
Furthermore, since you are transitioning from a monolithic to a more distributed architecture, then adopting AMQP for communication between the services is more ideal for your use case. Here is why…
Communication via a REST interface and by extension HTTP is synchronous in nature — this synchronous nature of HTTP makes it a not-so-great option as the pattern of communication in a distributed architecture like the one you talk about. Why?
Imagine you have two services, service A and service B in that your Django application that communicate via REST API calls. This API calls usually play out this way: service A makes an http request to service B, waits idly for the response, and only proceeds to the next task after getting a response from service B. In essence, service A is blocked until it receives a response from service B.
This is problematic because one of the goals with microservices is to build small autonomous services that would always be available even if one or more services are down– No single point of failure. The fact that service A connects directly to service B and in fact, waits for some response, introduces a level of coupling that detracts from the intended autonomy of each service.
AMQP on the other hand is asynchronous in nature — this asynchronous nature of AMQP makes it great for use in your scenario and other like it.
If you go down the AMQP route, instead of service A making requests to service B directly, you can introduce an AMQP based MQ between these two services. Service A will add requests to the Message Queue. Service B then picks up the request and processes it at its own pace.
This approach decouples the two services and, by extension, makes them autonomous. This is true because:
If service B fails unexpectedly, service A will keep accepting requests and adding them to the queue as though nothing happened. The requests would always be in the queue for service B to process them when it’s back online.
If service A experiences a spike in traffic, service B won’t even notice because it only picks up requests from the Message Queues at its own pace
This approach also has the added benefit of being easy to scale— you can add more queues or create copies of service B to process more requests.
Lastly, service A does not have to wait for a response from service B, the end users don’t also have to wait for long— this leads to improved performance and, by extension, a better user experience.
Just in case you are considering moving from HTTP to AMQP in your distributed architecture and you are just not sure how to go about it, you can checkout this 7 parts beginner guide on message queues and microservices. It shows you how to use a message queue in a distributed architecture by walking you through a demo project.

Test harness software for networking failures

During integration testing it is important to simulate various kinds of low-level networking failure to ensure that the components involved properly handle them. Some socket connection examples (from the Release It! book by Michael Nygard) include
connection refused
remote end replies with SYN/ACK but never sends any data
remote end sends only RESET packets
connection established, but remote end never acknowledges receiving packets, causing endless retransmissions
and so forth.
It would be useful to simulate such failures for integration testing involving web services, database calls and so forth.
Are there any tools available able to create failure conditions of this specific sort (i.e. socket-level failures)? One possibility, for instance, would be some kind of dysfunctional server that exhibits different kinds of failure on different ports.
EDIT: After some additional research, it looks like it's possible to handle this kind of thing using a firewall. For example iptables has some options that allow you to match packets (either randomly according to some configurable probability or else on an every-nth-packet basis) and then drop them. So I am thinking that we might set up our "nasty server" with the firewall rules configured on a port-by-port basis to create the kind of nastiness we want to test our apps against. Would be interested to hear thoughts about this approach.
bane is built for this purpose, described as:
Bane is a test harness used to test your application's interaction with other servers. It is based upon the material from Michael Nygard's "Release It!" book as described in the "Test Harness" chapter.
(Edit 2021): a more developed tool for testing behavior with different network issues is toxiproxy
Toxiproxy is a framework for simulating network conditions. It's made specifically to work in testing, CI and development environments, supporting deterministic tampering with connections, but with support for randomized chaos and customization. Toxiproxy is the tool you need to prove with tests that your application doesn't have single points of failure.
Take a look at the dummynet.
You can do it with iptables, or you can do it without actually sending the packets anywhere with ns-3, possibly combined with your favourite virtualisation solution, or you can do all sorts of strange things with scapy.

Resources