Multiple Google Vision OCR requests at once? - google-cloud-vision

According to the Google Vision documentation, the maximum number of image files per request is 16. Elsewhere, however, I'm finding that the maximum number of requests per minute is as high as 1800. Is there any way to submit that many requests in such a short period of time from a single machine? I'm using curl on a Windows laptop, and I'm not sure how to go about submitting a second request before waiting for the first to finish almost a minute later (if such a thing is possible).

If you want to request 1800 images, you can group 16 images per request (1800/16) you will need 113 request.
On the other hand, if the limit is 1800 requests per minute and each request can contain 16 images, then you can process 1800 * 16 = 28800 images per minute.
Please consider that docs says: These limits apply to each Google Cloud Platform Console project and are shared across all applications and IP addresses using that project. So it doesn't matter if requests are sent from a single o many machines.
Cloud Vision can receive parallel requests, so your app should be prepared to manage this amount of requests/responses. You may want to check this example and then use threads in your preferred programming language for sending/receiving parallel operations.

Related

How to send 50.000 HTTP requests in a few seconds?

I want to create a load test for a feature of my app. It’s using a Google App Engine and a VM. The user sends HTTP requests to the App Engine. It’s realistic that this Engine gets thousands of requests in a few seconds. So I want to create a load test, where I send 20.000 - 50.000 in a timeframe of 1-10 seconds.
How would you solve this problem?
I started to try using Google Cloud Task, because it seems perfect for this. You schedule HTTP requests for a specific timepoint. The docs say that there is a limit of 500 tasks per second per queue. If you need more tasks per second, you can split this tasks into multiple queues. I did this, but Google Cloud Tasks does not execute all the scheduled task at the given timepoint. One queue needs 2-5 minutes to execute 500 requests, which are all scheduled for the same second :thinking_face:
I also tried a TypeScript script running asynchronous node-fetch requests, but I need for 5.000 requests 77 seconds on my macbook.
I don't think you can get 50.000 HTTP requests "in a few seconds" from "your macbook", it's better to consider going for a special load testing tool (which can be deployed onto GCP virtual machine in order to minimize network latency and traffic costs)
The tool choice is up to you, either you need to have powerful enough machine type so it would be able to conduct 50k requests "in a few seconds" from a single virtual machine or the tool needs to have the feature of running in clustered mode so you could kick off several machines and they would send the requests together at the same moment of time.
Given you mention TypeScript you might want to try out k6 tool (it doesn't scale though) or check out Open Source Load Testing Tools: Which One Should You Use? to see what are other options, none of them provides JavaScript API however several don't require programming languages knowledge at all
A tool you could consider using is siege.
This is Linux based and to prevent any additional cost by testing from an outside system out of GCP.
You could deploy siege on a relatively large machine or a few machines inside GCP.
It is fairly simple to set up, but since you mention that you need 20-50k in a span of a few seconds, siege by default only allows 255 requests per second. You can make this larger, though, so it can fit your needs.
You would need to play around on how many connections a machine can establish, since each machine will have a certain limit based on CPU, Memory and number of network sockets. You could just increase the -c number, until the machine gives an "Error: system resources exhausted" error or something similar. Experiment with what your virtual machine on GCP can handle.

Maximum number of concurrent http connections in Google Cloudfunction

I must call an external API a certain amount of time from a Google Cloud Function. I must wait for all results before responding to the client. As I want to respond as quickly as I can, I want to make these calls async. But as I can get many calls (lets say, 250 or 1000+ calls), I'm wondering if there is a limit (there is certainly one..). I looked for the answer online, but all the things I found is about calling a CloudFunction concurrently which is not my problem here. I found some information about NodeJs, but nothing related to the CloudFunctions.
I'm using firebase.
I would also like to know if there is an easy way in in CloudFunction to use the maximum number of concurrent connections and queue the rest of the calls?
On Cloud Functions each request is processed by 1 instance of the function and the instance can process only one request in the same time (no concurrent request processing -> 2 concurrents request create 2 Cloud Functions instances).
Alternatively, with Cloud Run, you can process up to 80 request in the same time on the same instance. And so, for the same number of concurrent request, you have less instances (up to 80 time less) and because you pay the CPU and the memory, you will pay less with Cloud Run. I wrote an (old) article on this
The number of instance of a same Cloud Functions has been removed (previously, it was 1000). So, you don't have limit in the scalability (even if there is a physical limit when the region don't have enough resources to create a new instance of your function).
About the queue... There is not really a queue. the request is kept few seconds (about 10) and wait a new instance creation or a free instance (which just finish to process another request). After 10s, a 429 HTTP error code is returned.
About concurrent request on Cloud Functions I tested to call up to 25000 request in the same time and it works without issue.
However, you are limited by the function capacity (only 1 CPU, concurrency is limited) and the memory (boost the memory, boost the CPU speed and allows to handle more concurrent request -> I got a out of memory with 256Mb and 2500 concurrent requests test)
I performed the test in Go

How can I do performance testing of 10000+ users on Jmeter or any other opensource

I have to perform load testing for a particular application, I know that Jmeter can't test desktop app so I can convert it as a web link for the purpose of testing.
My client has provided that there are 15000 users for this particular application?
How can I test this huge number on J meter, Do I actually need to add 15000 Vusers.?
I searched for the solution and found that we need to create different servers, is this the only option as for this have to create 15 different servers (not feasible)
Please advise if there is any other open source so that I can do that.
Thanks !!!
p.s. I am quite new in Performance Testing
I don't think you need 15000 virtual users to simulate this amount of real users. Real users don't hammer the server non-stop, they need some time to "think" between operations.
For instance, given the following situation:
User does something each 15 seconds
Page load time is 5 seconds
It means that each user sends 3 requests per minute. 15000 users will send 45000 requests per minute which stands for 750 requests per second which can be simulated by a single modern mid-end computer.
If you will proceed with JMeter I strongly encourage you getting familiarized with JMeter Performance and Tuning Tips guide - it'll allow to use your machine in the most efficient way. If you still won't be able to simulate 750 (or so) request per second - you can consider distributed testing
In regards to other open source tools, Tsung is known for being capable of simulating huge loads on not so powerful hardware, but it runs only on Linux/Unix systems and don't have any GUI so if you need to conduct your load test fast - I would recommend going for JMeter.

asp.net high number of Request Queued and Context switching

We have a fairly popular site that has around 4 mil users a month. It is hosted on a Dedicated Box with 16 gb of Ram, 2 procc with 24 cores.
At any given time the CPU is always under 40% and the memory is under 12 GB but at the highest traffic we see a very poor performance. The site is very very slow. We have 2 app pools one for our main site and one for our forum. Only the site is being slow. We don't have any restrictions on cpu or memory per app pool.
I have looked at he Performance counters and I saw something very interesting. At our peek time for some reason Request are being queued. Overall context switching numbers are very high around 30 - 110 000 k.
As i understand high context switching is caused by locks. Can anyone give me an example code that would cause a high number of context switches.
I am not too concerned with the context switching, and i don't think the numbers are huge. You have a lot of threads running in IIS (since its a 24 core machine), and higher context switching numbers re expected. However, I am definitely concerned with the request queuing.
I would do several things and see how it affects your performance counters:
Your server CPU is evidently under-utilized, since you run below 40% all the time. You can try to set a higher value of "Threads per processor limit" in IIS until you get to a 50-60% utilization. An optimal value of threads per core by the books is 20, but it depends on the scenario, and you can experiment with higher or lower values. I would recommend trying setting a value >=30. Low CPU utilization can also be a sign of blocking IO operations.
Adjust the "Queue Length" settings in IIS properties. If you have configured the "Threads per processor limit" to be 20, then you should configure the Queue Length to be 20 x 24 cores = 480. Again, if the requests are getting Queued, that can be a sign that all your threads are blocked serving other requests or blocked waiting for an IO response.
Don't serve your static files from IIS. Move them to a CDN, amazon S3 or whatever else. This will significantly improve your server performance, because 1,000s of Server requests will go somewhere else! If you MUST serve the files from IIS, than configure IIS file compression. In addition use expire headers for your static content, so they get cached on the client, which will save a lot of bandwidth.
Use Async IO wherever possible (reading/writing from disk, db, network etc.) in your ASP.NET controllers, handlers etc. to make sure you are using your threads optimally. Blocking the available threads using blocking IO (which is done in 95% of the ASP.NET apps i have seen in my life) could easily cause the thread pool to be fully utilized under heavy load, and Queuing would occur.
Do a general optimization to prevent the number of requests that hit your server, and the processing time of single requests. This can include Minification and Bundling of your CSS/JS files, refactoring your Javascript to do less roundtrips to the server, refactoring your controller/handler methods to be faster etc. I have added links below to Google and Yahoo recommendations.
Disable ASP.NET debugging in IIS.
Google and Yahoo recommendations:
https://developers.google.com/speed/docs/insights/rules
https://developer.yahoo.com/performance/rules.html
If you follow all these advices, i am sure you will get some improvements!

play video from a point defined by the users

If i broadcast a video and divide it into packets, and when a users connect to the netgroup and receive the object from the group( the user will receive from specific time let say actual video is 10 minutes and user connect to the group, and seek video for last 5 minutes). how can i achieve this task. is it possible ? i am using flash player 10.1
Yes, it is possible, but it is a little complicated.
Flash video over HTTP uses progressive display and download. Random access into the stream is not technically possible. It may work in some instances when the file is already in the browser's cache, but it is not truly reliable. If you are stuck with HTTP only, then the only real option is to edit your video into chunks that represent your random access points. For example, if you have a one hour video, you can make twelve videos representing five minute offsets that play to the end (ie, a 60 min file, a 55 min file, etc). There are also some techniques to use a custom server and player which inject metadata to allow random access (I know colleagues who have done this, but have never had to do it myself).
Flash video can also play over a RTMP connection. Flash Media Server provides this, as do one or two alternates. RTMP / FMS give you lots more options for streaming your video and allows for true random access into the stream. You can either purchase and host FMS yourself, or go with a hosted solution like Influxis. Some cloud based solutions are also starting to become available.

Resources