How to send <n> requests (instead of sending for duration <d> seconds) - wrk

Current wrk configuration allows sending continuous requests for seconds (duration parameter).
Is there a way to use wrk to send requests and then exit.
My use case: I want to create large number of threads + connections (e.g. 1000 threads with 100 connections per thread) and send instantaneous bursts towards the server.

You can do it with LUA script:
local counter = 1
function response()
if counter == 100 then
wrk.thread:stop()
end
counter = counter + 1
end
Pass this script with -s command line parameter.

I make changes to wrk to introduce new knobs. Let me know if anyone is interested in the patch and I could post it.
I added a -r to send exactly requests and bail out.

Artem,
I have this code-change in my fork:
https://github.com/bhakta0007/wrk

Related

How to solve pdblp Time out issue

I am using Python to download some data from bloomberg. It works most of the time, but sometimes it pops up a 'Time Out Issue`. And after that the response and request does not match anymore.
The code I use in the for loop is as follows:
result_IVM=con.bdh(option_name,'IVOL_MID',date_string,date_string,longdata=True)
volatility=result_IVM['value'].values[0]
When I set up the connection, I used following code:
con = pdblp.BCon(debug=True, port=8194, timeout=5000)
If I increase the timeout parameter (now is 5,000), will it help for this issue?
I'd suggest to increase the timeout to 5000 or even 10000 then test for few times. The default value of timeout is 500 milliseconds, which is small!
The TIMEOUT Event is triggered by the blpapi when no Event(s) arrives within milliseconds
The author of pdblp defines timeout as:
timeout: int Number of milliseconds before timeout occurs when
parsing response. See blp.Session.nextEvent() for more information.
Ref: https://github.com/matthewgilbert/pdblp/blob/master/pdblp/pdblp.py

IncompleteRead error when submitting neo4j batch from remote server; malformed HTTP response

I've set up neo4j on server A, and I have an app running on server B which is to connect to it.
If I clone the app on server A and run the unit tests, it works fine. But running them on server B, the setup runs for 30 seconds and fails with an IncompleteRead:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/nose-1.3.1-py2.7.egg/nose/suite.py", line 208, in run
self.setUp()
File "/usr/local/lib/python2.7/site-packages/nose-1.3.1-py2.7.egg/nose/suite.py", line 291, in setUp
self.setupContext(ancestor)
File "/usr/local/lib/python2.7/site-packages/nose-1.3.1-py2.7.egg/nose/suite.py", line 314, in setupContext
try_run(context, names)
File "/usr/local/lib/python2.7/site-packages/nose-1.3.1-py2.7.egg/nose/util.py", line 469, in try_run
return func()
File "/comps/comps/webapp/tests/__init__.py", line 19, in setup
create_graph.import_films(films)
File "/comps/comps/create_graph.py", line 49, in import_films
batch.submit()
File "/usr/local/lib/python2.7/site-packages/py2neo-1.6.3-py2.7-linux-x86_64.egg/py2neo/neo4j.py", line 2643, in submit
return [BatchResponse(rs).hydrated for rs in responses.json]
File "/usr/local/lib/python2.7/site-packages/py2neo-1.6.3-py2.7-linux-x86_64.egg/py2neo/packages/httpstream/http.py", line 563, in json
return json.loads(self.read().decode(self.encoding))
File "/usr/local/lib/python2.7/site-packages/py2neo-1.6.3-py2.7-linux-x86_64.egg/py2neo/packages/httpstream/http.py", line 634, in read
data = self._response.read()
File "/usr/local/lib/python2.7/httplib.py", line 532, in read
return self._read_chunked(amt)
File "/usr/local/lib/python2.7/httplib.py", line 575, in _read_chunked
raise IncompleteRead(''.join(value))
IncompleteRead: IncompleteRead(131072 bytes read)
-------------------- >> begin captured logging << --------------------
py2neo.neo4j.batch: INFO: Executing batch with 2 requests
py2neo.neo4j.batch: INFO: Executing batch with 1800 requests
--------------------- >> end captured logging << ---------------------
The exception happens when I submit a sufficiently large batch. If I reduce the size of the data set, it goes away. It seems to be related to request size rather than the number of requests (if I add properties to the nodes I'm creating, I can have fewer requests).
If I use batch.run() instead of .submit(), I don't get an error, but the tests fail; it seems that the batch is rejected silently. If I use .stream() and don't iterate over the results, the same thing happens as .run(); if I do iterate over them, I get the same error as .submit() (except that it's "0 bytes read").
Looking at httplib.py suggests that we'll get this error when an HTTP response has Transfer-Encoding: Chunked and doesn't contain a chunk size where one is expected. So I ran tcpdump over the tests, and indeed, that seems to be what's happening. The final chunk has length 0x8000, and its final bytes are
"http://10.210.\r\n
0\r\n
\r\n
(Linebreaks added after \n for clarity.) This looks like correct chunking, but the 0x8000th byte is the first "/", rather than the second ".". Eight bytes early. It also isn't a complete response, being invalid JSON.
Interestingly, within this chunk we get the following data:
"all_relatio\r\n
1280\r\n
nships":
That is, it looks like the start of a new chunk, but embedded within the old one. This new chunk would finish in the correct location (the second "." of above), if we noticed it starting. And if the chunk header wasn't there, the old chunk would finish in the correct location (eight bytes later).
I then extracted the POST request of the batch, and ran it using cat batch-request.txt | nc $SERVER_A 7474. The response to that was a valid chunked HTTP response, containing a complete valid JSON object.
I thought maybe netcat was sending the request faster than py2neo, so I introduced some slowdown
cat batch-request.txt | perl -ne 'BEGIN { $| = 1 } for (split //) { select(undef, undef, undef, 0.1) unless int(rand(50)); print }' | nc $SERVER_A 7474
But it continued to work, despite being much slower now.
I also tried doing tcpdump on server A, but requests to localhost don't go over tcp.
I still have a few avenues that I haven't explored: I haven't worked out how reliably the request fails or under precisely which conditions (I once saw it succeed with a batch that usually fails, but I haven't explored the boundaries). And I haven't tried making the request from python directly, without going through py2neo. But I don't particularly expect either of these to be be very informative. And I haven't looked closely at the TCP dump except for using wireshark's 'follow TCP stream' to extract the HTTP conversation; I don't really know what I'd be looking for there. There's a large section that wireshark highlights in black in the failed dump, and only isolated lines black in the successful dump, maybe that's relevant?
So for now: does anyone know what might be going on? Anything else I should try to diagnose the problem?
The TCP dumps are here: failed and successful.
EDIT: I'm starting to understand the failed TCP dump. The whole conversation takes ~30 seconds, and there's a ~28-second gap in which both servers are sending ZeroWindow TCP frames - these are the black lines I mentioned.
First, py2neo fills up neo4j's window; neo4j sends a frame saying "my window is full", and then another frame which fills up py2neo's window. Then we spend ~28 seconds with each of them just saying "yup, my window is still full". Eventually neo4j opens its window again, py2neo sends a bit more data, and then py2neo opens its window. Both of them send a bit more data, then py2neo finishes sending its request, and neo4j sends more data before also finishing.
So I'm thinking that maybe the problem is something like, both of them are refusing to process more data until they've sent some more, and neither can send some more until the other processes some. Eventually neo4j enters a "something's gone wrong" loop, which py2neo interprets as "go ahead and send more data".
It's interesting, but I'm not sure what it means, that the penultimate TCP frame sent from neo4j to py2neo starts \r\n1280\r\n - the beginning of the fake-chunk. The \r\n8000\r\n that starts the actual chunk, just appears part-way through an unremarkable TCP frame. (It was the third frame sent after py2neo finished sending its post request.)
EDIT 2: I checked to see precisely where python was hanging. Unsurprisingly, it was while sending the request - so BatchRequestList._execute() doesn't return until after neo4j gives up, which is why neither .run() or .stream() did any better than .submit().
It appears that a workaround is to set the header X-Stream: true;format=pretty. (By default it's just true; it used to be pretty, but that was removed due to this bug (which looks like it's actually a neo4j bug, and still seems to be open, but isn't currently an issue for me).
It looks like, by setting format=pretty, we cause neo4j to not send any data until it's processed the whole of the input. So it doesn't try to send data, doesn't block while sending, and doesn't refuse to read until it's sent something.
Removing the X-Stream header entirely, or setting it to false, seems to have the same effect as setting format=pretty (as in, making neo4j send a response which is chunked, pretty-printed, doesn't contain status codes, and doesn't get sent until the whole request has been processed), which is kinda weird.
You can set the header for an individual batch with
batch._batch._headers['X-Stream'] = 'true;format=pretty'
Or set the global headers with
neo4j._add_header('X-Stream', 'true;format=pretty')

How can you throttle calls server side?

I know client side _underscore.js can be used to throttle click rates, but how do you throttle calls server side? I thought of using the same pattern but unfortunately _throttle doesn't seem to allow for differentiating between Meteor.userId()'s.
Meteor.methods({
doSomething: function(arg1, arg2){
// how can you throttle this without affecting ALL users
}
);
Here's a package I've roughed up - but not yet submitted to Atmosphere (waiting until I familiarize myself with tinytest and write up unit tests for it).
https://github.com/zeroasterisk/Meteor-Throttle
Feel free to play with it, extend, fix and contribute (pull requests encouraged)
The concept is quite simple, and it only runs (should only be run) on the server.
You would first need to come up with a unique key for what you want to throttle...
eg: Meteor.userId() + 'my-function-name' + 'whatever'
This system uses a new Collection 'throttle' and some helper methods to:
check, set, and purge records. There is also a helper checkThenSet
method which is actually the most common pattern, check if we can do something,
and the set a record that we did.
Usage
(Use Case) If your app is sending emails, you wouldn't want to send the same email over
and over again, even if a user triggered it.
// on server
if (!Throttle.checkThenSet(key, allowedCount, expireInSec)) {
throw new Meteor.Error(500, 'You may only send ' + allowedCount + ' emails at a time, wait a while and try again');
}
....
On Throttle Methods
checkThenSet(key, allowedCount, expireInSec) checks a key, if passes it then sets the key for future checks
check(key, allowedCount) checks a key, if less than allowedCount of the (unexpired) records exist, it passes
set(key, expireInSec) sets a record for key, and it will expire after expireInSec seconds, eg: 60 = 1 min in the future
purge() expires all records which are no longer within timeframe (automatically called on every check)
Methods (call-able)
throttle(key, allowedCount, expireInSec) --> Throttle.checkThenSet()
throttle-check(key, allowedCount) --> Throttle.check()
throttle-set(key, expireInSec) --> Throttle.set()
there is not built in support for this currently in meteor, but its on the roadmap https://trello.com/c/SYcbkS3q/18-dos-hardening-rate-limiting
in theory you could use some of the options here Throttling method calls to M requests in N seconds but you would have to roll your own solution

Parallel HTTP web crawler in Erlang

I'm coding on a simple web crawler and have generated a bunch gf static files I try to crawl by the code at bottom. I have two issues/questions I don't have an idea for:
1.) Looping over the sequence 1..200 throws me an error exactly after 100 pages have been crawled:
** exception error: no match of right hand side value {error,socket_closed_remotely}
in function erlang_test_01:fetch_page/1 (erlang_test_01.erl, line 11)
in call from lists:foreach/2 (lists.erl, line 1262)
2.) How to parallelize the requests, e.g. 20 cincurrent reqs
-module(erlang_test_01).
-export([start/0]).
-define(BASE_URL, "http://46.4.117.69/").
to_url(Id) ->
?BASE_URL ++ io_lib:format("~p", [Id]).
fetch_page(Id) ->
Uri = to_url(Id),
{ok, {{_, Status, _}, _, Data}} = httpc:request(get, {Uri, []}, [], [{body_format,binary}]),
Status,
Data.
start() ->
inets:start(),
lists:foreach(fun(I) -> fetch_page(I) end, lists:seq(1, 200)).
1. Error message
socket_closed_remotely indicates that the server closed the connection, maybe because you made too many requests in a short timespan.
2. Parallellization
Create 20 worker processes and one process holding the URL queue. Let each process ask the queue for a URL (by sending it a message). This way you can control the number of workers.
An even more "Erlangy" way is to spawn one process for each URL! The upside to this is that your code will be very straightforward. The downside is that you cannot control your bandwidth usage or number of connections to the same remote server in a simple way.

Why does my concurrent Haskell program terminate prematurely?

I have a UDP server that reflects every ping message it receives (this works well I think). I the client side I would then like to do two things:
make sure that I fired off N (e.g. 10000) messages, and
count the number of correctly received responses.
It seems that either because of the nature of UDP or because of the forkIO thing, my client code below ends prematurely/does not do any counting at all.
Also I am very surprised to see that the function tryOnePing returns 250 times the Int 4. Why could this be?
main = withSocketsDo $ do
s <- socket AF_INET Datagram defaultProtocol
hostAddr <- inet_addr host
thread <- forkIO $ receiveMessages s
-- is there any better way to eg to run that in parallel and make sure
-- that sending/receiving are asynchronous?
-- forM_ [0 .. 10000] $ \i -> do
-- sendTo s "ping" (SockAddrInet port hostAddr)
-- actually this would be preferred since I can discard the Int 4 that
-- it returns but forM or forM_ are out of scope here?
let tryOnePing i = sendTo s "ping" (SockAddrInet port hostAddr)
pings <- mapM tryOnePing [0 .. 1000]
let c = length $ filter (\x -> x==4) pings
-- killThread thread
-- took that out to make sure the function receiveMessages does not
-- end prematurely. still seems that it does
sClose s
print c
-- return()
receiveMessages :: Socket -> IO ()
receiveMessages socket = forever $ do
-- also tried here forM etc. instead of forever but no joy
let recOnePing i = recv socket 1024
msg <- mapM recOnePing [0 .. 1000]
let r = length $ filter (\x -> x=="PING") msg
print r
print "END"
The main problem here is that when your main thread finishes, all other threads gets killed automatically. You have to get the main thread to wait for the receiveMessages thread, or it will in all likelyhood simply finish before any responses have been received. One simple way of doing this is to use an MVar.
An MVar is a synchronized cell that can either be empty or hold exactly one value. The current thread will block if it tries to take from an empty MVar or insert into a full one.
In this case, we don't care about the value itself, so we'll just store a () in it.
We'll start with the MVar empty. Then the main thread will fork off the receiver thread, send all the packets, and try to take the value from the MVar.
import Control.Concurrent.MVar
main = withSocketsDo $ do
-- prepare socket, same as before
done <- newEmptyMVar
-- we need to pass the MVar to the receiver thread so that
-- it can use it to signal us when it's done
forkIO $ receiveMessages sock done
-- send pings, same as before
takeMVar done -- blocks until receiver thread is done
In the receiver thread, we will receive all the messages and then put a () in the MVar to signal that we're done receiving.
receiveMessages socket done = do
-- receive messages, same as before
putMVar done () -- allows the main thread to be unblocked
This solves the main issue, and the program runs fine on my Ubuntu laptop, but there are a couple more things you want to take care of.
sendTo does not guarantee that the whole string will be sent. You'll have to check the return value to see how much was sent, and retry if not all of it was sent. This can happen even for a short message like "ping" if the send buffer is full.
recv requires a connected socket. You'll want to use recvFrom instead. (Although it still works on my PC for some unknown reason).
Printing to standard output is not synchronized, so you might want to alter this so that the MVar will be used to communicate the number of received packets instead of just (). That way, you can do all the output from the main thread. Alternatively, use another MVar as a mutex to control access to standard output.
Finally, I recommend reading the documentation of Network.Socket, Control.Concurrent and Control.Concurrent.MVar carefully. Most of my answer is stitched together from information found there.

Resources