collectd not sharing information to graphite for all data - graphite

I have a strange one.
A number of data items are being collected by collectd and appear correctly with
collectdctl -s /var/run/collectdctl listval|getval and so forth.
These are then rendered into graphite effectively for most items.
Recently, the collectd-graphite connection ceased to be operational
for several recently added items. While it appears in collectd and
is queryable via collectdctl, it remains not on the graphite page.
I am asking to find out how you would approach this.
Thanks for any comment.

There's probably a number of ways you can troubleshoot this, but I end up almost always resorting to tcpdump, sigh. First enable debug logging in collectd just to make sure it really doesn't spit out an error message (LogLevel "debug" https://collectd.org/wiki/index.php/Plugin:LogFile although often collectd is compiled with debug logging disabled).
Then run tcpdump on the graphite server using the -s0 -X flags to tcpdump so you get the packet contents. (You can also use a more sophisticated network sniffer that prints the tcp data stream.) Check whether you see the data items that are missing the packets and whether they look appropriate (see https://collectd.org/wiki/index.php/Plugin:Write_Graphite). Typically this step allows me to quickly determine whether the problem is the sending collectd or the receiving service.

Related

Wireshark Student - I can't see any http post or get requests

I am a student and today for a lab, we were asked to install and use Wireshark. The installation went well, I installed the correct version, installed WinPcap, and the program started without any issues.
I was connected to the University's Wifi and as part of our lab we had to visit http://www.cas.mcmaster.ca/~rzheng/course/CAS4C03W17/Labs/INTRO-wireshark-file1.html and answer questions about the data captured in Wireshark.
Problem is, I am not getting any get or post requests, filtering by http.request.method == "GET" shows nothing, and http.request.method == "POST" shows nothing as well. Filtering by http shows the 200 OK and 304 Not Modified (if I refresh).
I was the only one in my lab who had this problem, and my instructor wasn't able to figure it out. He saved and sent me his output which has Get and Post requests so I can continue my work.
Did anyone have this problem before or have any idea on how to solve it? I can upload the saved outputs if you think it would help. Thanks!
Capture sample looks like it's filtered, since it contains only packets sent to your PC IP address. What is missing:
There is not a single outgoing packet, despite they are obviously on the net. E.g. there are "TCP acknowledge" packets received by PC in capture file, but packets sent by PC, which are acknowledged by them, aren't shown.
Not a single incoming broadcast/multicast packet. This situation is possible, but not very likely.
So there is some trouble with sniffer setup on your site. Possible explanations:
accidentally configured capture filter (don't mix with display filter)
Some interfering software is installed. Example of the same complaint
Method to determine if issue is gone: apply !(ip.dst == YOUR_IP_ADDR) display filter and check if packets output isn't empty on visiting any web page. Possible plan of troubleshooting:
check capture filter
check different network card (e.g. non-wireless connection)
check wireshark operability in pure environment (e.g. liveUSB)
try removing suspected interfering software

Load balancing TCP traffic using Apache Camel with Netty leads to transaction failures

I am new to Apache Camel and Netty and this is my first project. I am trying to use Camel with the Netty component to load balance heavy traffic in a back end load test scenario.This is the setup I have right now:
from("netty:tcp:\\this-ip:9445?defaultCodec=false&sync=true").loadBalance().roundRobin().to("netty:tcp:\\backend1:9445?defaultCodec=false&sync=true,netty:tcp:\\backend2:9445?defaultCodec=false&sync=true)
The issue is unexpected buffer sizes that I am receiving in the response that I see in the client system sending tcp traffic to Camel. When I send multiple requests one after the other I see no issues and the buffer size is as expected. But, when I try running multiple users sending similar requests to Camel on the same port, I intermittently see unexpected buffer sizes, sometimes 0 bytes to sometimes even greater than the expected number of bytes. I tried playing around with multiple options mentioned in the Camel-Netty page like:
Increasing backlog
keepAlive
buffersizes
timeouts
poolSizes
workerCount
synchronous
stream caching (did not work)
disabled useOriginalMessage for performance
System level TCP parameters, etc. among others.
I am yet to resolve the issue. I am not sure if I'm fundamentally missing something. I did take a look at the encoder/decoders and guess if that could be an issue. But, I don't understand why a load balancer needs to encode/decode messages. I have worked with other load balancers which just require endpoint configurations and hence, I am assuming that Camel does not require this. Am I right? Please know that the issue is not with my client/backend as I ran a 2000 user load test from my client to the backend with less than 1% failures but see a large number of failure ( not that there are no successes) with Camel. I have the following questions:
1.Is this a valid use-case for Apache Camel- Netty? Should I be looking at Mina or others?
2.Can I try to route tcp traffic to JMS or other components and then finally to the tcp endpoint?
3.Do I need encoders/decoders or should this configuration work?
4.Should I continue with this approach or try some other load balancer?
Please let me know if you have any other suggestions. TIA.
Edit1:
I also tried the same approach with netty4 and mina components. The route looks similar to the one in netty. The route with netty4 is as follows:
from("netty4:tcp:\\this-ip:9445?defaultCodec=false&sync=true").to("netty4:tcp:\\backend1:9445?defaultCodec=false&sync=true")
I read a few posts which had the same issue but did not find any solution relevant to my issue.
Edit2:
I increased the receive timeout at my client and immediately noticed the mismatch in expected buffer length issue fall to less than 1%. However, I see that the response times for each transaction when using Camel and not using it is huge; almost 10 times higher. Can you help me with reducing the response times for each transaction? The message received back at my client varies from 5000 to 20000 bytes. Here is my latest route:
from("netty:tcp://this-ip:9445?sync=true&allowDefaultCodec=false&workerCount=20&requestTimeout=30000")
.threads(20)
.loadBalance()
.roundRobin()
.to("netty:tcp://backend-1:9445?sync=true&allowDefaultCodec=false","netty:tcp://backend-2:9445?sync=true&allowDefaultCodec=false")
I also used certain performance enhancements like:
context.setAllowUseOriginalMessage(false);
context.disableJMX();
context.setMessageHistory(false);
context.setLazyLoadTypeConverters(true);
Can you point me in the right direction about how I can reduce the individual transaction times?
For netty4 component there is no parameter called defaultCodec. It is called allowDefaultCodec. http://camel.apache.org/netty4.html
Also, try something like this first.
from("netty4:tcp:\\this-ip:9445?textline=true&sync=true").to("netty4:tcp:\\backend1:9445?textline=true&sync=true")
The above means the data being sent is normal text. If you are sending byte or something else you will need to provide decoding/encoding for netty to handle the data.
And a side note. Before running the Camel route, test manually to send test messages via a standard tcp tool like sockettest to verify that everything works. Then implement the same via Camel. You can find sockettest here http://sockettest.sourceforge.net/ .
I finally solved the issue with the same route settings as above. The issue was with the Request and Response Delimiter not configured properly due to which it was either closing the connection too early leading to unexpected buffer sizes or it was waiting too long even after the entire buffer was received leading to high response times.

How to test your client server program

I'm making a multiplayer game and often i want to test out if it perfectly works on global network, because sometime it's just works locally, so how could i do that without sending my client to friend to test it out.
If you want to test it for the "global network" - you have to test it that way. There are multiple things that can go wrong which are not an issue on a local network. Just off the top of my head
latency (important for a game)
NAT (common cause of problems depending on your game architecture - more so if P2P)
security
connection errors (Wifi/3G intermittent loss)
There are many aspects of networking that are often subtly different when you're talking over the internet rather than running on either localhost or a local network.
All manner of delays can occur, and this can throw out some poor assumptions in your code.
The TCP flow can be affected by TCP's flow control (when the TCP Window fills up the sender will stop sending and report the fact to you (or maybe not report it, if you're using async APIs)).
TCP reads that return 'complete messages' on localhost and your own network may start to return pieces of a message.
UDP datagrams may go missing and never arrive or may arrive multiple times or in any sequence.
So you're right to think that you need to test your code for these edge cases which rarely show up on your own network. You're also right to think that simply sending a client to a friend, or running it on a remote machine, is not enough.
One approach is to build a dedicated test client which sends known game play and checks that it gets expected responses (how hard this is depends on your protocol and your game). Once you have that working you then have the test client deliberately send data in such a way that the items above are tested. So, if you're using UDP, you might put some code in your test client so that it sometimes doesn't bother to send a UDP datagram at all. The client should think it sent it. The networking layer simply ditches it. This tests your UDP protocol for missing datagrams. Then send some datagrams multiple times, then send some out of sequence, etc. For TCP add delays, break logical "messages" into separate network sends with large delays between them; ideally send each distinct message type as a sequence of single bytes to check that the server's 'message accumulation code' works correctly.
Once you have done this you need to do the same for your client code, perhaps by adding a "fuzzing" option to your server's network code to do the same kind of thing...
Personally I tend to try and take a step back and do as much of this as possible in dedicated 'unit tests' (I know that some people will say that these aren't unit tests, call them what you like, just write them!). These tests exercise your networking layer using real networking (talking to a dummy server/client that the test creates) and validate the horrible edge cases.

How to debug Websockets?

I want to monitor the websocket traffic (like to see what version of the protocol the client/server is using) for debugging purposes. How would I go about doing this? Wireshark seems too low level for such a task. Suggestions?
Wireshark sounds like what you want actually. There is very little framing or structure to WebSockets after the handshake (so you want low-level) and even if there was, wireshark would soon (or already) have the ability to parse it and show you the structure.
Personally, I often capture with tcpdump and then parse the data later using wireshark. This is especially nice when you may not be able wireshark on the device where you want to capture the data (i.e. a headless server). For example:
sudo tcpdump -w /tmp/capture_data -s 8192 port 8000
Alternately, if you have control over the WebSockets server (or proxy) you could always print out the send and receive data. Note that since websocket frames start with '\x00' will want to avoid printing that since in many languages '\x00' means the end of the string.
If you're looking for the actual data sent and received, the recent Chrome Canary and Chromium have now WebSocket message frame inspection feature.
You find details in this thread.
I think you should use Wireshark
Steps
Open wireshark
Go to capture and follow bellow path: capture > interfaces > start capture in your appropriate device.
Write rules in filter tcp.dstport == your_websoket_port
Hit apply
For simple thing, wireshark is too complex, i wanted to check only if the connection can be establish or not. Following Chrome plugin "Simple Web-socket (link : https://chrome.google.com/webstore/detail/simple-websocket-client/pfdhoblngboilpfeibdedpjgfnlcodoo?hl=en)" work like charm. See image.
https://lh3.googleusercontent.com/bEHoKg3ijfjaE8-RWTONDBZolc3tP2mLbyWanolCfLmpTHUyYPMSD5I4hKBfi81D2hVpVH_BfQ=w640-h400-e365

Using NetConnection and URLStream to send/recieve data at high frequency

I'm writing a Comet-like app using Flex on the client and my own hand-written server.
I need to be able to send short bursts of data from the client at quite a high frequency (e.g. of the order of 10ms between sends).
I also need the server to push short bursts of data at a similarly high frequency.
I'm using NetConnection.call() to send the data to the server, and URLStream (with chunked encoding) to push the data from the server to the client.
What I've found is that the data isn't being sent/received as soon as it's available. For example, in IE, it seems the data is sent every 200ms rather than as soon as NetConnection.call() is called. Similarly, URLStream isn't making the data available as soon as the server is sending it.
Judging by the difference in behaviour between the browsers, it seems as though the Flash Player (version 10) is relying on the host browser to do all the comms. Can anyone confirm this? Update: This is very likely as only the host browser would know about the proxy settings that might be set.
I've tried using the Socket class and there's no problem with speed there: it works perfectly. However, I'd like to be able to use HTTP-based (port 80) connections so that my app can run in heavily fire-walled environments (I tried using a Socket over port 80, but that has its problems).
Incidentally, all development/testing has been done on an internal LAN, so bandwidth/latency is not an issue.
Update: The data being sent/received is in small packets and doesn't need to be in any particular format. For example, I might need to send a short array of Numbers, and this could either be encoded in AMF (e.g. via NetConnection.call()) or could be put into GET parameters (e.g. using sendToURL()). The main point of my question is really to see whether anyone else has experienced the same problem in calling NetConnection/URLStream frequently, and whether there is a workaround (it's also possible that the fault lies with my server code of course, rather than Flash).
Thanks.
Turns out the problem had nothing to do with Flash/Flex or any of the host browsers. The problem was in my server code (written in C++ on Linux), and without access to my source code the cause is hard to find (so I couldn't have hoped for an answer from this forum).
Still - thank you everyone who chipped in.
It was only after looking carefully at the output shown in Wireshark that I noticed the problem, which was twofold:
Nagle's algorithm
I was sending replies in multiple packets by calling write() multiple times (e.g. once for the HTTP response header, and again for the HTTP response body). The server's TCP/IP stack was waiting for an ACK for the first packet before sending the second, but because of Nagle's algorithm the client was waiting 200ms before sending back the ACK to the first packet, so the server took at least 200ms to send the full HTTP response.
The solution is to use send() with the flag MSG_MORE until all the logically connected blocks are written. I could also have used writev() or setsockopt() with TCP_CORK, but it suited my existing code better to use send().
Chunk-encoded streams
I'm using a never-ending HTTP response with chunk encoding to push data back to the client. Naggle's algorithm needs to be turned off here because even if each chunk is written as one packet (using MSG_MORE), the client OS TCP/IP stack will still wait up to 200ms before sending back an ACK, and the server can't push a subsequent chunk until it gets that ACK.
The solution here is to ask the server not to wait for an ACK for each sent packet before sending the next packet, and this is done by calling setsockopt() with the TCP_NODELAY flag.
The above solutions only work on Linux and aren't POSIX-compliant (I think), but that isn't a problem for me.
I'm almost 100% sure the player relies on the browser for such communications. Can't find an official page stating so atm, but check this out for example:
Applications hosting the Flash Player
ActiveX control or Flash Player
plug-in can use the
EnforceLocalSecurity and
DisableLocalSecurity API calls to
control security settings.
Which I think somehow implies the idea. Also, I've suffered some network related bugs on FF/IE only which again points out to the player using each browser for networking (otherwise there wouldn't be such differences).
And regarding your latency problem, I think that if speed is critical, your best bet is sockets. You have some work to do, but seems possible, check out the docs again:
This error occurs in SWF content.
Dispatched if a call to
Socket.connect() attempts to connect
either to a server outside the
caller's security sandbox or to a port
lower than 1024. You can work around
either problem by using a cross-domain
policy file on the server.
HTH,
Juan

Resources