Good tools to understand / reverse engineer a top layer network protocol - networking

There is an interesting problem at hand. I have a role-playing MMOG running through a client application (not a browser) which sends the actions of my player to a server which keeps all the players in sync by sending packets back.
Now, the game uses a top layer protocol over TCP/IP to send the data. However, wireshark does not know what protocol is being used and shows everything beyond the TCP header as a dump.
Further, this dump does not have any plain text strings. Although the game has a chat feature, the chat string being sent is not seen in this dump as plain text anywhere.
My task is to reverse engineer the protocol a little to find some very basic stuff about the data contained in the packets.
Does anybody know why is the chat string not visible as plain text and whether it is likely that a standard top level protocol is being used?
Also, are there any tools which can help to get the data from the dump?

If it's encrypted you do have a chance (in fact, you have a 100% chance if you handle it right): the key must reside somewhere on your computer. Just pop open your favorite debugger, watch for a bit (err, a hundred bytes or so I'd hope) of data to come in from a socket, set a watchpoint on that data, and look at the stack traces of things that access it. If you're really lucky, you might even see it get decrypted in place. If not, you'll probably pick up on the fact that they're using a standard encryption algorithm (they'd be fools not to from a theoretical security standpoint) either by looking at stack traces (if you're lucky) or by using one of the IV / S-box profilers out there (avoid the academic ones, most of them don't work without a lot of trouble). Many encryption algorithms use blocks of "standard data" that can be detected (these are the IVs / S-boxes), these are what you look for in the absence of other information. Whatever you find, google it, and try to override their encryption library to dump the data that's being encrypted/decrypted. From these dumps, it should be relatively easy to see what's going on.
REing an encrypted session can be a lot of fun, but it requires skill with your debugger and lots of reading. It can be frustrating but you won't be sorry if you spend the time to learn how to do it :)

Best guess: encryption, or compression.
Even telnet supports compression over the wire, even though the whole protocol is entirely text based (well, very nearly).
You could try running the data stream through some common compression utilities, but I doubt that'd do much for you, since in all likelihood they don't transmit compression headers, there's simply some predefined values enforced.
If it's infact encryption, then you're pretty much screwed (without much, much more effort that I'm not even going to start to get into).

It's most likely either compressed or encrypted.
If it's encrypted you won't have a chance.
If it's compressed you'll have to somehow figure out which parts of the data are compressed, where the compressed parts start and what the compression algorithm is. If your lucky there will be standard headers that you can identify, although they are probably stripped out to save space.
None of this is simple. Reverse engineering is hard. There aren't any standard tools to help you, you'll just have to investigate and try things until you figure it out. My advice would be to ask the developers for a protocol spec and see if they are willing to help support what you are trying to do.

Related

What are some common methods used in game networking?

So I'm writing a fairly simple game with very low networking requirements, I'm using TCP.
I'm unsure where to start in even defining/implementing a protocol for the client and server to use. I've been looking around and I've seen a few examples, for instance Mojang's Minecraft which uses a table of 'commands' the client sends the server and the server sends the client, with numbers of arguments and such.
What's a good way to do this? I've heard complaints about Minecraft's protocol because if you overread by a byte you ruin the entire stream.
Game networking is a broad question, depending on what type of problem you are solving. TCP (may) not even be the correct choice for you.
For example - games that send movement of characters is typically done with UDP. The reason being that character movement isn't critical to the operation of the game, so some data loss of movement is "acceptable". That may be why sometimes your character "jumps" - some UDP packets were lost, or severely out-of-order.
UDP is argued as the preferred protocol for networked games. So before you even get started, carefully consider whether you are even picking the correct protocol.
Overall, I consider Glenn Fiedler's series on developing a networked game a fantastic read. I'd start here. He covers all of the basics of using UDP for gaming.
If you want to use TCP simply just to get a handle on TCP - then Minecraft is a reasonable example. A known list of commands that can be sent back and forth is a simple way to start. However, as you stated, is prone to some problems. This is more aligned with using the wrong protocol than how it was developed.
Google "game networking library" and you'll get a bunch of results. GNE would be a good one to look at.
I guess it depends on what your game is, what it mechanics are, what information is necessary. In any case I think this stack exchange https://gamedev.stackexchange.com/ is more suited to answer your question.
Gamedev.net's networking forum has a great FAQ covering these sorts of questions and many others, however, to make this more than a 'go-there-look-at-that' answer, I'll suggest some small improvements you can make. When using tcp, delivery is guarenteed, but this has a speed cost, which is fine if your not making a fps, but it means you need to get more from the data you do send, a great way to do this is via deltas/differentials, that is, sending only the change in state, not the entire game state, you can also validate your incoming packets for corrupt/anomalys data over and about tcp checks by predicting possibilities are allow, and with the same prediction, you can cut out even more data etc. But as others have said, this is a broad question, and not suited to getting truely helpful answers
As you're coding in lua, the only library anyone uses is luasocket (though ZMQ is gaining ground).
You're really going to have several protocols going: TCP for data that must be received (eg, server commands such as changemap or you_got_kicked, conversations and such; then use UDP for non-compulsory data, or data that quickly expires (eg, character positions).

Networking problems in games

I am looking for networking designs and tricks specific to games. I know about a few problems and I have some partial solutions to some of them but there can be problems I can't see yet. I think there is no definite answer to this but I will accept an answer I really like. I can think of 4 categories of problems.
Bad network
The messages sent by the clients take some time to reach the server. The server can't just process them FCFS because that is unfair against players with higher latency. A partial solution for this would be timestamps on the messages but you need 2 things for that:
Be able to trust the clients clock. (I think this is impossible.)
Constant latencies you can measure. What can you do about variable latency?
A lot of games use UDP which means messages can be lost. In that case they try to estimate the game state based on the information they already have. How do you know if the estimated state is correct or not after the connection is working again?
In MMO games the server handles a large amount of clients. What is the best way for distributing the load? Based on location in game? Bind a groups of clients to servers? Can you avoid sending everything through the server?
Players leaving
I have seen 2 different behaviours when this happens. In most FPS games if the player who hosted the game (I guess he is the server) leaves the others can't play. In most RTS games if any player leaves the others can continue playing without him. How is it possible without dedicated server? Does everyone know the full state? Are they transfering the role of the server somehow?
Access to information
The next problem can be solved by a dedicated server but I am curious if it can be done without one. In a lot of games the players should not know the full state of the game. Fog-of-war in RTS and walls in FPS are good examples. However, they need to know if an action is valid or not. (Eg. can you shoot me from there or are you on the other side of the map.) In this case clients need to validate changes to an unknown state. This sounds like something that can be solved with clever use of cryptographic primitives. Any ideas?
Cheating
Some of the above problems are easy in a trusted client environment but that can not be assumed. Are there solutions which work for example in a 80% normal user - 20% cheater environment? Can you really make an anti-cheat software that works (and does not require ridiculous things like kernel modules)?
I did read this questions and some of the answers https://stackoverflow.com/questions/901592/best-game-network-programming-articles-and-books but other answers link to unavailable/restricted content. This is a platform/OS independent question but solutions for specific platforms/OSs are welcome as well.
Thinking cryptography will solve this kind of problem is a very common and very bad mistake: the client itself of course have to be able to decrypt it, so it is completely pointless. You are not adding security, you're just adding obscurity (and that will be cracked).
Cheating is too game specific. There are some kind of games where it can't be totally eliminated (aimbots in FPS), and some where if you didn't screw up will not be possible at all (server-based turn games).
In general network problems like those are deeply related to prediction which is a very complicated subject at best and is very well explained in the famous Valve article about it.
The server can't just process them FCFS because that is unfair against players with higher latency.
Yes it can. Trying to guess exactly how much latency someone has is no more fair as latency varies.
In that case they try to estimate the game state based on the information they already have. How do you know if the estimated state is correct or not after the connection is working again?
The server doesn't have to guess at all - it knows the state. The client only has to guess while the connection is down - when it's back up, it will be sent the new state.
In MMO games the server handles a large amount of clients. What is the best way for distributing the load? Based on location in game?
There's no "best way". Geographical partitioning works fairly well, however.
Can you avoid sending everything through the server?
Only for untrusted communications, which generally are so low on bandwidth that there's no point.
In most RTS games if any player leaves the others can continue playing without him. How is it possible without dedicated server? Does everyone know the full state?
Many RTS games maintain the full state simultaneously across all machines.
Some of the above problems are easy in a trusted client environment but that can not be assumed.
Most games open to the public need to assume a 100% cheater environment.
Bad network
Players with high latency should buy a new modem. I don't think its a good idea to add even more latency because one person in the game got a bad connection. Or if you mean minor latency differences, who cares? You will only make things slower and complicated if you refuse to FCFS.
Cheating: aimbots and similar
Can you really make an anti-cheat software that works? No, you can not. You can't know if they are running your program or another program that acts like yours.
Cheating: access to information
If you have a secure connection with a dedicated server you can trust, then cheating, like seeing more state than allowed, should be impossible.
There are a few games where cryptography can prevent cheating. Card games like poker, where every player gets a chance to 'shuffle the deck'. Details on wikipedia : Mental Poker.
With a RTS or FPS you could, in theory, encrypt your part of the game state. Then send it to everyone and only send decryption keys for the parts they are allowed to see or when they are allowed to see it. However, I doubt that in 2010 we can do this in real time.
For example, if I want to verify, that you could indeed be at location B. Then I need to know where you came from and when you were there. But if you've told me that before, I knew something I was not allowed to know. If you tell me afterwards, you can tell me anything you want me to believe. You could have told me before, encrypted, and give me the decryption key when I need to verify it. That would mean, you'll have to encrypt every move you make with a different encryption key. Ouch.
If your not implementing a poker site, cheating won't be your biggest problem anyway.
With a lot of people accessing games on mobile devices, a "bad network" can occur when a player is in an area of poor reception or they're connected to a slow-wifi connection. So it's not just a problem of people connecting in sparsely populated areas. With mobile clients "bad networks" can occur very very often and it's usually EXTREMELY hard to diagnose.
UDP results in packet loss, but even games that use TCP and HTTP based can experience problems where the client & server communication slows to a crawl while packets are verified to have been sent. With communication UDP compensation for packet loss USUALLY depends on what the packets contain. If you're talking about motion data, usually if packets aren't received, the server interpolates the previous trajectory and makes a position change. Usually it's custom to the game how this is handled, which is why people often avoid UDP unless their game type requires it. Often to handle high network latency, problems games will automatically degrade the amount of features available to the users so that they can still interact with the game without causing the user to get kicked or experience too many broken features.
Optimally you want to have a logging tool like Loggly available that can help you find errors related to bad connection and latency and show you the conditions on the clients and server at the time they happened, this visibility lets you diagnose common problems users experience and develop strategies to address them.
Players leaving
Most games these days have dedicated servers, so this issue is mostly moot. However, sometimes yes, the server can be changed to another client.
Cheating
It's extremely hard to anticipate how players will cheat and create a cheat-proof system no one can hack. These days, a lot of cheat detection strategies are based on heuristic analysis of logging and behavioral analytics information data to spot abnormalities when they happen and flag it for review. You definitely should try to cheat-proof as much as is reasonable, but you also really need an early detection system that can spot new flaws people are exploiting.

How can I use hardware solutions to create "unbreakable" encryption or copy protection?

Two types of problems I want to talk about:
Say you wrote a program you want to encrypt for copyright purposes (eg: denying unlicensed user from reading a certain file, or disabling certain features of the program), but most software-based encryption can be broken by hackers (just look at the amount of programs available to HACK programs to become "full versions". )
Say you want to push a software to other users, but want to protect against piracy (ie, the other user making a copy of this software and selling it as their own). What effective way is there to guard against this (similar to music protection on CD's, like DRM)? Both from a software perspective and a hardware perspective?
Or are those 2 belong to the same class of problems? (Dongles being the hardware / chip based solution, as many noted below)?
So, can chip or hardware based encryption be used? And if so, what exactly is needed? Do you purchase a special kind of CPU, special kind of hardware? What do we need to do?
Any guidance is appreciated, thanks!
Unless you're selling this program for thousands of dollars a copy, it's almost certainly not worth the effort.
As others have pointed out, you're basically talking about a dongle, which, in addition to being a major source of hard-to-fix bugs for developers, is a also a major source of irritation for users, and there's a long history of these supposedly "uncrackable" dongles being cracked. AutoCAD and Cubase are two examples that come to mind.
The bottom line is that a determined enough cracker can still crack dongle protection; and if your software isn't an attractive enough target for the crackers to do this, then it's probably not worth the expense in the first place.
Just my two cents.
Hardware dongles, as other people have suggested, are a common approach for this. This still doesn't solve your problem, though, as a clever programmer can modify your code to skip the dongle check - they just have to find the place in your code where you branch based on whether the check passed or not, and modify that test to always pass.
You can make things more difficult by obfuscating your code, but you're still back in the realm of software, and that same clever programmer can figure out the obfuscation and still achieve his desired goal.
Taking it a step further, you could encrypt parts of your code with a key that's stored in the dongle, and require the bootstrap code to fetch it from the dongle. Now your attacker's job is a little more complicated - they have to intercept the key and modify your code to think it got it from the dongle, when really it's hard-coded. Or you can make the dongle itself do the decryption, passing in the code and getting back the decrypted code - so now your attacker has to emulate that, too, or just take the decrypted code and store it somewhere permanently.
As you can see, just like software protection methods, you can make this arbitrarily complicated, putting more burden on the attacker, but history shows that the tables are tilted in favor of the attacker. While cracking your scheme may be difficult, it only has to be done once, after which the attacker can distribute modified copies to everyone. Users of pirated copies can now easily use your software, while your legitimate customers are saddled with an onerous copy protection mechanism. Providing a better experience for pirates than legitimate customers is a very good way to turn your legitimate customers into pirates, if that's what you're aiming for.
The only - largely hypothetical - way around this is called Trusted Computing, and relies on adding hardware to a user's computer that restricts what they can do with it to approved actions. You can see details of hardware support for it here.
I would strongly counsel you against this route for the reasons I detailed above: You end up providing a worse experience for your legitimate customers than for those using a pirated copy, which actively encourages people not to buy your software. Piracy is a fact of life, and there are users who simply will not buy your software even if you could provide watertight protection, but will happily use an illegitimate copy. The best thing you can do is offer the best experience and customer service to your legitimate customers, making the legitimate copy a more attractive proposition than the pirated one.
They are called dongles, they fit in the USB port (nowadays) and contain their own little computer and some encrypted memory.
You can use them to check the program is valud by testing if the hardware dongle is present, you can store enecryption keys and other info in the dongle or sometimes you can have some program functions run in the dongle. It's based on the dongle being harder to copy and reverse engineer than your software.
See deskey or hasp (seem to have been taken over)
Back in the day I've seen hardware dongles on the parallell port. Today you use USB dongles like this. Wikipedia link.

Why HTTP protocol is designed in plain text way?

Yesterday, I have a discussion with my colleagues about HTTP. It is asked why HTTP is designed in plain text way. Surely, it can be designed in binary way just like TCP protocol, using flags to represents different kinds of method(POST, GET) and variables (HTTP headers). So, why HTTP is designed in such way? Is there any technical or historical reasons?
A reason that's both technical and historical is that text protocols are almost always preferred in the Unix world.
Well, this is not really a reason but a pattern. The rationale behind this is that text protocols allows you to see what's going on on the network by just dumping everything that goes through. You don't need a specialized analyzer as you need for TCP/IP. This makes it easier to debug and easier to maintain.
Not only HTTP, but many protocols are text based (e.g., FTP, POP3, SMTP, IMAP).
You might want to take a look at The Art of Unix Programming for a much more detailed explanation of this Unix thing.
With HTTP, the content of a request is almost always orders of magnitude larger than the protocol overhead. Converting the protocol into a binary one would save very little bandwidth, and the easy debugability that a text protocol offers easily trumps the minor bandwidth savings of a binary protocol.
Many Internet application protocols use more or less plain text for the protocol (see FTP, POP, SMTP, etc.).
It makes interoperability and troubleshooting much easier.
HTTP stands for "Hypertext Transfer Protocol".
It was initially devised as a way to serve text documents, hence the text based protocol.
What we do with HTTP now is far beyond its original intent.
As with RFC 2616 section 3.7.1 for HTTP 1.1, the key identifier to a line of command or header is the text line-break CRLF; text-based application protocols makes it easier to carry out a conversation (for troubleshooting) purely with a Telnet client. It also makes it easier to program with ReadLine() calls and matching text strings.
The CRLF parameter break also gives near-unlimited abitrary header extensions unlike a fixed-size TCP or IP headers where one hard-codes by bit offsets.
So it's easier to "read" the traffic or create a client or server?
You can debate whether it actually makes it easier, but surely that was the intent.
In the case of http ,some people work on a "binary" version of it, they called it Embedded Binary HTTP (EBHTTP)
https://datatracker.ietf.org/doc/html/draft-tolle-core-ebhttp-00
Historically, it all starts from RFC822 (STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES), whose latest version is RFC5322 (Internet Message Format). SMTP (RFC 821) was one of the most popular protocol based on RFC822. And, HTTP was born out of SMTP (your mail protocol).
I like the:
...preferred in the Unix world.
reason, but it doesn't go into any explanation for why.
In order to understand why you need to place yourself into the shoes of a designer that wants to make a usable product.
A) You can document the shit out of meaningless gibberish (binary).
B) Develop or hope others develop tools that portray your meaningless gibberish in a meaningful way.
or
A) You can document the shit out of meaningful text that takes advantage of language as a tool for a self-documenting protocol.
B) There is no immediate need for additional tools, and additional tools will be much easier to write and debug.
It creates staged delivery and creates something that is easier to comprehend & recall when doing future development. It also creates a situation where a higher level abstraction is no longer necessary.
Imagine a world where setting a header value isn't as simple as dictionary/Map somewhere in your framework. When running into errors you'd have to constantly question whether or not your framework is correct or not, because you couldn't easily see it's doing the right thing without additional tools. That would be the world of HTTP if each framework needed to invent/implement it's own higher level abstraction (browsers come to mind).
Many protocol designer's want efficiency, this design focuses on usability, which is paramount in the software development industry. Unusable tools that are prematurely optimized create an unnecessary burden for software developers, and this burden manifests across the board.
Now,HTTP/2 based Binary,it is much less error-prone.
https://http2.github.io/faq/#why-is-http2-binary

Is it possible to downsample an audio stream at runtime with Flash or FMS?

I'm no expert in audio, so if any of you folks are, I'd appreciate your insights on this.
My client has a handful of MP3 podcasts stored at a relatively high bit rate, and I'd like to be able to serve those files to her users at "different" bit rates depending on that user's credentials. (For example, if you're an authenticated user, you might get the full, unaltered stream, but if you're not, you'd get a lower-bit-rate version -- or at least a purposely tweaked lower-quality version than the original.)
Seems like there are two options: downsampling at the source and downsampling at the client. In this case, knowing of course that the source stream would arrive at the client at a high bit rate (and that there are considerations to be made about that, which I realize), I'd prefer to alter the stream at the client somehow, rather than on the server, for several reasons.
Is doing so possible with the Flash Player and ActionScript alone, at runtime (even with a third-party library), or does a scenario like this one require a server-based solution? If the latter, can Flash Media Server handle this requirement specifically? Again, I'd like to avoid using FMS if I can, since she doesn't really have the budget for it, but if that's the only option and it's really an option, I'm open to considering it.
Thanks in advance...
Note: Please don't question the sanity of the request -- I realize it might sound a bit strange, but the requirements are what they are. In that light, for purposes of answering the question, you can ignore the source and delivery path of the bits; all I'm really looking for is an explanation of whether (and ideally how) a Flash client can downsample an MP3 audio stream at runtime, irrespective of whether the audio's arriving over a network connection or being read directly from disk. Thanks much!
I'd prefer to alter the stream at the client somehow, rather than on the server, for several reasons.
Please elucidate the reasons, because resampling on the client end would normally be considered crazy: wasting bandwidth sending the higher-quality version to a user who cannot hear it, and risking a canny user ripping the higher-quality stream at it comes in through the network.
In any case the Flash Player doesn't give you the tools to process audio, only play it.
You shouldn't need FMS to process audio at the server end. You could have a server-side script that loaded the newly-uploaded podcasts and saved them back out as lower-bitrate files which could be served to lowly users via a normal web server. For Python see eg. PyMedia, py-lame; or even a shell script using lame or ffmpeg or something from the command line should be pretty easy to pull off.
If storage is at a premium, have you looked into AAC audio? I believe Flash 9 and 10 on desktop browsers will play it. AAC in my experience takes only half of the size of the comparable MP3 (i.e. a 80kbps AAC will sound the same as a 160kbps MP3).
As for playback quality, if I recall correctly there's audio playback settings in the Publish Settings section in the Flash editor. Wether or not the playback bitrate can be changed at runtime is something I'm not sure of.

Resources