I'm rewriting a python twisted server in C++ using asio. I have set up the following examples from
http://www.boost.org/doc/libs/1_39_0/doc/html/boost_asio/tutorial.html:
Daytime.1 - A synchronous TCP daytime client
Daytime.3 - An asynchronous TCP daytime server
and they seem to be functioning correctly. However, what is puzzling me is that when I created the twisted server both it and the client side required explicit specification of the IP addresses and port numbers. I am having a slightly different experience here:
On the client application no specification of the port number is required. I can successfully connect to the server by using only 127.0.0.1 as a command line argument.
Also, I appear to be able to connect to the same server with any legal variant of this IP address, as long as the first byte is 127 (eg 127.1.2.3 connects).
There's a literal in the client code specifying to connect using what I assume is an OS provided "daytime" TCP service. However, there is no reference to this in the server code. Why do I have to specify a particular service to connect to? I also suspect that this service could be related to the behaviour in points 1 and 2.
Now I know that the server has an acceptor socket listening that only establishes the connection once it receives a request but I would like some more details here.
Daytime is well known service in *nix world. You can get the list of known service by looking at /etc/services file and you can see below records in it:
daytime 13/udp # Daytime (RFC 867)
daytime 13/tcp # Daytime (RFC 867)
When service_name is provided with the host name, the tcp endpoint uses below version of getaddrinfo:
int error = ::getaddrinfo(host, service, &hints, result);
Looking at the man page [Emphasis mine]:
int
getaddrinfo(const char *hostname, const char *servname, const struct addrinfo *hints, struct addrinfo **res)
The hostname and servname arguments are either pointers to NUL-terminated strings or the null pointer. An acceptable
value for hostname is either a valid host name or a numeric host address string consisting of a dotted decimal IPv4
address or an IPv6 address. The servname is either a decimal port number or a *service name listed in services(5)*. At
least one of hostname and servname must be non-null.
So, in short, provided the correct service name, it knows the correct port number to use. 13 in case of "daytime" service.
Related
I am trying to create a TCP Server which will store incoming TCP Packets as binary, for a Key/Value Store. I already have an Erlang client which can send TCP packets to my Erlang Server, however for the sake of completeness, I want to allow the user to send TCP packets from a command line using clients such as NetCat. The user would adhere to a spec of how to format the data in the TCP Packet such that the Server will be able to understand it. For Example
$ nc localhost 8091
add:key:testKey
Key Saved!
add:value:testValue
Value Saved!
get:key:testKey
Value: testValue
The user interacts with the server by using the add:key/value: and get:key:. What is after that should be taken literally and passed to the server. Meaning a situation like this could be possible, if the user so wanted to.
$ nc localhost 8091
add:key:{"Foo","Bar"}
Key Saved!
add:value:["ferwe",324,{2,"this is a value"}]
Value Saved!
get:key:{"Foo","Bar"}
Value: ["ferwe",324,{2,"this is a value"}]
However, this doesn't seem possible to do as what actually happens is as follows...
I will pre-fill the erlang key/value store (using ETS) using my erlang client with a key of {"Foo","Bar"} and a value of ["ferwe",324,{2,"this is a value"}]. A tuple and list respectively (in this example) as this key/value store has to be able to accommodate ANY erlang compliant data type.
So in the example, currently there is 1 element in the ETS table:
Key
Value
{"Foo","Bar"}
["ferwe",324,{2,"this is a value"}]
I then want to retrieve that entry using NetCat by giving the Key, so I type in NetCat...
$ nc localhost 8091
get:key:{"Foo","Bar"}
My Erlang Server, receives this as <<"{\"Foo\",\"Bar\"}\n">>
My Erlang Server is set up to receive binary which is not an issue.
My question is therefore, can NetCat be used to send unencoded Packets which doesn't escape the quote marks.
Such that my Server is able to receive the Key and just <<"{"Foo","Bar"}">>
Thank you.
My question is therefore, can NetCat be used to send unencoded Packets which doesn't escape the quote marks.
Yes, netcat sends exactly what you give it, so in this case it sends get:key:{"Foo","Bar"} without escaping the quote marks.
Such that my Server is able to receive the Key and just <<"{"Foo","Bar"}">>
<<"{"Foo","Bar"}">> is not a syntactically correct Erlang term. Do you want to get the tuple {"Foo","Bar"} instead, in order to look it up in the ETS table? You can do it by parsing the binary:
Bin = <<"{\"Foo\",\"Bar\"}\n">>,
%% need to add a dot at the end for erl_parse
{ok, Tokens, _} = erl_scan:string(binary_to_list(Bin) ++ "."),
{ok, Term} = erl_parse:parse_term(Tokens),
ets:lookup(my_table, Term).
From my understanding, gRPC's protocol is http2 on top of TCP. Therefore, the maximum port that can be assigned is 65535, right? But it seems like if I set the server address to be something like "0.0.0.0:70040" which is out of the range, it is still working fine and the gRPC client is able to connect and communicate with the gRPC server. May I know what does this actually mean? Thank you.
builder.AddListeningPort("0.0.0.0:70040", grpc::InsecureServerCredentials());
I suspect the issue is that, the use of strings for the endpoint address limits the ability of the compiler to validate the unsigned 16-bit port address.
70040 by my early morning calculation is 10001000110011000 which is 17-bits. Ports must be 16-bits so one of these bits will be getting dropped, probably the most significant one (1).
Dropping that, yields 1000110011000 which corresponds to 4504 and it's likely the actual port that's being used.
You could try running the server on 70040 and connect to it from the the client using 4504.
Should a server application that listens on a port, able to detect and logs down any connection attempt done by Syn Scanning?
Test Scenario
I had written a windows program which i simply called it "simpleServer.exe".
This program is just a simulation of a very basic server application.
It listens on a port, and wait for incoming messages.
The listening Socket was defined to be a TCP Stream Socket.
that's all that this program is doing.
I had been deploying this exact same program on 2 different machines, both running on windows 7 professional 64bit.
This machine will act as a host.
and they are stationed in the same network area.
then, using the program "nmap",
i used another machine on the same network, to act as a client.
using the "-sS" parameter on "nmap", i do a Syn Scan, to the IP and Port of the listening simpleServer on both machine (one attempt at a time).
(note that the 2 hosts already had "wireshark" started, and is monitoring on tcp packets from the client's IP and to the listening port.)
In the "wireshark" entry, on both machine, i saw the expected tcp packet for Syn Scan:
client ----(SYN)----> host
client <--(SYN/ACK)-- host
client ----(RST)----> host
the above packet exchange suggests that the connection was not established.
But on the "simpleServer.exe", only one of it had "new incoming connection" printed in the logs, while the other instance was not alerted of any new incoming connection, hence no logs at all.
Code Snippets
// socket bind and listen was done above this loop
while(TRUE)
{
sClient=accept(sListen,(SOCKADDR*)&remoteAddr,&nAddrLen);
if(sClient == INVALID_SOCKET)
{
printf("Failed accept()");
continue;
}
dwSockOpt (sListen);
printf ("recv a connection: %s\n", inet_ntoa(remoteAddr.sin_addr));
closesocket(sClient);
}
side note:
yes, since it is just a simple program, the flow might be a little funny, such as no break in the while loop. so please don't mind this simple and flawed design.
Further Investigation
i had also put a getsockopt() in the "simpleServer" right after it went into listening state, to check the differences of both the listening socket's SOL_SOCKET option.
one notable difference i found between the two hosts, is the SO_MAX_MSG_SIZE.
the host that detects the incoming connection has a Hex value of 0x3FFFFFFF (1073741823), while the other one that has no logs is 0xFFFFFFFF (-1). not sure if this is related or not, but i just spammed whatever differences that i may found in my test environment. the other value of the SOL_SOCKET are more or less the same.
side note: i tested on some other machine, which covers another windows 7 professional, windows server 2008 r2, windows server 2003. i am not sure if it is coincidence or not, but machine that have SO_MAX_MSG_SIZE == -1, they all did not detect the connection of the Syn Scanning. but maybe it is just a coincidence. i have nothing to prove tho.
Help That I Needed
why is the different behavior from the 2 same of the same application on a different machine with the same OS?
what determines the value of the SO_MAX_MSG_SIZE? considering two same OS but having 2 different values.
If a connection is never established, accept() will never return. That disposes of 90% of your question.
The only explanation for the 'new incoming connection' (or 'recv a connection' or whatever it is) message is that something else connected.
SO_MAX_MSG_SIZE has no meaning for a TCP socket, let alone a listening TCP socket. So whatever variation you experienced is meaningless.
I'm working on a network client program that connects to public servers, specified by the user. If the user gives me a hostname to connect to that has both IPv4 and IPv6 addresses (commonly, a DNS name with both A and AAAA records), I'm not sure how I should decide which address I should connect to.
The problem is that it's quite common for machines to support both IPv4 and IPv6, but only to have global connectivity over IPv4. The most common case of this is when only IPv6 link-local addresses are configured. At the moment the best alternatives I can come up with are:
Try the IPv6 address(es) first - if the connection fails, try the IPv4 address(es); or
Just let the user specify it as a config setting ("prefer_ipv6" versus "prefer_ipv4").
The problem I can see with option 1 is that the connection might not fail straight away - it might take quite a while to time out.
Please do try IPv6. In the significant majority of installations, trying to create an IPv6 connection will fail right away if it can't succeed for some reason:
if the system doesn't support IPv6 sockets, creating the socket will fail
if the system does support IPv6, and has link-local addresses configured, there won't be any routing table entry for the global IPv6 addresses. Again, the local kernel will report failure without sending any packets.
if the system does have a global IP address, but some link necessary for routing is missing, the source should be getting an ICMPv6 error message, indicating that the destination cannot be reached; likewise if the destination has an IPv6 address, but the service isn't listening on it.
There are of course cases where things can break, e.g. if a global (or tunnel) address is configured, and something falsely filters out ICMPv6 error messages. You shouldn't worry about this case - it may be just as well that IPv4 connectivity is somehow broken.
Of course, it's debatable whether you really need to try the IPv6 addresses first - you might just as well try them second. In general, you should try addresses in the order in which they are returned from getaddrinfo. Today, systems support configuration options that let administators decide in what order addresses should be returned from getaddrinfo.
Subsequent to the question being asked the IETF has proposed an answer to this question with RFC6555, a.k.a. Happy Eyeballs.
The pertinent point being the client and server may both have IPv4 and IPv6 but a hop in between may not so it is impossible to reliably predict which path will work.
You should let the system-wide configuration decide thanks to getaddrinfo(). Just like Java does. Asking every single application to try to cater for every single possible IPv6 (mis)configuration is really not scalable! In case of a misconfiguration it is much more intuitive to the user if all or none applications break.
On the other hand you want to try to log annoying delays and time-outs profusely, so users can quickly identify what to blame. Just like every other delays ideally, including (very common) DNS time-outs.
This talk has the solution. To summarize;
Sometimes there are problems with either DNS lookups or the subsequent connection to the resolved address
You don't want to wait for connecting to an IPv6 address to timeout before connecting to the IPv4 address, or vice versa
You don't want to wait for a lookup for an AAAA record to timeout before looking for an A record or vice versa
You don't want to stall while waiting for both AAAA and A records before attempting to connect with whichever record you get back first.
The solution is to lookup AAAA and A records simultaneously and independently, and to connect independently to the resolved addresses. Use whatever connection succeeds first.
The easiest way to do this is to allow the networking API do it for you using connect-by-name networking APIs. For example, in Java:
InetSocketAddress socketAddress = new InetSocketAddress("www.example.com", 80);
SocketChannel channel = SocketChannel.open(socketAddress);
channel.write(buffer);
The slide notes say at this point:
Here we make an opaque object called an InetSocketAddress from a host
and port, and then when we open that SocketChannel, that can complete
under the covers, doing whatever is necessary, without the
application ever seeing an IP address.
Windows also has connect-by-name APIs. I don’t have code fragments for
those here.
Now, I’m not saying that all implementations of these APIs necessarily
do the right thing today, but if applications are using these APIs,
then the implementations can be improved over time.
The di!erence with getaddrinfo() and similar APIs is that they
fundamentally can’t be improved over time. The API definition is that
they return you a full list of addresses, so they have to wait until
they have that full list to give you. There’s no way getaddrinfo can
return you a partial list and then later give you some more.
Some ideas:
Allow the user to specify the preference on a per-site basis.
Try IPv4 first.
Attempt IPv6 in parallel upon the first connection.
On subsequent connections, use IPv6 if the connection was successful previously.
I say to try IPv4 first because that is the protocol which is better established and tested.
I have a web service that is running on IIS (6 or 7, doesn't matter) and I would like to know the port that the caller has sent their request/invocation from.
So if a client makes a call on my web service, how do I find out from the server side what the port number is they made the call from?
Is that something that even gets passed at even the lowest level? Just to be clear I'm not looking for the port for callback purposes. It's for logging only.
You should be able to find it as "REMOTE_PORT" in the server variables of the Server object.
However, this port should pretty much always be random, and is only active for the Request/Response pair the client is making. It should be can't be used for asynchronous call backs. Even your webservice when calling to someother service will use a random port number to initiate the request from. The only static port in the communication is the receiving port at the server end of the TCP connection.
If you service is on WCF, then:
OperationContext context = OperationContext.Current;
MessageProperties messageProperties = context.IncomingMessageProperties;
var endpointProperty = messageProperties[RemoteEndpointMessageProperty.Name]
as RemoteEndpointMessageProperty;
if (endpointProperty != null)
{
string sRemoteAddress = endpointProperty.Address;
int nRemotePort = endpointProperty.Port;
}
TCP sockets do have the concept of sender port number but it doesn't have much use in application level protocols. That said, considering the last paragraph of the OP, I think you're looking for some way to call back the client. The ports I said previously cannot be used for that. Asynchronous requests are identical to synchronous ones running on a separate thread, nothing special about them.
Some Network Address Translation-type devices will hide the actual "sending" port number from you. You would then have access to a useless IP address and a useless port number.