Can FTP have multiple TCP connection for multiple parallel file transfer - networking

While reading the FTP protocol specification from : (http://www.pcvr.nl/tcpip/ftp_file.htm). I came across this "FTP differs from the other applications that we've described because it uses two TCP connections to transfer a file". My question is, can FTP have multiple TCP connection for multiple parallel file transfer, for example can I transfer two files in parallel over two TCP connections, is this a matter of customization or standardization?

While it would be theoretically possible to make an FTP server support multiple, concurrent transfers, it's not supported by the RFC or any known implementation.
The block is a simple one in that the control connection, after receiving a transfer request, does not return a final status or accept new commands until the data transfer is completed. Thus, though you could queue up another transfer request it wouldn't actually be processed by the server until the current one completes.
If you want multiple file transfers, just log into the FTP server multiple times using different programs or command-line windows and have each initiate a transfer.

No it can't. FTP uses a control connection for sending commands and a data connection that exists for the duration of the file transfer or directory listing retrieval, that's it.
For more information you can consult RFC 959, which defines the specs of the FTP protocol.

Related

Send a file to Winsock socket with curl utility

I need to send a file to an HTTP server with the curl utility. However, as my server application only needs to process a single file, I'd like to avoid using some large HTTP framework with a wide range of functionality, so I'd like to stick to TCP/UDP protocols with some simple HTTP parser.
A file to send to the server may be quite large and I doubt if it is reasonable to send this file as a single TCP packet, so I'm dreaming of splitting this file into UDP packets and sending them one by one. But on the client side, all this must be done with a simple curl command, like curl --data-binary #filename 127.0.0.1:80.
Is it possible to split this request into several packets using WinSock API? For example, the server reads the name of a file, detects its size, allocates as many packets as needed and starts receiving UDP packets. Or maybe should I look at other ways of solving it?

What is a file descriptor/connection/socket/ip?

This is a broad question:
How do File descriptor, connection, socket, ip all relate to each other?
Is a file descriptor just a text file? And so when I make a connection to a server, does it mean the server and my computer basically create a file descriptor and we communicate through writing into the file?
Why servers have a limit of file descriptors? When a server opens up too many connections, it could fail with an error: too many files open. Is it because with each connection the server opens up a new text file?
An IP address is a "number" that identifies a machine in a network. So you can say "send this data to IP x.x.x.x" and the network knows which one is the destination machine. There are several ways to assign IP addresses to machines.
A connection is a conceptual link between two machines. It is conceptual because it is not necessary that the machines are directly wired to each other or anything: they are linked "through the network".
A socket is a programming interface that allows a programmer to manage connections, so the programmer can:
Establish (open) a new connection to another machine
Listen for incoming connections from other machines
Read data received through a connectino
Write data through the connection
Discard (close) the connection
A file descriptor is an element (typically a number) that allows the programmer to identify a "stream" of data: files on disk, sockets, pipes, named pipes... basically anything you can read / write bytes in a sequential manner. They are called file descriptors because initially they identified files only.

Implementing a WebServer

I am trying to create a Web Server of my own and there are several questions about working of Web servers we are using today. Questions are:
After receiving a HTTP request from a client through port 80, does server respond using same port 80?
If yes then while sending a large file say a pic in MB's, webserver will be unable to receive requests from other clients?
Is a computer port duplex or simplex? (Can it send and receive at the same time)?
If another port on server side is used to send response to client, then (if TCP is used, which is generally used), again 3-way handshaking will be done which will be overhead...
http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html here is a good guide on what's going on with webservers, although it's in c but the concepts are all there. This will explain the whole client server relationship as well as some implementation details.
I'll just give a high level on what's going on:
Usually what happens is when your server gets a new request that comes in it creates a fork that will process it, that way you are not bogged down by each request, when the request comes in the child process is handed a new file to write to(again this is all implementation details).
So really you have one server waiting for requests and for each request it received it spawns a child to process to deal with this request. I'm sure there are much easier languages to implement this stuff than c(I had to do both a c and java server serving to either one in my past) but c really gets you to understand the things that are going on and I'm betting that is what you are looking for here
Now there are a couple of things to think about:
how you want the webserver to work. The example explains the parent child process.
Do you want to use tcp/UDP there are differences in the way to payload gets delivered.
You don't have to connect on port 80. that's just the default for web.
Hopefully the guide will help you.
Yes. The server sends the response using the TCP connection established by the client, so it also responds using the same port. The server can handle connections from multiple clients using the same port because TCP connections are identified by (local-ip, local-port, remote-ip, remote-port), so the server can even handle multiple connections from same client provided that the source ports are different.
There are different techniques you can use to be able to serve multiple clients at the same time. These include
using multiple processes or threads: when one is busy serving a client the others can serve other clients.
using events: the server listens for events from the OS: when it can write a block of data to a connection it writes it, when a new client connects it accepts the connection, ...
Frequently both approaches are be combined.
A TCP connection is duplex: you can send and receive at the same time. The HTTP protocol is based on a simple request-response model though: at any given time only one party is "talking."

HTTP push over 100,000 connections

I want to use a client-server protocol to push data to clients which will always remain connected, 24/7.
HTTP is a good general-purpose client-server protocol. I don't think the semantics possibly could be very different for any other protocol, and many good HTTP servers exist.
The critical factor is the number of connections: the application will gradually scale up to a very large number of clients, say 100,000. They cannot be servers because they have dynamic IP addresses and may be behind firewalls. So, a socket link must be established and preserved, which leads us to HTTP push. Only rarely will data actually be pushed to a given client, so we want to minimize the connection overhead too.
The server should handle this by accepting the connection, inserting the remote IP and port into a table, and leaving it idle. We don't want 100,000 threads running, just so many table entries and file descriptors.
Is there any way to achieve this using an off-the-shelf HTTP server, without writing at the socket layer?
Use Push Framework : http://www.pushframework.com.
It was designed for that goal of managing a large number of long-lived asynchronous full-duplex connections.
LightStreamer (http://www.lightstreamer.com/) is the tool that is made specifically for PUSH operations of HTTP.
It should solve this problem.
You could also look at Jetty + Continuations.

Unix TCP servers and UDP Servers

Why is the design of TCP servers mostly such that whenever it accepts a connection, a new process is invoked to handle it . But, why in the case of UDP servers, mostly there is only a single process that handles all client requests ?
The main difference between TCP and UDP is, as stated before, that UDP is connectionless.
A program using UDP has only one socket where it receives messages. So there's no problem if you just block and wait for a message.
If using TCP you get one socket for every client which connects. Then you can't just block and wait for ONE socket to receive something, because there are other sockets which must be processed at the same time.
So you got two options, either use nonblocking methods or use threads. Code is usually much simpler when you don't have one while loop which has to handle every client, so threading is often prefered. You can also save some CPU time if using blocking methods.
When you talk with client via TCP connection you maintain TCP session. So when new connection established you need separate process(or thread, no matter how it implemented and what OS used) and maintain conversation. But when you use UDP connection you may recieve datagram(and you will be informed about senders ip and port) but in common case you cannot respond on it.
First of all, the classic Unix server paradigm is filter based. For example, various network services can be configured in /etc/services and a program like inetd listens on all of the TCP and UDP sockets for incoming connections and datagrams. When a connection / DG arrives it forks, redirects stdin, stdout and stderr to the socket using the dup2 system call, and then execs the server process. You can take any program which reads from stdin and writes to stdout and turn it into a network service, such as grep.
According to Steven's in "Unix Network Programming", there are five kinds of server I/O models (pg. 154):
blocking
non-blocking
multiplexing (select and poll)
Signal Driven
asynchronous ( POSIX aio_ functions )
In addition the servers can be either Iterative or Concurrent.
You ask why are TCP servers are typically concurrent, while UDP servers are typically iterative.
The UDP side is easier to answer. Typically UDP apps follow a simple request response model where a client sends a short request followed by a reply with each pair constituting a stand alone transaction. UDP servers are the only ones which use Signal Drive I/O, and at the very rarely.
TCP is a bit more complicated. Iterative servers can use any of the I/O models above, except #4. The fastest servers on a single processor are actually Iterative servers using non-blocking I/O. However, these are considered relatively complex to implement and that plus the Unix filter idiom where traditionally the primary reasons for use of the concurrent model with blocking I/O, whether multiprocess or multithreaded. Now, with the advent of common multicore systems, the concurrent model also has the performance advantage.
Your generalization is too general. This is a pattern you might see with a Unix-based server, where process creation is inexpensive. A .NET-based service will use a new thread from the thread pool instead of creating a new process.
Programs that can continue to do useful work while they are waiting for I/O
will often be multithreaded. Programs that do lots of computation which
can be neatly divided into separate sections can benefit from
multithreading, if there are multiple processors. Programs that service
lots of network requests can sometimes benefit by having a pool of
available threads to service requests. GUI programs that also need to
perform computation can benefit from multithreading, because it allows the
main thread to continue to service GUI events.
Thats why we use TCP as an internet protocol.

Resources