HTTP is used to display the info and also can be used to transfer files from one host to another host.
FTP is used to transfer files from one host to another.
So I come to this point that FTP and HTTP both are almost doing the same work. Then what is the exact benefit of using FTP while I can do this with the HTTP?
Correct me if I am wrong.
Thanks
FTP is a File Transfer Protocol, for transferring files.
FTP is significantly older, it is a protocol designed to enable the transfer of files over a long-running session. There are a wide array of commands and the intent is to allow you to navigate and browse a remote file system and retrieve files (originally over a separate data connection).
FTP still sees a lot of use, but many files are actually transferred over HTTP instead.
HTTP
The HyperText Transfer Protocol was originally designed to transfer hypertext documents and the various assets needed to render them. In practice, this is the way information is transferred on the web -- html, css, images, data are all transferred between web servers and web browsers, as well as between one server and another this way.
HTTP was designed to retrieve a resource from a URL that may or may not match the remote file system (in many web apps, the structure of the URLs has very little to do with the file locations). There is often only a single request in a single http connection and the data uses the same connection as the request.
So I come to this point that FTP and HTTP both are almost doing the same work.
Not really. FTP can be used for file transfer and not really much more. HTTP is way more flexible since it not only transfers byte streams but also meta data (what kind of data is this), supports implicit compression, client specific responses (like based on supported languages), has more flexible ways for authentication, is tuned for less overhead (i.e. can be faster) ...
Then what is the exact benefit of using FTP while I can do this with the HTTP?
There is no real benefit of FTP today. In contrary, in contrast to alternatives like HTTP the design of FTP leads to lots of problems in today's infrastructure where NAT is heavily used (i.e. multiple internal systems behind a single router with public IP address).
FTP remains mostly in places where clients or servers don't support more modern ways for file exchange. A typical example is cheap web hosting where access to the server to update files is often done by FTP since lots of tools have FTP builtin and it is easy to setup on the server too. Alternatives like WebDAV (HTTP based) or SFTP (SSH based) are less used here since they have less support in clients and servers even though they would offer more security and more flexibility and less problems.
Related
I am currently developing a website for an energy-monitoring company. We are trying to send high volumes data from the devices which record the data to a server so they can be processed in a database. The guy developing the firmware seems to think that the best way to send the data is to produce CSV files and send them via FTP. A program on the server needs to monitor the files received via FTP and run a PHP script to process them. I, however, feel that the best way of sending the data is via HTTP POST.
We had HTTP POST working and then I began trying to work with the CSVs which became a pain as reliably monitoring the files received via FTP meant editing the ProFTPD configuration file (which I found to be a near impossible task) and install a package called mod_exec (which comes with security risks) so that ProFTPD could run a PHP script. These issues and the fact that I am unfamiliar with the linux console which I am required to use extensively to set this up, makes the CSV method very difficult to set up. HTTP POST to me seems like a more direct way of sending the data without having to worry about files or relying on ProFTPD. It would also allow us to use identifiers to give the data being passed meaning as opposed to a string of values for which the meaning is not immediately apparent. In addition, the query string could be URL encoded to pass a multidimensional array which would work well given the type of data being passed.
Nevertheless, just because the HTTP POST method would be easier doesn't mean that the CSV method doesn't have advantages. Furthermore, the firmware guy has far more experience than me with computers so I trust his opinion.
Can you please help me to understand his point of view on the advantages of the CSV method and explain what the best method is?
You're right. FTP has major issues with firewalls, and especially doesn't work well on mobile (NAT'ted) IPv4. HTTP POST works far, far better under such circumstances, if only because nobody accepts an "internet" connection that breaks HTTP.
Furthermore, HTTP is a lot easier on the device as well. It's just a single-socket protocol, with trivial read/write semantics on that socket.
Some more benefits? HTTP has almost-native support for compression (gzip). HTTP transmission can start before the input is complete. HTTP is easier to secure (HTTPS)...
No, there really is little reason to use FTP.
The 'CSV method' (I'd call it the 'FTP method' though) has the advantage of being known to the embedded developer. The receiving side will have to create some way of checking if there is a file though. That adds complexity.
The 'HTTP method' has several advantages:
HTTP is easy to implement on the sending side
No need to create a file-checker
You can reply to the embedded device if everything went OK
I actually just implemented a system just like that (not too much data, but still) and use HTTP POST to send the data. I implemented the HTTP POST myself.
I would like to write an application to manage files, directories and processes on hundreds of remote PCs. There are measurement programs running on these machines, which are currently managed manually using TightVNC / RealVNC. Since the number of machines is large (and increasing) there is a need for automatic management. The plan is that our operators would get a scriptable client application, from which they could send queries and commands to server applications running on each remote PC.
For the communication, I would like to use a TCP-based custom protocol, but it is administratively complicated and would take very long to open pinholes in every firewall in the way. Fortunately, there is a program with a built-in TinyWeb-based custom web server running on every remote PC, and port 80 is opened in every firewall. These web servers serve requests coming from a central server, by starting a CGI program, which loads and sends back parts of the log files of measurement programs.
So the plan is to write a CGI program, and communicate with it from the clients through HTTP (using GET and POST). Although (most of) the remote PCs are inside the corporate intranet, they are scattered all over the country, I would like to secure the communication. It would not be wise to send commands, which manipulate files and processes, in plain text. Unfortunately the program which contains the web server cannot be touched, so I cannot simply prepare it for HTTPS. I can only implement the security layer in the client and in the CGI program. What should I do?
I have read all similar questions in SO, but I am still not sure what to do in this specific situation. Thank you for your help.
There are several webshells but as far as I can see ( http://www-personal.umich.edu/~mressl/webshell/features.html ) they run on the top of an existing SSL/TLS layer.
There is also S-HTTP.
There are several ways of authenticating to an server (username/passwort) in a protected way, without SSL. http://www.switchonthecode.com/tutorials/secure-authentication-without-ssl-using-javascript . But these solutions are focused only on sending a username/password to the server.
Would it be possible to implement something like message-level security in SOAP/WS-Security? I realise this might be a bit heavy duty and complicated to implement, but at least it is
standardised
definitely secure
possibly supported by some libraries or frameworks you could use
suitable for HTTP
Hey,
first of all this is a conceptional question and I do not know if StackOverflow is the appropriate place - so my apologies if I am wrong.
Nowadays the web is not only used for passing raw informations. Many and especially complex web applications are in use. These web application seem to be so complex that it seems irrational to use the HTTP protocol, which is based on so simple data exchange, plus it is stateless.
Would it not be more convincing to use remote invocations for this web applications? The big advantage to my mind is a unified GUI by using HTML. But there are applications, which have no need for a graphical interfaces and then it comes to a point where the HTTP protocol is really cumbersome.
Short answer: HTTP is allowed through firewalls where other protocols would be blocked.
A short partial answer is: first, for historical reasons - HTTP was used since the dawn of the web as protocol for requesting documents, and has since been used for some different purposes. One reason to keep using it is that it is generally served on port 80 which you can be sure won't be blocked by firewalls between your client and the server. The statelessness of the protocol may not always be what you want, but it has at least the advantage of protecting the server side from very trivial overloading problems.
OS independence
firewall passing
the web server is already a well understood and mostly "solved" problem in terms of load balancing, server fall over, etc.
don't have to reinvent the wheel
Other protocols are being used more and more now, including remote invocations and (the one I am particularly familiar with) WCF (which allows binary TCP/IP data transfer).
This allows data to travel faster for applications which require more bandwidth. For example an n-tier application may use WCF binary transfer between application and presentation tiers. Also public web services allow multiple protocols, including binary.
For data transfer protocols, firewalls should be configured (ie. expose a port specifically for your application), not worked around, I would not recommend using a protocol because firewalls do not block it.
The protocol used really depends on who will consume it and what control you have over the consumption - eg external third parties may need a plain-text version with a commonly agrreed data interface. On the other hand, two tiers in a single web application may be able to utilise binary data transfer for performance and security.
What are the advantages (or limitations) of one over the other for transferring files over the Internet?
(I am aware of secure forms of both protocols. I'd like to hear comparisons through personal experiences in terms of performance, reliability, file size limitations etc.)
Here's a performance comparison of the two. HTTP is more responsive for request-response of small files, but FTP may be better for large files if tuned properly. FTP used to be generally considered faster. FTP requires a control channel and state be maintained besides the TCP state but HTTP does not. There are 6 packet transfers before data starts transferring in FTP but only 4 in HTTP.
I think a properly tuned TCP layer would have more effect on speed than the difference between application layer protocols. The Sun Blueprint Understanding Tuning TCP has details.
Heres another good comparison of individual characteristics of each protocol.
I just benchmarked a file transfer over both FTP and HTTP :
over two very good server connections
using the same 1GB .zip file
under the same network conditions (tested one after the other)
The result:
using FTP: 6 minutes
using HTTP: 4 minutes
using a concurrent http downloader software (fdm): 1 minute
So, basically under a "real life" situation:
1) HTTP is faster than FTP when downloading one big file.
2) HTTP can use parallel chunk download which makes it 6x times faster than FTP depending on the network conditions.
Many firewalls drop outbound connections which are not to ports 80 or 443 (http & https); some even drop connections to those ports that are not HTTP(S). FTP may or may not be allowed, not to speak of the active/PASV modes.
Also, HTTP/1.1 allows for much better partial requests ("only send from byte 123456 to the end of file"), conditional requests and caching ("only send if content changed/if last-modified-date changed") and content compression (gzip).
HTTP is much easier to use through a proxy.
From my anecdotal evidence, HTTP is easier to make work with dropped/slow/flaky connections; e.g. it is not needed to (re)establish a login session before (re)initiating transfer.
OTOH, HTTP is stateless, so you'd have to do authentication and building a trail of "who did what when" yourself.
The only difference in speed I've noticed is transferring lots of small files: HTTP with pipelining is faster (reduces round-trips, esp. noticeable on high-latency networks).
Note that HTTP/2 offers even more optimizations, whereas the FTP protocol has not seen any updates for decades (and even extensions to FTP have insignificant uptake by users). So, unless you are transferring files through a time machine, HTTP seems to have won.
(Tangentially: there are protocols that are better suited for file transfer, such as rsync or BitTorrent, but those don't have as much mindshare, whereas HTTP is Everywhereâ„¢)
One consideration is that FTP can use non-standard ports, which can make getting though firewalls difficult (especially if you're using SSL). HTTP is typically on a known port, so this is rarely a problem.
If you do decide to use FTP, make sure you read about Active and Passive FTP.
In terms of performance, at the end of the day they're both spewing files directly down TCP connections so should be about the same.
One advantage of FTP is that there is a standard way to list files using dir or ls. Because of this, ftp plays nice with tools such as rsync. Granted, rsync is usually done over ssh, but the option is there.
Both of them uses TCP as a transport protocol, but HTTP uses a persistent connection, which makes the performance of the TCP better.
We have some files on our website that users of our software can download. Some of the files are in virtual folders on the website while others are on our ftp. The files on the ftp are generally accessed by clicking on an ftp:// link in a browser - most of our customers do not have an ftp client. The other files are accessed by clicking an http:// link in a browser.
Should I move all the files to the ftp? or does it not matter? Whats the difference?
HTTP has many advantages over FTP:
it is available in more places (think workplaces which block anything other than HTTP/S)
it works nicely with proxies (FTP requires extra settings for the proxy - like making sure that it allows the CONNECT method)
it provides built-in compression (with GZIP) which almost all browsers can handle (as opposed to FTP which has a non-official "MODE Z" extension)
NAT gateways must be configured in a special mode to support active FTP connections, while passive FTP connections require them to allow access to all ports (it it doesn't have conneciton tracking)
some FTP clients insist on opening a new data connection for each data transfer, which can leave you with a lot of "TIME_WAIT" sockets
If speed matters to your users, and they are technically inclined, http allows multiple connections for one file (if the client supports it. I use DownThemAll). Most browsers should handle ftp links just fine, though.
I think most users, even today, are more familiar with http than ftp and for that reason you should stick with http by default unless there's a compelling reason to use ftp. It's nit-picking, though.
I think it doesn't matter really, because the ftp is also transparent nowdays. You don't have to know anything special, the browser handles all.
I suggest that if they are downloading one file at one time, you can go to http.
However if they have to download several files with one go, I prefer ftp, because it's much more easy to manage.
There are some nice broswer extensions as _l0ser mentioned, but I prefer ftp for mass file-transfer.
Both FTP and HTTP seem sufficient for your needs, so I would definitely recommend choosing the simplest approach, which is either to leave things as they currently are or consolidate on HTTP.
Personally, I would put everything on HTTP. If nothing else, it eliminates an extra server. There is no compelling reason to choose FTP over HTTP anymore, and there are a few small advantages to HTTP (as others have pointed out).