The File Transfer Protocol (FTP) is a protocol used in the Internet Protocol (IP) Suite used for the transferring of files across a network between a server and a client. The protocol was built from the ground up with the client-server model in mind and utilizes different connections for data and control between the two. Users can authenticate themselves with a clear-text sign-in protocol which is usually in the form of a username and password or they can connect to the server anonymously if it is configured to allow it. Transmissions of a user's login credentials can be secured by Transport Layer Security / Secure Sockets Layer (TLS/SSL); which can also encrypt the content of transmissions. Should this not be the case, the information is transmitted unencrypted which leaves it susceptible to a sniffing attack. In some scenarios, SSH File Transfer Protocol (SFTP) might be used but it should be noted that the protocol is technologically different than TLS/SSL.
How It Functions
The protocol has the option of running in one of two modes which determine how the connection is established; active or passive mode. In 1998, the modes were both updated to account for IPv6 and the passive mode actually saw an update to extended passive mode. Regardless of which mode is used, the client will create a Transmission Control Protocol (TCP) connection from a random and usually unprivileged port number and connect it to the FTP server command port, 21.
In active mode, the client begins constantly listening for incoming data connections from the random port number created; let's call this port N. It will send the server an FTP command, PORT N so that the server is aware what port the client is listening for data connections on. In response, the server will then begin a data channel from its FTP port 20 to port N.
In passive mode, the client will use the control connection to send a PASV command to the server. The server will respond with a server IP Address and port number which the client will then use to establish a data connection from the random port number created by the TCP connection. This mode is the preferred mode whenever the client is behind a firewall and is therefore unable to accept incoming TCP connections. It should also be noted that this mode will not work for Network Address Translations (NATs) as there is also the additional problem of the IP Address and port number in the PORT command refer to the internal host's and not the NAT's. The only solution around this would be to use an application-level gateway and allow the NAT to alter the PORT command's IP Address and port number values.
Typically; whenever the server responds to the client over the control connection, it will respond with a three-digit status code in ASCII with a text message which is optional and human-readable. These messages will typically look something like "200" or "200 OK" which means that the last command was successful. Through this system, the server could actually abort a file transfer by sending an interrupt message. The actual data that is transferred over the data connection can be represented in one of four ways while in progress. ASCII mode is typically used for text by converting the data, if needed, from the sender's character representation to 8-bit ASCII before transmission and again, if needed, to the receiver's character representation. Because of this, however, this mode is only usable if the data is plain text. Image (or Binary) is when the sender sends the file byte by byte to the receiver who stores it as they receive it. EBCDIC mode is used for plain text files between nodes who use the EBCDIC character set and finally, Local mode allows two nodes with identical setups to send the data in its original format without the need for conversion to ASCII. It should be noted that when sending text files, there are different format control and record structure options are usable as these features were designed with accounting for files with Telnet and ASA control characters in mind. These transfer modes are stream mode which allows FTP to avoid doing any processing as all the data is sent in a continuous stream, block mode which makes FTP break up the data into several blocks (block header, byte count, and data field) before transferring the data, and compressed mode which compresses the data using an algorithm.
Hypertext Transfer Protocol vs File Transfer Protocol
Hypertext Transfer Protocol (HTTP) is an improvement over FTP by essentially fixing many of the bugs and issues FTP have; particularly with small transfers which are typical when it comes to loading web pages. While HTTP is stateless and handles both control and data over a single connection from client to server on well-known port numbers which allows it to bypass the restrictions of NATs and firewalls, FTP is a stateful control with a separate control and data connection between the client and server and is assigned an arbitrary port number which poses some problems with NATs and firewalls. Furthermore; the nature of FTP makes it slow to establish a control connection due to the round-trip time involved with sending the relevant commands and awaiting responses. This has led to the practice of keeping the control connection open across multiple file transfers on the data connection to make the process faster. HTTP, on the other hand, was once known for dropping each connection at the end of a transfer because it was cheaper to do so but has over time evolved to be capable of handling multiple transfers over one TCP connection. Despite this, its conceptual model is still independent of requests as opposed to a session. Another difference to note is that while a transfer is occurring over the data connection with FTP, the control connection is idle. This means that NAT or the firewall might interpret the connection as dead and end up breaking the connection which will confuse the download. HTTP does not have this issue as its single connection is only idle between requests. It is also normal and expected for these connections to be dropped after a time-out, avoiding any confusion of the download otherwise.
Derivatives and Improvements
In the earliest days of FTP, the client-applications based on the protocol were mostly nothing more than command-line programs because operating systems had not yet had Graphical User Interfaces (GUIs). As such, the protocol is still shipped even now on Windows, Unix, and Linux operating systems to this day despite the many improvements, derivatives, and innovations made.
Among these innovations and derivatives is FTP Secure (FTPS) which is used for the encryption of data across transfers as the protocol was never built to be secure. This is accomplished by sending the “AUTH TLS” command and having the server configured to allow or deny connections that do no request TLS. This became an outdated standard in time and was specified to use a different port to standard FTP. SFTP is another variation which was made for the sake of data encryption but through a different means. It used the Secure Shell protocol (SSH) to transfer files which encrypts both commands and data over a transfer but cannot interoperate with FTP software.
Trivial File Transfer Protocol (TFTP) was a much simpler version of FTP that allowed a client to retrieve from or place a file onto a remote host. Being a much simpler version of FTP, it was not nearly as robust and had no security on it but was primarily used for booting from a local area network (LAN).
Simple File Transfer Protocol; which is also strangely abbreviated as SFTP (the same as SSH File Transfer Protocol) fell somewhere in between TFTP and FTP in terms of complexity. While the protocol still offered no solutions to security, it did offer eleven different commands and three different modes of transmission (ASCII, binary, and continuous). For the most part, however, the protocol was never really used and has been instead assigned a historic-status.
Being one of the earliest transport layer protocols for the IP Suite, there's no doubt that FTP is dated and made obsolete by better protocols now, such as HTTP which offers not only security but faster speeds. That being said, some applications may actually still choose to use FTP or one of its derivatives for one reason or another but those cases are few.