What is the border of roles of network interfaces in MCUs? - networking

I am an embedded software developer who has any experience with TCPIP on connected devices. Also, I am not a software protocol expert, so I am a bit confusing about TCPIP protocol stack + responsiblities of its various phy layers.
First of all, I have experiences with such protocols like UART, SPI, CAN, USB... As you know, the phy layer directly affects you while selecting the protocol you used at the software level. For example, if you use usb and you build a software protocol on it, you do not occasionally deal with some details like checking corrupted frame in your sofware protocol, because phy layer of it guarantees this operation. CAN also has some CAN Controller facilities like crc and bit stuffing so, it is really reliable. But the situation is not the same for simple peripherals like UART/USART. Let's say you are using a bluetooth module to upgrade your firmware, you need to be aware of almost everything that can occur while communicating like delays, corrupted frames, payload validating etc.
Briefly, i am trying to understand the exact role of newtork interfaces come included in MCUs, that are interfaced with RJ45 phy sockets directly. In another words, imagine that I wrote a server application on my pc. Also i configured and ran an application in my development board which has an RJ45 socket and it runs as a client. Also imagine they established a connection over TCP. So, what will be the situation at the client side, when i send a 32 bytes of data to the socket from the server side? What will I see at the lowest level of MCU that is an RxCompleteInterrupt()? Are the data I sent and some other stuffs appended to the TCP packet guaranteed to be delivered by the eth controller in the MCU and ethernet controller of my PC? OR am i responsible (or the stack i used) check all the things necessary to validate whether the frame is valid or not?
I tried to be as clear as it would be. Please if you have experience, then try to write clean comments. I am not a TCPIP expert, maybe I used some wrong terminology, please focus the main concept of the question.
Thanks folks.

If you don't have any prior experience with the TCP/IP protocol suite, I would strongly suggest you to have a look at this IBM Redbook, more specifically at chapters 2, 3 and 4.
This being said:
So, what will be the situation at the client side, when i send a 32
bytes of data to the socket from the server side? What will I see at
the lowest level of MCU that is an RxCompleteInterrupt()?
You should have received an Ethernet frame in your buffer. This Ethernet frame should contain an IP packet. This IP packet should contain a TCP packet, which payload should consist in your 32 bytes of data. But there will be several exchanges between the client and the server prior to your data to be received, because of TCP being a connection-oriented protocol, i.e. several Ethernet frames will be sent/received.
Are the data I sent and some other stuffs appended to the TCP packet
guaranteed to be delivered by the eth controller in the MCU and
ethernet controller of my PC? OR am i responsible (or the stack i
used) check all the things necessary to validate whether the frame
is valid or not?
The TCP packet will ultimately be delivered, but there there are not warranties that your Ethernet frames and IP packets will be delivered, and will arrive in the right order. This is precisely the job of TCP, as a connection-oriented protocol, than to do what is needed so that the data you are sending as a TCP payload will ultimately be delivered. Your MCU hardware should be the one responsible for validating the Ethernet frames, but the TCP/IP stack running on the MCU is responsible for validating IP and TCP packets and the proper delivery of the data being sent/received over TCP.
You can experiment with TCP on a Linux PC using netcat, and capture the exchange using Wireshark or tcpdump.
Create a 'response' file containing 32 bytes:
echo 0123456789ABCDEFGHIJKL > response.txt
Start Wireshark, and filter on lo interface using filter tcp port 1234
Start a TCP server listening on TCP port 1234, which will send the content of response.txt upon receiving a connection from the client:
netcat -l 1234 < response.txt
In another console/shell, connect to the server listening on tcp/1234, and display what was received:
netcat localhost 1234
0123456789ABCDEFGHIJKL
On Wireshark, you should see the following Wireshark Network Capture, and be able to expand all frames/packets of the full exchange using the IBM Redbook as a reference.
Your 32 bytes of data will be in the payload section of a TCP packet sent by the server.

Related

modbus rtu over tcp (modbus tcp gateway)

Is it possible to build a modbus-RTU ethernet gateway using a linux box or arduino?
I have plc slaves linked together using modbus RTU, and I want to connect one of them over TCP with my pc as master.
I wonder if I can use a linux box (rasbery pi/raspbian) connected to a router, as a modbus-to-tcp converter by piping the usb port to my local ip on some port, as one plc will be connected in modbus rtu to the linux box`s usb port.
Piping command will be something Like this:
nc -l 5626 > /dev/ttyUSB0 < /dev/ttyUSB0
My goal is to connect a pc (networked with the linux box) to that plc through the linux box using modbus.
Yes you can use a linux box with ethernet port.
Its not a simple 'nc' is it.
First things first, you need to know, Does your PLC's support Modbus TCP or Modbus RTU over TCP.
bcos both the formats are not the same, they are not interchangeable.
once you ascertain this, you need to write a TCP Client to Connect to the Slaves as they run the Servers.
If they don't support Modbus TCP yet, you need to write a TCP servers as well.
on top of that, you have the handshaking delays and half open connections and what not.
If it is a commercial device, you need to look at the setup it will be running as well.
Hope this helps
I'm not entirely sure as to what your requirements are, but you may want to have a look at the following library. It's a Modbus TCP->RTU library I assume you can use.
https://github.com/3cky/mbusd
Best Regards
While it's certainly possible to build a Modbus TCP/RTU gateway, you won't be able to do it with a simple nc command, for two reasons:
The actual Modbus protocol data unit (PDU), merely containing the Modbus function code and data, is embedded in an application data unit (ADU) so the request reaches the correct device free of communication errors. This ADU is different for RTU and TCP. Generically, the ADU looks like this:
------------------------------------------
| Additional address | PDU | Error check |
------------------------------------------
For RTU, the "Additional address" is a single byte identifying the unit/slave on the serial line, and the "Error check" is a 16-bit CRC.
For TCP, the "Additional address" is the 7-byte Modbus application protocol header (MBAP), consisting of transaction identifier, protocol identifier, and a length field (2 bytes each), plus a single byte identifying the unit/slave (usually 255, though for a gateway, this would be the ID of the RTU slave behind it). The "Error check" is empty (zero bytes) as that is already taken care of by TCP.
The communication semantics are different for RTU and TCP.
For RTU, you write a request to the serial line, wait for the reply, and only then write the next request. There must be 3½ characters of silence between frames.
For TCP, you can in principle have multiple connections being served concurrently. If you tried to forward two TCP requests to the serial line simultaneously, chaos would ensue.
Still, you can build a working gateway with a slightly more complicated program than nc:
Accept connections on TCP port 5626 (note that the actually recommended port number for Modbus TCP is 502), convert the received TCP ADUs to RTU ADUs and put them into a queue together with a back channel.
Another part of your program takes one item at a time from that queue, sends the ADU over the serial line and reports the result back through the back channel. Then it goes on the to the next item, and so on.
The results arriving on the back channels are converted to TCP ADUs and sent back on the respective TCP connection.
This would certainly work on a Raspberry Pi, and possibly also on Arduino, depending on how large you want your queue to be.
Sources:
http://www.modbus.org/docs/Modbus_Application_Protocol_V1_1b.pdf
http://www.modbus.org/docs/Modbus_Messaging_Implementation_Guide_V1_0b.pdf

Is TCP/IP a mandatory for MQTT?

If so, do you know examples of what can go wrong in a non-TCP network?
Learning about MQTT I came across several mentions of the fact that MQTT relies on TCP/IP stack. For example, from mqtt.org:
MQTT for Sensor Networks is aimed at embedded devices on non-TCP/IP
networks, whereas MQTT itself explicitly expects a TCP/IP stack.
But if you read the reference documents, you won't finding anything like that. Moreover, there's QoS field that can be used for reliable delivery and whose values other than 0 are essentially useless in TCP/IP networks. Right now I see nothing that would prevent me from establishing an MQTT connection using UNIX pipe, domain or UDP socket rather than TCP socket.
MQTT just needs a delivery that is ordered and reliable, it doesn't have to be TCP. SCTP works just fine, for example, but UDP doesn't because there is no way to guarantee your large PUBLISH packet made up of multiple UDP packets will arrive in order and complete.
With regards TCP reliability, in theory what you are saying is true, but in practice when an application calls write() and receives a successful return, it can't guarantee when the data has actually made it out of the computer to the remote host. All write() (or send()) does is copy the data to the kernel buffers, at which point you have no further control.
The only way to be sure that a message reaches the remote host at the application level is to have the remote host reply.
MQTT-SN (for sensor Network ) is the solution for the problem that MQTT was having while running over TCP/IP .
There is a concept of MQTT gateway which is brought in in for MQTT-SN which helps in bringing non-TCP / IP implementation.
http://emqttd-docs.readthedocs.io/en/latest/mqtt-sn.html

What are examples of TCP and UDP in real life?

I know the difference between the two on a technical level.
But in real life, can anyone provide examples (the more the better) of applications (uses) of TCP and UDP to demonstrate the difference?
UDP: Anything where you don't care too much if you get all data always
Tunneling/VPN (lost packets are ok - the tunneled protocol takes care of it)
Media streaming (lost frames are ok)
Games that don't care if you get every update
Local broadcast mechanisms (same application running on different machines "discovering" each other)
TCP: Almost anything where you have to get all transmitted data
Web
SSH, FTP, telnet
SMTP, sending mail
IMAP/POP, receiving mail
EDIT: I'm not going to bother explaining the differences, since you state that you already know and every other answer explains it anyway :)
UDP is mailing a letter at the post office.
TCP is mailing a letter with a return receipt at the post office, except that the post master will organize the letters in-order-of mailing and only deliver them in-order.
Well, it was an attempt anyway.
TCP:
World Wide Web(HTTP)
E-mail (SMTP TCP)
File Transfer Protocol (FTP)
Secure Shell (SSH)
UDP:
Domain Name System (DNS)
Streaming media applications such as movies
Online multiplayer games
Voice over IP (VoIP)
Trivial File Transfer Protocol (TFTP)
REAL TIME APPLICATION FOR TCP:
Email:
Reason: suppose if some packet(words/statement) is missing we cannot understand the content.It should be reliable.
REAL TIME APPLICATION FOR UDP:
video streaming:
* **Reason: ***suppose if some packet(frame/sequence) is missing we can understand the content.Because video is collection of frames.For 1 second video there should be
25 frames(image).Even though we can understand some frames are missing due to our imagination skills. Thats why UDP is used for video streaming.
The classic standpoint is to consider TCP as safe and UDP as unreliable.
But when TCP-IP protocols are used in safety critical applications,
TCP is not recommended because it can stop on error for multiple reasons.
Whereas UDP lets the application software deal with errors, retransmission timers, etc.
Moreover, TCP has more processing overhead than UDP.
Currently, UDP is used in aircraft controls and flight instruments,
in the ARINC 664 standard also named AFDX (Avionics Full-Duplex Switched Ethernet).
In ARINC 664, TCP is optional but UDP is used with the RTOS (real time operating systems) designed for the ARINC 653 standard (high reliability control software in civil aircrafts).
For more information about real time controls using IP and UDP in AFDX,
you can read the pages 27 to 50 in
http://www.afdx.com/pdf/AFDX_Training_October_2010_Full.pdf
TCP
I will not send data anymore until i get an acknowledgment.
this process is slow
It is used for security purpose
example: web, sending mail, receiving mail etc
UDP
Here i have no headache with acknowledgment.
this process is faster but here data can be lost .
example : video streaming , online games etc
TCP + UDP = SMTP(example : mobile,telephone)
TCP guarantees (in-order) packet delivery. UDP doesn't.
TCP - used for traffic that you need all the data for. i.e HTML, pictures, etc.
UDP - used for traffic that doesn't suffer much if a packet is dropped, i.e. video & voice streaming, some data channels of online games, etc.
TCP is a connection oriented protocol, It establishes a path, or a virtual connection all the way through switches routers proxies etc and then starts any communication. Various mechanisms like routing djikstras shortest path algorithm exist to establish the virtual end to end connection. So it finds itself used while browsing HTML and other pages, making payments and web applications in general.
UDP is a connectionless protocol - it simply has a destination and nodes simply pass it along if it comes as best as they can. So packets arriving out of order, along various routes etc are common. So Instant messengers and similar software developers think UDP an ideal solution.
In real life if you want to throw data in the net, without worrying about time taken to reach, order of reaching use UDP. If you want a solid path before you start throwing packets, and want same order and latency for your data packets use TCP - I will use UDP for Torrents and TCP for PayPal!
TCP :
Transmission Control Protocol is a connection-oriented protocol, which means that it requires handshaking to set up end-to-end communications. Once a connection is set up, user data may be sent bi-directionally over the connection.
Reliable – Strictly only at transport layer, TCP manages message acknowledgment, retransmission and timeout. Multiple attempts to deliver the message are made. If it gets lost along the way, the server will re-request the lost part. In TCP, there's either no missing data, or, in case of multiple timeouts, the connection is dropped. (This reliability however does not cover application layer, at which a separate acknowledgement flow control is still necessary)
Ordered – If two messages are sent over a connection in sequence, the first message will reach the receiving application first. When data segments arrive in the wrong order, TCP buffers delay the out-of-order data until all data can be properly re-ordered and delivered to the application.
Heavyweight – TCP requires three packets to set up a socket connection, before any user data can be sent. TCP handles reliability and congestion control.
Streaming – Data is read as a byte stream, no distinguishing indications are transmitted to signal message (segment) boundaries.
Applications of TCP
World Wide Web, email, remote administration, and file transfer rely on TCP.
UDP :
User Datagram Protocol is a simpler message-based connectionless protocol. Connectionless protocols do not set up a dedicated end-to-end connection. Communication is achieved by transmitting information in one direction from source to destination without verifying the readiness or state of the receiver.
Unreliable – When a UDP message is sent, it cannot be known if it will reach its destination; it could get lost along the way. There is no concept of acknowledgment, retransmission, or timeout.
Not ordered – If two messages are sent to the same recipient, the order in which they arrive cannot be predicted.
Lightweight – There is no ordering of messages, no tracking connections, etc. It is a small transport layer designed on top of IP.
Datagrams – Packets are sent individually and are checked for integrity only if they arrive. Packets have definite boundaries which are honored upon receipt, meaning a read operation at the receiver socket will yield an entire message as it was originally sent.
No congestion control – UDP itself does not avoid congestion. Congestion control measures must be implemented at the application level.
Broadcasts – being connectionless, UDP can broadcast - sent packets can be addressed to be receivable by all devices on the subnet.
Multicast – a multicast mode of operation is supported whereby a single datagram packet can be automatically routed without duplication to very large numbers of subscribers.
Applications of UDP
Numerous key Internet applications use UDP, including: the Domain Name System (DNS), where queries must be fast and only consist of a single request followed by a single reply packet, the Simple Network Management Protocol (SNMP), the Routing Information Protocol (RIP) and the Dynamic Host Configuration Protocol (DHCP).
Voice and video traffic is generally transmitted using UDP. Real-time video and audio streaming protocols are designed to handle occasional lost packets, so only slight degradation in quality occurs, rather than large delays if lost packets were retransmitted. Because both TCP and UDP run over the same network, many businesses are finding that a recent increase in UDP traffic from these real-time applications is hindering the performance of applications using TCP, such as point of sale, accounting, and database systems. When TCP detects packet loss, it will throttle back its data rate usage. Since both real-time and business applications are important to businesses, developing quality of service solutions is seen as crucial by some.
Some VPN systems such as OpenVPN may use UDP while implementing reliable connections and error checking at the application level.
TCP is appropriate when you have to move a decent amount of data (> ~1 kB), and you require all of it to be delivered. Almost all data that moves across the internet does so via TCP - HTTP, SMTP, BitTorrent, SSH, etc, all use TCP.
UDP is appropriate when you have small messages which you can afford to lose, and would like to send them as efficiently as possible. One reason you might be able to afford to lose them is because you can re-send them if they get lost. The main example on the internet is DNS - DNS consists of small queries saying things like "what is the IP number for stackoverflow.com?", and the responses are correspondingly small. Computers make a lot of these queries, so they should be made efficiently, but if they get lost en route, it's easy to time out and re-send them.
TCP guarantees packet delivery AND order. Order is almost as important as the delivery in the first place when reconstructing data for files such as executables, etc.
UDP does not guarantee delivery NOR order. Packets can arrive (or not!) in any order.
Common uses for TCP include file transfer where the integrity of the packets is paramount. Voice/video applications can afford to lose some data while still maintaining acceptable quality, and so usually use UDP.
One additional thought on some of the comments above that talks about ordered delivery.... It must be clarified that the destination computer may receive packets out of order on the wire, but the TCP at the destination is responsible for "rearranging out-of-order data" before passing it on to the upper layers of the stack. When you say TCP guarantees ordered packet delivery, what that means is it will deliver packets in correct order to the upper layers of the stack.
SCTP vs TCP vs UDPServices/Features SCTP TCP UDP
Connection-oriented yes yes no
Full duplex yes yes yes
Reliable data transfer yes yes no
Partial-reliable data transfer optional no no
Ordered data delivery yes yes no
Unordered data delivery yes no yes
Flow control yes yes no
Congestion control yes yes no
ECN capable yes yes no
Selective ACKs yes optional no
Preservation of message boundaries yes no yes
Path MTU discovery yes yes no
Application PDU fragmentation yes yes no
Application PDU bundling yes yes no
Multistreaming yes no no
Multihoming yes no no
Protection against SYN flooding attacks yes no n/a
Allows half-closed connections no yes n/a
Reachability check yes yes no
Psuedo-header for checksum no (vtags) yes yes
Time wait state vtags 4-tuple n/a
Since tcp usages are pretty straightforward from other answers, I'll mention some interesting UDP use-cases:
1)DHCP - Dynamic Host Configuration Protocol, which is being used in order to dynamically assign IP address and some other network configuration to the connecting devices. In simple words, this protocol allows you just connect to the network cable(or wifi) and start using the internet, without any additional configurations. DHCP uses UDP protocol. Since the settings request message is being broadcasted from the host and there is no way to establish a TCP connection with DHCP server(you don't know it's address) it's impossible to use TCP instead.
2)Traceroute - well-known network diagnostic tool which allows you to explore which path in the network your datagram passes to reach it's destination(and how much time it takes). By default, it works by sending UDP datagram with unlikely destination port number(ranging from 33434 to 33534) to the destination with the ttl(time-to-live) field set to 1. When the router somewhere in the network gets such datagram - it finds out that the datagram is expired. Then, the router drops the datagram and sends to the origin of the datagram an ICMP(Internet Control Message Protocol) error message indicating that the datagram's ttl was expired and containing router's name and IP address. Each time the host sends datagrams with higher and higher TTL, thus increasing the network part which it succeeds to overcome and getting new ICMP messages from new routers. When it eventually reaches it's destination(datagrams TTL is big enough to allow it),- the destination host sends 'Destination port unreachable' ICMP message to the origin host. This way, Traceroute knows that the destination was reached. Since the TCP guarantees segments delivery it would be at least inefficient to use it instead of UDP which, in turn, allows datagram to be just dropped without any resend attempts(resend is implemented on the higher level, with continuously increasing TTL as described above).
TCP: will get there in meaningful order
UDP: god knows (maybe)
UDP is applied a lot in games or other Peer-to-peer setups because it's faster and most of the time you don't need the protocol itself to make sure everything gets to the destination in the original order (UDP does not garantee packet delivery or delivery order).
Web traffic on the other hand is over TCP. (I'm not sure here but I think it has to do with the way the HTTP protocol is built)
Edited because I failed at UDP.
Real life examples of both TCP and UDP
tcp -> a phone call, sms or anything specific to destination
UDP -> a FM radio channel (AM), Wi-Fi.

Why do we say the IP protocol in TCP/IP suite is connectionless?

Why is the IP called a connectionless protocol? If so, what is the connection-oriented protocol then?
Thanks.
Update - 1 - 20:21 2010/12/26
I think, to better answer my question, it would be better to explain what "connection" actually means, both physically and logically.
Update - 2 - 9:59 AM 2/1/2013
Based on all the answers below, I come to the feeling that the 'connection' mentioned here should be considered as a set of actions/arrangements/disciplines. Thus it's more an abstract concept rather than a concrete object.
Update - 3 - 11:35 AM 6/18/2015
Here's a more physical explanation:
IP protocol is connectionless in that all packets in IP network are routed independently, they may not necessarily go through the same route, while in a virtual circuit network which is connection oriented, all packets go through the same route. This single route is what 'virtual circuit' means.
With connection, because there's only 1 route, all data packets will arrive in the same order as they are sent out.
Without connection, it is not guaranteed all data packets will arrive
in the same order as they are sent out.
Update - 4 - 9:55 AM 2016/1/20/Wed
One of the characteristics of connection-oriented is that the packet order is preserved. TCP use a sequence number to achieve that but IP has no such facility. Thus TCP is connection-oriented while IP is connection-less.
The basic idea is pretty simple: with IP (on its own -- no TCP, UDP, etc.) you're just sending a packet of data. You simply send some data onto the net with a destination address, but that's it. By itself, IP gives:
no assurance that it'll be delivered
no way to find out if it was
nothing to let the destination know to expect a packet
much of anything else
All it does is specify a minimal packet format so you can get some data from one point to another (e.g., routers know the packet format, so they can look at the destination and send the packet on its next hop).
TCP is connection oriented. Establishing a connection means that at the beginning of a TCP conversation, it does a "three way handshake" so (in particular) the destination knows that a connection with the source has been established. It keeps track of that address internally, so it can/will/does expect more packets from it, and be able to send replies to (for example) acknowledge each packet it receives. The source and destination also cooperate to serial number all the packets for the acknowledgment scheme, so each end knows whether packets it sent were received at the other end. This doesn't involve much physically, but logically it involves allocating some memory on both ends. That includes memory for metadata like the next packet serial number to use, as well as payload data for possible re-transmission until the other side acknowledges receipt of that packet.
TCP/IP means "TCP over IP".
TCP
--
IP
TCP provides the "connection-oriented" logic, ordering and control
IP provides getting packets from A to B however it can: "connectionless"
Notes:
UDP is connection less but at the same level as TCP
Other protocols such as ICMP (used by ping) can run over IP but have nothing to do with TCP
Edit:
"connection-oriented" mean established end to end connection. For example, you pick up the telephone, call someone = you have a connection.
"connection-less" means "send it, see what happens". For example, sending a letter via snail mail.a
So IP gets your packets from A to B, maybe, in any order, not always eventually. TCP sorts them out, acknowledges them, requests a resends and provides the "connection"
Connectionless means that no effort is made to set up a dedicated end-to-end connection, While Connection-Oriented means that when devices communicate, they perform handshaking to set up an end-to-end connection.
IP is an example of the Connectionless protocols , in this kind of protocols you usually send informations in one direction, from source to destination without checking to see if the destination is still there, or if it is prepared to receive the information . Connectionless protocols (Like IP and UDP) are used for example with the Video Conferencing when you don't care if some packets are lost , while you have to use a Connection-Oriented protocol (Like TCP) when you send a File because you want to insure that all the packets are sent successfully (actually we use FTP to transfer Files). Edit :
In telecommunication and computing in
general, a connection is the
successful completion of necessary
arrangements so that two or more
parties (for example, people or
programs) can communicate at a long
distance. In this usage, the term has
a strong physical (hardware)
connotation although logical
(software) elements are usually
involved as well.
The physical connection is layer 1 of
the OSI model, and is the medium
through which the data is transfered.
i.e., cables
The logical connection is layer 3 of
the OSI model, and is the network
portion. Using the Internetwork
Protocol (IP), each host is assigned a
32 bit IP address. e.g. 192.168.1.1
TCP is the connection part of TCP/IP. IP's the addressing.
Or, as an analogy, IP is the address written on the envelope, TCP is the postal system which uses the address as part of the work of getting the envelope from point A to point B.
When two hosts want to communicate using connection oriented protocol, one of them must first initiate a connection and the other must accept it. Logically a connection is made between a port in one host and other port in the other host. Software in one host must perform a connect socket operation, and the other must perform an accept socket operation. Physically the initiator host sends a SYN packet, which contains all four connection identifying numbers (source IP, source port, destination IP, destination port). The other receives it and sends SYN-ACK, the initiator sends an ACK, then the connection are established. After the connection established, then the data could be transferred, in both directions.
In the other hand, connectionless protocol means that we don't need to establish connection to send data. It means the first packet being sent from one host to another could contain data payloads. Of course for upper layer protocols such as UDP, the recipient must be ready first, (e.g.) it must perform a listen udp socket operation.
The connectionless IP became foundation for TCP in the layer above
In TCP, at minimal 2x round trip times are required to send just one packet of data. That is : a->b for SYN, b->a for SYN-ACK, a->b for ACK with DATA, b->a for ACK. For flow rate control, Nagle's algorithm is applied here.
In UDP, only 0.5 round trip times are required : a->b with DATA. But be prepared that some packets could be silently lost and there is no flow control being done. Packets could be sent in the rate that are larger than the capability of the receiving system.
In my knowledge, every layer makes a fool of the one above it. The TCP gets an HTTP message from the Application layer and breaks it into packets. Lets call them data packets. The IP gets these packets one by one from TCP and throws it towards the destination; also, it collects an incoming packet and delivers it to TCP. Now, TCP after sending a packet, waits for an acknowledgement packet from the other side. If it comes, it says the above layer, hey, I have established a connection and now we can communicate! The whole communication process goes on between the TCP layers on both the sides sending and receiving different types of packets with each other (such as data packet, acknowledgement packet, synchronization packet , blah blah packet). It uses other tricks (all packet sending) to ensure the actual data packets to be delivered in ordered as they were broken and assembled. After assembling, it transfers them to the above application layer. That fool thinks that it has got an HTTP message in an established connection but in reality, just packets are being transferred.
I just came across this question today. It was bouncing around in my head all day and didn't make any sense. IP doesn't handle transport. Why would anyone even think of IP as connectionless or connection oriented? It is technically connectionless because it offers no reliability, no guaranteed delivery. But so is my toaster. My toaster offers no guaranteed delivery, so why not call aa toaster connectionless too?
In the end, I found out it's just some stupid title that someone somewhere attached to IP and it stuck, and now everyone calls IP connectionless and has no good reason for it.
Calling IP connectionless implies there is another layer 3 protocol that is connection oriented, but as far as I know, there isn't and it is just plain stupid to specify that IP is connectionless. MAC is connectionless. LLC is connectionless. But that is useless, technically correct info.

What is the difference between a port and a socket?

This was a question raised by one of the software engineers in my organisation. I'm interested in the broadest definition.
Summary
A TCP socket is an endpoint instance defined by an IP address and a port in the context of either a particular TCP connection or the listening state.
A port is a virtualisation identifier defining a service endpoint (as distinct from a service instance endpoint aka session identifier).
A TCP socket is not a connection, it is the endpoint of a specific connection.
There can be concurrent connections to a service endpoint, because a connection is identified by both its local and remote endpoints, allowing traffic to be routed to a specific service instance.
There can only be one listener socket for a given address/port combination.
Exposition
This was an interesting question that forced me to re-examine a number of things I thought I knew inside out. You'd think a name like "socket" would be self-explanatory: it was obviously chosen to evoke imagery of the endpoint into which you plug a network cable, there being strong functional parallels. Nevertheless, in network parlance the word "socket" carries so much baggage that a careful re-examination is necessary.
In the broadest possible sense, a port is a point of ingress or egress. Although not used in a networking context, the French word porte literally means door or gateway, further emphasising the fact that ports are transportation endpoints whether you ship data or big steel containers.
For the purpose of this discussion I will limit consideration to the context of TCP-IP networks. The OSI model is all very well but has never been completely implemented, much less widely deployed in high-traffic high-stress conditions.
The combination of an IP address and a port is strictly known as an endpoint and is sometimes called a socket. This usage originates with RFC793, the original TCP specification.
A TCP connection is defined by two endpoints aka sockets.
An endpoint (socket) is defined by the combination of a network address and a port identifier. Note that address/port does not completely identify a socket (more on this later).
The purpose of ports is to differentiate multiple endpoints on a given network address. You could say that a port is a virtualised endpoint. This virtualisation makes multiple concurrent connections on a single network interface possible.
It is the socket pair (the 4-tuple
consisting of the client IP address,
client port number, server IP address,
and server port number) that specifies
the two endpoints that uniquely
identifies each TCP connection in an
internet. (TCP-IP Illustrated Volume 1, W. Richard Stevens)
In most C-derived languages, TCP connections are established and manipulated using methods on an instance of a Socket class. Although it is common to operate on a higher level of abstraction, typically an instance of a NetworkStream class, this generally exposes a reference to a socket object. To the coder this socket object appears to represent the connection because the connection is created and manipulated using methods of the socket object.
In C#, to establish a TCP connection (to an existing listener) first you create a TcpClient. If you don't specify an endpoint to the TcpClient constructor it uses defaults - one way or another the local endpoint is defined. Then you invoke the Connect
method on the instance you've created. This method requires a parameter describing the other endpoint.
All this is a bit confusing and leads you to believe that a socket is a connection, which is bollocks. I was labouring under this misapprehension until Richard Dorman asked the question.
Having done a lot of reading and thinking, I'm now convinced that it would make a lot more sense to have a class TcpConnection with a constructor that takes two arguments, LocalEndpoint and RemoteEndpoint. You could probably support a single argument RemoteEndpoint when defaults are acceptable for the local endpoint. This is ambiguous on multihomed computers, but the ambiguity can be resolved using the routing table by selecting the interface with the shortest route to the remote endpoint.
Clarity would be enhanced in other respects, too. A socket is not identified by the combination of IP address and port:
[...]TCP demultiplexes incoming segments using all four values that comprise the local and foreign addresses: destination IP address, destination port number, source IP address, and source port number. TCP cannot determine which process gets an incoming segment by looking at the destination port only. Also, the only one of the [various] endpoints at [a given port number] that will receive incoming connection requests is the one in the listen state. (p255, TCP-IP Illustrated Volume 1, W. Richard Stevens)
As you can see, it is not just possible but quite likely for a network service to have numerous sockets with the same address/port, but only one listener socket on a particular address/port combination. Typical library implementations present a socket class, an instance of which is used to create and manage a connection. This is extremely unfortunate, since it causes confusion and has lead to widespread conflation of the two concepts.
Hagrawal doesn't believe me (see comments) so here's a real sample. I connected a web browser to http://dilbert.com and then ran netstat -an -p tcp. The last six lines of the output contain two examples of the fact that address and port are not enough to uniquely identify a socket. There are two distinct connections between 192.168.1.3 (my workstation) and 54.252.94.236:80 (the remote HTTP server)
TCP 192.168.1.3:63240 54.252.94.236:80 SYN_SENT
TCP 192.168.1.3:63241 54.252.94.236:80 SYN_SENT
TCP 192.168.1.3:63242 207.38.110.62:80 SYN_SENT
TCP 192.168.1.3:63243 207.38.110.62:80 SYN_SENT
TCP 192.168.1.3:64161 65.54.225.168:443 ESTABLISHED
Since a socket is the endpoint of a connection, there are two sockets with the address/port combination 207.38.110.62:80 and two more with the address/port combination 54.252.94.236:80.
I think Hagrawal's misunderstanding arises from my very careful use of the word "identifies". I mean "completely, unambiguously and uniquely identifies". In the above sample there are two endpoints with the address/port combination 54.252.94.236:80. If all you have is address and port, you don't have enough information to tell these sockets apart. It's not enough information to identify a socket.
Addendum
Paragraph two of section 2.7 of RFC793 says
A connection is fully specified by the pair of sockets at the ends. A
local socket may participate in many connections to different foreign
sockets.
This definition of socket is not helpful from a programming perspective because it is not the same as a socket object, which is the endpoint of a particular connection. To a programmer, and most of this question's audience are programmers, this is a vital functional difference.
#plugwash makes a salient observation.
The fundamental problem is that the TCP RFC definition of socket is in conflict with the defintion of socket used by all major operating systems and libraries.
By definition the RFC is correct. When a library misuses terminology, this does not supersede the RFC. Instead, it imposes a burden of responsibility on users of that library to understand both interpretations and to be careful with words and context. Where RFCs do not agree, the most recent and most directly applicable RFC takes precedence.
References
TCP-IP Illustrated Volume 1 The Protocols, W. Richard Stevens, 1994 Addison Wesley
RFC793, Information Sciences Institute, University of Southern California for DARPA
RFC147, The Definition of a Socket, Joel M. Winett, Lincoln Laboratory
A socket consists of three things:
An IP address
A transport protocol
A port number
A port is a number between 1 and 65535 inclusive that signifies a logical gate in a device.
Every connection between a client and server requires a unique socket.
For example:
1030 is a port.
(10.1.1.2 , TCP , port 1030) is a socket.
A socket represents a single connection between two network applications. These two applications nominally run on different computers, but sockets can also be used for interprocess communication on a single computer. Applications can create multiple sockets for communicating with each other. Sockets are bidirectional, meaning that either side of the connection is capable of both sending and receiving data.
Therefore a socket can be created theoretically at any level of the OSI model from 2 upwards. Programmers often use sockets in network programming, albeit indirectly. Programming libraries like Winsock hide many of the low-level details of socket programming. Sockets have been in widespread use since the early 1980s.
A port represents an endpoint or "channel" for network communications. Port numbers allow different applications on the same computer to utilize network resources without interfering with each other. Port numbers most commonly appear in network programming, particularly socket programming. Sometimes, though, port numbers are made visible to the casual user. For example, some Web sites a person visits on the Internet use a URL like the following:
http://www.mairie-metz.fr:8080/ In this example, the number 8080 refers to the port number used by the Web browser to connect to the Web server. Normally, a Web site uses port number 80 and this number need not be included with the URL (although it can be).
In IP networking, port numbers can theoretically range from 0 to 65535. Most popular network applications, though, use port numbers at the low end of the range (such as 80 for HTTP).
Note: The term port also refers to several other aspects of network technology. A port can refer to a physical connection point for peripheral devices such as serial, parallel, and USB ports. The term port also refers to certain Ethernet connection points, such as those on a hub, switch, or router.
ref http://compnetworking.about.com/od/basicnetworkingconcepts/l/bldef_port.htm
ref http://compnetworking.about.com/od/itinformationtechnology/l/bldef_socket.htm
With some analogy
Although a lot technical stuff is already given above for sockets...
I would like to add my answer, just in case , if somebody still could not feel the difference between ip, port and sockets
Consider a server S,
and say person X,Y,Z need a service (say chat service) from that server S
then
IP address tells --> who? is that chat server 'S' that X,Y,Z want to contact
okay, you got "who is the server"
but suppose that server 'S' is providing some other services to other people as well,say 'S' provides storage services to person A,B,C
then
port tells ---> which? service you (X,Y,Z) need i.e. chat service and not that storage service
okay.., you make server to come to know that 'chat service' is what you want and not the storage
but
you are three and the server might want to identify all the three differently
there comes the socket
now socket tells--> which one? particular connection
that is , say ,
socket 1 for person X
socket 2 for person Y
and socket 3 for person Z
Firsty, I think we should start with a little understanding of what constitutes getting a packet from A to B.
A common definition for a network is the use of the OSI Model which separates a network out into a number of layers according to purpose. There are a few important ones, which we'll cover here:
The data link layer. This layer is responsible for getting packets of data from one network device to another and is just above the layer that actually does the transmitting. It talks about MAC addresses and knows how to find hosts based on their MAC (hardware) address, but nothing more.
The network layer is the layer that allows you to transport data across machines and over physical boundaries, such as physical devices. The network layer must essentially support an additional address based mechanism which relates somehow to the physical address; enter the Internet Protocol (IPv4). An IP address can get your packet from A to B over the internet, but knows nothing about how to traverse individual hops. This is handled by the layer above in accordance with routing information.
The transport layer. This layer is responsible for defining the way information gets from A to B and any restrictions, checks or errors on that behaviour. For example, TCP adds additional information to a packet such that it is possible to deduce if packets have been lost.
TCP contains, amongst other things, the concept of ports. These are effectively different data endpoints on the same IP address to which an Internet Socket (AF_INET) can bind.
As it happens, so too does UDP, and other transport layer protocols. They don't technically need to feature ports, but these ports do provide a way for multiple applications in the layers above to use the same computer to receive (and indeed make) outgoing connections.
Which brings us to the anatomy of a TCP or UDP connection. Each features a source port and address, and a target port and address. This is so that in any given session, the target application can respond, as well as receive, from the source.
So ports are essentially a specification-mandated way of allowing multiple concurrent connections sharing the same address.
Now, we need to take a look at how you communicate from an application point of view to the outside world. To do this, you need to kindly ask your operating system and since most OSes support the Berkeley Sockets way of doing things, we see we can create sockets involving ports from an application like this:
int fd = socket(AF_INET, SOCK_STREAM, 0); // tcp socket
int fd = socket(AF_INET, SOCK_DGRAM, 0); // udp socket
// later we bind...
Great! So in the sockaddr structures, we'll specify our port and bam! Job done! Well, almost, except:
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
is also possible. Urgh, that's thrown a spanner in the works!
Ok, well actually it hasn't. All we need to do is come up with some appropriate definitions:
An internet socket is the combination of an IP address, a protocol and its associated port number on which a service may provide data. So tcp port 80, stackoverflow.com is an internet socket.
An unix socket is an IPC endpoint represented in the file system, e.g. /var/run/database.sock.
A socket API is a method of requesting an application be able to read and write data to a socket.
Voila! That tidies things up. So in our scheme then,
A port is a numeric identifier which, as part of a transport layer protocol, identifies the service number which should respond to the given request.
So really a port is a subset of the requirements for forming an internet socket. Unfortunately, it just so happens that the meaning of the word socket has been applied to several different ideas. So I heartily advise you name your next project socket, just to add to the confusion ;)
Generally, you will get a lot of theoretical but one of the easiest ways to differentiate these two concepts is as follows:
In order to get a service, you need a service number. This service number is called a port. Simple as that.
For example, the HTTP as a service is running on port 80.
Now, many people can request the service, and a connection from client-server gets established. There will be a lot of connections. Each connection represents a client. In order to maintain each connection, the server creates a socket per connection to maintain its client.
A socket = IP Address + a port (numeric address)
Together they identify an end-point for a network connection on a machine. (Did I just flunk network 101?)
These are basic networking concepts so I will explain them in an easy yet a comprehensive way to understand in details.
A socket is like a telephone (i.e. end to end device for communication)
IP is like your telephone number (i.e. address for your socket)
Port is like the person you want to talk to (i.e. the service you want to order from that address)
A socket can be a client or a server socket (i.e. in a company the telephone of the customer support is a server but a telephone in your home is mostly a client)
So a socket in networking is a virtual communication device bound to a pair (ip , port) = (address , service).
Note:
A machine, a computer, a host, a mobile, or a PC can have multiple addresses , multiple open ports, and thus multiple sockets. Like in an office you can have multiple telephones with multiple telephone numbers and multiple people to talk to.
Existence of an open/active port necessitate that you must have a socket bound to it, because it is the socket that makes the port accessible. However, you may have unused ports for the time being.
Also note, in a server socket you can bind it to (a port, a specific address of a machine) or to (a port, all addresses of a machine) as in the telephone you may connect many telephone lines (telephone numbers) to a telephone or one specific telephone line to a telephone and still you can reach a person through all these telephone lines or through a specific telephone line.
You can not associate (bind) a socket with two ports as in the telephone usually you can not always have two people using the same telephone at the same time .
Advanced: on the same machine you cannot have two sockets with same type (client, or server) and same port and ip. However, if you are a client you can open two connections, with two sockets, to a server because the local port in each of these client's sockets is different)
Hope it clears you doubts
There seems to be a lot of answers equating socket with the connection between 2 PC's..which I think is absolutely incorrect. A socket has always been the endpoint on 1 PC, that may or may not be connected - surely we've all used listener or UDP sockets* at some point. The important part is that it's addressable and active. Sending a message to 1.1.1.1:1234 is not likely to work, as there is no socket defined for that endpoint.
Sockets are protocol specific - so the implementation of uniqueness that both TCP/IP and UDP/IP uses* (ipaddress:port), is different than eg., IPX (Network, Node, and...ahem, socket - but a different socket than is meant by the general "socket" term. IPX socket numbers are equivalent to IP ports). But, they all offer a unique addressable endpoint.
Since IP has become the dominant protocol, a port (in networking terms) has become synonomous with either a UDP or TCP port number - which is a portion of the socket address.
UDP is connection-less - meaning no virtual circuit between the 2 endpoints is ever created. However, we still refer to UDP sockets as the endpoint. The API functions make it clear that both are just different type of sockets - SOCK_DGRAM is UDP (just sending a message) and SOCK_STREAM is TCP (creating a virtual circuit).
Technically, the IP header holds the IP Address, and the protocol on top of IP (UDP or TCP) holds the port number. This makes it possible to have other protocols (eg. ICMP that have no port numbers, but do have IP addressing information).
Short brief answer.
A port can be described as an internal address within a host that identifies a program or process.
A socket can be described as a programming interface allowing a program to communicate with other programs or processes, on the internet, or locally.
They are terms from two different domains: 'port' is a concept from TCP/IP networking, 'socket' is an API (programming) thing. A 'socket' is made (in code) by taking a port and a hostname or network adapter and combining them into a data structure that you can use to send or receive data.
After reading the excellent up-voted answers, I found that the following point needed emphasis for me, a newcomer to network programming:
TCP-IP connections are bi-directional pathways connecting one address:port combination with another address:port combination. Therefore, whenever you open a connection from your local machine to a port on a remote server (say www.google.com:80), you are also associating a new port number on your machine with the connection, to allow the server to send things back to you, (e.g. 127.0.0.1:65234). It can be helpful to use netstat to look at your machine's connections:
> netstat -nWp tcp (on OS X)
Active Internet connections
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 192.168.0.6.49871 17.172.232.57.5223 ESTABLISHED
...
A socket is a communication endpoint. A socket is not directly related to the TCP/IP protocol family, it can be used with any protocol your system supports. The C socket API expects you to first get a blank socket object from the system that you can then either bind to a local socket address (to directly retrieve incoming traffic for connection-less protocols or to accept incoming connection requests for connection-oriented protocols) or that you can connect to a remote socket address (for either kind of protocol). You can even do both if you want to control both, the local socket address a socket is bound to and the remote socket address a socket is connected to. For connection-less protocols connecting a socket is even optional but if you don't do that, you'll have to also pass the destination address with every packet you want to send over the socket as how else would the socket know where to send this data to? Advantage is that you can use a single socket to send packets to different socket addresses. Once you have your socket configured and maybe even connected, consider it to be a bi-directional communication pipe. You can use it to pass data to some destination and some destination can use it to pass data back to you. What you write to a socket is send out and what has been received is available for reading.
Ports on the other hand are something that only certain protocols of the TCP/IP protocol stack have. TCP and UDP packets have ports. A port is just a simple number. The combination of source port and destination port identify a communication channel between two hosts. E.g. you may have a server that shall be both, a simple HTTP server and a simple FTP server. If now a packet arrives for the address of that server, how would it know if that is a packet for the HTTP or the FTP server? Well, it will know so as the HTTP server will run on port 80 and the FTP server on port 21, so if the packet arrives with a destination port 80, it is for the HTTP server and not for the FTP server. Also the packet has a source port since without such a source port, a server could only have one connection to one IP address at a time. The source port makes it possible for a server to distinguish otherwise identical connections: they all have the same destination port, e.g. port 80, the same destination IP (the IP of the server), and the same source IP, as they all come from the same client, but as they have different source ports, the server can distinguish them from each other. And when the server sends back replies, it will do so to the port the request came from, that way the client can also distinguish different replies it receives from the same server.
A socket is a special type of file handle which is used by a process to request network services from the operating system.
A socket address is the triple:
{protocol, local-address, local-process} where the local process is identified by a port number.
In the TCP/IP suite, for example:
{tcp, 193.44.234.3, 12345}
A conversation is the communication link between two processes thus depicting an association between two.
An association is the 5-tuple that completely specifies the two processes that comprise a connection:
{protocol, local-address, local-process, foreign-address, foreign-process}
In the TCP/IP suite, for example:
{tcp, 193.44.234.3, 1500, 193.44.234.5, 21}
could be a valid association.
A half-association is either:
{protocol, local-address, local-process}
or
{protocol, foreign-address, foreign-process}
which specify each half of a connection.
The half-association is also called a socket or a transport address. That is, a socket is an end point for communication that can be named and addressed in a network.
The socket interface is one of several application programming interfaces (APIs) to the communication protocols. Designed to be a generic communication programming interface, it was first introduced by the 4.2BSD UNIX system. Although it has not been standardized, it has become a de facto industry standard.
A socket address is an IP address & port number
123.132.213.231 # IP address
:1234 # port number
123.132.213.231:1234 # socket address
A connection occurs when 2 sockets are bound together.
An application consists of pair of processes which communicate over the network (client-server pair). These processes send and receive messages, into and from the network through a software interface called socket. Considering the analogy presented in the book "Computer Networking: Top Down Approach". There is a house that wants to communicate with other house. Here, house is analogous to a process, and door to a socket. Sending process assumes that there is a infrastructure on the other side of the door that will transport the data to the destination. Once the message is arrived on the other side, it passes through receiver's door (socket) into the house (process). This illustration from the same book can help you:
Sockets are part of transport layer, which provides logical communication to applications. This means that from application's point of view both hosts are directly connected to each other, even though there are numerous routers and/or switches between them. Thus a socket is not a connection itself, it's the end point of the connection. Transport layer protocols are implemented only on hosts, and not on intermediate routers.
Ports provide means of internal addressing to a machine. The primary purpose it to allow multiple processes to send and receive data over the network without interfering with other processes (their data). All sockets are provided with a port number. When a segment arrives to a host, the transport layer examines the destination port number of the segment. It then forwards the segment to the corresponding socket. This job of delivering the data in a transport layer segment to the correct socket is called de-multiplexing. The segment's data is then forwarded to the process attached to the socket.
The port was the easiest part, it is just a unique identifier for a socket. A socket is something processes can use to establish connections and to communicate with each other. Tall Jeff had a great telephone analogy which was not perfect, so I decided to fix it:
ip and port ~ phone number
socket ~ phone device
connection ~ phone call
establishing connection ~ calling a number
processes, remote applications ~ people
messages ~ speech
A socket is a structure in your software. It's more-or-less a file; it has operations like read and write. It isn't a physical thing; it's a way for your software to refer to physical things.
A port is a device-like thing. Each host has one or more networks (those are physical); a host has an address on each network. Each address can have thousands of ports.
One socket only may be using a port at an address. The socket allocates the port approximately like allocating a device for file system I/O. Once the port is allocated, no other socket can connect to that port. The port will be freed when the socket is closed.
Take a look at TCP/IP Terminology.
from Oracle Java Tutorial:
A socket is one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the TCP layer can identify the application that data is destined to be sent to.
Port and socket can be compared to the Bank Branch.
The building number of the "Bank" is analogous to IP address.
A bank has got different sections like:
Savings account department
Personal loan department
Home loan department
Grievance department
So 1 (savings account department), 2 (personal loan department), 3 (home loan department) and 4 (grievance department) are ports.
Now let us say you go to open a savings account, you go to the bank (IP address), then you go to "savings account department" (port number 1), then you meet one of the employees working under "savings account department". Let us call him SAVINGACCOUNT_EMPLOYEE1 for opening account.
SAVINGACCOUNT_EMPLOYEE1 is your socket descriptor, so there may be
SAVINGACCOUNT_EMPLOYEE1 to SAVINGACCOUNT_EMPLOYEEN. These are all socket descriptors.
Likewise, other departments will be having employess working under them and they are analogous to socket.
A socket is a data I/O mechanism. A port is a contractual concept of a communication protocol. A socket can exist without a port. A port can exist witout a specific socket (e.g. if several sockets are active on the same port, which may be allowed for some protocols).
A port is used to determine which socket the receiver should route the packet to, with many protocols, but it is not always required and the receiving socket selection can be done by other means - a port is entirely a tool used by the protocol handler in the network subsystem. e.g. if a protocol does not use a port, packets can go to all listening sockets or any socket.
Port:
A port can refer to a physical connection point
for peripheral devices such as serial, parallel, and USB ports.
The term port also refers to certain Ethernet connection points, s
uch as those on a hub, switch, or router.
Socket:
A socket represents a single connection between two network applications.
These two applications nominally run on different computers,
but sockets can also be used for interprocess communication on a single computer.
Applications can create multiple sockets for communicating with each other.
Sockets are bidirectional, meaning that either side of the connection is capable of both sending and receiving data.
Relative TCP/IP terminology which is what I assume is implied by the question. In layman's terms:
A PORT is like the telephone number of a particular house in a particular zip code. The ZIP code of the town could be thought of as the IP address of the town and all the houses in that town.
A SOCKET on the other hand is more like an established phone call between telephones of a pair of houses talking to each other. Those calls can be established between houses in the same town or two houses in different towns. It's that temporary established pathway between the pair of phones talking to each other that is the SOCKET.
In a broad sense,
Socket - is just that, a socket, just like your electrical, cable or telephone socket. A point where "requisite stuff" (power, signal, information) can go out and come in from. It hides a lot of detailed stuff, which is not required for the use of the "requisite stuff". In software parlance, it provides a generic way of defining a mechanism of communication between two entities (those entities could be anything - two applications, two physically separate devices, User & Kernel space within an OS, etc)
A Port is an endpoint discriminator. It differentiates one endpoint from another. At networking level, it differentiates one application from another, so that the networking stack can pass on information to the appropriate application.
Socket is an abstraction provided by kernel to user applications for data I/O. A socket type is defined by the protocol it's handling, an IPC communication etc. So if somebody creates a TCP socket he can do manipulations like reading data to socket and writing data to it by simple methods and the lower level protocol handling like TCP conversions and forwarding packets to lower level network protocols is done by the particular socket implementation in the kernel. The advantage is that user need not worry about handling protocol specific nitigrities and should just read and write data to socket like a normal buffer. Same is true in case of IPC, user just reads and writes data to socket and kernel handles all lower level details based on the type of socket created.
Port together with IP is like providing an address to the socket, though its not necessary, but it helps in network communications.
A port denotes a communication endpoint in the TCP and UDP transports for the IP network protocol. A socket is a software abstraction for a communication endpoint commonly used in implementations of these protocols (socket API). An alternative implementation is the XTI/TLI API.
See also:
Stevens, W. R. 1998, UNIX Network Programming: Networking APIs: Sockets and XTI; Volume 1, Prentice Hall.
Stevens, W. R., 1994, TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley.
Socket is SW abstraction of networking endpoint, used as the interface to the application. In Java, C# it is represented by object, in Linux, Unix it is a file.
Port is just a property of a socket you have specify if you want to establish a communication. To receieve packet from a socket you have to bind it to specific local port and NIC (with local IP address) or all NICs (INADDR_ANY is specified in the bind call). To send packet, you have to specify port and IP of the remote socket.
Already theoretical answers have been given to this question. I would like to give a practical example to this question, which will clear your understanding about Socket and Port.
I found it here
This example will walk you thru the process of connecting to a website, such as Wiley. You would open your web browser (like Mozilla Firefox) and type www.wiley.com into the address bar. Your web browser uses a Domain Name System (DNS) server to look up the name www.wiley.com to identify its IP address is. For this example, the address is 192.0.2.100.
Firefox makes a connection to the 192.0.2.100 address and to the port
where the application layer web server is operating. Firefox knows
what port to expect because it is a well-known port . The well-known
port for a web server is TCP port 80.
The destination socket that Firefox attempts to connect is written as
socket:port, or in this example, 192.0.2.100:80. This is the server
side of the connect, but the server needs to know where to send the
web page you want to view in Mozilla Firefox, so you have a socket for
the client side of the connection also.
The client side connection is made up of your IP address, such as
192.168.1.25, and a randomly chosen dynamic port number. The socket associated with Firefox looks like 192.168.1.25:49175. Because web
servers operate on TCP port 80, both of these sockets are TCP sockets,
whereas if you were connecting to a server operating on a UDP port,
both the server and client sockets would be UDP sockets.
A single port can have one or more sockets connected with different external IP's like a multiple electrical outlet.
TCP 192.168.100.2:9001 155.94.246.179:39255 ESTABLISHED 1312
TCP 192.168.100.2:9001 171.25.193.9:61832 ESTABLISHED 1312
TCP 192.168.100.2:9001 178.62.199.226:37912 ESTABLISHED 1312
TCP 192.168.100.2:9001 188.193.64.150:40900 ESTABLISHED 1312
TCP 192.168.100.2:9001 198.23.194.149:43970 ESTABLISHED 1312
TCP 192.168.100.2:9001 198.49.73.11:38842 ESTABLISHED 1312
A socket is basically an endpoint for network communication, consisting of at least an IP-address and a port. In Java/C# a socket is a higher level implementation of one side of a two-way connection.
Also, a (non-normative) definition in the Java Tutorial.

Resources