How browser access internet when ports are blocked - http

I was having a doubt on how browser gets the data from website. I read these two links:
how can an application use port 80/HTTP without conflicting with browsers?
and
Port 80 blocked on my ISP so how my browser still works?
With this I understand that browser opens a local random source port and connect to port 80 of website. Now our system firewall have opened all outbound connection and blocked all incoming connection as default configuration. So how does it get back the response. Similarly how response comes back when our home routers and ISP have ports blocked.
So now, I am assuming that connection is somewhat different from response. And there must be some sort of header/information that is sent along which helps in recognizing it as response? And this helps in bypassing the ports?
My humble apologies in case I am messing up all terminologies and thanks for patience. I am beginner in this stuff. Any link towards guide will be very useful.

So how does it get back the response
Assuming you're talking about a firewall or NAT, these devices track outgoing connections, and allow replies to pass through. Connections are typically identified using Source IP + Destination IP + Source Port + Destination Port + Protocol (TCP/UDP). These connection identifiers are stored in a table in the NAT/Firewall.

Related

How do I implement a captive portal with STM32 microcontroller?

I am using my device as an Access Point and I need to display a deafault web page once somebody is connected.
Right now I have to connect to a specific IP (socket parameters: IP 192.168.0.1 , PORT 80) to show the page.
The page doesn't need to open automatically, it just has to show up no matter what URL gets inserted in the browser.
I've read the RFC7710 but I'm not able to implement "low level" solutions such as modify the option 160 of the DHCP.
The best thing for me would be to work at the HTTP level.
Any ideas?
HTTP works on the top of the TCP/IP infrastructure, it requires an established TCP connection first. As long as there is no TCP connection from the client to the HTTP server, the server has no way to tell, force, or trick a client to connect to its IP address and TCP port. No connection, no HTTP.

Networking - sending and receiving data

This could not be the right place, as it's not about pure programming;
nevertheless, as a simple web developer I find myself quite
ignorant on the subject of networking(Wikipedia usually mix
different subjects on the matter), and I feel as it is a "must" to know.
I sort of have an image of what happens when you write google.com
on your browser, and I don't know the whole process(I have a modem,
a router and a few computers connected to it. let's use my case for an example):
You write characters into chrome ->
there is some character encoding done to translate the address(ASCII or else) ->
DNS does something, not sure ->
your router receives a digital request from a computer's internet cable/WIFI, it saves the internal IPV4 address of
the sender in order to know to which computer to respond back. it sends the digital data to the modem ->
your modem receives digital data, and translates it from digital to analog ->
now your network provider does some work - >
the google server receives a request from an IP address - >
not sure how the google server handles the data, nevertheless it sends back data ->
service provider - > router gets translated digital data from the modem and remembers who sent the request, and sends it to the right person.
in order to optimize a web server or maybe to write a better code which involves networking, perhaps each beginner(such as myself) needs to understand this first? Thank you for your time.
EDIT: I did read wikipedia's OSI model, though it's not quite as helpful as I thought it would.
i will try to explain the idea, although its may be much ,more complicate - it depends on how deep you want to go ...
you write "www.stackoverflow.com"
your OS will try to resolve the www.stackoverflow.com to an IP address
since your OS probably cant, it needs to ask a DNS server
assuming you use an external DNS ( say IP=5.5.5.5 and your IP=10.10.10.10 which is on a different networks ), your OS will check if it knows how to reach 5.5.5.5
a default route 0.0.0.0/0 exists on your PC (this is also known as 'default-gw' which includes ALL internet, it points to your local router
an IP packet will be sent to the router MAC address with the DNS IP address in the destination
your router will probably change your private IP address to its own public IP address and will sends it to the ISP
ISP will route it to the internet until it reaches 5.5.5.5 which is the DNS
DNS will reply back resolving stackoverflow.com to an IP address
your PC now knows how to send packets to stackoverflow.com
packet will be sent to stackoverflow ip address (104.16.36.249) to port 80 (http)
stackoverflow web server listen to requests on port 80
once a packet arrives it will generate a response packet
it will send it back to you exactly in the same way
all that traffic can be seen with a network capture utility like wireshark, u can use those commands (windows) to verify...
ping stackoverflow.com
netstat -rn
ipconfig
nslookup
tracert -d

Simplest way to find my external ip address and port

I'm trying to develop an applicaton for p2p communication between two android devices. In order to punch a hole through my NAT(s), I'd need to know my external ip address and port.
To that end, I've developed a java server on GAE to report my "remote" ip address and port. The problem is that on GAE I can get my ip address, but not my port. Without it, I'm unable to successfully punch the hole.
So, my question is what's the best, free method to find out my external IP address and port?
That's a question that has no answer with TCP.
Here's the problem: your "port" is not a fixed value. You don't have "an" external port. You typically get one dynamically assigned for each outbound connection.
As answers you should see from the test sites posted in another answer clearly indicate, it's a moving target (though it may stay stationary for a short time due to the browser using HTTP/1.1 keepalives and actually reusing the same connection, not just the same port)... but if you hit the site repeatedly, you'll see it either drift around randomly, or increment. Trying it from two different web browsers on the same machine, you'd never see the same port number -- the port corresponds to the specific source connection, not the machine sourcing the connection.
Sometimes, you may find that it's the same port number as the port your machine's stack opened for the outbound connection, but even when it is, it doesn't matter, because no traffic should be able to return to your machine on that port unless it is from the IP address and port of the machine to which you made the outbound connection. Any decent network address translating device would never accept traffic from another source IP address and/or port, other than the one you addressed in the outbound connection.
There is no standard, simple, predictable, reliable, or consistent way to punch a hole in TCP NAT and then exploit that hole for a peer-to-per connection. To the extent that such things are possible in a given NAT implementation, that is an implementation that is shoddy, broken, defective, and insecure.
See also: https://www.rfc-editor.org/rfc/rfc5128
Sounds like your app could use a STUN server to get its external address.

What happens when my browser does a search? (ARP,DNS,TCP specifics)

I'm trying to learn the basics of ARP/TCP/HTTP (in sort of a scatter-shot way).
As an example, what happens when I go to google.com and do a search?
My understanding so far:
For my machine to communicate with others (the gateway in this case),
it may need to do an ARP Broadcast (if it doesn't already have the
MAC address in the ARP cache)
It then needs to resolve google.com's IP address. It does this by
contacting the DNS server. (I'm not completely sure how it knows
where the DNS server is? Or is it the gateway that knows?)
This involves communication through the TCP protocol since HTTP is
built on it (TCP handshake: SYN, SYN/ACK, ACK, then requests for
content, then RST, RST/ACK, ACK)
To actually load a webpage, the browser gets the index.html, parses
it, then sends more requests based on what it needs? (images,etc)
And finally, to do the actual google search, I don't understand how
the browser knows to communicate "I typed something in the search box
and hit Enter".
Does this seem about right? / Did I get anything wrong or leave out anything crucial?
Firstly try to understand that your home router is two devices: a switch and a router.
Focus on these facts:
The switch connects all the devices in your LAN together(including the router).
The router merely connects your switch(LAN) with the ISP(WAN).
Your LAN is essentially an Ethernet network which works with MAC addresses.
For my machine to communicate with others (the gateway in this case),
it may need to do an ARP Broadcast (if it doesn't already have the MAC
address in the ARP cache)
Correct.
When you want to send a file from your dekstop to your laptop, you do not want to go through the router. You want to go through the switch, as that is faster(lower layer). However you only know the IP of the laptop in your network. For that reason you need to get its MAC address. That's where ARP kicks in.
In this case you would broadcast the ARP request in the LAN until someone responds to you. This could be the router or any other device connected to the switch.
It then needs to resolve google.com's IP address. It does this by
contacting the DNS server. (I'm not completely sure how it knows where
the DNS server is? Or is it the gateway that knows?)
If you use DHCP, then that has already provided you with the IP of the DNS server. If not, then it means that you manually provided the IP of the DNS. So the IP of the DNS server is stored locally on your computer.
Making a DNS request is just about putting its IP in the packet with the request and forwarding the packet to the network.
Sidenote: DHCP also provides the IP address of the router.
This involves communication through the TCP protocol since HTTP is
built on it (TCP handshake: SYN, SYN/ACK, ACK, then requests for
content, then RST, RST/ACK, ACK)
Yes. To clarify things: When your computer sends the request
FRAME[IP[TCP[GET www.google.com]]]
The frame is being sent to your LAN's switch which forwards it to the MAC of the router. Your router will open the frame to check the destination IP and route it accordingly(in this case to the WAN). Finally when the frame arrives at the server, the server will open the TCP segment and read the payload, which is the HTTP message. The ACK/SYN etc. messages are being processed just by your computer and the server and not any router or switch.
To actually load a webpage, the browser gets the index.html, parses
it, then sends more requests based on what it needs? (images,etc)
Yes. An HTML file is essentially a tree structure which can have embedded resources like images, javafiles, CSS etc. For each such resource a new request has to be sent.
Once your browser gets all these recourses, it will render the webpage.
And finally, to do the actual google search, I don't understand how
the browser knows to communicate "I typed something in the search box
and hit Enter".
When you type a single character, it is being sent to the server. The server then responds with its suggestions. Easy as that.
References(good reads):
http://www.tcpipguide.com/free/t_TheNeedForAddressResolution.htm
http://www.howtogeek.com/99001/htg-explains-routers-and-switches/
http://www.eventhelix.com/realtimemantra/networking/ip_routing.htm#.UsrYAvim3yO
http://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol

Port Forwarding For Online Games/Other Services?

I've noticed recently that I don't have to forward ports for mmorpg's that I play.
I'm thinking about working on a game that people can play online and had a question.
Why is this the case given its a two way socket connection that is constantly sending data back and forth? Doesn't their server need to get through my firewall in order to connect to me?
TCP crash lesson: TCP is a two-way protocol. The challenge is that at least one host needs to initiate. Since within an MMORPG, your own computer is never acting as a server, nobody has to connect to it. All the information about game state is passed through the company's public facing servers that have public facing IP addresses (and hey, maybe they actually use port forwarding there, just to confuse my explanation... but you never have to see their pretty network internals, proxies, and other wizardry.).
Anyway, when you connect to Stackoverflow, you're making one outbound connection that requests data from the server, and then over that same connection you're receiving it back. Same exact scenario, only with a webserver instead of a game.
UPnP allows you to tackle many routers. There's also NAT Punch-Through if you have access to a third party that isn't behind a router.
Either way, port forwarding is only necessary if you wish to act as a server (or the sender in a P2P relationship). A client does not need to forward ports.
You don't need to forward ports to access the web either, despite data coming in as well.
When you make an outgoing TCP connection, your NAT router puts the connection in a table, so that when data comes in, it knows what machine in your LAN to send the packet to.
Everyone mentions TCP, but NAT works for UDP as well: The first outgoing UDP packet associates that source port with the internal IP address, and your NAT device will forward incoming traffic to that port to the correct host on the internal network.
In other words, if your computer requests the connection (outbound) first, the router opens up the port automatically, on the assumption that you're going to want data back. But if you want remote users to connect to your computer without your computer requesting it, the router would normally drop the packets since it wouldn't know where to send them (they were unsolicited). So instead, you need to tell the router to deliver any unsolicited packets at port N to your computer.
Sorry to add another answer so late, and I know one was already accepted, but I personally found the other answers to be more confusing than this simple explanation.

Resources