When we send an HTTP request to a web server as to load a web page e.g. http://wwww.nothing_is_here.com, what exactly the server does as to serve our request? Till now, I thought the server was looking to find a file named index (index.html, index.php) which should have HTML content and send it back to our browser. Now, I know this is not always the case. For example, in ASP .NET where we apply routing, home/index path is added to the URL by default as for our app to be routed. That's I cannot understand is how exactly the server acts upon a similar situation. Why it does not return an error message in case there is no index file, how it knows it has to apply routing rules? How can we instruct the server what to do in either case?
What the server does when it receives a request for the root (which is what it will receive in your example), is up to the server configuration. It is not within the control of the client making the request.
The server will be configured to have a default document (file name) in such cases, which is often index.html, but equally could be any file set by the administrator of the server.
The server will often be configured to recognise different hosts (e.g. if it's serving multiple sites on the same interface:port. These different sites will often have different configuration about what the default file name(s) is/are (if any). In some cases the server is configured to display a directory listing of a server-configured site root folder.
Related
I have an assignment that reads "You want to use your web browser to fetch a web page from a site called "www.wagstaff.info". Its web server is at TCP port 8080, and the page you are looking for is called "horse.html".Give the URL that you enter into the navigation field of your web browser."
My first thought is that I want to write "www.wagstaff.info/" and then somehow query the web server for the object and retrieve it if it finds it, but i'm not sure if this is the right approach / how to do this.
I actually entered this site and I think the port is miswritten and it should be 80. I tried making a TCP connection to this site with port 80 and it works. I made a GET request for /horse.html and I got 404 Not Found, which makes me believe the page /horse.html doesn't exist on this site in actuality, but it doesn't have to, the assignment just uses the site as an example. But how would I make such a query/request not in cmd using telnet, but instead using the web browser and entering a URL?
If i'm on the correct path here, then, in other words, what do I type after "www.wagstaff.info/" to query for an object (or page) "/horse.html"? I would expect to get a 404 not found in my web browser, but to me it would mean I have a correct solution.
i would write the following: www.wagstaff.info:8080/horse.html
the www.wagstaff.info is the domain name that is then resolved to an actual ip address (like 192.168.0.1) what comes after the colon would be the port you're attempting to connect to and everything after that would be the path to the file you're trying to fetch.
I have an HTML5 app written in static html/js/css (it's actually written in Dart, but compiles down to javascript). I'm serving the application files via CDN, with the REST api hosted on a separate domain. The app uses client-side routing, so as the user goes about using the app, the url might change to be something like http://www.myapp.com/categories. The problem is, if the user refreshes the page, it results in a 404.
Are there any CDN's that would allow me to create a rule that, if the user requests a page that is a valid client-side route, it would just return the (in my case) client.html page?
More detailed explanation/example
The static files for my web app are stored on S3 and served via Amazon's CloudFront CDN. There is a single HTML file that bootstraps the application, client.html. This is the default file served when visiting the domain root, so if you go to www.mysite.com the browser is actually served www.mysite.com/client.html.
The web app uses client-side routing. Once the app loads and the user starts navigating, the URL is updated. These urls don't actually exist on the CDN. For example, if the user wanted to browse widgets, she would click a button, client-side routing would display the "widgets" view, and the browser's url would update to www.mysite.com/widgets/browse. On the CDN, /widgets/browse doesn't actually exist, so if the user hits the refresh button on the browser, they get a 404.
My question is whether or not any CDNs support looking at the request URI and rewriting it. So, I could see a request for /widgets/browse and rewrite it to /client.html. That way, the application would be served instead of returning a 404.
I realize there are other solutions to this problem, namely placing a server in front of the CDN, but it's less ideal.
I do this using CloudFront, but I use my own server running Apache to accomplish this. I realize you're using a server with Amazon, but since you didn't specify that you're restricted to that, I figured I'd answer with how to accomplish what you're looking to do anyway.
It's pretty simple. Any time you query something that isn't already in the cache on CloudFront, or exists in the Cache but is expired, CloudFront goes back to your web server asking it to serve up the content. At this point, you have total control over the request. I use the mod_rewrite in Apache to capture the request, then determine what content I'm going to serve depending on the request. In fact, there isn't a single file (spare one php script) on my server, yet cloudfront believes there are thousands. Pretty sure url rewriting is standard on most web servers, I can only confirm on lighttp and apache from my own experience though.
More Info
All you're doing here is just telling your server to rewrite incoming requests in order to satisfy them. This would not be considered a proxy or anything of the sort.
The flow of content between your app and your server, with cloudfront in between is like this:
appRequest->cloudFront
if cloudFront has file, return data to user without asking your server
for the file.
If cloudFront DOESN'T have the file (or it has expired), go back to
the origin server and ask it for a new copy to cache.
So basically, what is happening in your situation is this:
A)app->ask cloudfront for url cloud front doesn't have
B)cloudfront
then asks your source server for the file
C)file doesn't exist there,
so the server tells cloudFront to go fly a kite
D)cloudFront comes back empty handed and makes your app 404
E)app crashes and
burns, users run away and use something else.
So, all you're doing with mod_rewrite is telling your server how it can re-interpret certain formatted requests and act accordingly. You could point all .jpg requests to point to singleImage.jpg, then have your app ask for:
www.mydomain.com/image3.jpg
www.mydomain.com/naughtystuff.jpg
Neither of those images even have to exist on your server. Apache would just honor the request by sending back singleImage.jpg. But as far as cloudfront or your app is concerned, those are two different files residing at two different unique places on the server.
Hope this clears it up.
http://httpd.apache.org/docs/current/mod/mod_rewrite.html
I think you are using the URL structure in a wrong way. the path which is defined by forward slashes is supposed to bring you to a specific resource, in your example client.html. However, for routing beyond that point (within that resource) you should make use of the # - as is done in many javascript frameworks. This should tell your router what the state of the resource (your html page or app) is. if there are other resources referenced, e.g. images, then you should provide different paths for them which would go through the CDN.
We have a site hosted on a domain that is provided by a reverse proxy from a 3rd party. The proxy provides sends the url www.example.com/nl/test/ to our server where the site is hosted on.
The site itself uses frameworks like Telerik UI for ASP.NET Ajax and the Ajax Control Toolkit.
By default when the site is loaded all the scriptresources/themes files/images are being written with the path "/scriptresource.axd" "/image.gif" the reverse proxy from the 3rd party picks this up and tries to redirect files to wwww.example.com/defaulthomepage/.
So by default ASP.NET writes paths to the domain root which is not something we want in this scenario, it should point to /nl/test/ instead.
In order to prevent default method of writing file paths the URL Rewrite extension for IIS was installed.
There are no a couple of rules in URL Rewriting which succesfully rewrite all the files to /nl/test/file.whatever but every time an Ajax request is made the following error occurs
Error: Sys.WebForms.PageRequestManagerParserErrorException: The message received from the server could not be parsed. Common causes for this error are when the response is modified by calls to Response.Write(), response filters, HttpModules, or server trace is enabled.
Is there a way to rebase the default file path of an ASP.NET site through code? We have already tried the base html tag.
What would cause a site to try an go to an https url?
We have sitecore set up to redirect non www URLs to www pre-pended URLs. Example: joesrx.com resolves to www.joesrx.com through the Sitecore URLResolver.
What we are seeing is that if you type joesrx.com, it tries to go to https://joesrx.com before it hits the Sitecore server. Since there are no certificates on this server and https is not utilized we get a 404.
Is there something in IIS that is misconfigured? Proxy teams says it is not in their setting and network team says all of the DNS entries are correct.
As a general rule for debugging these sorts of problems, try to imagine all the elements between you and the application and then use a simple divide and conquer approach. You can also test behavior on individual levels of the path between you and the actual application.
In this case for example (from you to application code):
User
Browser
browser may do caching of redirects. Try a different browser / try incognito mode / clear cache
Browser Extensions/Settings
any extensions which make it so the browser always tries to access website(s) via https? Try with extension disabled / another browser
Proxies/Firewalls
any Proxies/Firewalls on your end which may modify requests? Can you try to access the site bypassing any proxies/firewalls, maybe from a different network connection?
Network
Web Server
Web Server Configuration / Pipelines / Resolvers / Filters / Etc.
.htaccess / IIS config / nginx config / servlet filters / (lots of options depending on your framework). Check the server
Actual application code
well.. check the code.
Example of divide and conquer, choosing the Network mid-point: Try accessing the URL with wget/curl from command-line, curl -i will also show you the headers received from the server. If you find a "Location: .." header it's clear that the server is sending a redirect. So now you only have to check Web Server / framework configuration and actual application code.
There are a few things I would check first:
Do you have rewrite rules in your web.config? They may be pattern-matching on www. and redirecting in order to enforce SSL
Do you have code in your pipelines that is attempting to enforce SSL for specific paths? The code here may not be checking the URL correctly.
In IIS, did you bind the 'www' host name to your IIS site? Or is it falling through to another site that has SSL enforced?
In case the other answers don't help, check for HTST headers such as "Strict-Transport-Security: max-age=31536000".
This HTTP header tells browsers to use only SSL for future requests (among other things).
For more info check out:
https://www.owasp.org/index.php/HTTP_Strict_Transport_Security
We have a number of existing clients that point to urls like:
http://sub1.site.com/images/image1.jpg
/images is a virutal directory that points to a directory that actually contains image1.jpg on that server.
We're moving all of the files out of this directory and onto a separate server that will not run this same application.
The file will now only be available at:
http://sub2.site.com/image1.jpg
What is the best way to make it so clients requesting
http://sub1.site.com/images/image1.jpg will get the content that now resides at http://sub2.site.com/image1.jpg?
A few requirements:
We need the actual content to be returned through that url - not a 302 response.
We cannot modify the IIS server configuration - only the web.config for the site
Again, we're running asp.net 3.5
Thanks.
Not totally sure this would work, but you could setup URL Routing on the old site so all requests are sent to a handler and within that handler you could do a web request to get the file from it's new location.
I use a variation of the process to map image URLs to different locations and my handler does some database queries to get the mapped relationship and provide the correct image. I don't see why you could do a web request to get the image.
Since you are using IIS7, you can use the built in URL rewrite module.
You would want an inbound and an outbound rule to change \images\image1.jpg to \image1.jpg
It can get pretty involved, but this should be rather simple.
Assuming you can add handlers to your site (as in add a DLL to your /bin directory in the site) and with the restriction that you can't send 302 responses for better performance, then alternatively you could write a custom handler to grab all requests that match that URL pattern, do the web request for the sub2.site image from the original site via web client code, then serve it back out of the original site, sub1.site.com.
See How To Create an ASP.NET HTTP Handler by Using Visual C# .NET for the very basics of creating and setting up a custom handler. Then use the HttpWebRequest to make the request of sub2.site.com, as in the guide A Deeper Look at Performing HTTP Requests in an ASP.NET Page. Plus a little other code to handle errors, timeouts, passing the image through with as little processing and memory usage as possible, etc.
Depending on the response time/lag between the two servers, this may be slow, but it would fit all your requirements. But if the point of moving the images to site 2 was for performance (CPU or memory) or bandwidth limitations, then this solution would nullify any gains — and would actually make things worse. But if they were moved for other business or technical reasons though, then this solution might be helpful still.
If you have other control over the server or anything upstream from the server, you could use mod_proxy (or similar Windows/IIS tool) to intercept those URLs and forward them to another server and respond back with the real request. Depending on your network configuration and available servers, this could be the simplest, best performing solution.
Can IIS be configure to forward request to another web server? on serverfault has a quick process and link for an IIS 7.5 solution.