nginx - Completely case-insensitive URL matching and file lookup - nginx

I want all URLs on my server to be case-insensitive, in both directions. With that I mean: If the user requests index.html, but the file is called Index.html, they should still get it. If they request Index.html, but it's called index.html, they should still get it.
My server runs on Linux whose file system is case-sensitive by default, but can this be worked around by nginx?

Related

How google searches show result of wordpress posts?

As we know wordpress stores its post and pages in database not in a physical page then how it is possible by google to show result from the postswhich doesnot exist physically.
Also if we do the same will it work or not?
Please do make me clear.
Short-answer:
A web-address is ultimately just a string (a piece of text) given to a web-server which it can interpret and act-on any way it likes.
It can simply map that string to a file-system path and see if it matches a file on disk and return that file to the website visitor.
But it can also instead use that string to do something completely different - such as looking it up in a database and then returning database content.
A web-server is not just (old-school) servers like Apache and IIS that default to serving filesystem content - but it also includes server-side programs like PHP scripts, Node.js applications, and so on.
Step-by-step explanation:
A website visitor (human's web-browser, search engine spider, a bot, etc) requests GET http://example.wordpress.com/2019/10/12/lorem-ipsum
The TCP packet with the request reaches physical computers owned or operated by Wordpress.com.
(This answer will ignore complications like network-load-balancing, application-level routing, HTTP reverse-proxies, and so on.)
The physical computer's operating system routes that network packet to the "outer" webserver software, this is like Apache or nginx.
Apache or nginx only looks at the GET and 2019/10/12/lorem-ipsum part of the request, these are the Method and Path components of the request respectively.
If the "outer" web-server is configured to map the website's root with some filesystem directory, then it will look to see if (by default) /var/www (the default root for Apache on Linux) or C:\inetpub\wwwroot (the default root for IIS on Windows) contains a file named lorem-ipsum exists in /var/www/2019/10/12 (or C:\inetpub\wwwroot\2019\10\12\lorem-ipsum on Windows).
But Wordpress.com does not do this.
Most modern web-applications built today also do not do this, because exposing raw files directly to the internet generally isn't a good idea (but it's still okay for "static file websites", of course).
Instead, WordPress.com is specifically configured to pass the entire 2019/10/12/lorem-ipsum string into php.exe along with the path to the entrypoint script file of WordPress.com's namesake PHP web-application.
This is actually just WordPress' index.php file - however remember that it is invoked using a variety of special techniques (using PATH_INFO) which is why you don't see index.php inside URIs like example.wordpress.com/index.php/2019/10/12/lorem-ipsum.
And then php.exe runs through the script, which then looks-up 2019/10/12/lorem-ipsum in a database, then retrieves the content, builds and renders the page as HTML and then returns the rendered HTML to the website visitor from step 1.
Better answer:
Think more abstractly and challenge your assumptions:
URLs do not point to a web-pages. They actually point to "resources".
A "resource" is not necessarily a web-page. A resource is a representation of some "thing".
A web-page, too, can itself be a representation of some "thing".
A web-page is not necessarily a static HTML file on-disk.
A web-page can be generated dynamically on-the-fly by server-side software like PHP, NodeJS ASP.NET, Java/JSP/Servlets, CGI (archaic), and so on.
In WordPress' case:
The URL http://example.wordpress.com/2019/10/12/lorem-ipsum points to a WordPress article.
But a WordPress article can be represented in different ways - such as a HTML web-page (as in this example), but it could also be represented as a JSON blob or XML blob (for consumption by other computer programs).
That WordPress article could also be represented as a part of another resource, such as a link when you request all articles published in October 2019 (by getting http://example.wordpress.com/2019/10).

nginx redirect map folder

I can't seem to get something to work to redirect anything from a folder or its subfolders to another folder in my nginx redirect map.
I want to redirect any request to anything within the folder /fauxnews to /topics/faux-news. (Not to redirect to another file with the same name in the destination folder, but just to "/topics/faux-news", which will list all posts in that topic.)
I've found things that seemingly should work, like:
/fauxnews(.*) /topics/faux-news/;
/fauxnews.* /topics/faux-news/;
... but they aren't working. What should I use there?
Okay, found out you have to tell nginx you're using regex for a particular line, by inserting the tilde at the beginning. (The '^' is an anchor telling it to start the matching there):
~^/fauxnews(.*) /topics/faux-news/;

NGINX Rewrite Rule without access to the configuration file

In apache, the rewrite rule can be written in the configuration file or in .htaccess file. How about in nginx? Can I use url rewriting without access to the configuration file?
Unfortunately, you can't. This is one of the reasons shared hostings typically use apache or litespeed, not nginx or lighttpd.
A (very ugly) workaround would be to handle all requests with a script which would contain the rewrite rules and would serve the file/script according to the request URI (and which could be modified by a user without having root privileges). However you'd have a bad performance serving static files and you'd need to handle all the request headers by this script, which is not very practical.

What is the default file to be visited behind a website?

When I open a website's url such as www.stackoverflow.com via curl, which file is actually being visited in the server? I know usually it is index.html. But I cannot find such a convention in the RFC2616 document. How can I know it?
BR
The document dilivered by calling a website without a path in the URL is configured by the webserver. So you have no standard there. Is a users joice.
Curl will download the file the webserver is delivering him, or follow the redirect (if -L option is given) when webserver responses a redirect.
There is no way for the client to know how the data for the HTTP response was generated. It might not even be related to a specific file.
The last time I wrote a significant bit of server side code, everything outside of /static/ was routed (via mod_rewrite) though a FastCGI program that got its data from a few different controller libraries, a dozen database schema libraries, a database and a dozen template files.
The WWW is built on links between URLs, not files. Don't worry about files if you are writing client code.
It's not necessarily index.html, and you can't actually know that it could be anything depending on the Server Configuration, for instance in Apache you can change the directory index to the one that suits you
DirectoryIndex home.php
in this case the default file accessed is home.php
in IIS you can take a look about default index and how to change it
but the defaults are
in Apache
index.php (usually: depending on the server configuration)
index.html (is the default that comes with a fresh install)
in IIS
Default.htm
Default.asp
Index.htm
Index.html
Iisstart.htm

Referencing relative (parent) path with sudomain

I have a website with several subdomains that direct the user into a subfolder on my site. Inside each subfolder is a Default.aspx file which does some processing and then redirects the user to "../Default.aspx".
This works fine if you type the full URL to that page. If you try to access it through the subdomain, the ".." parent is not being parsed correctly, and just concatenates the subfolder path into the main path and I get a page not found.
The root path of my application is www.domain.com/root.
The subdomain points to subdomain.domain.com/root/subfolder
When I navigate to subdomain.domain.com, I get this error:
"404 - /root/subfolder/root/Default.aspx not found"
All I want is for subdomain.domain.com to redirect the user up one folder level to www.domain.com/root/Default.aspx
Can anyone help? Is this a feature/restriction of using a shared hosting provider - the subdomains are restricted to the folder where they are pointed?
Your description is a bit confusing, since you mix local paths and URLs together. Am I right that you are trying to do: Page at subdomain.domain.com/root/subfolder/Default.aspx redirects to www.domain.com/root/Default.aspx?
That means you want to do 2 things:
Redirect from sub-domain subdomain to sub-domain www, and
Navigate to a file one folder up.
Both things you can do in a single HTTP redirect. For this, use the Response.Redirect method, and make sure that in the URL you use the www sub-domain, and the correct absolute path to the page you want to show.
Response.Redirect("http://www.domain.com/root/Default.aspx");
Update
Or, redirect to a URL relative to the current page, in the same domain.
Response.Redirect( Page.ResolveClientUrl( "../Default.aspx" ) );
Update 2
Or, use the Host HTTP header to distinguish on subdomains and switch programatically in your shared codebase.
The answer is to point all the subdomains to the same folder (the main code base) and then in the Master file, switch based on the http header. If they are coming in from partner1.domain.com, use css1 - if they are from partner2.domain.com, use css2, etc.
This allows me to use relative paths throughout the code AND preserve the subdomain in the browser's URL bar.
One caveat - if you are testing in multiple environments (I have a DEV and a TST) you need some code to detect which environment you are in and operate a little differently, since the http header host would show something like "localhost:51510". For me, those subdomains only exist in my Production environment.
Credit to bgever - thanks!

Resources