How google searches show result of wordpress posts? - wordpress

As we know wordpress stores its post and pages in database not in a physical page then how it is possible by google to show result from the postswhich doesnot exist physically.
Also if we do the same will it work or not?
Please do make me clear.

Short-answer:
A web-address is ultimately just a string (a piece of text) given to a web-server which it can interpret and act-on any way it likes.
It can simply map that string to a file-system path and see if it matches a file on disk and return that file to the website visitor.
But it can also instead use that string to do something completely different - such as looking it up in a database and then returning database content.
A web-server is not just (old-school) servers like Apache and IIS that default to serving filesystem content - but it also includes server-side programs like PHP scripts, Node.js applications, and so on.
Step-by-step explanation:
A website visitor (human's web-browser, search engine spider, a bot, etc) requests GET http://example.wordpress.com/2019/10/12/lorem-ipsum
The TCP packet with the request reaches physical computers owned or operated by Wordpress.com.
(This answer will ignore complications like network-load-balancing, application-level routing, HTTP reverse-proxies, and so on.)
The physical computer's operating system routes that network packet to the "outer" webserver software, this is like Apache or nginx.
Apache or nginx only looks at the GET and 2019/10/12/lorem-ipsum part of the request, these are the Method and Path components of the request respectively.
If the "outer" web-server is configured to map the website's root with some filesystem directory, then it will look to see if (by default) /var/www (the default root for Apache on Linux) or C:\inetpub\wwwroot (the default root for IIS on Windows) contains a file named lorem-ipsum exists in /var/www/2019/10/12 (or C:\inetpub\wwwroot\2019\10\12\lorem-ipsum on Windows).
But Wordpress.com does not do this.
Most modern web-applications built today also do not do this, because exposing raw files directly to the internet generally isn't a good idea (but it's still okay for "static file websites", of course).
Instead, WordPress.com is specifically configured to pass the entire 2019/10/12/lorem-ipsum string into php.exe along with the path to the entrypoint script file of WordPress.com's namesake PHP web-application.
This is actually just WordPress' index.php file - however remember that it is invoked using a variety of special techniques (using PATH_INFO) which is why you don't see index.php inside URIs like example.wordpress.com/index.php/2019/10/12/lorem-ipsum.
And then php.exe runs through the script, which then looks-up 2019/10/12/lorem-ipsum in a database, then retrieves the content, builds and renders the page as HTML and then returns the rendered HTML to the website visitor from step 1.
Better answer:
Think more abstractly and challenge your assumptions:
URLs do not point to a web-pages. They actually point to "resources".
A "resource" is not necessarily a web-page. A resource is a representation of some "thing".
A web-page, too, can itself be a representation of some "thing".
A web-page is not necessarily a static HTML file on-disk.
A web-page can be generated dynamically on-the-fly by server-side software like PHP, NodeJS ASP.NET, Java/JSP/Servlets, CGI (archaic), and so on.
In WordPress' case:
The URL http://example.wordpress.com/2019/10/12/lorem-ipsum points to a WordPress article.
But a WordPress article can be represented in different ways - such as a HTML web-page (as in this example), but it could also be represented as a JSON blob or XML blob (for consumption by other computer programs).
That WordPress article could also be represented as a part of another resource, such as a link when you request all articles published in October 2019 (by getting http://example.wordpress.com/2019/10).

Related

What is the default file to be visited behind a website?

When I open a website's url such as www.stackoverflow.com via curl, which file is actually being visited in the server? I know usually it is index.html. But I cannot find such a convention in the RFC2616 document. How can I know it?
BR
The document dilivered by calling a website without a path in the URL is configured by the webserver. So you have no standard there. Is a users joice.
Curl will download the file the webserver is delivering him, or follow the redirect (if -L option is given) when webserver responses a redirect.
There is no way for the client to know how the data for the HTTP response was generated. It might not even be related to a specific file.
The last time I wrote a significant bit of server side code, everything outside of /static/ was routed (via mod_rewrite) though a FastCGI program that got its data from a few different controller libraries, a dozen database schema libraries, a database and a dozen template files.
The WWW is built on links between URLs, not files. Don't worry about files if you are writing client code.
It's not necessarily index.html, and you can't actually know that it could be anything depending on the Server Configuration, for instance in Apache you can change the directory index to the one that suits you
DirectoryIndex home.php
in this case the default file accessed is home.php
in IIS you can take a look about default index and how to change it
but the defaults are
in Apache
index.php (usually: depending on the server configuration)
index.html (is the default that comes with a fresh install)
in IIS
Default.htm
Default.asp
Index.htm
Index.html
Iisstart.htm

Add subdirectory for locale based URL existing ASP.NET website, resolve relative paths correctly

Existing ASP.NET (MVC and webforms hybrid) website displays translated content. The language is based on a cookie that stores the user's preference. There is no change in the URL when the user changes the setting. The content is reloaded in the preferred language. For SEO, the locale should be included in the URL ( support.google.com/webmasters/answer/182192?hl=en).
I've tried the following:
1) Use URL Rewrite Module: (http://www.iis.net/learn/extensions/url-rewrite-module/setting-http-request-headers-and-iis-server-variables)
Issues:
- All hyperlinks and redirects still point to the old URL without the locale.
- Complex outbound rules required based on the folder structure and usage (mixture of absolute paths and relative paths e.g. ../, ~/, /).
- Also need to disable static compression as per documentation
- Performance considerations due to large size of Html.
- Postback results in runtime exceptions due to issue in the relative path rewrite.
- Paths defined in script files (ajax loading etc) are a huge challenge
- Base tag does not work as expected, because the Rewrite Module seems to append ../ (http://www.iis.net/learn/extensions/url-rewrite-module/url-rewriting-for-aspnet-web-forms#Using_tilda)
2) IIS 7.5 Virtual Directory: Create Virtual Directory for each language and point it to the root. i.e. www.example.com is the root and www.example.com/fr-ca/ is a virtual directory mapped back to the root
Issues:
- Runtime exception in config file saying that the virtual directory needs to be converted to application
- Converting it to application gives 500.19 error due to duplicate entries in the web config (since the virtual directory is pointing back to the root)
- I tried moving the root to another subdirectory (i.e. have a physical directory for each language) to avoid web config conflicts, but that is resulting in some sort of "kernel" error. Also, this would mean changing the physical structure of the application, and also address routing issues
3) Using sub-domains:
I have also considered using sub-domains and hosting the application independently for each language, but this has a lot of drawbacks, including having to address scalability, single sign on, cookies, domain specific stuff like analytics etc.
So what is the least painful way to include a language sub-directory in the URL, and make all links relative to that sub-directory?
Note: The site contains a mixture of absolute paths and relative paths e.g. (../, ~/, /) sometimes used in conjunction with ResolveClientUrl, ResolveUrl
In the end, we went with option 2, with the below steps:
Create a new folder, deploy a copy of the application to the new folder. The new folder should be in a different directory from the root application.
Create a new virtual application* (not virtual directory) under the root application; 1 for each new language, pointing to the new folder. (If the need arises in the future, any of the virtual applications can point a different folder customized for that specific language)
In the new folder, remove the modules and handlers sections in the system.webServer section of the web.config file (they will be inherited from the parent web.config)
If you are using SQL session state, you will need to specify a custom Application Name in the web.config, and modify TempGetAppID stored procedure so that the Application Name is the same across all the virtual applications. See the following (http://blogs.msdn.com/b/toddca/archive/2007/01/25/sharing-asp-net-session-state-across-applications.aspx)
Hopefully, all the links are resolved on the server side using Url.Content (MVC) or ResolveUrl (webforms). If not, they need to be fixed. Any paths specified in javascript would not automatically resolve to the virtual application either (they would still be resolved to root application)
Test the heck out of it. Each and every link. (A tool like ScreamingFrog may help to make sure that no 404s are returned, methinks. But it wouldn't solve HTTP POST)
Note that depending on custom error handling, and any existing URL rewrite rules, the steps maybe different.
Summary: option 1 (URL Rewrite) is totally impractical. Option 2 (sub-directory) is the most practical solution, however it is not quite as straightforward as it should've been.

Force file download in a browser using ASP.Net MVC when the file is located on a different server without downloading it on my server first

Here's what I would like to accomplish:
I have a file stored in Windows Azure Blob Storage (or for that matter any file which is not on my web server but accessible via a URL).
I want to force download a file without actually downloading the file on my web server first i.e. browser should automatically fetch the file from this external URL and prompts the user to download it.
Possible Solutions Explored:
Here's what I have explored so far (and why they won't work):
Using something like FileContentResult as described here Returning a file to View/Download in ASP.NET MVC to download the file. This solution would require me to fetch the contents on my server and then stream from my server to the browser. For this reason this solution won't work.
Using HTML 5 download attribute: HTML 5 download attribute would have worked perfectly fine however the problem is that while it is really a very neat solution, it is not supported in all browsers.
Changing the file's content type: Another thing I could do (at least for the files that I own) to change the content type property of the file to something that the browser wouldn't understand and thus would be forced to download the file. This might work in some browsers however not in all as IE is smart enough to go beyond the content type and sees the file's content to determine the content type. Furthermore if I don't own the files, then I won't have access to changing the content type of the file.
Simply put, in my controller action I should be able to specify the URL of the file and somehow browser should force download the file.
Is this something which can be accomplished? If yes, then any ideas how I could accomplish this?
Simply put, in my controller action I should be able to specify the URL of the file and somehow browser should force download the file [without exposing the URL of the file to the client].
You can't. If the final URL is to remain hidden, your server must serve the data, so your server must download the file from the URL.
Your client can't download a file it can't get the URL to.
You can create file transfer WCF service (REST) which will stream your content from blob storage or from other sources through your file managers to client browser directly by URL.
https://{service}/FileTransfer/DownloadFile/{id, synonym, filename etc}
Blob path won't be exposed, web application will be free from file transfer issues.

Is there a way to download a PHP/ASP/whatever source code without processing it, as plain text?

Suppose the URL http://example.com/test.php. If I type this URL on the browser address bar, the PHP code is executed, and its output is returned to me. Fine. But, what if instead of executing it, I wanted to view it's source as plain text. Is there a a way to issue such request?
I believe that there must be some way, and my concern is that some outsider could retrieve sensitive code, such as configurations file, by guessing it's location. For example, Joomla instalations have a configuration.php on it's root folder. If someone retrieves such file as plain text, then these database credentials have been seriously compromised. Obviously, this could be prevented with proper permissions, but it's just too common to just issue 0777 as everything permissions and forgetting about access denials.
For PHP: if properly configured, there is no way to download it. File permissions won't help either way, as the webserver needs to be able to read the files, and that's the one serving contents. However. a webserver can for instance be configured to serve them with x-httpd-php-source, or the PHP/webserver configuration may be broken. Which is why files which don't need direct access (db config, class definitions, etc.) should be outside the document root, so there is no way those files will get served by accident even when the webserver config is incorrect / failing. If your current hoster does not allow you to store files outside the document root, switch hosting a.s.a.p.
There is a way to issue such request that downloads the source code of http://example.com/test.php if the server is configured to provide a URL to do so. Usually it isn't, so usually there is no way to issue such a request.

Custom VirtualPathProvider unable to serve URLs ending with a directory

As part of a CMS, I have created a custom VirtualPathProvider which is designed to serve a single file in place of an actual file structure. I have it set up such that if a file actually exists on the server, that file will be served. If the file does not exist, the virtual content stored for that address will be served instead. This is similar to the concept of serving a website from files stored in a database, though in this case the content is stored in XML files on the server.
This setup works perfectly when a request is made to a specific page. For example, if I ask for "www.mysite.com/foobar.aspx", the content that is stored for "foobar.aspx" will be served. Further, if I ask for "www.mysite.com/subdir/foobar.aspx", the appropriate content will also be served.
The problem is this: If I ask for something like "www.mysite.com/foobar", things begin to fall apart. If the directory exists on disk (and doesn't have a configured default page in IIS, such as index.aspx), I will get a "Directory Listing Denied" error. If the directory does not exist, I'll simply get a 404 - Resource Not Found.
I've tried several things, and so far nothing I've done has made a bit of difference. It seems as though IIS is simply noting the nonexistence of a directory (or default file in an existing directory) and serving up its own error code, without ever asking my application what to do with the request. If it ever did get to the application, I would be able to solve the problem, but as it stands, I'm quite lost. Does anyone know if there is some setting in IIS that is causing this?
I've looked for every resource I can find on the subject, and am coming up empty. I know this should be possible, because I have read tutorials on serving content from both databases and ZIP files. HELP!
p.s., I am running IIS6 and .NET 3.5
IIS will only pass a request to the ASP.NET process if it is configured to do so for the particular extension. The default is aspx, ascx, etc. In other words, if you request a .html file, ASP.NET will never see that HTTP request. Likewise for empty extension.
To change this behavior, add a wildcard mapping to the ASP.NET process. Load IIS Manager, go to the Properties for your web site and look at the Home Directory tab. Click on "Configuration" and there you will see the extension-to-applicaiton mappings.

Resources