how to get search engines to understand a DB driven asp.net site - asp.net

All,
This would seem like a fairly basic asp.net question - but in all my years of coding, I've never really thought about it.
Say you have a asp.net 2.0 site with only a masterpage and a default.aspx and its a blog that saves all the data into the database. Links on the side are generated automatically. So ... the URL is always just http://www.XXXXX.com/default.aspx.
So, with that being the case, what do you need to do so that ... say google ... knows about all the different blog entries and links directly to the entries instead of just the base URL?
Is it as simple as changing the forms method to: method="get"?
Thanks, L. Lee Saunders

There are at least two solutions:
Search engines understand query strings, so just add the article IDs to the URLs in your anchor tags -- no need to even use a form control.
Use URL rewriting to expose one set of URLs to the outside world (like /article-title/1234/) in your anchor tags, and then modify the URL to be default.aspx when it arrives at your site; the page could then pull the article to be displayed from any number of places, including but not limited to a query string.

You could have a REST webservice so that you can just use urls to navigate the site, and perhaps have a front page with some new posts, so that the spider can navigate the site..
As an example, look at the urls for SO, it is easy for a spider to navigate this database-driven website.

Create a page that just serves up XML Sitemap (the data obviously being pulled from your database) and submit the sitemap to Google.
Google will then index any links in your sitemap.
(This assumes that these is some difference between each article - e.g. a Querystring key/value).
Useful Link(s):
Web Sitemap Generators
Google Sitemap Validator
Google Sitemaps for ASP.NET 2.0 (there are about a gazillion interesting links off the back of this as well).

some sort of URL rewriting may be an answer
I wouldn't recommend a postback for your situation, it can get ugly for refreshes etc. So, yes, change the method to "get"
Then, say your page of, default.aspx?postid=12345 will get translated into /mm/dd/yy/this-is-my-post.aspx

Related

Redirecting to .aspx in a different domain

Okay, I'm dealing with a problem hear.
I'm building landing pages for my company.
The main website works with a .aspx form to retrieve car data (from for example licenseplates).
Now we've set up some new relevant domain names to use for some of the landing pages.
The problem now is, that when on those pages I type in licenseplate and click search, it fails doing so.
Since it tries to find the .aspx form on the landing page domain url.
For example:
Main site: www.mysite.com/category.aspx?k=80zbfk (refered to when the licenseplate is typed in.)
Landing page: www.mysite2.com/category.aspx?k=80zbfk (were it refers to on the landing page)
No the second one should refer to the first one. But I can't seem to find a way to do so.
I don't have acces to the .aspx files since they're in control of a external company.
Is there any way to fix this? To refer the landing page to the .aspx from the main site?
Or do I have to contact the webcompany to ask for the files so I can copy them to the other domain?
Thanks in advance!
It sounds like you want to be using absolute links rather than relative.
An absolute link includes the host and path of the target url, whereas a relative one contains only the path and lets the browser infer the host (via the url being visited).
For example, if you were editing the HTML of a page here on Stack Overflow, the absolute link to this question would look something like this,
http://stackoverflow.com/questions/26379795/redirecting-to-aspx-in-a-different-domain
While the relative one would look like this,
/questions/26379795/redirecting-to-aspx-in-a-different-domain
In the case of the latter one, the browser would be left to assume based on context that you wanted to go to that path on http://stackoverflow.com/. There's a bit more to it, and variations on that syntax exist. But that's the gist of it.
So, getting back to your question, yes. You will probably have to update the ASPX pages. Relative links are best practice in most cases, which explains why they were used in your code, but you've got an exception. It's probably going to be easiest to just go through and change whatever links you need, to point to the main domain. But for what it's worth, that should be a relatively easy fix, once you get the files.
Alternatively, you could set up a rewrite rule or redirection policy on your landing page servers to automatically 301 redirect any requests that contain search information off to the main server, but that's definitely a workaround approach, and there will be a performance hit in doing so. The one and only advantage that I can imagine to doing that is that you wouldn't need to get the pages from the third party, but it sounds like that wouldn't be a bad idea to do anyway.

SEO: Making data and URLs retrieved from database, crawlable and indexable

I have an asp.net .aspx page(say fruits.aspx page) which lists all the fruits(apple, banana, mango etc) with a thumbnail, title and link which leads to each fruit's respective detail page. Now all this data is being retrieved from an XML with the help of backhand code with help of an XSLT and user-control.
Now since the data and URLs of each fruit's detail page are not there statically on this page, it will not be crawled and indexed as per my knowledge.
Is there a workaround that I can do to make each fruit's detail page crawled and indexed.
If I had the dynamic URLs only with something like "?var=value", I could solve it with static/dynamic conversion using URL re-write. But here the URL itself is not there but is generated from code behind.
Search engines will not see the aspx file as it sits on your server; Instead, they see the same thing your web browser does: the resulting HTML output.
This means that the parameters you speak of will be seen and indexed properly by search engines.
There is no way to do it then. Each page you want indexed must have a unique URL. When you generate the page, just generate a unique URL. Take your query parameters and paste them on the end of your script name.
For example say that fruits.aspx is called with ?fruit=banana as a query parameter. Your best option is to generate a page with a unique static URL for example make the link to the banana page look like /fruits.aspx/fruit/banana.
Even better would be to rewrite it to remove the .aspx. Then the site looks like all static content, which is even better for indexing. If a URL looks like it is backed by a databasem the search engine is less likely to index everything.

Flex 3: Project Architecture & SEO

I've got a Flex 3 project. One of the problems I have is that not very much of its content is indexed by Google. Currently, I pull data from a mySQl database, so the Googlebot doesn't see most of the site.
My goal is to increase the amount of content indexed by Google, improve the SEO, and improve SERPs.
I thought that instead of pulling the data from the database that I would change the project's architecture and create separate "pages". So, in my case, I would compile each puzzle separately and upload it to the server in its own directory. This way the info in each puzzle would get indexed.
The negative is that if I add a puzzle, I'd have to add a link to it in all of the puzzles that are already on the server. I would have to add the link, re-compile each puzzle and upload it to the server. Is there a way to get around this problem? Also, if I wanted to communicate some data from one puzzle to another in the future, I wouldn't be able to do so.
Any suggestions?
Thank you.
-Laxmidi
The usual way to achieve this goal is to develop a hidden parallel site in HTML.
On the first page you will have your flash and, hidden by javascript, a list of links to the other pages. These links will be parsed by the robots. Ideally, the href pages are virtual (look for "url rewriting"). On each "fake" page, your server-side language will print on the page a content or links from your database AND the flash. The flash will be provided with a string explaining where it is and what it's supposed to show.
Ex: http://www.mysite.com/category1/content7 The URL rewriting sends this request to http://www.mysite.com/index.php?uri=category1/content7. The page should display the Flash with FlashVar "uri=category1/content7". The Flash knows which content it has to display so when an user comes from google, following this link, he will find the content he was looking for.
Every linking and content for SEO should be in HTML, don't trust robots capability of reading Flash.
have a look at Adobe's reference on deep-linking.
you can generate a website's sitemap.xml with a cron process (daily), such that the URLs encode the state of the application you need. This URL will encode whatever content you need to retrieve from the db, with just one index.html page.
good luck!

How to use ASP.NET Routing in a Quote of the Day Website

Good Afternoon,
A client is interested in creating an ASP.NET 2.0 website whose purpose is to serve up a "quote of the day". He wants the quotes on static content pages all attached to the same master page. The quote pages must be viewed in a certain sequence, and site browsers cannot view any other pages than the starting page when browsing to the site. That is, everyone must go to page 001.aspx when entering the site.
Two Questions:
1. The content pages are going to be created by the client using an excel data source and a merge process by which each quote page is created eg. 001.aspx, 002.aspx etc. This seems clunky to me at best. Would ASP.NET Dynamic Data be a better solution here?
I'm new to ASP.NET Routing and URL Rewriting as a whole. How would I setup a route table to ensure that users always entered the site on the same entry page, and create a route table such that default.aspx resolves to 001.aspx?
Thanks,
Sid
I would suggest to use the excel sheet as a data source and handle viewing the 'Quote pages' by paging through the result set obtained from said data source.
If your client is concerned about SEO, he must recognize that his requirement to have only one entry page defeats his One-Quote-One-Page-Is-SEO-friendly.
I don't think the effort to distinguish between a human user and a search bot is worth it.
Anyway googlebot is capable of indexing pages with URL parameters thus allowing to be SEO friendly without generating static content (other bots should be as well).
Possible solution
To allow search bots to index your Quotes you have a query parameter for the date of the Quote.
If you want to enforce human users (hackers don't count ;-)) to enter the site only by the current date you check the browser string and redirect any browser not being know as a search bot to the current date if the referer is not equal to the previous date.
This solution should give you a reasonable result without too much overhead.

asp.net url concealment?

In my asp.net 2005 app, I would like conceal the app structure from the user. Currently, the end user can learn intimate details of my web app as they navigate and watch the url change. I don't want the end user to know about my application structure. I would like the browser url to not change if possible. Please advise.
thanks
E.A.
URL rewriting is the only one that can provide any kind of real concealment.
Just moving the requests to AJAX or to frames, means anyone (well, more advanced users) can still see those requests being fired, just not in the address bar.
Simplest solution is to use frames - a single frame that holds your application and is 100% * 100%. The URL will not change though the underlying URL can still be seen via "View Frame info", however only advanced users will even figure that out.
In your pages, make sure that they are contained inside the holding frame.
A couple of possibilities.
1) use AJAX to power everything. This will mean that the user never leaves the home page
2) use postbacks to power everything. In this, you'd have all those pages be user controls which you progrmattically hide or show.
3) URL rewriting (especially if this is asp.net 3.0 or later)
My site uses url parameters to dynamically load ascx files into a single main aspx. So if I get 'page_id=123' on the query string, I load the corresponding ascx. The url changes, but only the query string - the domain part remains the same.
If you want the url to remain precisely the same at all times, then frames (per Oded) or ajax (per Stephen) are probably the only ways to do it.
Short answer: use URL encryption
A simple & straight article: http://devcity.net/PrintArticle.aspx?ArticleID=47
and another article: https://web.archive.org/web/20210610035204/http://aspnet.4guysfromrolla.com/articles/083105-1.aspx
HTH

Resources