SEO: Making data and URLs retrieved from database, crawlable and indexable

SEO: Making data and URLs retrieved from database, crawlable and indexable - asp.net

I have an asp.net .aspx page(say fruits.aspx page) which lists all the fruits(apple, banana, mango etc) with a thumbnail, title and link which leads to each fruit's respective detail page. Now all this data is being retrieved from an XML with the help of backhand code with help of an XSLT and user-control.
Now since the data and URLs of each fruit's detail page are not there statically on this page, it will not be crawled and indexed as per my knowledge.
Is there a workaround that I can do to make each fruit's detail page crawled and indexed.
If I had the dynamic URLs only with something like "?var=value", I could solve it with static/dynamic conversion using URL re-write. But here the URL itself is not there but is generated from code behind.

Search engines will not see the aspx file as it sits on your server; Instead, they see the same thing your web browser does: the resulting HTML output.
This means that the parameters you speak of will be seen and indexed properly by search engines.

There is no way to do it then. Each page you want indexed must have a unique URL. When you generate the page, just generate a unique URL. Take your query parameters and paste them on the end of your script name.
For example say that fruits.aspx is called with ?fruit=banana as a query parameter. Your best option is to generate a page with a unique static URL for example make the link to the banana page look like /fruits.aspx/fruit/banana.
Even better would be to rewrite it to remove the .aspx. Then the site looks like all static content, which is even better for indexing. If a URL looks like it is backed by a databasem the search engine is less likely to index everything.

Related

Will Googlebot follow _escaped_fragment_ HTTP redirect?

I have an ajaxified website, and I want all my content to be crawlable. I have a photo gallery, which only loads the photo using ajax, without refreshing the whole page. My root URL is this:
http://mysite/photos
and whenever a photo thumbnail is clicked, it displays the photo, and hash becomes #!/photo/photoid/phototitle, or when you are searching for a criteria, it becomes #!/photos/f-number/1.8/iso/640 e.g. for searching for photos with f/1.8 at ISO 640 (and more criteria can be appended this way). When a user opens up a URL like http://mysite/photos/#!/photos/f-number/1.8/iso/640 the landing page, using a javascript, will redirect the user to http://mysite/photos/f-number/1.8/iso/640 (without the hashbang), and again, there, the page loads http://mysite/Dynamic/PhotoThumbnails.aspx?f-number=1.8&iso=640 using ajax (yes, javascript looks at the location path and parses it according to that format). For the first case (link of a photo itself rather than a search), using again, only javascript, the page loads the photo itself (along with some extra tables showing technical info about photo) from the url http://mysite/Dynamic/RenderPhoto.aspx?ID=123 (where 123 the ID of the photo).
Given this information, my problem is simple: I am planning to (on my masterpage load event) redirect all requests with _escaped_fragment_s to the appropriate RenderPhoto or PhotoThumbnails page, by parsing the _escaped_fragment_ at server side. Will that work? My main concerns are;
Will Google follow the HTTP redirect? (301 or 302)
Will I get into any trouble (such as being removed from index) as I am not showing the exact same content to Google? (a browser will load a side mavigation bar, and all those fancy css styles visually-nice-looking page etc. and then load the real content into a pane at that page, where Google will be getting the "true" content only. My base page, sidebar content thumbnail list page, and photo renderer are COMPLETELY different pages which implement their OWN logic, so I cannot ever merge them)
If there is a risk of being removed due to the reasons above, what are my alternatives (no, I cannot merge the pages, it is NOT an option)? Do you recommend taking regular snapshots of pages and cache them and sending those to Googlebot?
Here is the current BETA of my website (yeah I know about lots of bugs), just to give you the idea how it will work: http://canpoyrazoglu.com/photos
I'm on ASP.NET 4.0, and using jQuery, if it helps.

A new answer to an old question. Yes it will follow it. However you may end up with both the clean and #! URLs. However, check this out (from Google Developer Guides):
Note that if you use a permanent (301) redirect, the url shown in our
search results will typically be the target of the redirect, whereas
if a temporary (302) redirect is used, we'll typically show the #! url
in search results.
This is the Google Developer Guide link:
https://developers.google.com/webmasters/ajax-crawling/docs/faq#redirects

Yes, I'm pretty sure it will follow a redirect. The Facebook open graph debugger does, and this blog post advocates implementing redirects: http://www.yearofmoo.com/2012/11/angularjs-and-seo.html

ASP.NET 4.0 URL Rewriting: How to deal with the IDs

I have just started adding the new .NET 4.0 URL Rewriting into my project. I have a question.
Let's say I have a Article.aspx that displays, well, articles. I made a route for it in the Global.asax:
routes.MapPageRoute("article-browse", "article/{id}", "~/Article.aspx");
So the link consists of the article's id which is, obviously, not a very nice, nor SEO friendly link. I would like to display the Article's title in the link, instead of the ID.
Do I have to pass the whole title in the parameter (instead of the id) and then make a SQL query that searches for a database record with the matching title? That sounds scary. Maybe there is some way to do something similar to the Eval() methods, that would change the title into an ID?
Thank you very much!

There is nothing to prevent you from including both the ID (for quick SQL retrieval) and the article's title in the link (for SEO purposes). This is exactelly how stackoverflow is handling the routing (check the address for this question).
routes.MapPageRoute("article-browse", "article/{id}/{title}", "~/Article.aspx");
Obviously, the title after the ID is not necessary to display the page (you only use the ID to fetch the article), but everytime you generate the link in your site, generate it with the title, and the bots will use that when indexing your pages.
Oh, and you might also want to create a method that translates your title into a URL-friendly string.Like making all lowercase, converting spaces and other characters to '-',etc.

Redirecting search results into an ASP.NET page

I've an ASP.NET page with a textbox and a option from user of the following choices: Wikipedia, Google, Dictionary.com, Flickr, Google images.
The user enters a word(s) in the textbox and selects a choice among the following.
Depending on the choice select by the user I wish to return the following.
Wikipedia: Return the content and link to the page corresponding to the topic about the word.
Google: Return the top 10 results of google search for this word.
Flickr: Return a few images atmost 10 images from flickr search
GoogleImage: Return a few images from google image search.
Dictionary: Return the meaning of the word.
How can I do that?

Since you are wanting to do some processing on the results prior to displaying them, your best bet is probably to invoke a web request on the server to fetch your results as RSS or some other parsable XML format.
So first up, we have Wikipedia, which has API support for open search, and queries with XML or JSON output. You can get the details of the API by going to: http://en.wikipedia.org/w/api.php
I would think either the query action, or opensearch action would be what you want.
Right, now there is Google, which supports search results as RSS through their Active Search feature. The link takes you to the main page where you can build the query, at which point it should be easy to drop in your search terms. There is also the Google Search AJAX API, which you can find out about here (See the "Flash and other Non-Javascript Environments" section for building the URLs directly. I believe this option should give you access to Google Image results as well.
For Flickr, have a look at this App Garden page. There are several output formats available to choose from.
I wasn't able to find anything real solid on getting results from Dictionary.com, but it does appear that they have an API. You might be able to dig through google and find some references on how to get search results as XML or JSON. There are also several other Dictionary sites which may have more information about their APIs. While searching I managed to find this SO question about word lookup from google dictionary.
Hope this helps.

Have an iframe within your page, and then set the src of the frame to the appropriate query string that you craft from the user's input.
This can be done from javascript within the page, in response to the user selecting something in the 'choice' dropdown. You can have the appropriate urls already embedded in the javascript (as variables), and just substitute in the user's input.

How to use ASP.NET Routing in a Quote of the Day Website

Good Afternoon,
A client is interested in creating an ASP.NET 2.0 website whose purpose is to serve up a "quote of the day". He wants the quotes on static content pages all attached to the same master page. The quote pages must be viewed in a certain sequence, and site browsers cannot view any other pages than the starting page when browsing to the site. That is, everyone must go to page 001.aspx when entering the site.
Two Questions:
1. The content pages are going to be created by the client using an excel data source and a merge process by which each quote page is created eg. 001.aspx, 002.aspx etc. This seems clunky to me at best. Would ASP.NET Dynamic Data be a better solution here?
I'm new to ASP.NET Routing and URL Rewriting as a whole. How would I setup a route table to ensure that users always entered the site on the same entry page, and create a route table such that default.aspx resolves to 001.aspx?
Thanks,
Sid

I would suggest to use the excel sheet as a data source and handle viewing the 'Quote pages' by paging through the result set obtained from said data source.
If your client is concerned about SEO, he must recognize that his requirement to have only one entry page defeats his One-Quote-One-Page-Is-SEO-friendly.
I don't think the effort to distinguish between a human user and a search bot is worth it.
Anyway googlebot is capable of indexing pages with URL parameters thus allowing to be SEO friendly without generating static content (other bots should be as well).
Possible solution
To allow search bots to index your Quotes you have a query parameter for the date of the Quote.
If you want to enforce human users (hackers don't count ;-)) to enter the site only by the current date you check the browser string and redirect any browser not being know as a search bot to the current date if the referer is not equal to the previous date.
This solution should give you a reasonable result without too much overhead.

how to get search engines to understand a DB driven asp.net site

All,
This would seem like a fairly basic asp.net question - but in all my years of coding, I've never really thought about it.
Say you have a asp.net 2.0 site with only a masterpage and a default.aspx and its a blog that saves all the data into the database. Links on the side are generated automatically. So ... the URL is always just http://www.XXXXX.com/default.aspx.
So, with that being the case, what do you need to do so that ... say google ... knows about all the different blog entries and links directly to the entries instead of just the base URL?
Is it as simple as changing the forms method to: method="get"?
Thanks, L. Lee Saunders

There are at least two solutions:
Search engines understand query strings, so just add the article IDs to the URLs in your anchor tags -- no need to even use a form control.
Use URL rewriting to expose one set of URLs to the outside world (like /article-title/1234/) in your anchor tags, and then modify the URL to be default.aspx when it arrives at your site; the page could then pull the article to be displayed from any number of places, including but not limited to a query string.

You could have a REST webservice so that you can just use urls to navigate the site, and perhaps have a front page with some new posts, so that the spider can navigate the site..
As an example, look at the urls for SO, it is easy for a spider to navigate this database-driven website.

Create a page that just serves up XML Sitemap (the data obviously being pulled from your database) and submit the sitemap to Google.
Google will then index any links in your sitemap.
(This assumes that these is some difference between each article - e.g. a Querystring key/value).
Useful Link(s):
Web Sitemap Generators
Google Sitemap Validator
Google Sitemaps for ASP.NET 2.0 (there are about a gazillion interesting links off the back of this as well).

some sort of URL rewriting may be an answer
I wouldn't recommend a postback for your situation, it can get ugly for refreshes etc. So, yes, change the method to "get"
Then, say your page of, default.aspx?postid=12345 will get translated into /mm/dd/yy/this-is-my-post.aspx

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex