SiteMap for the search engines - asp.net

I have to make an api call and get the response. This response contains more than 4000 urls.
I have to list all these urls in the sitemap for the search engines to crawl easily. I have to write a handler for doing this task. Can someone suggest me an example for doing this.

I'll assume you are talking about a sitemap in XML format, but you didn't specify what the source is besides that you are to do an API call. However, the 3rd or so result from a Google search on "asp.net google sitemap" should give you a perfect starting point:
http://www.mikesdotnetting.com/Article/94/Create-a-Google-Site-Map-with-ASP.NET
I would suggest creating an ASHX handler (File -> New -> Generic Handler in Visual Studio) instead of a page like they do in the example.
Upload the handler to the website and add the sitemap to e.g. Google by using their Webmaster Tools.

A quick search on Google resulted in this link:
XML sitemap with ASP.NET
Which should get you most of the way with the handler and composing the XML.

Related

How would you go about writing a custom script that grabs the Adobe or Google Analytics image request?

If I wanted to build a scraper that pings each URL on a site and stores the adobe (or Google) image request, how would I go about this? I.e. I just want something that grabs all the parameters in the URL posted to Adobe in a csv or something similar. I'm familiar with how to build simple web scrapers, but how do I grab the URL I see in for example Fiddler that contains all the variables being sent to the Analytics solution?
If I could do this I could run a script that lists all URLs with the corresponding tracking events that are being fired and it would make QAing much more manageable.
You should be able to query the DOM for the image object created by the tag request. I am more familiar with the IBM Digital Analytics (Coremetrics) platform and you can find the tag requests using accessing the following array document.cmTagCtl.cTI in the Web Console on a Coremetrics tagged page. I used this method when building a Selenium WebDriver test case and wanted to test for the analytics tags.
I don't have the equivalent for Adobe or GA at the moment since it depends in the library implementation am trying the do the same as you for GA.
Cheers,
Jamie

Access to sitemap?

i created a site map with the name "Web.sitemap" in the root folder, and i need to feed this to google keywords. Any idea how i can access this file? I tried (domain)/Web.sitemap , but it doesn't load.
What is the proper way to access this file?
Thanks
Web.sitemap is typically used by the Sitemap control in ASP.NET to render menus and what not. It is not exposed publically, and in fact the default IIS configuration will block it from being loaded through the browser.
You may be thinking of a sitemap.xml file, which is an XML description of every page on your site used by search engines and crawlers. More information on this can be obtained from http://www.sitemaps.org/protocol.php
Not sure what you mean by feeding it to "Google keywords"? But if you want to submit a sitemap to Google Webmaster Tools (and search engines in general), it is an XML sitemap following the XML sitemaps protocol you want (as Mike wrote)

Create Google compliant dynamic XML sitemap

I want to create a dynamic (fetching data from the database) XML sitemap which I can submit to Google webmaster tools.
Surprisingly, I couldn't find any recent controls/code online to do this. The most recent code I found was this http://weblogs.asp.net/bleroy/archive/2005/12/02/432188.aspx which is for ASP.Net 2.0. I don't mind using this, but I suspect it's outdated.
Can somebody please point me in the direction of code which accomplishes this?
A couple of options include:
The ASP.Net SiteMap infrastructure. It allows you to write a custom sitemap provider like this one, which uses Micosoft Access, to generate a sitemap.
You can also find a very simple sitemap generator project on this site.
Another option (and fun learning experience) is to write your own by just looking at the sitemap protocol, and using Linq To SQL along with Linq To Xml to generate the format. Here is an example uses Linq To SQL and Linq To XML to generate XML.
Finally, Google also accepts RSS/Atom feeds, so you could generate one of those instead. If you go this route then you can make use of the SyndicationFeed class. There are also a couple open source options available.
Actually i just done it recently using LinqToXMl
How to generate xsi:schemalocation attribute correctly when generating a dynamic sitemap.xml with LINQ to XML?
Actually the string that is returned by that code is written directly to the Response object. I use a .ashx HttpHandler to deliver the content as XML and using Routing to serve it under the name of sitemap.xml. Also you should put it on your robots.txt file

Rss and external feed

I want to build a similar app like this:http://community.livejournal.com/ohnotheydidnt/32551171.html
using a livejournal rss feed. Any way of retrieving an external feed ( meaning getting a feed from a different domain that the one your web application-Same origin policy)? I've built a parser, but I would like to use dashcode for simple html building.
Across domains, if the data is only available via RSS and you don't have control of the other domain, then your best option is a server-side proxy.
If you have control over the other domain, you can create a page containing a javascript function which uses XmlHttpRequest to pull the RSS and returns the RSS. Then you can use a cross-domain messaging library like EasyXDM to call that script.
You also might want to check if the RSS feed's website supports JSONP as an alternate format, which would allow you to get the RSS data via javascript. Make sure you trust the site if you do this, though, since the site can execute javascript inside your page!

Updateable Google Sitemap for ASP.NET 3.5 Web App Project

I am working on an ASP.NET 3.5 Web Application project in C#. I have manually added a Google-friendly sitemap which includes entries for every page in the project - this is not a CMS.
<url>
<loc>http://www.mysite.com/events.aspx</loc>
<lastmod>2009-11-17T20:45:46Z</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
The client updates events using an admin back-end. Other than that, the site is relatively static. I'm trying to decide on the best way to update the <lastmod> values for a handful of pages that are regularly updated.
In particular, I am using the QueryStringField of the ListView control to enhance SEO as described here:
https://web.archive.org/web/20211029044137/https://www.4guysfromrolla.com/articles/010610-1.aspx
http://gsej.wordpress.com/2009/05/31/using-a-datapager-with-both-a-querystringfield-and-renderdisabledbuttonsaslabels/
When the QueryStringField property is set, the DataPager renders the paging interface as a series of hyperlinks which the crawler can follow and index. However, if Google has crawled my list of events two days ago, and in the meantime, the admin has added another dozen events... say the page size is set to 6; in this case, the Google SERP links would now be pointing to the wrong pages. This is why I need to be sure that the sitemap reflects changes to the events page as soon as they happen.
I have already looked though other SO questions for info and didn't find what I needed. Can anyone offer some guidance or an alternative approach?
UPDATE:
Since this is a shared hosting environment, a directory watcher/service won't work:
How to create file watcher in shared webhosting environment
UPDATE:
Starting to realize that I may need signify to Google that the containing page has been updated; update the last-modified HTTP header?
Rather than using a hand-coded sitemap, create a sitemap handler that will generate the sitemap on the fly. You can create a method in the handler that will grab pages from an existing navigation sitemap, from the database, or even from a hard-coded list of pages. You can create an XmlDocument from the list, and write the InnerXml of the document out to the handler response stream.
Then, create a class with a method that will automatically ping search engines with the above handler's URL (like http://www.google.com/webmasters/tools/ping?sitemap=http://www.mysite.com/sitemap.ashx).
Whever someone adds a new event, call the above method. This will ping Google using your latest sitemap (freshly generated by the above method).
You want to make sure that the ping only works if the sitemap has actually been updated. You could use File.SetLastWriteTime on events.aspx in the AddNewEvent handler to signify that the containing page has been updated.
Aslo, be careful to make sure there have been no pings for the last hour (as Google guidelines discourage pinging more than once per hour).
I actually plan to implement this in the following OSS project: http://cyclemania.codeplex.com. I will let you know once it's done and you can have a look.
If you let your user add events to the website you are probably using a database.
This means you can generate the XML-Sitemap at runtime like this:
create a page where your sitemap will be available (this doesn't need to be sitemap.xml but can also be sitemap.aspx or even sitemap.ashx).
open a database connection
loop through all records and create an Xml Element for each record
This blog post should help you further: Build a Search Engine SiteMap in C#.
It is not using the new XElements from .Net 3.5, but is will work fine.
You can put this in an aspx page, but adding an HttpHandler is probably better as described on the same blog, different post: (creating a httphandler for a sitemap)

Resources