ASP.Net authentication and Googlebot - asp.net

I have an ASP.Net 3.5 web site with forms authentication enabled. Is it possible to have Googlebot crawl my web site without getting prompted for a username/password?

Google claims that is not wont to index page and show them on the users as available that are not, because actually they request user name and password.
It can give the option only to crawl the protect page by the AdSense so he can know what advertize to show on them
https://www.google.com/adsense/support/bin/answer.py?answer=37081
Other solutions that check if is bot or coming from google bot computers are not safe because they can easy spoof by the users, and also may fail to show a preview or a cache of the page.
So you need to think your site structures, what is very important and what is not, to show some part of the pages, hide some other if the user is not register, and that way google have something to index even if its not loged in.

Here is an article:
http://www.guruofsearch.com/google-access-password-protected-site
It would be interesting to see if a google sitemap would result in pages showing up in google - but I doubt that would work either, as the pages would likely need to be crawled anyway.
And some other interesting comments here:
http://forums.searchenginewatch.com/showthread.php?t=8221

Related

How to check search results of website on google?

I am working on a website "https://datasiplus.com".
When i type datasiplus on google, i get as 3rd result this url "https://datasiplus.com.cutestat.com/".
Is it normal ?
Can it be the cause for my website having unwanted popup ads ?
How to check search results of website on google?
You can see all indexed pages from your website (domain) if you go to google search and type the following
site:datasiplus.com
cutestat.com is it normal?
This page is a tool to get information about a specific domain. It's estimating the value, traffic and lot more. Either this tool has automatically crawled your page or someone searched for your domain with it.
There is a form on their site, where you can request to remove your domain from cutestat.com here
So yes, it's normal that this is in google index because it's like a subpage of their tool and datasiplus is a keyword for both sites, yours and datasiplus.com.cutestat.com
If you go to google now and search for datasiplus, then you can already see your own question there.
Can it be the cause for my website having unwanted popup ads?
No, this page will not cause unwanted popup ads on your page (or any other page).
Popups like this is most probably caused by malware on your page. This may be introduced through some security holes in wordpress and / or from one of the plugins you are using.
To get started to search and remove such malware, you can start at this SO question

My azure website doesn't show up on search engines

As the title says. I have an ASP.NET web application that I published to my azure account. I did a little SEO and it should show up somewhere on the search engines but it doesn't.
It doesn't even show up if I type in the address in the search field. It works fine when typing the URL in address field.
My azure subscription is "Pay-as-you-go".
Any tips or answers are appriciated!
Thanks!
My answer mainly pertains to Google. How long have you waited? It's my experience that it takes a few days to a week minimum to start showing up (if you're using Google sign up for their web master tools and when you submit your site you can see when it's indexed and what pages are indexed which is important because they may skip content they deem is duplicated elsewhere whether it is or not). It's also my experience (using Azure) that sub domains on "azurewebsites.net" end up with poor SEO but if I have a full domain on my site it ranks much higher.
I also assumed that you submitted the site to the search engines, if you haven't site up for a web master account and do that (Bing and Google both have these).
http://www.bing.com/toolbox/webmaster
https://www.google.com/webmasters/tools/home?hl=en
In Google you can also search specifically for your site to see what comes back which will indicate that others can get to your stuff (even if it's buried 100 pages deep in other searches):
site:[your site].azurewebsites.net

How to get all content of website from google cache?

My gmail account was hacked today, and I can't login or request new password anymore. And I lost all content in my blogspot too.
I looked around and found it was stored in google cache. But I had more than 200 articles, and I need to go through more than 200 urls to copy all content.
Is there any methods can help me retries all content from google cache?
A Web Crawler could help you to retrieve many pages of information.
Also, maybe you could use Internet Archive Wayback Machine to retrieve some lost information.
Another tip: the google advanced search could help you too. In particular, the site or domain param.
Update: Maybe this script can do all work for you: Retrieving Google’s Cache for a Whole Website

Page Administration & Open Graph

I'm an Admin for this page
http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fwww.westberks.gov.uk%2Findex.aspx%3Farticleid%3D23789
and I've also specified that my App can administer it too. I've Liked this page so I thought I should be able to access the admin screen for the page but I can't seem to. Any ideas on how I access the admin screen for this page in FB, so I can manually publish updates?
Additionally, when I try to update page programmatically I get the message
(OAuthException) (#200) The user hasn't authorized the application to perform this action
but the page has my App listed here
http://graph.facebook.com/10150303466842688
This was definitely a bug rather than a programming or setup issue. The bug has been marked fixed by Facebook as of 1/18/2012 and everything now works as it is supposed to! Bug report:
http://developers.facebook.com/bugs/308356579205492?browse=search_4f0f1475c470b2076799347
Until this recent fix, there was a problem where OpenGraph pages did NOT allow the admins of those pages to retrieve page access tokens for them. Which means they were locked out of posting "as the page" and apparently also locked out of the Admin area for their own pages as well.
I know that this is fixed for me now with this bugfix, and hopefully it will also be fixed for everyone else.
You will need to ask for manage_pages, read_stream and publish_stream. Once your admin accepts those permissions, the app can call me/accounts on the Graph (play here https://developers.facebook.com/tools/explorer). In there will be a list of all the pages they admin. In each listing will be a unique access token. This is called the page access token. Using that token you should be able to read and write to the me/feed for that page.

capture details from external web page

I'm wondering if it's possible to capture details from the web page that a user previously visited, if my page was not linked from it?
What I am trying to achieve is to allow users to my site to find a page they like while browsing the web, and then navigate to a page on my site via a bookmark, which will add the URL (and possibly some other details like the page title) to a form which they can then submit to my site to add the page to a list of favourites there.
I am not really sure where to start looking for this. I wondered if I could use http referrer, but think this may only work if there is a link to my page?
Alternatively, I am open to other suggestions as to how I could capture this data - a Firefox plugin? A page which users browse other sites in an iframe, with a skinny frame on top?
Thanks in advance for your suggestions.
Features like this are typically not allowed by browsers for security and privacy reasons. The IFrame would work, but this is a common hacking technique so it may be likely to break or be flagged in the future.
The firefox addon is the best solution, but requires users to install it manually.
Also, a bookmarklet could be used. While they are actively on the target page, the bookmarklet could send you the URL.
This example bookmarklet would create a tinyURL for the destination page. You could add it to your database or whatnot.
javascript:void(window.open('http://tinyurl.com/create.php?url='+document.location.href));
If some other site links to yours and the user clicked on that link which took them to your site you can access the "referrer" from the http headers. How you get a hold of the HTTP headers is language / framework specific. In .NET you would use the Request.UrlReferrer; other frameworks would probably handle it differently.
EDIT: After reading your question again, my guess would be what you're looking for is some sort of browser plugin. If I understand correctly, you want to give your clients the ability to bookmark a site, while they are on that site, which would somehow notify your site about the page they're viewing. The cleanest way to achieve this would be a browser plugin. You can also do FRAME tricks, like the Digg bar.

Categories

Resources