Ways to track the referring page to create other links? - http

I need to be able to determine which page the user just came from to determine which links to display, such as breadcrumbs or links to the previous next item. This is basically the HTTP_REFERER functionality in PHP, but I need a way of tracking it across multiple pages. I also need to "support" the back button.
I have noticed that Facebook uses a query/get parameter of "ref" to track the referring page. (They also avoid reloading the entire page, using AJAX instead but I'm don't have the budget to do that right now.) Also, the site I'm working on needs to be indexed by Google, so this method will also require that I add the canonical link tag.
I'm wondering if the ref/referrer query parameter is the best method or what other options there are?

If you want breadcrumbs, you shouldn't be using HTTP_REFERER at all. It should be a logical path to get to where they are, no matter where they came from, like User > Albums > AlbumName > Photo, even if they came from a direct link their friend gave them. That said, if you do want to go back a few pages, just store them as a an array in a SESSION variable.
I'm pretty sure Facebook just uses the ref GET variable to collect some statistics on which buttons users are using, since there are multiple ways to get to the same page.
None of this should break the back button, or intefere with your canonical tag.
From comments: You could use a ?ref=blah tag, or session variables, ($_SESSION['history'][0] = $_SERVER['HTTP_REFERER'] or REQUEST_URI). Use whatever you find easiest. Session variables rely on cookies or passing an ID through the URL, GETs just clutter the URL and might get passed around to friends.

Related

Symfony - Restrict QueryParams

Is there way to force Symfony throw 404 if there is some extra params ?
For example, I have route /news/ and I want to allow only date parameter. So link could exist in this form: /news/?date=243242, but I want 404 if user enters following link: /news/?param=2 ?
Thanks.
(I don't want to check query params in controller, I know I can)
Do you really need these to be get params? You meet your objective buy having them as values in the URL itself e.g.
#Route("/news/date/{date}")
Slightly different I know - but you can enforce it
Why do you care about the extra params anyway? If some nasty user decides to play with the URL directly, your app is not supposed to behave correctly.
Don't bother with all these checks — unless those params somehow affect security.
Based on your comment, you want to respond with 404 to get rid of duplicate content in Google. There are several steps you need to take to solve that problem.
If a user enter an extra parameter manually, in no way that would add a page to the Google index. So, if you're having duplicate pages based on different params in the Google index, it means that you have links with those extra params on your site. That's how they end up being indexed.
First thing you could do would be to get rid of those links. Then you could go to the Google Webmaster Tools and manually remove the indexed pages with those extra parameters from the index. If you don't have the problematic links anymore, they won't get to the index again.
If for some reason you can't get rid of those link, go to the Webmaster Tools and consult the URL Parameters section to understand how to add parameters that Google should ignore.

Show constant URL for site in asp.net

I have a web site with number of pages, developing in asp.net.
I have a page URL's like:
example:
1) http://www.xyz.com/Home.aspx
2) http://www.xyz.com/Index.aspx
3) http://www.xyz.com/viewMember?Name=abc&id=1
But the end user is at any page, i would like to show the URL like "http://www.xyz.ie".
Is there any setting in web.config ? If not, is there any other way ?
Please help me...
Thanks in advance.
Jagadi
You can not keep one single URL for different page - but you can do some tricks to simulate it.
To make the url stay the same, but the content change, you need to make some trick.
I am not recommend, search engines they will not follow what you do and they show each page different, user can not make bookmark, and average user can easy find the real url of the page, even with one different click on the browser can find it.
One trick is to use frames, or iframes. On the main page you load all the rest inside an iframe, or inside a frame.
Second trick is to use ajax to load each other content.
And finally you can use session to know what to show on the user, user did not change links, but make post back that change the content.

Abusing HTTP POST

Currently reading Bloch's Effective Java (2nd Edition) and he makes a point to state, in bold, that overusing POSTs in web applications is inherently bad. Unfortunately, he doesn't specify why.
This startled me, because when I do any web development, all I ever use are POSTs! I have always steered clear of GETs for security reasons and because it felt more professional (long, unsightly URLs always bother me for some reason).
Are there performance differentials between GET and POST? Can anyone elaborate on why overusing POSTs is bad, and why? My understanding - and preliminary searches - seem to all indicate that these two are handles very similarly by the web server. Thanks in advance!
You should use HTTP as it's supposed to be used.
GET should be used for idempotent, read queries (i.e. view an item, search for a product, etc.).
POST should be used for create, delete or update requests (i.e. delete an item, update a profile, etc.)
GET allows refreshing the page, bookmark it, send the URL to someone. POST doesn't allow that. A useful pattern is post/redirect/get (AKA redirect after post).
Note that, except for long search forms, GET URLs should be short. They should usually look like http://www.foo.com/app/product/view?productId=1245, or even http://www.foo.com/app/product/view/1245
You should almost always use GET when requesting content. Only use POST when you are either:
Transmitting sensitive information which should not appear in the URL bar, or
Changing the state on the server (adding/changing/deleting stuff, altough recently some web applications use POST to change, PUT to add and DELETE to delete.)
Here's the difference: If you want to give the link to the page to a friend, or save it somewhere, or even only add it to your bookmarks, you need the full URL of the page. Just like your address bar should say http://stackoverflow.com/questions/7810876/abusing-http-post at the moment. You can Ctrl-C that. You can save that. Enter that link again, you're back at this page.
Now when you use any action other than GET, there is simply no URL to copy. It's like your browser would say you are at http://stackoverflow.com/question. You can't copy that. You can't bookmark that. Besides, if you would try to reload this page, your browser would ask you whether you want to send the data again, which is rather confusing for the non-tech-savy users of your page. And annoying for the entire rest.
However, you should use POST/PUT when transferring data. URL's can only be so long. You can't transmit an entire blog post in an URL. Also, if you reload such a page, You'll almost certainly double-post, because the above described message does not appear.
GET and POST are very different. Choose the right one for the job.
If you are using POST for security reasons, I might drop a mention of other security factors here. You need to ensure that you send the data from a form submit in encrypted form even if you are using POST.
As for the difference between GET and POST, it is as simple as GET is used to send a GET request. So, you would want to get data from a page and act upon it and that is the end of everything.
POST on the other hand, is used to POST data to the application. I am talking about transactions here (complete create, update or delete operations).
If you have a sensitive application that takes, say and ID to delete a user. You would not want to use GET for it because in that case, a witty user may raise mayhem simply changing the ID at the end of the URL and deleting all random uses.
POST allows more data and can be hacked to send streams of files as well. GET has a limited size though.
There is hardly any tradeoff in using GET or POST.

Bookmarking ASP.NET search results using POST or GET?

I need a little help understanding how HTML forms work. It is my understanding that forms that use GET as their method submit name/value pairs for all fields within the form tags of said submission. However, if you take a look at the follow example from Google (and I've seen this in many other places too) and only use one of the fields on the form:
http://books.google.co.uk/advanced_book_search
Rather than being sent to a page with a name/value pair for each field of the advanced search page you are taken to a much cleaner looking URL:
http://www.google.co.uk/search?tbo=p&tbm=bks&q=hitchiker&num=10
Despite all of the input fields on the advanced search page.
Onto my problem... My own advanced search page is quite large and at the moment is being POSTed to my search results page which is taking in the values and searching accordingly, no problems! However, I want my users to be able to bookmark/share their searches and in order to do this I need to have items being passed into the querystring but I don't want massive querystrings if I don't need them. If my user has only searched by a color for example then I want the URL to be something like search.aspx?color=red; If they're searching by color and size then search.aspx?color=red&size=large and so on. Is this possible?
To complicate things even further I'm using ASP.NET so it's not the easiest of things to create a form that uses GET though I do believe I have already found away around this.
If you can give any advice or a nudge in the right direction, then thank-you! :)
What you're suggesting should be easily possible if you conditionally check the querystring on the results page to ensure the key/value is there.
if(Request.QueryString["color"] != "")
{
// Add color to the seach parameters
}
To create the GET request I would think you would need to POST back to your search form and redirect to the results form from there, dynamically adding key/values to the querystring as and when they are required. This Post/Redirect/Get design pattern is typically used with web forms to help with book marking.
If you want to share bookmarked searches between users, then you'll have to share the name/value querystring options in the posted URL. It sounds like you don't want to include the pair if one wasn't specified. That's easy, just dynamically build a querystring for pairs that the user HAS provided input for. So, when processing, loop through all input controls, and if a value was provided, append it to the querystring, or not.

Standard way to persist data between requests in ASP.NET-MVC

What is the most standard or best way to persist data between requests?
Should I use cookies or session variables? I'm interested in keeping data like sort order, sort column, and page number (for paginiation).
I'm coming from a webforms background so normally this type of thing was automatically handled for me in the viewstate of the controls I was using.
update
I like the querystring idea, for searching and more meaningful URLs; however, I'm working on an "index/list" view, which consists of a View with header, and "control" options, like DDLs for filtering and a partial view that renders the table of data.
The DDLs use a $.load() to call an ActionResult on the controller, which returns the partial view, passing parameters there in the querystring, but since these are ajax requests the main page url of the user's browser does not get updated.
Is there a best-practice for taking querystrings off the main-page URL and using them in ajax requests to other ActionResults?
If you want it to survive only through one request/redirect TempData is your friend.
However, for things like your pagination, URL is the best method, for the ability to share links alone.
A standard way is to pass those sort of things via URL Query Parameters. You can modify your routing to expect certain URL variables. That way the pages become more search engine friendly as well.
It depends on how permanent you want the information to be:
Things like the page number should indeed be in the URL (as others have pointed out) - this helps with bookmarking, etc, but remember that if you add more content to the list, then that bookmarked result set will not always be what the user wanted...
If you're happy for these values to be lost when a session times out (by default around 20 minutes), then put them in Session.
If you think that sessions are going to timeout before the next request, or you want to save it across visits then you should be storing them in either cookies, or a profile (potentially allowing "Anonymous" profiles, which work with the users cookies, so they would lose them across machines).
Personally, I'd think very carefully about putting sort order and columns in the URL if you do you could actually end up really confusing search engines:
Lots of pages with very similar content (page 1, sorted by date desc, page 1 sorted by date asc, etc) - search engines don't like duplicate content, and nor should you as Google (for instance) will only show two pages from your site in a default result set, you want them to be valid, not duplicates.
Search engines will spend lots more time crawling your site, and potentially give up - If on every page they find links to "Sort by this column", they will attempt to follow them, resulting in more work on the server, higher bandwidth use, etc.
These can be mitigated through the use of a Robots.txt file denying access to sorted versions of the page, but if this is generated almost dynamically that will be very complex to maintain going forward.
In response to your update, a nice way to achieve that for pages would be to have links to "Previous" and "Next" pages of results (or better yet, a list of all pages in the list), output on the page, with the page numbers, that you then hide with JavaScript.
This way users should see your nice, AJAXy behaviour, and search engines (and users without JavaScript - mobile, or those using older screen readers for instance) will still be able to get access to all your pages - this will help your pages to degrade gracefully, or use "Progressive Enhancement".
Things that were previously in viewstate should probably be put back in the clients hands via either hidden fields or cookies.
Session is "too" easy. In a dev environment it works great, pretty much no matter what you put in it. In production scalability and persistence become a problem. In-process session is likely to disappear unexpectedly if you have crashing bug in your site, and requires server affinity when load balancing. Out-of process session fixes the durability and affinity issues, but can still be a performance bottle neck if too much stuff is put in session. A VERY common problem is that each page will put 1 or 2 items into session but never take them out again when they are done. And even if a page removes it session data when it is no longer needed, the data can still get orphaned if a user starts a process and never completes it.
Cookies is a fast and simple way to persist data between requests, and you can also make them live only for a limited time depending on your needs.
Session are easiest.

Resources