Symfony - Restrict QueryParams - symfony

Is there way to force Symfony throw 404 if there is some extra params ?
For example, I have route /news/ and I want to allow only date parameter. So link could exist in this form: /news/?date=243242, but I want 404 if user enters following link: /news/?param=2 ?
Thanks.
(I don't want to check query params in controller, I know I can)

Do you really need these to be get params? You meet your objective buy having them as values in the URL itself e.g.
#Route("/news/date/{date}")
Slightly different I know - but you can enforce it

Why do you care about the extra params anyway? If some nasty user decides to play with the URL directly, your app is not supposed to behave correctly.
Don't bother with all these checks — unless those params somehow affect security.
Based on your comment, you want to respond with 404 to get rid of duplicate content in Google. There are several steps you need to take to solve that problem.
If a user enter an extra parameter manually, in no way that would add a page to the Google index. So, if you're having duplicate pages based on different params in the Google index, it means that you have links with those extra params on your site. That's how they end up being indexed.
First thing you could do would be to get rid of those links. Then you could go to the Google Webmaster Tools and manually remove the indexed pages with those extra parameters from the index. If you don't have the problematic links anymore, they won't get to the index again.
If for some reason you can't get rid of those link, go to the Webmaster Tools and consult the URL Parameters section to understand how to add parameters that Google should ignore.

Related

Why do my Google Analytics URLs begin with slashes?

When looking at Google Analytics, all reports show URLs that begin with a slash and www: "/www.url.com/page.html."
I've never seen Google report like this. Webmaster Tools is set up correctly. Not sure what else can be set up in Analytics. Any idea?
Current: /www.url.com/page.html
Typical: /page.html
by default GA only reports the relative path and query string of the URL (it strips the protocol and domain. So one of two things must be happening:
1) you have code that is passing a custom page name to the _trackPageview call, adding that "/[domain]".
2) you have a custom filter within the interface setup that is prefixing the page name with "/[domain]"
Adding the domain to the page name is a fairly common thing to do when you have GA code spanning multiple domains, most especially when they are going to a rollup profile, so that you can see which pages are coming from which domain.
So if I had to guess (and this is only a guess, seeing as how I don't have access to your
code or GA interface), someone must have attempted to rebuild the full url to use as the page name instead of just the path+querystring - and then messed up (probably a messup in some regex with the protocol, if I wanted to throw even more guesses at it).
But the 64 thousand dollar question is.. where is it being changed? Like I said.. GA by default does not do this, so someone has added code to do it on your site, or else a filter within the interface.
I would start by looking to see if there are any filters in your interface, since that is the easier thing to determine. If you see no filters relevant to this, then you will have to look on your page code (including any script includes or other javascript code being output). It would be a value passed with _trackPageview so ctrl+f for that.

Cannot change og_type

I just replaced a tumblr website for a client with a brand new Wordpress site. And when running it through the Facebook debugger, I get this error:
The object at 'http://example.com/' previously had type 'tumblr-feed:tumblelog' and cannot be changed to an object of type 'website' to avoid data corruption of existing actions.
I Googled "Cannot change og_type" (in quotes) and got literally zero results (well now it seems there are results stemming from this question). Am I really doomed to Facebook data mismatch?
Per the error message
... cannot be changed to an object of type 'website' to avoid data corruption of existing actions.
If the og:type were changed for a URL, any existing user posts linking to it or sharing it, any Open Graph actions referencing it, as well as any likes of the URL would become broken and the user's profiles would be missing content they'd posted before.
I don't believe there's any way around this, as it's an intentional restriction to avoid breaking existing posts, likes, actions, etc referencing a URL. If the posts were broken, content would be removed from or mangled on the user's timeline.
A possible workaround if you want to have a 'new' object at that URL is to use my instructions in this answer about moving URLs to put a Like button on the URL you're trying to change (let's call it A), but pointing to a slightly different, new URL (let's call it B) , and then use the redirect mechanism in my answer to bounce users landing at URL B back to A, but serve the metdata describing 'A' on URL B if the Facebook crawler accesses it
Does the client's site have more than 10,000 likes? If so, Facebook doesn't allow og:type to be changed.
You can update the attributes of your page by updating your page's tags. Note that og:title and og:type are only editable initially - after your page receives 50 likes the title becomes fixed, and after your page receives 10,000 likes the type becomes fixed. These properties are fixed to avoid surprising users who have liked the page already. Changing the title or type tags after these limits are reached does nothing, your page retains the original title and type.
Here's the link to the Open Graph documentation. :)
I would reccomend using the Open Graph Debugger to check what facebook really sees and if facebook eventually has a cached version of your site. (you find hte debugger here: https://developers.facebook.com/tools/debug)
NOte that it doesnt say og:type - it says og_type
This is hitting me too since my og:type is set to "shamrockirishbar:shamrockirishbar" BUT the linter is saying og_type (of which there is none in my meta data) is set to "website".
enter link description here

Bookmarking ASP.NET search results using POST or GET?

I need a little help understanding how HTML forms work. It is my understanding that forms that use GET as their method submit name/value pairs for all fields within the form tags of said submission. However, if you take a look at the follow example from Google (and I've seen this in many other places too) and only use one of the fields on the form:
http://books.google.co.uk/advanced_book_search
Rather than being sent to a page with a name/value pair for each field of the advanced search page you are taken to a much cleaner looking URL:
http://www.google.co.uk/search?tbo=p&tbm=bks&q=hitchiker&num=10
Despite all of the input fields on the advanced search page.
Onto my problem... My own advanced search page is quite large and at the moment is being POSTed to my search results page which is taking in the values and searching accordingly, no problems! However, I want my users to be able to bookmark/share their searches and in order to do this I need to have items being passed into the querystring but I don't want massive querystrings if I don't need them. If my user has only searched by a color for example then I want the URL to be something like search.aspx?color=red; If they're searching by color and size then search.aspx?color=red&size=large and so on. Is this possible?
To complicate things even further I'm using ASP.NET so it's not the easiest of things to create a form that uses GET though I do believe I have already found away around this.
If you can give any advice or a nudge in the right direction, then thank-you! :)
What you're suggesting should be easily possible if you conditionally check the querystring on the results page to ensure the key/value is there.
if(Request.QueryString["color"] != "")
{
// Add color to the seach parameters
}
To create the GET request I would think you would need to POST back to your search form and redirect to the results form from there, dynamically adding key/values to the querystring as and when they are required. This Post/Redirect/Get design pattern is typically used with web forms to help with book marking.
If you want to share bookmarked searches between users, then you'll have to share the name/value querystring options in the posted URL. It sounds like you don't want to include the pair if one wasn't specified. That's easy, just dynamically build a querystring for pairs that the user HAS provided input for. So, when processing, loop through all input controls, and if a value was provided, append it to the querystring, or not.

Considerations for changing a page's URL in a CMS

I have written a CMS for a website. You can create pages and do all things you would expect but I am just wanting your opinions on what to do if a user changes the URL of a page. You would need to do a 301 for the previous stored URL but if the user changes the URL 10 times you have to account for all those changes.
Therefore do you not allow users to change the URL or are there other approaches?
Thanks
I'd imagine that a user renaming a page isn't going to happen very often, so you might be able to afford to run a scan through all of the pages in your database looking for references to the previous URI. Present a warning page to the user, saying "All of these pages have links that will now go to 404 because of this change", and give them options:
Establish a 301 as you're thinking
Automagically update the identified links
Don't rename the page
Don't make any of the changes
Of course, you could always just perform the automatic update and let the user back that out too, but that necessitates a fairly complex WAL set up that I can tell you from experience is a huge pain.
Just my $0.02!
If you are worried about the 10 sequential requests caused by the 301's, you could have a script periodically going through all "redirect pages", figure out the most recent URL they now point to, and point them straight there without intermediate redirects.
Alternatively, keep a list of the original URLs along with the lastest one, so you can update all of them when the URL changes again.
Quite a lot of CMS's simply just don't allow the permalinks to be changed so if your worried about not complying with some rule you wouldn't be the first to not allow it.
If you do however implement changing the permalinks (say by changing the title) you'll have to store N titles for each content item and just redirect to the title with the highest id.
ContentItemTitle
id - auto
increment text - unique constraint
contentId - reference
This is linked back to ContentItem as a 1 to 1 relationship.
Then when you recieve a HTTP request for <text> just lookup all the ContentItemTitle rows that share the same <contentId> as <text> and pick the one one with the highest <id> and redirect to that.

Ways to track the referring page to create other links?

I need to be able to determine which page the user just came from to determine which links to display, such as breadcrumbs or links to the previous next item. This is basically the HTTP_REFERER functionality in PHP, but I need a way of tracking it across multiple pages. I also need to "support" the back button.
I have noticed that Facebook uses a query/get parameter of "ref" to track the referring page. (They also avoid reloading the entire page, using AJAX instead but I'm don't have the budget to do that right now.) Also, the site I'm working on needs to be indexed by Google, so this method will also require that I add the canonical link tag.
I'm wondering if the ref/referrer query parameter is the best method or what other options there are?
If you want breadcrumbs, you shouldn't be using HTTP_REFERER at all. It should be a logical path to get to where they are, no matter where they came from, like User > Albums > AlbumName > Photo, even if they came from a direct link their friend gave them. That said, if you do want to go back a few pages, just store them as a an array in a SESSION variable.
I'm pretty sure Facebook just uses the ref GET variable to collect some statistics on which buttons users are using, since there are multiple ways to get to the same page.
None of this should break the back button, or intefere with your canonical tag.
From comments: You could use a ?ref=blah tag, or session variables, ($_SESSION['history'][0] = $_SERVER['HTTP_REFERER'] or REQUEST_URI). Use whatever you find easiest. Session variables rely on cookies or passing an ID through the URL, GETs just clutter the URL and might get passed around to friends.

Resources