I just replaced a tumblr website for a client with a brand new Wordpress site. And when running it through the Facebook debugger, I get this error:
The object at 'http://example.com/' previously had type 'tumblr-feed:tumblelog' and cannot be changed to an object of type 'website' to avoid data corruption of existing actions.
I Googled "Cannot change og_type" (in quotes) and got literally zero results (well now it seems there are results stemming from this question). Am I really doomed to Facebook data mismatch?
Per the error message
... cannot be changed to an object of type 'website' to avoid data corruption of existing actions.
If the og:type were changed for a URL, any existing user posts linking to it or sharing it, any Open Graph actions referencing it, as well as any likes of the URL would become broken and the user's profiles would be missing content they'd posted before.
I don't believe there's any way around this, as it's an intentional restriction to avoid breaking existing posts, likes, actions, etc referencing a URL. If the posts were broken, content would be removed from or mangled on the user's timeline.
A possible workaround if you want to have a 'new' object at that URL is to use my instructions in this answer about moving URLs to put a Like button on the URL you're trying to change (let's call it A), but pointing to a slightly different, new URL (let's call it B) , and then use the redirect mechanism in my answer to bounce users landing at URL B back to A, but serve the metdata describing 'A' on URL B if the Facebook crawler accesses it
Does the client's site have more than 10,000 likes? If so, Facebook doesn't allow og:type to be changed.
You can update the attributes of your page by updating your page's tags. Note that og:title and og:type are only editable initially - after your page receives 50 likes the title becomes fixed, and after your page receives 10,000 likes the type becomes fixed. These properties are fixed to avoid surprising users who have liked the page already. Changing the title or type tags after these limits are reached does nothing, your page retains the original title and type.
Here's the link to the Open Graph documentation. :)
I would reccomend using the Open Graph Debugger to check what facebook really sees and if facebook eventually has a cached version of your site. (you find hte debugger here: https://developers.facebook.com/tools/debug)
NOte that it doesnt say og:type - it says og_type
This is hitting me too since my og:type is set to "shamrockirishbar:shamrockirishbar" BUT the linter is saying og_type (of which there is none in my meta data) is set to "website".
enter link description here
Related
One of the purposes of og:url -- I thought -- was that it was a way you could make sure sessions variables, or any other personal information that might find its way into a URL, would not be passed along by sharing in places like Facebook. According to the best practices on Facebook's developer pages: "URL
A URL with no session id or extraneous parameters. All shares on Facebook will use this as the identifying URL for this article."
(under good examples: developers.facebook.com/docs/sharing/best-practices)
This does NOT appear to be working, and I am puzzled as to either -- how I misunderstood, and/or what I have wrong in my code. Here's an example:
https://vault.sierraclub.org/fb/test.html?name=adrian
When I drop things into the debugger, it seems to be working fine...
https://developers.facebook.com/tools/debug/sharing/?q=https%3A%2F%2Fvault.sierraclub.org%2Ffb%2Ftest.html%3Fname%3Dadrian
og:url reads as expected (without name=adrian).
But if I share this on facebook -- and then click the link. The URL goes to the one with name=adrian in it, not the og:url.
Am I doing something incorrectly here, or have I misunderstood? If the latter, how does one keep things like sessions variables out of shares?
Thanks for any insight.
UPDATE
Facebook replied to a bug report on this, and I learned that I indeed was reading the documentation incorrectly
developers.facebook.com/bugs/178234669405574/
The question then remains -- is there any other method to keeping sessions variables/authentication tokens out of shares?
Is there way to force Symfony throw 404 if there is some extra params ?
For example, I have route /news/ and I want to allow only date parameter. So link could exist in this form: /news/?date=243242, but I want 404 if user enters following link: /news/?param=2 ?
Thanks.
(I don't want to check query params in controller, I know I can)
Do you really need these to be get params? You meet your objective buy having them as values in the URL itself e.g.
#Route("/news/date/{date}")
Slightly different I know - but you can enforce it
Why do you care about the extra params anyway? If some nasty user decides to play with the URL directly, your app is not supposed to behave correctly.
Don't bother with all these checks — unless those params somehow affect security.
Based on your comment, you want to respond with 404 to get rid of duplicate content in Google. There are several steps you need to take to solve that problem.
If a user enter an extra parameter manually, in no way that would add a page to the Google index. So, if you're having duplicate pages based on different params in the Google index, it means that you have links with those extra params on your site. That's how they end up being indexed.
First thing you could do would be to get rid of those links. Then you could go to the Google Webmaster Tools and manually remove the indexed pages with those extra parameters from the index. If you don't have the problematic links anymore, they won't get to the index again.
If for some reason you can't get rid of those link, go to the Webmaster Tools and consult the URL Parameters section to understand how to add parameters that Google should ignore.
This is a question about web application architecture rather than coding per se, however I still think it belongs here as it's in the problem domain of most web developers:
My problem. I have a page on which the content is not complete (only partial content). I don't want to just return a 200 response because I want it to be clear that the content on the page is only temporary, and that a visitor (google) should return at a later date to retrieve the correct page.
I'm not sure if there is a status code in the http specification that would be useful here.
I'm thinking about using a 302 redirect to the same URI, but I'm not sure if google will see this as gaming (I don't see why it should - no-one would 302 to the same URI on a permanent basis as the page content would be pretty much disregarded).
That's exactly what I want: For the page to be accessible - but for google to disregard the page, remember the URL and come back later to index it.
I don't want to use a meta 'no-index' tag with a 200 response as I fear this will stop the page being reindexed when the correct content is ready.
206 is the partial status code but thats not what you are doing here. Thats for multi part docs. What you have here is a "under construction" type page but only the content in the page is going to change not the uri. So the right thing to do is just return a 200 and let Google index it.
If you don't want it indexed yet because it is not ready for the public yet then add a meta no-index like you say. Google still downloads the page and parses it to find the no-index but does not index it. Remove the no-index when you are ready and it will start indexing. You can even prompt this by submitting a new sitemap.xml file with your page in it.
Google re-indexes insanely quickly these days so don't worry too much about temp blocking a page with a meta tag.
Google will re-index the page when the content changes automatically. Or you can force an update somewhere in the webmaster tools.
Alternatively, you could have the page 302 to an alternate address with your partially completed content until such time as the page is 'finished'. Then copy the final content into your original page and take off the 302.
Any error codes are reserved for error conditions. There are no such error as "This page is not in it's final version", indeed. What you might want is to specify that this page becomes obsolete and invalidated at some later time. For example, the following code means the page becomes obsolete instantly:
I have written a CMS for a website. You can create pages and do all things you would expect but I am just wanting your opinions on what to do if a user changes the URL of a page. You would need to do a 301 for the previous stored URL but if the user changes the URL 10 times you have to account for all those changes.
Therefore do you not allow users to change the URL or are there other approaches?
Thanks
I'd imagine that a user renaming a page isn't going to happen very often, so you might be able to afford to run a scan through all of the pages in your database looking for references to the previous URI. Present a warning page to the user, saying "All of these pages have links that will now go to 404 because of this change", and give them options:
Establish a 301 as you're thinking
Automagically update the identified links
Don't rename the page
Don't make any of the changes
Of course, you could always just perform the automatic update and let the user back that out too, but that necessitates a fairly complex WAL set up that I can tell you from experience is a huge pain.
Just my $0.02!
If you are worried about the 10 sequential requests caused by the 301's, you could have a script periodically going through all "redirect pages", figure out the most recent URL they now point to, and point them straight there without intermediate redirects.
Alternatively, keep a list of the original URLs along with the lastest one, so you can update all of them when the URL changes again.
Quite a lot of CMS's simply just don't allow the permalinks to be changed so if your worried about not complying with some rule you wouldn't be the first to not allow it.
If you do however implement changing the permalinks (say by changing the title) you'll have to store N titles for each content item and just redirect to the title with the highest id.
ContentItemTitle
id - auto
increment text - unique constraint
contentId - reference
This is linked back to ContentItem as a 1 to 1 relationship.
Then when you recieve a HTTP request for <text> just lookup all the ContentItemTitle rows that share the same <contentId> as <text> and pick the one one with the highest <id> and redirect to that.
I need to be able to determine which page the user just came from to determine which links to display, such as breadcrumbs or links to the previous next item. This is basically the HTTP_REFERER functionality in PHP, but I need a way of tracking it across multiple pages. I also need to "support" the back button.
I have noticed that Facebook uses a query/get parameter of "ref" to track the referring page. (They also avoid reloading the entire page, using AJAX instead but I'm don't have the budget to do that right now.) Also, the site I'm working on needs to be indexed by Google, so this method will also require that I add the canonical link tag.
I'm wondering if the ref/referrer query parameter is the best method or what other options there are?
If you want breadcrumbs, you shouldn't be using HTTP_REFERER at all. It should be a logical path to get to where they are, no matter where they came from, like User > Albums > AlbumName > Photo, even if they came from a direct link their friend gave them. That said, if you do want to go back a few pages, just store them as a an array in a SESSION variable.
I'm pretty sure Facebook just uses the ref GET variable to collect some statistics on which buttons users are using, since there are multiple ways to get to the same page.
None of this should break the back button, or intefere with your canonical tag.
From comments: You could use a ?ref=blah tag, or session variables, ($_SESSION['history'][0] = $_SERVER['HTTP_REFERER'] or REQUEST_URI). Use whatever you find easiest. Session variables rely on cookies or passing an ID through the URL, GETs just clutter the URL and might get passed around to friends.