The following site appears to be hijacking a client's content.
http://mothernova2.rssing.com/chan-24556607/latest.php
This is my client's site.
http://www.mothernova.com/
How would I go about blocking that domain from accessing the site? It also appears they are pulling the site into an iframe allowing full browsing.
FYI, the site is using WordPress, WordFence and iThemes Security (if there are any settings I should add for blocking).
You need to use a framekilling script, which uses javascript to check if your script is the top one. Here's a simple version:
<script type="text/javascript">
if(top != self) top.location.replace(location);
</script>
One drawback to this approach: if there is a legitimate site iframing your code, you need to check the referrer and start adding exceptions.
And a question to answer before you do it: you're getting a pageview and ad impression from the annoying framing site; is there any reason why you need to go to the bother, when they're sending a few viewers to your client's content?
The site owners of rssing.com are well known scrapers. And they are grabbing your content by RSS, hence the name rssing.com.
You can use the contact form to ask that they take your content down. Tell them they are clearly violating your TOS and copyright for your content.
(I had to do this in the past for my own content scraped from my site; they did remove my site at my request.)
Maybe I wasn't implementing the above suggestions correctly (I was adding them at the page level), but they weren't working for me. I did find this post and it seems to work as outlined.
http://forum.ait-pro.com/forums/topic/rssing-com-good-or-bad/
I updated my .htaccess file with the suggested code.
Brett
Related
Hi I'm having an issue with a site where visitors need to be members to access certain pages, but once logged in they go to these pages and still see the 'not logged in' page and need to refresh to view the actual content.
This obviously leads to a lot of bounces and I'd like to fix so that they see the content right away.
The root issue comes from some cache settings or something from the host - unfortunately we can't change host (and it's not a regular hosting company with a website but a design company reseller) for the time being. This issue does not occur in our offline environment of the same site.
I've already had to add a ?randomnumber to the stylesheet so it loads new versions properly. I was wondering if something like this would work - but dynamically as pages are being added all the time by different admins.
Or any other solutions also appreciated!
Thanks
Like you said, tweaking the caching settings would be the most ideal. But since that's not an option, I'd suggest adding a random, meaningless query string to the URL of the member pages so that it's seen as a 'new page' and (likely) won't cache.
So instead of /member-page
Direct them to /member-page?cache-buster=randomlyGeneratedStringHere
I'm looking for an easy way to share through LinkedIn without all that hassle with OAuth 2.0 which I don't see required when I see other pages that use this kind of sharing (and they didn't required anything from - I can straight out share).
Straight to the issue:
this one works: https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Frefair.me
this one doesn't: https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Frefair.me%2Fjob%2F494
Seems like beyond main domain I can't get sharing working. For instance from other site a link that goes deeper and is still shareable: https://www.linkedin.com/shareArticle?mini=true&url=https://bulldogjob.pl/companies/jobs/2043-programista-java-warszawa-bms-sp-z-o-o&title=Programista+Java&summary=&source=https://bulldogjob.pl
I also tested with and without source and summary query params. Anyone had that issue?
LinkedIn uses the Open Graph protocol (http://ogp.me/) to determine how pages are shared in LinkedIn.
You may also use the LinkedIn Post Inspector (https://www.linkedin.com/post-inspector/) tool to debug how various pages would be shared in LinkedIn.
I decoded your URL so I could get a cleaner look...
https://www.linkedin.com/sharing/share-offsite/?url=https://refair.me/job/494
So, let's try to visit your URL: https://refair.me/job/494 . The webpage you are sharing DOES NOT LOAD.
Is your site down for everyone? Yes, your site is down for everyone.
In order to share a URL on LinkedIn, you must fulfill the following minimum requirements:
The URL must load.
If you just want to test out the API, try using wikipedia.org or google.com as test pages.
Surprisingly, the old refair.me URL by itself works fine in LinkedIn, but that could be from some internal cache, from way back in the day when the page once did work. It certainly does not do so anymore.
I'm having a few issues with making our site shareable on linked in and I'm at a loss. The og: meta tags all look fine, the facebook scraper picks it up fine, but the linkedIn scraper does not... and the img etc are not on a protected folder or anything like that.
When inspecting the developer tools the get request to the url-preview?url= link shows that the img etc.. aren't there.
The image is less than 1mb, all og: meta tags are obeyed. The only think that may not be 100% is the image ratio is not 1/4 or 4/1 (it's 2/1)... But that is only a recommendation and not a hard and fast rule.
Does LinkedIn provide something similar to FB (https://developers.facebook.com/tools/debug/) where you can test the scraper and re-run it? Or is there another way to debug this? Any help appreciated.
https://www.hipla.co.uk (is the page i'm trying to share).
cheers
It transpires linked in doesn't offer a similar facility to FB or twitter to test the OG meta tags and re-scrape the page. They cache a page for 7 days and then re-scrape again. However, you can refresh the linkedIn crawler cache simply by appending GET params to the URL, i.e. https://www.hipla.co.uk?123.
I eventually figured out what our issue was. We were using a wildcard cert (for multi domain, so we could have a single ssl cert for multiple subdomains) which meant we had to set the server name in the apache default-ssl.conf file, but we had a typo in it for the www instance ... which meant it gave an SSL error (for the linkedIn crawler) which isn't debuggable (if that's a word) using linkedIn but was spotted as we got an SSL error when testing the twitter metadata tags using the twitter card validator. Hope this helps anyone else who has a typo in their ssl settings. Note that the ssl error was not visible using a browser(s) as all looked fine.
First time posting so please bear with me.
I'm the unofficial web guy at the company I work for and I helped create our basic static HTML site.
Any work that I do to the site offline and then FTP shows up instantly on my machine. I rarely, if ever, need to clear the cache for changes to show up. However, within the company I work for, nearly half of the users never see the updates. Some do, some don't.
On the machines that don't I've cleared the cache in browser and through the internet control panel settings. Nothing. Still shows the stale content. The only thing that works - and I've seen this both in Chrome and IE is that when I add www in front of the URL is then shows the refreshed site. No big deal, right? Well for users who type in mysite.com without the in front will not see the updates. People who have favorited it like that, will not see the updates.
Now, on to what I've tried to fix it. After much research many people have steered me away from meta tag refresh so I haven't tried that, however, with the help of the IT guy we have, from what we can tell, set the HTTP header of the site to always refresh. This did not do anything for us.
I've tried changing image names in the HTML page when updating a photo and that didn't work either.
I haven't been able to find a .htaccess file so can I create one? If we (IT guy and I) changed the HTTP Header setting to always refresh but there is not .htaccess file will there be no change?
Any help or suggestions would be greatly appreciated.
I have searched on here for the answer and the two most suggested changes are HTTP Header and Meta refresh. HTTP header didn't help and it seems the Meta tag route is bad form.
This is a DNS issue. You need to ask the provider of your web services to add an A or a CNAME record for the domain's root.
If you don't understand the above, just call the provider of your web presence (the company that hosts your web server) and tell them you want yourdomain.com and www.yourdomain.com to go to the same place.
I'm wondering if it's possible to capture details from the web page that a user previously visited, if my page was not linked from it?
What I am trying to achieve is to allow users to my site to find a page they like while browsing the web, and then navigate to a page on my site via a bookmark, which will add the URL (and possibly some other details like the page title) to a form which they can then submit to my site to add the page to a list of favourites there.
I am not really sure where to start looking for this. I wondered if I could use http referrer, but think this may only work if there is a link to my page?
Alternatively, I am open to other suggestions as to how I could capture this data - a Firefox plugin? A page which users browse other sites in an iframe, with a skinny frame on top?
Thanks in advance for your suggestions.
Features like this are typically not allowed by browsers for security and privacy reasons. The IFrame would work, but this is a common hacking technique so it may be likely to break or be flagged in the future.
The firefox addon is the best solution, but requires users to install it manually.
Also, a bookmarklet could be used. While they are actively on the target page, the bookmarklet could send you the URL.
This example bookmarklet would create a tinyURL for the destination page. You could add it to your database or whatnot.
javascript:void(window.open('http://tinyurl.com/create.php?url='+document.location.href));
If some other site links to yours and the user clicked on that link which took them to your site you can access the "referrer" from the http headers. How you get a hold of the HTTP headers is language / framework specific. In .NET you would use the Request.UrlReferrer; other frameworks would probably handle it differently.
EDIT: After reading your question again, my guess would be what you're looking for is some sort of browser plugin. If I understand correctly, you want to give your clients the ability to bookmark a site, while they are on that site, which would somehow notify your site about the page they're viewing. The cleanest way to achieve this would be a browser plugin. You can also do FRAME tricks, like the Digg bar.