How to crawl other websites with Drupal? - drupal

I am actually looking for a solution to crawl specified websites with Drupal and make theres content visibil in my search after the crawling process.
Any ideas about that?
I tried for now the Drupal Apache Solr Modul which is working very good as a search as it should be, but i dont know how to extend it, so that the index is filled with content of other sites?

Try using the Feeds Crawler Module

Related

How to mechanically identify all broken links in a drupal site

We have just moved to drupal and are trying to pro-actively identify all broken external web (http://, https://) links.
I've seen some references to validation of links but wasn't sure if it only meant validation of the syntax of the link as opposed to whether these web links work or not (e.g. 404).
What is the easiest way to go through all web links in a drupal site and identify all of the broken external web links? This is something we'd like to automate and schedule every day/week.
As someone else mentioned, use Link Checker module. It's a great tool.
In addition, you can check the Crawl errors in Google Webmaster tools for 404'd links like this:
Clicking any URL from there will show you where the URL was linked from so you can update any internal broken links. Be sure to use canonical URLs to avoid that.
Make sure you're using a proper internal linking strategy to avoid broken internal links in the first place, too: http://www.daymuse.com/blogs/drupal-broken-internal-link-path-module-tutorial
Essentially: use canonical, relative links to avoid broken internal links in the future when you change aliases. In simple Drupal terms, be sure you're linking to "node/23" instead of "domain.ext/content/my-node-title" since multiple parts of that might change in the future.
I have not found a Drupal based approach for this. The best, free piece of software I've found for finding bad links on sites is Screaming Frog SEO Spider Tool.
http://www.screamingfrog.co.uk/seo-spider/

How can i hide my platform (CMS)

I have Joomla and Drupal sites, but I don't want others to find out what platform (CMS) I'm running.
I want to prevent detection from tools like Wappalyzer or similar tools. (as seen in this screenshot: http://i43.tinypic.com/2evc6qo.png)
I've heard that has to do with meta tags but I'm not sure.
There is no way to hide the fact you're using Joomla. If you inspect the source code of a websites built using Wordpress for example, you will see wp-includes within the URL's of CSS and JS file includes.
When using Joomla, you can type /administrator at the end of the URL, however if the admin URL is hidden, against, inspecting the source can give it away.
This might be of little help:
How to disable right-click context-menu in javascript
For Drupal, see the community wiki page "Hide, obscure, or remove clues that a site runs on Drupal":
The short answer is :
You can't. Do not try.
You can get pretty far with trying to hide the fact that your site runs on Drupal. But at some point you’ll probably don’t run Drupal anymore ;-)
Have a look …
at our sister site, Drupal SE: How can I obscure the fact my site uses Drupal?
at drupalscout.com: Hiding the fact your site runs Drupal OR Fingerprinting a Drupal Site
There is way to hide Joomla from bots.
You need to use this jomdefender plugin. It removes word joomla from all pages, change admin page and add few antibot tricks.
Its not perfect, but it still adds much more security to your joomla such as file integrity check, which could be quite usefull when some file gets hacked.

Rebuild wordpress site with codeigniter

I have a client that has a website that is built with WordPress. They want to expand the site adding new features. To me it seems best to rebuild the site so WP is not being used. I like using codeigniter but one issue is how we keep our SEO rankings.
The urls in WP are something like www.foo.com/test-this-site.html
Is there a way to build the site in Codeigniter but utilize that URL structure? I basicly need to keep all the current pages working at the same url.
Does anyone know if this is possible with Codeigniter and how this may affect search ranking? Or is there a better way to go about this. Any sort of direction would be helpful
By using the URI Routing in Code Igniter, you can customize all the link structures you want in any way you want.
You will just have to make sure you support all the same links than WordPress and you'll be fine.

Integrate wordpress blog with joomla website

I have a website which I developed in joomla and a blog in wordpress. I would like to integrate that blog with my Joomla website without using wrappers or CorePHP component.
Is there any way to do this?
I think the best thing you can hope for in this situation is going to be to have one be a subdomain of the other, and when you want one system to reference the other you'll need to link to it. Also, if you could move one of the systems to be a subfolder of the other system, you'll probably get away with having relative links.
The only other option would be to literally migrate all of your content from one system to the other manually.
What about using a feed reader in your Joomla website to show your Wordpress posts?

Display Drupal content outside Drupal?

Is it possible to use Drupal to feed a few dynamic portions of a mostly static website? We have a plain old website and are looking to create a sibling site just for web-app stuff (private CMS, databasing, some forms for specific things, etc.). Some of the content we create on the sibling site (which would be Drupal), we'd like to render in areas on the primary site (non-Drupal). An example might be a news feed generator that displays on the primary site, but is actually fed from content created in the secondary site's interface. Another potential workflow might be a Drupal installation that's located in a subdirectory of a mostly static website. A general login link could redirect users to the drupal area, but could we get any of the content they create outside of that, modularly, so we can keep our nice rigid site design? I guess I'm looking to harness Drupal as more of a framework than a CMS.
Is any of this possible? Is this even a logical concept, or am I stupid for asking?
Thanks for any suggestions.
It is possible you could implement a custom callbacks which are accessed via Jquery on your old site.
However....
Why would you do this, Drupal is a CMS for websites, if you have a static website, no matter how big it won't be too dificult to put it into drupal and look the same, even have the same URLs. You then get Drupal goodness wherever and whenever you want very easily.
You can always access your Drupal database in your external site to display whatever Drupal content you want.
You could build RSS feeds with Views and put a simple feed parser into your static site. But again, if you want more than simple RSS syndication, you are better off planning a migration path than partial Drupal integration.

Resources