Search multiple plone site indexes - plone

I need to implement a central search for multiple plone sites on different servers/machines.If there is a way to select which sites to search would be a plus but not the primary concern.Few ways I came upon to go about this:
-Export the ZCatalog indexes to an XML file and use a crawler periodically to get all the XML files so a search can be done on them,but this way does not allow for live searching.
-There is a way to use a common catalog but its not optimal and cannot be implemented on the sites i am working on because of some requirements.
-I read somewhere that they used solr but i need help on how to use it.
But I need a way to use the existing ZCatalog and index and not create another index as i think is the case with using solr due to the extra overheads and the extra index required to be maintained.But will use it if no other solution possible.I am a beginner at searching so please give details as much as possible.

You should really look into collective.solr:
https://pypi.python.org/pypi/collective.solr/4.1.0
Searching multiple sites is a complex use case and you most likely need a solution that scales. In the end it will require far less effort to go with Solr instead of coming up with your own solution. Solr is build for these kind of requirements.

As an alternative, you can also use collective.elasticindex, an extension to index Plone content into ElasticSearch, for this.
According to its documentation:
This doesn’t replace the Plone catalog with ElasticSearch, nor
interact with the Plone catalog at all, it merely index content inside
ElasticSearch when it is modified or published.
In addition to this, it provides a simple search page called
search.html that queries ElasticSearch using Javascript (so Plone is
not involved in searching) and propose the same features than the
default Plone search page. A search portlet let you redirect people to
this new search page as well.
That can be and advantage over collective.solr.

Related

MYSQL based FulltextSearchable With Elemental blocks

Is it possible to use the mysql based fulltextsearchable functionality to search elemental blocks content?
Elemental has integration features with Solr, but I'm working on a few sites that are too small to justify running a server instance for search.
The DNADesign\Elemental\Extensions\ElementalPageExtension provides a getElementsForSearch() method which is the elemental blocks concatenated and includes some configuration to customise it.
For this to work in a database-based search you'll need to store that (or something like it but why re-invent the wheel?) in the database. You can do this by declaring a new db field on the page and then using something like onBeforeWrite() on the page to actually store the data.
Be aware that you can save and publish blocks separately to the page though, which will complicate matters somewhat and could lead to stale data depending on how you handle this.
Yes MySql full text search is efficient and reliable, just concatenate your blocks in a text column separated by spaces and index this column.
Avoiding using a search server like Solr should save you lots of pain.

Possible to search through all JSPs in Adobe CQ5 repository with CRXDE?

We have a couple of relatively simple websites running on Adobe CQ 5.5 that were developed by a third party. I'm pretty familiar with how CQ works, but I'm working with somebody else's code here and I need to be able to search through all components in the system for a particular string.
The issue is that I can't seem to find a way to search across all of the various .jsp files stored with the various system components. I would have figured that the query tool in CRXDE Lite would have done the trick with something like this:
/jcr:root//*[jcr:contains(., 'Find this exact string in a JSP')] order by #jcr:score
But I've had no luck.
What I am looking for is some sort of global search that includes JSP files. Is that possible? Were I using a regular Java system, any IDE worth the download would be able to do this.
Thanks.
Might not be easiest way, but you can use the VLT tool to checkout the repository into your filesystem. Then you can lookup using whatever tool you prefer. It might even be faster in the long run
I don't have the actual answer but I suppose the JSPs are indexed via a filter that strips out some of their content.
It should be possible to configure the repository to index them as is instead, based on the info at http://wiki.apache.org/jackrabbit/IndexingConfiguration and http://jackrabbit.apache.org/jackrabbit-text-extractors.html
Sorry about the vagueness of this answer - I know the basic principles but to provide the details I would need more time than I can afford now ;-)

How to index & search hierarchical nodes with solr + drupal + cck

My Drupal 6 site uses 3 custom node types that are hierarchically organized: page, book, library. I want to use Solr to index and search the content.
I want the search to return only Book nodes in the results.
But I want the search to use the contents of children (pages) and parents (libraries) when performing the search.
Can Solr be configured to index & search in this way?
Thanks!
You are going to have a couple of issues with this:
Solr isn't hierarchical by nature, it's denormalized so indexing a heirarchy is hard.
You're going to have to figure out how to boost various terms/fields based on where in the hierarchy they are (is the library more important than the book, so to speak).
Drupal has a specific configuration related to nodes and modifying that, by default, wouldn't be the easiest.
The Solr implementation is tightly tied to the database, so modifying the configuration would probably take a lot of effort on your part.
I would recommend you don't try to implement this, but if you did you could look at the Apache Solr Attachments module. You would have to do something similar... basically:
hook_modify_query to modify the actual indexing of the node
custom theme your search results to display this hierarchy
Or you could create a single giant field with a bunch of searchable text and use that as part of your searches.

Module Multi-instance in OrchardCMS

assuming i have a contacts orchard module which manages contacts
can i have two instance like so
mysite.com/WorkContacts/...
mySite.com/HomeContacts/....
and have the data partitioned by instance/location type etc.
I assume it should be but want to be sure before i dig any deeper
It's not possible by default (although I'm not saying impossible at all).
Each module has it's unique, hardcoded Id which prevents multi-instancing of modules by design. There are also many other reasons why it wouldn't be a good idea...
Achieving such behavior is possible of course, but in slightly different way. As Orchard is mainly about content, you are free to build your own, different content types for different contact types from existing parts and fields. And then you're free to create instances of those. It's described very well here.
HTH
This would probably be better asked over on the Orchard sites.
If you look at blogs functionality you can have multiple of those, following a similar pattern of code you could have multiple of the contacts modules.
The path /HomeContacts ... etc would be set through the routing functionality of Orchard.
I think what you're looking for might the multi-tenancy module, available from the gallery. The only difference with what you describe is that the instances would need different server names rather than subfolders like you decribed.
Then again it's not quite clear whether you only want to separate just the data for that module (in which case the suggestion to model it after blog is a good one) or for the whole site (that would be multi-tenancy).

Using Solr for multiple sites

I have setup a Solr server, now, I have two sites that I want to index and search using SolrNet.
How do I differentiate the two sites' content in Solr?
You may want to take a look at this document: http://wiki.apache.org/solr/MultipleIndexes
I think the best approach is to use Multiple Solr Cores.
Another option is you can simply add a new field that indicates the item's Web site. For example, you can add a field called type.
Searches on website1.com would require you to filter on the type field.
&fq=type:website1.com
That way you only need to deal with one core and one schema.xml file. This works if the pages of both sites have a very similar field set, and it will make it easier to search across both sites if you plan on doing that.
http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index

Resources