MYSQL based FulltextSearchable With Elemental blocks - silverstripe

Is it possible to use the mysql based fulltextsearchable functionality to search elemental blocks content?
Elemental has integration features with Solr, but I'm working on a few sites that are too small to justify running a server instance for search.

The DNADesign\Elemental\Extensions\ElementalPageExtension provides a getElementsForSearch() method which is the elemental blocks concatenated and includes some configuration to customise it.
For this to work in a database-based search you'll need to store that (or something like it but why re-invent the wheel?) in the database. You can do this by declaring a new db field on the page and then using something like onBeforeWrite() on the page to actually store the data.
Be aware that you can save and publish blocks separately to the page though, which will complicate matters somewhat and could lead to stale data depending on how you handle this.

Yes MySql full text search is efficient and reliable, just concatenate your blocks in a text column separated by spaces and index this column.
Avoiding using a search server like Solr should save you lots of pain.

Related

Search multiple plone site indexes

I need to implement a central search for multiple plone sites on different servers/machines.If there is a way to select which sites to search would be a plus but not the primary concern.Few ways I came upon to go about this:
-Export the ZCatalog indexes to an XML file and use a crawler periodically to get all the XML files so a search can be done on them,but this way does not allow for live searching.
-There is a way to use a common catalog but its not optimal and cannot be implemented on the sites i am working on because of some requirements.
-I read somewhere that they used solr but i need help on how to use it.
But I need a way to use the existing ZCatalog and index and not create another index as i think is the case with using solr due to the extra overheads and the extra index required to be maintained.But will use it if no other solution possible.I am a beginner at searching so please give details as much as possible.
You should really look into collective.solr:
https://pypi.python.org/pypi/collective.solr/4.1.0
Searching multiple sites is a complex use case and you most likely need a solution that scales. In the end it will require far less effort to go with Solr instead of coming up with your own solution. Solr is build for these kind of requirements.
As an alternative, you can also use collective.elasticindex, an extension to index Plone content into ElasticSearch, for this.
According to its documentation:
This doesn’t replace the Plone catalog with ElasticSearch, nor
interact with the Plone catalog at all, it merely index content inside
ElasticSearch when it is modified or published.
In addition to this, it provides a simple search page called
search.html that queries ElasticSearch using Javascript (so Plone is
not involved in searching) and propose the same features than the
default Plone search page. A search portlet let you redirect people to
this new search page as well.
That can be and advantage over collective.solr.

How to feed Word 2010 (.docx) documents/templates with data from MySQL database?

What would be the best approach to replace placeholders in a .docx document (Word 2010) with data coming from a MySQL database?
Can I just open the file using a server side language and do a string replace per each placeholder?
Is there any existing tool/library available?
Thanks
Disclosure: I work for Invantive.
Using Invantive Composition (http://www.invantive.com/products/invantive-composition) you can fill Word documents (letters, legal pleadings, insurancy policies) with data from a database (IBM DB2, Oracle, MySQL, Teradata and SQL Server) and then fully change the contents at will manually. It is intended for real Microsoft Word end-users (both the guys that make the template and the ones that use it) that access the databases through a central webservice and models with queries. Invantive Composition allows nested repeating groups of data and lay-out. Integrates into Microsoft Word using click once.
In the past, I personally have also been using JasperReports (http://community.jaspersoft.com/project/jasperreports-library) to generate letters using the RTF output target of JasperReports. It is free and works fine as long as you do not want to edit the output more than a few words and have Java/SQL development skills. Just as Invantive Composition it works fine for large numbers of different reports.
As long as you can control the environment completely, you can also consider using RTF as intermediate language (not for end-users, only real developers). Save document as RTF, replace parts of the text you need to be replacable, write a webservice that accepts the parameter and dumps back the resulting RTF. Takes some time to generate more complex tables (tables are obviously something invented by the human race after the RTF specification was written :-) This approach only works with very limited number of templates and when you have sufficient developer time available to get it up and running and stabilized.
As an independent reviewer, I have also seen cases where XML templates were used, but the results were not as good as with JasperReports.
**Disclosure: I lead the docx4j project **
There are heaps of existing tools/libraries available!
Yes, you can just do a string replace, but that is a brittle approach, since Word may have split the string across runs.
You can use MERGEFIELDs, or content control data binding.
docx4j supports all three approaches, but content control data binding is the most powerful.
ContentControlsMergeXML
MERGEFIELDs
VariableReplace
One thing to consider especially is "repeats". If you want say a row of a table in Word, for each matching row in your MySQL table, then you need a way to make this happen.
docx4j does this with a "repeat" content control around the table row; whichever solution you choose, I'd make sure up front that it can handle repeats.
If you want to use PHP the most complete available solution is PHPDocX.
You may check in the tutorial how to substitute placeholder variables by data coming from any data source (like a MySQL DB).
In particular, you may populate table rows with an indefinite number of entries and you may delete whole blocks of the Word document depending on the data fed to the application or build dynamical Word charts.
You may check the available DEMO for a simple but quite illustrative example (its inner workings are explained in the tutorial section).
You can use open Open XML SDK and replace your placeholders like this.
Disclosure: I lead the docxgenjs project
I think you shouldn't have to code everything by yourself, that's why I created a Mustache-like templating engine for docx
Demo:
http://javascript-ninja.fr/docxgenjs/examples/demo.html
Repo
https://github.com/edi9999/docxgenjs
It is JS-based and works client and server side.
Yes, you can use server side language to do it.
Check on apache POI.
http://poi.apache.org
Hello I read the above esp the comments and Ivantive looks impressive - but the solution I needed was much simpler. Use Selection.Range.InsertDatabase in Word to fetch records from an access database or excel spreadsheet or even just another word document. With the access solution you can choose the layout of the records to fetch and have it fetch just particular recordds based on a field (eg ID). Google the words above and it'll take you to MS guidance and an example VB script. Worked well in just a few mins. Now looking for VB script that asks the person what ID they want from the dbase and we're done.
it uses docx templates that have merge fields with java objects (the objects have the information you load from mysql or any other source). The xdoc report is an project for java language, the home page of the project is https://code.google.com/p/xdocreport/.
*Disclosure: I create the templ4docx project *
Hello
You can use templ4docx java library, which is on maven central repository, so you can just add it to your maven dependencies:
<dependency>
<groupId>pl.jsolve</groupId>
<artifactId>templ4docx</artifactId>
<version>2.0.0</version>
</dependency>
Example usage:
Docx docx = new Docx("E:\\template.docx");
Variables variables = new Variables();
variables.addTextVariable(new TextVariable("${firstName}", "John"));
variables.addTextVariable(new TextVariable("${lastName}", "Sky"));
docx.fillTemplate(variables);
docx.save("E:\\filledTemplate.docx");
More details you can find here: http://jsolve.github.io/java/templ4docx/

How do I create a feed that transfers nodes between Drupal 7 websites?

I have a Drupal 7 website with content types like "events" and "news".
I would like for nodes of these content types to be automatically imported into other websites.
I played with Feeds, XPath on the 'client' websites and Views RSS fields' onn the 'server' side, but I realized that there would be problems with content type fields like files... Any suggestions? I would like to be able to create new views for this content in the other websites.
P.S. The content types will be identical between the websites (but they don't have to, if your solution includes something else).
You probably have more success with services and content Distribution. RSS feeds are not well suited for transfer of semantic data. They are highly focused on lists-of-articles and typically lack information such as "event-start-date".
Services allows you to expose services on the server-drupal-site, exposing the nodes as e.g. RESTfull json. The client side-drupalsite can then use services and content-distribution to import nodes from said server.
That said, the services suit plugs into views, and is really heavy, large and complex. If you are allergic to large and complex projects (like I am), you may be better of writing simple modules:
events-service: a 20+ lines module that grabs the events from the database, and presents them as json.
news-service: a 10+ lines module that fetches a list of news-nodes and presents them as json.
events-client: a small module (~400-800 lines?) that eats said json at the given url and turns them into nodes. It will keep a register of some UUID next to the nodes table, to avoid re-creating nodes on changes upstream (but instead finding the associated one and updating that).
news-client: a small module. Same as above.
Writing such modules is very rewarding, because instead of fighting with poorly documented views-plugins, complex layers around services and such, you have full control and full understanding. It also allows for a lot better tuning and performance.
The one large downside is that Drupal, more specific: CCK or Fields, dictate the database and its structure. There will be a point when some tiny config change on your site breaks your modules SQL query: all of a sudden you are blasting SQL errors because Drupal decided to rename or move out some table, column or reference.
Maybe you can just share data by creating xmls/json (server side) that will be used by the client side.
services is a good way to go. But I find it complex for simple stuffs.
What you can do is create views that will output as xml/json... You can do this by doing preprocess functions in your module/template file.
After which the client side (maybe run cron) will take the xml/json and create nodes programmatically.

Using Solr for multiple sites

I have setup a Solr server, now, I have two sites that I want to index and search using SolrNet.
How do I differentiate the two sites' content in Solr?
You may want to take a look at this document: http://wiki.apache.org/solr/MultipleIndexes
I think the best approach is to use Multiple Solr Cores.
Another option is you can simply add a new field that indicates the item's Web site. For example, you can add a field called type.
Searches on website1.com would require you to filter on the type field.
&fq=type:website1.com
That way you only need to deal with one core and one schema.xml file. This works if the pages of both sites have a very similar field set, and it will make it easier to search across both sites if you plan on doing that.
http://wiki.apache.org/solr/MultipleIndexes#Flattening_Data_Into_a_Single_Index

Best way to create a search function ASP.NET and SQL server

I have an SQL database with multiple tables, and I am working on creating a searching feature. Other than having multiple queries for the different tables, is there a different way to go about said searching function?
I should probably add that a lot of my content is database driven to make upkeep easier. Lucene will not work for this, correct?
Different approaches to consider:
1) Multiple queries pre-baked, like you described.
2) Dynamic sql that you put together on the fly based on user-entered criteria.
3) If text is involved, based on SQL Server full text search or Lucene.
In my open source app BugTracker.NET, I do both 2 and 3 (using Lucene.NET).
I documented how I use Lucene.NET here:
http://www.ifdefined.com/blog/post/2009/02/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx
Since you have tagged the question with Asp.net I suppose you want to search your webpages. In that case you can use Indexing Server to perform freetext searches easily that search the generated html and any keywords you have set up.
As Corey Trager suggested, using Lucene.NET is also an option. It has a good reputation of being fast and quite easy to use.
Although the other answers provide good suggestions such as using Lucene, I have much preferred using a custom caching method.
So for a website that I help create, we cached the searchable data every couple of hours, from many tables, into one simple table with columns such as:
URL
Item/Page Name
Main Keywords
Text Only Contents
Date Updated
I would then write my SQL statement to search this field using different functions to determin the rank.
You might want to check out this post i wrote on writing full text queries, its in C#, but its easilly portable, or just stick it in a library and use it as it.
How to build an SQL full text index search term in c#

Resources