Does anyone know of a way to include pdf documents in the search for drupal 7?
I can't find anything to achieve this.
The Apache Solr Attachments module does this, but currently only has a development (not stable) release for Drupal 7.
Check out the Search API module and it's sibling module Search API Attachments. It uses Apache Tika to parse text from documents.
Another option:
https://drupal.org/project/search_files
This is slightly lighter weight but given SOLR's power and community adoption it is probably still the best return for effort (as a service or installed yourself), whether using Apache SOLR modules or Search API modules.
Related
I got a question regarding the search module of DNN. I am currently using DNN community edition 7.4 and I want to index PDF documents so that I can search on the content of the PDF files.
Now after a lot of research I found the following conclusions:
By default DNN does not support this. You can enable this by using external modules as: Dnn search engine and Search boost 3.2
Community edition does not allow to index documents but only pages. The professional edition does include indexing documents and paging.
Now I am wondering are these conclusions valid? If I upgrade my community edition then can I index documents and if I keep using the community edition my only option is buying external modules?
You are correct.
The Evoq (professional) editions have three search crawler schedule tasks that index different types of content:
File Crawler will index any supported document file types in the File Manager to include in the search results. This crawler will parse files such as PDF and Office documents.
Site Crawler will index module content from HTML modules as well as any third-party modules that support the search integration.
Url Crawler will crawl and parse HTML pages typically on external urls where you cannot get the content directly from the CMS with the Site crawler.
The DNN Community edition only comes with the Site Crawler. So if you need file parsing and indexing, you will need to use a third-party module like Searchboost or upgrade to one of the Evoq editions.
I've not dug into the details of what and how SDL Tridion is storing data in it's internal search engine (SOLR), but I need to build a GUI extension that needs to perform searching on component/metadata fields across publications.
I can't see any reason not to have a look into SOLR, but before I invest the time, does anyone know any reason why this would be a bad idea?
Thanks in advance!
It's a bad idea in general to bypass the API and directly query SOLR.
From your question, I see no reason to do so.
Do you need to index more data than what is already indexed by Tridion?
If not, surely you can just search using the API?
If you do, you could consider implementing a custom Search Indexing Handler for the additional data. Although this is not very well documented at the moment, it seem rather straight forward to create (implement ISearchIndexingHandler and update your CM and SOLR configuration). The benefit would be that your data can also be searched for using the standard Tridion search.
It really depends on your search requirements. If it's just about simple search - then it's probably fine, but if you want to make some Tridion specific searches then it will be quite difficult as SDL Tridion does a lot of post processing on SOLR results. Why can't you just use CoreService and have a convenient supported search interface?
As Peter said, its really a bad idea to interact with SOLR that comes with Tridion. Tridion has a abstraction layer to hide complexity of SOLR query. For example tridion hides case sensivity of the search keyword.
I strongly recommend to use tridion search api to build ur interface. Tridion search api also supports executing solr query directly. But its not recommended.
For indexing additional data u can implement ISearchIndexingHandler. It has some complexity with the solr config files (adding new fields).
I just took over a Drupal-based website in which I implemented a forum. Now I need to add an advanced search tool. But I haven’t found out how. Could you help me do it? Like are there any new plugins I need to download?
If you want any way of advanced search you can study Solr. There are many integration modules for Drupal and Solr. For example, you can use Apache Solr Search. I don’t use neither this module, nor Drupal itself, but Solr is a very good solution for indexing and searching.
Faceted Search module is not being moved to Drupal 7.
What are my alternatives?
The Search API module is new for Drupal 7, and allows you to choose from a variety of backends, including Solr, Xapian, native database and others. It supports faceted search irregardless of which backend is used.
One option is an apache solr integration http://drupal.org/project/apachesolr or a self written lucene search module.
I have set a faceted search up in Drupal 7 using the Search API which has a search facets module. You will also need Entity api. The facets get created as blocks.
Although this goes on to use the Solr module, which I didn't use, I found this video very useful for some pointers on how to set up a search using search api.
http://vimeo.com/15556855
You (who comes from search engines) can use FacetAPI module: https://www.drupal.org/project/facetapi
It's a very easy module to create faceted search in Drupal.
(http://beautiful7mind.wordpress.com/2013/03/10/step-by-step-how-to-implement-facet-search-on-data-in-drupal-7/)
I would like to find out if there is wiki software that runs on SQLite.
Sure is.
Instiki
Instiki (What Is Instiki) is a basic
Wiki clone so pretty and easy to set
up, you’ll wonder if it’s really a
wiki. Runs on Rails and focuses on
portability and stability. Supports
file uploads, PDF export, RSS,
multiple users and password
protection. Some use Instiki as a CMS
- Content Management System because of its ability to export static pages.
Fossil, which is a DVCS with many other features (including a wiki) is built on top of Sqlite (in face, made by D. Richard Hipp who wrote Sqlite ). It can be found at: http://www.fossil-scm.org/index.html/doc/tip/www/index.wiki
According to the SQLite Users Page, the CVSTrac program is using SQLite as its internal database structure (you have to follow the link to the CVSTrac site to see the details). I'm sure there are others by hunting around the site.
Wiki::Toolkit can use SQLite as its database.
Mediawiki does, but it is only in the development stage.
Any of the Django ones should be able to, because Django is flexible about which Database backend it uses and SQLite is one of the options.