Using Tika to search files with Solr and Drupal - drupal

I am using Apache Tika with Solr to search nodes with attachments (zips, pdfs, etc), in the DB we have indexed nodes with files. looks as if it all is configured correct with Solr (Tika too) but the results are empty. I am only getting results for one PDF file that I created to test, It contain one word ‘MyTestNew’.autocomplete shows it but not after submit.
search works by title but not by file contents. http://screencast.com/t/lhSvs0iJS
We think maybe it’s a problem in the code. Maybe module_invoke_all() for the apache search integrated module. But not sure for now.’

Related

Alfresco CMIS service to index document

I have uploaded document to Alfresco using Apache Chemistry php client. The documents is uploaded successfully but (I think it is not indexed), content based search is not working on this document. whereas If I upload the same document from alfresco share it is also indexed and content based search works on it. Kindly tell me about cmis service to index document or propose me some solution for it. Thanks in advance!
I was unable to figure out issue with Apache chemistry php client. so I shifted to java version i.e. alfresco-api-examples. link https://github.com/jpotts/alfresco-api-java-examples and following its CmisCreateDocumentExample the file is now successfully uploaded and indexed.

Is it possible to upload a file in Symfony2 without referring to entity's or using Doctrine?

I'm building a web application that displays specific data to the customers of my client. The client wants to populate the application with data via a CSV file. So he needs to be able to upload the file to the server, and the application places the CSV data in to the database.
I am using Ddeboer/DataImport bundle and have managed to get it to work with a CSV file already placed on the server. The trick now is to get the CSV file on to the server in the first place.
Because I want the file to be directly sent to the server, and that it won't be associated to any other records held on the database, I feel there is absolutely no need to use Doctrine at all. However, the documentation that I've encountered suggest that you should only upload files to the server via Doctrine/Database. Surely this can't be the case?
Is there a simple way of just uploading the file to a designated folder on the server? No bells or whistles, just a pure file upload in Symfony2.
Of course it is. The cookbook only contains a doctrine based entry but it has a link to the file form type.
http://symfony.com/doc/current/reference/forms/types/file.html

Drupal search results filter, similar to drupal.org's search results page

Pretty new to drupal and I want to create a search results page that includes a filter, just like the one used on drupal.org (http://drupal.org/search/apachesolr_multisitesearch/test).
I've looked around for a module to do this, but haven't been able to locate one. Perhaps I'm not searching for the correct terms, but I've not had any luck.
Incase it matters, I have many nodes of content that are attached to a taxonomy of terms called "Tags" - I'd like to have the filter update the list of results based on the user selecting which "terms" to remove.
How are you accomplishing this?
TIA!
Edit:
Throwing in my install instructions for future travels:
Goto the following website: https://hudson.apache.org/hudson/job/Solr-trunk/lastSuccessfulBuild/artifact/checkout/solr/dist/
And find the nightly (Today’s date) build and download it.
Expand and rename the downloaded folder as “apache-solr-nightly”. This should NOT live in your webroot.
Download the zip here: http://solr-php-client.googlecode.com/files/SolrPhpClient.r22.2009-11-09.zip and unzip the file.
Download the zip here: http://ftp.drupal.org/files/projects/apachesolr-6.x-1.2.zip (Or the most current 6.x version) and unzip the file.
Copy the schema.xml from the apachesolr-6.x-1.2/ that was just unzipped to the apache-solr-nightly/example/solr/conf
Copy the solrconfig.xml from apachesolr-6.x.1.2/ to apache-solr-nightly/example/solr/conf
Edit the solrconfig.xml and include the following line as a child of
<luceneMatchVersion>LUCENE_40</luceneMatchVersion>
Search for and comment out the following lines
<queryResponseWriter name="xslt"
class="org.apache.solr.request.XSLTResponseWriter">
<int
name="xsltCacheLifetimeSeconds">5
</queryResponseWriter>
Change to the apache-solr-nightly/example directory and execute the solr jar
java -jar start.jar
drupal.org uses Apache Solr Search Integration (together with Apache Solr, of course).

Handling file uploads in Drupal

I've been messing around a bit with various solutions to what I would see as a fairly common problem, but I've not yet been able to solve it in a satisfactory way.
What I wish to achieve is some kind of functionality where a user can upload new files, or select existing files to reuse them.
What I've been using so far is a combination of the filefield, filefield_sources, imce and ckeditor modules. I guess ckeditor isn't really important for the solution, but I need to be able to embed images from the archive somehow, and this is done with IMCE . Since I do not want everything to be accessible from the filebrowser I created a subdirectory and set full access to it in the IMCE settings, lets call it default/files/site
This worked fine as long as all filehanding was done through IMCE, but when I uploaded files directly from the filefield my files ended up in the default/files root, so I set up folders for my fields, for example default/files/site/movies in a field that allowed the .flv format. This worked fine to, as long as I didn't try to access the files through IMCE. It appears the folders created by filefield are not accessible from IMCE?
I'm also in a position where I need to support large uploads (200MB+), but from my experience in other projects, allowing file uploads through FTP is usually a life-saver, but from what I understand IMCE won't support files not uploaded through Drupal in some way, since they are not present in the database (giving the message: The selected file could not be used because the file does not exist in the database.)
I'm aware that I don't really have a clear question to my problem, but somehow I need to figure this out pretty fast. How would I preferably solve this? I'm aware that I'm not the first to have this problem, but I have not yet been able to find a nice and stable solution. What am I missing?
Also check this thread (http://drupal.org/node/438940) and the reference to John Locke's work at: http://www.freelock.com/blog/john-locke/2010-02/using-file-field-imported-files-drupal-drush-rescue
Well, I'm not personally familiar with IMCE off of the top of my head, but if you need files that have been uploaded via ftp to be added to the files tables, then my impulse would be to write a small module which would then allow the user to click a button and start off a batch process. (This is me assuming that you are using Drupal 6, as the batch api doesn't exist in 5.)
Said batch process would then iterate over all of the files in the appropriate directory, which I would assume you had uploaded the files to, use file_copy() (from Drupal's file API) to copy the files to default/files/site, and then would add said files to the files table, which is actually quite simple with drupal_write_record().
It might not even need to use the batch api - it somewhat matters if you're just uploading 10-30 really big files, or 200-300 MB files.
For using the batch api, I'd look at http://drupal.org/node/180528 - this has a fairly basic example of how the batch api works, which basically consists of telling the api that you want to keep calling function_a, and then inside of function_a setting your progress in the context array until you're done, at which time the batch process finishes. Then you just have whoever uploads the files via ftp to hit a button on the website to move and register the files.

How to store Blobs in Drupal?

I want to store PDF and Image files in Drupal. By default Drupal seems to store binary/uploaded files into the filesystem. I want to to be able to store files so that - I can have workflows, metadata, IP information and URL aliases?
Update, am still stuck at how to create URL aliases for files on the file system. Drupal only seems to allow creating URL aliases to node content.
The simplest solution for Drupal is to use the Upload module, which is in core. The upload module allows you attach files to nodes. As you note, they are stored in the filesystem.
If you want to get more complex, you can build your own content types through the web interface by using the CCK family of modules (such as the filefield module or the imagefield module). Your files are still stored on the filesystem.
You could use CCK to create the metadata field you want.
You could use the workflow or rules modules to implement workflow.
IP information is recorded automatically by Drupal's logging module (either the database logging module, which logs to the database, or the syslog module, which logs to syslog).
URL aliases for nodes are available in core Drupal by enabling the path module (or download the pathauto module for automatic creation of aliases).
I tried Drupal's Upload module, FileField as well as uploading files using FCKEditor - I prefer the last one - it is closer to my specific scenario.
Creating an alias to the file will likely require some custom code.
It's not possible to simply create an alias using the Administer > Site building > URL aliases screen. I tried inserting a record with
insert into url_alias (src, dst) values ('sites/default/files/ChipotleArt.JPG', 'here-is-the-file.jpg');
That still didn't work. So, you will really likely need to write some custom code.

Resources