Create and Consume Collections in solr4 - alfresco

Someone suggest me to create multiple collections in solr (lucene) to get faster response from the solr.I am new in this era so can anyone please help me how to create collections in solr and how alfresco can consume those collections.
Note- i am using solr4 and alfresco.5.2
Thanks in advance !!

First of all, even if Alfresco is using SolR, you don't have the access to all the functionnalities. I don't think that creating multiple collections is possible in the Alfresco product.
The closest thing to what you are talking about is sharding which enables you to split your index in smaller ones to improve performance : http://docs.alfresco.com/5.1/concepts/solr-shard-config.html.
The documentation explains well the mecanism, you have the choice between manual or dynamic sharding.
Finally, if you are looking for better performances, alfresco replication is another lever to improve you alfresco responsness.

Related

Meteor Approach to Collections Considering Join Aren't Supported by Core Yet?

I'm creating my main project functionality right now so it's kind of a big decision to make in my project, I want efficient & scalable solution. I use different API's to fetch users products ultimately for 1 collection to display products information inside a table with possible merge by SKU TITLE from different sources.
I have thought of 2 approaches (In both approaches we add Meteor.userId() to collection insert so each users has it's own products:
1) to create each API it's own collection and fetch the products to it, after or in middle of the API query where I insert it to sourceXProducts also add the logic of merge products by sku and add it to main usersProducts Only the fields I need, and we have the collection of the sourceXproducts if we ever need anything we didn't really include to main usersProducts we can query it and get it so we basically keep all the information possible (because it can come handy)
source1Products = new Meteor.Collection('source1Products');
source2Products = new Meteor.Collection('source2Products');
usersProducts = new Meteor.Collection('usersProducts');
Pros: Honestly I'm not sure, It makes it organized also the way I learned Meteor it seems to be used a lot.
Cons: Meteor collection joins is not supported in core yet, So I have to use a meteor package such as: meteor-publish-composite which seems good but this way might hit performance
2) Create 1 collection and just insert everything the API resonse has and additional apiSource field so we can choose products from X user X api.
usersProducts = new Meteor.Collection('usersProducts');
Pros: No joins, possibly better performance
Cons: Not organized, It can become a large collection maybe it's not good for mongodb
3) Your ideas? :)
First, you should improve the question. You do not tell us anything precise about your schema. What are the entities you have and what type of relations are there and what type of joins do you think you will be doing. How often you will be doing them?
Second, you should rethink your schema and think in the terms of a non-relational database. I see many people coming from SQL world and then they simply design their schema in the same way. Wrong. MongoDB is not SQL and things you learned there you should not try to just reuse here. You should start using features like subdocuments and arrays which can help you solve many basic things you would do in SQL with joins. So, knowing your schema would help us help you design the schema. For example, see this answer and especially the comments for the discussion for a similar type of question you are asking here.
Third, have you evaluated various solutions which exist out there? There are many, but you have not shown us that you tried any of them and how it worked for you. What were pros and cons of them, for you and your project?
Fourth, if you are lazy to evaluate, you can just use peerlibrary:peerdb and peerlibrary:related. They are simply perfect. You should trust me. I am their author.

Database table/field mapping in SilverStripe, integrating additional database

I'm well aware of the standard SilverStripe Data Structure and table/field naming conventions. But how do you integrate SilverStripe with a pre-existing database? Is there any way to map existing tables/fields with a different naming convention to be useable by the SilverStripe ORM and DataObjects? Also, is it possible to use the ORM with two different databases?
In a recent project I had the same issue, and I solved creating views in the SS database over the CRM database, in order to present to SilverStripe the data in the way it likes. Obviously I also created DataObjects mapping the data, and so no dev/build is needed. It's not an easy way to do it, but if you're lucky and the second database logic is similar to SS logic it's a feasible task.
Now I have a CRM that write data into its database with its logic, and SS that reads it through views as if it were its own DataObject.
Good luck :)
I am afraid that, as far as I know, the answer to both questions is no.
I guess the best option would be to write an importer that connects to the old database, fetches the data, and then creates silverstripe objects for it.
If you have to run both systems at a time it will be come trick. The first thing I would consider here would probably be a rest api between the 2 systems, but not sure how well that would work out.

Is it possible to replace apache solr in broadleafcommerce with ElasticSearch?

I am trying to use broadleafcommerce and customize it.On study i found it uses Apache Solr . However, i am already handy with
ElasticSearch as i am currently using ElasticSearch only in my workplace. so, i'm curious as if i can replace that customizable code of broadleafcommerce with ElasticSearch. If it is possible, i also want to know how long will it take or what will be its difficulty level ?
Thanks in advance !
The product is open source, you can have a look at the code yourself. Here is the package that would need to be made solr independent. As far as I see there are quite some dependencies on Solr now, but maybe you can give it a shot and contribute it back. In the end that's the power of open source.
I can't tell exactly how much work that would be since I don't know the product and what it does with the data. The solr schema would need to be translated to the related elasticsearch mapping, then the indexer will need to be converted in order to push data to elasticsearch (otherwise if technically doable you could write a river that imports data in elasticsearch from the framework itself). Last step is to convert the search code together with the facets, highlighting etc.
Maybe you (or the people behind the project) might want to have a look at spring data which has now a community driven spring-data-solr project and an unofficial elasticsearch implementation too.

TinkerPop Blueprints and Frames - How to treat data as collections

I found this question How to store and retrieve different types of Vertices with the Tinkerpop/Blueprints graph API?
But it's not a clear answer for me.
How do I query the 10 most recent articles vs the 10 most recently registered users for an api for instance? Put graphing aside for a moment as this stack must be able to handle both a collection/document paradigm as well as a graphing paradigm (relations between elements of these TYPES of collections). I've read everything including source and am getting closer but not quite there.
Closest document I've read is the multitenant article here: http://architects.dzone.com/articles/multitenant-graph-applications
But it focuses on using gremlin for graph segregation and querying which I'm hoping to avoid until I require analysis on the graphs.
I'm considering using cassandr/hadoop at this point with a reference to the graph id but I can see this biting me down the road.
Indexes in Tinkerpop/Blueprints only support simple lookups based on exact matches of property keys.
If you want to find the 10 most recent articles, or articles with the phrase 'Foo Bar', you may have to maintain an external index using Lucene or Elastic Search. These technologies use inverted indexes which support term/phrase lookups, range queries, and wildcard searches, etc. You can store the vertex.getId() as the field in the indexed document to link back to the vertex in the graph database.
I have implemented something like this for my application which uses a Blueprints database (Bitsy) and a fancy index (Lucene). I have documented the high-level design to (a) keep the fancy index up-to-date using batch updates every few seconds and (b) ensure transactional consistency across the graph database and the fancy index. Hope this helps.
Answering my own question to this - The answer is indices.
https://github.com/tinkerpop/blueprints/wiki/Graph-Indices

Storing and downloading Data in iOS Applications

I am a bit new to iOS Development and I was wondering if someone could point me in the right direction regarding an application I am working on.
I am currently working on an application that will be displaying product lists and categories. The list is updated on a weekly basis (one every week).
I am now trying to decide two things:
1- What's the best method of storing this data, I am looking for a way that will allow me to replace the data in the application once every week.
2- Is it going to be beneficial to use CoreData? Note that I Only have Product Category, Product and Product Information entities.
Appreciate your support.
I would use Core Data. Because I know Core Data and am used to work with it. But this is clearly very much like using a chainsaw to cut a slice of bread.
As I understand, you're not familiar with Core Data. Maybe it's not the right tool for the job considering the learning curve.
In your case I would simply use JSON files as provided by the server.
That said, if your looking in Core Data anyway, any store will do, either atomic, XML or SQLite. The first two will load the whole data set in memory and queries will be done in memory as well. SQLite provides the benefits usually associated with databases, with a slightly increased complexity. A chainsaw.
I would use Core Data. If you haven't worked with Core Data before, learn it. It's a great framework.

Resources