I’m programming en ASP MVC3 application. One part of that application should be a product list containing faceted search, full text search and distance search. After a while of research I found SOLR and SOLRNet.
I have installed Solr on a Tomcat 7 and included a DataImportHandler for Indexing data from my MS SQL database.
Now I have a problem perhaps only an understanding problem:
The facets I want to use are placed in the database and could change
every time. Where I have to implement the facet indexing? In the ASP MVC application or in the data-config of solr?
How does solr work in combination with solrnet, solr have to get a
new index of my database for each search, correct?
How to make solr indexing the data from solrnet?
Have I to rebuild the index after every change?
A lot of questions and I would be happy if someone knows the answer of some of them.
Thank you very much and have a nice weekend!
The facets I want to use are placed in the database and could change every time. Where I have to implement the facet indexing? In the ASP MVC application or in the data-config of solr?
You mentioned you already set up DataImportHandler to index your data, so populating the index is just a matter of running a scheduled full-import or delta-import.
How does solr work in combination with solrnet, solr have to get a new index of my database for each search, correct?
No, you don't need to recreate the index for every search.
How to make solr indexing the data from solrnet?
You mentioned you already set up DataImportHandler to index your data, that's a valid approach to populate your index by having Solr pull data from the database. If you want to push data to your index using SolrNet instead, use the Add/AddRange methods.
The facets I want to use are placed in the database and could change every time. Where I have to implement the facet indexing? In the ASP MVC application or in the data-config of solr?
Solr provides out of the box faceting.
you would index the data you need to facet in schema.xml and enable faceting during the searches in solrconfig.xml.
On the application side you just need to process the facet data return by Solr.
http://www.lucidimagination.com/devzone/technical-articles/faceted-search-solr
How does solr work in combination with solrnet, solr have to get a new index of my database for each search, correct?
Usually the client (java and ruby) interacts with Solr through http executing searches and processing the results providing you an easy access.
So everytime the search happens Solrnet would be querying the latest index.
Have I to rebuild the index every change?
With Data import handler you can incrementally index you data periodically.
The timestamp handling is performed for you by Solr.
You would need to have jobs which would perform the incremental indexing.
However, if you need to have the data reflected with every change you would need to index the data through application.
Dataimporthandler, with Solr on Tomcat 7 should work fine. Check the jdk version as jdk 7 had some issues with Solr/Lucene.
Related
I have a .NET 6 application that writes logs out to a SQLite file each time a cycle is completed. I'm using EF Core.
The application sits on a Raspberry Pi with limited resource, so I want to keep the live database small as when it gets large the system slows down. So I want to archive the logs to only keep the last 50 or so logs in the live database.
I am hoping to create a method that will will create a new SQLite database file and then copy the last oldest record over when a new log is created. I'd also want to limit how big the archive file is, maybe split out to create a new one once it reaches a certain size on disk.
I've had a look around and can't find any best practices anything documented. Could someone give me a steer to the best way to achieve this or similar.
I solved my issue by putting EFCore aside and instead using the sqlite-net-pcl nuget package from SQLite-net.
This allowed me to separate the creation of my archive and also apply additional commands not supported in EFCore like vacuum.
I still use EFCore and Linq to query the records to create my archive with and then again to remove those records once the archive is created.
For our stored procedures, we were using an approach that was working rather well during CD which was making use of the javascript v2 SDK to call container.storedProcedures.upsert. Upsert has now been removed from the API on v3 as it's not supported on non-partitioned collections (which are the only ones you'll be able to create from now on).
I supposed that the v3 SDK would have a way to at least delete and re-create these objects, but for what I can see it only allows creation: https://learn.microsoft.com/en-us/javascript/api/%40azure/cosmos/storedprocedures?view=azure-node-latest
We followed a similar approach for maintaing the index definitions updated and this is the main reason we now need to migrate to the v3 SDK as otherwise updating some kind of indexes fail through v2.
Given that what we want (if possible) is to be able to maintain all of these objects in soure control and automatically deploy them during CD, what would be the recommended way to do this?
(Meanwhile I'm exploring using these powershell commands for it: https://github.com/PlagueHO/CosmosDB but attempting to create a UDF through them caused a very bizzarre outcome in which Azure Portal stopped showing me any UDF on the collection until I removed the one I had created using New-CosmosDbUserDefinedFunction)
There are a few options today and your choices will get better here over the next couple of months.
Cosmos now has support for creating stored procedures, triggers and UDFs using ARM Templates. The second sample on this page has an ARM template that shows this. Cosmos DB ARM Template Samples. This PS tool you are using is not officially supported so you'll need to file an issue there for any questions. We will be releasing PS Cmdlets to create stored procedures, triggers and UDF's but there is no ETA to share at this time.
I am using solr-5.4.0 in my production environment(solr cloud mode) and am trying to automate the reload/restart process of the solr collections based on certain specific conditions.
I noticed that on solr reload the thread count increases a lot there by resulting in increased latencies. So I read about reload process and came to know that while reloading
1) Solr creates a new core internally and then assigns this core same name as the old core. Is this correct?
2) If above is true then does solr actually create a new index internally on reload?
3) If so then restart sounds much better than reload, or is there any better way to upload new configs on solr?
4) Can you point me to any docs that can give me more details about this?
Any help would be appreciated, Thank you.
If I'm not wrong you wanted to restart/reload the collection in production (Solr cloud mode) and asking for the best approach, If that's true here are few points for the consideration-
If possible, Could you please provide more details like what it cause/requirement to reload/restart the collection in production
I’m assuming the reason could be to refresh the shared resource (Like to see the changes of updated synonyms, or adding or deleting a stop word, etc.) or to update the Solr config set.
Here are a few points for the consideration-
If you want to update the shared resources –
Upload the resources through Solr API and reload the collection through (https://lucene.apache.org/solr/guide/6_6/collections-api.html#CollectionsAPI-Input.1)
If you want to update the config set –
When running in SolrCloud mode, changes made to the schema on one node will propagate to all replicas in the collection. You can pass the updateTimeoutSecs parameter with your request to set the number of seconds to wait until all models confirm they applied the schema updates. ( I got this information from solr-5.4.0, and it’s similar to what we have in Solr 6.6 here https://lucene.apache.org/solr/guide/6_6/schema-api.html)
1) Solr creates a new core internally and then assigns this same core name as the old core. Is this correct?
Not sure about it. Can you please share some reference?
2) If above is true then does solr create a new index internally on reload?
Not sure about it. Can you please share some reference?
3) If so then restart sounds much better than reload, or is there any better way to upload new configs on solr?
I don’t agree, because ideally reload is part of the restart, as per my understanding there will be an additional process in reset related to caching and sync.
4) Can you point me to any docs that can give me more details about this?
Here is a link for reference guide- https://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-5.4.pdf
In one requirement I need to query just created document. If I use lucene search then it will take few seconds to do the indexing and may not come in the search result.
The query should be executing from some alfresco webscript or a scheduler which runs every 5 seconds.
Right now I am doing it by using NodeService and finding child by name which is not the efficient way to do. I am using JAVA API.
Is there any other way to do it?
Thank you
You don't mention what version of Alfresco you are using, but it looks like you are using Solr.
If you just created the document, the recommendation is to keep the reference to it, so you don't have to search for it again.
However, sometimes it is not possible to have the document reference. For example, client1 is not aware that client2 just created a document. If you are using Alfresco version 4.2 or later, you can probably enable Transactional Metadata Queries (TMQ), which allows you to perform searches against the database, so there is no Solr latency. Please review the whole section, because you need to comply with four conditions to use TMQ:
Enable the TMQ patch, so the nodes properties tables get indexed in the database.
Enable searches using the database, whenever possible (TRANSACTION_IF_POSSIBLE).
Make sure that you use the correct query language (CMIS, AFTS, db-lucene, etc.)
Your query must be supported by TMQ.
I would like to find a way to create and populate a database during asp.net setup.
So, what I'm willing to do is:
Create the database during the setup
Populate the database with some initial data (country codes or something like that)
Create the appropriate connection string in the configuration file
I'm using .NET 3.5 and Visual Studio 2005, and the Database is SQL Server 2005.
Thanks in advance.
If you are creating an installer I'm sure there is a way to do it in there, but I am not all that familiar with that.
Otherwise, what you might do is the following.
Add a application_start handler in the Global.asax, check for valid connection string, if it doesn't exist, continue to step two.
Login to the server using a default connection string
Execute the needed scripts to create the database and objects needed.
Update the web.config with the connection information
The key here is determining what the "default" connection string is. Possibly a second configuration value.
Generally, you'll need to have SQL scripts to do this. I tend to do this anyway, as it makes maintaining and versioning the database much easier in the long run.
The core idea is, upon running the setup program, you'll have a custom action to execute this script. The user executing your setup will need permissions to:
Create a database
Create tables and other database-level objects in the newly-created database
Populate data
Your scripts will take care of all of that, though. You'll have a CREATE DATABASE command, the appropriate CREATE SCHEMA, CREATE TABLE, CREATE VIEW, etc. commands, and then after the schema is built, the appropriate INSERT statements to populate the data.
I normally break this into multiple scripts, but YMMV:
Create schema script
"Common scripts" (one for the equivalent of aspnet_regsql for web projects, one with the creation of the Enterprise Library logging tables and procs)
Create stored procedure script, if necessary (to be executed after the schema's created)
Populate initial data script
For future maintenance, I create upgrade scripts where a single script typically handles the entire upgrade process.
When writing the scripts, be sure to use the appropriate safety checks (IF EXISTS, etc) before creating objects. And I tend to make mine transactional, as well.
Good luck!
Well, actually I found a tutorial on MSDN: Walkthrough: Using a Custom Action to Create a Database at Installation
I'll use that and see how it goes, thanks for your help guys, I'll let you know how it goes.
If you can use Linq to Sql then this is easy.
Just import your entire database into the Linq to Sql designer. This will create objects that describe all objects in your database, including the System.Data.Linq.DataContext derived class that encapsulate the entire database setup.
Now you can call DataContext.CreateDatabase() to create the database.
See here more information.