Which is the best way to create a site search engine for a dynamic asp.net site with hundreds of dynamic pages. I have seen many products and articles
http://www.karamasoft.com/UltimateSearch/overview.aspx
http://www.sitesearchasp.net
http://www.easysearchasp.net/
http://msdn.microsoft.com/en-us/magazine/cc163355.aspx
http://www.codeproject.com/KB/asp/indexserver.aspx
Priyan,
Another high-quality open-source option would be the .NET port of Lucene
CodeProject - Introducing Lucene
dotlucene
lucene.net
You haven't mentioned Google's SiteSearch "product". Is one of your requirements that you'd like to host the search engine/catalog yourself?
Microsoft also has a product Search Server 2008 Express although I'm not sure if you can install it on any hosting provider.
And (disclaimer: I am the author) there is also a very basic open source project on CodeProject called Searcharoo (also at searcharoo.net). It is really meant as a 'demonstration/learning experience' - hence the six how to articles - but it might suffice for a small dynamic site.
I have used SQL Server Full Text Search for some projects - works well but it's really just searching database content, not a combination of static and dynamic Html/Pdf/Word/Jpg etc documents which a "real" web crawler will do.
Related
I'm about to embark on the ASP.net project which involves building a pretty powerful search function. The application is very database heavy. Essentially, organisations will be adding a lot of metadata about themselves in the form of multi-selects, free text boxes etc. which are all stored in SQL 2008.
When it comes to search I'm loath to re-invent the wheel. Normally with a content driven site I'd use a component such as Zoom Search or ASP.net Search engine (http://www.aspnetsearchengine.com/UltimateSearch/Features.aspx)
But I don't think these type of content driven search controls are apppropriate for what I need given the data driven nature of the search.
I'm thinking full text search is the way to go but then I'm thinking I'll probably lose a lot of the bells and whistles I'd typically get with a packaged search module like spelling suggestions, document search, synonyms, ignore words etc.
Are there any good hybrid solutions (paid or free) for .net sites that provides these nice features within a search framework of sorts?
Thanks,
Ed
Lucene is pretty highly regarded across a number of languages. It's in use on some pretty large sites too, i know monster.com use it and their search is pretty extensive.
https://lucene.apache.org/
Edit Found some more resources:
Lucene.Net and SQL Server
SQL Server 2008 Full Text Search (FTS) versus Lucene.NET
http://ifdefined.com/blog/post/Full-Text-Search-in-ASPNET-using-LuceneNET.aspx
open source faceted search / guided navigation for ecommerce sites with .net apis
my next assignments is to build 2 information portals for customers. These portals will be login protected sites and contain a set of pages displaying information like orders, invoices, pdf-files ... for the authenticated user (all presented as lists with links to detail pages). The users and the data are stored in an Oracle database. The portals differ in some of the features and in the layout.
My standard approach is to build an individual ASP.net Web Application for every portal.
But this is not the best way to get something reusable. So for these two projects my idea is to create a set of WCF services to get the Data from the Oracle database and to build user controls to display the different elements in Umbraco. This way I hope to get a set of independent, reusable “modules” which can be used to build these portals.
Now my question: is Umbraco a good platform for this type of projects? And is my “concept” a valid approach?
Kind regards
Volkmar
Umbracois very flexible. ON the one hand there is the question about security: With Umbraco you can use any Membership Provider you want for all visitors ( also with member roles).
On the other hand you have the question of the integration: With Umbraco you can create usercontrols, xslts or razor files as macros (which can be seen as the reusable modules).
For Xslt you can implement your own XsltExtension which pulls the external content as XPathNodeIterator you can use in every Xslt macro. For ascx files or razor you can use LinQ2Umbraco, your own objects etc to connect to the oracle database.
You also can use some sort of caching functionality to reduce the db-calls. On the other hand is one of the biggest advantages that Umbraco stores all the content as xml and object tree in memmory. So it is very fast in content rendering. With every database call you are loosing a little bit of this advantage.
hth, Thomas
Ruben Verbourgh began the Oracle4Umbraco project to create an abstracted fork for the Datalayer to support running on an Oracle DB. You can find it at http://oracle4umbraco.codeplex.com/, although it has no active releases, so build from source and YMMV.
Volkmar, your concept is perfectly sound - although you might want to consider using the Umbraco data store as the persistence layer for your data rather than in the Oracle DB itself. You get XML content versioning, caching, and all the benefits of the content-management side of things, in a robust and flexible framework which you can expose to other apps later should you so need to, through the Umbraco APIs and web services.
HTH,
Benjamin
content management of website becomes simplified with Umbraco.
But if you are planning to use Oracle as backend, Umbraco does not have support for it.
So decide carefully as to what parameters can be compromised.
Good luck.
We host websites in a shared hosting environment where Microsoft SQL Server full text searching is not allowed. We would love an ASP.NET API that allowed similar functionality to get around this restriction.
We can't easily install software on the shared servers, so the API would have to be written in ASP.NET.
SQL "like" queries are our alternative and they are fast enough (our websites never exceed more than 50Mb of text) but they don't rank results well, have a dictionary, do stemming etc
For this type of circumstance I'd rely on Google and create a proper sitemap. You can integrate google search right into your website too with Google SiteSearch.
If you need more control over full-text search, you can use features of the RDBMS to support this. You don't say which brand of RDBMS you're using. I assume it's likely Microsoft SQL Server if you're using ASP.NET.
See the docs for Full-Text Search at MSDN.
For other brands of RDBMS, see my answer: How best to develop the sql to support Search functionality in a web application?
Lucene is what we were looking for http://incubator.apache.org/lucene.net/
I have no experience building a search solution, but I'd like to have a search box within my solution and I don't know where to even begin. Are there cool SQL Server tricks that I can use to make my search solution performant (I'm using a hosted SQL 2008 server) I'd love pointers to a multi-step tutorial that starts me off with a simple query search solution...and then layers on more advanced code and features.
You don't actually say whether you need/want a 'spider' to index your site "as is" (like Google; which is useful if your searchable content on each page comes from many different tables/objects/entities) or whether you just want to query EF using full-text-search-like syntax to return a collection of Entities?
If you are interesting in the 'spider' approach - here's a CodeProject article for a small ASP.NET Search Engine "Searcharoo". It is a web-crawling search engine for small-ish sites (it doesn't use a database at all), so it may not be applicable for your situation.
The code is also at searcharoo.codeplex.com and there are 7 articles on how it works/was built at Searcharoo.net (disclaimer: I wrote them; I hope they are interesting/useful).
If you need to search your database directly, you should probably look into SQL Server 2008's Full Text Search feature (assuming LIKE isn't sophisticated enough for your needs). We used info from this article (free registration) to set-up SQL Full Text Search on a work project... no EF in our solution though.
Also, as you might know StackOverflow is built with ASP.NET MVC - they blogged about some problems with SQL 2008 FTS. There's also some info on SQL FTS versus Lucene.NET (which is another search engine you could research) that might be useful.
You might be interested in reading this.
Read this article:
Create a Site Search Engine in ASP.NET
If you don't have to program an engine yourself you could consider using Google Custom Search Engine. There are couple of articles about this:
Using Google Co-op's Custom Search
Engine
Implementing Search in ASP.NET with Google Custom Search
Also could be useful:
Helping Visitors Search Your Site By Creating an OpenSearch Provider
I am exploring the options of establishing a wiki site for my company's division of developers, numbering over a hundred. We are a pure Microsoft (Certified Partner) shop, so it is natural to base an implementation on ASP.NET and IIS for familiarity's sake as well as extended learning opportunities.
It looks like Screwturn wiki does not offer a user provider that can hook up to Active Directory. Is there a wiki engine that natively supports AD? Managing two user bases would not be the most efficient of activities when we wish to control access.
UPDATE: looks like ScrewTurn now has an official AD provider
http://www.screwturn.eu/blog/?p=255
Have you had a look at Sharepoint Wiki? Sicne you are a Microsoft shop, it's probably the easiest to set up.
As expected, it integrates with Active Directory out of the box. It's not really written about much since it's part of Sharepoint Server. Here's the Microsoft Page
To be honest, it's not the greatest Wiki around. The markup is HTML based, which is clunky coming from MediaWiki, but It fit in nicely.
The N2 Open Source ASP.NET CMS is a lightweight CMS framework to help you build great web sites that anyone can update. It contains a package of functional templates with News, Wiki, Photo Galleries, FAQs, RSS, Data Entry, Polls and more. Also, N2 leverages on ASP.NET features such as existing web controls site map and membership providers.
We've been using Perspective Wiki - it integrates with Windows AD fairly well, and has most of the features you'd expect of a wiki - which is more than could be said for the SP Wiki - we've not migrated to MOSS wiki's yet mostly because of missing features.
MindTouch is OpenSource and .Net based. Ohloh has rated it as a 5. The source code is pretty tight, and there are connectors for SqlServer as well as a scripting language. Finally it supports LDAP and ActiveDirectory.
MindTouch is a generally considered a good open source alternative to SharePoint as you can quickly create customizations with it's toolkits as well as integrate with MS Office.
According to their website LiteWiki supports the ASP.NET membership providers and is pretty lightweight.
It might not be what you are looking for but my compnay has MediaWiki on Linux/Apache running with integrated AD security. It is locked down to users in a particular AD group and we log on with our AD credentials.
MediaWiki seems to be a very good wiki too.
You can add AD support with a couple line of code. Check the Screwturn wiki forums for various examples