Kentico CMS search results -

How do I change the Kentico CMS search settings so as to display a part of text from search results as in Google? Presently it shows only the path in the results.

It depends on how you have your search setup really.
At the page level if you are using the Portal Engine model, which the majority of people use now, you have to check the Widget that you are using, basically it boils down to a regular search or Smart Search.
If your using the ASPX Template model you may have to open up your source for the page and see which usercontrol file your using from ~/CMSWebParts/Search/ or ~/CMSWebParts/SmartSearch/
Once you figure out which user control you are using it's a matter of inspecting the Transformation that it uses. Most likely you'll be using one of the following:
Click on Edit Transformation and check out which field is inside the Call to SearchHighlight, normally, "Content". Then you know it's pulling from the main content of the document. I've also seen this be tied to a different field like "Title" or "Caption". But the default is "Content".
If you still dont see results with part of the text, make sure you have a Smart Search Index setup, found in CMSSiteManager -> Administation -> Smart Search. If you don't see your site in the Index list then you need to add one. Make sure you rebuild it and optimize it (click edit on the row to get to those options). After that is all rebuilt then you should see the text appear under the result.

One thing to note, is that as #jao has mentioned, this only takes the first 280 characters of the content of the page. If you're matching search text doesn't happen to be in the first 280 characters, then no highlighting will occur.

try the following in your search result transformation:
<%# SearchHighlight(HTMLHelper.HTMLEncode(TextHelper.LimitLength(HttpUtility.HtmlDecode(HTMLHelper.StripTags(GetSearchedContent(DataHelper.GetNotEmpty(Eval("Content"),"")),false, " ")), 280, "...")),"<span style=\"background-color: #FEFF8F\">","</span>") %>
This will show the first 280 characters from your content, with the search terms highlighted.


Customizing Search in Orchard

Is there any option to completely control a Search module with a Summary view? I am struggling to get there. I have the following settings so far:
In admin I created an Index called PublicSearch with a number of fields.
I am getting a search result which is a mixture of two content parts - Question and Expert
I have a Part view to be used in search result for Experts called ExpertSummary.cshtml. The view only contains the following elements now:
#model dynamic
<p>Expert Summary</p>
When the search result is coming I only expect the "Expert Summary" to be visible in Expert region of the Search but I am getting an additional "body" section (truncated to X characters). It seems to me because "body" is selected as a field when I created the Index it is coming up.
Each Expert record has an image of expert which is coming up in the search result and When I take out <Place Fields_MediaPicker="Content:1" /> from it disappears, which is fine.
But I want this summary to be completely controllable in ExpertSummary.cshtml - like a two column layout where the left col will hold the expert image and the right one will hold a brief description along with some other info - but everything would be in the View - should not come from Orchard search module as defaults.
In short I want Orchard's Index and Search modules to be functional and working like it is now but the layout and information I want to control completely using custom HTML in Parts/ExpertSummary.cshtml view.
Is this at all possible? If yes, how?
Please suggest. I am completely lost my way in Orchard framework!
I'm not completely sure I understand your question... Orchards search returns a Summary view by default. So you need to control the layout of your summary. Use shape tracing to create a new alternate to change the layout of the summary view. So ExpertSummary won't control the entire layout, it will just control the Expert part. Then use placement to decide what to display.
Also, tagging questions with "orchardcms" will be more useful than "orchardcms-1.7".

SDL Tridion Schema Field "List of Links" Options

I'm looking to create an SDL Tridion schema with a list of repeatable links while avoiding multiple fields per link.
In a rich text field I have the following options for creating a hyperlink:*
When content authors create one of these hyperlinks, they have the option to select linked (visible) text as well as title and target attributes that function like typical HTML hyperlinks.
"Richtext" means a Text field with Height of the Text Area = at least 2 rows with Allow Rich Text Formatting selected.
Single Schema Field Link
When creating a single schema field, I see these options:
External Link (author options will include http://, mailto, Other)
Multimedia Link
Component Link (which can allow Multimedia Values)
Current Ideas
The best out-of-the-box (OOTB) setups I've found for this "list of links" is either offering:
a single 2-line RTF with instructions to create a hyperlink (of any type) in that field
separate fields for each type as well as additional fields for display name, target, and title (where the fields are assembled through template code), authors fill in only one of the fields (component link or external)
Is there a way in the schema form designer, by updating the schema source, or through code to offer the same (RTF) hyperlink drop-down options, but in a single field? I could be missing something, but recognize this scenario isn't supported OOTB.
One question we are missing here is to consider if those links are going to be used somewhere else individually. If that's the case, multiple components would be my first choice, so we can reuse each component several times.
If you are planning to allow the editor to create a list of links that they are only going to use in a given component (not reusable), well, you have all the options mentioned in the previous answers.
To give you an idea on what's the best approach (in my humble opinion) here are things to consider:
Individual Components per link: use this approach if links are reusable.
Using embedded schemas (with the link structure) so this approach can be used in different component types (schemas)
Custom URL / Single Line Text Field: it requires an additional development effort and it is very unlikely you will keep the hard-link-references when creating internal links. As you know SDL Tridion keeps a reference to the tcm id in order to resolve links, trigger publishing, etc..
Custom URL / 2 Lines RTF: It will do the job, but you need to make sure you disable all the other RTF options from the Ribbon Tool Bar within the Schema RTF options, so you meke sure that the editors can only create links. Also, you might need to consider to add an XSLT filter to check if the edtiors entered something more than just links. These links are not reusable.
In general if you implement something custom (GUI extension + Custom URL) keep in mind all the TRIDION CMS concepts, like blueprinting (what happens when the link is inherited down), where used, etc...
My recommendation has always been to use Separated Components, but be careful with the link propagation when publishing...
I have seen this case at customers. If they consider less development effort, the idea of having a multiple embedded field is good.
You can have it as:
[text] Link Text
[Component Link] Link to anything
You would need an extra Content schema for External Links, like:
[External Link] Url
[text] target
[any extra option you need]
This means the editor would need to create a new External Link Component every time they create an external link. It is extra work, but it can also mean easier maintenance on the use of external urls within their site.
Lastly, the editor would just add multiple Component Links, those being of schema External Link of any other. It will be the template code which checks on the schema of the linked Component and add the code accordingly.
XML Name Description Field Type
[text] Text Text
[title] Title Text
[static_url] External URL Text
[component] Internal URL Component Link
In the field description for "External URL" and "Internal URL" you could add a comment to make sure that the editor doesn't get confused, only one of these two fields should be filled in. From the component, its ID can be used to create the dynamic link in the DWT. This solution has no development effort and for the editor is pretty much as intuitive as it can get. Of course this would be a multivalue embedded schema field inside the Links schema.
This use-case might work using a Custom URL field and maybe a GUI extension. The idea is to have a Custom URL that opens a popup (which might be a GUI extension). In that popup, you would select/construct your link (maybe using the same options as a normal RTF link - Component, Anchor, mailto, etc).
The popup would return a specially crafted string. The format could be anything, even an actual anchor tag (but JSon is also fine). Example: {href:'tcm:1-2',type='component'}.
Your Templates would interpret this string in order to generate something meaningful, like a dynamic link or static HTML anchor.
Also the Custom URL popup should be smart enough to 'decode' such a link (if a value was specified in that field previously) and maybe pre-populate some attributes in the RTF link constructor form.

Find every instance of a CSS id/class across a whole site

Before making a CSS change that might possibly have unintended consequences, what's a good way to find where else on the whole site (not just this page) that id or class is used? (It doesn't have to be exhaustive, and semi-manual processes are ok, too.)
For a bit of context, it's a Joomla-based site with a lot of content, and I'm not yet familiar with most of it. The id in question has a two letter name, and I have no idea where else it might be used. I don't have direct access to the server for any grep-like approaches.
The only technique I can think of is using Stylish to make an obvious change to that one selector, and browsing the site for a bit to see where it pops up.
The easiest way would be a local grep, but since you don't have access to the server, try downloading it locally using wget:
wget -r -l --domains=
That'll recursively retrieve pages from your domain to an infinite depth, but only following links to pages within your domain.
Once it's on disk, do a local grep and you're golden.
I use for this sort of thing. You simply put in your webpage, and it will look through the whole site (incl. login) and give you the CSS that you actually are using.
I've found it to be 95% correct - but it only doesn't pick up on things like some CSS browser hacks and some errors (ie. the CSS only displays after an error), so it should work fine for this.
You could also check the original template (assuming the template is a commercial one) to see where the id perhaps should be (they usually lay everything out in their demo template), but unused-css won't tell you exactly where it is used, only if it is or not. For that, I'd start with a view-source -> find on the major pages, and then try other mentioned solutions.
Get the whole site's source tree into an IDE like NetBeans or Eclipse and then do a recursive search for id="theid" on the root folder.
If this is not possible, how are you updating the CSS?
Assuming you don't want to do the grep approach:
Is the ID in question appearing in the actual content area of the page, or in the 'surrounding' areas? If it seems like it's not part of the content, but rather appears in a template, you could search the template files for it. As you're updating the CSS, I'm going to assume you can at least get a hold of the template files. Many text editors/IDE's will let you do a 'global search'. I'd load the template files in TextMate (my texteditor of choice) and do a "search in project" for the particular ID.
That will at least give you a semblance of an idea of where in the site that ID shows up. No, it won't be every 'page', but you'll know what kind of page it appears on (which, with a CMS, is really what you're after).
If the ID in question appears in the content, that is, it was hand-entered by content creators, you'll have to go another route. Do you have access to the database? If you can get a dump of the database (I think Joomla! is MySQL based), you can open the sql in something like Sequel Pro and do a search in the content records for that ID.
This is not actually as hard as it sounds. First place to look the index.php file for the template. This file should be pretty small without a ton of code unless the template is from a developer that uses a template framework. If the ID is in there, then it will show up on every page in the website since this is the foundation that every page is built on.
If you don't find it in there, then you need to determine whether it is displaying in a module position or in the component area. You should be able to tell the difference by looking at the index.php file from the template.
If it's in a module position, then the ID should only show up in instances of that particular module.
If it's in the component area, then it should only display in any pages being created by the component. That does leave the possibility of it affecting many elements you don't want changed. But there is a solution for that. you can use the page class suffix in a menu item to add a unique id/class to the page you want to change (depends on your template). With that unique suffix you can create a specific selector that will only affect the pages you want to change.

Drupal Site Index - not crawling through "Blocks"?

I created a "View"* in Drupal to grab all the content and essentially make a site map, but I realized that it doesn't have an option to grab content from the Blocks I have created. Does anyone have an idea if I can even do that?
If not, should I essentially make each block a page so that it can crawl through the pages? I worry that this will end up becoming unmanageable in the end... What are some other options/work arounds? My end goal is to make a site map - maybe I am making this too complicated?
*To make my view I did:
Administration->Structure->Views->Add. Then I made it a page, called it "site-index", and made it "show Content of type All" (with tagged field empty). Then I chose "Content: Title" for my Fields and my Filter Criteria is set as: "Content: Published (Yes):" - That way, it will grab the titles of my web pages.
Thanks, and please reply if further clarification is needed!
Apologies if I'm wrong but I think there might be a bit of confusion over terminology here. In the context of a view Content means nodes, not all HTML content on the site. Your view will return a list of all published nodes, which are essentially the pages on your site.
On a normal sitemap (if there is such a thing) you would only link to full pages, not to parts of pages like a block, they are essentially used to provide a hierarchical overview of your site to aid navigation for users and, probably more importantly these days, search engines (you can submit an XML sitemap to the major search engines instead of this but that's really for another question).
Rather than doing this yourself I'd actually recommend you download and install the Sitemap module which will do all of the work for you, as well as arranging the content in their respective hierarchy.

Custom Parser for Nutch (or open source .NET Crawler)

I have been using Nutch/Solr/SolrNet for my search solutions, I must say, it works a treat. On a new site I'm working on, I am using Master pages, as a result, content in the header and footer is getting indexed and distorts the results. For example, I have a link to the Contact Us page in the header. Now, when I search for 'Contact' the result returns all the pages in the site.
Is there a customizable Nutch parser that i can maybe pass a div id and then it only indexes content inside the div.
Or if there are .NET based crawlers that I can customize.
BTW you'd get a more relevant audience by posting to the Nutch user list
You can implement a Nutch filter (I like Jericho HTML Parser) to extract only the parts of the page you need to index using DOM manipulation. You can use the TextExtractor class to grab clean text (sans HTML tags) to be used in your index. I usually save that data in custom fields.
