Silverstripe: partialMatchFilter and "&" - silverstripe

I have a client that wants to be able to pick several things off the Silverstripe tag cloud widget.
So, using jQuery, I prepare a string containing all the selected tags which I then pass to a Silverstripe function.
if($_GET["selectedTags"]){
$selectedTagString = $_GET["selectedTags"];
$selectedTagString = substr($selectedTagString, 0, strlen($selectedTagString) -1);
$tagArray = explode("|", $selectedTagString);
$blogEntries = DataObject::get("BlogEntry")->filter(array("Tags:PartialMatch" => $tagArray));
return $blogEntries->renderWith(array("blogSearchResults"));
}
}
And it works quite well.
EXCEPT for tags that have an "&" in them, like "Otago & Southland", where the search fails and nothing is retrieved.
Looking at the generated SQL, everything seems to be fine.
SELECT DISTINCT "SiteTree_Live"."ClassName", "SiteTree_Live"."Created",
.
.
.
"BlogEntry_Live"."BlogEntryThumbnailID", "SiteTree_Live"."ID", CASE WHEN "SiteTree_Live"."ClassName" IS NOT NULL THEN "SiteTree_Live"."ClassName" ELSE 'SiteTree' END AS "RecordClassName" FROM "SiteTree_Live" LEFT JOIN "Page_Live" ON "Page_Live"."ID" = "SiteTree_Live"."ID" LEFT JOIN "BlogEntry_Live" ON "BlogEntry_Live"."ID" = "SiteTree_Live"."ID" WHERE ("BlogEntry_Live"."Tags" LIKE '%southland & otago championships%') AND ("SiteTree_Live"."ClassName" IN ('BlogEntry')) ORDER BY "SiteTree_Live"."Sort" ASC
Has anyone had this problem before?

Are the tags stored using & in the db? Then you first should html encode your string.

Related

How to match space in MarkLogic using CTS functions?

I need to search those elements who have space " " in their attributes.
For example:
<unit href="http:xxxx/unit/2 ">
Suppose above code have space in the last for href attribute.
I have done this using FLOWER query. But I need this to be done using CTS functions. Please suggest.
For FLOWER query I have tried this:
let $x := (
for $d in doc()
order by $d//id
return
for $attribute in data($d//#href)
return
if (fn:contains($attribute," ")) then
<td>{(concat( "id = " , $d//id) ,", data =", $attribute)}</td>
else ()
)
return <tr>{$x}</tr>
This is working fine.
For CTS I have tried
let $query :=
cts:element-attribute-value-query(xs:QName("methodology"),
xs:QName("href"),
xs:string(" "),
"wildcarded")
let $search := cts:search(doc(), $query)
return fn:count($search)
Your query is looking for " " to be the entirety of the value of the attribute. If you want to look for attributes that contain a space, then you need to use wildcards. However, since there is no indexing of whitespace except for exact value queries (which are by definition not wildcarded), you are not going to get a lot of index support for that query, so you'll need to run this as a filtered search (which you have in your code above) with a lot of false positives.
You may be better off creating a string range index on the attribute and doing value-match on that.

SQLite: isolating the file extension from a path

I need to isolate the file extension from a path in SQLite. I've read the post here (SQLite: How to select part of string?), which gets 99% there.
However, the solution:
select distinct replace(column_name, rtrim(column_name, replace(column_name, '.', '' ) ), '') from table_name;
fails if a file has no extension (i.e. no '.' in the filename), for which it should return an empty string. Is there any way to trap this please?
Note the filename in this context is the bit after the final '\'- it shouldn't be searching for'.'s in the full path, as it does at moment too.
I think it should be possible to do it using further nested rtrims and replaces.
Thanks. Yes, you can do it like this:
1) create a scalar function called "extension" in QtScript in SQLiteStudio
2) The code is as follows:
if ( arguments[0].substring(arguments[0].lastIndexOf('\u005C')).lastIndexOf('.') == -1 )
{
return ("");
}
else
{
return arguments[0].substring(arguments[0].lastIndexOf('.'));
}
3) Then, in the SQL query editor you can use
select distinct extension(PATH) from DATA
... to itemise the distinct file extensions from the column called PATH in the table called DATA.
Note that the PATH field must contain a backslash ('\') in this implementation - i.e. it must be a full path.

Obtain data from dynamically incremented IDs in JQuery

I have a quick question about JQuery. I have dynamically generated paragraphs with id's that are incremented. I would like to take information from that page and bring it to my main page. Unfortunately I am unable to read the dynamically generated paragraph IDs to get the values. I am trying this:
var Name = ((data).find("#Name" + id).text());
The ASP.NET code goes like this:
Dim intI As Integer = 0
For Each Item As cItem in alProducts1
Dim pName As New System.Web.UI.HtmlControls.HtmlGenericControl("p")
pName.id = "Name" & intI.toString() pName.InnerText = Item.Name controls.Add(pName) intI += 1
Next
Those name values are the values I want...Name1, name2, name3 and I want to get them individually to put in their own textbox... I'm taking the values from the ASP.NET webpage and putting them into an AJAX page.
Your question is not clear about your exact requirement but you can get the IDs of elements with attr method of jQuery, here is an example:
alert($('selector').attr('id'));
You want to select all the elements with the incrementing ids, right?
// this will select all the elements
// which id starts with 'Name'
(data).find("[id^=Name]")
Thanks for the help everyone. I found the solution today however:
var Name = ($(data).find('#Name' + id.toString()).text());
I forgot the .toString() part and that seems to have made the difference.

Linq-to-SQL question

im really new to linq-to-SQL so this may sound like a really dumb question, i have the following code
var query = from p in DC.General
where p.GeneralID == Int32.Parse(row.Cells[1].Text)
select new
{
p.Comment,
};
how do i got about getting the result from this query to show in a text box ??
That would be:
TextBox1.Text = query.Single().Comment;
You have to filter the first result from your query. To do that, you can use Single() if you know the query only returns one value. You could also use First(), if the results might contain more than one row.
Also, if it's only a single value, you could rewrite the code to:
var query = from p in DC.General
where p.GeneralID == Int32.Parse(row.Cells[1].Text)
select p.Comment;
TextBox1.Text = query.Single();

Why is this looping infinitely?

So I just got my site kicked off the server today and I think this function is the culprit. Can anyone tell me what the problem is? I can't seem to figure it out:
Public Function CleanText(ByVal str As String) As String
'removes HTML tags and other characters that title tags and descriptions don't like
If Not String.IsNullOrEmpty(str) Then
'mini db of extended tags to get rid of
Dim indexChars() As String = {"<a", "<img", "<input type=""hidden"" name=""tax""", "<input type=""hidden"" name=""handling""", "<span", "<p", "<ul", "<div", "<embed", "<object", "<param"}
For i As Integer = 0 To indexChars.GetUpperBound(0) 'loop through indexchars array
Dim indexOfInput As Integer = 0
Do 'get rid of links
indexOfInput = str.IndexOf(indexChars(i)) 'find instance of indexChar
If indexOfInput <> -1 Then
Dim indexNextLeftBracket As Integer = str.IndexOf("<", indexOfInput) + 1
Dim indexRightBracket As Integer = str.IndexOf(">", indexOfInput) + 1
'check to make sure a right bracket hasn't been left off a tag
If indexNextLeftBracket > indexRightBracket Then 'normal case
str = str.Remove(indexOfInput, indexRightBracket - indexOfInput)
Else
'add the right bracket right before the next left bracket, just remove everything
'in the bad tag
str = str.Insert(indexNextLeftBracket - 1, ">")
indexRightBracket = str.IndexOf(">", indexOfInput) + 1
str = str.Remove(indexOfInput, indexRightBracket - indexOfInput)
End If
End If
Loop Until indexOfInput = -1
Next
End If
Return str
End Function
Wouldn't something like this be simpler? (OK, I know it's not identical to posted code):
public string StripHTMLTags(string text)
{
return Regex.Replace(text, #"<(.|\n)*?>", string.Empty);
}
(Conversion to VB.NET should be trivial!)
Note: if you are running this often, there are two performance improvements you can make to the Regex.
One is to use a pre-compiled expression which requires re-writing slightly.
The second is to use a non-capturing form of the regular expression; .NET regular expressions implement the (?:) syntax, which allows for grouping to be done without incurring the performance penalty of captured text being remembered as a backreference. Using this syntax, the above regular expression could be changed to:
#"<(?:.|\n)*?>"
This line is also wrong:
Dim indexNextLeftBracket As Integer = str.IndexOf("<", indexOfInput) + 1
It's guaranteed to always set indexNextLeftBracket equal to indexOfInput, because at this point the character at the position referred to by indexOfInput is already always a '<'. Do this instead:
Dim indexNextLeftBracket As Integer = str.IndexOf("<", indexOfInput+1) + 1
And also add a clause to the if statement to make sure your string is long enough for that expression.
Finally, as others have said this code will be a beast to maintain, if you can get it working at all. Best to look for another solution, like a regex or even just replacing all '<' with <.
In addition to other good answers, you might read up a little on loop invariants a little bit. The pulling out and putting back stuff to the string you check to terminate your loop should set off all manner of alarm bells. :)
Just a guess, but is this like the culprit?
indexOfInput = str.IndexOf(indexChars(i)) 'find instance of indexChar
Per the Microsoft docs, Return Value -
The index position of value if that string is found, or -1 if it is not. If value is Empty, the return value is 0.
So perhaps indexOfInput is being set to 0?
What happens if your code tries to clean the string <a?
As I read it, it finds the indexChar at position 0, but then indexNextLeftBracket and indexRightBracket both equal 0, you fall into the else condition, and then you insert a ">" at position -1, which will presumably insert at the beginning, giving you the string ><a. The new indexRightBracket then becomes 0, so you delete from position 0 for 0 characters, leaving you with ><a. Then the code finds the <a in the code again, and you're off to the races with an infinite memory-consuming loop.
Even if I'm wrong, you need to get yourself some unit tests to reassure yourself that these edge cases work properly. That should also help you find the actual looping code if I'm off-base.
Generally speaking though, even if you fix this particular bug, it's never going to be very robust. Parsing HTML is hard, and HTML blacklists are always going to have holes. For instance, if I really want to get a <input type="hidden" name="tax" tag in, I'll just write it as <input name="tax" type="hidden" and your code will ignore it. Your better bet is to get an actual HTML parser involved, and to only allow the (very small) subset of tags that you actually want. Or even better, use some other form of markup, and strip all HTML tags (again using a real HTML parser of some description).
I'd have to run it through a real compiler but the mindpiler tells me that the str = str.Remove(indexOfInput, indexRightBracket - indexOfInput) line is re-generating an invalid tag such that when you loop through again it finds the same mistake "fixes" it, tries again, finds the mistake "fixes" it, etc.
FWIW heres a snippet of code that removes unwanted HTML tags from a string (It's in C# but the concept translates)
public static string RemoveTags( string html, params string[] allowList )
{
if( html == null ) return null;
Regex regex = new Regex( #"(?<Tag><(?<TagName>[a-z/]+)\S*?[^<]*?>)",
RegexOptions.Compiled |
RegexOptions.IgnoreCase |
RegexOptions.Multiline );
return regex.Replace(
html,
new MatchEvaluator(
new TagMatchEvaluator( allowList ).Replace ) );
}
MatchEvaluator class
private class TagMatchEvaluator
{
private readonly ArrayList _allowed = null;
public TagMatchEvaluator( string[] allowList )
{
_allowed = new ArrayList( allowList );
}
public string Replace( Match match )
{
if( _allowed.Contains( match.Groups[ "TagName" ].Value ) )
return match.Value;
return "";
}
}
That doesn't seem to work for a simplistic <a<a<a case, or even <a>Test</a>. Did you test this at all?
Personally, I hate string parsing like this - so I'm not going to even try figuring out where your error is. It'd require a debugger, and more headache than I'm willing to put in.

Resources