I have a situation where I need to strip out HTML code from some text. However, some of the input text includes lists, and I want to retain the numbering in that case.
If I do
result = Regex.Replace(result, "<li>", vbNewLine & "1. ", RegexOptions.IgnoreCase)
Then after stripping out the other HTML tags, I end up with:
1. List item one
1. List item two
1. List item three
Is there a way to get the index of the match during replacement?
so for example:
result = Regex.Replace(result, "<li>", vbNewLine & replacementIndex + 1 & " ", RegexOptions.IgnoreCase)
Then after stripping out the other HTML tags, I would get:
1. List item one
2. List item two
3. List item three
Is this possible??
Note: This is inside a function, so that each list is handled separately, and unordered lists get bullets (*) instead.
This should be a good starting point. #"(\<ul\>)((.|\n)*?)(\<\/ul\>)" this will match everything in between the tags.
It's messy, but something like the following. Only change one at a time. This may be slow for large data sets.
int lineNbr = 1;
string newResult = result.Replace("(?i)<li>", vbNewLine & (lineNbr++).ToString() & '. ', 1);
while (newResult != result)
{
result = newResult;
newResult = result.Replace("(?i)<li>", vbNewLine & (lineNbr++).ToString() & '. ', 1);
}
Here's how I ended up doing it - first, find each ordered list:
Dim result As String = rawText
Dim orderedLists As MatchCollection = Regex.Matches(rawText, "<ol>.*?</ol>", RegexOptions.Singleline)
For Each ol As Match In orderedLists
result = Replace(result, ol.Value, EncodeOrderedList(ol.Value))
Next
And the function to convert each one:
Private Function EncodeOrderedList(ByVal rawText As String) As String
Dim result As String = rawText
result = Regex.Replace(result, "<ol>\s*<li>", "1. ", RegexOptions.IgnoreCase)
result = Regex.Replace(result, "</li>\s*</ol>", vbNewLine & vbNewLine, RegexOptions.IgnoreCase)
Dim bullets As MatchCollection = Regex.Matches(rawText, "</li>\s*<li>")
Dim i As Integer = 2
For Each li As Match In bullets
result = Replace(result, li.Value, vbNewLine & i & ". ", 1, 1)
i += 1
Next
Return result
End Function
I haven't tested it on nested lists.
Related
i have a string with 1 value per line. i call it ttl_count.
ttl_count looks like this
1
1
1
0
0
0
1
1
1
1
0
etc
1's and 0's
What i want to do is run down the column of numbers in 'ttl_count' and total the consecutive groupings of 1. using the example above:
1
1
1 >> 3
0
0
0
1
1
1
1 >> 4
0
Here we see 2 consecutive groups, one with a subtotal of 3, and the other 4. i want to send each calculated subtotal into another variable to determine the MAX value and if the last entry in the variable is '1' to show the current subtotal.
not quite sure how to do this.
You can use the String.Split and String.Join Methods. Since you mentioned that you had one value per line I am assuming that you are reading from a file and have the standard windows CRLF endings. The first Split removes the line ending, I then join it back together so you have a string of just ones and zeros. I then split on the zeros which will give you an array with just ones in it. At that point it would just be as simple as using the String.Length Method on each Array element to get the Total number of ones in each string. If you are wanting to write the information back into the source(I am assuming a file) will require you to iterate through the string and count the ones then append the subtotal to the existing string and write that back to a file.
Module Module1
Sub Main()
Dim splitFirst As String() = {vbCrLf}
Dim splitNext As String() = {"0"}
Dim testString As String = "1" & vbCrLf &
"1" & vbCrLf &
"1" & vbCrLf &
"0" & vbCrLf &
"0" & vbCrLf &
"0" & vbCrLf &
"1" & vbCrLf &
"1" & vbCrLf &
"1" & vbCrLf &
"1" & vbCrLf &
"0"
Dim results As String() = testString.Split(splitFirst, StringSplitOptions.RemoveEmptyEntries)
Dim holding As String = String.Join("", results)
results = holding.Split(splitNext, StringSplitOptions.RemoveEmptyEntries)
'Show the results
For Each item In results
Console.WriteLine(item & " Count = " & item.Length.ToString())
Next
Console.ReadLine()
End Sub
End Module
Here is the same idea as a Function returning a String Array with the Groups of 1's as individual items.
Public Function getColumnCounts(data As String) As String()
Dim splitFirst As String() = {vbCrLf} 'Seperator used to strip CrLf's
Dim splitNext As String() = {"0"} 'Seperator used to strip the 0's
'This is where the CrLf information is removed
Dim results As String() = data.Split(splitFirst, StringSplitOptions.RemoveEmptyEntries)
'Join the results array to make a string
Dim holding As String = String.Join("", results)
'Split it again to remove the blocks of zero's leaving just groups on ones in the array
results = holding.Split(splitNext, StringSplitOptions.RemoveEmptyEntries)
'Return the results as a String Array
'For Example
'For Each item In getColumnCounts(testString)
' Console.WriteLine(item & " Count = " & item.Length.ToString())
'Next
Return results
End Function
Hi I am using a code to get the referral URL as you can see below:
sRef = encode(Request.ServerVariables("HTTP_REFERER"))
The code above is getting the following URL:
http://www.rzammit.com/pages/linux-form.asp?adv=101&loc=349&websync=233344-4555665-454&ptu=454545
From that url I want to grab ONLY the ADV and LOC (Request.querystring doesnt work because this is a script which is run when the form is submitted)
So to cut the story short, by using the referral URL, i want to get out the values for the adv and loc parameters.
Any help please on how I can do this?
Below is the code I am currently using but I have a problem. The parameters which are after the loc, is showing as well. I want something dynamic. Also the values of the adv and loc can be longer.
<%
sRef = Request.ServerVariables("HTTP_REFERER")
a=instr(sRef, "adv")+4
b=instr(sRef, "&loc")
response.write(mid(sRef ,a,b-a))
response.write("<br>")
response.write(mid(sRef ,b+5))
%>
Here is something to get you started; it uses regular expressions to get all URL variables for you. You can use the split() function to split them on the "=" sign and get a simple array, or put them in a dictionary or whatever.
Dim fieldcontent : fieldcontent = "http://www.rzammit.com/pages/linux-form.asp?adv=101&loc=349&websync=233344-4555665-454&ptu=454545"
Dim regEx, Matches, Item
Set regEx = New RegExp
regEx.IgnoreCase = True
regEx.Global = True
regEx.MultiLine = False
regEx.Pattern = "(\?|&)([a-zA-Z0-9]+)=([^&])"
Set Matches = regEx.Execute(fieldcontent)
For Each Item in Matches
response.write(Item.Value & "<br/>")
Next
Set regEx = Nothing
substring everything after the ?.
Split on "&"
Iterate the array to find "adv=" and "loc="
Below is the code:
Dim fieldcontent
fieldcontent = "http://www.rzammit.com/pages/linux-form.asp?adv=101&loc=349&websync=233344-4555665-454&ptu=454545"
fieldcontent = mid(fieldcontent,instr(fieldcontent,"?")+1)
Dim params
params = Split(fieldcontent,"&")
for i = 0 to ubound(params) + 1
if instr(params(i),"adv=")>0 then
advvalue = mid(params(i),len("adv=")+1)
end if
if instr(params(i),"loc=")>0 then
locvalue = mid(params(i),5)
end if
next
You can use the following generic function:
function getQueryStringValueFromUrl(url, key)
dim queryString, queryArray, i, value
' check if a querystring is present
if not inStr(url, "?") > 0 then
getQueryStringValueFromUrl = empty
end if
' extract the querystring part from the url
queryString = mid(url, inStr(url, "?") + 1)
' split the querystring into key/value pairs
queryArray = split(queryString, "&")
' see if the key is present in the pairs
for i = 0 to uBound(queryArray)
if inStr(queryArray(i), key) = 1 then
value = mid(queryArray(i), len(key) + 2)
end if
next
' return the value or empty if not found
getQueryStringValueFromUrl = value
end function
In your case:
dim url
url = "http://www.rzammit.com/pages/linux-form.asp?adv=101&loc=349&websync=233344-4555665-454&ptu=454545"
response.write "ADV = " & getQueryStringValueFromUrl(url, "adv") & "<br />"
response.write "LOC = " & getQueryStringValueFromUrl(url, "loc")
I currently have a variable called databaseinformation in a script that I am writing. I want to seperate this into two variables called Instance_Name and Database_Name
The string in question:
[MSSQLSERVER]BESMgmt.BAK
In this instance the Instance_Name is MSSQLSERVER and the Database_Name is BESMgmt. The string will not necessarily end in .BAK but stripping the last four characters off the variable would be fine. The Instance_Name and Database_Name will change values and length.
Thanks for any help in advance
#Trinitrotoluene: Working sample code --
Option Explicit
Dim ToBeSplit, Instance_Name, Database_Name
Dim SplitMe
Dim Position
ToBeSplit = "[MSSQLSERVER]BESMgmt.BAK"
SplitMe = Split(ToBeSplit, "]")
If IsArray(SplitMe) Then
Instance_Name = SplitMe(0)
Database_Name = SplitMe(1)
End If
Instance_Name = Replace(Instance_Name, "[", "")
If InStr(Database_Name, ".") > 0 Then
Database_Name = Left(Database_Name, Len(Database_Name) - 4)
End If
Response.Write "Instance_Name = " & Instance_Name & "<br>"
Response.Write "Database_Name = " & Database_Name
Or do it with a regular expression object, this makes you more flexible, allthough some will say that you have two problems now:
Option Explicit
Dim regEx, matches
Dim myString : myString = "[MSSQLSERVER]BESMgmt.BAK"
set regEx = new RegExp
regEx.pattern = "\[(.*)\](.*).{4}"
set matches = regEx.Execute(myString)
msgbox "Instance_name: " & matches(0).subMatches(0) & vbNewLine & "Database_name: " & matches(0).subMatches(1)
I am developing a asp.net web application, i have a string (with a value in it from a database), with multiple lines that i put in a TextBox with mulitline type. (textarea)
Now the problem is, that in the string are multiple lines, with much empty space. so i want the remove only the double linebreaks.
example of my textbox:
+++++++++++++++++++++++++++++++++++++++++++++++++++++
{empty}
{empty}
'This is some text in the textbox on line 3
'some text on line 4
{empty}
'some text on line 6
{empty}
{empty}
'some text on line 9
{empty}
+++++++++++++++++++++++++++++++++++++++++++++++++++++
now somehow i want to remove line 1 and 2, and line 7 and 8
thanks in advance
Here is the solution:
'now rebuild your example string
Dim Empty As String = Chr(13) & Chr(10)
Dim Sb As New System.Text.StringBuilder
Sb.Append("+++++++++++++++++++++++++++++++++++++++++++++++++++++")
Sb.Append(Empty)
Sb.Append(Empty)
Sb.Append(Empty & "This is some text in the textbox on line 3")
Sb.Append(Empty & "some text on line 4")
Sb.Append(Empty)
Sb.Append(Empty & "some text on line 6")
Sb.Append(Empty)
Sb.Append(Empty)
Sb.Append(Empty & "some text on line 9")
Sb.Append(Empty)
Sb.Append(Empty)
Sb.Append("+++++++++++++++++++++++++++++++++++++++++++++++++++++")
Dim YourString As String = Sb.ToString
MessageBox.Show(YourString)
'now replace the double empty
Dim result As String
result = YourString.Replace(Empty & Empty & Empty, Empty)
MessageBox.Show(result)
NOTE: This solution has been tested OK with Visual Studio 2010.
This will get rid of all empty lines.
Dim splt() As Char = New Char() {ControlChars.Lf, ControlChars.Cr}
Dim lines() As String = TextBox1.Text.Split(splt, StringSplitOptions.RemoveEmptyEntries)
TextBox1.Lines = lines
This looks like it will get rid of multiple newlines
Dim s As String = TextBox1.Text.Replace(Environment.NewLine, ControlChars.Cr)
Dim lines As New List(Of String)
lines.AddRange(s.Split(New Char() {ControlChars.Cr}))
For x As Integer = lines.Count - 1 To 1 Step -1
If lines(x) = "" AndAlso lines(x - 1) = "" Then
lines.RemoveAt(x)
End If
Next
TextBox1.Lines = lines.ToArray
The way I usually do this is to convert all of the various line breaks into a single one that I can manage, de-dupe and convert back to vbNewLine:
'//Convert all line break types to vbCr/ASCII 13
T = T.Replace(vbNewLine, vbCr).Replace(vbLf, vbCr)
'//Loop until all duplicate returns are removed
Do While T.Contains(vbCr & vbCr)
T = T.Replace(vbCr & vbCr, vbCr)
Loop
'//Check to see if the string has one at the start to remove
If T.StartsWith(vbCr) Then T = T.TrimStart(Chr(13))
'//Convert back to standard windows line breaks
T = T.Replace(vbCr, vbNewLine)
The following code removes double empty lines at the beginning, and also double empty lines anywhere in the textbox.
Dim myText as String = TextBox1.Text
myText = Regex.Replace(myText, "^(\r\n\r\n)(.*)", "$2")
myText = Regex.Replace(myTextt, "(.*\r\n)(\r\n\r\n)(.*)", "$1$3")
TextBox1.Text = myText
In the example given, it would remove lines 1 and 2, and lines 7 and 8.
Here's my code
Dim RefsUpdate As String() = Session("Refs").Split("-"C)
Dim PaymentsPassedUpdate As String() = Session("PaymentsPassed").Split("-"C)
Dim x as Integer
For x = 1 to RefsUpdate.Length - 1
Dim LogData2 As sterm.markdata = New sterm.markdata()
Dim queryUpdatePaymentFlags as String = ("UPDATE OPENQUERY (db,'SELECT * FROM table WHERE ref = ''"+ RefsUpdate(x) +"'' AND bookno = ''"+ Session("number") +"'' ') SET alpaid = '"+PaymentsPassedUpdate(x) +"', paidfl = 'Y', amountdue = '0' ")
Dim drSetUpdatePaymentFlags As DataSet = Data.Blah(queryUpdatePaymentFlags)
Next
I don't get any errors for this but it doesn't seem to working as it should
I'm passing a bookingref like this AA123456 - BB123456 - CC123456 - etc and payment like this 50000 - 10000 - 30000 -
I basically need to update the db with the ref AA123456 so the alpaid field has 50000 in it.
Can't seem to get it to work
Any ideas?
Thanks
Jamie
I'm not sure what isn't working, but I can tell you that you are not going to process the last entry in your arrays. You are going from 1 to Length - 1, which is one short of the last index. Therefore, unless your input strings end with "-", you will miss the last one.
Your indexing problem mentioned by Mark is only one item, but it will cause an issue. I'd say looking at the base your problem stems from not having trimmed the strings. Your data base probably doesn't have spaces leading or trailing your data so you'll need to do something like:
Dim refsUpdateString as string = RefsUpdate(x).Trim()
Dim paymentsPassedUpdateString as string = PaymentsPassedUpdate(x).Trim()
...
Dim queryUpdatePaymentFlags as String = ("UPDATE OPENQUERY (db,'SELECT * FROM table WHERE ref = ''" & refsUpdateString & "'' AND bookno = ''" & Session("number") & "'' ') SET alpaid = '" & paymentsPassedUpdateString & "', paidfl = 'Y', amountdue = '0' ")
Also, I would recommend keeping with the VB way of concatenation and use the & character to do it.