Replace all URLs with hyperlinks - asp.net

I have following code which uses regex to find all the urls within a given string:
Dim regex As New Regex("(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)", RegexOptions.IgnoreCase)
Now, I want to replace all the matches with hyperlinks:
For Each match As Match In mactches
strtemp &= strtemp.Replace(match, "<a target='_blank' href='mailto:" & match & "'>" & match & "</a>")
Next
The regex works fine but there is an issue while replacing. Suppose my input string is as follows:
www.google.com is as same as google.com and also http://google.com
The code will first replace www.google.com with <a> and then, when the second match (google.com) comes up, it will again replace the previous one. So, what is a way of achieving this?

If you use Regex.Replace, it will work correctly, since it will replace each occurrence as it finds them rather than replacing all other matches at the same time:
Dim pattern As String = "(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)"
Dim regex As New Regex(pattern, RegexOptions.IgnoreCase)
Dim input As String = "www.google.com is as same as google.com and also http://google.com"
Dim output As String = regex.Replace(input, "<a target='_blank' href='mailto:$&'>$&</a>")
However, if you are just going to recreate the Regex object each time you call it, you could just use the static Regex.Replace method instead.
The $& is a special substitution expression which instructs the Replace method to insert the entire match at that point in the replacement string. For other substitution expressions, see the section on the MSDN quick reference page.

Related

Change all href tags to lower case

I have a string that contains multiple href tags such as:
href="HTTPS://www.example.com" and href="hTtp://www.example.com"
I need to change it to lower case, example:
href="https://www.example.com" and href="http://www.example.com"
what do I need to do with the code below to achieve it:
Dim strRegex As String = "<?href\s*=\s*[""'].+?[""'][^>]*?"
Dim myRegex As New Regex(strRegex, RegexOptions.IgnoreCase)
For Each myMatch As Match In myRegex.Matches(StringThatContainsHrefTags)
If myMatch.Success Then
Dim pattern As String
pattern = "http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\#\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"
If Regex.IsMatch(myMatch.ToString(), pattern) Then
'what do I do here?
End If
End If
Next
You can do it like this if you do not want to parse the HTML content.
Dim Input As String = "something asdf and then asdf"
Dim Output As String = Regex.Replace(Input, "<a\s+[^>]*href=""(?<Protocol>https?):", New MatchEvaluator(Function(X) X.Value.ToLower), RegexOptions.IgnoreCase)
This will handle only the HTTP/HTTPS part. You can extend the regex to entire URL and perform other validations from your code if needed.
There's shortcut in the code expecting only one pair of braces in the regex. Otherwise some tunning in X.Groups("Protocol").Value would be needed.

string.split()(1) removes first character

I am running into an issue when i split a string on "_Pub" and get the back half of the string it removes the first character and I don't understand why or how to fix it unless i add the character back in
strFilePath = "/C:/Dev/Edge/_Publications/Ann Report/2013-2016/2016 Edge.pdf"
Dim relPath = strFilepath.Split("_Publications")(1)
lb.CommandArgument = relPath
returns Publications\Ann Report\2013-2016\2016 Edge.pdf
What you have as a delimiter is not a string array "string()" but a regular string. You need a string array to use a string as a delimiter. otherwise it takes the first char of your string.
https://msdn.microsoft.com/en-us/library/tabh47cf(v=vs.110).aspx
try this
Dim relPath = strFilepath.Split(new string() {"_Publications"}, StringSplitOptions.RemoveEmptyEntries)(1)
It appears that you want to get the part of the path starting at some directory. Splitting the path might not be such a good idea: imagine if there was a file "My_Publications_2017.pdf" in a directory "C:\Dev\Edge\_Publications". The split as you intended in the question would give the array of strings {"C:\Dev\Edge\", "\My", "_2017.pdf"}. As has been pointed out elsewhere, the String.Split you used doesn't do that anyway.
A more robust way would be to find where the starting directory's name is in the full path and get the substring of the path starting with it, e.g.:
Function GetRelativePath(fullPath As String, startingDirectory As String) As String
' Fix some errors in how the fullPath might be supplied:
Dim tidiedPath = Path.GetFullPath(fullPath.TrimStart("/".ToCharArray()))
Dim sep = Path.DirectorySeparatorChar
Dim pathRoot = sep & startingDirectory.Trim(New Char() {sep}) & sep
Dim i = tidiedPath.IndexOf(pathRoot)
If i < 0 Then
Throw New DirectoryNotFoundException($"Cannot find {pathRoot} in {fullPath}.")
End If
' There will be a DirectorySeparatorChar at the start - do not include it
Return tidiedPath.Substring(i + 1)
End Function
So,
Dim s = "/C:/Dev/Edge/_Publications/Ann Report/2013-2016/2016 Edge.pdf"
Console.WriteLine(GetRelativePath(s, "_Publications"))
Console.WriteLine(GetRelativePath(s, "\Ann Report"))
outputs:
_Publications\Ann Report\2013-2016\2016 Edge.pdf
Ann Report\2013-2016\2016 Edge.pdf
Guessing that you might have several malformed paths starting with a "/" and using "/" as the directory separator character instead of "\", I put some code in to mitigate those problems.
The Split() function is supposed to exclude the entire delimiter from the result. Could you re-check & confirm your input and output strings?

VB.net / asp.net: Get URL without filename

i want to set a link in VB.net dynamically to a file.
My url looks like that:
http://server/folder/folder2/file.aspx?get=param
I tried to use Request.URL but i have not found any solution to get only
http://server/folder/folder2/
without the query string and without the filename.
Please help.
Dim url = Request.Url;
Dim result = String.Format(
"{0}{1}",
url.GetLeftPart(UriPartial.Authority),
String.Join(string.Empty, url.Segments.Take(url.Segments.Length - 1))
)
You can easily get a relative file path using the Request instance, then work with that, using Path class ought to help:
Dim relativePath = Request.AppRelativeCurrentExecutionFilePath
Dim relativeDirectoryPath = System.IO.Path.GetDirectoryName(relativePath)
It's worth noting that GetDirectoryName might transform your slashes, so you could expand the path:
Dim mappedPath = HttpContext.Current.Server.MapPath(newpath)
So, to remove redundancy, we could shorten this:
Dim path = _
Server.MapPath( _
Path.GetDirectoryName( _
Request.AppRelativeCurrentExecutionFilePath)))
But you'll need to check for possible exceptions.
You can use Uri.Host to get the computer name and then Uri.Segments (an array) to get everything up to the filename, for example:
var fileName = Uri.Host && Uri.Segments(0) && Uri.Segments(1)
This will give you: server/folder/folder2
If you have a variable number of segments, you can iterate over them and ignore the last one.
I hope that might help :)

regular expression validator in vb.net/asp.net custom expressions

I want to make sure using the custom expression validator that no ' or " can be used i tryed using [^"'] but whenever i put normal letters it doesnt work either.
I would just do a regex replace to match ['"] and replace it with nothing (ie. "").
In VB.NET, the code would look like:
dim strTest as string = "This "is" a test'"
dim regex as regex = new regex("[" & chr(34) & "']", regexoptions.ingorePatternWhitespace)
msgbox(regex.Replace(strTest, ""))
The output will be This is a test.

Concept needed for building consistent urls (routes)

My project has the need to build consistent urls similar to the ones here on stackoverflow. I know how I "can" do it by running the string through multiple filters, but I'm wondering if I can do it all with a single method.
Basically I want to remove all special characters, and replace them with dashes BUT if there are multiple dashes in a row, I need them to be a single dash. How can I implement this as clean as possible?
Example: If I were to use the following string.
My #1 Event
My regex would create the following string
my--1-event
notice how there are two dashes (one for the space and one for the "#" symbol). What I need is
my-1-event
Here's how I'm implementing it currently
''# <System.Runtime.CompilerServices.Extension()>
Public Function ToUrlFriendlyString(ByVal input As String) As String
Dim reg As New Regex("[^A-Za-z0-9]")
''# I could run a loop filter here to match "--" and replace it with "-"
''# but that seems like more overhead than necessary.
Return (reg.Replace(Trim(input), "-"))
End Function
And then all I do is call the extension method
Dim UrlFriendlyString = MyTile.ToUrlFriendlyString
Thanks in advance.
Add a + to the end of the regex.
This will tell it to match one or more characters that match the character class that precedes the +.
Also, you should create your Regex instance in a Shared field outside the method so that .Net won't need to parse the regex again every time you call the method.
[edited by rockinthesixstring]: here's the final result
Private UrlRegex As Regex = New Regex("[^a-z0-9]+", RegexOptions.IgnoreCase)
<System.Runtime.CompilerServices.Extension()>
Public Function ToUrlFriendlyString(ByVal input As String) As String
Return (UrlRegex.Replace(Trim(input), "-"))
End Function
Another way I do this without using a regex and also is a little simpler to understand is the following:
Excuse me on my vb as I am mainly C# guy.
''# <System.Runtime.CompilerServices.Extension()>
Public Function ToUrlFriendlyString(ByVal input As String) As String
If [String].IsNullOrEmpty(s) = True Then
Return [String].Empty
End If
Dim builder As New StringBuilder()
Dim slug = input.Trim().ToLowerInvariant()
For Each c As Char in slug
Select Case c
Case ' '
builder.Append("-")
Case '&'
builder.Append("and")
Case Else
If (c >= '0' And c <= '9') OrElse (c >= 'a' And c <= 'z') And c != '-')
builder.Append(c)
End If
End Select
Next
Return builder.ToString()
End Function

Resources