I have an HTML file and I want to find a particular URL inside it.
Starting with http://abcd.com/Admin/CoreFolder/Resources/
Has a JPG file. Ex - http://abcd.com/Admin/CoreFolder/Resources/ff8127aa-bf81-4833-ba12-edf2f366ccbb.jpg
Can you help me out with the regex for the same. I want to find all occurrences and replace the abcd.com with xyz.com.
http.*?(abcd\.com)[^\s]*?\.jpg
DEMO : http://regexr.com?3853l
Then you can replace the first capture ( the part within braces ) with xyz.com
This is a RegEx that will match any .jpg link starting with http://abcd.com/Admin/CoreFolder/Resources/
(http://abcd.com/Admin/CoreFolder/Resources/(.+?).jpg)
Example test on RegExr
Using something like sed to do the search and replace (you didn't actually give enough information about your environment - like "what language do you want to implement your regex in"?):
sed 's/\(http:\/\/\)\(abcd\.com\\)\(/Admin\/CoreFolder\/Resources\/.*\.jpg\)/\1xyz\.com\2/' myFile.html > newFile.html
Related
I am creating a script to hash some parts of the text. The point is, I would like to do it using sed. Here is how example string looks like:
..HASHSTART.......HASHSTART..HASHEND..HASHEND..HASHSTART.....
.....HASHEND....HASHSTART.........
I have three problems with it:
I would like to match "HASHSTART" and "HASHEND" properly. So for example if i will find first HASHSTART, then i would like to hash everything to first find of HASHEND, even if there is more HASHSTARTs between.
I would like to hash it even if there is a new line between HASHSTART and HASHEND.
If there is HASHSTART at the end and no HASHEND till end of the file, I would like to hash everything after.
By "hash" I mean here to change from . to #.
So with this input the output should be:
..HASHSTART##################HASHEND..HASHEND..HASHSTART#####
#####HASHEND....HASHSTART#########
Thank you very much for any help.
Like this?
awk -F 'HASHSTART' -vw=3 '{i=index($w,"HASHEND");if(i)print substr($w,1,i-1)}' file
-vw=3 postion (2,)
echo "1,a,20,000,aa,s" | sed 's/,\([^0]\)/|\1/g'
**output
1|a|20,000|aa|s**
Please explain the above command.
I am unable to understand this execution.
The given command uses sed to substitute certain characters for other characters.
The basic form for this is
s/FIND/REPLACE/
where FIND and REPLACE are regular expressions.
The g at the end stands for global. It means that not only the first occurrence of a pattern matching FIND is replaced but all occurrences in the input string.
To the regular expressions used:
FIND ,\([^0]\) This pattern matches all two character strings who start with a , which is not followed by a 0.
REPLACE |\1 This is equal to a two character string who starts with a | which is followed by the second character in FIND. (The \1 remembers the previously found match)
For a detailed overview of the sed commands I suggest you also read here: http://www.grymoire.com/Unix/Sed.html#uh-1
And to look up on how to read regular expressions: http://www.grymoire.com/Unix/Regular.html
Of curse there are many more sites concerning this to be found if the above web-pages are not enlightening to you.
I have searched on google and stackoverflow but didnt find a good answer. I also tryed it by myself, but iam no regex guru.
My goal is to replace all relative urls in a html style tag with the absolute version.
e.g.
style="url(/test.png)" with style="url(http://mysite.com/test.png)"
style="url("/test.png")" with style="url("http://mysite.com/test.png")"
style="url('/test.png')" with style="url('http://mysite.com/test.png')"
style="url(../test.png)" with style="url(http://mysite.com/test.png)"
style="url("../test.png")" with style="url('http://mysite.com/test.png')"
style="url('../test.png')" with style="url('http://mysite.com/test.png')"
and so on.
Here what i tryed with my poor regex "skils"
url\((?<Url>[^\)]*)\)
gives me the url in the "url" function.
thanks in advance!
Well, you can try the regex:
style="url\((['"])?(?:\.\.)?(?<url>[^'"]+)\1?\)"
And replace with:
style="url($1http://mysite.com$2$1)"
regex101 demo
(['"])? will capture quotes if they are present and use them again at \1?
([^'"]+) will capture the url itself.
I have a RegEx problem. Consider the following URL:
http://ab.cdefgh.com/aa-BB/index.aspx
I need a regular expression that looks at "aa-BB" and, if it doesn't
match a number of specific values, say:
rr-GG
vv-VV
yy-YY
zz-ZZ
then the URL should redirect to some place. For example:
http://ab.cdefgh.com/notfound.aspx
In web.config I have urlrewrite rules. I need to know what
the regex would be between the tags.
<urlrewrites>
<rule>
<url>?</url>
<rewrite>http://ab.cdefgh.com/notfound.aspx</rewrite>
</rule>
</urlrewrites>
Assuming you don't care about the potential for the replacement pattern to be in the domain name or some other level of the directory structure, this should select on the pattern you're interested in:
http:\/\/ab\.cdefgh\.com\/(?:aa\-BB|rr\-GG|vv\-VV|yy\-YY|zz\-ZZ)\/index\.aspx
where the aa-BB, etc. patterns are simply "or"ed together using the | operator.
To further break this apart, all of the /, ., and - characters need to be escaped with a \ to prevent the regex from interpreting them as syntax. The (?: notation means to group the things being "or"ed without storing it in a backreference variable (this makes it more efficient if you don't care about retaining the value selected).
Here is a link to a demonstration (maybe this can help you play around with the regex here to get to exactly which character combinations you want)
http://rubular.com/r/UfB65UyYrj
Will this help?
^([a-z])\1-([A-Z])\2.*
It matches:
uu-FF/
aa-BB/
bb-CC/index
But not
aaBB
asdf
ba-BB
aA-BB
(Edit based on comment)
Just pipe delimit your desired urls inside of () and escaping special chars.
Eg.
^(xx-YY|yy-ZZ|aa-BB|goodStuff)/.*
But, I think you might actually want the following which matches anything other than the urls that you specify, so that all else goes to notfound.aspx:
^[^(xx-YY|yy-ZZ|aa-BB|goodStuff)]/.*
Assuming you want anything but xx-XX, yy-YY and zz-ZZ to redirect:
[^(xx\-XX)|(yy\-YY)|(zz\-ZZ)]
Sample text =
legacycard.ashx?save=false&iNo=3&No=555
Sample pattern =
^legacycard.ashx(.*)No=(\d+)
Want to grab group #2 value of "555" (the value of "No=" in the sample text)
In Expresso, this works, but in ASP.NET UrlRewrite, it is not catching.
Am I missing something?
Thanks!
I would do something along these lines:
^legacycard.ashx\?(?:.+&)*No=(\d+)
The \? will escape the question mark that normally separates the URL and the parameters, then you make sure that it will capture every parameter key/value pair (anything that ends on &) before the parameter you actually care about. Using ?: lets you specify that the set of brackets is non capturing (I'm assuming you won't need any of the data, has the potential to slightly speeds up your regex) and leaves you just 555 captured. The added benefit of this approach is that it'll work regardless of parameter order.
Just use this regex:
^legacycard\.ashx\?save=(false|true)&iNo=(?<ino>\d+)&No=(?<no>\d+)
Then Regex Replace with
${no}
Looks fine to me, your regex should match the entire string
legacycard.ashx?save=false&iNo=3&No=555
not sure why you have groups, but groups should also return
?save=false&iNo=3&
and
555
For good measure you should know that the . in legacycard.ashx is also interpreted by regex and you would normally escape it, in this case it dosen't matter because a single dot matches everything, also a dot. :)
Try this
^legacycard.ashx(\?No=|.*?&No=)(\d+)
this should work.