Finding first Image from a content - asp.net

How can I find an image from a content? I have a method in aspx I am calling this method for remove all html tags like this: Usage.DeleteHtml(Eval("content").ToString())
but I don't want delete img tag from content.. I should find the first image I will show it on my page.. like this:<img src="Usage.FindImage("content")" />
but couldn't write a method for finding image..
my DeleteHtml method:
public static string DeleteHtml(string text)
{
string mystr= Regex.Replace(text, #"<(.|\n)*?>", string.Empty);
return mystr;
}

I assume that your task is essentially retrieving the first image in document.
If your HTML document is a well-formed XML-document as well, you could easily solve your task using XPath.
More on XPath in .NET here.
XPath query to retrieve the first image's URL will look like this:
//img[1]/#src
Otherwise, if you really need to strip HTML, it's a duplicate to a couple of questions already:
Using C# regular expressions to remove HTML tags
How can I strip HTML tags from a string in ASP.NET?
How to clean HTML tags using C#
Short answer: use Html Agility Pack.

Related

Can not display base64 encoded images in an HTML fragment in WinJS app

I'm writing a WinJS app that takes an HTML fragment the user has copied to the clipboard, replaces their
Later, when I go to display the .html, I create an iFrame element (using jQuery $(''), and attempt to source the .html into it, and get the following error
0x800c001c - JavaScript runtime error: Unable to add dynamic content. A script attempted to inject dynamic content, or elements previously modified dynamically, that might be unsafe. For example, using the innerHTML property to add script or malformed HTML will generate this exception. Use the toStaticHTML method to filter dynamic content, or explicitly create elements and attributes with a method such as createElement. For more information, see http://go.microsoft.com/fwlink/?LinkID=247104.
I don't get the exception if I don't base64 encoded the images, i.e. leave them intact and can display iframes on the page with the page showing images.
If I take the html after subbing the urls for base64 and run it through toStaticHTML, it removes the src= attribute completely from the tags.
I know the .html with the encoded pngs is right b/c I can open it in Chrome and it displays fine.
My question is I'm trying to figure out why it strips the src= attributes from the tags and how to fix it, for instance, creating the iframe without using jquery and some MS voodoo, or a different technique to sanitize the HTML?
So, a solution I discovered (not 100% convinced it the best and am still looking for something a little less M$ specific) is the MS Webview
http://msdn.microsoft.com/en-us/library/windows/apps/bg182879.aspx#WebView
I use some code like below (where content is the html string with base64 encoded images)
var loadHtmlSuccess = function (content) {
var webview = document.createElement("x-ms-webview");
webview.navigateToString(content);
assetItem.append(webview);
}
I believe you want to use execUnsafeLocalFunction. For example:
var target = document.getElementById('targetDIV');
MSApp.execUnsafeLocalFunction(function () {
target.innerHTML = content}
);

Concatenation String - MVC Razor

With the Razor view engine, I just want to convert a path like this:
src="<%=MyImageServer %>image1.jpg"
into
src="#MyImageServer[PROBLEM_HERE]image1.jpg"
You see the problem... Any Suggestion?
Note: MyImageServer is a variable with a path.
Wrap in parentheses:
src="#(MyImageServer)image1.jpg"
But you probably want to avoid such tag soup in your views and write a custom HTML helper:
#Html.Image("image1.jpg")
which will take care of generating the proper image.

How to trim html tags from text in asp.net grid view?

I have used asp.net ajax html editor and i saved data in database. But now i want to retrieve it and show it in grid view. But when i retrieve that, it also shows those html tags (generated by asp.net ajax editor). So, i want to trim those tags and show plain text in grid view. How do i do that?
Thanks
Go to you db and look, how it is saved. Maybe it is save encoded. If it is not the case, you can use some simple regex to remove all those tags.
<[^<]+?>
This shows you just plain text and removes all Tags
To stripe the html tags from text you can utilize the
RegEx.Replace("str","Pattern","replacementstring "); method which there exist in
System.Text.RegularExpressions namespace
for example
Plain_Body = Regex.Replace(txtBody.Text, #"<[^>]*>", string.Empty);
here i am replacing the html specific characters with String.Empty or "" you can add additional characters if you wish to pattern like #"<[^>]*>" and spaces(&nbsp) and Ampersand(&amp) etc

Remove style tags, CSS, scripts and HTML tags from HTML to plain text

Using regular expressions, how do I remove style tags, CSS, scripts and HTML tags from HTML to plain text.
In ASP.NET C#.
I don't think you are looking for a regex to do this, however the following regex should do it,
if you run a regex replace:
<[^>]*>
To use this in a Regex Replace to the following:
string myHtmlString = "<html><body>my test text</body></html>";
string myPlainTextString = Regex.Replace(myHtmlString ,"<[^>]*>",String.Empty);
I recommend you use something like the Html Agility pack though - http://htmlagilitypack.codeplex.com/
as it has a method to make this even easier called "ConvertToPlainText":
string myHtmlString = "<html><body>my test text</body></html>";
string myPlainTextString = ConvertToPlainText(myHtmlString);

Dynamically generated HTML in C# to be formatted

I have an ASP.NET web forms site with a rather large menu. The HTML for the menu is dynamically generated via a method in the C# as a string. I.e., what is being returned is something like this:
<ul><li><a href='default.aspx?param=1&anotherparam=2'>LINK</a></li></ul>
Except it is a lot bigger, and the lists are nested up to 4 deep.
This is written to the page via a code block.
However, instead of returning a flat string from the method I would like to return it as formatted HTML, so when rendered it looks like this:
<ul>
<li>
<a href='default.aspx?param=1&anotherparam=2'>LINK</a>
</li>
</ul>
I thought about loading the html into an XmlDocument but it doesn't like the & character found in the query strings (in the href attribute values).
The main reason for doing this is so I can more easily debug the generated HTML during development.
Anyone have any ideas?
Maybe you can work with an HtmlTextWriter? It has Indenting capabilities and it may actually be a cleaner thing as you could write straight into the output stream, which should be more "in the flow" than generating a string in memory etc.
Is there a reason you want to do this? This implicitly minified HTML will perform slightly better anyway. If you do still need to render the HTML for pretty display, you will either need to incorporate indentation into the logic that generates the output HTML or build your content using ASP.NET controls and then call Render().
Try loading the HTML into the HTML Agilty Pack. It is an HTML parser that can deal with HTML fragments (and will be fine with & in URLs).
I am not sure if it can output pretty printed (what you call "formatted") HTML, but that would be my first approach.
I like to use format strings for this sort of thing, your HTML output would be generated with;
String.Format("<ul>{0}\t<li>{0}\t\t<a href='{2}'>{3}</a>{0}\t</li>{0}</ul>",
System.Environment.NewLine,
myHrefVariable,
myLinkText);

Resources