I want parse the html of current page.
How can I get the html of current page for that in asp.net?
Thanks in advance.
for client side
In Internet explorer
Right click on the browser --> View source
IN firefox
Right click on the browser --> View Page Source
for server side
You can override the page's render method to capture the HTML source on the server-side.
protected override void Render(HtmlTextWriter writer)
{
// setup a TextWriter to capture the markup
TextWriter tw = new StringWriter();
HtmlTextWriter htw = new HtmlTextWriter(tw);
// render the markup into our surrogate TextWriter
base.Render(htw);
// get the captured markup as a string
string pageSource = tw.ToString();
// render the markup into the output stream verbatim
writer.Write(pageSource);
// remove the viewstate field from the captured markup
string viewStateRemoved = Regex.Replace(pageSource,
"<input type=\"hidden\" name=\"__VIEWSTATE\" id=\"__VIEWSTATE\" value=\".*?\" />",
"", RegexOptions.IgnoreCase);
// the page source, without the viewstate field, is in viewStateRemoved
// do what you like with it
}
Override Render method and call base.Render with you own HtmlWriter.
Do you really want to parse HTML? It's a tricky business. If you don't absolutely have to do it, I'd avoid it by using DOM methods client-side (if a client-side solution is acceptable). If you're doing a lot of it, you might consider jQuery, Prototype, or some other tool to help.
Related
How would I be able to get the innerHtml of the current aspx page in codebehind? I want to use the innerHTML and pass to a pdf converter function when the user clicks the pdf button, but i need the current page html as string.
I would do a postback and use javascript to provide the current innerHTML
__doPostBack(**event target**, document.documentElement.innerHTML);
You can override Render method of the page.
protected override void Render(HtmlTextWriter writer)
{
StringBuilder sb = new StringBuilder();
HtmlTextWriter tw = new HtmlTextWriter(new StringWriter(sb));
base.Render(tw);
string innerHtml = sb.ToString();
}
innerHtml will contain whole rendered html code of page. A little simplified version.
Asp.net
A.aspx
I'm using JQuery to access an ashx file which loads control ( ascx ) which contains a GridView. The control content is being injected to the page...
When I do this:
StringWriter writer = new StringWriter();
HttpContext.Current.Server.Execute(page, writer, false);
string output = writer.ToString();
It tells me that the GridView must be placed in a form section.
So I've created my Page
public class MyPage: Page
{
public override void VerifyRenderingInServerForm(Control control)
{
//base.VerifyRenderingInServerForm(control);
}
}
and inside of it I override this method. I'm using my page and everything is fine.
The question is why ? Why does it have to be in a form? It doesn't have any inputs !
Also, if my ascx contains only <asp:Label ( runatServer) everything is fine and it doesn't require placing it in a Form.
What am I missing ?
It must not be inside of a form, but the only one who knows are you. This exception is also a way to prevent nasty errors and provide a clear error message. Only controls that can postback need to be nested in a HtmlForm control.
http://msdn.microsoft.com/en-us/library/system.web.ui.page.verifyrenderinginserverform%28v=VS.100%29.aspx
I'm trying to take an existing bunch of code that was previously on a full .aspx page, and do the same stuff in a .ashx handler.
The code created an HtmlTable object, added rows and cells to those rows, then added that html table the .aspx's controls collection, then added it to a div that was already on the page.
I am trying to keep the code in tact but instead of putting the control into a div, actually generate the html and I'll return that in a big chunk of text that can be called via AJAX client-side.
HtmlTable errors out when I try to use the InnerHtml property (says it isn't supported), and when I try RenderControl, after making first a TextWriter and next an HtmlTextWriter object, I get the error that Page cannot be null.
Has anyone done this before? Any suggestions?
*Most recent is above.
OK, even after Matt's update there is a workaround ;)
Firstly, we have to use a page with form inside. Otherwise we won't be able to add a ScriptManager control. One more thing: the ScriptManager control should be the first control in the form. Further is easier:
Page page = new Page();
Button button = new System.Web.UI.WebControls.Button
{
ID = "btnSumbit",
Text = "TextButton",
UseSubmitBehavior = true
};
HtmlForm form = new HtmlForm
{
ID="theForm"
};
ScriptManager scriptManager = new ScriptManager
{
ID = "ajaxScriptManager"
};
form.Controls.Add(scriptManager);
form.Controls.Add(button);
page.Controls.Add(form);
using (StringWriter output = new StringWriter())
{
HttpContext.Current.Server.Execute(page, output, false);
context.Response.ContentType = "text/plain";
context.Response.Write(output.ToString());
}
This works. The output is quite large so I decided not to include it into my answer :)
Actually, there is a workaround. Yep, we may render a control in handler.
Firstly, we need a formless page. Because without it we get:
Control 'btnSumbit' of type 'Button'
must be placed inside a form tag with
runat=server.
public class FormlessPage : Page
{
public override void VerifyRenderingInServerForm(Control control)
{
}
}
Secondly, nobody can prevent us from creating an instance of our FormlessPage page. And now let's add a control there (I decided to add a Button control as an example, but you could use any).
FormlessPage page = new FormlessPage();
Button button = new System.Web.UI.WebControls.Button
{
ID = "btnSumbit",
Text = "TextButton",
UseSubmitBehavior = true
};
page.Controls.Add(button);
Thirdly, let's capture the output. For this we use HttpServerUtility.Execute method:
Executes the handler for the specified
virtual path in the context of the
current request. A
System.IO.TextWriter captures output
from the executed handler and a
Boolean parameter specifies whether to
clear the
System.Web.HttpRequest.QueryString and
System.Web.HttpRequest.Form
collections.
Here is the code:
using (StringWriter output = new StringWriter())
{
HttpContext.Current.Server.Execute(page, output, false);
context.Response.ContentType = "text/plain";
context.Response.Write(output.ToString());
}
The result will be:
<input type="submit" name="btnSumbit" value="TextButton" id="btnSumbit" />
In addition I can recommend ScottGu's article Tip/Trick: Cool UI Templating Technique to use with ASP.NET AJAX for non-UpdatePanel scenarios. Hope, you could find a lot of useful there.
Another option is to host the ASP.NET HTTP pipeline in your process, render the page to a stream and read the HTML you need to send from the HttpListenerContext.Response.OutputStream stream after the page has been processed.
This article has details: http://msdn.microsoft.com/en-us/magazine/cc163879.aspx
I have two pages on my site which are populated with content from a database as well as having user-entered fields (don't ask!). The pages also contain a ListView with a nested DataList. There are buttons on these pages which when clicked grab the html content of the page, write it to a HtmlTextWriter then get the text and put it into an email.
What I need to do is replace any TextBox / DropDownLists in the html source with string literal equivalents before putting into the email.
My aspx code so far looks something like this:
<div id="mailableContent" runat="server">
<asp:TextBox ID="txtMessage" runat="server"/>
<asp:Label ID="lblContentFromDb" runat="server"/>
<asp:ListView ID="lvwOffices" runat="server">
//loads of stuff here including more textboxes for the user to fill in
</asp:ListView>
</div>
and the codebehind is something like this:
StringBuilder stringBuilder = new StringBuilder();
StringWriter writer = new StringWriter(stringBuilder);
HtmlTextWriter htmlWriter = new HtmlTextWriter(writer);
mailableContent.RenderControl(htmlWriter);
MailMessage message = new MailMessage();
//do more stuff to set up the message object
message.Body = stringBuilder.ToString();
//send the message
My ideas so far are to 1. manually set any textboxes to Visible=false then populate literal controls with corresponding textbox values which is rather messy and tedious. Note I have to strip out input controls otherwise html for the email needs to be wrapped with a form elemen which I don't really want to do.
Is there a better way to do all this, I'm thinking that perhaps doing some .Net1.1 style xslt transforms with page content defined in xml files might be a better way to approach this, but am unsure if this will handle my requirement where I'm currently using a ListView with a nested DataList.
I find what you are describing to have a lot of overhead. Does your email template stay prety much the same? if so why not simply have a simple html template with html 3 code. Then simply read this file from disk and replace specific peices (ie. ##Name##) with the dynamic content. This way you have complete control over the html being sent via email and you can control what users input.
this would also limit the amout of work to make the html compatible with email clients.
Clarification: In the preceding suggestion, I propose that the UI implementation and the Email implementation be distinct, this in turn allows to to compose the email with more flexibility. Without using
mailableContent.RenderControl(htmlWriter);
this also allows you to compose the contents of the ListView to your specifications.
Couple of ideas:
Use RegEx to replace the html output of the textboxes and dropdownlists with plaintext. (Tricky, with the complication of finding the <option selected="selected"> stuff.
Do what I do:
Create an interface e.g. IPlainTextable
Make the interface enforce a boolean property called PlainTextMode (set false by default)
Extend TextBox and DropDownList with your own controls that implement IPlainTextable
In the Render section of your extended webcontrols, render out the plaintext value if PlainTextMode is true. e.g for your subclass of TextBox
protected override void Render(HtmlTextWriter writer)
{
if (PlainTextMode)
writer.WriteLine(this.Text);
else
base.Render(writer);
}
Before rendering out your page, run through all the IPlainTextable controls and set the PlainTextMode to true.
I have written a nifty little method for iterating through a nested control set:
public static List<T> FindControlsOfType<T>(Control ctlRoot)
{
List<T> controlsFound = new List<T>();
if (typeof(T).IsInstanceOfType(ctlRoot))
controlsFound.Add((T)(object)ctlRoot);
foreach (Control ctlTemp in ctlRoot.Controls)
{
controlsFound.AddRange(FindControlsOfType<T>(ctlTemp));
}
return controlsFound;
}
So you would just do something like:
foreach (IPlainTextable ctl in FindControlsOfType<IPlainTextable>(this))
{
ctl.PlainTextMode = true;
}
and then do your render to string after that...
Is there any way through which I can get HTML of my current page. By current page I mean let's say I am working on Default.aspx and want to get HTML by providing a button on it.
How to get it.
EDITED in response to clarification of the requirements
You can override the page's render method to capture the HTML source on the server-side.
protected override void Render(HtmlTextWriter writer)
{
// setup a TextWriter to capture the markup
TextWriter tw = new StringWriter();
HtmlTextWriter htw = new HtmlTextWriter(tw);
// render the markup into our surrogate TextWriter
base.Render(htw);
// get the captured markup as a string
string pageSource = tw.ToString();
// render the markup into the output stream verbatim
writer.Write(pageSource);
// remove the viewstate field from the captured markup
string viewStateRemoved = Regex.Replace(pageSource,
"<input type=\"hidden\" name=\"__VIEWSTATE\" id=\"__VIEWSTATE\" value=\".*?\" />",
"", RegexOptions.IgnoreCase);
// the page source, without the viewstate field, is in viewStateRemoved
// do what you like with it
}
Not sure why you want what you want, but... this is off the top of my head, i.e. I didn't try this code.
Add a client-side onclick to your button to show markup and do something like this:
function showMarkup() {
var markup = "<html>" + document.getElementsByTagName("html")[0].innerHTML + "</html>";
alert(markup); // You might want to show a div or some other element instead with the markup variable as the inner text because the alert might get cut off.
}
If you need this rendered markup posted back to the server for some reason, store the encoded markup in a hidden input and post that back. You can register the script below on the server-side using ClientScriptManager.RegisterOnSubmitStatement . Here's the cleint-side code.
var markup = escape("<html>" + document.getElementsByTagName("html")[0].innerHTML + "</html>");
var hiddenInput = $get('hiddenInputClientId');
if (hiddenInput) {
hiddenInput.value = markup;
}
Hope that helps,
Nick
I'm still not sure what your objective is with this. But if you want the total rendered output of the page then your probably better of looking at some client side code as this would be run once the server has returned the fully rendered HTML.
Otherwise you could proably catch the page unload event and do something with the rendered content there.
More info needed on what you want from this.