Rewrite ASP.NET page output - asp.net

A little background: I am trying to create a lightweight cookieless database-backed user session using a highly striped down ASP.NET implementation. This site audience will be mobile users connecting via celluar networks, so the page sizes need to be very small. I am not using the .NET session, viewstate, etc. and most page contain very few if any server controls.
I want to be able to process the output of a page request so I can modify internal links in the response with my own session information. I have read that there was an ISAPI filter to allow cookieless sessions pre-ASP.NET. That is basically what I want to build, just inside the application.
Has anyone done anything like this? I'm already inheiriting the System.Web.UI.Page class for my page base for other reasons. It seems like I should be able to do something from here.
Thanks

HttpModules can give you complete control over your output, but there are also a couple other things you can do that are a little simpler.
Create a custom Filter for the Response.Filter. More or less you create a Stream that you run everything through before sending it onto the underlying stream letting you make your changes there.
Override the render event for the Page and write all your contents to a string and then make your changes there... for example...
.
//this is from memory, you might need to check it
override void Render(HtmlTextWriter writer) {
StringWriter html = new StringWriter();
HtmlTextWriter render = new HtmlTextWriter(html);
base.Render(render);
string output = html.ToString()
//make your changes to output
//output = ???
writer.Write(output);
}

Look into using an HttpModule for this. You can process the entire response on the way out.
You might also be able to do something with a base class - perhaps go through all the server-side controls that might have links in them on the PreRenderComplete event. This wouldn't help you with HTML <A> tags, though.

Related

ASP.NET - Parse / Query HTML Before Transmission and Insert CSS Class References

As a web developer I feel too much of my time is spent on CSS. I am trying to come up with a solution where I can write re-usable CSS i.e. classes and reference these classes in the HTML without additional code in ASPX or ASCX files etc. or code-behind files. I want an intermediary which links up HTML elements with CSS classes.
What I want to achieve:
Modify HTML immediately before transmission
Select elements in the HTML
Based on rules defined elsewhere (e.g. in a text file relating to
the page currently being processed):
Add a CSS class reference to multiple HTML elements
Add multiple CSS class references to a single HTML element
How I envisage this working:
Extend ASP.NET functions which generate final HTML
Grab all the HTML as a string
Pass the string into a contructor for an object with querying (e.g. XPATH) methods
Go through list of global rules e.g. for child ul of first div then class = "navigation"
Go through list of page specific rules e.g. for child ul of first div then class &= " home"
Get processed HTML from object e.g. obj.ToString
ASP.NET to resume page generation using processed HTML
So what I need to know is:
Where / how can I extend ASP.NET page generation functions (to get all HTML of page)
What classes have element / node querying methods and access to attributes
Thanks for your help in advance.
P.S. I am developing ASP.NET web forms websites with VB.net code-behinds running on ISS 7
Check out my CsQuery project: https://github.com/jamietre/csquery or on nuget as "CsQuery".
This is a C# (.NET 4) port of jQuery. In basic performance tests (included in the project test suite) selectors are about 100 times faster than HTML Agility Pack + Fizzler (a css selector add-on for HAP); it's plenty fast for manipulating the output stream in real time on a typical web site. If you are amazon.com or something, of course, YMMV.
My initial purpose in developing this was to manipulate HTML from a content management system. Once I had it up and running, I found that using CSS selectors and the jQuery API is a whole lot more fun than using web controls and started using it as a primary HTML manipulation tool for server-rendered pages, and built it out to cover pretty much all of CSS, jQuery and the browser DOM. I haven't touched a web control since.
To intercept HTML in webforms with CsQuery you do this in the page codebehind:
using CsQuery;
using CsQuery.Web;
protected override void Render(HtmlTextWriter writer)
{
var csqContext = WebForms.CreateFromRender(Page, base.Render, writer);
// CQ object is like a jQuery object. The "Dom" property of the context
// returned above represents the output of this page.
CQ doc = csqContext.Dom;
doc["li > a"].AddClass("foo");
// write it
csqContext.Render();
}
To do the same thing in ASP.NET MVC please see this blog post describing that.
There is basic documentation for CsQuery on GitHub. Apart from getting HTML in and out, it works pretty much like jQuery. The WebForms object above is just to help you handle interacting with the HtmlTextWriter object and the Render method. The general-purpose usage is very simple:
var doc = CQ.Create(htmlString);
// or (useful for scraping and testing)
var doc = CQ.CreateFromUrl(url);
// do stuff with doc, a CQ object that acts like a jQuery object
doc["table tr:first"].Append("<td>A new cell</td>");
Additonally, pretty much the entire browser DOM is available using the same methods you use
in a browser. The indexer [0] returns the first element in the selection set like jquery; if you are used to write javascript to manipulate HTML it should be very familiar.
// "Select" method is the same as the property indexer [] we used above.
// I go back and forth between them to emphasise their interchangeability.
var element = dom.Select("div > input[type=checkbox]:first-child")[0];
a.Checked=true;
Of course in C# you have a wealth of other general-purpose tools like LINQ at your disposal. Alternatively:
var element = dom["div > input[type=checkbox]:first-child"].Single();
a.Checked=true;
When you're done manipulating the document, you'll probably want to get the HTML out:
string html = doc.Render();
That's all there is to it. There are a vast number of methods on the CQ object, covering all the jQuery DOM manipulation techniques. There are also utility methods for handling JSON, and it has extensive support for dynamic and anonymous types to make passing data structures (e.g. a set of CSS classes) as easy as possible -- much like jQuery.
Some More Advanced Stuff
I don't recommend doing this unless you are familiar with lower-level tinkering with asp.net's http workflow. There's nothing at all undoable but there will be a learning curve if you've never heard of an HttpHandler.
If you want to skip the WebForms engine altogether, you can create an IHttpHandler that automatically parses HTML files. This would definitely perform better than overlaying on a the ASPX engine -- who knows, maybe even faster than doing a similar amount of server-side processing with web controls. You can then then register your handler using web.config for specific extensions (like htm and html).
Yet another way to automatically intercept is with routing. You can use the MVC routing library in a webforms app with no trouble, here's one description of how to do this. Then you can create a route that matches whatever pattern you want (again, perhaps *.html) and pass handling off to a custom IHttpHandler or class. In this case, you're doing everything: you will need to look at the path, load the file from the file system, parse it with CsQuery, and stream the response.
Using either mechanism, you'll need a way to tell your project what code to run for each page, of course. That is, just because you've created a nifty HTML parser, how do you then tell it to run the correct "code behind" for that page?
MVC does this by just locating a controller with the name of "PageNameController.cs" and calling a method that matches the name of the parameter. You could do whatever you want; e.g. you could add an element:
<script type="controller" src="MyPageController"></script>
Your generic handler code could look for such an element, and then use reflection to locate the correct named class & method to call. This is pretty involved, and beyond the scope of this answer; but if you're looking to build a whole new framework or something this is how you would go about it.
Intercepting the content of the page prior to it being sent is rather simple. I did this a while back on a project that compressed content on the fly: http://optimizerprime.codeplex.com/ (It's ugly, but it did its job and you might be able to salvage some of the code). Anyway, what you want to do is the following:
1) Create a Stream object that saves the content of the page until Flush is called. For instance I used this in my compression project: http://optimizerprime.codeplex.com/SourceControl/changeset/view/83171#1795869 Like I said before, it's not pretty. But my point being you'll need to create your own Stream class that will do what you want (in this case give you the string output of the page, parse/modify the string, and then output it to the user).
2) Assign the page's filter object to it. (Page.Response.Filter) Note that you need to do it rather early on so you can catch all of the content. I did this with a HTTP Module that ran on the PreRequestHandlerExecute event. But if you did something like this:
protected override void OnPreInit(EventArgs e)
{
this.Response.Filter = new MyStream();
base.OnPreInit(e);
}
That would also most likely work.
3) You should be able to use something like Html Agility Pack to parse the HTML and modify it from there.
That to me seems like the easiest approach.

How do i render a control to a string

I've seen it time and time again the typical answer being something like this:
public string RenderControlToHtml(Control ControlToRender)
{
System.Text.StringBuilder sb = new System.Text.StringBuilder();
System.IO.StringWriter stWriter = new System.IO.StringWriter(sb);
System.Web.UI.HtmlTextWriter htmlWriter = new System.Web.UI.HtmlTextWriter(stWriter);
ControlToRender.RenderControl(htmlWriter);
return sb.ToString();
}
... This is fine if you have simple html tags but when I have a textbox or some other asp control in my control it throws a wobbly about the control not being on a form (which in fact it is because im trying to render a portion of the page to a string that i can then send as an email)
So ...
I'm pretty sure this has been asked and answered before but i'm at a loss for finding a real answer that actually works ...
How do i render the html output of both server and client side controls to a html string in .Net 4.0 because seemingly the above is not good enough?
Note:
I have found examples that talk about doing this at page level ...
public override void VerifyRenderingInServerForm(Control control)
{
//Do nothing (we dont care if theres a form or not)
}
... and disabling event validation but apparently that's not working for me either.
Is there a way to do this without a "hack" thats clean?
Also:
I even tried creating a new page, adding a form to it, added my control to that then calling renderControl on that page to which i got more errors.
EDIT:
I've been digging around and I think the problem might be related to postbacks or something because i found this:
http://forums.asp.net/t/1325559.aspx
Another thing that might be putting a bit of a spanner in the works is my use of the ajax toolkit, I essentially need only the visible portion of the control which seems to give a bit of a headache for updatepanels for some reason.
I'm guessing the above sample works only on basic .net controls that are not ajax toolkit related.
What further complicates the issue is that I would like to get the control in its current state when a button is clicked at the bottom of it ...
Essentially the control represents a form for booking an MOT and I would like to render the form filled in to an email that is then sent to the garage if that makes sense.
I'm thinking i may have to admit defeat here and simply get the markup from the client and manually build an email pulling out the control values as this seems to be a compatability issue between ajax controls and the renderControl method from what i can tell (maybe you cant render a partial postback compatible control in this fashion).
Unless someone smarter than me can prove it can be done ???
Create an ajax control extender and get the html markup on the client side then post that back to the server for processing, it's a bit of hack but it seems that anything involving ajax controls will cause the default recommended mechanism to break.

How can i manipulate the page while it is rendering?

I want to change some elements text when page is leaving the server (page_render, endRequest etc.).
How can i get access to the page and how can i find the elements to change their values, texts?
You can do so by using a HttpModule. This sits in the pipeline and can do pre- and postprocessing.
For example take a look at this whitespaceremover.
Besides HttpModules, you can also override the 'Render' method (or do this in a basepage to make it reusable).
protected override void Render(HtmlTextWriter writer )
{
StringWriter stringWriter = new StringWriter();
HtmlTextWriter htmlWriter = new HtmlTextWriter(stringWriter);
base.Render(htmlWriter);
string html = stringWriter.ToString();
// do stuff with the html
writer.Write(html);
}
There are a number of options and which one suites you will depend largely on what the actual goal is.
Handle the PreRender event of a Page and adjust any elements you want to in this event. Ideally you would put this in a base class that is inherited by all the pages that require this processing. This gives you access to the actual page model and control tree.
Setup a filter that will give you direct access to the response stream. You can implement this in 2 ways, either as a separate HttpModule that installs the filter or you can install the filter directly from the Global.asax. Which route you choose depends on how reusable you need this, with the HttpModule being the most reusable.
Here is a nice article Modifying the HTTP Response Using Filters

ASP.NET Save HTML Sent to Browser

I need to save the full and exact HTML sent to the browser for some transactions (for legal tracking purposes.) I suspect I am not sure if there is a suitable hook to do this. Does anyone know? (BTW, I am aware of the need to also save associated pages like style sheets and images.)
You can create an http module and have the output stream saved somewhere.
You should hook to PreSendRequestContent event...:
This event is raised just before ASP.NET sends the response contents to the client. This event allows us to change the contents before it gets delivered to the client. We can use this event to add the contents, which are common in all pages, to the page output. For example, a common menu, header or footer.
You could attach to the PreSendRequestContent. This event is raised right before the content is sent and gives you a chance to modify it, or in your case, save it.
P&P article on interception pattern
You could implement a response filter. Here is a nice sample that processes the HTML produced by ASP.NET. In addition to the HTML being sent to the client you should be able to also write the HTML to a database or other suitable storage.
Here is an alternate and IMO much easier way to hook the filter into your application:
in Global.asax, place the following code in the Application_BeginRequest handler:
void Application_BeginRequest(object sender, EventArgs e)
{
Response.Filter = new HtmlSavingFilter(Response.Filter);
}
I suppose you only want to save the rendered html for certain pages. If so, I have been using the following approach in one of my applications that stores the rendered html for caching purpose somewhere on the disk. This method simply overrides the render event of the page.
protected override void Render(HtmlTextWriter writer)
{
using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
{
base.Render(htmlwriter);
string html = htmlwriter.InnerWriter.ToString();
using (FileStream outputStream = new FileStream(#"C:\\temp.html", FileMode.OpenOrCreate))
{
outputStream.Write(html, 0, html.Length);
outputStream.Close();
}
writer.Write(html);
}
}
Really works well for me.
There are also hardware devices made specifically for this purpose. We've used one called "PageVault".

Modifying the HTML of a page before it is sent to the client

I need to catch the HTML of a ASP.NET just before it is being sent to the client in order to do last minute string manipulations on it, and then send the modified version to the client.
e.g.
The Page is loaded
Every control has been rendered correctly
The Full html of the page is ready to be transferred back to the client
Is there a way to that in ASP.NET?
You can override the Render method of your page. Then call the base implementation and supply your HtmlTextWriter object. Here is an example
protected override void Render(HtmlTextWriter writer)
{
StringWriter output = new StringWriter();
base.Render(new HtmlTextWriter(output));
//This is the rendered HTML of your page. Feel free to manipulate it.
string outputAsString = output.ToString();
writer.Write(outputAsString);
}
You can use a HTTPModule to change the html. Here is a sample.
Using the answer of Atanas Korchev for some days, I discovered that I get JavaScript errors similar to:
"The message received from the server could not be parsed"
When using this in conjunction with an ASP.NET Ajax UpdatePanel control. The reason is described in this blog post.
Basically the UpdatePanel seems to be critical about the exact length of the rendered string being constant. I.e. if you change the string and keep the length, it succeeds, if you change the text so that the string length changes, the above JavaScript error occurs.
My not-perfect-but-working solution was to assume the UpdatePanel always does a POST and filter that away:
protected override void Render(HtmlTextWriter writer)
{
if (IsPostBack || IsCallback)
{
base.Render(writer);
}
else
{
using (var output = new StringWriter())
{
base.Render(new HtmlTextWriter(output));
var outputAsString = output.ToString();
outputAsString = doSomeManipulation(outputAsString);
writer.Write(outputAsString);
}
}
}
This works in my scenario but has some drawbacks that may not work for your scenario:
Upon postbacks, no strings are changed.
The string that the user sees therefore is the unmanipulated one
The UpdatePanel may fire for NON-postbacks, too.
Still, I hope this helps others who discover a similar issue. Also, see this article discussing UpdatePanel and Page.Render in more details.
Take a look at the sequence of events in the ASP.NET page's lifecycle. Here's one page that lists the events. It's possible you could find an event to handle that's late enough in the page's lifecycle to make your changes, but still get those changes rendered.
If not, you could always write an HttpModule that processes the HTTP response after the page itself has finished rendering.
Obviously it will be much more efficient if you can coax the desired markup out of ASP.Net in the first place.
With that in mind, have you considered using Control Adapters? They will allow you to over-ride how each of your controls render in the first place, rather than having to modify the string later.
I don't think there is a specific event from the page that you can hook into; here is the ASP.Net lifecycle: http://msdn.microsoft.com/en-us/library/ms178472.aspx
You may want to consider hooking into the prerender event to 'adjust' the values of the controls, or perform some client side edits/callbacks.

Resources