HTML renders differently in ColdFusion 9

HTML renders differently in ColdFusion 9 - iis-7

I have a really strange issue. I just moved my ColdFusion application from ColdFusion 8/IIS 6 to ColdFusion 9/IIS 7.
None of the HTML files were modified; just copied and pasted to the new server's web root. But the pages are rendering slightly differently. It's as if the CSS margin settings and padding settings for the layout are not being applied in the same way. The elements are bunching together whereas there was proper spacing before I moved the application over.
I thought ColdFusion server wouldn't interfere with any formatting of HTML pages as it just handles back-end requests, but these pages clearly look different on the older version of ColdFusion/IIS.
Is there anything that would make this happen in ColdFusion or IIS? I'm just using CFM pages with basic CSS styling, etc. Nothing crazy.

I'll brave an answer. Here are some things to try (or at least think about):
I am assuming that there are no errors and that just the HTML is wacky (but workable).
In your CF Admin head over to 'Settings':
Look for 'Enable Whitespace Management': (which described) Reduces the file size of the pages that ColdFusion returns to the browser by removing many of the extra spaces, tabs, and carriage returns that ColdFusion might otherwise persist from the CFML source file.
See if it is checked if not check it and see if you notice any changes expected or unexpected.
And look for <cfsetting> tags with the enablecfoutputonly set to no/yes or true/false or 0/1 (or lack of the parameter)
Next is: <cfprocessingdirective>
Look for page that does this:
<cfprocessingdirective
pageEncoding = "page-encoding literal string"
suppressWhiteSpace = "yes|no">
</cfprocessingdirective
pageEncoding = "page-encoding literal string" <<< this could do some things to on a character level that you might consider.
See Adam's colorful post here for encoding details.
Provides the following information to ColdFusion about how to process the current page:
Specifies whether to remove excess whitespace character from ColdFusion generated content in the tag body.
Identifies the character encoding (character set) of the page contents.
Also check for these <cfcontent> and <cfsilent>
Maybe that will unravel some mysteries for you (Or make more depending what you find when you run some searches through your code)...
On a last note. I am not sure what your code base contains, but if you run into some things you can consolidate consider using the Application.cfc then move all these white space encoding things here. Somethings do similar things to others so you might want to clip the redundancy to a single point of failure that way you don't have individual pages replicating actions when they don't need to.
Oh, one last thing, make sure you turn your caching off because some of this sounds like someone trying to beat overhead on pages. If you have caching on and render pages while you are making changes you may not see your fixes and then you will pull your hair out. We don't want that.
:)
Good luck.

Related

What are some client-side tricks to get around IE7's absurd 32-stylesheet limit?

I just worked out, by trial-and-error, that IE 7 has an upper limit of 32 stylesheet includes (i.e. tags).
I'm working on the front-end of a very large website, in which we wish to break our CSS into as many separate files as we wish, since this makes developing and debugging much easier.
Performance isn't a concern, as we do compress all these files into a single package prior to deployment.
The problem is on the development side. How can we work with more than 32 stylesheets if IE 7 has an upper limit of 32?
Is there any means of hacking around this?
I'm trying to come up with solutions, but it seems that even if I loaded the stylesheets via Ajax, I'd still be writing out tags, which would still count towards the 32-stylesheet limit.
Is this the case? Am I stuck with the 32-file limit or is there a way around it?
NOTE: I'm asking for a client-side solution to this. Obviousy a server-side solution isn't necessary as we already have a compression system in place. I just don't want to have to do a re-compress every time I make one little CSS change that I want to test.

Don't support IE7.
To avoid confusion: I'm not seriously suggesting this as a real solution.

Create CSS files on the server side and merge all files that are needed for this certain page.
If you are using Apache or Lighttp consider using mod_concat

Write your stylesheet into an existing style block with JavaScript using the cssText property, like this:
document.styleSheets[0].cssText += ourCss;
More info here:
https://bushrobot.blogspot.com/2012/06/getting-around-31-stylesheet-limit-in.html

At my last company we solved this by mashing all the CSS into one big document and inserting a URL in the web page that referenced that one-shot document. This was all done on-the-fly, just before returning the page to the client (we had a bunch of stuff going on behind the scenes that generated dynamic CSS).
You might be able to get your web server to do something similar, depending on your setup, otherwise it sounds like you're stuck with only 32 files.
Or you could just not support IE7 ;)

is it better to avoid using beestings?

I think
beestings change the html every time.
this means html is not able to be cached.
am I right?

I assume that by beesting, you mean some sort of random number in the URL.
Yes this will (usually) stop caching of pages. Some browser may still cache some or all of the page.
As to whether not to avoid it. Would caching a page stop it working as it should? If your page has fairly random content or content the changes often, users would not see this if the page is cached.
If you can avoid the need to stop caching, pages will be able to load faster. Which makes for a happy user.

Old, old question, but anyway... no. You actually can "cache some or all of the response generated by an ASP.NET page, referred to in ASP.NET as output caching". You can read more at this Microsoft page.
For those still wandering, bee stings are special tags used in Microsoft's asp.NET to hold server-side code, much like PHP would do. It is, or was, very common to keep simple code inline, although having significant amount of code in bee stings is considered a bad practice by Microsoft themselves.
But as a rule, yeah, it can be cached.

Embed asp page inside of an asp.net website

i have a current asp website that i need to keep in sync but it has stuff like this:
<!--#include file="inc_search_form.asp" -->
i can't change this file at all as it exists in another asp website so i can't break that compatibility..
is there anyway i can just shove this same file into my asp website and have it work the same?

There is an assumption here that the other asp website may want to change the content of this asp file and you would want such changes reflected in your new website. If that isn't the case you would simply create ASP.NET version of this content in your new webstie.
There isn't really enough info in your question for a good answer to you specific scenario.
To the general scenario the answer is a flat no.
However there may be some mitigation depending on what the include actually does. For example it may be possible simply read the ASP file in ASP.NET perhaps do some text based tweaking and include the final HTML content in your ASP.NET pages output. This approach though is very fragile if the include is subject to change (if not see first paragraph in this answer).
Another mitigation might be if the include file can generate the desired content when requested directly, in this case you may get away with making HttpWebRequest looping back to this ASP page whilst processing the ASP.NET page. Ugly and again fragile but possible.
To what extent does the include file depend on the includer to have created a context for it? For example does the include file use variables that it expects the includer to have created? In which case the answer is no.
Does the include expect to be placed in a specific part of an overal HTML page, does it contain inline Javascript and does it attempt to interact with other parts of the containing page? Loop back HttpWebRequest might work in this case.
The most likely answer is no. Even if the answer is yes what ever the solution it will be fragile. Personally I just wouldn't even attempt it despite any perceived benefits. In the long run maintaining a ASP.NET version of this content in parallel with the existing ASP version in the other site is much more tenable.

HTMLEncode script tags only

I'm working on StackQL.net, which is just a simple web site that allows you to run ad hoc tsql queries on the StackOverflow public dataset. It's ugly (I'm not a graphic designer), but it works.
One of the choices I made is that I do not want to html encode the entire contents of post bodies. This way, you see some of the formatting from the posts in your queries. It will even load images, and I'm okay with that.
But I am concerned that this will also leave <script> tags active. Someone could plant a malicious script in a stackoverflow answer; they could even immediately delete it, so no one sees it. One of the most common queries people try when they first visit is a simple Select * from posts, so with a little bit of timing a script like this could end up running in several people's browsers. I want to make sure this isn't a concern before I update to the (hopefully soon-to-be-released) October data export.
What is the best, safest way to make sure just script tags end up encoded?

You may want to modify the HTMLSanatize script to fit your purposes. It was written by Jeff Atwood to allow certain kinds of HTML to be shown. Since it was written for Stack Overflow, it'd fit your purpose as well.
I don't know whether it's 'up to date' with what Jeff currently has deployed, but it's a good starting point.

Don't forget onclick, onmouseover, etc or javascript: psuedo-urls (<img src="javascript:evil!Evil!">) or CSS (style="property: expression(evil!Evil!);") or…
There are a host of attack vectors beyond simple script elements.
Implement a white list, not a black list.

If the messages are in XHTML format then you could do an XSL transform and encode/strip tags and properties that you don't want. It gets a little easier if you use something like TinyMCE or CKEditor to provide a wysiwyg editor that outputs XHTML.

What about simply breaking the <script> tags? Escaping only < and > for that tag, ending up with <script>, could be one simple and easy way.
Of course links are another vector. You should also disable every instance of href='javascript:', and every attribute starting with on*.
Just to be sure, nuke it from orbit.

But I am concerned that this will also leave <script tags active.
Oh, that's just the beginning of HTML ‘malicious content’ that can cause cross-site scripting. There's also event handlers; inline, embedded and linked CSS (expressions, behaviors, bindings), Flash and other embeddable plugins, iframes to exploit sites, javascript: and other dangerous schemes (there are more than you think!) in every place that can accept a URL, meta-refresh, UTF-8 overlongs, UTF-7 mis-sniffing, data binding, VML and other non-HTML stuff, broken markup parsed as scripts by permissive browsers...
In short any quick-fix attempt to sanitise HTML with a simple regex will fail badly.
Either escape everything so that any HTML is displayed as plain text, or use a full parser-and-whitelist-based sanitiser. (And keep it up-to-date, because even that's a hard job and there are often newly-discovered holes in them.)
But aren't you using the same Markdown system as SO itself to render posts? That would be the obvious thing to do. I can't guarantee there are no holes in Markdown that would allow cross-site scripting (there certainly have been in the past and there are probably some more obscure ones still in there as it's quite a complicated system). But at least you'd be no more insecure than SO is!

Use a Regex to replace the script tags with the encoded tags. This will filter the tags which has the word "script" in it and HtmlEncode it. Thus, all the script tags such as <script>, </script> and <script type="text/javascript"> etc. will get encoded and will not encode other tags in the string.
Regex.Replace(text, #"</?(\w+)[^>]*>",
tag => tag.Groups[1].Value.ToLower().Contains("script") ? HttpUtility.HtmlEncode(tag.Value) : tag.Value,
RegexOptions.Singleline);

Is it worth the development time to output valid HTML?

Developing websites are time-consuming. To improve productivity, I would code a prototype to show to our clients. I don't worry about making the prototype comform to the standard. Most of the time, our clients would approve the prototype and give an unreasonable deadline. I usually end up using the prototype in production (hey, the prototype works. No need to make my job harder.)
I could refactor the code to output valid HTML. But is it worth the effort to output valid HTML?

It is only worth the effort if it gives you a practical benefit. Sticking to standards might make it easier to build a website that works across most browsers. Then again, if you're happy with how a website displays on the browsers you care about (maybe one, maybe all), then going through hoops to make it pass validation is a waste of time.
Also, the difference in SEO between an all-valid html website and a mostly-valid html website is negligible.
So always look for the practical benefit, there are some in some situations, but don't do it just for the sake of it.

Yes. It's hard enough trying to deal with how different browsers will render valid HTML, never mind trying to predict what they'll do with invalid code. Same goes for search engines - enough problems in the HTML may lead to the site not being indexed properly or at all.
I guess the real answer is "it depends on what is invalid about the HTML". If the invalid parts relate to accessibility issues, you might even find your customer has legal problems if they use the site on a commercial basis.

Probably not if you have a non-complying site to begin with and are short on time.
However, and you won't believe me because I didn't believe others to begin with, but it is easier to make a site compliant from the start - it saves you headaches in terms of browser compatibility, CSS behaviour and even JavaScript behavior and it is typically less markup to maintain.
Site compliance (at least to Transitional) is pretty easy.

Producing compliant HTML is similar to ensuring that you have no warnings during a compilation - the warnings are there for a reason, you may not realise what that reason is, but ignore the warnings and, before you know where you are, there as so many, you can't spot the one that's relevant to the problem that you're trying to fix.
If you use Firefox to view your web pages, you'll get a helpful green tick or red cross in the bottom right hand corner, quickly showin you whether you've complied or not. Clicking on a red cross will show you all of the places where you goofed.
Some of the warnings/errors may seem a bit pedantic, but fix them and you'll benefit in many ways.
Your page is much more likely to work with a wider range of browsers.
Accessibility compliance will be easier (You'll have 'alt' attributes on your images, for example)
If you choose XHTML as a standard, your markup will be more likely to be useful in an AJAX environment.
Failure to do this results in unpredictability.
One of the biggest problems with web browsers is that they have perpetuated bad habits (And still do, in some cases) by silently correcting certain markup problems, such as failure to close table cells and/or rows. This single fact has resulted in thousands of web pages that are not compliant but 'work', lulling their developers into a false sense of security.
When you consider how many things there are that can go wrong with a website, being lazy when it comes to compliance is just adding more problems to your workload.
EDIT: having read your original post again, I notice that you say you don't bother with compliance when working on a prototype, then you go on to say that you usually use the prototype in production - this means that it's not strictly a prototype, but a candidate.
The normal situation in such circumstances is that once the customer accepts a candidate, no time is allocated for bug fixing or tidying up, thus strengthening the argument for making the markup compliant in the first place.
If you won't be given time later, do it now.
If you are given time later, then you had the time to do it anyway.

If you want your sight to be accessible to people with and without disabilities, as well as external systems, then yes, you should definitely make sure you output valid HTML.
It's easy to test your HTML with automatic validators.
I'll add to what Mike Edwards said about legal ramifications and remind you that you have a moral obligation too :)

Why not write the prototype in valid (X)HTML in the first place? I've never found that to be more of an effort than using invalid HTML. Producing valid XHTML should be a trivial task. (On the other hand, producing semantically meaningful XHTML might be more taxing.)
In short, I see no advantage whatsoever in using invalid HTML for prototypes.

I honestly dont know why it is extra effort to do standards based HTML. It's not as if it's hard and you should be doing it as a matter of professionalism.
If you paid someone to build you a house and he cut corners out of laziness, that you didnt notice at the time, but in 10 years cracks appeared in your walls, would you be happy?

Valid HTML just to be able to have a badge on your site - no.
Having "valid HTML" in the sense of "HTML that works on every major browser or browser engine" - yes.

Absolutely. Invalid code can cause all sorts of weird behaviors, and errors which don't obscure those that do when you get a validation report.
Case in point:
A yellow background was spilling out of a list of messages and over the heading for the next list of messages - but only in Internet Explorer.
Why? The background was applied to a list item, but the person who wrote the page had written it as a single list with a heading in the middle. Headings are not allowed between list items and different browsers attempted to recover from it in different ways. Internet Explorer ended the list item (with the background colour) when it saw the start of the following item (after the heading), while other browsers ended it when they saw the end tag for the first list item.
It was the only validity error on the page, so it took only a couple of minutes to track down the problem and fix it.

Because, if you stick to standards, your work will be compatible in the future. User Agents will strive for standard compliance and their quirks non-compliance mode will always be subject to change. This is the way is supposed to be.
Unless you're into that whole IE8 broken standards perpetuation thing that they want to enable by default. -- that's another argument.
Webkit, Gecko, Presto? (is that opera's engine?), and the others will always become more compliant with every release.
Unless your html work is in a IE embedded browser control, then there's really no reason to output valid html as long as it renders.

In my opinion the key criterion is "fit for purpose" - If your clients want something for a small/internal market (and don't care if that alienates potential customers who have disabilities or use less-common browsers) then that's their choice.
At the same time I think it's our (as developers) responsibility to make sure they know the implications of their decisions - Some organisations will be bound by legislative requirements that websites be useable by screen readers, which typically means standards-compliant HTML.

i believe making valid html outputs wont hurt your development time that much if you've trained yourself to code valid html from the start. for one, its not that hard to know which tags are not allowed within an elementand the required attributes in a tag are sometimes the ones you'd really need anyway - i believe these are the main errors that makes your html invalid, so why not just learn them as early as now if you plan to stay on the web for long?plus outputting valid html can help boost your sites ranking

There are two rules for writing websites:
The site must work for your users.
The site must work for your users.
To meet the first rule, you have to code such that your site renders correctly when using Internet Explorer. Unless you have the freedom to alter your site design to use only those features that IE renders correctly, this means writing invalid HTML.
To meet the second rule, you have to code such that your site renders correctly when using screen-readers and braille screens. Although some newer screen readers can work with IE-targeted sites, in general this means writing valid HTML.
If you're working on a small project, or you're part of a large team, you can code a site that outputs IE-targeted HTML for IE, and valid HTML otherwise. But if you're taking on a medium-to-large project on your own, you have to decide which rule you're going to follow and which one you're going to ignore.
UPDATE:
This is getting voted down by users who think you can always get away with valid HTML in IE. That may be true if you have the flexibility to change your design to get around IE's shortcomings, but if a client has given you a design and you have to get it working, you may have to resort to invalid HTML. It's sad, but it's true, whatever they might think.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex