I am trying to construct the below URL:
https://console.aws.amazon.com/elasticmapreduce/home?region=us-east-1#cluster-details:j-1IGU6572KT6LB
I am not sure how to include the :j-1IGU6572KT6LB. When I include :`, it gets encoded. Trying to see if that can be avoided.
This is what I have:
UriBuilder
.fromPath("console.aws.amazon.com")
.path("elasticmapreduce")
.path("home")
.queryParam("region","us-east-1")
.fragment("cluster-details")
.port(-1)
.scheme("https")
If the ":" in fragments is encoded that appears to me to be a bug (see RFC 3986, Section 3.5 and 3.3). I recommend to open a bug report.
OTOH, if a recipient fails to handle the percent-encoded colon, that's a bug as well.
So far as I can tell Lua XML-RPC does not include XML tag Base64 so transmitting binary data from a string type poses a problem.
I've hacked a workaround which intercepts the encoded message, flipping "string" to "base64" where the data is precoded base64, and with added line breaks to keep inside a sensible line length. This works with wordpress servers, the target.
Question: is this facility directly in Lua XML-RPC?
Refs.
http://codex.wordpress.org/XML-RPC_WordPress_API/Media
http://keplerproject.github.io/lua-xmlrpc/manual.html#data_types
I am working in a language that has extremely low-level TCP support (if you must know, it's UnrealScript). The response received after making a POST request includes the entire HTTP header, status code, body, etc. as a string.
So, I need to parse the response to extract the body text manually. The HTTP 1.1 specification says:
Response = Status-Line
*(( general-header
| response-header
| entity-header ) CRLF)
CRLF
[message-body]
Am I correct in assuming that the best way to do this is to split the string along a double CRLF (carriage return/line feed) and return the second part of this split?
Or are there weird HTTP edge cases I should be aware of?
Am I correct in assuming that the best way to do this is to split the string along a double CRLF
Yes - but what appears in the body may be compressed using three different compressions methods even if you told the server you don't accept compressed responses.
Further the body may be split into chunks, in between each chunk is an indicator of the size of the next chunk.
Do you really have no scope for using an off the shelf component for parsing? (I would recommend lib curl).
This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
When running my script, I am getting several errors like this:
Warning: Cannot modify header information - headers already sent by (output started at /some/file.php:12) in /some/file.php on line 23
The lines mentioned in the error messages contain header() and setcookie() calls.
What could be the reason for this? And how to fix it?
No output before sending headers!
Functions that send/modify HTTP headers must be invoked before any output is made.
summary ⇊
Otherwise the call fails:
Warning: Cannot modify header information - headers already sent (output started at script:line)
Some functions modifying the HTTP header are:
header / header_remove
session_start / session_regenerate_id
setcookie / setrawcookie
Output can be:
Unintentional:
Whitespace before <?php or after ?>
The UTF-8 Byte Order Mark specifically
Previous error messages or notices
Intentional:
print, echo and other functions producing output
Raw <html> sections prior <?php code.
Why does it happen?
To understand why headers must be sent before output it's necessary
to look at a typical HTTP
response. PHP scripts mainly generate HTML content, but also pass a
set of HTTP/CGI headers to the webserver:
HTTP/1.1 200 OK
Powered-By: PHP/5.3.7
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
<html><head><title>PHP page output page</title></head>
<body><h1>Content</h1> <p>Some more output follows...</p>
and <img src=internal-icon-delayed>
The page/output always follows the headers. PHP has to pass the
headers to the webserver first. It can only do that once.
After the double linebreak it can nevermore amend them.
When PHP receives the first output (print, echo, <html>) it will
flush all collected headers. Afterward it can send all the output
it wants. But sending further HTTP headers is impossible then.
How can you find out where the premature output occurred?
The header() warning contains all relevant information to
locate the problem cause:
Warning: Cannot modify header information - headers already sent by
(output started at /www/usr2345/htdocs/auth.php:52) in
/www/usr2345/htdocs/index.php on line 100
Here "line 100" refers to the script where the header() invocation failed.
The "output started at" note within the parenthesis is more significant.
It denominates the source of previous output. In this example, it's auth.php
and line 52. That's where you had to look for premature output.
Typical causes:
Print, echo
Intentional output from print and echo statements will terminate the opportunity to send HTTP headers. The application flow must be restructured to avoid that. Use functions
and templating schemes. Ensure header() calls occur before messages
are written out.
Functions that produce output include
print, echo, printf, vprintf
trigger_error, ob_flush, ob_end_flush, var_dump, print_r
readfile, passthru, flush, imagepng, imagejpeg
among others and user-defined functions.
Raw HTML areas
Unparsed HTML sections in a .php file are direct output as well.
Script conditions that will trigger a header() call must be noted
before any raw <html> blocks.
<!DOCTYPE html>
<?php
// Too late for headers already.
Use a templating scheme to separate processing from output logic.
Place form processing code atop scripts.
Use temporary string variables to defer messages.
The actual output logic and intermixed HTML output should follow last.
Whitespace before <?php for "script.php line 1" warnings
If the warning refers to output inline 1, then it's mostly
leading whitespace, text or HTML before the opening <?php token.
<?php
# There's a SINGLE space/newline before <? - Which already seals it.
Similarly it can occur for appended scripts or script sections:
?>
<?php
PHP actually eats up a single linebreak after close tags. But it won't
compensate multiple newlines or tabs or spaces shifted into such gaps.
UTF-8 BOM
Linebreaks and spaces alone can be a problem. But there are also "invisible"
character sequences that can cause this. Most famously the
UTF-8 BOM (Byte-Order-Mark)
which isn't displayed by most text editors. It's the byte sequence EF BB BF, which is optional and redundant for UTF-8 encoded documents. PHP however has to treat it as raw output. It may show up as the characters  in the output (if the client interprets the document as Latin-1) or similar "garbage".
In particular graphical editors and Java-based IDEs are oblivious to its
presence. They don't visualize it (obliged by the Unicode standard).
Most programmer and console editors however do:
There it's easy to recognize the problem early on. Other editors may identify
its presence in a file/settings menu (Notepad++ on Windows can identify and
remedy the problem),
Another option to inspect the BOMs presence is resorting to an hexeditor.
On *nix systems hexdump is usually available,
if not a graphical variant which simplifies auditing these and other issues:
An easy fix is to set the text editor to save files as "UTF-8 (no BOM)"
or similar to such nomenclature. Often newcomers otherwise resort to creating new files and just copy&pasting the previous code back in.
Correction utilities
There are also automated tools to examine and rewrite text files
(sed/awk or recode).
For PHP specifically there's the phptags tag tidier.
It rewrites close and open tags into long and short forms, but also easily
fixes leading and trailing whitespace, Unicode and UTF-x BOM issues:
phptags --whitespace *.php
It's safe to use on a whole include or project directory.
Whitespace after ?>
If the error source is mentioned as behind the
closing ?>
then this is where some whitespace or the raw text got written out.
The PHP end marker does not terminate script execution at this point. Any text/space characters after it will be written out as page content
still.
It's commonly advised, in particular to newcomers, that trailing ?> PHP
close tags should be omitted. This eschews a small portion of these cases.
(Quite commonly include()d scripts are the culprit.)
Error source mentioned as "Unknown on line 0"
It's typically a PHP extension or php.ini setting if no error source
is concretized.
It's occasionally the gzip stream encoding setting
or the ob_gzhandler.
But it could also be any doubly loaded extension= module
generating an implicit PHP startup/warning message.
Preceding error messages
If another PHP statement or expression causes a warning message or
notice being printed out, that also counts as premature output.
In this case you need to eschew the error,
delay the statement execution, or suppress the message with e.g.
isset() or #() -
when either doesn't obstruct debugging later on.
No error message
If you have error_reporting or display_errors disabled per php.ini,
then no warning will show up. But ignoring errors won't make the problem go
away. Headers still can't be sent after premature output.
So when header("Location: ...") redirects silently fail it's very
advisable to probe for warnings. Reenable them with two simple commands
atop the invocation script:
error_reporting(E_ALL);
ini_set("display_errors", 1);
Or set_error_handler("var_dump"); if all else fails.
Speaking of redirect headers, you should often use an idiom like
this for final code paths:
exit(header("Location: /finished.html"));
Preferably even a utility function, which prints a user message
in case of header() failures.
Output buffering as a workaround
PHPs output buffering
is a workaround to alleviate this issue. It often works reliably, but shouldn't
substitute for proper application structuring and separating output from control
logic. Its actual purpose is minimizing chunked transfers to the webserver.
The output_buffering=
setting nevertheless can help.
Configure it in the php.ini
or via .htaccess
or even .user.ini on
modern FPM/FastCGI setups.
Enabling it will allow PHP to buffer output instead of passing it to the webserver instantly. PHP thus can aggregate HTTP headers.
It can likewise be engaged with a call to ob_start();
atop the invocation script. Which however is less reliable for multiple reasons:
Even if <?php ob_start(); ?> starts the first script, whitespace or a
BOM might get shuffled before, rendering it ineffective.
It can conceal whitespace for HTML output. But as soon as the application logic attempts to send binary content (a generated image for example),
the buffered extraneous output becomes a problem. (Necessitating ob_clean()
as a further workaround.)
The buffer is limited in size, and can easily overrun when left to defaults.
And that's not a rare occurrence either, difficult to track down
when it happens.
Both approaches therefore may become unreliable - in particular when switching between
development setups and/or production servers. This is why output buffering is
widely considered just a crutch / strictly a workaround.
See also the basic usage example
in the manual, and for more pros and cons:
What is output buffering?
Why use output buffering in PHP?
Is using output buffering considered a bad practice?
Use case for output buffering as the correct solution to "headers already sent"
But it worked on the other server!?
If you didn't get the headers warning before, then the output buffering
php.ini setting
has changed. It's likely unconfigured on the current/new server.
Checking with headers_sent()
You can always use headers_sent() to probe if
it's still possible to... send headers. Which is useful to conditionally print
info or apply other fallback logic.
if (headers_sent()) {
die("Redirect failed. Please click on this link: <a href=...>");
}
else{
exit(header("Location: /user.php"));
}
Useful fallback workarounds are:
HTML <meta> tag
If your application is structurally hard to fix, then an easy (but
somewhat unprofessional) way to allow redirects is injecting a HTML
<meta> tag. A redirect can be achieved with:
<meta http-equiv="Location" content="http://example.com/">
Or with a short delay:
<meta http-equiv="Refresh" content="2; url=../target.html">
This leads to non-valid HTML when utilized past the <head> section.
Most browsers still accept it.
JavaScript redirect
As alternative a JavaScript redirect
can be used for page redirects:
<script> location.replace("target.html"); </script>
While this is often more HTML compliant than the <meta> workaround,
it incurs a reliance on JavaScript-capable clients.
Both approaches however make acceptable fallbacks when genuine HTTP header()
calls fail. Ideally you'd always combine this with a user-friendly message and
clickable link as last resort. (Which for instance is what the http_redirect()
PECL extension does.)
Why setcookie() and session_start() are also affected
Both setcookie() and session_start() need to send a Set-Cookie: HTTP header.
The same conditions therefore apply, and similar error messages will be generated
for premature output situations.
(Of course, they're furthermore affected by disabled cookies in the browser
or even proxy issues. The session functionality obviously also depends on free
disk space and other php.ini settings, etc.)
Further links
Google provides a lengthy list of similar discussions.
And of course many specific cases have been covered on Stack Overflow as well.
The WordPress FAQ explains How do I solve the Headers already sent warning problem? in a generic manner.
Adobe Community: PHP development: why redirects don't work (headers already sent)
Nucleus FAQ: What does "page headers already sent" mean?
One of the more thorough explanations is HTTP Headers and the PHP header() Function - A tutorial by NicholasSolutions (Internet Archive link).
It covers HTTP in detail and gives a few guidelines for rewriting scripts.
This error message gets triggered when anything is sent before you send HTTP headers (with setcookie or header). Common reasons for outputting something before the HTTP headers are:
Accidental whitespace, often at the beginning or end of files, like this:
<?php
// Note the space before "<?php"
?>
To avoid this, simply leave out the closing ?> - it's not required anyways.
Byte order marks at the beginning of a php file. Examine your php files with a hex editor to find out whether that's the case. They should start with the bytes 3F 3C. You can safely remove the BOM EF BB BF from the start of files.
Explicit output, such as calls to echo, printf, readfile, passthru, code before <? etc.
A warning outputted by php, if the display_errors php.ini property is set. Instead of crashing on a programmer mistake, php silently fixes the error and emits a warning. While you can modify the display_errors or error_reporting configurations, you should rather fix the problem.
Common reasons are accesses to undefined elements of an array (such as $_POST['input'] without using empty or isset to test whether the input is set), or using an undefined constant instead of a string literal (as in $_POST[input], note the missing quotes).
Turning on output buffering should make the problem go away; all output after the call to ob_start is buffered in memory until you release the buffer, e.g. with ob_end_flush.
However, while output buffering avoids the issues, you should really determine why your application outputs an HTTP body before the HTTP header. That'd be like taking a phone call and discussing your day and the weather before telling the caller that he's got the wrong number.
I got this error many times before, and I am certain all PHP programmer got this error at least once before.
Possible Solution 1
This error may have been caused by the blank spaces before the start of the file or after the end of the file.These blank spaces should not be here.
ex)
THERE SHOULD BE NO BLANK SPACES HERE
echo "your code here";
?>
THERE SHOULD BE NO BLANK SPACES HERE
Check all files associated with file that causes this error.
Note: Sometimes EDITOR(IDE) like gedit (a default linux editor) add one blank line on save file. This should not happen. If you are using Linux. you can use VI editor to remove space/lines after ?> at the end of the page.
Possible Solution 2:
If this is not your case, then use ob_start to output buffering:
<?php
ob_start();
// code
ob_end_flush();
?>
This will turn output buffering on and your headers will be created after the page is buffered.
Instead of the below line
//header("Location:".ADMIN_URL."/index.php");
write
echo("<script>location.href = '".ADMIN_URL."/index.php?msg=$msg';</script>");
or
?><script><?php echo("location.href = '".ADMIN_URL."/index.php?msg=$msg';");?></script><?php
It'll definitely solve your problem.
I faced the same problem but I solved through writing header location in the above way.
You do
printf ("Hi %s,</br />", $name);
before setting the cookies, which isn't allowed. You can't send any output before the headers, not even a blank line.
COMMON PROBLEMS:
(copied from: source)
====================
1) there should not be any output (i.e. echo.. or HTML codes) before the header(.......); command.
2) remove any white-space(or newline) before <?php and after ?> tags.
3) GOLDEN RULE! - check if that php file (and also, if you include other files) have UTF8 without BOM encoding (and not just UTF-8). That is problem in many cases (because UTF8 encoded file has something special character in the start of php file, which your text-editor doesnt show)!!!!!!!!!!!
4) After header(...); you must use exit;
5) always use 301 or 302 reference:
header("location: http://example.com", true, 301 ); exit;
6) Turn on error reporting, and find the error. Your error may be caused by a function that is not working. When you turn on error reporting, you should always fix top-most error first. For example, it might be "Warning: date_default_timezone_get(): It is not safe to rely on the system's timezone settings." - then farther on down you may see "headers not sent" error. After fixing top-most (1st) error, re-load your page. If you still have errors, then again fix the top-most error.
7) If none of above helps, use JAVSCRIPT redirection(however, strongly non-recommended method), may be the last chance in custom cases...:
echo "<script type='text/javascript'>window.top.location='http://website.com/';</script>"; exit;
It is because of this line:
printf ("Hi %s,</br />", $name);
You should not print/echo anything before sending the headers.
A simple tip: A simple space (or invisible special char) in your script, right before the very first <?php tag, can cause this !
Especially when you are working in a team and somebody is using a "weak" IDE or has messed around in the files with strange text editors.
I have seen these things ;)
Another bad practice can invoke this problem which is not stated yet.
See this code snippet:
<?php
include('a_important_file.php'); //really really really bad practise
header("Location:A location");
?>
Things are okay,right?
What if "a_important_file.php" is this:
<?php
//some php code
//another line of php code
//no line above is generating any output
?>
----------This is the end of the an_important_file-------------------
This will not work? Why?Because already a new line is generated.
Now,though this is not a common scenario what if you are using a MVC framework which loads a lots of file before handover things to your controller? This is not an uncommon scenario. Be prepare for this.
From PSR-2 2.2 :
All PHP files MUST use the Unix LF (linefeed) line ending.
All PHP files MUST end with a single blank line.
The closing ?> tag MUST be omitted from files containing only php
Believe me , following thse standards can save you a hell lot of hours from your life :)
Sometimes when the dev process has both WIN work stations and LINUX systems (hosting) and in the code you do not see any output before the related line, it could be the formatting of the file and the lack of Unix LF (linefeed)
line ending.
What we usually do in order to quickly fix this, is rename the file and on the LINUX system create a new file instead of the renamed one, and then copy the content into that. Many times this solve the issue as some of the files that were created in WIN once moved to the hosting cause this issue.
This fix is an easy fix for sites we manage by FTP and sometimes can save our new team members some time.
Generally this error arise when we send header after echoing or printing. If this error arise on a specific page then make sure that page is not echoing anything before calling to start_session().
Example of Unpredictable Error:
<?php //a white-space before <?php also send for output and arise error
session_start();
session_regenerate_id();
//your page content
One more example:
<?php
includes 'functions.php';
?> <!-- This new line will also arise error -->
<?php
session_start();
session_regenerate_id();
//your page content
Conclusion: Do not output any character before calling session_start() or header() functions not even a white-space or new-line
In some javascript, I have:
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
alert( url );
location.href = url;
where the value of address is the string "Seattle, WA".
In the alert I see
find.aspx?Seattle%2C%20WA
as I expect.
But on the server side, when I look at Request.Url, the relevant substring I see is
find.aspx?Seattle, WA
And in the Firefox url window I see
find.aspx?location=Seattle%2C WA
So I'm getting three different representations whereas I would expect that in all three places I should see what I see in the alert. My expectation is that the url I assign to location.href should show up as-is in the browser url window, and should be passed as-is to the server in Request.Url (and I would need to decode the values on the server before using them). What's happening?
Firefox converts certain encoded characters into their literal forms as a way to be friendly to users. It will also convert spaces typed into the address bar into %20 for the server.
Update: The reason Firefox doesn't display the comma unencoded is because commas are allowed in URLs, but spaces are not, so it knows that a space is going to be unambiguously interpreted, whereas the pre-encoded comma is different from a non-encoded comma to some servers. see: Can I use commas in a URL?
ASP is probably trying to help you out by auto-un-encoding the string for you.
Update: It looks like ASP.NET unencodes Request.Url for you by default, as mentioned here: QueryString malformed after URLDecode They also mention that you can use HttpRequest.Url.Query to access the un-decoded version.
The alert is the only thing not doing any "magic" for you.
For the alert, you are doing the encoding yourself. Perhaps it looks the same as on the server-side if you removed encodeURIComponent.
On the server side, ASP.NET will always show you the unencoded form. This is to make it easier to directly map to files that also have text that needed to be (un)encoded.
Note that you can replace every letter for its UTF8 representation in URL Encoding. It will still be the same URL. I.e., type the following in the browser window and it will still work: %66%59%6E%64.aspx?location=Seattle%2C%20WA. To only encode the necessary chars, use UrlEncode on the server side if you create a link yourself.
URL encoding can become fairly tricky. You ask to explain it. To know the correct escape of a certain character, you need to know how that character looks in UTF8. The hexadecimal value of the UTF-8 bytes then become the %XX%YY value of your letter. Sometimes it's one %XX, but it can be up to six byte sequences in total (some Chinese characters for instance).
URL Encoding works one way only. Never double-encode or double-unencode. This is prohibited by the specification. Also, because you can encode any character, it is not always possible (as you found out) to do roundtrip encoding/unencoding. If you unencode and re-encode again, it is well possible that the resulting string is different, but syntactically the same.
In HTML, URL Encoding is sometimes interspersed with HTML Encoding. I.e., the ampersand is valid in HTML, but not in HTML. find.aspx?city=A&name=B becomes find.aspx?city=A&name=B in and HTML URL. However, browsers are lenient and will accept wrongly HTML-encoded strings.
Finally, a not on the browser: if you type in a space in a link, even inside an <a> tag, it will escape the space (or other character) for you. Likewise, it will nowadays show the odd characters (é, ï etc) in the address bar, but when it sends it over HTTP, the browser will correctly do the encoding for you.
Update: about anwering your question of needing a "definitive" reference or proof.
While I couldn't find any on the internet, I decided to look for it myself using Reflector. Going through the methods that set, for instance, the HttpRequest.QueryString, you quickly encounter the private method HttpRequest.FillInQueryStringCollection which then calls HttpValueCollection.FillfromEncodedBytes. Somewhat near the end of that method, HttpUtility.UrlDecode is called for the values. Conclusion: do not call it yourself, to prevent double decoding.
You can see this for yourself when you download Reflector and disassemble the .NET libs of System.Web.
For your example you can change this line
var url = "find.aspx?" + "location=" + encodeURIComponent( address );
to
var url = "find.aspx?" + "location=" + address;
and see the address as it is. Bu if address variable contains any '&' character your variable will be corrupt. So you are using encodeURIComponent to encode these things url.
On the Server side all these encoded strings are decoded back. It means encodeURIComponent is just for sending the address variable (whether it contains & character or not) to server side correctly.