QUrl and encoding - qt

I'm having a problem on mac with Qt 4.8.x
I have a mailto QURL which has a body component that's encoded. Simplifying it:
mailto:?subject=Hello&body=Hello%2C%0A%0AYou%20have%20been%20invited
My problem is that the encoding still shows on the mail body when my default app launches (tried multiple e-mail clients). This works well for Windows, but fails on Mac.
Is there a way for me to have the encoding turn into the encoded characters into readable text in the e-mail client body?

Also use fromPercentEncoding to decode Qurl format in to clear string.

QUrl is buggy. There is no way around this bug but to go directly to Cocoa.
void MyMacLaunchURLClass::LaunchURL(char * url)
{
NSString *urlStr = [NSString stringWithCString:url encoding:NSASCIIStringEncoding];
[[NSWorkspace sharedWorkspace] openURL:[NSURL URLWithString:urlStr]];
}
That resolved the bug.

Related

Accented characters of a vCard not displaying in Windows Contacts

I am writing a method that allows a user to generate a vCard file based on some information. It works quite well and the user can open the generated file with Microsoft Outlook without any problem (tested on versions 2010 and 2013). But when he wants to open it with Windows Contacts, the accented characters are not well displayed whereas I have set the charset to the different fields to UTF-8.
Here is an example of a generated vCard:
BEGIN:VCARD
VERSION:4.0
FN;CHARSET=utf-8:Vandenbergh Cédric
N;CHARSET=utf-8:Vandenbergh;Cédric;;;
TITLE;CHARSET=utf-8:Manager d'entité
END:VCARD
And here is what I get when opening it with Windows Contacts:
Could it be a bug in Windows Contacts (tested under Windows 8.1)? Or am I missing something?
Edit
Here is the code I use for a user to download a vCard:
Context.Response.Clear();
Context.Response.Buffer = true;
Context.Response.ContentType = "text/x-vCard; charset=utf-8";
Context.Response.AddHeader("Content-Disposition", "attachment;filename=file.vcf");
Context.Response.Write(user.GetVCard()); // Get vCard content
Context.Response.Flush();
Context.Response.End();
Edit 2/10/15
I still have no idea on how to solve this problem. In my opinion, it could really be a Windows Contacts bug but I cannot find anything on the web to confirm my impression.
Just because you add a "CHARSET=utf-8" parameter doesn't mean the value will be written in UTF-8. You are probably not setting the character encoding to "UTF-8" when you write the vCard.
Also, note that vCard version 4.0 does not support the "CHARSET" parameter. The specification states that all vCards MUST be written in UTF-8.

Define Character Encoding of QWebElement's `toPlainText()`

I'm having trouble getting the hang of the character encoding while dealing with QWebKit's QWebElement and its toPlainText() function (*).
I have got a QString with UTF8 encoding holding the content of a HTML page, which was read from local disc via QFile. No I want to parse this page by using QWebKit. Thus I defined a QWebFrame object as part of a QWebPage. With QWebFrame::setHtml() I filled in the QString into the QWebKit environment.
QString rawReport = "some UTF8 encoded string read in previously";
QWebPage p;
QWebFrame *frame = p.mainFrame();
frame->setHtml(rawReport);
QWebElement report = frame->documentElement();
qDebug() << report.toPlainText();
But somehow, qDebug() seems to get the encoding wrong as for example German umlauts äöüß are shown rather funny. Even not as their corresponding HTML entities.
I doubt it's qDebug's fault but rather the encoding inside QWebElement. Somewhere I read, that QWebFrame::setHtml() expects UTF8 encoding. But I'm almost sure, this is the case here.
What am I missing? Is there somewhere a function/option to force QWebFrame/QWebElement to use a specific character encoding for both, input and output?
[*] Using QWebElement::toOuterXml() or QWebElement::toInnerXml() show the same encoding problem.
Have you tried using from***() functions of QString to find how the string returned by toPlainText() is encoded?
The documentation states
When using this method WebKit assumes that external resources such as JavaScript programs or style sheets are encoded in UTF-8 unless otherwise specified. For example, the encoding of an external script can be specified through the charset attribute of the HTML script tag. It is also possible for the encoding to be specified by web server.''.
I would thus try to change the charset specified in the html source (in the corresponding meta tag) that you are loading to explicitly specify that you are using UTF-8.

Question marks in email for characters like non-breaking space. It only occurs on Unix and not on Windows

I am facing a weird problem related to content type/encoding.
Here is my Java code snippet below. This code works perfectly fine on a Windows machine where the application server is running on windows and the SMTP server for sending emails is also Windows localhost. When I deploy the same code on a Unix server, the email sent for the exact same content contains question marks (???) for special characters like non-breaking white space.
I did a lot of googling, but I did not find any solution. How can I fix this problem? The content types I tried were ISO-8859-1, UTF-8 and Windows-1252. Nothing helps.
MimeMessage message = new MimeMessage(session);
.............
Multipart mp = new MimeMultipart();
MimeBodyPart messageBody = new MimeBodyPart();
messageBody.setContent(mailMessage, "text/html;charset=Windows-1252");
messageBody.setHeader("Content-Type", "text/html;charset=Windows-1252");
// Add body to the multimedia part
mp.addBodyPart(messageBody);
message.setContent(mp);
// Send message
Transport.send(message);
Are you using the same mail server in both cases? And the same client program to view the message?
For debugging, just before the Transport.send call, add:
message.writeTo(new FileOutputStream("msg.txt"));
and then examine the msg.txt file to see if the characters are correctly encoded.
How do you create the text in the mailMessage String? If you don't create the string with the correct Unicode characters, no charset is going to make it right.
Also, you don't ever need to set the Content-Type header explicitly, remove that line.
And, instead of setContent, use:
messageBody.setText(mailMessage, "html", "utf-8");
That makes sure the Content-Type header is set correctly and the parameters (e.g., charset) are quoted correctly.
Ultimately, I had to go with a crude way of doing it. I replaced such characters with space.
mailMessage.replaceAll("[^\\x20-\\x7e]", " ");
Now, all the special characters like non-breaking space or any other character falling out of normal range, will be replaced with space. The email in this case was anyway meant for normal text.

How to get the "query string" from a QUrl?

I have a QUrl and I need to extract the path+file+params. Basically everything but the hostname - what would be requested via HTTP.
I looked through the Qt 4.6 docs but I couldn't find anything that looked like it would do this.
What method(s) would I call?
You can clear the scheme with setScheme. After that the url will be relative so it shouldn't return the hostname anymore when converting it to a string.
QUrl someUrl("http://stackoverflow.com/foo/bar?spam=eggs");
someUrl.setScheme("");
someUrl.toString();
Or, you can give the toString() method some extra parameters:
QUrl someUrl("http://stackoverflow.com/foo/bar?spam=eggs");
someUrl.toString(QUrl::RemoveScheme);

"name" web pdf for better default save filename in Acrobat?

My app generates PDFs for user consumption. The "Content-Disposition" http header is set as mentioned here. This is set to "inline; filename=foo.pdf", which should be enough for Acrobat to give "foo.pdf" as the filename when saving the pdf.
However, upon clicking the "Save" button in the browser-embedded Acrobat, the default name to save is not that filename but instead the URL with slashes changed to underscores. Huge and ugly. Is there a way to affect this default filename in Adobe?
There IS a query string in the URLs, and this is non-negotiable. This may be significant, but adding a "&foo=/title.pdf" to the end of the URL doesn't affect the default filename.
Update 2: I've tried both
content-disposition inline; filename=foo.pdf
Content-Type application/pdf; filename=foo.pdf
and
content-disposition inline; filename=foo.pdf
Content-Type application/pdf; name=foo.pdf
(as verified through Firebug) Sadly, neither worked.
A sample url is
/bar/sessions/958d8a22-0/views/1493881172/export?format=application/pdf&no-attachment=true
which translates to a default Acrobat save as filename of
http___localhost_bar_sessions_958d8a22-0_views_1493881172_export_format=application_pdf&no-attachment=true.pdf
Update 3: Julian Reschke brings actual insight and rigor to this case. Please upvote his answer.
This seems to be broken in FF (https://bugzilla.mozilla.org/show_bug.cgi?id=433613) and IE but work in Opera, Safari, and Chrome. http://greenbytes.de/tech/tc2231/#inlwithasciifilenamepdf
Part of the problem is that the relevant RFC 2183 doesn't really state what to do with a disposition type of "inline" and a filename.
Also, as far as I can tell, the only UA that actually uses the filename for type=inline is Firefox (see test case).
Finally, it's not obvious that the plugin API actually makes that information available (maybe someboy familiar with the API can elaborate).
That being said, I have sent a pointer to this question to an Adobe person; maybe the right people will have a look.
Related: see attempt to clarify Content-Disposition in HTTP in draft-reschke-rfc2183-in-http -- this is early work in progress, feedback appreciated.
Update: I have added a test case, which seems to indicate that the Acrobat reader plugin doesn't use the response headers (in Firefox), although the plugin API provides access to them.
Set the file name in ContentType as well. This should solve the problem.
context.Response.ContentType = "application/pdf; name=" + fileName;
// the usual stuff
context.Response.AddHeader("content-disposition", "inline; filename=" + fileName);
After you set content-disposition header, also add content-length header, then use binarywrite to stream the PDF.
context.Response.AddHeader("Content-Length", fileBytes.Length.ToString());
context.Response.BinaryWrite(fileBytes);
Like you, I tried and tried to get this to work. Finally I gave up on this idea, and just opted for a workaround.
I'm using ASP.NET MVC Framework, so I modified my routes for that controller/action to make sure that the served up PDF file is the last part of the location portion of the URI (before the query string), and pass everything else in the query string.
Eg:
Old URI:
http://server/app/report/showpdf?param1=foo&param2=bar&filename=myreport.pdf
New URI:
http://server/app/report/showpdf/myreport.pdf?param1=foo&param2=bar
The resulting header looks exactly like what you've described (content-type is application/pdf, disposition is inline, filename is uselessly part of the header). Acrobat shows it in the browser window (no save as dialog) and the filename that is auto-populated if a user clicks the Acrobat Save button is the report filename.
A few considerations:
In order for the filenames to look decent, they shouldn't have any escaped characters (ie, no spaces, etc)... which is a bit limiting. My filenames are auto-generated in this case, and before had spaces in them, which were showing up as '%20's in the resulting save dialog filename. I just replaced the spaces with underscores, and that worked out.
This is by no names the best solution, but it does work. It also means that you have to have the filename available to make it part of the original URI, which might mess with your program's workflow. If it's currently being generated or retrieved from a database during the server-side call that generates the PDF, you might need to move the code that generates the filename to javascript as part of a form submission or if it comes from a database make it a quick ajax call to get the filename when building the URL that results in the inlined PDF.
If you're taking the filename from a user input on a form, then that should be validated not to contain escaped characters, which will annoy users.
Hope that helps.
Try placing the file name at the end of the URL, before any other parameters. This worked for me.
http://www.setasign.de/support/tips-and-tricks/filename-in-browser-plugin/
In ASP.NET 2.0 change the URL from
http://www. server.com/DocServe.aspx?DocId=XXXXXXX
to
http://www. server.com/DocServe.aspx/MySaveAsFileName?DocId=XXXXXXX
This works for Acrobat 8 and the default SaveAs filename is now MySaveAsFileName.pdf.
However, you have to restrict the allowed characters in MySaveAsFileName (no periods, etc.).
Apache's mod_rewrite can solve this.
I have a web service with an endpoint at /foo/getDoc.service. Of course Acrobat will save files as getDoc.pdf. I added the following lines in apache.conf:
LoadModule RewriteModule modules/mod_rewrite.so
RewriteEngine on
RewriteRule ^/foo/getDoc/(.*)$ /foo/getDoc.service [P,NE]
Now when I request /foo/getDoc/filename.pdf?bar&qux, it gets internally rewritten to /foo/getDoc.service?bar&qux, so I'm hitting the correct endpoint of the web service, but Acrobat thinks it will save my file as filename.pdf.
If you use asp.net, you can control pdf filename through page (url) file name.
As other users wrote, Acrobat is a bit s... when it choose the pdf file name when you press "save" button: it takes the page name, removes the extension and add ".pdf".
So /foo/bar/GetMyPdf.aspx gives GetMyPdf.pdf.
The only solution I found is to manage "dynamic" page names through an asp.net handler:
create a class that implements IHttpHandler
map an handler in web.config bounded to the class
Mapping1: all pages have a common radix (MyDocument_):
<httpHandlers>
<add verb="*" path="MyDocument_*.ashx" type="ITextMiscWeb.MyDocumentHandler"/>
Mapping2: completely free file name (need a folder in path):
<add verb="*" path="/CustomName/*.ashx" type="ITextMiscWeb.MyDocumentHandler"/>
Some tips here (the pdf is dynamically created using iTextSharp):
http://fhtino.blogspot.com/2006/11/how-to-show-or-download-pdf-file-from.html
Instead of attachment you can try inline:
Response.AddHeader("content-disposition", "inline;filename=MyFile.pdf");
I used inline in a previous web application that generated Crystal Reports output into PDF and sent that in browser to the user.
File download dialog (PDF) with save and open option
Points To Remember:
Return Stream with correct array size from service
Read the byte arrary from stream with correct byte length on the basis of stream length.
set correct contenttype
Here is the code for read stream and open the File download dialog for PDF file
private void DownloadSharePointDocument()
{
Uri uriAddress = new Uri("http://hyddlf5187:900/SharePointDownloadService/FulfillmentDownload.svc/GetDocumentByID/1/drmfree/");
HttpWebRequest req = WebRequest.Create(uriAddress) as HttpWebRequest;
// Get response
using (HttpWebResponse httpWebResponse = req.GetResponse() as HttpWebResponse)
{
Stream stream = httpWebResponse.GetResponseStream();
int byteCount = Convert.ToInt32(httpWebResponse.ContentLength);
byte[] Buffer1 = new byte[byteCount];
using (BinaryReader reader = new BinaryReader(stream))
{
Buffer1 = reader.ReadBytes(byteCount);
}
Response.Clear();
Response.ClearHeaders();
// set the content type to PDF
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", "attachment;filename=Filename.pdf");
Response.Buffer = true;
Response.BinaryWrite(Buffer1);
Response.Flush();
// Response.End();
}
}
I believe this has already been mentioned in one flavor or another but I'll try and state it in my own words.
Rather than this:
/bar/sessions/958d8a22-0/views/1493881172/export?format=application/pdf&no-attachment=true
I use this:
/bar/sessions/958d8a22-0/views/1493881172/NameThatIWantPDFToBe.pdf?GeneratePDF=1
Rather than having "export" process the request, when a request comes in, I look in the URL for GeneratePDF=1. If found, I run whatever code was running in "export" rather than allowing my system to attempt to search and serve a PDF in the location /bar/sessions/958d8a22-0/views/1493881172/NameThatIWantPDFToBe.pdf. If GeneratePDF is not found in the URL, I simply transmit the file requested. (note that I can't simply redirect to the file requested - or else I'd end up in an endless loop)
You could always have two links. One that opens the document inside the browser, and another to download it (using an incorrect content type). This is what Gmail does.
For anyone still looking at this, I used the solution found here and it worked wonderfully. Thanks Fabrizio!
The way I solved this (with PHP) is as follows:
Suppose your URL is SomeScript.php?id=ID&data=DATA and the file you want to use is TEST.pdf.
Change the URL to SomeScript.php/id/ID/data/DATA/EXT/TEST.pdf.
It's important that the last parameter is the file name you want Adobe to use (the 'EXT' can be about anything). Make sure there are no special chars in the above string, BTW.
Now, at the top of SomeScript.php, add:
$_REQUEST = MakeFriendlyURI( $_SERVER['PHP\_SELF'], $_SERVER['SCRIPT_FILENAME']);
Then add this function to SomeScript.php (or your function library):
function MakeFriendlyURI($URI, $ScriptName) {
/* Need to remove everything up to the script name */
$MyName = '/^.*'.preg_quote(basename($ScriptName)."/", '/').'/';
$Str = preg_replace($MyName,'',$URI);
$RequestArray = array();
/* Breaks down like this
0 1 2 3 4 5
PARAM1/VAL1/PARAM2/VAL2/PARAM3/VAL3
*/
$tmp = explode('/',$Str);
/* Ok so build an associative array with Key->value
This way it can be returned back to $_REQUEST or $_GET
*/
for ($i=0;$i < count($tmp); $i = $i+2){
$RequestArray[$tmp[$i]] = $tmp[$i+1];
}
return $RequestArray;
}//EO MakeFriendlyURI
Now $_REQUEST (or $_GET if you prefer) is accessed like normal $_REQUEST['id'], $_REQUEST['data'], etc.
And Adobe will use your desired file name as the default save as or email info when you send it inline.
I was redirected here because i have the same problem. I also tried Troy Howard's workaround but it is doesn't seem to work.
The approach I did on this one is to NO LONGER use response object to write the file on the fly. Since the PDF is already existing on the server, what i did was to redirect my page pointing to that PDF file. Works great.
http://forums.asp.net/t/143631.aspx
I hope my vague explanation gave you an idea.
Credits to Vivek.
Nginx
location /file.pdf
{
# more_set_headers "Content-Type: application/pdf; name=save_as_file.pdf";
add_header Content-Disposition "inline; filename=save_as_file.pdf";
alias /var/www/file.pdf;
}
Check with
curl -I https://example.com/file.pdf
Firefox 62.0b5 (64-bit): OK.
Chrome 67.0.3396.99 (64-Bit): OK.
IE 11: No comment.
Try this, if your executable is "get.cgi"
http://server,org/get.cgi/filename.pdf?file=filename.pdf
Yes, it's completely insane. There is no file called "filename.pdf" on the server, there is directory at all under the executable get.cgi.
But it seems to work. The server ignores the filename.pdf and the pdf reader ignores the "get.cgi"
Dan

Resources