Retrieve from html body - asp.net

From this HTML body of a mail ,How can I retrieve only the body(Hi...Thank You) to a text box
<html><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, Sans-Serif;font-size:14px"><div>Hi...ThankYou</div></div></body></html>
Thank You

I suggest you to have a look at HTML parsing libraries like HtmlAgilityPack or CsQuery
Here is how it's done in CsQuery (the selector syntax is compatible with jquery):
Dim html = "<html><body><div style=""color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, Sans-Serif;font-size:14px""><div>Hi...ThankYou</div></div></body></html>"
Dim cs = CsQuery.CQ.Create(html)
Dim txt = cs("body>div>div").Text()
textBox.Text = txt
you can obtain CsQuery through Nuget using PM> Install-Package CsQuery -Version 1.3.4 command

You can use HtmlAgilityPack
var node = doc.DocumentNode.SelectNodes("/html/body/div/div");

Related

How do you save a pdf with Avenir font using CSS?

I am trying to save a pdf of html text in Google Apps Script. The full script sends a stylized email with the html text but I would like to back up the body of the email as a pdf. Using the below code I am able to save a pdf, but it does not use the Avenir font as I'd expect.
function printToPDF() {
var htmlMessage = '<html style="font-family: `Avenir`;">Avenir test</html>';
var folder = DriveApp.getFolderById(folderId).createFolder('test')
var blob = Utilities.newBlob(htmlMessage, MimeType.HTML, "text.html");
var pdf = blob.getAs(MimeType.PDF);
var document = DriveApp.getFolderById(folderId).createFile(pdf);
}
General CSS styling works (followed this post Google Apps Script - Convert HTML with styling to PDF and attach to an email - Create a PDF blob from HTML).
I tried to replicate your code, test few font family and found out that not all fonts are accepted in PDF.
Example:
Here I tried using Comic Sans MS.
function printToPDF() {
var folderId = "id";
var html = HtmlService.createTemplateFromFile("ABCD");
var output = html.evaluate();
var blob = Utilities.newBlob(output.getContent(), MimeType.HTML, "text.html");
var pdf = blob.getAs(MimeType.PDF);
DriveApp.getFolderById(folderId).createFile(pdf);
}
ABCD.html
<!DOCTYPE html>
<html>
<head>
<base target="_top">
<style>
body {
font-family: 'Comic Sans MS';
font-size: 48px;
}
</style>
</head>
<body>
Comic Sans MS
</body>
</html>
Output:
Since there is no proper documentation on what are the accepted fonts in PDF, what you can do for now is to search for fonts similar to Avenir and do a trial and error.

How does PHP / CSS deal with missing glyphs?

So I can specify my fonts in my website style CSS, and then set the font-family:
#font-face {
font-family: "custom-helvetica";
src: url("/assets/HelveticaNeue.ttf");
src: url("/assets/HelveticaNeueBold.ttf");
src: url("/assets/HelveticaBlkIt.ttf");
}
#font-face {
font-family: "custom-tahoma";
src: url("/assets/Tahoma.ttf");
src: url("/assets/Tahomabd.ttf");
}
html {
font-family: Tahoma, Helvetica, Arial, sans-serif;
}
Here's an example piece of text:
Testing glyphs in PHP: ± µ ⁓ â ฿ ₿
So let's suppose that the font Helvetica contains all the glyphs in the example apart from ±, µ, ⁓ and that the font Tahoma contains all the glyphs in the example apart from â, ฿, ₿. Let's suppose that the font Arial contains every glyph in the example.
How does PHP/CSS work with this?
Will it apply Tahoma to the example and get this result? -
Testing glyphs in PHP: ± µ ⁓ ࠀ ࠀ ࠀ
Or will it decide that Arial is the only font that can render the entire string correctly, and apply that font to the whole string? Or will the font change dynamically throughout the string to adapt to any missing glyphs?

Different font weight for different fonts

I use Bold, Medium and Normal font weights on my website, that's 700, 500 and 400 respectively.
I use Helvetica Neue font and as a fallback for systems that doesn't have it installed I want to use Open Sans. The problem is Open Sans doesn't have Medium style.
I want my elements that I used to define as font-weight: 500 have font-weight: 600 if the browser uses Open Sans. Is it possible somehow?
There's a similar question at Stack Overflow: How to set different font-weight for fallback font? but I'cant get the result I need using techniqe described in an accepted answer.
I need something like
#font-face {
font-family: 'semibold';
src: 'Helvetica Neue':500, 'Open Sans':600;
}
Not sure how to do it though.
You can't really define weight in a font-face declaration. Instead, font-weight is used there as a gatekeeper to match the font and not to pass styles to the element.
It seems like overkill, but you could use this JavaScript function by Sam Clarke as a starting point to see if the font is available, and then conditionally modify the font-weight following the logic that works best for your specific requirements.
For a simplified example with just these two fonts, you might set up the CSS like this:
#font-face {
font-family: h-semibold;
src: local('Helvetica Neue');
}
#font-face {
font-family: os-semibold;
src: local('Open Sans');
}
.semibold {
font-family: h-semibold, os-semibold;
}
.w5 {
font-weight: 500;
}
.w6 {
font-weight: 600;
}
Then, using the function linked above, you put something like this in your JS to conditionally load the weight classes depending on font support:
var semibold = document.querySelectorAll('.semibold');
if (isFontAvailable('h-semibold')) {
semibold.forEach(result => {
result.className += ' ' + 'w5';
});
} else {
semibold.forEach(result => {
result.className += ' ' + 'w6';
});
}
You'll doubtless work out a more elegant solution if you really need to carry it through.

CSS Font-Family Support Dropped for <SELECT> in Firefox?

The following CSS used to work in all browsers that I have tested. It even has an option selector to handle Firefox.
select,
option {
font-family: "Lucida Console", Monaco, monospace;
}
<select>
<option>PN-2345 The first element Hardware</option>
<option>Pn-1332-CFG Second thing Powdercoat</option>
</select>
The newest versions of Firefox no longer properly apply font family styles.
Former versions of Firefox, and every other major browser I've tested, fully apply the font family settings both to the select and to the items in the dropdown - now, it only gets applied to the select box itself, but NOT the dropdown.
Does Firefox still support font-family changes to dropdowns? If so, how?
I did some experiments, and apparently the font-family will render correctly in <option> elements as long as the font is installed locally. Which is obviously useless.
If anyone has any info disproving me, please tell us.
You can set the font for both the select and option elements in Firefox using:
select, option {
font: -moz-pull-down-menu;
}
Does this work? You can use this code if you want to :)
var ff = document.getElementById('sel');
function font() {
ff.style.fontFamily = "'" + ff.value + "', sans-serif";
}
select {
font-family: 'Overpass', sans-serif;
}
option#diff {
font-family: 'Ubuntu', sans-serif;
}
option#muli {
font-family: 'Muli', sans-serif;
}
option#over {
font-family: 'Overpass', sans-serif;
}
<link href="https://fonts.googleapis.com/css2?family=Muli:wght#300&family=Overpass:wght#300&family=Ubuntu:wght#300&display=swap" rel="stylesheet">
<select id='sel' onchange='font()'>
<option id='muli' value='Muli'>Muli yay</option>
<option selected id='over' value='Overpass'>Overpass hooray</option>
<option id='diff' value='Ubuntu'>Ubuntu is awesome</option>
</select>
My understanding is that Firefox delegates the rendering of options to the OS to some extent, so only fonts that are installed on the system can be applied. You can mitigate this for most cases by setting a fallback font or at least a generic family, like the code in the question does with , monospace at the end of the rule. That's how I interpret this comment from bugzilla.
-moz-font-family:"Lucida Console", Monaco, monospace;

Flying Saucer font for unicode characters

I am generating PDF using Grails export plugin (basically, Flying Saucer). My GSP page is an UTF-8 page (or at least properties are showing that it is UTF-8, also in the beginning of the GSP page there is a <?xml version="1.0" encoding="UTF-8"?> directive). At first generated PDF properly contained umlaut characters "äöüõ", but Cyrillic characters were missing from PDF (not rendered at all). Then I've changed my css file as described in documentation by adding following:
#font-face {
src: url(ARIALUNI.TTF);
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: UTF-8;
}
body {
font-family: "Arial Unicode MS", Arial, sans-serif;
}
ArialUni.ttf is also deployed to the server. But now I am getting both umlaut characters and Cyrillic characters rendered as boxes. If I am changing -fs-pdf-encoding property value to Identity-H then umlaut characters are rendered properly, but Cyrillic characters are rendered as question marks.
Any ideas of what font can be used to properly render both umlaut and Cyrillic characters? Or may be my CSS is somehow wrong? Any hints would be much appreciated.
Upd 1:
I have also tried following css (which was generated by http://fontface.codeandmore.com/):
#font-face {
font-family: 'ArialUnicodeMS';
src: url('arialuni.ttf');
src: url('arialuni.eot?#iefix') format('embedded-opentype'),
url('arialuni.woff') format('woff'),
url('arialuni.ttf') format('truetype'),
url('arialuni.svg#arialuni') format('svg');
font-weight: normal;
font-style: normal;
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: UTF-8;
}
body {
font-family:'ArialUnicodeMS';
}
I've added <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
I was also trying to run grails with -Dfile.encoding=UTF-8, as was mentioned here: http://grails.1312388.n4.nabble.com/PDF-plugin-Having-problems-with-instalation-td2297840.html, but nothing helps. Cyrillic characters are not shown at all. Any other ideas what might be the problem?
*BTW:*I am packaging my PDF as zip and sending it back to browser in the response like that:
response.setHeader "Content-disposition", "attachment; filename=test.zip"
response.setHeader "Content-Encoding", "UTF-8"
response.contentType = 'application/zip'
response.outputStream << zip
response.outputStream.flush()
response.outputStream.close()
Do I need to somehow consider encoding while zipping????, which I do like that:
public static byte[] zipBytes(Map<String, ByteArrayOutputStream> fileNameToByteContentMap) throws IOException {
ByteArrayOutputStream zipBaos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(zipBaos);
fileNameToByteContentMap.eachWithIndex {String fileName, ByteArrayOutputStream baos, i ->
byte[] content = baos.buf
ZipEntry entry = new ZipEntry(fileName)
entry.setSize(content.length)
zos.putNextEntry(entry)
zos.write(content)
zos.closeEntry()
}
zos.close()
return zipBaos.toByteArray();
}
I managed to "enable" unicode characters (cyrillic or czech) within java code and furthermore providing a true type font in my resources (CALIBRI.TTF).
import org.w3c.dom.Document;
import org.xhtmlrenderer.pdf.ITextRenderer;
import com.lowagie.text.pdf.BaseFont;
...
ITextRenderer renderer = new ITextRenderer();
URL fontResourceURL = getClass().getResource("fonts/CALIBRI.TTF");
//System.out.println("font-path:"+fontResourceURL.getPath());
/* HERE comes my solution: */
renderer.getFontResolver().addFont(fontResourceURL.getPath(),
BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.setDocument(doc, null);
renderer.layout();
baos = new ByteArrayOutputStream();
renderer.createPDF(baos);
baos.flush();
result = baos.toByteArray();
...
Finally I added the font-family 'Calibri' to the css section of my document:
...
<style type="text/css">
span { font-size: 11pt; font-family: Calibri; }
...
For some reason it started working with following css and .ttf file, which was generated by face-kit-generator:
#font-face {
src: url('arialuni.ttf');
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
body {
font-family: Arial Unicode MS, Lucida Sans Unicode, Arial, verdana, arial, helvetica, sans-serif;
font-size: 8.8pt;
}
Weird thing is that if I put font into some folder, let say "fonts", it will find the font but characters won't be rendered.

Resources