When you capture through org-protocol and a browser, either through the capture(); function or encodeURIComponent(window.getSelection());, the text appears to be passed to the Emacs org-protocol server as plain text.
Is there a way to pull in some of the HTML heading/CSS style info to keep a minimal amount of formatting for readability? Most sites aren't anything close to plain text, so even selecting across a heading and a couple paragraphs comes out like garbage.
edit: I found pandoc, which will do HTML to org-mode conversions, but the results are overkill. Is there any way to get just the formatting from the selected objects, not a blind parse of HTML chunk?
Related
I've written a program that uses the Google Translate Python API to translate webpages. Most of the time, the API does translation as I expect, but in some cases text within a tag does not get translated.
I tried putting one such tag in the Google Translate web interface and found that the text is still not translated; i.e., the problem has to do with the Google Translate service rather than the way I am using the API.
The specific tag I am looking at is: <div class="someClass">World:</div>
I want the word "World" to be translated in the output, regardless of the language into which I am translating. In certain languages, such as French and Khmer, the word "World" is translated as expected, but in other languages, such as Spanish and Somali, it remains "World." I have noticed that removing the class attribute sometimes helps (translation then works in Spanish but not in Somali), and adding more text seems to help as well (I've never seen this issue when the text is a full sentence or paragraph, for example).
In the context of my project, it is particularly important that the case of a tag with just one word inside be handled correctly. Does anyone know why this is happening or how I can make translation happen consistently? A solution requiring minimal to no changes to the original HTML would be ideal.
Edit A little more context based on playing around with things: Directly calling google.cloud.translate.Client().translate('<div class="someClass">World:</div>', 'es') actually has the correct behavior: "World" becomes "Mundo." I incrementally lengthened the page text by adding tags that came before and after that div in the original webpage--none of which wrapped more than one word of text--and the text between tags stopped being translated when the text was around 1,000 characters long. However, when I changed "World:" to a whole sentence, all of the text between tags was translated even when the page text was longer than 1,000 characters.
It is said that http://telegra.ph is an editor that is able to format a text via HTML tags or markdown. In online editor neither the first nor the second method is working. What am I doing wrong? How to format text?
You can see API document, there support only limited tags.
You can select text on Telegraph Editor, you will see floating menu which contains formatting options.
I think you mistake Telegraph as Bot API, which provides Markdown and HTML formatting options.
I want to get data from textbox in plain HTML i.e if i write Hello World then it should return
Hello World
. I dont want to use HtmlEditor can i get plain html using textArea?
http://www.dotnetperls.com/encode-html-string
If you really need the you can always string-replace spaces
You're probably going to need to transform the text manually (with string.Replace() or something similar) to accomplish this. Consider, for example, the "enter" and "space" tags you're looking for. If the user enters this as plain text (such as in a TextArea):
Hello World
Another line
Then that's precisely the value that's in the TextArea. The user didn't enter this:
Hello World<br />Another line
That's an entirely different string value. A WYSIWYG HTML editing control (and there are many for ASP.NET) would do some of the text transforms for you. At least it would probably convert the carriage returns into break tags for you.
But I doubt it would convert every space into a non-breaking space, since that's a very different value than what was entered. You'll likely have to do that yourself. (And be aware that converting all spaces into non-breaking spaces might not render the output like you expect. Look forward to a lot of horizontal scrolling.)
HTML isn't a translation of text into another medium, it's markup that's included in the text. Both of my examples above are perfectly valid HTML. One of them just doesn't include any markup tags.
I need users to be able to enter text in a webform with some basic formatting options and then generate a report showing the formatted text.
The support for HTML is horrible and entering a simple bulletlist doesn't even show properly in the report.
Right now i'm using a textarea with tinyMCE but that's because i don't know what else to use.
Is there a known best-practice for showing formatted text in a Crystal Report?
Edit
I just need to show a report with a bunch of text and icons. Users need to be able to save it to PDF. I doesn't even have to be Crystal Reports but it's what i have been using and worked so far. Until i needed to show formatted text.
I wish for another solution that comes with a designer and let's me bind against a DataSet.
The solution is to convert the HTML to RTF. RTF support in CR is much better than it is for HTML. This way users can still use the tinyMCE editor and even paste Word formatted HTML.
The way i convert the HTML to RTF if using an XSL stylesheet. Basically you load the HTML as an XML document and let the XSL translate it to RTF. This way you also have a lot of freedom over the way your text will appear since you can tweak the XSL.
I used this article to achieve that, the article's attachment includes the .XSL.
I am having a minor issue playing with my exported Crystal Report, I can generate the reports just fine on our website, however when I attempt to export them to Word documents I don't quite get a document I can do much with.
i.e.:
I can't position the generated text anywhere in the document, it is almost 'frozen' in place. I would expect if I moved the cursor above the report text and pressed Enter a bunch of times I could remove the report down the page, however it just won't budge
All the text seems to be in its own box and I can't move it around or do anything with it.
Any thoughts? My expectation would be once it is exported to Word I could play with it like a Word document, move the text down the page, edit the document, do something with it.
Thanks!
btw, this question is similar to the one posted here, but this one wasn't tagged properly and I don't have enough karma to fix it:
https://stackoverflow.com/questions/434381/word-formatting-not-intact-when-exported-from-crystal-reoport
I'm afraid that you can't do much about it. Crystal Reports is very much orientated towards putting data in fixed positions on pages, so when it exports to Word it puts its data into text boxes because that's a similar thing that Word offers. You could make the Crystal Report page consists of a giant text field and using spaces and newlines to get the data into the right place, which will probably then give you a giant textbox in Word.