Google Translation API - google-translate

I have text that I would like to translate into Russian. The text has custom tags and has multiple <BR> tags. The API behaves oddly with <BR> tags. Are there known issues with <BR> tags? Is there a way around it or what is the best way to use Google JQuery tranlsation to translate the text?
The text is
<INPUTANSWER PARTID='1'>
<SPAN STYLE="FONT: 7pt 'Times New Roman'"> </SPAN>
Place a <STRONG>90 degree</STRONG> explicit angle constraint to the inside
faces of <STRONG>DP-1007:1 </STRONG>and<STRONG>DP-1006:1</STRONG> as shown.</P>
<P STYLE="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in">
2.
<SPAN STYLE="FONT: 7pt 'Times New Roman'"> </SPAN>
Drive this angle constraint between <STRONG>90 and 100 degrees</STRONG>
with an <STRONG>increment</STRONG> <STRONG>of 0.125 degrees.</STRONG>
</INPUTANSWER>

check this. Its the jquery translate project. I've used it before with normal text, never tried markup but quoting their home page
It also reduces the number of requests by concatenating elements and doesn't send unnecessary html markup still providing access to each element as they've got translated.
If this doesn't work you can always hold on to the original document fragment and just walk it, translate content and replace. I am sure this will work as the API behaved perfectly with plain text.

Find traslator, who would read machine transltions, thats a bad tone. besides completely unclear from English to Russian due complete different language structures. Easier to read in English than auto translated text

Related

Select All Japanese Text, But Without Romanization

Important Edit—
It appears that there's nothing wrong with my language code here, something else somewhere I've done in my CSS has stopped my code from working. This means all my guesses as to why it's not working were completely wrong. (lol)
When I find out what went wrong, I'll update this question; but if I can't find the error, I may just redo it from scratch— the project isn't so monolithic that that still is an option.
For now, the question is effectively solved.
Original Question—
I've been writing a mixed-language HTML document —primarily in English— and have been using different formatting for Japanese language text. For example, since Japanese doesn't use italicized/oblique or bold text, emphasis must be done with text-emphasis instead. However, romanized Japanese —Romaji— inherits these text effects when I'd rather it not.
This initial interaction was expected, so I tried to use :not(:lang(ja-latn)) to prevent this. While admittedly a bit messy, it ought to work… but it does not. I think the issue is that ja-latn, Romaji, is a kind of Japanese as far as HTML & CSS is concerned, and doesn't understand what I'm trying to do. Not labeling the Romaji or changing it to English would be textually inaccurate, confuse screen readers, and generally be a be a bit of a hack.
This is how I had done this (in a condensed form) provided as an example of what I mean. If I made some mistake in formatting not described in this post, that's only because I keep getting "Secure Connection Failed" errors whenever I try to test the snippet, and missed it.
i,em{font-style:italic;}
b,strong{font-weight:bold;}
:is(i,em):lang(ja):not(:lang(ja-latn)){
font-style:normal;
font-weight:normal;
text-emphasis: open currentcolor;
text-emphasis-position: over right;}
ruby{ruby-position:under;}
.goodhappy{color:green}
.wrongangry{color:red; text-emphasis-color:red;}
<div lang="en" >
English text, because I <em class="goodhappy" >don't</em> speak Japanese.<br />
<ruby lang="ja"><!--
-->日<rt lang="ja-hira" >に</rt><!--
-->本<rt lang="ja-hira" >ほん</rt><!--
-->語<rt lang="ja-hira" >ご</rt><!--
--><em class="goodhappy" >わ</em><rt lang="ja-hira" >わ</rt><!--
-->話<rt lang="ja-hira" >はな</rt><!--
-->せません<!--
--></ruby>
<br />
<span lang="ja-latn" >Nihongo <em class="wrongangry" >wa</em> hanasemasen</span>
</div>
How would I go about selecting only CJK japanese characters, but not Romaji text? To be clear, I realize this could be easily done by using a span.class and not using em/i/b/strong etc.. What I mean is, is there a way to accomplish this only in CSS, without more HTML markup than is strictly necessary?
In your question you stated that you tried :not(:lang(ja-latn)) with no success, but in your code you have :not(ja-latn) which is invalid. I changed your code using :not(:lang(ja-latn)) and as you can see it works properly leaving the romaji without the emphasis on top of it
i,em{font-style:italic;}
b,strong{font-weight:bold;}
:is(i,em):lang(ja):not(:lang(ja-latn)){
font-style:normal;
font-weight:normal;
text-emphasis: open currentcolor;
text-emphasis-position: over right;}
ruby{ruby-position:under;}
.goodhappy{color:green}
.wrongangry{color:red; text-emphasis-color:red;}
<div lang="en" >
English text, because I <em class="goodhappy" >don't</em> speak Japanese.
<ruby lang="ja"><!--
-->日<rt lang="ja-hira" >に</rt><!--
-->本<rt lang="ja-hira" >ほん</rt><!--
-->語<rt lang="ja-hira" >ご</rt><!--
--><em class="goodhappy" >わ</em><rt lang="ja-hira" >わ</rt><!--
-->話<rt lang="ja-hira" >はな</rt><!--
-->せません<!--
--></ruby>
<br />
<span lang="ja-latn" >Nihongo <em class="wrongangry" >wa</em> hanasemasen</span>
</div>

How to avoid broken thematic sections (eg. div) in HTML?

I am trying to transfer a text from a printed book into HTML5, but meanwhile I am trying to keep its thematic and page/paragraph/lines layout structure exactly as it is. For example, every page of the printed book is divided as a <div> section eg. <div class=page id=55> so that it emulates/represents exactly the page unit of the printed book, and also facilitate referencing. I don't care much how the text will be rendered on the browser, this is something that I can think about later. I just want the HTML and the browser to "know" the original pagination and layout of the printed book.
The problem is that in the printed book, some paragraphs or even boxes, tables etc span over to the next page. If I translate it to HTML, I do it like this:
<div class=page id=1>
<p>Once upon a time...</p>
...
<p>...and so the bold knight
</div>
<div class=page id=2>
slew the evil dragon.</p>
<p>Text...</p>
...
This is illegal in HTML, as we have a <p> tag being interrupted by a </div> tag, and then a new div element beginning with a plain text, which is closed by a </p> tag.
HTML would expect me to close the first part of the broken paragraph with a </p>, and continue with a new <p> tag after the div, but I am not doing this because it doesn't correspond to the pagnation of the original book, and would result in half-paragraphs being understood are 2 proper paragraphs.
So, how to use legal HTML while maintaining the theoretical page/paragraph/broken paragraph/page break structure and information, or at least making the brower "know" the original pagination? Is there a more appropriate tag or method to emulate the page break while keeping the page number id?
Perhaps something like
<p>...and so the brave knight<some tag(s) that show page 2 begins here>killed the dragon</p>
How about instead of encapsulating each page within a div you include a tag at the start of each page designating the page number. An aside tag seems appropriate for this.
<aside class="page-number" data-page="1">Page 1</aside>
<p>Once upon a time...</p>
<p>...and so the bold knight</p>
<aside class="page-number" data-page="2">Page 2</aside>
<p class="continued">slew the evil dragon.</p>
<p>Text...</p>
If you need to continue a paragraph then you'll have to break into multiple elements, but perhaps you can specify when a paragraph is a continuation of a previous one. For instance using the continued class as shown above.
If you really don't want to break the p tag then you could put a span within it that is only used for semantic reasons. Something like this;
<p>...and so the bold knight
<span class="page-marker" aria-hidden="true" data-page="1"></span>
slew the evil dragon.</p>
But this kind of makes less semantic sense than the previous solution.
Try adding display: inline; to either the CSS style of the class page or the style attribute of each page div.

How can I put an image outside of the paragraph?

Instead of
<p>Hello <img src="helloworld.jpg"> World</p>
I would like to have:
<p>Hello</p><img src="helloworld.jpg"><p>World</p>
<p> has a padding of 40px and I would like the images to use all space available.
You can turn off paragraphs for all content (link), but as far as I know you can't turn it off for certain elements only.
What you can do is modify the HTML after retrieving it from the database but before outputting it. You haven't specified your server-side language, for C# I've found that CsQuery is great.

what is the correct way to code incoming links for SEO?

our site is giving out 'badges' to our authors. they can post these on their personal blogs and they will serve as incoming links to our site.
We want to give out the best possible code for SEO without doing anything that would get us flagged.
i would like to know what you're thoughts are on the following snippet of code and if anyone has any DEFINITE advice on dos and donts with it. Also, let me know if any of it is redundant or not worth it for SEO purposes.
i've kept the css inline since some of the writers would not have access to add link to external css
i've changed the real values, but title, alt etc would be descriptive keywords similar to our page titles etc (no overloading keywords or any of that)
<div id="writer" style="width:100px;height:50px;>
<h1><strong style="float:left;text-indent:-9999px;overflow:hidden;margin:0;padding:0;">articles on x,y,z</strong>
<a href="http://www.site.com/link-to-author" title="site description">
<img style="border:none" src="http://www.site.com/images/badge.png" alt="description of articles" title="View my published work on site.com"/>
</a>
</h1></div>
thanks
Using H1 to enclose your "badge" is a really bad idea—not in so much as it'll negatively affect SEO for your site, but it will very likely ruin the accessibility (and thus SEO) of the author site. H1-H6 are used to provide document structure by semantically delimiting document headings. Random use of heading tags can confuse screen readers and webcrawlers. There's not much you can do in terms of legitimate SEO aside from making correct use of semantic HTML markup.
Edit:
Something like this would be the safest bet:
<div id="writer-badge" style="width: 100px; height: 50px;">
<strong>
Articles on x,y,z
</strong>
<br />
<a href="..." title="site description" rel="profile">
<img style="border: none" src="..." alt="..."
longdesc="http://site.com/badges-explained"
/>
</a>
</div>
I put a line-break between the text and image to treat the text as sort of a badge title. If it's not meant to be displayed that way, then I would omit the <strong> tags altogether (there's no semantic value in encapsulating the text that way, and any styling could be done using the DIV or a weight-neutral SPAN element).
IMO there's really no reason for a achievement badge to have a heading of its own (it's really not even part of the document, just a flourish in the layout), but if you absolutely must, then H6 would be more appropriate and safer to use than H1.
As far as keyword proximity, that is sorta venturing into the grey-hat area of SEO (similar to keyword stuffing), and I wouldn't know anything about that. I've yet to come across any reliable info on how Google or other search engines treat keyword placement. I think if you properly use tag attributes like alt, title, longdesc, rel, rev, etc. in images and links, you'll be alright.
I don't think there is any issue with this code except your <h1> tag. I would probably change it to <h2> simply because pages are supposed to have only 1 <h1> tag per page.
You could also use an iFrame instead if you wanted. That is what SO does but I know you will not get as much linky goodness.

HTML Tags: Presentational vs Structural

I found many different views on many articles on presentation tags, with some people thinking all tags are presentational, but some others do not think so.
For example: in the HTML 5 specification, they do not think <small> is presentational.
In this list of tags - which are all HTML 5 supported - which tag is presentational and which is not?
<abbr>
<address>
<area>
<b>
<bdo>
<blockquote>
<br>
<button>
<cite>
<dd>
<del>
<dfn>
<dl>
<dt>
<em>
<hr>
<i>
<ins>
<kbd>
<map>
<menu>
<pre>
<q>
<samp>
<small>
<span>
<strong>
<sub>
<sup>
<var>
Who decides which HTML tag is presentational and Which is not - and how do they make that decision? Is it a particularly large group such as the W3C or is it based on groups of web developers, i.e. the web community? Also, between the two, which advice we should follow for deciding which tags are presentational?
If a tag is valid as according to the W3C in accepted doctypes, then what are the pros to not using any xhtml tag from any point of view?
in user/usability/accessibility point of view
if we use more HTML tags then pages without CSS will better.
in developer point of view
if we make use of more available tags in HTML, than we do not need to use <span class=className">
it takes more time to write and it uses more charter space than tags in HTML and CSS both.
For example:
instead of using:
<span class="boldtext">Some text<span>
.boldtext {font-weight:700}
We can use:
<b>Some text<b>
b {font-weight:700}
it looks cleaner, it is easier to use , it uses less characters - which will reduce the page size - and it is more readable in source. It also does not break the rule of content and presentation separation.
We can also do this:
<b class="important">Some text<b>
b.important {font-weight:700}
and whenever we want to change font-weight then we can change css only in both examples.
If a tag is considered valid by w3c in their recognized doctypes, then what are the pros to not using any X/HTML presentational tags which are not directly recognized by either the W3C, or by the HTML specifications?
Can we change any design parameters without changing anything in HTML? Does this fit within the meme of content and presentation separation?
If any HTML tag breaks the rule of separation, then does not the css property Content break as well?
see this article.
Why are the HEIGHT and WIDTH attributes for the IMG element permitted?. does it not break the rule of separation? A good debate on this matter can be found here.
W3C decides the semantics of tags. The specification documents of HTML5 gives conditions on the use of the various tags.
HTML5
To continue with your example, there is nothing wrong with using <b> to bold some text unless:
The text being bolded is a single entity already represented by a tag:
Incorrect:
<label for="name"><b>Name:</b></label>
Correct: (Use CSS to style the element)
label { font-weight: bold; }
<label for="name">Name:</label>
The text is being bolded to put added emphasis and weight on a section or words of a block of text.
Incorrect:
<p>HTML has been created to <b>semantically</b> represent documents.</p>
Correct: (Use <strong>)
<p>HTML has been created to <strong>semantically</strong> represent documents.</p>
The following is an example of proper use of the <b> tag:
Correct:
<p>You may <b>logout</b> at any time.</p>
I realize that there doesn't seem to be a lot of difference between the above example and the one using <strong> as the proper example. To simply explain it, the word semantically plays an important role in the sentence and its emphasis is being strengthened by bold font, while logout is simply bolded for presentation purposes.
The following would be an improper usage.
Incorrect:
<p><b>Warning:</b> Following the procedure described below may irreparably damage your equipment.</p>
Correct: (This is used to add strong emphasis, therefore use <strong>)
<p><strong>Warning:</strong> Following the procedure described below may irreparably damage your equipment.</p>
Using <span class="bold"> is markup-smell and simply shouldn't be allowed. The <span> element is used to apply style on inline elements when a generic presentation tag (ie.: <b> doesn't apply) For example to make some text green:
Incorrect:
<p>You will also be happy to know <span class="bold">ACME Corp</span> is a <span class="eco-green">certified green</span> company.</p>
Correct: (Explanation below)
<p>You will also be happy to know <b>ACME Corp</b> is a <em class="eco-green">certified green</em> company.</p>
The reason here why you would want to use <em> as opposed to <span> for the word green is because the color green here is used to add emphasis on the fact that ACME Corp is a certified green company.
The following would be a good example of the use of a <span> tag:
Correct:
<p>You may press <kbd>CTRL+G</hbd> at any time to change your pen color to <span class="pen-green">green</span>.</p>
In this example, the word green is styled in green simply to reflect the color, not to add any emphasis (<em>) or strong emphasis (<strong>).
The whole distinction between "presentation" elements versus "structure" element is, in my opinion, a matter of common sense, not something defined by W3C or anyone else. :-P
An element that describes what its content is (as opposed to how it should look) is a structure element. Everything else is, by definition, not structural, and therefore a presentation element.
Now, I'll answer the second part of your post. I understand this is a contentious topic, but I'll speak my mind anyway.
Well-made HTML should not concern itself with how it should look. That's the job of the stylesheet. The reason it should leave it to the stylesheet, is so you can deliver one stylesheet for desktop computers, another one for netbooks, smartphones, "dumbphones" (for lack of a better term), Kindles, and (if you care about accessibility, and you should) screen readers.
By using presentation markup in your HTML, you force a certain "look" across all these different types of media, removing the ability of the designer to choose a look that works best for such devices. This is micromanagement of the worst sort, and designers will hate you for it. :-)
To use your example, instead of using <b>, you should ask yourself what the boldness is supposed to express. If you're trying to express a section title, use one of the header tags (<h1> through <h6>). If you're trying to express strong emphasis, use <strong>. You get the idea. Express the what, not the how; leave the how to the stylesheet designers.
</soapbox>
It's not that presentational elements should be avoided, it's that markup should be as semantic as possible. When designing a document structure, default styling should be considered a secondary affect. If an element is used solely for presentation, it's not semantic, no matter what element is used.
The example usage of <b> isn't semantic, because <b> imparts no meaning. <span class="boldtext"> also isn't semantic. As such, their usage is mixing presentation into the structure.

Resources