Select All Japanese Text, But Without Romanization - css

Important Edit—
It appears that there's nothing wrong with my language code here, something else somewhere I've done in my CSS has stopped my code from working. This means all my guesses as to why it's not working were completely wrong. (lol)
When I find out what went wrong, I'll update this question; but if I can't find the error, I may just redo it from scratch— the project isn't so monolithic that that still is an option.
For now, the question is effectively solved.
Original Question—
I've been writing a mixed-language HTML document —primarily in English— and have been using different formatting for Japanese language text. For example, since Japanese doesn't use italicized/oblique or bold text, emphasis must be done with text-emphasis instead. However, romanized Japanese —Romaji— inherits these text effects when I'd rather it not.
This initial interaction was expected, so I tried to use :not(:lang(ja-latn)) to prevent this. While admittedly a bit messy, it ought to work… but it does not. I think the issue is that ja-latn, Romaji, is a kind of Japanese as far as HTML & CSS is concerned, and doesn't understand what I'm trying to do. Not labeling the Romaji or changing it to English would be textually inaccurate, confuse screen readers, and generally be a be a bit of a hack.
This is how I had done this (in a condensed form) provided as an example of what I mean. If I made some mistake in formatting not described in this post, that's only because I keep getting "Secure Connection Failed" errors whenever I try to test the snippet, and missed it.
i,em{font-style:italic;}
b,strong{font-weight:bold;}
:is(i,em):lang(ja):not(:lang(ja-latn)){
font-style:normal;
font-weight:normal;
text-emphasis: open currentcolor;
text-emphasis-position: over right;}
ruby{ruby-position:under;}
.goodhappy{color:green}
.wrongangry{color:red; text-emphasis-color:red;}
<div lang="en" >
English text, because I <em class="goodhappy" >don't</em> speak Japanese.<br />
<ruby lang="ja"><!--
-->日<rt lang="ja-hira" >に</rt><!--
-->本<rt lang="ja-hira" >ほん</rt><!--
-->語<rt lang="ja-hira" >ご</rt><!--
--><em class="goodhappy" >わ</em><rt lang="ja-hira" >わ</rt><!--
-->話<rt lang="ja-hira" >はな</rt><!--
-->せません<!--
--></ruby>
<br />
<span lang="ja-latn" >Nihongo <em class="wrongangry" >wa</em> hanasemasen</span>
</div>
How would I go about selecting only CJK japanese characters, but not Romaji text? To be clear, I realize this could be easily done by using a span.class and not using em/i/b/strong etc.. What I mean is, is there a way to accomplish this only in CSS, without more HTML markup than is strictly necessary?

In your question you stated that you tried :not(:lang(ja-latn)) with no success, but in your code you have :not(ja-latn) which is invalid. I changed your code using :not(:lang(ja-latn)) and as you can see it works properly leaving the romaji without the emphasis on top of it
i,em{font-style:italic;}
b,strong{font-weight:bold;}
:is(i,em):lang(ja):not(:lang(ja-latn)){
font-style:normal;
font-weight:normal;
text-emphasis: open currentcolor;
text-emphasis-position: over right;}
ruby{ruby-position:under;}
.goodhappy{color:green}
.wrongangry{color:red; text-emphasis-color:red;}
<div lang="en" >
English text, because I <em class="goodhappy" >don't</em> speak Japanese.
<ruby lang="ja"><!--
-->日<rt lang="ja-hira" >に</rt><!--
-->本<rt lang="ja-hira" >ほん</rt><!--
-->語<rt lang="ja-hira" >ご</rt><!--
--><em class="goodhappy" >わ</em><rt lang="ja-hira" >わ</rt><!--
-->話<rt lang="ja-hira" >はな</rt><!--
-->せません<!--
--></ruby>
<br />
<span lang="ja-latn" >Nihongo <em class="wrongangry" >wa</em> hanasemasen</span>
</div>

Related

Semantic mark for testimonial details

I jus got the following PSD design:
(sorry about the grid line in between , that blue line really is not needed.).
Now i was wondering with all there html5 tags , what would be a great and semantic markup to code the above design , iám usually a guy who goes old school and uses div and spans , but this time i used cite , when i read the MDN doc's there seems to still be no clarity weather a name/designation can be used in cite , basically the way look at cite is , it a tag to be used only when you have a definitive resource to be added to you markup. Even though neither name nor designation is a definitive resource, i came up with the following markup.
<div class="testimonia-details">
<img src="img/res/p1.png" alt="testimonial giver">
<p>
<span>Brian</span>-May 2015
<span>Managing Partner.<cite>Tammy Lenski LLC</cite></span>
</p>
</div>
can anybody tell me what would be a more semantic way to code the testimonial details ? Thank you , i would greatly appreciate any help or guidance, i have always wonder what would be a semantic markup especially for a scenario like above.
Using schema.org metadata
<div class="testimonia-details" itemscope itemtype="http://schema.org/Person">
<img src="img/res/p1.png" alt="testimonial giver" itemprop="image">
<p>
<span itemprop="name">Brian</span>-May 2015
<span itemprop="jobTitle">Managing Partner.<span itemprop="worksFor">Tammy Lenski LLC</span></span>
</p>
</div>

what is the correct way to code incoming links for SEO?

our site is giving out 'badges' to our authors. they can post these on their personal blogs and they will serve as incoming links to our site.
We want to give out the best possible code for SEO without doing anything that would get us flagged.
i would like to know what you're thoughts are on the following snippet of code and if anyone has any DEFINITE advice on dos and donts with it. Also, let me know if any of it is redundant or not worth it for SEO purposes.
i've kept the css inline since some of the writers would not have access to add link to external css
i've changed the real values, but title, alt etc would be descriptive keywords similar to our page titles etc (no overloading keywords or any of that)
<div id="writer" style="width:100px;height:50px;>
<h1><strong style="float:left;text-indent:-9999px;overflow:hidden;margin:0;padding:0;">articles on x,y,z</strong>
<a href="http://www.site.com/link-to-author" title="site description">
<img style="border:none" src="http://www.site.com/images/badge.png" alt="description of articles" title="View my published work on site.com"/>
</a>
</h1></div>
thanks
Using H1 to enclose your "badge" is a really bad idea—not in so much as it'll negatively affect SEO for your site, but it will very likely ruin the accessibility (and thus SEO) of the author site. H1-H6 are used to provide document structure by semantically delimiting document headings. Random use of heading tags can confuse screen readers and webcrawlers. There's not much you can do in terms of legitimate SEO aside from making correct use of semantic HTML markup.
Edit:
Something like this would be the safest bet:
<div id="writer-badge" style="width: 100px; height: 50px;">
<strong>
Articles on x,y,z
</strong>
<br />
<a href="..." title="site description" rel="profile">
<img style="border: none" src="..." alt="..."
longdesc="http://site.com/badges-explained"
/>
</a>
</div>
I put a line-break between the text and image to treat the text as sort of a badge title. If it's not meant to be displayed that way, then I would omit the <strong> tags altogether (there's no semantic value in encapsulating the text that way, and any styling could be done using the DIV or a weight-neutral SPAN element).
IMO there's really no reason for a achievement badge to have a heading of its own (it's really not even part of the document, just a flourish in the layout), but if you absolutely must, then H6 would be more appropriate and safer to use than H1.
As far as keyword proximity, that is sorta venturing into the grey-hat area of SEO (similar to keyword stuffing), and I wouldn't know anything about that. I've yet to come across any reliable info on how Google or other search engines treat keyword placement. I think if you properly use tag attributes like alt, title, longdesc, rel, rev, etc. in images and links, you'll be alright.
I don't think there is any issue with this code except your <h1> tag. I would probably change it to <h2> simply because pages are supposed to have only 1 <h1> tag per page.
You could also use an iFrame instead if you wanted. That is what SO does but I know you will not get as much linky goodness.

HTML Tags: Presentational vs Structural

I found many different views on many articles on presentation tags, with some people thinking all tags are presentational, but some others do not think so.
For example: in the HTML 5 specification, they do not think <small> is presentational.
In this list of tags - which are all HTML 5 supported - which tag is presentational and which is not?
<abbr>
<address>
<area>
<b>
<bdo>
<blockquote>
<br>
<button>
<cite>
<dd>
<del>
<dfn>
<dl>
<dt>
<em>
<hr>
<i>
<ins>
<kbd>
<map>
<menu>
<pre>
<q>
<samp>
<small>
<span>
<strong>
<sub>
<sup>
<var>
Who decides which HTML tag is presentational and Which is not - and how do they make that decision? Is it a particularly large group such as the W3C or is it based on groups of web developers, i.e. the web community? Also, between the two, which advice we should follow for deciding which tags are presentational?
If a tag is valid as according to the W3C in accepted doctypes, then what are the pros to not using any xhtml tag from any point of view?
in user/usability/accessibility point of view
if we use more HTML tags then pages without CSS will better.
in developer point of view
if we make use of more available tags in HTML, than we do not need to use <span class=className">
it takes more time to write and it uses more charter space than tags in HTML and CSS both.
For example:
instead of using:
<span class="boldtext">Some text<span>
.boldtext {font-weight:700}
We can use:
<b>Some text<b>
b {font-weight:700}
it looks cleaner, it is easier to use , it uses less characters - which will reduce the page size - and it is more readable in source. It also does not break the rule of content and presentation separation.
We can also do this:
<b class="important">Some text<b>
b.important {font-weight:700}
and whenever we want to change font-weight then we can change css only in both examples.
If a tag is considered valid by w3c in their recognized doctypes, then what are the pros to not using any X/HTML presentational tags which are not directly recognized by either the W3C, or by the HTML specifications?
Can we change any design parameters without changing anything in HTML? Does this fit within the meme of content and presentation separation?
If any HTML tag breaks the rule of separation, then does not the css property Content break as well?
see this article.
Why are the HEIGHT and WIDTH attributes for the IMG element permitted?. does it not break the rule of separation? A good debate on this matter can be found here.
W3C decides the semantics of tags. The specification documents of HTML5 gives conditions on the use of the various tags.
HTML5
To continue with your example, there is nothing wrong with using <b> to bold some text unless:
The text being bolded is a single entity already represented by a tag:
Incorrect:
<label for="name"><b>Name:</b></label>
Correct: (Use CSS to style the element)
label { font-weight: bold; }
<label for="name">Name:</label>
The text is being bolded to put added emphasis and weight on a section or words of a block of text.
Incorrect:
<p>HTML has been created to <b>semantically</b> represent documents.</p>
Correct: (Use <strong>)
<p>HTML has been created to <strong>semantically</strong> represent documents.</p>
The following is an example of proper use of the <b> tag:
Correct:
<p>You may <b>logout</b> at any time.</p>
I realize that there doesn't seem to be a lot of difference between the above example and the one using <strong> as the proper example. To simply explain it, the word semantically plays an important role in the sentence and its emphasis is being strengthened by bold font, while logout is simply bolded for presentation purposes.
The following would be an improper usage.
Incorrect:
<p><b>Warning:</b> Following the procedure described below may irreparably damage your equipment.</p>
Correct: (This is used to add strong emphasis, therefore use <strong>)
<p><strong>Warning:</strong> Following the procedure described below may irreparably damage your equipment.</p>
Using <span class="bold"> is markup-smell and simply shouldn't be allowed. The <span> element is used to apply style on inline elements when a generic presentation tag (ie.: <b> doesn't apply) For example to make some text green:
Incorrect:
<p>You will also be happy to know <span class="bold">ACME Corp</span> is a <span class="eco-green">certified green</span> company.</p>
Correct: (Explanation below)
<p>You will also be happy to know <b>ACME Corp</b> is a <em class="eco-green">certified green</em> company.</p>
The reason here why you would want to use <em> as opposed to <span> for the word green is because the color green here is used to add emphasis on the fact that ACME Corp is a certified green company.
The following would be a good example of the use of a <span> tag:
Correct:
<p>You may press <kbd>CTRL+G</hbd> at any time to change your pen color to <span class="pen-green">green</span>.</p>
In this example, the word green is styled in green simply to reflect the color, not to add any emphasis (<em>) or strong emphasis (<strong>).
The whole distinction between "presentation" elements versus "structure" element is, in my opinion, a matter of common sense, not something defined by W3C or anyone else. :-P
An element that describes what its content is (as opposed to how it should look) is a structure element. Everything else is, by definition, not structural, and therefore a presentation element.
Now, I'll answer the second part of your post. I understand this is a contentious topic, but I'll speak my mind anyway.
Well-made HTML should not concern itself with how it should look. That's the job of the stylesheet. The reason it should leave it to the stylesheet, is so you can deliver one stylesheet for desktop computers, another one for netbooks, smartphones, "dumbphones" (for lack of a better term), Kindles, and (if you care about accessibility, and you should) screen readers.
By using presentation markup in your HTML, you force a certain "look" across all these different types of media, removing the ability of the designer to choose a look that works best for such devices. This is micromanagement of the worst sort, and designers will hate you for it. :-)
To use your example, instead of using <b>, you should ask yourself what the boldness is supposed to express. If you're trying to express a section title, use one of the header tags (<h1> through <h6>). If you're trying to express strong emphasis, use <strong>. You get the idea. Express the what, not the how; leave the how to the stylesheet designers.
</soapbox>
It's not that presentational elements should be avoided, it's that markup should be as semantic as possible. When designing a document structure, default styling should be considered a secondary affect. If an element is used solely for presentation, it's not semantic, no matter what element is used.
The example usage of <b> isn't semantic, because <b> imparts no meaning. <span class="boldtext"> also isn't semantic. As such, their usage is mixing presentation into the structure.

Google Translation API

I have text that I would like to translate into Russian. The text has custom tags and has multiple <BR> tags. The API behaves oddly with <BR> tags. Are there known issues with <BR> tags? Is there a way around it or what is the best way to use Google JQuery tranlsation to translate the text?
The text is
<INPUTANSWER PARTID='1'>
<SPAN STYLE="FONT: 7pt 'Times New Roman'"> </SPAN>
Place a <STRONG>90 degree</STRONG> explicit angle constraint to the inside
faces of <STRONG>DP-1007:1 </STRONG>and<STRONG>DP-1006:1</STRONG> as shown.</P>
<P STYLE="MARGIN-LEFT: 0.5in; TEXT-INDENT: -0.25in">
2.
<SPAN STYLE="FONT: 7pt 'Times New Roman'"> </SPAN>
Drive this angle constraint between <STRONG>90 and 100 degrees</STRONG>
with an <STRONG>increment</STRONG> <STRONG>of 0.125 degrees.</STRONG>
</INPUTANSWER>
check this. Its the jquery translate project. I've used it before with normal text, never tried markup but quoting their home page
It also reduces the number of requests by concatenating elements and doesn't send unnecessary html markup still providing access to each element as they've got translated.
If this doesn't work you can always hold on to the original document fragment and just walk it, translate content and replace. I am sure this will work as the API behaved perfectly with plain text.
Find traslator, who would read machine transltions, thats a bad tone. besides completely unclear from English to Russian due complete different language structures. Easier to read in English than auto translated text

why <br /> and not <br/>?

This is one of those things that you read once, say "aha!" and then forget. Exactly my case.
Why is the line-break tag in xhtml preferentially written with a space <br /> and not in the also ok format <br/> ? I remember the reason was interesting, and as you can imagine it's not easy to find with google.
For sure it's not an issue of xml well-formedness. From W3C
[44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
Empty-element tags may be used for any element which has no content, whether
or not it is declared using the keyword EMPTY. For interoperability, the
empty-element tag should be used, and should only be used, for elements which
are declared EMPTY.
Examples of empty elements:
<IMG align="left" src="http://www.w3.org/Icons/WWW/w3c_home" />
<br></br>
<br/>
So the space at the end is optional.
w3c specifies this as the grammar:
EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
That means open bracket, a name, a number of (space and attribute) tokens, an optional space, a slash, and an end tag. According to this, both are correct.
If I recall correctly it's simply because some older browsers had problems with a self-closing tag without a space before the slash. I doubt it's an issue nowadays, but a lot of developers (myself included) got into the habit of including the space.
Edit: Ah, here we are:
http://www.w3.org/TR/xhtml1/#guidelines
Include a space before the trailing / and > of empty elements, e.g. <br />, <hr /> and <img src="karen.jpg" alt="Karen" />. Also, use the minimized tag syntax for empty elements, e.g. <br />, as the alternative syntax <br></br> allowed by XML gives uncertain results in many existing user agents.
Some older browsers didn't parse the element correctly without the space, so most web developers use <br />. I don't remember which browsers offhand, but I believe they're just about extinct.
EDIT: The browser was Netscape 4.
There is no right way in XHTML. They are formally identical in XML. Whitespace is not significant in that location.
For XHTML: both of them. For HTML4 and earlier: neither.
<br /> is valid (old) HTML, while <br/> is not. If you are serving your XHTML as XML, it doesn't matter. If you are serving it as text/html, then it needs to be valid HTML in addition to being valid XHTML. (Why serve XHTML as HTML? Because IE doesn't understand XHTML as XML, and because no major browser will start rendering XHTML mid-way through downloading the text, but they will do that to HTML. My blog appears to load slowly not because the site is slow, but because the browser won't start rendering the page until everything has been fetched. I hate browsers.)
A little background to add to Matt Hamilton's answer.
A least one problem browser was Netscape 4. A quick check shows that in that browser, <br/> (i.e. no space) doesn't cause a line break. In fact, it doesn't appear to do anything. <br /> (i.e. with space) does perform a line break.
When creating polyglot documents that can behave as XHTML or HTML (Note: "behave as" - not "valid") it's necessary to use either <br /> or <br></br>. However, when parsed as HTML, </br> is treated as if it was <br>, so <br></br> produces two line breaks.
Both <br/> and <br /> are correct. The reason that <br /> came about in the first place was to support older browsers that didn't understand the new <br/> syntax. It's really kind of a hack where the / is interpreted as an attribute with no value and ignored.
Both are correct, and both will be accepted by web browsers. You may as well save yourself the extra character, and use <br/>
Both are correct. But I would use <br /> just to keep my code consistent... because I would never write
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
instead of
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
just to save a byte... and the second version is imho better readable. But that's just a matter of taste. Do it as you like, but do it consistent :-)
Either will work just fine. Assuming you are asking for evangelical reasons, I prefer <br/>
Both are correct.
<br>. You aren't using XML anyway.

Resources