Sometimes in my accessibility audits I will come across a <p> tag without any content inside it. The screen reader will read out "empty", wasting my time and any disabled person's time in browsing the website.
There is also reading of redundant elements like "separator" when I pass an <hr> tag.
I know these things lessen the accessible experience. But are they considered to break the WCAG standard? If so, then what criteria? Is that subject even given thought to in the standard?
Empty paragraphs, divs, spans, etc. are definitely annoying for users of assistive technology, and it's best-practice to remove them, but they are not a WCAG failure.
To the best of my knowledge, the only empty elements that may cause a WCAG failure are:
title - the title element must describe the purpose of the page (S.C. 2.4.2 Level A)
labels - a form label associated with an input field must not be empty (S.C. 3.3.2 Level AA)
heading - heading elements (h1-h6) are not required, however if a heading element is present, then it must contain text that is descriptive of the content (S.C. 2.4.6. Level AA)
table - if a table elements is used for tablular data, it must contain tr, td, and th items (S.C. 1.3.1. Level A).
table header - table header (th) elements are required for displaying tabular data. While there is no restriction on empty table cells (td), table headers may not be empty (S.C. 1.3.1. Level A)
lists - list elements (ul, ol, dl) must have list items as child elements. (S.C. 1.3.1. Level A)
links - anchor elements (a) must have a valid href value and programmatically-discernible text, as determined by the accessible name calculation algorithm (WCAG 4.1.2 Level A).
There are more things that will fail without required attributes, but that seems a little outside the scope of the question.
Related
I am trying to figure all the valid HTML5 elements that can be nested inside paragraph elements such that w3 validator doesn't show any errors. I mean I am trying to figure all tags that can replace the dots in the following code such that w3 validator doesn't show any errors:
<p>...</p>
Is there such a list available? I tried searching on Google without any luck.
Even if the converse list is available, i.e. elements that can not be nested inside paragraph elements, it is good enough for me.
The HTML5 spec tells us that the <p> element's content model is phrasing content. Phrasing content is defined by the spec:
3.2.5.1.5 Phrasing content
Phrasing content is the text of the document, as well as elements that
mark up that text at the intra-paragraph level. Runs of phrasing
content form paragraphs.
a (if it contains only phrasing content)
abbr
area (if it is a descendant of a map element)
audio
b
bdi
bdo
br
button
canvas
cite
code
command
datalist
del (if it contains only phrasing content)
dfn
em
embed
i
iframe
img
input
ins (if it contains only phrasing content)
kbd
keygen
label
map (if it contains only phrasing content)
mark
math
meter
noscript
object
output
progress
q
ruby
s
samp
script
select
small
span
strong
sub
sup
svg
textarea
time
u
var
video
wbr
text
I'm building a program that scrapes a website. It looks at the entire website and takes only the header and footer navigation menus from that website, then inserts new html tags (div, p, table, etc.) in between the header and footer menus.
I'm looking for some ideas on how to strip only the header and footer nav menus, as well as add code in between the two.
I'm using HTML Agility Pack and have worked on a few methods.
Method 1:
In most cases, the header and footer navigation menus are mostly
links, and have very little text. I used a threshold variable that
was a ratio of text to links. If the ratio text:links for a node is
less than the threshold, the node would be considered a menu node, and
it would be saved. Any node whose text:links ratio was greater than
the threshold value would be removed.
Method 1 worked for some sites, but not for others, so I ditched it.
Method 2:
I searched each node for an id or class attribute that included "nav"
or "menu". "n","a","v", "m","e","n","u" could have been upper case or
lower case, and "nav" and "menu" could have been surrounded by any
combination of characters. That way, it would include id's and
classes such as "bottomNav", "navRight1", "LeftMenu2", etc. If the id
or class contained either "nav" or "menu", the node would be saved.
If the node's attributes did not contain either of those terms, or any
of the node's descendants did not contain either of those terms, the
node would be deleted.
Again, method 2 worked for some sites, but not for others.
For the sites where either of these methods worked, I still wasn't able to put new html code in between the two menus, because I had no way of telling where the header menu ended, and where the footer menu began.
I'm just looking for other ideas on how to scrape only the header and footer navigation menus from a website, and insert new html code in between the two.
Other than looking for specific elements or element classes (header, nav, ...), you can try to look at the problem in a different way:
first, fetch and parse two (or more) pages from each website, preferably checking that they vary substantially (but not totally);
then, do a diff (of the DOM, preferably), and retain only the common structure.
This common structure should consist mostly of headers, footers, navbars and other elements more or less constant across each website.
A final step might be to look in this common structure for small gaps caused by headers/footers that vary depending on context, as opposed to large gaps caused by different (main) content, and scrape their possible values from the largest set of pages you can fetch from each website.
What determines whether one should prefer to use <ul> over <article>, or vice versa in a HTML document?
As an example I have a portfolio page with a list of items, which would be more appropriate?
Element names form part of the semantic web/HTML, so you should use the one you deem most appropriate for your content, MDN is often a good resource to get an overview on what appropriate content may be, some suggestions from which are below.
Lists tend to include shorter, more concise often text only or very image-light content. It sounds like you likely want to look at the section or article tags.
Section
The HTML Section Element (<section>) represents a generic section of a
document, i.e., a thematic grouping of content, typically with a
heading.
Article
The HTML <article> Element represents a self-contained composition in
a document, page, application, or site, which is intended to be
independently distributable or reusable, e.g., in syndication. This
could be a forum post, a magazine or newspaper article, a blog entry,
a user-submitted comment, an interactive widget or gadget, or any
other independent item of content.
List (ul)
The HTML unordered list element (<ul>) represents an unordered list of
items, namely a collection of items that do not have a numerical
ordering, and their order in the list is meaningless. Typically,
unordered-list items are displayed with a bullet, which can be of
several forms, like a dot, a circle or a squared.
If I want to link to a place within the same* page. I've seen that should do like this:
Link Text Here
But what if I have several divs with the same id? Is there way to distinguish other than using the id of the div?
I generate the xhtml code from Java and to match the generic css file (that will not be generated) I use "generic" divs for some cases. Of course I could generate a dummy div with no style attributes but with a unique id and wrap that one around the area of interest. I'm however curious if it could be done in a better way?
The id attribute is optional. However according to w3schools if an element has an id attribute then the id value must be unique. Source: http://www.w3schools.com/tags/att_global_id.asp.
ID Naming rules:
Must contain at least one character
Must not contain any space characters In HTML
all values are case-insensitive
And from what I have learned from experience and html validation it must not start with a number but nobody told me that...
I am trying to be semantically correct here in my web pages, but not sure how to proceed:
My data looks like this or needs to:
Name: lastname, first mi Address: 123 Main St. City, State, Zip: more...
Fieldx: data1 Fieldy: more data...
What I dont' want is the regular table data look with column headers across the top:
name Address
lastname, first mi Some address...
I'm not sure what to look up to do this. When i looked up tableless CSS, I only find forms and layouts.
Am I wrong here to thing I should be using the form layouts with CSS (and no tables)...it's just not a "form"?
edit: do I just put everything inside a div and then in spans with float right?
I think DIVs are the way to go. DIVs give you much better flexibility when combined with CSS.
What you have is structurally a list of name/value pairs, which corresponds to a two-column table. It could also be marked up as a dl element, if we take the liberal modern interpretation that dl is not really a definition list but a description list, which in turn is effectively a list of name/value pairs. And it could also be marked up as a ul element where each li element contains two span elements (with classes), but then you lose the array-like idea in the structure. Finally, you could use some div or p container, containing span elements with alternating classes.
All of this has little to do with semantics (meaning). Rather, it is about structure. Instead of considering which of the approaches is more “correct”, consider which is most comfortable in styling (and possibly in processing, e.g. in client-side scripting). If you want tabular layout, using a table is natural. (Somewhat amusingly, if you don’t want such layout, then you probably should not use a table, because old versions of IE don’t let you style a table element in a non-tabular way.)
If you intend to have the data as inline text as in your example, then I would use
<div class=foo>
<span class=name>Name</span> <span class=value>lastname, first mi</span>
<span class=name>Address</span> <span class=value>123 Main St. </span>
...
</div>
This is a bit verbose markup, but it can be styled easily, since div and span have no default styling (except that div implies line breaks before and after), and you can conveniently use class selectors.