selectors-list ::=
selector[:pseudo-class] [::pseudo-element]
[, selectors-list]
properties-list ::=
[property : value] [; properties-list]
I'm trying to get familiar with CSS and I'd be happy to understand the rules of reading these (from https://developer.mozilla.org/en-US/docs/Web/CSS/Reference).
This MDN page seems to use a pseudo-BNF notation to describe CSS syntax. The notation used is very confusing because [] :: : ; and , all mean something in CSS, yet they used [] to represent optional parts and ::= for grammar rule definition.
I can give you a rough English translation of what they meant:
style-rule ::=
selectors-list {
properties-list
}
A style-rule is made of a selectors-list followed by { followed by a properties-list followed by }.
selectors-list ::=
selector[:pseudo-class] [::pseudo-element]
[, selectors-list]
A selectors-list is made of a selector, optionally followed by : and a pseudo-class, optionally followed by :: and a pseudo-element, optionally followed by , and another selectors-list.
This definition is not only incorrect (you may use multiple pseudo-class in a row), but has a confusing name. If a pseudo-class and a pseudo-element are something separate from the selector, why would you call a list of all three a selectors-list?
Leaving that aside...
properties-list ::=
[property : value] [; properties-list]
A properties-list can be completely empty, or may contain a property followed by : followed by a value, optionally followed by ; and another properties-list.
And then, they don't even use their pseudo-BNF to define what is a selector, a pseudo-class, a pseudo-element, a property or a value. This whole notation is way more confusing than helpful. This MDN page should probably get rewritten.
Related
What characters/symbols are allowed within the CSS class selectors?
I know that the following characters are invalid, but what characters are valid?
~ ! # $ % ^ & * ( ) + = , . / ' ; : " ? > < [ ] \ { } | ` #
You can check directly at the CSS grammar.
Basically1, a name must begin with an underscore (_), a hyphen (-), or a letter(a–z), followed by any number of hyphens, underscores, letters, or numbers. There is a catch: if the first character is a hyphen, the second character must2 be a letter or underscore, and the name must be at least 2 characters long.
-?[_a-zA-Z]+[_a-zA-Z0-9-]*
In short, the previous rule translates to the following, extracted from the W3C specification:
In CSS, identifiers (including element names, classes, and IDs in
selectors) can contain only the characters [a-z0-9] and ISO 10646
characters U+00A0 and higher, plus the hyphen (-) and the underscore
(_); they cannot start with a digit, or a hyphen followed by a digit.
Identifiers can also contain escaped characters and any ISO 10646
character as a numeric code (see next item). For instance, the
identifier "B&W?" may be written as "B&W?" or "B\26 W\3F".
Identifiers beginning with a hyphen or underscore are typically reserved for browser-specific extensions, as in -moz-opacity.
1 It's all made a bit more complicated by the inclusion of escaped Unicode characters (that no one really uses).
2 Note that, according to the grammar I linked, a rule starting with two hyphens, e.g., --indent1, is invalid. However, I'm pretty sure I've seen this in practice.
To my surprise most answers here are wrong. It turns out that:
Any character except NUL is allowed in CSS class names in CSS. (If CSS contains NUL (escaped or not), the result is undefined. [CSS-characters])
Mathias Bynens' answer links to explanation and demos showing how to use these names. Written down in CSS code, a class name may need escaping, but that doesn’t change the class name. E.g. an unnecessarily over-escaped representation will look different from other representations of that name, but it still refers to the same class name.
Most other (programming) languages don’t have that concept of escaping variable names (“identifiers”), so all representations of a variable have to look the same. This is not the case in CSS.
Note that in HTML there is no way to include space characters (space, tab, line feed, form feed and carriage return) in a class name attribute, because they already separate classes from each other.
So, if you need to turn a random string into a CSS class name: take care of NUL and space, and escape (accordingly for CSS or HTML). Done.
I’ve answered your question in-depth at CSS character escape sequences. The article also explains how to escape any character in CSS (and JavaScript), and I made a handy tool for this as well. From that page:
If you were to give an element an ID value of ~!#$%^&*()_+-=,./';:"?><[]{}|`#, the selector would look like this:
CSS:
<style>
#\~\!\#\$\%\^\&\*\(\)\_\+-\=\,\.\/\'\;\:\"\?\>\<\[\]\\\{\}\|\`\#
{
background: hotpink;
}
</style>
JavaScript:
<script>
// document.getElementById or similar
document.getElementById('~!#$%^&*()_+-=,./\';:"?><[]\\{}|`#');
// document.querySelector or similar
$('#\\~\\!\\#\\$\\%\\^\\&\\*\\(\\)\\_\\+-\\=\\,\\.\\/\\\'\\;\\:\\"\\?\\>\\<\\[\\]\\\\\\{\\}\\|\\`\\#');
</script>
Read the W3C spec. (this is CSS 2.1; find the appropriate version for your assumption of browsers)
relevant paragraph:
In CSS, identifiers (including
element names, classes, and IDs in
selectors) can contain only the
characters [a-z0-9] and ISO 10646
characters U+00A1 and higher, plus the
hyphen (-) and the underscore (_);
they cannot start with a digit, or a
hyphen followed by a digit.
Identifiers can also contain escaped
characters and any ISO 10646 character
as a numeric code (see next item). For
instance, the identifier "B&W?" may be
written as "B&W?" or "B\26 W\3F".
As #mipadi points out in Kenan Banks's answer, there's this caveat, also in the same webpage:
In CSS, identifiers may begin with '-'
(dash) or '_' (underscore). Keywords
and property names beginning with '-'
or '_' are reserved for
vendor-specific extensions. Such
vendor-specific extensions should have
one of the following formats:
'-' + vendor identifier + '-' + meaningful name
'_' + vendor identifier + '-' + meaningful name
Example(s):
For example, if XYZ organization added
a property to describe the color of
the border on the East side of the
display, they might call it
-xyz-border-east-color.
Other known examples:
-moz-box-sizing
-moz-border-radius
-wap-accesskey
An initial dash or underscore is
guaranteed never to be used in a
property or keyword by any current or
future level of CSS. Thus typical CSS
implementations may not recognize such
properties and may ignore them
according to the rules for handling
parsing errors. However, because the
initial dash or underscore is part of
the grammar, CSS 2.1 implementers
should always be able to use a
CSS-conforming parser, whether or not
they support any vendor-specific
extensions.
Authors should avoid vendor-specific
extensions
The complete regular expression is:
-?(?:[_a-z]|[\200-\377]|\\[0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|\\[^\r\n\f0-9a-f])(?:[_a-z0-9-]|[\200-\377]|\\[0-9a-f]{1,6}(\r\n|[ \t\r\n\f])?|\\[^\r\n\f0-9a-f])*
So all of your listed characters, except “-” and “_” are not allowed if used directly. But you can encode them using a backslash foo\~bar or using the Unicode notation foo\7E bar.
For those looking for a workaround, you can use an attribute selector, for instance, if your class begins with a number. Change:
.000000-8{background:url(../../images/common/000000-0.8.png);} /* DOESN'T WORK!! */
to this:
[class="000000-8"]{background:url(../../images/common/000000-0.8.png);} /* WORKS :) */
Also, if there are multiple classes, you will need to specify them in selector or use the ~= operator:
[class~="000000-8"]{background:url(../../images/common/000000-0.8.png);}
Sources:
https://benfrain.com/when-and-where-you-can-use-numbers-in-id-and-class-names/
Is there a workaround to make CSS classes with names that start with numbers valid?
https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors
My understanding is that the underscore is technically valid. Check out:
https://developer.mozilla.org/en/underscores_in_class_and_id_names
"...errata to the specification published in early 2001 made underscores legal for the first time."
The article linked above says never use them, then gives a list of browsers that don't support them, all of which are, in terms of numbers of users at least, long-redundant.
I would not recommend to use anything except A-z, _- and 0-9, while it's just easier to code with those symbols. Also do not start classes with - while those classes are usually browser-specific flags. To avoid any issues with IDE autocompletion, less complexity when you may need to generate those class names with some other code for whatever reason. Maybe some transpiling software may not work, etc., etc.
Yet CSS is quite loose on this. You can use any symbol, and even emoji works.
<style>
.😭 {
border: 2px solid blue;
width: 100px;
height: 100px;
overflow: hidden;
}
</style>
<div class="😭">
😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅😅
</div>
We can use all characters in a class name. Even characters like # and .. We just have to escape them with \ (backslash).
.test\.123 {
color: red;
}
.test\#123 {
color: blue;
}
.test\#123 {
color: green;
}
.test\<123 {
color: brown;
}
.test\`123 {
color: purple;
}
.test\~123 {
color: tomato;
}
<div class="test.123">test.123</div>
<div class="test#123">test#123</div>
<div class="test#123">test#123</div>
<div class="test<123">test<123</div>
<div class="test`123">test`123</div>
<div class="test~123">test~123</div>
For HTML5 and CSS 3, classes and IDs can start with numbers.
Going off of Kenan Banks's answer, you can use the following two regex matches to make a string valid:
[^a-z0-9A-Z_-]
This is a reverse match that selects anything that isn't a letter, number, dash or underscore for easy removal.
^-*[0-9]+
This matches 0 or 1 dashes followed by 1 or more numbers at the beginning of a string, also for easy removal.
How I use it in PHP:
// Make alphanumeric with dashes and underscores (removes all other characters)
$class = preg_replace("/[^a-z0-9A-Z_-]/", "", $class);
// Classes only begin with an underscore or letter
$class = preg_replace("/^-*[0-9]+/", "", $class);
// Make sure the string is two or more characters long
return 2 <= strlen($class) ? $class : '';
I'm working on a parser for LiveScript language, and am having trouble with parsing both object property definition forms — key: value and (+|-)key — together. For example:
prop: "val"
+boolProp
-boolProp
prop2: val2
I have the key: value form working with this:
Expression ::= TestExpression
| ParenExpression
| OpExpression
| ObjDefExpression
| PropDefExpression
| LiteralExpression
| ReferenceExpression
PropDefExpression ::= Expression COLON Expression
ObjDefExpression ::= PropDefExpression (NEWLINE PropDefExpression)*
// ... other expressions
But however I try to add ("+"|"-") IDENTIFIER to PropDefExpression or ObjDefExpression, I get errors about using left recursion. What's the (right) way to do this?
The grammar fragment you posted is already left-recursive, i.e. without even adding (+|-)boolprop, the non-terminal 'Expression' derives a form in which 'Expression' reappears as the leftmost symbol:
Expression -> PropDefExpression -> Expression COLON Expression
And it's not just left-recursive, it's ambiguous. E.g.
Expression COLON Expression COLON Expression
can be derived in two different ways (roughly, left-associative vs right-associative).
You can eliminate both these problems by using something more restricted on the left of the colon, e.g.:
PropDefExpression ::= Identifier COLON Expression
Also, another ambiguity: Expression derives PropDefExpression in two different ways, directly and via ObjDefExpression. My guess is, you can drop the direct derivation.
Once you've taken care of those things, it seems to me you should be able to add (+|-)boolprop without errors (unless it conflicts with one of the other kinds of expression that you didn't show).
Mind you, looking at the examples at http://livescript.net, I'm doubtful how much of that you'll be able to capture in a conventional grammar. But if you're just going for a subset, you might be okay.
I don't know how much help this will be, because I know nothing about GrammarKit and not much more about the language you're trying to parse.
However, it seems to me that
PropDefExpression ::= Expression COLON Expression
is not quite accurate, and it is creating an ambiguity when you add the boolean property production because an Expression might start with a unary - operator. In the actual grammar, though, a property cannot start with an arbitrary Expression. There are two types of key-property definitions:
name : expression
parenthesized_expression : expression
(Which is to say, expressions need to start with a ().
That means that a boolean property definition, starting with + or - is recognizable from the first token, which is precisely the condition needed for successful recursive descent parsing. There are several other property definition syntaxes, including names and parenthesized_expressions not followed by a :
That's easy to parse with an LR(1) parser, like the one Jison produces, but to parse it with a recursive-descent parser you need to left-factor. (It's possible that GrammarKit can do this for you, by the way.) Basically, you'd need something like (this is not complete):
PropertyDefinition ::= PropertyPrefix PropertySuffix? | BooleanProperty
PropertyPrefix ::= NAME | ParenthesizedExpression
PropertySuffix ::= COLON Expression | DOT NAME
I encountered a css selector in a file like this:
#contactDetails ul li a, a[href^=tel] {....}
The circumflex character “^” as such has no defined meaning in CSS. The two-character operator “^=” can be used in attribute selectors. Generally, [attr^=val] refers to those elements that have the attribute attr with a value that starts with val.
Thus, a[href^=tel] refers to such a elements that have the attribute href with a value that starts with tel. It is probably meant to distinguish telephone number links from other links; it’s not quite adequate for that, since the selector also matches e.g. ... but it is probably meant to match only links with tel: as the protocol part. So a[href^="tel:"] would be safer.
a[href^="tel"]
(^) means it selects elements that have the specified attribute with a value beginning/starting exactly with a given string.
Here it selects all the 'anchor' elements the value of href attribute starting exactly with a string 'tel'
The carat "^" used like that will match a tags where the href starts with "tel" ( http://csscreator.com/content/attribute-selector-starts )
It means a tags whose href attribute begins with "tel"
Example:
This is a link
will match.
I'm reading the spec on attribute selectors, but I can't find anything that says if whitespace is allowed. I'm guessing it's allowed at the beginning, before and after the operator, and at the end. Is this correct?
The rules on whitespace in attribute selectors are stated in the grammar. Here's the Selectors 3 production for attribute selectors (some tokens substituted with their string equivalents for illustration; S* represents 0 or more whitespace characters):
attrib
: '[' S* [ namespace_prefix ]? IDENT S*
[ [ '^=' |
'$=' |
'*=' |
'=' |
'~=' |
'|=' ] S* [ IDENT | STRING ] S*
]? ']'
;
Of course, the grammar isn't terribly useful to someone looking to understand how to write attribute selectors, as it's intended for someone who's implementing a selector engine.
Here's a plain-English explanation:
Whitespace before the attribute selector
This isn't covered in the above production, but the first obvious rule is that if you're attaching an attribute selector to another simple selector or a pseudo-element, don't use a space:
a[href]::after
If you do, the space is treated as a descendant combinator instead, with the universal selector implied on the attribute selector and anything that may follow it. In other words, these selectors are equivalent to each other, but different from the above:
a [href] ::after
a *[href] *::after
Whitespace inside the attribute selector
Whether you have any whitespace within the brackets and around the comparison operator doesn't matter; I find that browsers seem to treat them as if they weren't there (but I haven't tested extensively). These are all valid according to the grammar and, as far as I've seen, work in all modern browsers:
a[href]
a[ href ]
a[ href="http://stackoverflow.com" ]
a[href ^= "http://"]
a[ href ^= "http://" ]
Whitespace is not allowed between the ^ (or other symbol) and = as these are treated as a single token, and tokens cannot be broken apart.
If IE7 and IE8 implement the grammar correctly, they should be able to handle them all as well.
If a namespace prefix is used, whitespace is not allowed between the prefix and the attribute name.
These are incorrect:
unit[sh| quantity]
unit[ sh| quantity="200" ]
unit[sh| quantity = "200"]
These are correct:
unit[sh|quantity]
unit[ sh|quantity="200" ]
unit[sh|quantity = "200"]
Whitespace within the attribute value
But notice the quotes around the attribute values above; if you leave them out, and you try to select something whose attribute has spaces in its value you have a syntax error.
This is incorrect:
div[class=one two]
This is correct:
div[class="one two"]
This is because an unquoted attribute value is treated as an identifier, which doesn't include whitespace (for obvious reasons), whereas a quoted value is treated as a string. See this spec for more details.
To prevent such errors, I strongly recommend always quoting attribute values, whether in HTML, XHTML (required), XML (required), CSS or jQuery (once required).
Whitespace after the attribute value
As of Selectors 4 (following the original publication of this answer), attribute selectors can accept flags in the form of an identifier appearing after the attribute value. Two flags have been defined pertaining to character case, one for case-insensitive matching:
div[data-foo="bar" i]
And one for case-sensitive matching (whose addition I had a part in, albeit by proxy of the WHATWG):
ol[type="A" s]
ol[type="a" s]
The grammar has been updated thus:
attrib
: '[' S* attrib_name ']'
| '[' S* attrib_name attrib_match [ IDENT | STRING ] S* attrib_flags? ']'
;
attrib_name
: wqname_prefix? IDENT S*
attrib_match
: [ '=' |
PREFIX-MATCH |
SUFFIX-MATCH |
SUBSTRING-MATCH |
INCLUDE-MATCH |
DASH-MATCH
] S*
attrib_flags
: IDENT S*
In plain English: if the attribute value is not quoted (i.e. it is an identifier), whitespace between it and attrib_flags is required; otherwise, if the attribute value is quoted then whitespace is optional, but strongly recommended for the sake of readability. Whitespace between attrib_flags and the closing bracket is optional as always.
This question already has answers here:
CSS attribute selectors: The rules on quotes (", ' or none?)
(2 answers)
Closed last year.
e.g.:
a[href="val"]
Does "val" need to have quotes around it? Single or double are acceptable? What about for integers?
TLDR: Quotes are required unless the value meets the identifier specification for CSS2.1
The CSS spec might say they are optional, but the real world presents a different story. When making a comparison against the href attribute you will need to use quotes (single or double work in my very limited testing - latest versions of FF, IE, Chrome.)
Interestingly enough the css spec link referenced by #Pekka happens to use quotes around their href-specific examples.
And it's not just due to non-alpha characters like the period or slashes that give this unique situation a quote requirement - using a partial match selector ~= doesn't work if you just use the "domain" in "domain.com"
Ok, every answer here is wrong (including my own previous answer.) The CSS2 spec didn't clarify whether quotes are required in the selector section itself, but the CSS3 spec does and quotes the rule as a CSS21 implementation:
http://www.w3.org/TR/css3-selectors/
Attribute values must be CSS identifiers or strings. [CSS21] The case-sensitivity of attribute names and values in selectors depends on the document language.
And here is the identifier info:
http://www.w3.org/TR/CSS21/syndata.html#value-def-identifier
In CSS, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [a-zA-Z0-9] and ISO 10646 characters U+00A0 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit, two hyphens, or a hyphen followed by a digit. Identifiers can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F".
My answer seemed correct but that's because the '~=' is a white-space selector comparator so it will never match a partial string inside an href value. A '*=' comparator does work however. And a partial string like 'domain' does work for matching href='www.domain.com'. But checking for a full domain name would not work because it violates the identifier rule.
According to the examples in the CSS 2.1 specs, quotes are optional.
In the following example, the selector matches all SPAN elements whose "class" attribute has exactly the value "example":
span[class=example] { color: blue; }
Here, the selector matches all SPAN elements whose "hello" attribute has exactly the value "Cleveland" and whose "goodbye" attribute has exactly the value "Columbus":
span[hello="Cleveland"][goodbye="Columbus"] { color: blue; }
Numbers are treated like strings, i.e. they can be quoted, but they don't have to.
No, they don't have to have quotes, tough in order to avoid ambiguities many people do use quotes, which are needed if the value contains whitespace.
Either single or double quotes are fine, and integers will be treated the same way (css does not have a distinction between strings and integers).
See the examples in the spec.
They do not need to be quoted.
There is also no distinction between strings/doubles/integers. CSS isn't Turing-complete, let alone typed.