Why is extnValue in X.509 Extensions always encapsulated in an OCTET_STRING? - x509certificate

I'm curious, and I was not able to find an explanation so far.
In RFC 5280 Extensions define the following:
Extension ::= SEQUENCE {
extnID OBJECT IDENTIFIER,
critical BOOLEAN DEFAULT FALSE,
extnValue OCTET STRING
-- contains the DER encoding of an ASN.1 value
-- corresponding to the extension type identified
-- by extnID
}
What is the reason for defining the encapsulating OCTET_STRING for extnValue, instead of directly defining extnValue as the "DER encoding of an ASN.1 value corresponding to the extension type identified by extnID".
Thank you.

Not an authoritative answer, but my thoughts are: this is because extension values may have arbitrary enclosing tags and can be defined in external modules:
Most extensions use SEQUENCE, but some are not, like in a given example, Subject Key Identifier is just another OCTET_STRING, Key Usages is a BIT_STRING. And in base type definition you have to use fixed tag to represent variable content (ANY).
In addition, parsers may not know how to parse particular extension, so they read it as octet string without having to dig deeper if extension type is unknown to parser.
update 13.02.2023 (based on comments):
Regarding the type / tag, from my understanding, each different type can be easily identified by the leading tag byte, such as SEQUENCE=0x10, OCTET_STRING=0x04 or BIT_STRING=0x03
you cannot define the field with variable tag, because you introduce type ambiguity. That is, extnValue ANY field definition is not valid, because its type is indeterminate. When you define a type (in this case, it is Extension type), all fields must have deterministic tag.

Related

Is "--" a valid CSS3 identifier?

According the CSS Level 3 specification, for parsing the start of an identifier, you:
Check if three code points would start an identifier
Look at the first code point:
If the first character is -, then we have a valid identifier if:
The second code point is an identifier-start code point ([a-zA-Z_] or non-ASCII).
The second code point is -.
The second and third character form a valid escape.
Otherwise, we do not have a valid identifier start. After determining if we have a valid identifier start, the only requirements to have a valid <ident-token> is we have 0 or more of any combination of the following:
Escape tokens
ASCII letters
Digits
_ or -
Non-ASCII characters
Since we do not require any characters following an identifier start token, this would suggest that -- is a valid identifier, even if never supported by any browser or framework. However, even official CSS validation services (maintained by those that design the CSS specifications) do not consider this a valid identifier. Is this merely a bug in the validation service?
Yes it's valid and it works. It's the shortest custom property (aka CSS variable) that you can define:
body {
--:red;
background:var(--);
}
Related: Can a css variable name start with a number?
The -- custom property identifier is reserved for future use, but current browsers incorrectly treat it as a valid custom property.
See also
w3c/csswg-drafts#6313

What does "+" means in HTTP Accept header?

How could I understand this record:
Accept: application/vnd.my.api+json
I mean, is this "+" symbol is standartized (anyway, I have not find it in spec), or it is just a convention?
Thanks.
The Accept header specifies a list of acceptable media types.
The "+xxx" part of the media type is called suffix. It is an augmentation to the media type definition and helps to specify the underlying structure of that media type.
RFC 6838, "4.2.8. Structured Syntax Name Suffixes" defines:
XML in MIME [RFC3023] defined the first such augmentation to the
media type definition to additionally specify the underlying
structure of that media type. To quote:
This document also standardizes a convention (using the suffix
'+xml') for naming media types ... when those media types
represent XML MIME (Multipurpose Internet Mail Extensions)
entities.
That is, it specified a suffix (in that case, "+xml") to be
appended to the base subtype name.
Since this was published, the de facto practice has arisen for
using this suffix convention for other well-known structuring
syntaxes. In particular, media types have been registered with
suffixes such as "+der", "+fastinfoset", and "+json". This
specification formalizes this practice and sets up a registry for
structured type name suffixes.
The primary guideline for whether a structured type name suffix is
registrable is that it be described by a readily available
description, preferably within a document published by an established
standards-related organization, and for which there's a reference
that can be used in a Normative References section of an RFC.
Media types that make use of a named structured syntax SHOULD use
the appropriate registered "+suffix" for that structured syntax
when they are registered. By the same token, media types MUST NOT
be given names incorporating suffixes for structured syntaxes they
do not actually employ. "+suffix" constructs for as-yet
unregistered structured syntaxes SHOULD NOT be used, given the
possibility of conflicts with future suffix definitions.

Bad production of 2.5.4.5 oid into X509IssuerName, change proposal

I noticed that durnig a xades signature with xades4j the element X509IssuerName presents a bad formatted serialnumber issuer value, it shows a PrintableString Hex encoded, i search into xades4j code and i found that the problem is into the DataGenBaseCertRefs class, if you set
cert.getIssuerX500Principal().getName(X500Principal.RFC1779)
into the generate method you can resolve this problem and procuce an issuer value from this:
2.5.4.5=#130b3037393435323131303036
to this
OID.2.5.4.5=07945211006
I'm not sure that change is correct. XML-DSIG states that RFC 4514 should be used when encoding the distinguished names. Regarding the attribute type, on that RFC one reads:
If the AttributeType is defined to have a short name (...) that short name, a descr, is used. Otherwise the AttributeType is encoded as the dotted-decimal encoding, a numericoid, of its OBJECT IDENTIFIER.
In turn, numericoid is defined on RFC 4512 as follows:
numericoid = number 1*( DOT number )
Regarding the attribute value, one reads:
If the AttributeType is of the dotted-decimal form, the AttributeValue is represented by an number sign ('#' U+0023) character followed by the hexadecimal encoding of each of the octets of the BER e ncoding of the X.500 AttributeValue.
My understanding is that, since a short name was not known, the hex value should be used. What do you think?
This actually makes me realize that xades4j is using RFC 2253, since it is the default on getName().
Are you also including a X509IssuerSerial element on KeyInfo/X509Data? Is that one different from the cert ref?
Can you send me, on another channel, a certificate with those characteristics for tests?

Use of typed URI in sesame sail openrdf

My question is simple but maybe non-sense. (in that case , sorry to people who gonna spend time to explain me why )
I'd like to create a resource like (i dont show all the resource declaration here ) :
<owl:DatatypeProperty rdf:about="relation:isPartOf">
<rdfs:domain rdf:resource="http://www.w3.org/2004/02/skos/core#note"/>
<rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#anyURI"/>
</owl:DatatypeProperty>
<rdf:Description rdf:about="resource:context:sc#c1">
<skos:note rdf:datatype="relation:isPartOf" rdf:resource="resource:context:sc#c2">
</skos:note>
</rdf:Description>
Important to see is the triple about a skos:note relation
Subject : c1 a uri. Predicate : a skos:note , Object : a typed URI
My URI is not a direct URI but a "relation;isPartOf" uri.
I create a custom typedUri class to do that / i used a home made triple store so i can use my own class.
I change a little bit the RDFXMWritter to output these example. so "it works".
My question is more : Can a URI be typed like this ? why sesame openrdf do not provide a TypedURI class ? I'm sure there is a good reason ? any help, ideas or answers would be nice.
i'm quite sure , my idea to create a TypedURi class is wrong somewhere . but where ? :-)
thank you
EDIT : the TypedURI is not really a new kind of resource. The URI in my context is still a URI. i just declare that inside my skos:note statement , that for c1 , the object of the statement is a data of type "relation:isPartOf" and the range of the data is a anyURI.
... The typedURI helps to implements the datatype with such a range.
First of all: no, a URI can not be typed like this in RDF. Which also answers your second question: OpenRDF Sesame does not provide this functionality because it is not part of the RDF model.
Typing of URIs (or more accurately, resources, which are identified using URIs) is done by using an rdf:type relation, linking the resource URI to a class URI. For example, to make the resource ex:p1 of type foaf:Person, we would say (using Turtle syntax for RDF):
ex:p1 rdf:type foaf:Person .
There's another kind of typing in RDF, namely datatyping. This only applies to literal values so it can not be used on a URI. It is used to make a literal value a string, an integer number, a date, etc.
Update a confusion may arise because xsd:anyURI is a valid datatype in RDF, and it is (in XML Schema) defined to be a type for URIs. However, when using a datatype in RDF, its lexical space is always a literal (simply because the spec only allows for literals to actually have a datatype). So you could indeed do something like this (using Turtle syntax for literal notation):
"http://www.example.org/some/uri"^^xsd:anyURI
But from the point of view of the RDF model, this is not a URI, but a literal string (with datatype xsd:anyURI). So in a sense, yes, you can add types to URIs in RDF, but you can only do this by "converting" them to literals first.

ASN.1 Octet Strings

I'm decoding a X.509 Certificate in ASN.1 format. I'm decoding it successfully, traversing the structure, but there is one thing that I don't understand.
There are some scenarios where I get an octet string and this website that I am playing with (http://lapo.it/asn1js/) shows that these octet strings actually contain more of the ASN.1 tree. This website annotates such octet strings with (encapsulates)
My question is this: how do I know during parsing that an octet string actually encapsulates something more? Do I just try to parse it, looking if I get a tag and valid length? If not then it is pure bytes data? And if yes then it is a valid sub-tree?
Or is this meant to be output as bytes and the consumer should then only try to parse it if he knows that it is encoded data from for certain keys?
Take the example that is already loaded on the site and hit "decode". I am referring for example to offset 332 which is an octet string that encapsulates a bit string.
This is what "extensions" looks like in ASN.1 speak (RFC 2459 §B.2 — I know that RFC is "obsolete", but that useful appendix isn't present in the later versions).
Extensions ::= SEQUENCE OF Extension
Extension ::= SEQUENCE {
extnId OBJECT IDENTIFIER,
critical BOOLEAN DEFAULT FALSE,
extnValue OCTET STRING }
Every extension payload is encapsulated within an OCTET STRING. The OID of the extensions tells you what to expect within that octet string.
In the case of keyUsage it's a BIT STRING (§4.2.1.3).
And now I have an answer about my own question on subjectAltName, it's in §4.2.1.7.
One benefit of using OCTET STRING for the content is that, as per spec, unknown (non-critical) extensions can be identified as such and trivially be skipped over (though I think DER makes it trivial too).
And the way to tell ASN.1 tools to deal with that encapsulation is by using the keyword "CONTAINING". For example (this is not the actual/correct certificate spec, but it should give you an idea):
TstCert DEFINITIONS IMPLICIT TAGS ::=
BEGIN
Sun ::= SEQUENCE {
subjAltType OBJECT IDENTIFIER,
name GenNames
}
GenNames ::= SEQUENCE SIZE (1..5) OF GenName
GenName ::= CHOICE {
otherName [0] OtherName,
rfc822Name [1] UTF8String
}
OtherName ::= OCTET STRING (CONTAINING SEQUENCE {
type-id OBJECT IDENTIFIER,
value [0] EXPLICIT UTF8String
} )
END

Resources