I have an Asset entity with a field called symbol. This field can basically contain any human-readable string, including special symbols.
I'd like to generate a URL with this symbol as a parameter, but without it being escaped.
For instance I have an Asset with symbol $, but it's being generated as assets/%24
I need to be able to generate it in the Twig template without escaping these characters.
I'm using Symfony 5.
$ is a reserved character as specified in the RFC2393 :
2.2. Reserved Characters
Many URI include components consisting of or delimited by, certain
special characters. These characters are called "reserved", since
their usage within the URI component is limited to their reserved
purpose. If the data for a URI component would conflict with the
reserved purpose, then the conflicting data must be escaped before
forming the URI.
reserved = ";" | "/" | "?" | ":" | "#" | "&" | "=" | "+" |
"$" | ","
If you don't mind not following this recommandation, you could try to url_decode your generated url by creating a Twig filter and use it like this :
{{ asset(...)|urldecode }}
Related
So I know you can separate your parameters in a query string through a couple different characters
(eg. www.example.com?foo=1&bar=2 or www.example.com?foo1;bar=2)
Are there any characters other than ';' and '&' that can be used to separate query parameters? Is it just general coding practice to use ';' or '&' or are there some regulations that list which characters I can use? I know in RFC 3986 the reserved characters include
";" | "/" | "?" | ":" | "#" | "&" | "=" | "+" | "$" | ","
So does this mean that any of these characters can be used to separate query parameters?
The format of the query string contents isn't part of RFC 3986 (URIs) or RFC 723* (HTTP); it's a side effect of how HTML forms work.
So if your code needs to work with HTML forms, you are restricted to what browsers do. Otherwise, in theory, you can use any format you want, as long as it's consistent with RFC 3986's definition of the "query" component.
^[^0-9]+$
Debuggex Demo
my requirements below
Allow only Character like [a-z and A-z]
Not allow numbers
Following values not allowed
Test123
12345Test
1Test12345678
T1E2S3T
1234
Info path Show Invalid pattern
Following Custom Pattern Work for me.....
Field Name | does not match pattern | [a-zA-Z]([ ]*[a-zA-Z])*
I am trying to find words in a file that contains the world file and does not contain hypthens. It looks correct to me but my output shows all the words with a hypthen and word file. My path name is/folder.file.txt
.
cat /folder/file.txt | grep file[!-]*
! isn't the right operator for negation. You need ^.
file[!-]*
will match any string containing the word 'file' and zero or more instances of '!' or '-'.
So basically - anything with the word 'file' in it. If you want to negate a character class, you need to use ^. But the * then allows for zero of the 'not patterns'.
If the dash is immediately after the word file then:
file[^-]
will match:
file1243
somefilefilea
but not:
file-1234
I think what you may be missing from your pattern is that * allows you to ignore part of the pattern.
^file[^-]*$
might do what you're after?
https://www.regex101.com/ will let you test regular expressions.
What characters are allowed in an URL query string?
Do query strings have to follow a particular format?
Per https://www.rfc-editor.org/rfc/rfc3986
In section 2.2 Reserved Characters, the following characters are listed:
reserved = gen-delims / sub-delims
gen-delims = “:” / “/” / “?” / “#” / “[” / “]” / “#”
sub-delims = “!” / “$” / “&” / “’” / “(” / “)” / “*” / “+” / “,” / “;”
/ “=”
The spec then says:
If data for a URI component would conflict with a reserved character’s
purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.
Next, in section 2.3 Unreserved Characters, the following are listed:
unreserved = ALPHA / DIGIT / “-” / “.” / “_” / “~”
Wikipedia has your answer: http://en.wikipedia.org/wiki/Query_string
"URL Encoding: Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # can be used to further specify a subsection (or fragment) of a document; the character = is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as URL encoding.
In particular, encoding the query string uses the following rules:
Letters (A-Z and a-z), numbers (0-9) and the characters '.','-','~' and '_' are left as-is
SPACE is encoded as '+' or %20[citation needed]
All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
The octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by"~" without changing its interpretation.
The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 1738."
Regarding the format, query strings are name value pairs. The ? separates the query string from the URL. Each name value pair is separated by an ampersand (&) while the name (key) and value is separated by an equals sign (=). eg. http://domain.com?key=value&secondkey=secondvalue
Under Structure in the Wikipedia reference I provided:
The question mark is used as a separator and is not part of the query string.
The query string is composed of a series of field-value pairs
Within each pair, the field name and value are separated by an equals sign, '='.
The series of pairs is separated by the ampersand, '&' (or semicolon, ';' for URLs embedded in HTML and not generated by a ...; see below).
W3C recommends that all web servers support semicolon separators in addition to ampersand separators[6] to allow application/x-www-form-urlencoded query strings in URLs within HTML documents without having to entity escape ampersands.
This link has the answer and formatted values you all need.
https://perishablepress.com/url-character-codes/
For your convenience, this is the list:
< %3C
> %3E
# %23
% %25
{ %7B
} %7D
| %7C
\ %5C
^ %5E
~ %7E
[ %5B
] %5D
` %60
; %3B
/ %2F
? %3F
: %3A
# %40
= %3D
& %26
$ %24
+ %2B
" %22
space %20
I have made custom xhtml valdidator in .NET(validating through dtd + some extra rules) and I have noticed a discrepancy between my validation and w3c validation.
In my validator I get the following error when there is colon in the id (let's say : id="mustang:horse")
(Error) The 'id' attribute has an invalid value according to its data type.
But I do not get any errors on w3c for this pattern.
I tried to find a list of invalid characters for an attribute in xml/xhtml but couldn't find it?
Thank you for your help,
There is a list and and it does permit colons.
The XHTML 1.0 spec says at http://www.w3.org/TR/xhtml1/#h-4.10
... in XHTML 1.0 the id attribute is defined to be of type ID ...
The XML 1.0 spec says at http://www.w3.org/TR/2008/REC-xml-20081126/#id
Values of type ID MUST match the Name production.
And the Name production is defined at http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Name
[4] NameStartChar ::= ":" |
[A-Z] | "_" | [a-z] | [#xC0-#xD6] |
[#xD8-#xF6] | [#xF8-#x2FF] |
[#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] |
[#x2C00-#x2FEF] | [#x3001-#xD7FF] |
[#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]
[4a] NameChar ::= NameStartChar | "-" | "." |
[0-9] | #xB7 | [#x0300-#x036F] |
[#x203F-#x2040]
[5] Name ::= NameStartChar (NameChar)*
And also says above this formal definition:
Document authors are encouraged to use
names which are meaningful words or
combinations of words in natural
languages, and to avoid symbolic or
white space characters in names. Note
that COLON, HYPHEN-MINUS, FULL STOP
(period), LOW LINE (underscore), and
MIDDLE DOT are explicitly permitted.
(My emphasis)
The reason for this difference is that W3C validator doesn't seem to do namespace aware XHTML processing. Although XHTML documents need to be in the XHTML namespace, this is actually reasonable, because HTML documents are not using namespaces and the normative valid structure of XHTML documents (as HTML) is defined by a DTD file and DTDs are not actually namespace aware.
Like #Alochi already noted:
Values of type ID MUST match the Name
production.
This is true when the document is parsed as not namespace aware, but it is not true if the document needs to be namespace conformant. The Namespaces in XML specification states that IDs must match NCName production which explicitly forbids the colon character. Namespace aware parsing is a common convention and therefore using a colon in the value of a id is not recommended even though it is allowed when the document parsing is not namespace aware .
Summary: if namespaces are ignored, an ID value must be a valid Name and it can contain a colon; otherwise it must be a valid NCName and it can't contain a colon.