How should I encode dictionaries into HTTP GET query strings? - http

An HTTP GET query string is a ordered sequence of key/value pairs:
?spam=eggs&spam=ham&foo=bar
Is, with certain semantics, equivalent to the following dictionary:
{'spam': ['eggs', 'ham'], 'foo': bar}
This happens to work well for boolean properties of that page being requested:
?expand=1&expand=2&highlight=7&highlight=9
{'expand': [1, 2], 'highlight': [7, 9]}
If you want to stop expanding the element with id 2, just pop it out of the expand value and urlencode the query string again. However, if you've got a more modal property (with 3+ choices), you really want to represent a structure like so:
{'highlight_mode': {7: 'blue', 9: 'yellow'}}
Where the values to the corresponding id keys are part of a known enumeration. What's the best way to encode this into a query string? I'm thinking of using a sequence of two-tuples like so:
?highlight_mode=(7,blue)&highlight_mode=(9,yellow)
Edit: It would also be nice to know any names that associate with the conventions. I know that there may not be any, but it's nice to be able to talk about something specific using a name instead of examples. Thanks!

The usual way is to do it like this:
highlight_mode[7]=blue&highlight_mode[9]=yellow
AFAIR, quite a few server-side languages actually support this out of the box and will produce a nice dictionary for these values.

I've also seen people JSON-encode the nested dictionary, then further encode it with BASE64 (or something similar), then pass the whole resulting mess as a single query string parameter.
Pretty ugly.
On the other hand, if you can get away with using POST, JSON is a really good way to pass this kind of information back and forth.

In many Web frameworks it's encoded differently from what you say.
{'foo': [1], 'bar': [2, 3], 'fred': 4}
would be:
?foo[]=1&bar[]=2&bar[]=3&fred=4
The reason array answers should be different from plain answers is so the decoding layer can automatically tell the less common foo case (array which just happens to have a single element) from extremely common fred case (single element).
This notation can be extrapolated to:
?highlight_mode[7]=blue&highlight_mode[9]=yellow
when you have a hash, not just an array.
I think this is pretty much what Rails and most frameworks which copy from Rails do.
Empty arrays, empty hashes, and lack of scalar value look identical in this encoding, but there's not much you can do about it.
This [] seems to be causing just a few flamewars. Some view it as unnecessary, because the browser, transport layer, and query string encoder don't care. The only thing that cares is automatic query string decoder. I support the Rails way of using []. The alternative would be having separate methods for extracting a scalar and extracting an array from querystring, as there's no automatic way to tell when program wants [1] when it wants 4.

This piece of code works for me with Python Backend-
import json, base64
param={
"key1":"val1",
"key2":[
{"lk1":"https://www.foore.in", "lk2":"https://www.foore.in/?q=1"},
{"lk1":"https://www.foore.in", "lk2":"https://www.foore.in/?q=1"}
]
}
encoded_param=base64.urlsafe_b64encode(json.dumps(param).encode())
encoded_param_ready=str(encoded_param)[2:-1]
#eyJrZXkxIjogInZhbDEiLCAia2V5MiI6IFt7ImxrMSI6ICJodHRwczovL3d3dy5mb29yZS5pbiIsICJsazIiOiAiaHR0cHM6Ly93d3cuZm9vcmUuaW4vP3E9MSJ9LCB7ImxrMSI6ICJodHRwczovL3d3dy5mb29yZS5pbiIsICJsazIiOiAiaHR0cHM6Ly93d3cuZm9vcmUuaW4vP3E9MSJ9XX0=
#In JS
var decoded_params = atob(decodeURI(encoded_param_ready));

Related

Dynamic Receiver in Powermail (6.0.0) based on String

TYPO3 8.7.x, Powermail 6.0.0
I'd like to override the receiver based on two subjects they can select.
Now I know that this works just fine:
[globalString = GP:tx_powermail_pi1|field|konsilbereich = 5]
But this does not seem to work:
[globalString = GP:tx_powermail_pi1|field|konsilbereich = "Some phrase"]
I read that some workarounds were to have a hidden field that's been filled by Javascript upon a users selection and instead of the actual field, the hidden field gets submitted. But that is not an option for us.
I checked the docs as well as many support forums but could not find a good answer to this.
Is this not possible, or if, how would I accomplish that I can use an actual string within the comparison?
The problem is a comparisation with a string in TypoScript conditions. Strings could have space, special characters or umlauts. That's why TypoScript works best with integers.
Two possibilities come into my mind for your case:
1) Building an own conditition is quite simple in TYPO3 (see https://docs.typo3.org/typo3cms/TyposcriptReference/latest/Conditions/Reference.html#custom-conditions for an easy example)
2) Use an integer together with GP: - but then I would use a selectbox with a text as label and a number as value

extract elasticsearch date from a <start-date>/<duration> XBRL-JSON format

I am storing XBRL JSON using elasticsearch.
This xBRL-JSON OIM spec describes the oim:period property:
Otherwise, an ISO 8601 time interval representing the {interval}
property, expressed in one of the following forms:
<start>/<end>
<start>/<duration>
<duration>/<end>
Where <start> and <end> are valid according to the xsd:dateTime datatype, and <duration> is valid according to xsd:duration.
Examples from arelle's plugin look like this:
2016-01-01T00:00:00/PT0S
2015-01-01T00:00:00/P1Y
I notice that arelle's plugin exclusively produces this format:
<start>/<duration>
My question
Is there a way to save at least the <start> part as a date type in elasticsearch?
Ideas I had:
elastichsearch only (my preference)
Use a custom date format which anticipates the /<duration> part, but ignores it
I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like PT0S and P1Y above)?
EDIT So the single-quote character escapes literals; this works yyyy'/P' will accept a value '2015/P'. However, the rest of the duration could be more dynamic
Re: dynamic; will Joda accept regex or wildcard character like "\d" or "+" qualifier so I can ignore all the possible variations following the P?
Use a character filter to strip out the /<duration> part before saving only <start>as datetime. But I don't know if character filters happen before saving as type: date. If they don't, the '/`part isn't stripped, and I wouldn't be passing valid date strings.
Don't use date type: Use a pattern tokenizer to split on /, and at least the two parts will be saved as separate tokens. Can't use date math, though.
Use a transformation; although it seems like this is deprecated. I read about using copy_to instead, but that seems to combine terms, and I want to break this term apart
Some sort of plugin? Maybe a plugin which will fully support this "interval" datatype described by the OIM spec... maybe a plugin which will store its separate parts...?
change my application (I prefer to use elasticsearch-only techniques if possible)
I could edit this plugin or produce my own plugin which uses exclusively <start> and <end> parts, and saves both into separate fields;
But this breaks the OIM spec, which says they should be combined in a single field
Moreover it can be awkward to express an "instant" fact (with no duration; the PT0S examples above); I guess I just use the same value for end property as start property... Not more awkward than a 0-length duration (PT0S) I guess.
Not a direct answer, but it's worth noting that the latest internal drafts of the xBRL-JSON specification have moved away from the the single-field representation. Although the "/" separated notation is an ISO standard, tool support for it appears to be extremely poor, and so the working group has chosen to switch to separate fields for start and end dates. I would expect Arelle support to follow suit in due course.

Empty URI query string parameters: "a=&b=" versus "a&b"

Should the following URLs be considered functionally equivalent?
http://example.com/foo?a=&b=
http://example.com/foo?a&b
This came about when a user of a Drupal module I wrote which parses apart and then rewrites URIs noticed that the code sometimes causes the query string parts to change in unexpected ways due to how some of the underlying PHP functions behave. For example:
parse_str("a&b", $values); print http_build_query($values);
a=&b=
Is this something I should bother worrying about?
Edit so SO stops complaining that this question is similar to another one: The question is whether it's safe to assume that "no value for X" and "empty value for X" are equivalent, not whether the "no value" style is syntactically correct (which it is).
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax doesn't have anything to say about the structure of the query string aside from how characters like ? should be dealt with. So strictly speaking, your two example URLs are different. Of course, the application which receives those query strings may treat them as functionally equivalent, but this isn't something you can determine from the URL alone.
As per RFC6570 empty query parameters are allowed. Please refer to section 3.2.9
Example Template Expansion
{&x,y,empty} &x=1024&y=768&empty=

A Minor, but annoying niggle - Why does ASP.Net set SQL Server Guids to lowercase?

I'm doing some client-side stuff with Javascript/JQuery with .Net controls which expose their GUID/UniqueIdentifier IDs on the front end to allow them to be manipulated. During debugging something is driving me crazy: The GUIDs in the db are stored in uppercase, however by the time they make it to the front end they're in lowercase.
This means I can't quickly copy and paste IDs into the browser's console to execute JS on the fly when devving/debugging. I have found a just-about-workable way of doing this but I was wondering if anyone knew why this behaviour is the case and whether there is any way of forcing GUIDs to stay uppercase.
According to MSDN docs the Guid.ToString() method will produce lowercase string.
As to why it does that - apparently RFC 4122 states it should be this way.
The hexadecimal values "a" through "f" are output as lower case characters and are case insensitive on input.
Also check this question on SO - net-guid-uppercase-string-format.
So the best thing you can do is to call ToUpper() on your GUID strings, and add extension method as showed in the other answer.
If you're using an Eval template, then I'd see if you can do this via an Extension method.
something like
public static string ToUpperString(this Guid guid, string format = "")
{
string output = guid.ToString(format);
return output.ToUpper();
}
And then in your Eval block,
myGuid.ToUpperString("B")
Or however you need it to look.
I'm on my Mac at the moment so I can't test that, but it should work if you've got the right .Net version.

ASP.NET Localization with singular and plural

Can I with ASP.NET Resources/Localization translate strings to depend on one or other (the English grammar) in a easy way, like I pass number 1 in my translate, it return "You have one car" or with 0, 2 and higher, "You have %n cars"?
Or I'm forced to have logic in my view to see if it's singular or plural?
JasonTrue wrote:
To the best of my knowledge, there isn't a language that requires something more complicated than singular/plural
Such languages do exist. In my native Polish, for example, there are three forms: for 1, for 2-4 and for zero and numbers greater than 4. Then after you reach 20, the forms for 21, 22-24 and 25+ are again different (same grammatical forms as for numerals 0-9). And yes, "you have 0 things" sounds awkward, because you're not likely to see that used in real life.
As a localization specialist, here's what I would recommend:
If possible, use forms which put the numeral at the end:
a: Number of cars: %d
This means the form of the noun "car" does not depend on the numeral, and 0 is as natural as any other number.
If the above is not acceptable, at least always make a complete sentence a translatable resource. That is, do use
b: You have 1 car.
c: You have %d cars.
But never split such units into smaller fragments such as
d: You have
e: car(s)
(Then somewhere else you have a non-localizable resource such as %s %d %s)
The difference is that while I cannot translate (c) directly, since the form of the noun will change, I can see the problem and I can change the sentence to form (a) in translation.
On the other hand, when I am faced with (d) and (e) fragments, there is no way to make sure the resulting sentence will be grammatical. Again: using fragments guarantees that in some languages the translation will be anything from grammatically awkward to completely broken.
This applies across the board, not just to numerals. For example, a fragment such as %s was deleted is also untranslatable, since the form of the verb will depend on the gender of the noun, which is unavailable here. The best I can do for Polish is the equivalent of Deleted: %s, but I can only do it as long as the %s placeholder is included in the translatable resource. If all I have is "was deleted" with no clue to the referent noun, I can only startle my dog by cursing aloud and in the end I still have to produce garbage grammar.
I've been working on a library to assist with internationalization of an application. It's called SmartFormat and is open-source on GitHub.
It contains "grammatical number" rules for many languages that determine the correct singular/plural form based on the locale. When translating a phrase that has words that depend on a quantity, you specify the multiple forms and the formatter chooses the correct form based on these rules.
For example:
var message = "There {0:is:are} {0} {0:item:items} remaining.";
var output = Smart.Format(culture, message, items.Count);
It has a very similar syntax to String.Format(...), but has tons of great features that make it easy to create natural and grammatically-correct messages.
It also deals with possible gender-specific formatting, lists, and much more.
moodforaday wrote:
This applies across the board, not just to numerals. For example, a fragment such as "%s was deleted" is also untranslatable, since the form of the verb will depend on the gender of the noun, which is unavailable here. The best I can do for Polish is the equivalent of "Deleted: %s", but I can only do it as long as the %s placeholder is included in the translatable resource. If all I have is "was deleted" with no clue to the referent noun, I can only startle my dog by cursing aloud and in the end I still have to produce garbage grammar.
The point I would like to make here, is never include a noun as a parameter. Many European languages modify the noun based on its gender, and in the case of German, whether it is the subject, object, or indirect object. Just use separate messages.
Instead of, "%s was deleted.", use a separate translation string for each type:
"The transaction was deleted."
"The user was deleted." etc.
This way, each string can be translated properly.
The solution to handle plural forms in different languages can be found here:
http://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html
While you can't use the above software in a commercial application (it's covered under GPL), you can use the same concepts. Their plural form expressions fit perfectly with lambda expressions in .NET. There are a limited number of plural form expressions that cover a large number of languages. You can map a particular language to a lambda expression that calculates which plural form to use based on the language. Then, look up the appropriate plural form in a .NET resx file.
There is nothing built in, we ended up coding something like this:
Another solution can be:
use place holders like {CAR} in the resource file strings
Maintain separate resource entries for singular and plural words for "car" :
CAR_SINGULAR car
CAR_PLURAL cars
Develop a class with this logic:
class MyResource
{
private List<string> literals = new List<string>();
public MyResource() { literals.Add("{CAR}") }
public static string GetLocalizedString(string key,bool isPlural)
{
string val = rm.GetString(key);
if(isPlural)
{
val = ReplaceLiteralWithPlural(val);
}
else
{
val = ReplaceLiteralWithSingular(val);
}
}
}
string ReplaceLiteralWithPlural(string val)
{
StringBuilder text = new StringBuilder(val)
foreach(string literal in literals)
{
text = text.Replace(literal,GetPluralKey(literal));
}
}
string GetPluralKey(string literal)
{
return literal + "_PLURAL";
}
This is still not possible in .NET natively. But it is a well known problem and already solved by gettext. It have native support for many technologies, even includes C# support. But looks like there is modern reimplementation now in .NET: https://github.com/VitaliiTsilnyk/NGettext
Orchard CMS uses GNU Gettext's PO files for localization and plural forms.
Here is their code where you can grab the idea:
https://github.com/OrchardCMS/OrchardCore/tree/main/src/OrchardCore/OrchardCore.Localization.Core
https://github.com/OrchardCMS/OrchardCore/tree/main/src/OrchardCore/OrchardCore.Localization.Abstractions
The logic needn't be in your view, but it's certainly not in the resource model the DotNet framework. Yours is a simple enough case that you can probably get away with creating a simple format string for singular, and one for plural. "You have 1 car."/"You have {0} cars." You then need to write a method that discriminates between the plural and singular case.
There are also a number of languages where there's no distinction between singular and plural, but this just requires a little duplication in the resource strings. (edit to add: gender and sometimes endings change depending on number of units in many languages, but this should be fine as long as you distinguish between singular and plural sentences, rather than just words).
Microsoft mostly gave up on semantically clever localization because it's hard to generalize even something like pluralization to 30+ languages. This is why you see such boring presentation of numeric quantities in most applications that are translated to many languages, along the lines of "Cars: {0}". That sounds lame, but surprisingly, the last time I checked, usability studies didn't actually favor the verbose natural language presentation in most cases.

Resources