extract elasticsearch date from a <start-date>/<duration> XBRL-JSON format - datetime

I am storing XBRL JSON using elasticsearch.
This xBRL-JSON OIM spec describes the oim:period property:
Otherwise, an ISO 8601 time interval representing the {interval}
property, expressed in one of the following forms:
<start>/<end>
<start>/<duration>
<duration>/<end>
Where <start> and <end> are valid according to the xsd:dateTime datatype, and <duration> is valid according to xsd:duration.
Examples from arelle's plugin look like this:
2016-01-01T00:00:00/PT0S
2015-01-01T00:00:00/P1Y
I notice that arelle's plugin exclusively produces this format:
<start>/<duration>
My question
Is there a way to save at least the <start> part as a date type in elasticsearch?
Ideas I had:
elastichsearch only (my preference)
Use a custom date format which anticipates the /<duration> part, but ignores it
I haven't checked Joda yet; will it ignore characters in the date format if they aren't part of the special character? Like the "/" delimiter or the "P" which precedes any duration value (like PT0S and P1Y above)?
EDIT So the single-quote character escapes literals; this works yyyy'/P' will accept a value '2015/P'. However, the rest of the duration could be more dynamic
Re: dynamic; will Joda accept regex or wildcard character like "\d" or "+" qualifier so I can ignore all the possible variations following the P?
Use a character filter to strip out the /<duration> part before saving only <start>as datetime. But I don't know if character filters happen before saving as type: date. If they don't, the '/`part isn't stripped, and I wouldn't be passing valid date strings.
Don't use date type: Use a pattern tokenizer to split on /, and at least the two parts will be saved as separate tokens. Can't use date math, though.
Use a transformation; although it seems like this is deprecated. I read about using copy_to instead, but that seems to combine terms, and I want to break this term apart
Some sort of plugin? Maybe a plugin which will fully support this "interval" datatype described by the OIM spec... maybe a plugin which will store its separate parts...?
change my application (I prefer to use elasticsearch-only techniques if possible)
I could edit this plugin or produce my own plugin which uses exclusively <start> and <end> parts, and saves both into separate fields;
But this breaks the OIM spec, which says they should be combined in a single field
Moreover it can be awkward to express an "instant" fact (with no duration; the PT0S examples above); I guess I just use the same value for end property as start property... Not more awkward than a 0-length duration (PT0S) I guess.

Not a direct answer, but it's worth noting that the latest internal drafts of the xBRL-JSON specification have moved away from the the single-field representation. Although the "/" separated notation is an ISO standard, tool support for it appears to be extremely poor, and so the working group has chosen to switch to separate fields for start and end dates. I would expect Arelle support to follow suit in due course.

Related

Date validation in Ballerina

I have to validate different types of dates in ballerina. I have been trying deal with it using regex, however there are many ways to write a date (eg. 25/05/2020, 05/25/2020, 25-May-2020 etc.).
It is hard to predict all the types. What's more, it would be nice to validate whether the received input is really a date - that includes different number of days in different months or leap years. Generally such regex would be monstrous (anyway really big). Are you aware of any existing library that would provide a shortcut or have date validation?
Is there any way to facilitate such task?
Validating all the date formats at a single go is a challenge, not only in Ballerina, but in the other languages too. Ballerina do not have a single method to validate all the date formats, but you can use ballerina/time library to parse date strings to Time records.
import ballerina/time;
public function main() {
string timeString = "<your time string>";
string timeFormat = "<your time format>";
time:Time|time:Error time = time:parse(timeString, timeFormat);
}
In the sample, the time:parse() will return a valid Time record, if the provided is a valid time, according to the provided time format.
I know this is not the exact answer you want, but this is the way to parse a time in Ballerina.
Alternatively, there are some Java answers which can be used (I am not sure whether they fulfil your requirement though) with Ballerina - Java interoperability.

jsonapi.org correct way to use pagination using the page query string

In the documentation for jsonapi for pagination is says the following:
For example, a page-based strategy might use query parameters such as
page[number] and page[size]
How would I represent this in the query string? http://localhost:4200/people?page[number]=1&page[size]=25, I don't think using a map link structure is a valid query string. Only the page parameter is reserved according to the documentation.
I don't think using a map link structure is a valid query string.
You're right technically, and that's why the spec has the note that says:
Note: The example query parameters above use unencoded [ and ] characters simply for readability. In practice, these characters must be percent-encoded, per the requirements in RFC 3986.
So, page[size] is really page%5Bsize%5D which is a valid query parameter name.
Only the page parameter is reserved according to the documentation.
When the spec text says that only page is reserved, it actually means that any page[......] style query parameter is reserved. (I can tell you that for sure as one of the spec's editors.) But it should say so more explicitly, so I'll open an issue for it.

What's the correct format for TCDL linkAttributes?

I can see the technology-independent Tridion Content Delivery Language (TCDL) link has the following parameters, which are pretty well described on SDL Live Content.
type
origin
destination
templateURI
linkAttributes
textOnFail
addAnchor
VariantId
How do we add multiple attribute-value pairs for the linkAttributes? Specifically, what do we use to escape the double quotes as well as separate pairs (e.g. if we need class="someclass" and onclick="someevent").
The separate pairs are just space delimited, like a normal series of attributes. Try XML encoding the value of linkAttributes however. So, " become &quote;, etc...
If you are using some Javascript, you might take care of the Javascript quotes too, as in \".
Edit: after I figured out your real question, the answer is a lot simpler:
You should wrap the values inside your linkAttributes in single quotes. Spaces inside linkAttributes are typically handled fine; but if not, escape then with %20.
If you need something more or want something that isn't handled by the standard tcdl:ComponentLink, remember that you can always create your own TCDL tag and and use a TagHandler or TagRenderer (look them up in the docs for examples or search for Jaime's article on TagRenderer) to do precisely what you want.
My original answer was to a question you didn't ask: what is the format for TCDL tags (in general). But the explanation might still be useful to some, so remains below.
I'd suggest having a look at what format the default building blocks (e.g. the Link Resolver TBB in the Default Finish Actions) output and use that as a guide line.
This is what I could quickly get from the transport package of a published page:
<tcdl:Link type="Page" origin="tcm:5-199-64" destination="tcm:5-206-64"
templateURI="tcm:0-0-0" linkAttributes="" textOnFail="true"
addAnchor="" variantId="">Home</tcdl:Link>
<tcdl:ComponentPresentation type="Embedded" componentURI="tcm:5-69"
templateURI="tcm:5-133-32">
<span>
...
One of the things that I know from experience: your entire TCDL tag will have to be on a single line (I wrapped the lines above for readability only). Or at least that is the case if it is used to invoke a REL TagRenderer. Clearly the tcdl:ComponentPresentation tag above will span multiple lines, so that "single line rule" doesn't apply everywhere.
And that is probably the best advice: given the fact that TCDL tags are processed at multiple points in Tridion Publishing, Deployment and Delivery pipeline, I'd stick to the format that the default TBBs output. And from my sample that seems to be: put everything on a single line and wrap the values in (double) quotes.

How can I make input fields accept locale dependent number formatting?

I'm working on a Spring MVC Project and ran into a problem with the internationalization in forms, especially the number formatting.
I already use fmt:formatNumber to format the numbers according to the current selected locale.
<fmt:formatNumber value="${object[field]}"/>
Like this, number formatting works well when I display numbers. But how about the forms?
At the moment, the input fields that are supposed to receive float values are prefilled with 0.0 and expect me to use "." as decimal separator, no matter what locale is selected. Values containing "," are refused by the server (...can not convert String to required type float...).
How can I make my input fields use and accept the appropriate number format as well?
Did you have a look at #NumberFormat? If you annotate the property the input field is bound to, this should result in the proper formatting. Something like:
#NumberFormat(style = Style.NUMBER)
private BigDecimal something;
This style is the "general-purpose number format for the current locale". I guess, the current locale is determined threadwise from the LocaleContextHolder.
Your app needs to be annotation-driven, also see the section "Annotation-driven Formatting" in the docs.
You might want to take a look at the DecimalFormatSymbols as suggested in this answer.

How should I encode dictionaries into HTTP GET query strings?

An HTTP GET query string is a ordered sequence of key/value pairs:
?spam=eggs&spam=ham&foo=bar
Is, with certain semantics, equivalent to the following dictionary:
{'spam': ['eggs', 'ham'], 'foo': bar}
This happens to work well for boolean properties of that page being requested:
?expand=1&expand=2&highlight=7&highlight=9
{'expand': [1, 2], 'highlight': [7, 9]}
If you want to stop expanding the element with id 2, just pop it out of the expand value and urlencode the query string again. However, if you've got a more modal property (with 3+ choices), you really want to represent a structure like so:
{'highlight_mode': {7: 'blue', 9: 'yellow'}}
Where the values to the corresponding id keys are part of a known enumeration. What's the best way to encode this into a query string? I'm thinking of using a sequence of two-tuples like so:
?highlight_mode=(7,blue)&highlight_mode=(9,yellow)
Edit: It would also be nice to know any names that associate with the conventions. I know that there may not be any, but it's nice to be able to talk about something specific using a name instead of examples. Thanks!
The usual way is to do it like this:
highlight_mode[7]=blue&highlight_mode[9]=yellow
AFAIR, quite a few server-side languages actually support this out of the box and will produce a nice dictionary for these values.
I've also seen people JSON-encode the nested dictionary, then further encode it with BASE64 (or something similar), then pass the whole resulting mess as a single query string parameter.
Pretty ugly.
On the other hand, if you can get away with using POST, JSON is a really good way to pass this kind of information back and forth.
In many Web frameworks it's encoded differently from what you say.
{'foo': [1], 'bar': [2, 3], 'fred': 4}
would be:
?foo[]=1&bar[]=2&bar[]=3&fred=4
The reason array answers should be different from plain answers is so the decoding layer can automatically tell the less common foo case (array which just happens to have a single element) from extremely common fred case (single element).
This notation can be extrapolated to:
?highlight_mode[7]=blue&highlight_mode[9]=yellow
when you have a hash, not just an array.
I think this is pretty much what Rails and most frameworks which copy from Rails do.
Empty arrays, empty hashes, and lack of scalar value look identical in this encoding, but there's not much you can do about it.
This [] seems to be causing just a few flamewars. Some view it as unnecessary, because the browser, transport layer, and query string encoder don't care. The only thing that cares is automatic query string decoder. I support the Rails way of using []. The alternative would be having separate methods for extracting a scalar and extracting an array from querystring, as there's no automatic way to tell when program wants [1] when it wants 4.
This piece of code works for me with Python Backend-
import json, base64
param={
"key1":"val1",
"key2":[
{"lk1":"https://www.foore.in", "lk2":"https://www.foore.in/?q=1"},
{"lk1":"https://www.foore.in", "lk2":"https://www.foore.in/?q=1"}
]
}
encoded_param=base64.urlsafe_b64encode(json.dumps(param).encode())
encoded_param_ready=str(encoded_param)[2:-1]
#eyJrZXkxIjogInZhbDEiLCAia2V5MiI6IFt7ImxrMSI6ICJodHRwczovL3d3dy5mb29yZS5pbiIsICJsazIiOiAiaHR0cHM6Ly93d3cuZm9vcmUuaW4vP3E9MSJ9LCB7ImxrMSI6ICJodHRwczovL3d3dy5mb29yZS5pbiIsICJsazIiOiAiaHR0cHM6Ly93d3cuZm9vcmUuaW4vP3E9MSJ9XX0=
#In JS
var decoded_params = atob(decodeURI(encoded_param_ready));

Resources