Passing additional arguments to _normalise_coerse methods in cerberus - cerberus

I have some code see EOM; it's by no means final but is the best way (so far) I've seen/conceived for validating multiple date formats in a somewhat performant way.
I'm wondering if there is a means to pass an additional argument to this kind of function (_normalise_coerce), it would be nice if the date format string could be defined in the schema. something like
{
"a_date":{
"type": "datetime",
"coerce": "to_datetime",
"coerce_args": "%m/%d/%Y %H:%M"
}
}
Vs making a code change in the function to support an additional date format. I've looked through the docs and not found anything striking. Fairly good chance I'm looking at this all wrong but figured asking the experts was the best approach. I think defining within the schema is the cleanest solution to the problem, but I'm all eyes and ears for facts, thoughts and opinions.
Some context:
Performance is essential as this could be running against millions of rows in AWS lambdas (and Cerbie (my nickname for cerberus) isn't exactly a spring chicken :P ).
None of the schemas will be native python dicts as they're all defined in JSON/YAML, so it all needs to be string friendly.
Not using the built-in coercion as the python types cannot be parsed from strings
I don't need the datetime object, so regex is a possibility, just less explicit and less futureproof.
If this is all wrong and I'm grossly incompetent, please be gentle (づ。◕‿‿◕。)づ
def _normalize_coerce_to_datetime(self, value: Union(str, datetime, None)) -> Union(datetime, str, None):
'''
Casts valid datetime strings to the datetime python type.
:param value: (str, datetime, None): python datetime, datetime string
:return: datetime, string, None. python datetime,
invalid datetime string or None if the value is empty or None
'''
datetime_formats = ['%m/%d/%Y %H:%M']
if isinstance(value, datetime):
return value
if value and not value.isspace():
for format in datetime_formats:
try:
return datetime.strptime(value, format)
except ValueError:
date_time = value
return date_time
else:
return None

I have attempted to do this myself and have not found a way to pass additional arguments to a custom normalize_coerce rule. If you want to extend the Cerberus library to include custom validators then you can include arguments and then access these through the constraints in the custom validator. The below is an example that I have used for a conditional to default coercer, but as I needed to specify the condition and both the value to check against and the value to return I couldn't find a way to do this with the normalize_coerce and hence applied inside a validate rule and edited the self.document, as seen by the code.
Schema:
{
"columns":{
"Customer ID":{
"type":"number",
"conditional_to_default":{
"condition":"greater_than",
"value_to_check_against":100,
"value_to_return":22
}
}
}
}
def _validate_conditional_to_default(self, constraint, field, value):
"""
Test the values and transform if conditions are met.
:param constraint: Dictionary with the args needed for the conditional check.
:param field: Field name.
:param value: Field value.
:return: the new document value if applicable, or keep the existing document value if not
"""
value_to_check_against = constraint["value_to_check_against"]
value_to_return = constraint["value_to_return"]
rule_name = 'conditional_to_default'
condition_mapping_dict = {"greater_than": operator.gt, "less_than": operator.lt, "equal_to": operator.eq,
"less_than_or_equal_to": operator.le,
"greater_than_or_equal_to": operator.ge}
if constraint["condition"] in condition_mapping_dict:
if condition_mapping_dict[constraint["condition"]](value, value_to_check_against):
self.document[field] = value_to_return
return self.document
else:
return self.document
if constraint["condition"] not in condition_mapping_dict:
custom_errors_list = []
custom_error = cerberus.errors.ValidationError(document_path=(field, ), schema_path=(field, rule_name),
code=0x03, rule=rule_name, constraint="Condition must be "
"one of: "
"{condition_vals}"
.format(condition_vals=list(condition_mapping_dict.keys())),
value=value, info=())
custom_errors_list.append(custom_error)
self._error(custom_errors_list)
return self.document
This is probably the wrong way to do it, but I hope the above gives you some inspiration and gets you a bit further. Equally I'm following this to see if anyone else has found a way to pass arguments to the _normlize_coerce function.

Related

Is it bad practice to provide your own setter or should I use setproperty?

Suppose if I had the following Employee struct:
mutable struct Employee
_id::Int64
_first_name::String
_last_name::String
function Employee(_id::Int64,_first_name::String,_last_name::String)
# validation left out.
new(_id,_first_name,_last_name)
end
end
If I wanted to implement my own setproperty!() I can do:
function setproperty!(value::Employee,name::Symbol,x)
if name == :_id
if !isa(x,Int64)
throw(ErrorException("ID type is invalid"))
end
setfield!(value,:_id,x)
end
if name == :_first_name
if is_white_space(x)
throw(ErrorException("First Name cannot be blank!"))
end
setfield!(value,:_first_name,x)
end
if name == :_last_name
if is_white_space(x)
throw(ErrorException("Last Name cannot be blank!"))
end
setfield!(value,:_last_name,x)
end
end
Have I implemented setproperty!() correctly?
The reason why I use setfield!() for _first_name and _last_name, is because if I do:
if name == :_first_name
setproperty!(value,:_first_name,x) # or value._first_name = x
end
it causes a StackOverflowError because it's recursively using setproperty!().
I don't really like the use of setproperty!(), because as the number of parameters grows, so would setproperty!().
It also brings to mind using Enum and if statements (only we've switched Enum with Symbol).
One workaround I like, is to document that the fields are meant to be private and use the provided setter to set the field:
function set_first_name(obj::Employee,first_name::AbstractString)
# Validate first_name before assigning it.
obj._first_name = first_name
end
The function is smaller and has a single purpose.
Of course this doesn't prevent someone from using setproperty!(), setfield!() or value._field_name = x, but if you're going to circumvent the provided setter then you'll have the handle the consequences for doing it.
Of course this doesn't prevent someone from using setproperty!(), setfield!() or value._field_name = x, but if you're going to circumvent the provided setter then you'll have the handle the consequences for doing it.
I would recommend you to do this, defining getter,setter functions, instead of overloading getproperty/setproperty!. on the wild, the main use i saw on overloading getproperty/setproperty! is when fields can be calculated from the data. for a getter/setter pattern, i recommend you to use the ! convention:
getter:
function first_name(value::Employee)
return value._first_name
end
setter:
function first_name!(value::Employee,text::String)
#validate here
value._first_name = text
return value._first_name
end
if your struct is mutable, it could be that some fields are uninitialized. you could add a getter with default, by adding a method:
function first_name(value::Employee,default::String)
value_stored = value._first_name
if is_initialized(value_stored) #define is_initialized function
return value_stored
else
return default
end
end
with a setter/getter with default, the only difference between first_name(val,text) and first_name!(val,text) would be the mutability of val, but the result is the same. useful if you are doing mutable vs immutable functions. as you said it, the getproperty/setproperty! is cumbersome in comparison. If you want to disallow accessing the fields, you could do:
Base.getproperty(val::Employee,key::Symbol) = throw(error("use the getter functions instead!")
Base.setproperty!(val::Employee,key::Symbol,x) = throw(error("use the setter functions instead!")
Disallowing the syntax sugar of val.key and val.key = x. (if someone really want raw access, there is still getfield/setfield!, but they were warned.)
Finally, i found this recomendation in the julia docs, that recommends getter/setter methods over direct field access
https://docs.julialang.org/en/v1/manual/style-guide/#Prefer-exported-methods-over-direct-field-access

Swiftui: how do you assign the value in a "String?" object to a "String" object?

Swiftui dictionaries have the feature that the value returned by using key access is always of type "optional". For example, a dictionary that has type String keys and type String values is tricky to access because each returned value is of type optional.
An obvious need is to assign x=myDictionary[key] where you are trying to get the String of the dictionary "value" into the String variable x.
Well this is tricky because the String value is always returned as an Optional String, usually identified as type String?.
So how is it possible to convert the String?-type value returned by the dictionary access into a plain String-type that can be assigned to a plain String-type variable?
I guess the problem is that there is no way to know for sure that there exists a dictionary value for the key. The key used to access the dictionary could be anything so somehow you have to deal with that.
As described in #jnpdx answer to this SO question (How do you assign a String?-type object to a String-type variable?), there are at least three ways to convert a String? to a String:
import SwiftUI
var x: Double? = 6.0
var a = 2.0
if x != nil {
a = x!
}
if let b = x {
a = x!
}
a = x ?? 0.0
Two key concepts:
Check the optional to see if it is nil
if the optional is not equal to nil, then go ahead
In the first method above, "if x != nil" explicitly checks to make sure x is not nil be fore the closure is executed.
In the second method above, "if let a = b" will execute the closure as long as b is not equal to nil.
In the third method above, the "nil-coalescing" operator ?? is employed. If x=nil, then the default value after ?? is assigned to a.
The above code will run in a playground.
Besides the three methods above, there is at least one other method using "guard let" but I am uncertain of the syntax.
I believe that the three above methods also apply to variables other than String? and String.

Convert large xml values into double type json?

I'm forming an xml whose snippet is -
<cache-properties>
<list-cache-hit-rate>
<units>hits/sec</units>
<value>1.5308452E6</value>
</list-cache-hit-rate>
<list-cache-miss-rate>
<units>misses/sec</units>
<value>25422.167</value>
</list-cache-miss-rate>
<compressed-tree-cache-hit-rate>
<units>hits/sec</units>
<value>970.2339</value>
Notice the value 1.5308452E6 is big enough that the values are stored as exponent while performing fn:sum() behind the scene.
Later, I'm converting the xml to json by the following function -
let $arr := json:to-array(local:tojson($data))
return (($data))
and value converted looks like this -
cache-properties": {
"list-cache-hit-rate": {
"units": "hits/sec",
"value": 1.5308452E6
},
"list-cache-miss-rate": {
"units": "misses/sec",
"value": "25422.167"
},
"compressed-tree-cache-hit-rate": {
"units": "hits/sec",
"value": "970.2339"
},
Notice the values are enclosed in quotes except 1.5308452E6 this value. This is not in quotes. What correction is needed here ? Or is this correct? I'd rather have all values in quotes. This is my custom transform function-
declare function local:tojson($func){
let $custom := let $config := json:config("custom")
let $_ := map:put( $config, "whitespace", "ignore" )
let $_ := map:put( $config, "array-element-names", "Video" )
return $config
return json:transform-to-json($func,$custom)
};
Take a look at the xml schema. Your snippets appear to be similar or identical to marklogic system status xml schema however you mention 'fn:sum in the background' so Im guessing you have applied a transformation which has changed the xsd type.
The json transformation code uses the XSD type if in scope to determine the typed output in JSON (for XML numeric types). Also if the number is 'too large' it can convert to string to avoid JavaScript issue.
( it basically uses fn:data(value) to convert )
If needed you can either force a string type onto your xml, or you can specialize the transformation by overriding one of the json-custom: primitives in json/custom.xqy by supplying the appropriate mapping in the config. Look into the source for the full list of overridable functions. They are not fully documented as they are not with full generality in mind and may not be obvious, easy or possibly to change behaviour in every conceivable way.
The strategies are to either
Use an XML with schema in scope that types atomic values explicitly (in your case as xs:string),
Override one of the low level functions in custom.xqy
Convert the JSON by post-processing and 'stringify' the desired elements
Roll your own (not too difficult with the samples show)
All of the above

idl: pass keyword dynamically to isa function to test structure read by read_csv

I am using IDL 8.4. I want to use isa() function to determine input type read by read_csv(). I want to use /number, /integer, /float and /string as some field I want to make sure float, other to be integer and other I don't care. I can do like this, but it is not very readable to human eye.
str = read_csv(filename, header=inheader)
; TODO check header
if not isa(str.(0), /integer) then stop
if not isa(str.(1), /number) then stop
if not isa(str.(2), /float) then stop
I am hoping I can do something like
expected_header = ['id', 'x', 'val']
expected_type = ['/integer', '/number', '/float']
str = read_csv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i=0l,n_elements(expected_type) do
if not isa(str.(i), expected_type[i]) then stop
endfor
the above doesn't work, as '/integer' is taken literally and I guess isa() is looking for named structure. How can you do something similar?
Ideally I want to pick expected type based on header read from file, so that script still works as long as header specifies expected field.
EDIT:
my tentative solution is to write a wrapper for ISA(). Not very pretty, but does what I wanted... if there is cleaner solution , please let me know.
Also, read_csv is defined to return only one of long, long64, double and string, so I could write function to test with this limitation. but I just wanted to make it to work in general so that I can reuse them for other similar cases.
function isa_generic,var,typ
; calls isa() http://www.exelisvis.com/docs/ISA.html with keyword
; if 'n', test /number
; if 'i', test /integer
; if 'f', test /float
; if 's', test /string
if typ eq 'n' then return, isa(var, /number)
if typ eq 'i' then then return, isa(var, /integer)
if typ eq 'f' then then return, isa(var, /float)
if typ eq 's' then then return, isa(var, /string)
print, 'unexpected typename: ', typ
stop
end
IDL has some limited reflection abilities, which will do exactly what you want:
expected_types = ['integer', 'number', 'float']
expected_header = ['id', 'x', 'val']
str = read_csv(filename, header=inheader)
if ~array_equal(strlowcase(inheader), expected_header) then stop
foreach type, expected_types, index do begin
if ~isa(str.(index), _extra=create_struct(type, 1)) then stop
endforeach
It's debatable if this is really "easier to read" in your case, since there are only three cases to test. If there were 500 cases, it would be a lot cleaner than writing 500 slightly different lines.
This snipped used some rather esoteric IDL features, so let me explain what's happening a bit:
expected_types is just a list of (string) keyword names in the order they should be used.
The foreach part iterates over expected_types, putting the keyword string into the type variable and the iteration count into index.
This is equivalent to using for index = 0, n_elements(expected_types) - 1 do and then using expected_types[index] instead of type, but the foreach loop is easier to read IMHO. Reference here.
_extra is a special keyword that can pass a structure as if it were a set of keywords. Each of the structure's tags is interpreted as a keyword. Reference here.
The create_struct function takes one or more pairs of (string) tag names and (any type) values, then returns a structure with those tag names and values. Reference here.
Finally, I replaced not (bitwise not) with ~ (logical not). This step, like foreach vs for, is not necessary in this instance, but can avoid headache when debugging some types of code, where the distinction matters.
--
Reflective abilities like these can do an awful lot, and come in super handy. They're work-horses in other languages, but IDL programmers don't seem to use them as much. Here's a quick list of common reflective features I use in IDL, with links to the documentation for each:
create_struct - Create a structure from (string) tag names and values.
n_tags - Get the number of tags in a structure.
_extra, _strict_extra, and _ref_extra - Pass keywords by structure or reference.
call_function - Call a function by its (string) name.
call_procedure - Call a procedure by its (string) name.
call_method - Call a method (of an object) by its (string) name.
execute - Run complete IDL commands stored in a string.
Note: Be very careful using the execute function. It will blindly execute any IDL statement you (or a user, file, web form, etc.) feed it. Never ever feed untrusted or web user input to the IDL execute function.
You can't access the keywords quite like that, but there is a typename parameter to ISA that might be useful. This is untested, but should work:
expected_header = ['id', 'x', 'val']
expected_type = ['int', 'long', 'float']
str = read_cv(filename, header=inheader)
if not array_equal(strlowcase(inheader), expected_header) then stop
for i = 0L, n_elemented(expected_type) - 1L do begin
if not isa(str.(i), expected_type[i]) then stop
endfor

Passing strings as task creation discriminants in Ada

I'm moving my first steps with Ada, and I'm finding that I struggle to understand how to do common, even banal, operations that in other languages would be immediate.
In this case, I defined the following task type (and access type so I can create new instances):
task type Passenger(
Name : String_Ref;
Workplace_Station : String_Ref;
Home_Station : String_Ref
);
type Passenger_Ref is access all Passenger;
As you can see, it's a simple task that has 3 discriminants that can be passed to it when creating an instance. String_Ref is defined as:
type String_Ref is access all String;
and I use it because apparently you cannot use "normal" types as task discriminants, only references or primitive types.
So I want to create an instance of such a task, but whatever I do, I get an error. I cannot pass the strings directly by simply doing:
Passenger1 := new Passenger(Name => "foo", Workplace_Station => "man", Home_Station => "bar");
Because those are strings and not references to strings, fair enough.
So I tried:
task body Some_Task_That_Tries_To_Use_Passenger is
Passenger1 : Passenger_Ref;
Name1 : aliased String := "Foo";
Home1 : aliased String := "Man";
Work1 : aliased String := "Bar";
begin
Passenger1 := new Passenger(Name => Name1'Access, Workplace_Station => Work1'Access, Home_Station => Home1'Access);
But this doesn't work either, as, from what I understand, the Home1/Name1/Work1 variables are local to task Some_Task_That_Tries_To_Use_Passenger and so cannot be used by Passenger's "constructor".
I don't understand how I have to do it to be honest. I've used several programming languages in the past, but I never had so much trouble passing a simple String to a constructor, I feel like a total idiot but I don't understand why such a common operation would be so complicated, I'm sure I'm approaching the problem incorrectly, please enlighten me and show me the proper way to do this, because I'm going crazy :D
Yes, I agree it is a serious problem with the language that discriminates of task and record types have to be discrete. Fortunately there is a simple solution for task types -- the data can be passed via an "entry" point.
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
procedure Main is
task type Task_Passenger is
entry Construct(Name, Workplace, Home : in String);
end Passenger;
task body Task_Passenger is
N, W, H : Unbounded_String;
begin
accept Construct(Name, Workplace, Home : in String) do
N := To_Unbounded_String(Name);
W := To_Unbounded_String(Workplace);
H := To_Unbounded_String(Home);
end Construct;
--...
end Passenger;
Passenger : Task_Passenger;
begin
Passenger.Construct("Any", "length", "strings!");
--...
end Main;
Ada doesn't really have constructors. In other languages, a constructor is, in essence, a method that takes parameters and has a body that does stuff with those parameters. Trying to get discriminants to serve as a constructor doesn't work well, since there's no subprogram body to do anything with the discriminants. Maybe it looks like it should, because the syntax involves a type followed by a list of discriminant values in parentheses and separated by commas. But that's a superficial similarity. The purpose of discriminants isn't to emulate constructors.
For a "normal" record type, the best substitute for a constructor is a function that returns an object of the type. (Think of this as similar to using a static "factory method" instead of a constructor in a language like Java.) The function can take String parameters or parameters of any other type.
For a task type, it's a little trickier, but you can write a function that returns an access to a task.
type Passenger_Acc is access all Passenger;
function Make_Passenger (Name : String;
Workplace_Station : String;
Home_Station : String) return Passenger_Acc;
To implement it, you'll need to define an entry in the Passenger task (see Roger Wilco's answer), and then you can use it in the body:
function Make_Passenger (Name : String;
Workplace_Station : String;
Home_Station : String) return Passenger_Acc is
Result : Passenger_Acc;
begin
Result := new Passenger;
Result.Construct (Name, Workplace_Station, Home_Station);
return Result;
end Make_Passenger;
(You have to do this by returning a task access. I don't think you can get the function to return a task itself, because you'd have to use an extended return to set up the task object and the task object isn't activated until after the function returns and thus can't accept an entry.)
You say
"I don't understand how I have to do it to be honest. I've used several programming languages in the past, but I never had so much trouble passing a simple String to a constructor, I feel like a total idiot but I don't understand why such a common operation would be so complicated, I'm sure I'm approaching the problem incorrectly, please enlighten me and show me the proper way to do this, because I'm going crazy :D"
Ada's access types are often a source of confusion. The main issue is that Ada doesn't have automatic garbage collection, and wants to ensure you can't suffer from the problem of returning pointers to local variables. The combination of these two results in a curious set of rules that force you to design your solution carefully.
If you are sure your code is good, then you can always used 'Unrestricted_Access on an aliased String. This puts all the responsibility on you to ensure the accessed variable won't disappear from underneath the task though.
It doesn't have to be all that complicated. You can use an anonymous access type and allocate the strings on demand, but please consider if you really want the strings to be discriminants.
Here is a complete, working example:
with Ada.Text_IO;
procedure String_Discriminants is
task type Demo (Name : not null access String);
task body Demo is
begin
Ada.Text_IO.Put_Line ("Demo task named """ & Name.all & """.");
exception
when others =>
Ada.Text_IO.Put_Line ("Demo task terminated by an exception.");
end Demo;
Run_Demo : Demo (new String'("example 1"));
Second_Demo : Demo (new String'("example 2"));
begin
null;
end String_Discriminants;
Another option is to declare the strings as aliased constants in a library level package, but then you are quite close to just having an enumerated discriminant, and should consider that option carefully before discarding it.
I think another solution would be the following:
task body Some_Task_That_Tries_To_Use_Passenger is
Name1 : aliased String := "Foo";
Home1 : aliased String := "Man";
Work1 : aliased String := "Bar";
Passenger1 : aliased Passenger(
Name => Name1'Access,
Workplace_Station => Work1'Access,
Home_Station => Home1'Access
);
begin
--...

Resources