Define a jsonable type using mypy / PEP-526 - mypy

Values that can be converted to a JSON string via json.dumps are:
Scalars: Numbers and strings
Containers: Mapping and Iterable
Union[str, int, float, Mapping, Iterable]
Do you have a better suggestion?

Long story short, you have the following options:
If you have zero idea how your JSON is structured and must support arbitrary JSON blobs, you can:
Wait for mypy to support recursive types.
If you can't wait, just use object or Dict[str, object]. It ends up being nearly identical to using recursive types in practice.
If you don't want to constantly have to type-check your code, use Any or Dict[str, Any]. Doing this lets you avoid needing to sprinkle in a bunch of isinstance checks or casts at the expense of type safety.
If you know precisely what your JSON data looks like, you can:
Use a TypedDict
Use a library like Pydantic to deserialize your JSON into an object
More discussion follows below.
Case 1: You do not know how your JSON is structured
Properly typing arbitrary JSON blobs is unfortunately awkward to do with PEP 484 types. This is partly because mypy (currently) lacks recursive types: this means that the best we can do is use types similar to the one you constructed.
(We can, however, make a few refinements to your type. In particular, json.Dumps(...) actually does not accept arbitrary iterables. A generator is a subtype of Iterable, for example, but json.dumps(...) will refuse to serialize generators. You probably want to use something like Sequence instead.)
That said, having access to recursive types may not end up helping that much either: in order to use such a type, you would need to start sprinkling in isinstance checks or casts into your code. For example:
JsonType = Union[None, int, str, bool, List[JsonType], Dict[JsonType]]
def load_config() -> JsonType:
# ...snip...
config = load_config()
assert isinstance(config, dict)
name = config["name"]
assert isinstance(name, str)
So if that's the case, do we really need the full precision of recursive types? In most cases, we can just use object or Dict[str, object] instead: the code we write at runtime is going to be nearly the same in either case.
For example, if we changed the example above to use JsonType = object, we would still end up needing both asserts.
Alternatively, if you find sprinkling in assert/isinstance checks to be unnecessary for your use case, a third option is to use Any or Dict[str, Any] and have your JSON be dynamically typed.
It's obviously less precise than the options presented above, but asking mypy to not type check uses of your JSON dict and relying on runtime exceptions instead can sometimes be more ergonomic in practice.
Case 2: You know how your JSON data will be structured
If you do not need to support arbitrary JSON blobs and can assume it forms a particular shape, we have a few more options.
The first option is to use TypedDicts instead. Basically, you construct a type explicitly specifying what a particular JSON blob is expected to look like and use that instead. This is more work to do, but can let you gain more type-safety.
The main disadvantage of using TypedDicts is that it's basically the equivalent of a giant cast in the end. For example, if you do:
from typing import TypedDict
import json
class Config(TypedDict):
name: str
env: str
with open("my-config.txt") as f:
config: Config = json.load(f)
...how do we know that my-config.txt actually matches this TypedDict?
Well, we don't, not for certain.
This can be fine if you have full control over where the JSON is coming from. In this case, it might be fine to not bother validating the incoming data: just having mypy check uses of your dict is good enough.
But if having runtime validation is important to you, your options are to either implement that validation logic yourself or use a 3rd party library that can do it on your behalf, such as Pydantic:
from pydantic import BaseModel
import json
class Config(BaseModel):
name: str
env: str
with open("my-config.txt") as f:
# The constructor will raise an exception at runtime
# if the input data does not match the schema
config = Config(**json.load(f))
The main advantage of using these types of libraries is that you get full type safety. You can also use object attribute syntax instead of dict lookups (e.g. do config.name instead of config["name"]), which is arguably more ergonomic.
The main disadvantage is doing this validation does add some runtime cost, since you're now scanning over the entire JSON blob. This might end up introducing some non-trivial slowdowns to your code if your JSON happens to contain a large quantity of data.
Converting your data into an object can also sometimes be a bit inconvenient, especially if you plan on converting it back into a dict later on.

There has been a lengthy discussion (https://github.com/python/typing/issues/182) about the possibility of introducing a JSONType; however, no definitive conclusion has yet been reached.
The current suggestion is to just define JSONType = t.Union[str, int, float, bool, None, t.Dict[str, t.Any], t.List[t.Any]] or something similar in your own code.

Related

Why recommend use ctx as the first parameter?

As the documentation said
Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it. The Context should be the first parameter, typically named ctx
but I found, in the typical http request handle function, a http.Request object has .Context() method can retrieve the context which http request associate with.
So why recommend to use context as the first parameter in these functions? Is that reasonable in this situation?
I know that is not an absolute rule. But I want to know why the HandlerFunc is func(ResponseWriter, *Request) instead of func(context.Context, ResponseWriter, *Request).
Apparently HandlerFunc breaks this recommendation.
As described in the documentation you quoted above, ctx should be a (very) common argument for many functions. This is similar to the way many functions return an error. The best place for a common argument/return value is either as the first, or last in a list. (Arguably, Go could have chosen to make error always be the first return value--I won't discuss that here).
Since variadic variables may only be the last in the list of function arguments, this leaves the only option for a common argument to be the first one.
I expect this is why ctx is always first.
This pattern is often seen with other variables in Go (and other languages) as well. Any time a common variable is used by a set of related functions, that common variable often comes first in the argument list (or possibly second, after ctx).
Contrary to the advice you quoted, there are libraries that store ctx in a struct, rather than passing it around as the first argument. These are usually (always?) libraries which had to be retro-fitted to use ctx, long after the library contract was set in stone (by the Go 1.x compatibility guarantee).
Generally, you should follow the advice to pass ctx as the first argument, for any new work.

Pointers sent to function

I have following code in main():
msgs, err := ch.Consume(
q.Name, // queue
//..
)
cache := ttlru.New(100, ttlru.WithTTL(5 * time.Minute)) //Cache type
//log.Println(reflect.TypeOf(msgs)) 'chan amqp.Delivery'
go func() {
//here I use `cache` and `msgs` as closures. And it works fine.
}
I decided to create separate function for instead of anonymous.
I declared it as func hitCache(cache *ttlru.Cache, msgs *chan amqp.Delivery) {
I get compile exception:
./go_server.go:61: cannot use cache (type ttlru.Cache) as type *ttlru.Cache in argument to hitCache:
*ttlru.Cache is pointer to interface, not interface
./go_server.go:61: cannot use msgs (type <-chan amqp.Delivery) as type *chan amqp.Delivery in argument to hitCache
Question: How should I pass msg and cache into the new function?
Well, if the receiving variable or a function parameter expects a value
of type *T — that is, "a pointer to T",
and you have a variable of type T, to get a pointer to it,
you have to get the address of that variable.
That's because "a pointer" is a value holding an address.
The address-taking operator in Go is &, so you need something like
hitCache(&cache, &msgs)
But note that some types have so-called "reference semantics".
That is, values of them keep references to some "hidden" data structure.
That means when you copy such values, you're copying references which all reference the same data structure.
In Go, the built-in types maps, slices and channels have reference semantics,
and hence you almost never need to pass around pointers to the values of such types (well, sometimes it can be useful but not now).
Interfaces can be thought of to have reference semantics, too (let's not for now digress into discussing this) because each value of any interface type contains two pointers.
So, in your case it's better to merely not declare the formal parameters of your function as pointers — declare them as "plain" types and be done with it.
All in all, you should definitely complete some basic resource on Go which explains these basic matters in more detail and more extensively.
You're using pointers in the function signature but not passing pointers - which is fine; as noted in the comments, there is no reason to use pointers for interface or channel values. Just change the function signature to:
hitCache(cache ttlru.Cache, msgs chan amqp.Delivery)
And it should work fine.
Pointers to interfaces are nearly never used. You may simplify things and use interfaces of pass by value.

Why Map does not work for GString in Groovy?

With the following snippet I cannot retrieve gString from a map:
def contents = "contents"
def gString = "$contents"
def map = [(gString): true]
assert map.size() == 1 // Passes
assert gString.hashCode() == map.keySet().first().hashCode() // Passes, same hash code
assert map[gString] // Fails
How on earth is that possible?
Assertion message clearly shows that there's something seriously wrong with Groovy:
assert map[gString] // Fails
| ||
| |contents
| null
[contents:true]
It's not the same question as Why groovy does not see some values in dictionary?
First answer there suggests:
You're adding GString instances as keys in your map, then searching for them using String instances.
In this question I clearly add GString and try to retrieve GString.
Also neither Why are there different behaviors for the ways of addressing GString keys in maps? nor Groovy different results on using equals() and == on a GStringImpl have an answer for me. I do not mutate anything and I do not mix String with GString.
tl;dr: You seem to have discovered a bug in Groovy's runtime argument overloading evaluation.
Answer:
map[gString] is evaluated as map.getAt(gString) at runtime straightforwardly via Groovy's operator overloading mechanism. So far, so good, but now is where everything starts to go awry. The Java LinkedHashMap class does not have a getAt method anywhere in it's type hierarchy, so Groovy must use dynamically associated mixin methods instead (Actually that statement is sort of reversed. Groovy uses mixin methods before using the declared methods in the class hierarchy.)
So, to make a long story short, Groovy resolves map.getAt(gString) to use the category method DefaultGroovyMethods.getAt(). Easy-peasy, right? Except that this method has a large number of different argument overloads, several of which might apply, especially when you take Groovy's default argument coercion into account.
Unfortunately, instead of choosing DefaultGroovyMethods.getAt(Map<K,V>,K), which would seem to be a perfect match, Groovy chooses DefaultGroovyMethods.getAt(Object,String), which coerces the GString key argument into a String. Since the actual key is in fact a GString, the method ultimately fails to find the value.
To me the real killer is that if the argument overload resolution is performed directly from code (instead of after the operator resolution and the category method selection), then Groovy makes the right overload choice! That is to say, if you replace this expression:
map[gString]
with this expression:
DefaultGroovyMethods.getAt(map,gString)
then the argument overloading is resolved correctly, and the correct value is found and returned.
There's nothing wrong with Groovy. A GString is not a String. It is mutable and as such should never be used as a key in a map (like any other mutable object in Java).
Learn more about this in the docs: http://docs.groovy-lang.org/latest/html/documentation/index.html#_gstring_and_string_hashcodes

Find all imported interfaces that object supports

I have an object like os.Stdout and I want to know if it supports io.WriteCloser on my platform. I can get the type of my object, but it doesn't tell me anything about interfaces.
package main
import ("fmt"; "reflect"; "os")
func main() {
fmt.Println(reflect.TypeOf(os.Stdout))
}
This code prints *os.File to console.
I can manually lookup if os.File matches io.WriteCloser methods, but I am curious to get all interfaces that this object supports.
It's not an exactly answer on the question, because it is not for runtime. Anyway I think it maybe useful
Take a look on https://golang.org/lib/godoc/analysis/help.html
godoc has static analysis features. And it can display your type implements relations.
For example you can run godoc -http=:8081 -analysis=type and get all your packages documentation with type analysis.
To expand on the comment from #Volker regarding type assertions, that would look like this:
_, implements := interface{}(os.Stdout).(io.Reader)
It casts os.Stdout to an interface{} type and then attempts to assert that it is an io.Reader. Type assertions return two values; the first is the asserted value (or nil if assertion fails) and the second is a boolean indicating if the assertion was successful or not. If you omit capturing the second return value then a failed assertion will cause a panic.
For alternative, possibly more generic or runtime requirements the types package may have some useful functions based on reflection: https://godoc.org/golang.org/x/tools/go/types

How do generics (Vector) work inside the AVM?

Support for generics (currently only Vector.<*>, and called 'postfix type parameters' by Adobe) was added in Flash Player 10, but the only AVM2 documentation does not describe how these objects are accessed.
Specifically, I noticed a new opcode (0x53) and a new multiname kind (0x1D) that seem relevant, but their usage is not documented.
NB: This question was created with the answer already known as it is more easily found here than on my blog or the Adobe Bug DB.
The reverse engineering work I did on this did not include declaring your own generic types, though it's very likely possible.
References to the declaring (parameterless) generic type (Vector) are made through a regular qualified name (though any multiname should do).
References to a typed generic type (Vector.<int> as opposed to Vector.<>) are made by a new multiname kind (0x1D), which I call GenericName. GenericName has a format like so:
[Kind] [TypeDefinition] [ParamCount] [Param1] [Param2] [ParamN]
Where:
[TypeDefinition] is a U30 into the multiname table
[ParamCount] is a U8 (U30?) of how many type parameters there are
[ParamX] is a U30 into the multiname table.
Obviously generics are not generally supported yet, so ParamCount will always be 1 (for Vector.<*>).
The other interesting thing is how instances of the class are created. A new opcode was added in Flash 10 (0x53), which I will call MakeGenericType. MakeGenericType is declared with the following stack:
TypeDefinition, ParameterType1, ParameterTypeN -> GenericType
It also has one parameter, a U8 (U30?) specifying how many parameters are on the stack. You will generally see MakeGenericType being used like this:
GetLex [TypeDefinitionMultiname]
GetLex [ParameterTypeMultiname]
MakeGeneric [ParamCount]
Coerce [GenericNameMultiname]
Construct [ConstructorParamCount]
So if you had the following...
GetLex __AS3__.vec::Vector
GetLex int
MakeGeneric 1
Coerce __AS3__.vec::Vector.<int>
Construct 0
You would now have an instance of Vector.<int>

Resources