Can we use standalone Spring Cloud Schema Registry with Confluent's KafkaAvroSerializer? - spring-kafka

I have a project using Spring cloud stream with Kafka Streams binder. For the output of a stream, I am using Avro, with the Serde provided by Confluent(io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde).
I am able to use it with the Confluent Schema Registry. Serialization and Deserialization takes place correctly.
However, I wanted to see if we can use the Spring Cloud Schema Registry Server instead of the Confluent one. I configured a standalone Schema Registry server and set the schema registry in my project to it (changed the schemaRegistryClient.endpoint and schema.registry.url properties).
When I tried it out, it seems Spring Cloud is able to work with the standalone server. It registers the schema available in the resources folder as a .avsc file. However, when I send a message, it seems the Confluent serializer continues to approach it as a Confluent Schema Registry (which has different REST endpoints from Spring Schema Registry). As a result, it gets a 405 response code.
We get the following exception(partial stack-trace)
org.apache.kafka.common.errors.SerializationException: Error registering Avro schema: <my-avro-schema>
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Unexpected character ('<' (code 60)): expected a valid value (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
at [Source: (sun.net.www.protocol.http.HttpURLConnection$HttpInputStream); line: 1, column: 2]; error code: 50005
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:230)
It seems to me that there are two possibilities:
Spring Schema Registry Server can work only with the content-type provided by Spring (specified as content-type: application/*+avro) and not with the native Serde provided by Confluent, or
There is an issue with the project configuration.
Can someone help me figure out which one is it? If it is the second one, can someone point out what is wrong?

Each schema registry provider requires a proprietary SerDe library. For example, if you would like to integrate AWS Glue Schema Registry with Kafka, then you would need Amazon's SerDe stuff. Hence, the Confluent's SerDe library expects Confluent's Schema Registry at the address specified in the schema.registry.url property.

Related

Confused about health checking protocol

I have read below doc, source code and issue:
https://github.com/grpc/grpc/blob/master/doc/health-checking.md
https://github.com/grpc/grpc-node/blob/master/packages/grpc-health-check/test/health_test.js
https://github.com/grpc/grpc/issues/10428
I provide an example and try to explain:
// Import package
let health = require('grpc-health-check');
// Define service status map. Key is the service name, value is the corresponding status.
// By convention, the empty string "" key represents that status of the entire server.
const statusMap = {
"ServiceFoo": proto.grpc.health.v1.HealthCheckResponse.ServingStatus.SERVING,
"ServiceBar": proto.grpc.health.v1.HealthCheckResponse.ServingStatus.NOT_SERVING,
"": proto.grpc.health.v1.HealthCheckResponse.ServingStatus.NOT_SERVING,
};
// Construct the service implementation
let healthImpl = new health.Implementation(statusMap);
// Add the service and implementation to your pre-existing gRPC-node server
server.addService(health.service, healthImpl);
I am not clear about the following points:
Does the service name in statusMap need to be the same as the service name in the protocol buffers file? Or the service name can be arbitrarily specified. If so, how does the service name map to the service defined in the protocol buffers?
From the health checking protocol:
The server should register all the services manually and set the individual status
Why do we need to register manually? If the service code can be generated, why doesn't grpc help us automatically register the service name in statusMap? (Imagine setting the status of 100 services one by one)
The service status is hard code and cannot be changed at application runtime. If my service is unavailable at runtime for some reason such as misconfiguration, downstream service is not available, but the status of the service is always serving(because it's hard code), if so, what is the meaning of the health check?
For RESTful API, we can provide a /health-check or /ping API to check that the entire server is running normally.
Regarding the service names, the first linked document says this:
The suggested format of service name is package_names.ServiceName, such as grpc.health.v1.Health.
This does correspond to the package names and service name defined in the Protobuf definition.
The services need to be registered "manually" because the status is determined at the application level, which the grpc library does not know about, and a registered service name is only meaningful along with the corresponding status. In addition, the naming format mentioned above is just a convention; the health check service user is not constrained to it, and the actual services on the server are not constrained to use the standard /package_names.ServiceName/MethodName method naming scheme either.
Regarding the third point, the service status should not be hardcoded, and can be changed at runtime. The HealthImplementation class used in the code in the question has a setStatus method that can be used to update the status.
Also, as mentioned in a comment in the code in the question,
By convention, the empty string "" key represents that status of the entire server.
That can be used as the equivalent of the /health-check or /ping REST APIs.

How to create/simulate creation of deserialization exception when using schema registry ( in addition to brokers and zookeepers)

We are using spring-kafka-2.2.7.RELEASE to produce and consume avro messages and using schema registry for schema validation with 'FORWARD_TRANSITIVE' as the compatibility type. Now, I'm trying to use 'ErrorHandlingDeserializer2 ' from spring-kafka to handle the exception/error when a deserializer fails to deserialize a message. Now I'm trying to write a component test to test this configuration. My component test expected to have below steps.
Spin up a local kafka cluster using docker containers
Send an avro message (using KafkaTemplate) with invalid schema to re-create/simulate the deserialization exception onto a test topic
Now what's happening is, since we have schema registry in place, if i send a message with new schema (invalid schema) it's validating the schema as per the compatibility type setting we have and not letting me producer the message onto kafka by throwing an exception at the producer level itself.
Now my question is, In this scenario, how can I create/simulate the creation of deserialization exception to test my configuration. Please suggest.
Note:- I don't want to disable/stop schema registry because that wouldn't simulate our prod setup.

Why set up Serialization Context in Corda off node?

Recently, when I wanted to sign something with a certificate outside node, I got the below exceptions:
Caused by: java.lang.IllegalStateException: Expected exactly 1 of
{nodeSerializationEnv, globalSerializationEnv,
contextSerializationEnv, inheritableContextSerializationEnv} but got:
{}
https://github.com/corda/corda/blob/671a9d232cf1f29dbce4432bc91096ffd098a91c/core/src/main/kotlin/net/corda/core/serialization/internal/SerializationEnvironment.kt#L91
On debugging, I found it's first serialised and then signed. So, I had to set up the serialisation context to get it serialised and signed.
I have a limited understanding of why is it required. I understand that different contexts are required for P2P and RPC calls, but I'm not entirely sure can someone please fill me in with some background.
The internal library you are using to sign the certificate requires the certificate to be serialised first. In turn, this requires you to specify a serialisation context. A serialisation contexts defines how serialisation is performed in various situations, such as P2P, client-side RPC, server-side RPC, storage and checkpointing.
Note that these serialisation contexts are set for you automatically when running a node or a suite of node tests. You are only encountering this issue because you are using an internal library outside the context where it is expected to be used.
In your case, you should probably use globalSerializationEnv, which is the serialisation environment used for mock nodes and nodes created using the node driver. nodeSerializationEnv is used by the node itself, and contextSerializationEnv and inheritableContextSerializationEnv are used for various platform tests.
For educational purposes, it can be helpful to look at how the node sets up its serialisation framwork when it starts up (see https://github.com/corda/corda/blob/release-V3/node/src/main/kotlin/net/corda/node/internal/Node.kt):
nodeSerializationEnv = SerializationEnvironmentImpl(
SerializationFactoryImpl().apply {
registerScheme(KryoServerSerializationScheme())
registerScheme(AMQPServerSerializationScheme(cordappLoader.cordapps))
},
p2pContext = AMQP_P2P_CONTEXT.withClassLoader(classloader),
rpcServerContext = KRYO_RPC_SERVER_CONTEXT.withClassLoader(classloader),
storageContext = AMQP_STORAGE_CONTEXT.withClassLoader(classloader),
checkpointContext = KRYO_CHECKPOINT_CONTEXT.withClassLoader(classloader))

Querying Titan ElasticSearch backend via Rexster

I have Titan 0.3.2 running in embedded mode, and have been able to create and query ElasticSearch indexes via the Gremlin shell (see previous question). I am using the default configuration, which calls the ES index "search".
These searches return the correct nodes without error via the Gremlin shell:
g.query().has('my_label','abc').vertices()
g.query().has('my_label',CONTAINS,'abc').vertices()
However, if I attempt to run these same Gremlin queries via RexPro, Rexster sends back this error for the first query above:
java.util.concurrent.ExecutionException:
javax.script.ScriptException:
javax.script.ScriptException:
java.lang.IllegalArgumentException: Index is unknown or not configured: search
and this for the second:
java.util.concurrent.ExecutionException:
javax.script.ScriptException:
javax.script.ScriptException:
groovy.lang.MissingPropertyException: No such property: CONTAINS for class: Script3
Similarly, if I try to query on the indexed key via the REST API (GET):
http://localhost:8182/graphs/graph/vertices?key=my_key&value=abc
I receive the same error response:
{"message":"Index is unknown or not configured: search","error":"Index is unknown or not configured: search"}
Lastly, if I try to start with a clean database and run the index creation script through rexpro:
g.makeType().name("my_label").dataType(String.class).indexed("search", Vertex.class).unique(Direction.OUT).makePropertyKey();
I see the same unknown index error:
java.util.concurrent.ExecutionException:
javax.script.ScriptException:
javax.script.ScriptException:
java.lang.IllegalArgumentException: Index is unknown or not configured: search
So it seems that Rexster needs some additional information about the indexing backend, possibly in its configuration file (I am using the default one included with the installation). Anyone familiar with this issue? Happy to provide more information.
You might have a mix of things going on, but the main thing is that as of Titan 0.3.2, Rexster does not automatically import Titan classes, so you can't query with CONTAINS. You need to specify the full package name when doing so:
com.thinkaurelius.titan.core.attribute.Text.CONTAINS
I can't say for sure what else is wrong, but it looks like the elastic search index is not configured properly. That doesn't have much to do with the Rexster configuration file for Titan Server. It has more to do with the second argument you pass to titan.sh which contains Titan configuration information. Make sure that elastic search is properly configured as it is in this file shown here (default with the installation): https://github.com/thinkaurelius/titan/blob/master/config/titan-server-cassandra-es.properties

BizTalk error Schema Not Found

I'm receiving an HL7 2.3 ORU schema. I've configured the appropriate party to use a schema namespace of "http://mycompany.ca/application/HL7/2X/2.3/1"
I've built my custom HL7 Schema, ad set the targetNamespace to "http://mycompany.ca/application/HL7/2X/2.3/1", and ensured it has a root element of "ORU_R01_23_GLO_DEF".
I've deployed the schema to biztalk by importing it, then running the msi.
I can see that my BIzTalk application has the schema in it, and I can see that the MSI installed the schema on the drive.
When I send an HL7 to my receive location, I get an error in the evenlog:
Error happened in body during parsing
Error # 1
Alternate Error Number: 301
Alternate Error Description: Schema http://mycompany.ca/application/HL7/2X/2.3/1#ORU_R01_23_GLO_DEF not found
Alternate Encoding System: HL7-BTA
From this, I can tell that the party resolution worked correctly, but can't figure out why it can not find the schema.

Resources