Reading from PubsubIO writing to DatastoreIO - google-cloud-datastore

Is it possible to create a pipeline that reads data from Pub/Sub and writes to Datastore? In my code I specify the PubsubIO as the input, and apply windowing to get a bounded PCollection, but it seems that it is not possible to use the DatastoreIO.writeTo with the options.setStreaming as true, while that is required in order to use PubsubIO as input. Is there a way around this? Or is it simply not possible to read from pubsub and write to datastore?
Here's my code:
DataflowPipelineOptions options = PipelineOptionsFactory.create()
.as(DataflowPipelineOptions.class);
options.setRunner(DataflowPipelineRunner.class);
options.setProject(projectName);
options.setStagingLocation("gs://my-staging-bucket/staging");
options.setStreaming(true);
Pipeline p = Pipeline.create(options);
PCollection<String> input = p.apply(PubsubIO.Read.topic("projects/"+projectName+"/topics/event-streaming"));
PCollection<String> inputWindow = input.apply(Window.<String>into(FixedWindows.of(Duration.standardSeconds(5))).triggering(AfterPane.elementCountAtLeast(1)).discardingFiredPanes().withAllowedLateness(Duration.standardHours(1)));
PCollection<String> inputDecode = inputWindow.apply(ParDo.of(new DoFn<String, String>() {
private static final long serialVersionUID = 1L;
public void processElement(ProcessContext c) {
String msg = c.element();
byte[] decoded = Base64.decodeBase64(msg.getBytes());
String outmsg = new String(decoded);
c.output(outmsg);
}
}));
PCollection<DatastoreV1.Entity> inputEntity = inputDecode.apply(ParDo.of(new CreateEntityFn("stream", "events")));
inputEntity.apply(DatastoreIO.writeTo(datasetid));
p.run();
And this is the exception I get:
Exception in thread "main" java.lang.UnsupportedOperationException: The Write transform is not supported by the Dataflow streaming runner.
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner$StreamingWrite.apply(DataflowPipelineRunner.java:488)
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner$StreamingWrite.apply(DataflowPipelineRunner.java:480)
at com.google.cloud.dataflow.sdk.runners.PipelineRunner.apply(PipelineRunner.java:74)
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.apply(DataflowPipelineRunner.java:314)
at com.google.cloud.dataflow.sdk.Pipeline.applyInternal(Pipeline.java:358)
at com.google.cloud.dataflow.sdk.Pipeline.applyTransform(Pipeline.java:267)
at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.apply(DataflowPipelineRunner.java:312)
at com.google.cloud.dataflow.sdk.Pipeline.applyInternal(Pipeline.java:358)
at com.google.cloud.dataflow.sdk.Pipeline.applyTransform(Pipeline.java:267)
at com.google.cloud.dataflow.sdk.values.PCollection.apply(PCollection.java:159)
at my.own.project.google.dataflow.EventStreamingDataflow.main(EventStreamingDataflow.java:104)

The DatastoreIO sink is not currently supported in the streaming runner. To write to Datastore from a streaming pipeline, you can make direct calls to the Datastore API from a DoFn.

Ok, after a lot of banging my head against the wall, I finally got it working. Like danielm suggested, I'm making calls to the Datastore API from a ParDo DoFn. One problem was, that I didn't realize there is a separate API for using the Cloud Datastore outside from AppEngine. (com.google.api.services.datastore... vs. com.google.appengine.api.datastore...). Another problem was that apparently there is some kind of bug in the latest version of the Cloud Datastore API (google-api-services-datastore-protobuf v1beta2-rev1-4.0.0, I got an IllegalAccessError), I resolved that by using an older version (v1beta2-rev1-2.1.2).
So, here's my working code:
import com.google.cloud.dataflow.sdk.Pipeline;
import com.google.cloud.dataflow.sdk.io.PubsubIO;
import com.google.cloud.dataflow.sdk.options.DataflowPipelineOptions;
import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
import com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner;
import com.google.cloud.dataflow.sdk.transforms.DoFn;
import com.google.cloud.dataflow.sdk.transforms.ParDo;
import com.google.cloud.dataflow.sdk.values.PCollection;
import com.google.api.services.datastore.DatastoreV1.*;
import com.google.api.services.datastore.client.Datastore;
import com.google.api.services.datastore.client.DatastoreException;
import com.google.api.services.datastore.client.DatastoreFactory;
import static com.google.api.services.datastore.client.DatastoreHelper.*;
import java.security.GeneralSecurityException;
import java.io.IOException;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;
//--------------------
public static void main(String[] args) {
DataflowPipelineOptions options = PipelineOptionsFactory.create()
.as(DataflowPipelineOptions.class);
options.setRunner(DataflowPipelineRunner.class);
options.setProject(projectName);
options.setStagingLocation("gs://my-staging-bucket/staging");
options.setStreaming(true);
Pipeline p = Pipeline.create(options);
PCollection<String> input = p.apply(PubsubIO.Read.topic("projects/"+projectName+"/topics/my-topic-name"));
input.apply(ParDo.of(new DoFn<String, String>() {
private static final long serialVersionUID = 1L;
public void processElement(ProcessContext c) throws ParseException, DatastoreException {
JSONObject json = (JSONObject)new JSONParser().parse(c.element());
Datastore datastore = null;
try {
datastore = DatastoreFactory.get().create(getOptionsFromEnv()
.dataset(datasetid).build());
} catch (GeneralSecurityException exception) {
System.err.println("Security error connecting to the datastore: " + exception.getMessage());
} catch (IOException exception) {
System.err.println("I/O error connecting to the datastore: " + exception.getMessage());
}
Key.Builder keyBuilder = makeKey("my-kind");
keyBuilder.getPartitionIdBuilder().setNamespace("my-namespace");
Entity.Builder event = Entity.newBuilder()
.setKey(keyBuilder);
event.addProperty(makeProperty("my-prop",makeValue((String)json.get("my-prop"))));
CommitRequest commitRequest = CommitRequest.newBuilder()
.setMode(CommitRequest.Mode.NON_TRANSACTIONAL)
.setMutation(Mutation.newBuilder().addInsertAutoId(event))
.build();
if(datastore!=null){
datastore.commit(commitRequest);
}
}
}));
p.run();
}
And the dependencies in pom.xml:
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>[1.0.0,2.0.0)</version>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-datastore-protobuf</artifactId>
<version>v1beta2-rev1-2.1.2</version>
</dependency>
<dependency>
<groupId>com.google.http-client</groupId>
<artifactId>google-http-client</artifactId>
<version>1.17.0-rc</version>
</dependency>
<!-- Some more.. like JUnit etc.. -->

Related

openqa selenium Session Not Created Exception. Flutter Automation

I am trying to automate flutter Apk by using valuekey locators. I used following code to automate Apk. I am trying to use Appium and flutter finder for the automation.
package io.github.ashwith.flutter.example;
import java.net.MalformedURLException;
import java.net.URL;
import java.time.Duration;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.remote.RemoteWebDriver;
import io.appium.java_client.android.AndroidDriver;
import io.github.ashwith.flutter.FlutterFinder;
public class Flutter_Finder {
public static RemoteWebDriver driver;
public static void main(String[] args) throws MalformedURLException {
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability("deviceName", "Android");
capabilities.setCapability("platformName", "Android");
capabilities.setCapability("noReset", true);
capabilities.setCapability("app", "E:\\Testsigma.apk");
capabilities.setCapability("automationName", "flutter");
driver = new AndroidDriver(new URL("http://localhost:4723/wd/hub"), capabilities);
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(30));
FlutterFinder finder = new FlutterFinder(driver);
WebElement element = finder.byValueKey("incrementButton");
element.click();
}
}
When I am trying to run the code I am getting following error code.
Exception in thread "main" org.openqa.selenium.SessionNotCreatedException:
Could not start a new session.
Response code 500.
Message: An unknown server-side error occurred while processing the command.
Original error: Cannot read property 'match' of undefined
I have used following Appium java client version for this automation as my dependencies.
<dependency>
<groupId>io.appium</groupId>
<artifactId>java-client</artifactId>
<version>8.3.0</version>
</dependency>
Please help me to resolve this error.
Thank you very much!

Google Vision API : java.lang.NoClassDefFoundError: com/google/cloud/vision/v1/ImageAnnotatorClient ERROR

I'm trying to run the Google Api Vision sample code but I'm getting this error:
java.lang.NoClassDefFoundError:com/google/cloud/vision/v1/ImageAnnotatorClient
These are the dependencies that imported into my project.
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-vision</artifactId>
<version>1.74.0</version>
</dependency>
<dependency>
<groupId>com.google.api.grpc</groupId>
<artifactId>proto-google-common-protos</artifactId>
<version>1.7.0</version>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
<version>3.0.2</version>
</dependency>
<dependency>
<groupId>javax.annotation</groupId>
<artifactId>javax.annotation-api</artifactId>
<version>1.3.2</version>
</dependency>
Code that I'm using. Which is provided google Vision API from: https://cloud.google.com/vision/docs/libraries
package com.google.cloud.vision.api.utils;
//Imports the Google Cloud client library
import com.google.cloud.vision.v1.AnnotateImageRequest;
import com.google.cloud.vision.v1.AnnotateImageResponse;
import com.google.cloud.vision.v1.BatchAnnotateImagesResponse;
import com.google.cloud.vision.v1.EntityAnnotation;
import com.google.cloud.vision.v1.Feature;
import com.google.cloud.vision.v1.Feature.Type;
import com.google.cloud.vision.v1.Image;
import com.google.cloud.vision.v1.ImageAnnotatorClient;
import com.google.protobuf.ByteString;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
public class QuickstartSample {
public static void main(String... args) throws Exception {
// Instantiates a client
try (ImageAnnotatorClient vision = ImageAnnotatorClient.create()) {
// The path to the image file to annotate
String fileName = "/content/dam/USGBoral/Australia/Website/Images/products/steel_framing/SteelFraming-335x135_en.jpg";
// Reads the image file into memory
Path path = Paths.get(fileName);
byte[] data = Files.readAllBytes(path);
ByteString imgBytes = ByteString.copyFrom(data);
// Builds the image annotation request
List<AnnotateImageRequest> requests = new ArrayList<>();
Image img = Image.newBuilder().setContent(imgBytes).build();
Feature feat = Feature.newBuilder().setType(Type.LABEL_DETECTION).build();
AnnotateImageRequest request = AnnotateImageRequest.newBuilder()
.addFeatures(feat)
.setImage(img)
.build();
requests.add(request);
// Performs label detection on the image file
BatchAnnotateImagesResponse response = vision.batchAnnotateImages(requests);
List<AnnotateImageResponse> responses = response.getResponsesList();
for (AnnotateImageResponse res : responses) {
if (res.hasError()) {
System.out.printf("Error: %s\n", res.getError().getMessage());
return;
}
for (EntityAnnotation annotation :res.getLabelAnnotationsList()) {
annotation.getAllFields()
.forEach((k, v) -> System.out.printf("%s : %s\n", k, v.toString()));
}
}
}
}
}
Use package Google.Cloud.Vision.V1 or you might also want to check the correct Client Library to use depending on your framework.

Infinispan cluster with Karaf instances

we are very new to Infinispan and also quite new to Apache Karaf. Installing Infinispan in Karaf was easy, we did write two OSGi Bundles to form a cluster with two nodes that run on one host. We tried it with the tutorial for a distributed cache from the Infinispan website (tutorial). Unfortunately the cluster seems not to be build and we can't determine why. Any help or push in the right direction would be very appreciated.
The code of the bundle that writes something in the cache looks like that:
import org.infinispan.Cache;
import org.infinispan.configuration.cache.CacheMode;
import org.infinispan.configuration.cache.ConfigurationBuilder;
import org.infinispan.configuration.global.GlobalConfigurationBuilder;
import org.infinispan.manager.DefaultCacheManager;
import org.infinispan.context.Flag;
import org.osgi.framework.BundleActivator;
import org.osgi.framework.BundleContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class CacheProducer implements BundleActivator{
private static Logger LOG = LoggerFactory.getLogger(CacheProducer.class );
private static DefaultCacheManager cacheManager;
#Override
public void start( BundleContext context ) throws Exception{
LOG.info( "Start Producer" );
GlobalConfigurationBuilder global = GlobalConfigurationBuilder.defaultClusteredBuilder();
global.transport().clusterName("ClusterTest");
// Make the default cache a distributed synchronous one
ConfigurationBuilder builder = new ConfigurationBuilder();
builder.clustering().cacheMode(CacheMode.DIST_SYNC);
// Initialize the cache manager
cacheManager = new DefaultCacheManager(global.build(), builder.build());
// Obtain the default cache
Cache<String, String> cache = cacheManager.getCache();
cache.put( "message", "Hello World!" );
LOG.info( "Producer: whole cluster content!" );
cache.entrySet().forEach(entry -> LOG.info(entry.getKey()+ ": " + entry.getValue()));
LOG.info( "Producer: current cache content!" );
cache.getAdvancedCache().withFlags(Flag.SKIP_REMOTE_LOOKUP)
.entrySet().forEach(entry -> LOG.info(entry.getKey()+ ": " + entry.getValue()));
}
#Override
public void stop( BundleContext context ) throws Exception{
cacheManager.stop();
}
}
And the one that tries to print out what is in the cache like that:
package metdoc81.listener;
import org.infinispan.configuration.cache.CacheMode;
import org.infinispan.configuration.cache.ConfigurationBuilder;
import org.infinispan.configuration.global.GlobalConfigurationBuilder;
import org.osgi.framework.BundleActivator;
import org.osgi.framework.BundleContext;
import org.infinispan.Cache;
import org.infinispan.manager.DefaultCacheManager;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class Activator implements BundleActivator{
private static Logger LOG = LoggerFactory.getLogger(Activator.class);
private static DefaultCacheManager cacheManager;
public void start( BundleContext bundleContext ) throws Exception{
LOG.info("start cluster listener");
GlobalConfigurationBuilder global = GlobalConfigurationBuilder.defaultClusteredBuilder();
global.transport().clusterName("ClusterTest");
// Make the default cache a distributed synchronous one
ConfigurationBuilder builder = new ConfigurationBuilder();
builder.clustering().cacheMode(CacheMode.DIST_SYNC);
// Initialize the cache manager
cacheManager = new DefaultCacheManager(global.build(), builder.build());
// Obtain the default cache
Cache<String, String> cache = cacheManager.getCache();
LOG.info("After configuration");
cache.entrySet().forEach(entry -> LOG.info(entry.getKey()+ ": " + entry.getValue()));
LOG.info("After logging");
}
public void stop( BundleContext bundleContext ) throws Exception{
}
}
The printing from the CacheProducer works, printing from the Listener does not.
We found the solution ourselves.
The problem just occurs when you try to run the code on MacOS, on Windows it's working. According to a discussion at JBossDeveloper there was a problem with the multicast routing on MacOS. Even though they added a workaround into the example code, you still have to add the -Djava.net.preferIPv4Stack=true Flag when running it or you have to add these two lines of code:
Properties properties = System.getProperties();
properties.setProperty( "java.net.preferIPv4Stack", "true" );

Prtining method arguments using byte buddy API

I am working on a project where I need access method arguments during execution.
Is it possible to print method arguments using byte buddy framework? any sample code on this using javaagent is highly appreciated.
Yes, this is possible. You can use MethodDelegation or Advice to inject your code and then use the #AllArguments annotation to get hold of the actual arguments.
The question is, how do you create your code in your project? You can either use a Java agent with the AgentBuilder or create proxy subclasses using ByteBuddy instances. Refer to the documentation and the mentioned classes javadoc to find out how this is done.
Here is an example of how this can be implemented using MethodDelegation. I use it to measure the execution time of methods. I specifically did not begin to remove the extra code, because I want to more fully reveal the capabilities of Byte Buddy.
package md.leonis.shingler;
import net.bytebuddy.agent.ByteBuddyAgent;
import net.bytebuddy.agent.builder.AgentBuilder;
import net.bytebuddy.implementation.MethodDelegation;
import net.bytebuddy.implementation.bind.annotation.AllArguments;
import net.bytebuddy.implementation.bind.annotation.Origin;
import net.bytebuddy.implementation.bind.annotation.RuntimeType;
import net.bytebuddy.implementation.bind.annotation.SuperCall;
import net.bytebuddy.matcher.ElementMatchers;
import java.lang.instrument.Instrumentation;
import java.lang.reflect.Method;
import java.util.Arrays;
import java.util.concurrent.Callable;
import java.util.stream.Collectors;
public class MeasureMethodTest {
public static void main(String[] args) throws InterruptedException {
premain(ByteBuddyAgent.install());
for (int i = 0; i < 4; i++) {
SampleClass.foo("arg" + i);
}
}
public static void premain(Instrumentation instrumentation) {
new AgentBuilder.Default()
.type(ElementMatchers.nameStartsWith("md.leonis.shingler"))
.transform((builder, type, classLoader, module) ->
builder.method(ElementMatchers.any()).intercept(MethodDelegation.to(AccessInterceptor.class))
).installOn(instrumentation);
}
public static class AccessInterceptor {
#RuntimeType
public static Object intercept(#Origin Method method, #SuperCall Callable<?> callable, #AllArguments Object[] args) throws Exception {
long start = System.nanoTime();
try {
return callable.call();
} finally {
if (method.getAnnotationsByType(Measured.class).length > 0) {
String params = Arrays.stream(args).map(Object::toString).collect(Collectors.joining(", "));
System.out.println(method.getReturnType().getSimpleName() + " " + method.getName() + "("+ params +") took " + ((System.nanoTime() - start) / 1000000) + " ms");
}
}
}
}
public static class SampleClass {
#Measured
static void foo(String s) throws InterruptedException {
Thread.sleep(50);
}
}
}
This example measures the execution time of all methods found in the md.leonis.shingler package and marked with the #Measured annotation.
To run it, you need two libraries: byte-buddy and byte-buddy-agent.
The result of work:
void foo(arg0) took 95 ms
void foo(arg1) took 50 ms
void foo(arg2) took 50 ms
void foo(arg3) took 50 ms
Note that the console displays the values of all arguments passed to the method. This is the answer to the question asked.
Here is the annotation example:
package md.leonis.shingler;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.METHOD)
public #interface Measured {
}
To be honest, I was not able to directly configure filtering by annotations in the Agent. Here is an example (not working):
new AgentBuilder.Default()
.type(ElementMatchers.isAnnotatedWith(Measured.class))
.transform((builder, type, classLoader, module) ->
builder.method(ElementMatchers.any()).intercept(MethodDelegation.to(AccessInterceptor.class))
).installOn(instrumentation);
If someone knows how to do this, please comment below.

Glassfish 3.1.2 seems not to run more than one thread in an EJB

I have an EJB to calculate sth. with increasing precision as long as the calculation runs.
So one async function starts the calculation and one async should stop it. But Running on Glassfish 3.1.2 calling stopCalculating() does not create a new Thread but waits until startCalculating() finishes, what obviously never happens.
import java.util.concurrent.Future;
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.ejb.AsyncResult;
import javax.ejb.Asynchronous;
import javax.ejb.Singleton;
#Singleton
public class Calculator {
private boolean calculating = false;
private String result = "Empty";
#Asynchronous
public void startCalculating() {
calculating = true;
Logger.getGlobal().log(Level.INFO, "Starting!");
calculate();
}
private void calculate() {
result = "";
while(calculating) {
/*Calculate and update result*/
Logger.getGlobal().log(Level.INFO, "Calculate...");
}
}
#Asynchronous
public Future<String> stopCalculating() {
Logger.getGlobal().log(Level.INFO, "Stopping!");
calculating = false;
return new AsyncResult<String>(result);
}
}
How can I get Glassfish to run stopCalculating() in an other Thread?
I think it's because by default Singleton EJB has #Lock(WRITE) access.
You're using local calls instead of going through EJB proxy so calculate() invoked from startCalculating(-) is a part of the whole startCalculating(-) invocation (so has #Local(WRITE) as well).
I'd try adding #Lock(READ) to your calculate() method and change the local call to business call or just set #Lock(READ) for your EJB and give it a shot.

Resources