Simple gcloud dataflow pipeline:
PubsubIO.readStrings().fromSubscription -> Window -> ParDo -> DatastoreIO.v1().write()
When load is applied to the pubsub topic, the messages are read but not acked:
Jul 25, 2017 4:20:38 PM org.apache.beam.sdk.io.gcp.pubsub.PubsubUnboundedSource$PubsubReader stats
INFO: Pubsub projects/my-project/subscriptions/my-subscription has 1000 received messages, 950 current unread messages, 843346 current unread bytes, 970 current in-flight msgs, 28367ms oldest in-flight, 1 current in-flight checkpoints, 2 max in-flight checkpoints, 770B/s recent read, 1000 recent received, 0 recent extended, 0 recent late extended, 50 recent ACKed, 990 recent NACKed, 0 recent expired, 898ms recent message timestamp skew, 9224873061464212ms recent watermark skew, 0 recent late messages, 2017-07-25T23:16:49.437Z last reported watermark
What pipeline step should ack the messages?
stackdriver dashboard shows that there are some acks but the number of unacked messages stays stable.
no error messages in the trace indicating that the message processing failed.
entries show up in the datastore
Dataflow will only acknowledge PubSub messages after they are durably committed somewhere else. In a pipeline that consists of PubSub -> ParDo -> 1 or more sinks, this may be delayed by any of the sinks having problems (even if they are being retried, that will slow things down). This is part of ensuring that results seem to be processed effectively-once. See a previous question about when Dataflow acknowledges a message for more details.
One (easy) option to change this behavior is to add a GroupByKey (using a randomly generated key) after the PubSub source and before the sinks. This will cause the messages to be acknowledged earlier, but may perform worse, since PubSub is generally better at holding the unprocessed inputs than the GroupByKey.
Related
I am using this SQSSensoe settings to poll messages
fetch_sqs_message = SQSSensor(
task_id="...",
sqs_queue="...",
aws_conn_id="aws_default",
max_messages=10,
wait_time_seconds=30,
poke_interval=60,
timeout=300,
dag=dag
)
I would assume everytime it polls it should poll up to 10 messages. Which my queue has around 5 when I tested this.
But each time I trigger the dag, it only polls 1 message at a time, which I found out from the SQS message count.
Why is it doing this? How can I to get it poll as much messages as possible?
Recently, a new feature has been added to SQSSensor so that the sensor can polls SQS multiple times instead of only once.
You can check out this merged PR
For example, if num_batches is set to 3, SQSSensor will poll the queue 3 times before returning the results.
Disclaimer: I contributed to this feature.
I am using Alertatron to send manual alerts to bybit testnet exchange. I am getting the following error log. Please let me know what the issue is
======error start==
[v283, bybit, jothibybit, EOSUSD] ::: market(side=buy, amount=1);
Script market v1.0.0, by Alertatron
using buy offset of 0 from 2.884 (current price) --> 2.884
[bybit, jothibybit, EOSUSD] Executing market order to buy 1.
Not enough margin to cover order that size [post /v2/private/order/create] - status code: 30031
Session 40b71524 has no more commands to process on bybit (jothibybit), EOSUSD - waiting for background processes...
bybit : jothibybit : No active background processes for EOSUSD. Done.
Session 40b71524 finished waiting for related background tasks
Request to close exchange connection bybit, jothibybit. Not being used any more.
Bybit closed
Bot entering idle state - updating to latest release.
===end log
my code
jothibybit(EOSUSD){
exchangeSettings(leverage=cross);
cancel(which=all);
market(side=buy, amount=1);
}
#bot
Generally that error means that you do not have enough money on your Bybit account to execute the trade.
The reason for that can be, that you have not enough money, then you need to reduce the trade size, or you are trying to execute the trade from a subaccount, which is somehow not working properly. in case you are using a subaccount API Key on Bybit, try to execute the trades using a Main Account API Key.
In both cases double check on Alertatron if you have access to the Bybit balance, by running a Manual Alert:
https://alertatron.com/docs/automated-trading/balance
While you are executing that test trade, you should open the "live bot output" in Alertatron, to see the response you get from Bybit.
If that works, try to execute an example trade:
https://alertatron.com/docs/automated-trading/api-keys-bybit
I have a queue at NiFi that contains the items that will be processed through an API query (invokeHTTP). These items can be processed and return the answer with the data correctly (status 200), they can not be found (status 404) and also a failure (status 500).
However, in the case of status 404 and 500, false negatives can happen, so if I consult the same data that gave an error again, it returns with status 200. But there are cases that there really is a failure and it is not a false negative.
So I created a queue for retry and failure for them to enter involeHTTP again and consult the API. I put an expiration time of 5 minutes so that the data that is really at fault is not forever consulting the API.
However, I wanted to prioritize this Failure and Retry queue, so that by the time a data reaches it, it will be consulted in the API again, in front of the standard processing queue, so as not to lose the data that gave false negatives.
Is it possible to do this treatment with this self relationship or do you need a new flowfile?
Each queue can have a prioritizer configured on the queue's settings. Currently you have two separate queues for InvokeHttp, the failure/retry queue and the incoming queue from the matched relationship of EvaluateJsonPath. You need to put a funnel in front of InvokeHttp and send both of these queues to the funnel, then the funnel to InvokeHttp. This way you can create a single incoming queue to InvokeHttp and configure the prioritizer there.
In order to prioritize it correctly, you may want to use Flow File Attribute prioritizer. You would use UpdateAttribute to add a "priority" attribute to each flow file, the ones for failure/retry get priority "A" and the others get priority "B" (or anything that sorts after A).
What is the situation with fees when:
I have one channel
10 users belong to the channel
one of the people added a message to the channel so all users who are listening will receive the message
Costs:
1. Adding a message costs X1
2. What will the cost be for everyone to read?
Adding a message costs X1
Adding a message will cost you exactly one document write.
What will the cost be for everyone to read?
If all 10 users are reading that message, then the cost will of 10 document reads.
In short, we are sometimes seeing that a small number of Cloud Bigtable queries fail repeatedly (for 10s or even 100s of times in a row) with the error rpc error: code = 13 desc = "server closed the stream without sending trailers" until (usually) the query finally works.
In detail, our setup is as follows:
We are running a collection (< 10) of Go services on Google Compute Engine. Each service leases tasks from a pair of PULL task queues. Each task contains an ID of a bigtable row. The task handler executes the following query:
row, err := tbl.ReadRow(ctx, <my-row-id>,
bigtable.RowFilter(bigtable.ChainFilters(
bigtable.FamilyFilter(<my-column-family>),
bigtable.LatestNFilter(1))))
If the query fails then the task handler simply returns. Since we lease tasks with a lease time between 10 and 15 minutes, a little while later the lease will expire on that task, it will be lease again, and we'll retry. The tasks have a max retry of 1000 so they can be retried many times over a long period. In a small number of cases, a particular task will fail with the grpc error above. The task will typically fail with this same error every time it runs for hours or days on end, before (seemingly out of the blue) eventually succeeding (or the task runs out of retries and dies).
Since this often takes so long, it seems unrelated to server load. For example right now on a Sunday morning, these servers are very lightly loaded, and yet I see plenty of these errors when I tail the logs. From this answer, I had originally thought that this might be due to trying to query for a large amount of data, perhaps near the max limit that cloud bigtable will support. However I now see that this is not the case; I can find many examples where tasks that have failed many times finally succeed and report only a small amount of data (e.g. <1 MB) was retrieved.
What else should I be looking at here?
edit: From further testing I now know that this is completely machine (client) independent. If I tail the log on one of the task leasing machines, wait for a "server closed the stream without sending trailers" error, and then try a one-off ReadRow query to the same rowId from another, unrelated, totally unused machine, I get the same error repeatedly.
This error is typically caused by having more than 256MB of data in your reply.
However, there is currently a bug in our server side error handling code that allows some invalid characters in HTTP/2 trailers which is not allowed by the spec. This means that some error messages that have invalid characters will be seen as this kind of error. This should be fixed early next year.