How to await connections to socket and convert them into a FusedStream in Rust - asynchronous

I have a loop {} around a futures::select!. I'd like to
await new connections on a Unix socket.
await a new message on one of the Unix streams created in 1. (preferably with some kind of uuid(sent by the connecting client in 1. directly after connecting) attached to it)
I still haven't understood the concept of a Stream correctly, especially futures-rs's "FusedFuture". I tried to create the streams I need using tokio's async-stream, but that didn't work as it does not implement futures-rs's FusedFuture (which I require, as I check other streams in the same select! that benefit/require FusedFutures is_terminated() method)
I think something like tokio_stream::StreamMap is interesting (the key as uuid and the value as a Stream that contains a tupel with the message received on the socket and reference to the UnixStream in order to write back.
I'm really sorry that I cannot provide any examples or something similar to make the issue less unclear. I've tried a lot but I was unable to come any closer to my goal. I hope the way a wrote this is somewhat understandable 😟
Whether to use a std::os::unix::net::UnixListener or tokio::net::UnixListener is up in the air. I've tried both but was unable to achieve anything with either of them.

Related

Implementing a cold source with Mutiny

I'd like to know, what is a proper way to implement my own cold source (publisher) using the Mutiny library.
Let's say there is huge file parser that should return lines as Multi<String> items according to the Subscriber's consumption rate.
New lines should be read only after previous were processed to optimize memory usage, while buffering a couple of hundred items to eliminate consumer idling.
I know about the Multi.createFrom.emitter() factory method, but using it I can't see a convenient way to implement the backpressure.
Does Mutiny have a idiomatic way to create cold sources that produce next items only after requested by the downstream, or in this case I supposed to implement my own Publisher using the Java Reactive Streams API and then wrap it in Multi?
You can use Multi.createFrom().generator(...).
The function is called for every request. But you can pass a "state" to remember where you are, typically an Iterator.
This is the opposite of the emitter approach (which does not check for requests but has a backpressure strategy attached to it).
If you need more fine-grain back-pressure support, you would need to implement a Publisher.

Need to write multiple commands on device 1 by 1

Its taking too long to write a single command on characteristics. I am using below code for a single command and a loop on it.
getConnObservable()
.first()
.flatMap(rxBleConnection -> rxBleConnection.writeCharacteristic(characteristics, command))
.observeOn(AndroidSchedulers.mainThread())
.subscribe(
bytes -> onWriteSuccess(),
this::onWriteFailure
);
Its taking almost 600ms to write on device. I need to write like 100 of commands 1 by 1.
Can anyone please explain what is the best way to do that batch operation
The best way to get the highest performance possible over BLE is to use the same RxBleConnection to carry out all writes—this means to mitigate the overhead of RxJava i.e.:
getConnObservable()
.first()
.flatMapCompletable(rxBleConnection -> Completable.merge(
rxBleConnection.writeCharacteristic(characteristics, command0)).toCompletable(),
rxBleConnection.writeCharacteristic(characteristics, command1)).toCompletable(),
(...)
rxBleConnection.writeCharacteristic(characteristics, command99)).toCompletable(),
))
.observeOn(AndroidSchedulers.mainThread())
.subscribe(
this::onWriteSuccess,
this::onWriteFailure
);
Additionally one could try to negotiate the shortest possible Connection Interval (CI) by subscribing to rxBleConnection.requestConnectionPriority(BluetoothGatt.CONNECTION_PRIORITY_HIGH, delay, timeUnit)
Further speedup can be achieved by setting bluetoothGattCharacteristic.setWriteType(BluetoothGattCharacteristic.WRITE_TYPE‌​_WITHOUT_RESPONSE) if the peripheral/characteristic supports this write type.*
*Be aware that the internal buffer for writes without response is limited and depending on the API level behaves a bit differently. It should not matter for ~100 writes though.
In regards to this conversation:
RxAndroidBle is a Bluetooth Low Energy library and comparing it to Blue2Serial (which uses standard Bluetooth) in terms of performance is not the best thing to do. These have different use-cases—just like using a WiFi or Ethernet cable to get access to the Internet.
Best Regards

A MailboxProcessor that operates with a LIFO logic

I am learning about F# agents (MailboxProcessor).
I am dealing with a rather unconventional problem.
I have one agent (dataSource) which is a source of streaming data. The data has to be processed by an array of agents (dataProcessor). We can consider dataProcessor as some sort of tracking device.
Data may flow in faster than the speed with which the dataProcessor may be able to process its input.
It is OK to have some delay. However, I have to ensure that the agent stays on top of its work and does not get piled under obsolete observations
I am exploring ways to deal with this problem.
The first idea is to implement a stack (LIFO) in dataSource. dataSource would send over the latest observation available when dataProcessor becomes available to receive and process the data. This solution may work but it may get complicated as dataProcessor may need to be blocked and re-activated; and communicate its status to dataSource, leading to a two way communication problem. This problem may boil down to a blocking queue in the consumer-producer problem but I am not sure..
The second idea is to have dataProcessor taking care of message sorting. In this architecture, dataSource will simply post updates in dataProcessor's queue. dataProcessor will use Scanto fetch the latest data available in his queue. This may be the way to go. However, I am not sure if in the current design of MailboxProcessorit is possible to clear a queue of messages, deleting the older obsolete ones. Furthermore, here, it is written that:
Unfortunately, the TryScan function in the current version of F# is
broken in two ways. Firstly, the whole point is to specify a timeout
but the implementation does not actually honor it. Specifically,
irrelevant messages reset the timer. Secondly, as with the other Scan
function, the message queue is examined under a lock that prevents any
other threads from posting for the duration of the scan, which can be
an arbitrarily long time. Consequently, the TryScan function itself
tends to lock-up concurrent systems and can even introduce deadlocks
because the caller's code is evaluated inside the lock (e.g. posting
from the function argument to Scan or TryScan can deadlock the agent
when the code under the lock blocks waiting to acquire the lock it is
already under).
Having the latest observation bounced back may be a problem.
The author of this post, #Jon Harrop, suggests that
I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
This idea is surely worth exploring but, before starting to play around with code, I would welcome some inputs on how I could structure my solution.
Thank you.
Sounds like you might need a destructive scan version of the mailbox processor, I implemented this with TPL Dataflow in a blog series that you might be interested in.
My blog is currently down for maintenance but I can point you to the posts in markdown format.
Part1
Part2
Part3
You can also check out the code on github
I also wrote about the issues with scan in my lurking horror post
Hope that helps...
tl;dr I would try this: take Mailbox implementation from FSharp.Actor or Zach Bray's blog post, replace ConcurrentQueue by ConcurrentStack (plus add some bounded capacity logic) and use this changed agent as a dispatcher to pass messages from dataSource to an army of dataProcessors implemented as ordinary MBPs or Actors.
tl;dr2 If workers are a scarce and slow resource and we need to process a message that is the latest at the moment when a worker is ready, then it all boils down to an agent with a stack instead of a queue (with some bounded capacity logic) plus a BlockingQueue of workers. Dispatcher dequeues a ready worker, then pops a message from the stack and sends this message to the worker. After the job is done the worker enqueues itself to the queue when becomes ready (e.g. before let! msg = inbox.Receive()). Dispatcher consumer thread then blocks until any worker is ready, while producer thread keeps the bounded stack updated. (bounded stack could be done with an array + offset + size inside a lock, below is too complex one)
Details
MailBoxProcessor is designed to have only one consumer. This is even commented in the source code of MBP here (search for the word 'DRAGONS' :) )
If you post your data to MBP then only one thread could take it from internal queue or stack.
In you particular use case I would use ConcurrentStack directly or better wrapped into BlockingCollection:
It will allow many concurrent consumers
It is very fast and thread safe
BlockingCollection has BoundedCapacity property that allows you to limit the size of a collection. It throws on Add, but you could catch it or use TryAdd. If A is a main stack and B is a standby, then TryAdd to A, on false Add to B and swap the two with Interlocked.Exchange, then process needed messages in A, clear it, make a new standby - or use three stacks if processing A could be longer than B could become full again; in this way you do not block and do not lose any messages, but could discard unneeded ones is a controlled way.
BlockingCollection has methods like AddToAny/TakeFromAny, which work on an arrays of BlockingCollections. This could help, e.g.:
dataSource produces messages to a BlockingCollection with ConcurrentStack implementation (BCCS)
another thread consumes messages from BCCS and sends them to an array of processing BCCSs. You said that there is a lot of data. You may sacrifice one thread to be blocking and dispatching your messages indefinitely
each processing agent has its own BCCS or implemented as an Agent/Actor/MBP to which the dispatcher posts messages. In your case you need to send a message to only one processorAgent, so you may store processing agents in a circular buffer to always dispatch a message to least recently used processor.
Something like this:
(data stream produces 'T)
|
[dispatcher's BCSC]
|
(a dispatcher thread consumes 'T and pushes to processors, manages capacity of BCCS and LRU queue)
| |
[processor1's BCCS/Actor/MBP] ... [processorN's BCCS/Actor/MBP]
| |
(process) (process)
Instead of ConcurrentStack, you may want to read about heap data structure. If you need your latest messages by some property of messages, e.g. timestamp, rather than by the order in which they arrive to the stack (e.g. if there could be delays in transit and arrival order <> creation order), you can get the latest message by using heap.
If you still need Agents semantics/API, you could read several sources in addition to Dave's links, and somehow adopt implementation to multiple concurrent consumers:
An interesting article by Zach Bray on efficient Actors implementation. There you do need to replace (under the comment // Might want to schedule this call on another thread.) the line execute true by a line async { execute true } |> Async.Start or similar, because otherwise producing thread will be consuming thread - not good for a single fast producer. However, for a dispatcher like described above this is exactly what needed.
FSharp.Actor (aka Fakka) development branch and FSharp MPB source code (first link above) here could be very useful for implementation details. FSharp.Actors library has been in a freeze for several months but there is some activity in dev branch.
Should not miss discussion about Fakka in Google Groups in this context.
I have a somewhat similar use case and for the last two days I have researched everything I could find on the F# Agents/Actors. This answer is a kind of TODO for myself to try these ideas, of which half were born during writing it.
The simplest solution is to greedily eat all messages in the inbox when one arrives and discard all but the most recent. Easily done using TryReceive:
let rec readLatestLoop oldMsg =
async { let! newMsg = inbox.TryReceive 0
match newMsg with
| None -> oldMsg
| Some newMsg -> return! readLatestLoop newMsg }
let readLatest() =
async { let! msg = inbox.Receive()
return! readLatestLoop msg }
When faced with the same problem I architected a more sophisticated and efficient solution I called cancellable streaming and described in in an F# Journal article here. The idea is to start processing messages and then cancel that processing if they are superceded. This significantly improves concurrency if significant processing is being done.

Monitor own network traffic in java

I have a Java program which connects to the internet and send files (emails with attachments, SSL, javamail).
It sends only one email at a time.
Is there a way that my program could track the network traffic it itself is generating?
That way I could track progress of emails being sent...
It would also be nice if it was cross-platform solution...
Here's another approach that only works for sending messages...
The data for a message to be sent is produced by the Message.writeTo method and filtered through various streams that send it directly out the socket. You could subclass MimeMessage, override the writeTo method, wrap the OutputStream with your own OutputStream that counts the data flowing through it (similar to my other suggestion), and reports that to your program. In code...
public class MyMessage extends MimeMessage {
...
public void writeTo(OutputStream os, String[] ignoreList) throws IOException, MessagingException {
super.writeTo(new MyCountingStream(os), ignoreList);
}
}
If you want percent completion you could first use Message.writeTo to write the message to a stream that does nothing but count the amount of data being written, while throwing away the data. Then you know how big the message really is, so when the message is being sent you can tell what percent of the message that is.
Hope that helps...
Another user's approach is here:
Using JProgressBar with Java Mail ( knowing the progress after transport.send() )
At a lower level, if you want to monitor how many bytes are being sent, you should be able to write your own SocketFactory that produces Sockets that produce wrapped InputStreams and OutputStreams that monitor the amount of data passing through them. It's a bit of work, and perhaps lower level than you really want, but it's another approach.
I've been meaning to do this myself for some time, but I'm still waiting for that round tuit... :-)
Anyway, here's just a bit more detail. There might be gotchas I'm not aware of once you get into it...
You need to create your own SocketFactory class. There's a trivial example in the JavaMail SSLNOTES.txt file that delegates to another factory to do the work. Instead of factory.createSocket(...), you need to use "new MySocket(factory.createSocket(...))", where MySocket is a class you write that overrides all the methods to delegate to the Socket that's passed in the constructor. Except the getInputStream and getOutputStream methods, which have to use a similar approach to wrap the returned streams with stream classes you create yourself. Those stream classes then have to override all the read and write methods to keep track of how much data if being transferred, and make that information available however you want to your code that wants to monitor progress. Before you do an operation that you want to monitor, you reset the count. Then as the operation progresses, the count will be updated. What it won't give you is a "percent completion" measure, since you have no idea how much low level data needs to be sent to complete the operation.

Techniques for infinitely long pipes

There are two really simple ways to let one program send a stream of data to another:
Unix pipe, or TCP socket, or something like that. This requires constant attention by consumer program, or producer program will block. Even increasing buffers their typically tiny defaults, it's still a huge problem.
Plain files - producer program appends with O_APPEND, consumer just reads whatever new data became available at its convenience. This doesn't require any synchronization (as long as diskspace is available), but Unix files only support truncating at the end, not at beginning, so it will fill up disk until both programs quit.
Is there a simple way to have it both ways, with data stored on disk until it gets read, and then freed? Obviously programs could communicate via database server or something like that, and not have this problem, but I'm looking for something that integrates well with normal Unix piping.
A relatively simple hand-rolled solution.
You could have the producer create files and keep writing until it gets to a certain size/number of record, whatever suits your application. The producer then closes the file and starts a new one with an agreed naming algorithm.
The consumer reads new records from a file then when it gets to the agreed maximum size closes and unlinks it and then opens the next one.
If your data can be split into blocks or transactions of some sort, you can use the file method for this with a serial number. The data producer would store the first megabyte of data in outfile.1, the next in outfile.2 etc. The consumer can read the files in order and delete them when read. Thus you get something like your second method, with cleanup along the way.
You should probably wrap all this in a library, so that from the applications point of view this is a pipe of some sort.
You should read some documentation on socat. You can use it to bridge the gap between tcp sockets, fifo files, pipes, stdio and others.
If you're feeling lazy, there's some nice examples of useful commands.
I'm not aware of anything, but it shouldn't be too hard to write a small utility that takes a directory as an argument (or uses $TMPDIR); and, uses select/poll to multiplex between reading from stdin, paging to a series of temporary files, and writing to stdout.

Resources