Scheduling tool with callbacks - asynchronous

I try to develop amateur web games (just for fun) and I'm looking for a tool that allows me to (with a use case) :
Enqueue items in a system, with a finish_at property. Typically, it would be used for the units production queue ;
Be able to alter the finish_at property of an item. Eh, if the frostwyrm attacks the barrack, its production is temporarily stopped ;
Be able to setup callback on the progression (at given percentages). When a player builds a Wonder, I want to notify his ennemies about the progress ;
Do you know such a tool ? If you don't, what technologies can you recommend to me in order to built that ?
Thanks a lot !

I'd go for a doubly linked list, where each node can be a member of multiple different lists at the same time:
struct node
{
struct node *factory_prev;
struct node *factory_next;
struct node *finish_prev;
struct node *finish_next;
struct node *progress_prev;
struct node *progress_next;
};
The first list links all the items produced by this factory; the list base is with the factory object. When the factory is attacked, you can walk this list to find all items that are now delayed, so you can take them out of the other lists.
The second list links all items being produced, ordered by the time they are finished. Looking at the first element can tell you in O(1) whether an item is finished.
The third list links all items that broadcast progress events, again sorted by when the next such event occurs. This mechanism is very similar to the "finish" list and could theoretically be merged, however having a separate, shorter list allows you to find the items that need progress broadcasts faster.
As these are doubly linked lists, as soon as you have a pointer to a node you know how to unlink it from all other lists as well. Be aware that in a multithreaded program you might need some form of locking still.
The alternative would be to reschedule items that have been delayed at the time when they would ordinarily have been produced; this is however pretty difficult to get right for items that also have progress reports.

Related

Having Qt read-only model and view in different threads

As I learn from Qt docs, in Qt Model/View framework, models and their attached views should live in the same (GUI) thread. This might lead to some undesirable effects like the following. I have subclassed QAbstractTableModel and implemented the required virtual functions. Internally, the model makes queries into sqlite database which have lots of records, and provides the data accordingly to attached views via reimplemented data() function.
Now, in GUI I have a QTableView, which I attached to that model. Also I have a QLineEdit input field. Printing text in this field emits a textChanged() signal, which is connected to a (custom) query() slot of the model. In this way, typing a new character in the input field should update the table with the records matching the typed phrase.
Well, as my database is large, I do not expect the table updating to follow immediately after typing another letter - updating waits for the query to complete which might take a second or two.
But what bothers me is that since I am obliged to have model and table in the same GUI thread, the input field also friezes after each letter untill the table is updated. I would like to make it such that I might type the phrase without freezing effect, and let wait the table to update. Notifying the model only when the entire phrase is typed with pressing Enter is not an option for me - I need the textChanged() signal to work.
So then I thought - would it be of a big offend to Qt if I ignore the docs and put the model into a non-GUI thread? To my surprise, it worked! Now typing does not freeze, and the program does not crash (at least for now).
So my question is - is it nonetheless unsafe to use model in a non-GUI thread, and my program might suddenly crash any other day? I also should mention that I want to use the model in a read-only way. If I need to change data underlying the model, I will not do it using the view/delegates, I will just send appropriate signals to the model's thread, and all changes will be performed within that thread.
Imagine this example of removing the last row:
Synchronous (same thread)
emit beginRemoveRows(int r = last row)
view reacts and removes references to r
remove r from model
endRemoveRows()
view knows it may repaint
Asynchronous (different threads)
emit beginRemoveRows(r)
remove r from model
endRemoveRows()
Both signals are in the event queue of the GUI thread.
If the GUI event queue contains a repaint event before the beginRemoveRows(), the view will call model->data(r) and your program will likely crash*.
(*) Or at least run into the safeties of your data() implementation, but there are other things under the hood like QPersistentModelIndex which you don't control...

Golang RWMutex on map content edit

I'm starting to use RWMutex in my Go project with map since now I have more than one routine running at the same time and while making all of the changes for that a doubt came to my mind.
The thing is that I know that we must use RLock when only reading to allow other routines to do the same task and Lock when writing to full-block the map. But what are we supposed to do when editing a previously created element in the map?
For example... Let's say I have a map[int]string where I do Lock, put inside "hello " and then Unlock. What if I want to add "world" to it? Should I do Lock or can I do RLock?
You should approach the problem from another angle.
A simple rule of thumb you seem to understand just fine is
You need to protect the map from concurrent accesses when at least one of them is a modification.
Now the real question is what constitutes a modification of a map.
To answer it properly, it helps to notice that values stored in maps are not addressable — by design.
This was engineered that way simply due to the fact maps internally have intricate implementation which
might move values they contain in memory
to provide (amortized) fast access time
when the map's structure changes due to insertions and/or deletions of its elements.
The fact map values are not addressable means you can not do
something like
m := make(map[int]string)
m[42] = "hello"
go mutate(&m[42]) // take a single element and go modifying it...
// ...while other parts of the program change _other_ values
m[123] = "blah blah"
The reason you are not allowed to do this is the
insertion operation m[123] = ... might trigger moving
the storage of the map's element around, and that might
involve moving the storage of the element keyed by 42
to some other place in memory — pulling the rug
from under the feet of the goroutine
running the mutate function.
So, in Go, maps really only support three operations:
Insert — or replace — an element;
Read an element;
Delete an element.
You cannot modify an element "in place" — you can only
go in three steps:
Read the element;
Modify the variable containing the (read) copy;
Replace the element by the modified copy.
As you can now see, the steps (1) and (3) are mere map accesses,
and so the answer to your question is (hopefully) apparent:
the step (1) shall be done under at least an read lock,
and the step (3) shall be done under a write (exclusive) lock.
In contrast, elements of other compound types —
arrays (and slices) and fields of struct types —
do not have the restriction maps have: provided the storage
of the "enclosing" variable is not relocated, it is fine to
change its different elements concurrently by different goroutines.
Since the only way to change the value associated with the key in the map is to reassign the changed value to the same key, that is a write / modification, so you have to obtain the write lock–simply using the read lock will not be sufficient.

Difference between returning a pointer and a value in initialization methods [duplicate]

This question already has answers here:
Pointers vs. values in parameters and return values
(5 answers)
Closed 3 years ago.
Consider the following struct:
type Queue struct {
Elements []int
}
What would be the different between:
func NewQueue() Queue {
queue := Queue{}
return queue
}
and
func NewQueue() *Queue {
queue := &Queue{}
return queue
}
To me the seem practically the same, (and in fact trying it with some enqueuing and dequeueing yields the same results) but I still see both usages in the wild, so perhaps one is preferable.
It's possible to return a value and then the caller to call methods that have a pointer receiver. However, if the caller is always going to want to use pointers, because the object's big or because methods need to modify it in place, you might as well return a pointer. Pointers vs. values is a common question in Go and there's an answer trying to break down when to use one or the other.
In the specific case of a slice-backed Queue type, it's pretty small and fast to copy as a value, but if you want to be able to copy it around and have everyone see the same data whichever copy is accessed, you're going to need to use a pointer, because a slice is really a little struct of start pointer, length, and capacity, and those change when you reslice or grow it. If this is a surprise, the Go blog posts on the mechanics of append and slice usage and internals could be useful reading.
If your queue isn't for sharing or passing around but for using locally in a single function, you could provide an append-style interface where operations return a modified queue, but at that point maybe you just want to use slice tricks directly.
(If your queue is meant to be used concurrently, think hard about using a buffered channel. It might not be exactly what you're imagining, but a lot of the tricky bits have already been figured out for you by the implementers.)
Also--if Queue is really a slice with methods added, you can make it type Queue []int.

A MailboxProcessor that operates with a LIFO logic

I am learning about F# agents (MailboxProcessor).
I am dealing with a rather unconventional problem.
I have one agent (dataSource) which is a source of streaming data. The data has to be processed by an array of agents (dataProcessor). We can consider dataProcessor as some sort of tracking device.
Data may flow in faster than the speed with which the dataProcessor may be able to process its input.
It is OK to have some delay. However, I have to ensure that the agent stays on top of its work and does not get piled under obsolete observations
I am exploring ways to deal with this problem.
The first idea is to implement a stack (LIFO) in dataSource. dataSource would send over the latest observation available when dataProcessor becomes available to receive and process the data. This solution may work but it may get complicated as dataProcessor may need to be blocked and re-activated; and communicate its status to dataSource, leading to a two way communication problem. This problem may boil down to a blocking queue in the consumer-producer problem but I am not sure..
The second idea is to have dataProcessor taking care of message sorting. In this architecture, dataSource will simply post updates in dataProcessor's queue. dataProcessor will use Scanto fetch the latest data available in his queue. This may be the way to go. However, I am not sure if in the current design of MailboxProcessorit is possible to clear a queue of messages, deleting the older obsolete ones. Furthermore, here, it is written that:
Unfortunately, the TryScan function in the current version of F# is
broken in two ways. Firstly, the whole point is to specify a timeout
but the implementation does not actually honor it. Specifically,
irrelevant messages reset the timer. Secondly, as with the other Scan
function, the message queue is examined under a lock that prevents any
other threads from posting for the duration of the scan, which can be
an arbitrarily long time. Consequently, the TryScan function itself
tends to lock-up concurrent systems and can even introduce deadlocks
because the caller's code is evaluated inside the lock (e.g. posting
from the function argument to Scan or TryScan can deadlock the agent
when the code under the lock blocks waiting to acquire the lock it is
already under).
Having the latest observation bounced back may be a problem.
The author of this post, #Jon Harrop, suggests that
I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
This idea is surely worth exploring but, before starting to play around with code, I would welcome some inputs on how I could structure my solution.
Thank you.
Sounds like you might need a destructive scan version of the mailbox processor, I implemented this with TPL Dataflow in a blog series that you might be interested in.
My blog is currently down for maintenance but I can point you to the posts in markdown format.
Part1
Part2
Part3
You can also check out the code on github
I also wrote about the issues with scan in my lurking horror post
Hope that helps...
tl;dr I would try this: take Mailbox implementation from FSharp.Actor or Zach Bray's blog post, replace ConcurrentQueue by ConcurrentStack (plus add some bounded capacity logic) and use this changed agent as a dispatcher to pass messages from dataSource to an army of dataProcessors implemented as ordinary MBPs or Actors.
tl;dr2 If workers are a scarce and slow resource and we need to process a message that is the latest at the moment when a worker is ready, then it all boils down to an agent with a stack instead of a queue (with some bounded capacity logic) plus a BlockingQueue of workers. Dispatcher dequeues a ready worker, then pops a message from the stack and sends this message to the worker. After the job is done the worker enqueues itself to the queue when becomes ready (e.g. before let! msg = inbox.Receive()). Dispatcher consumer thread then blocks until any worker is ready, while producer thread keeps the bounded stack updated. (bounded stack could be done with an array + offset + size inside a lock, below is too complex one)
Details
MailBoxProcessor is designed to have only one consumer. This is even commented in the source code of MBP here (search for the word 'DRAGONS' :) )
If you post your data to MBP then only one thread could take it from internal queue or stack.
In you particular use case I would use ConcurrentStack directly or better wrapped into BlockingCollection:
It will allow many concurrent consumers
It is very fast and thread safe
BlockingCollection has BoundedCapacity property that allows you to limit the size of a collection. It throws on Add, but you could catch it or use TryAdd. If A is a main stack and B is a standby, then TryAdd to A, on false Add to B and swap the two with Interlocked.Exchange, then process needed messages in A, clear it, make a new standby - or use three stacks if processing A could be longer than B could become full again; in this way you do not block and do not lose any messages, but could discard unneeded ones is a controlled way.
BlockingCollection has methods like AddToAny/TakeFromAny, which work on an arrays of BlockingCollections. This could help, e.g.:
dataSource produces messages to a BlockingCollection with ConcurrentStack implementation (BCCS)
another thread consumes messages from BCCS and sends them to an array of processing BCCSs. You said that there is a lot of data. You may sacrifice one thread to be blocking and dispatching your messages indefinitely
each processing agent has its own BCCS or implemented as an Agent/Actor/MBP to which the dispatcher posts messages. In your case you need to send a message to only one processorAgent, so you may store processing agents in a circular buffer to always dispatch a message to least recently used processor.
Something like this:
(data stream produces 'T)
|
[dispatcher's BCSC]
|
(a dispatcher thread consumes 'T and pushes to processors, manages capacity of BCCS and LRU queue)
| |
[processor1's BCCS/Actor/MBP] ... [processorN's BCCS/Actor/MBP]
| |
(process) (process)
Instead of ConcurrentStack, you may want to read about heap data structure. If you need your latest messages by some property of messages, e.g. timestamp, rather than by the order in which they arrive to the stack (e.g. if there could be delays in transit and arrival order <> creation order), you can get the latest message by using heap.
If you still need Agents semantics/API, you could read several sources in addition to Dave's links, and somehow adopt implementation to multiple concurrent consumers:
An interesting article by Zach Bray on efficient Actors implementation. There you do need to replace (under the comment // Might want to schedule this call on another thread.) the line execute true by a line async { execute true } |> Async.Start or similar, because otherwise producing thread will be consuming thread - not good for a single fast producer. However, for a dispatcher like described above this is exactly what needed.
FSharp.Actor (aka Fakka) development branch and FSharp MPB source code (first link above) here could be very useful for implementation details. FSharp.Actors library has been in a freeze for several months but there is some activity in dev branch.
Should not miss discussion about Fakka in Google Groups in this context.
I have a somewhat similar use case and for the last two days I have researched everything I could find on the F# Agents/Actors. This answer is a kind of TODO for myself to try these ideas, of which half were born during writing it.
The simplest solution is to greedily eat all messages in the inbox when one arrives and discard all but the most recent. Easily done using TryReceive:
let rec readLatestLoop oldMsg =
async { let! newMsg = inbox.TryReceive 0
match newMsg with
| None -> oldMsg
| Some newMsg -> return! readLatestLoop newMsg }
let readLatest() =
async { let! msg = inbox.Receive()
return! readLatestLoop msg }
When faced with the same problem I architected a more sophisticated and efficient solution I called cancellable streaming and described in in an F# Journal article here. The idea is to start processing messages and then cancel that processing if they are superceded. This significantly improves concurrency if significant processing is being done.

push object originally on the stack to a vector, will the objects get lost?

I just started using STL, say I have a rabbit class, now I'm creating a rabbit army...
#include <vector>
vector<rabbit> rabbitArmy (numOfRabbits,rabbit());
//Q1: these rabbits are on the heap right?
rabbit* rabbitOnHeap = new rabbit();
//Q2: rabbitOnHeap is on the heap right?
rabbit rabbitOnStack;
//Q3: this rabbit is on the stack right?
rabbitArmy.push_back(rabbitOnStack);
//Q4: rabbitOnStack will remain stored on the stack?
//And it will be deleted automatically, though it's put in the rabbitArmy now?
Q4 is the one I'm most concerned with, should I always use new keyword to add rabbit to my army?
Q5: Is there better way to add rabbits to the army than:
rabbitArmy.push_back(*rabbitOnHeap);
Since you haven't specified otherwise, the objects you put in the vector will be allocated with std::allocator<rabbit>, which uses new. For what it's worth, that's usually called the "free store" rather than the heap1.
Again, the usual term is the free store.
Officially, that's "automatic storage", but yes, on your typical implementation that'll be the stack, and on an implementation that doesn't support a stack in hardware, it'll still be a stack-like (LIFO) data structure of some sort.
When you add an item to a vector (or other standard container) what's actually added to the container is a copy of the item you pass as a parameter. The item you pass as a parameter remains yours to do with as you please. In the case of something with automatic storage class, it'll be destroyed when it goes out of scope -- but the copy of it in the collection will remain valid until it's erased or the collection destroyed, etc.
No. In fact, you should only rarely use new to allocate items you're going to put in a standard collection. Since the item in the array will be a copy of what you pass, you don't normally need to use new to allocate it.
Usually you just push back a local object.
For example:
for (i=0; i<10; i++)
rabbitArmy.push_back(rabbit());
This creates 10 temporary rabbit objects (the rabbit() part), adds a copy of each to the rabbitArmy. Then each of the temporaries is destroyed, but the copies of them in the rabbitArmy remain.
In typical usage, "the heap" refers to memory managed by calloc, malloc, realloc, and free. What new and delete manage is the free store. A new expression, in turn, obtains memory from an operator new (either global or inside a class). operator new and operator delete are specified so they could be almost a direct pass-through to malloc and free respectively, but even when that's the case the heap and free store are normally thought of separately.

Resources