Pipelining in Architectural Simulators - pipeline

So I'm studying architectural simulators (mostly, simplescalar) to figure out what really happens in the innards of a microprocessor. One fascinating thing i noticed was that, the entire pipeline was written backwards! That is, in a purely sequential while loop, the writeback stage comes before the issue stage, which comes before the decode stage and so on.
What's the point of this? Let's say the output of a hypothetical fetch() stage is stored in a shared buffer ('latch') that is accessed by the input of the decode() stage. Since its a purely sequential while loop, I don't see a problem where this latch/buffer will be overwritten. However, answers to questions like this: claim that simulating the pipeline in reverse somehow avoids this 'problem'? Some insight/guidance in the right direction will be much appreciated!

I don't know Simplescalar very well, but I've seen this done in other architectural simulators.
The outer loop of the pipeline is supposed to represent one cycle of the processor. Imagine this is the first cycle and--for simplicity--the front end has a width of one. What would happen if each stage were to be executed in the order of fetch to commmit?
cycles 1
- fetch: instruction 1 (place it in fetch/decode latch)
- decode: instruction 1 (place it in decode/rename latch)
- rename: instruction 1 (place it in rename/dispach latch)
- dispatch: instruction 1 (place it in issue queue)
- issue: instruction 1
etc...
You haven't simulated anything useful here as this isn't a pipeline. What happens when the loop executes each stage in the order of commit to fetch?
cycle 1
- issue: noop
- dispatch: noop
- rename: noop
- decode: noop
- fetch: instruction 1 (place it in fetch/decode latch)
cycle 2
- issue: noop
- dispatch: noop
- rename: noop
- decode: instruction 1 (place it in decode/rename latch)
- fetch: instruction 2 (place it in fetch/decode latch)
It's not a very complicated idea, but it helps to simplify the simulator.

Related

Making blocking http call in akka stream processing

I am new to akka and still trying to understand the different akka and streaming concepts. For some new feature i need to add a http call to already existing stream which is working on an internal object. Something like this -
val step1Flow = Flow[SampleObject].filter(...--Filtering condition--...)
val step2Flow = Flow[SampleObject].map(obj => {
...
-- Business logic to update values in the obj --
...
})
...
override val flowGraph: Flow[SampleObject, SampleObject, NotUsed] =
bufferIn.via(Flow.fromGraph(GraphDSL.create() {
implicit builder =>
import GraphDSL.Implicits._
...
val step1 = builder.add(step1Flow)
val step2 = builder.add(step2Flow)
val step3 = builder.add(step3Flow)
...
source ~> step1 ~> step2 ~> step3 ~> merge
...
}
I need to add the new http request flow (lets call it newFlow) after step1. All these flow have Inlet and Outlet as SampleObject. Now my understanding is that the newFlow would need to be blocking because the outlet need to be SampleObject only. For that I have used Await function on the http call future. The code looks like this -
val responseFuture: Future[(Try[HttpResponse], SomeContext)] =
Source
.single(httpRequest -> context)
.via(Retry(retrySettings).join(clientFlow))
.runWith(Sink.head)
...
val (httpTry, passedAlongContext) = Await.result(responseFuture, 30.seconds)
-- logic to process response and return SampleObject --
Now this works fine but i think there should be a better way to do this without using wait. Also i think this would block the main thread till the request completes, which is going to affect the app throughput.
Could you please guide if the approach i used is correct or not. And how do i make use of some other thread pool to handle these blocking call so my main threadpool is not affected
This question seems very similar to mine but i do not understand it completely - connect Akka HTTP to Akka stream . Also i can't change the step2 or further flows.
EDIT : Added some code details for the stream
I ended up using the approach mentioned in the question because i couldn't find anything better after looking around. Adding this step decreased the throughput of my application as expected, but there are approaches to increase that can be used. Check these awesome blogs by Colin Breck -
https://blog.colinbreck.com/maximizing-throughput-for-akka-streams/
https://blog.colinbreck.com/partitioning-akka-streams-to-maximize-throughput/
To summarize -
Use Asynchronous Boundaries for flows which are blocking.
Use Futures if possible and add callbacks to futures. There are several ways to do that.
Use Buffers. There are several types of buffers available, choose what suits your needs.
Other than these, you can use inbuilt flows like -
Use "Broadcast" to broadcast your events to multiple consumers.
Use "Partition" to partition your stream into multiple streams based
on some condition.
Use "Balance" to partition your stream when there is no logical way to partition your events or they all could have different work loads.
You could use any one or multiple things from above options.

How to gracefully shut down reactive kafka-consumer and commit last processed record?

My painful hunt for this feature is fully described in disgustingly log question: Several last offsets aren't getting commited with reactive kafka and it shows my multiple attemps with different failures.
How would one subscribe to ReactiveKafkaConsumerTemplate<String, String>, which will process the records in synchronous way (for simplicity), and will ack/commit every 2s AND upon manual cancellation of stream? Ie. it works, ack/commits every 2s. Then via rest/jmx/whatever comes signal, the stream terminates and ack/commits the last processed kafka record.
After a lot of attempts I was able to come up with following solution. It seems to work, but it's kinda ugly, because it's very "white-box" where outer flow highly depends on stuff happening inside of other methods. Please criticise and suggest improvements. Thanks.
kafkaReceiver.receive()
.flatMapSequential(receivedKafkaRecord -> processKafkaRecord(receivedKafkaRecord), 16)
.takeWhile(e-> !stopped)
.sample(configuration.getKafkaConfiguration().getCommitInterval())
.concatMap(offset -> {
log.debug("ack/commit offset {}", offset.offset());
offset.acknowledge();
return offset.commit();
})
.doOnTerminate(()-> log.info("stopped."));
What didn't work:
A) you cannot use Disposable.dispose, since that would break the stream and your latest processed record won't be committed.
B) you cannot put take on top of stream, as that would cancel the stream and you won't be able to commit either.
C) not sure how I'd be able to intercorporate usage of errors here.
Because of what didn't work stream termination is triggered by boolean field named stopped, which can be set anyhow.
Flow explained:
flatMapSequential — because of inner parallelism and necessity to commit N only if all N-1 was processed.
processKafkaRecord returns Mono<ReceiverOffset>, ie. the offset of processed record to have something to ack/commit. When stopped the method will skip processing and return Mono.empty
take will stop stream if stopped, this has to be put here becaue of possibility of whole sample interval consisting only from "empties"
rest is simple: sample by given interval, commit in order. If sample does return empty record, commit is skipped. Finally we log that stream is cancelled.
If anyone know how to improve, please criticise.

What is the use of count here at the end of stream Java 8

I am a little bit confused about the utility of using count() here in this code :
Stream.iterate(1, (Integer n) -> n + 1)
.peek(n -> System.out.println("number generated - " + n))
.filter(n -> (n % 2 == 0))
.peek(n -> System.out
.println("Even number filter passed for –" + n))
.limit(5).count();
With count() added to the end of the stream, i get what i want, which is to show this result :
number generated - 1
number generated - 2
Even number filter passed for –2
number generated - 3
number generated - 4
Even number filter passed for –4
number generated - 5
number generated - 6
Even number filter passed for –6
number generated - 7
number generated - 8
Even number filter passed for –8
number generated - 9
number generated - 10
Even number filter passed for –10
but if i delete count() from the end of the stream statement i didn't get any result.
You need to perform a terminal operation on the stream for anything to happen. If you drop the .count() the stream basically does nothing. Think about the method peek and filter and so on to just take in a function and putting it on an internal queue of functions to execute.
The remembered functions will only be executed after you use one terminal operation after which you no longer have a stream but some result or nothing at all. count, collect, reduce, forEach are such terminal operations.
Read more about it in the "Stream operations and pipelines" section of the docs and take a look through the functions in the Stream docs to see which methods are terminal and which are not.
As mentioned by #luk2302 in his answer "you need to perform a terminal operation on the stream for anything to happen".
Just to expand upon this, there is two type of stream operations i.e. an operation is either an intermediate operation which is lazy. lazy meaning as late as possible or executed only when needed; or an operation is eager meaning as soon as possible or immediately.
In the example you've provided, all the operations bar count() are intermediate operations and they will not render the processing of the data unless told otherwise by a terminal operation.
The count() operation is eager (i.e terminal), so it will render the processing of the entire stream.
To determine whether a method in the pipeline is intermediate (lazy) or terminal (eager), you only have to look at what type that methods returns.
The rule is simple. If a method in the stream pipeline returns a stream it’s an intermediate operation therefore lazy. If a method in the stream pipeline returns anything else it’s a terminal operation therefore eager.
Stream operations are of two types - intermediate and terminal operations.
Intermediate operations are lazy that means they are not executed on the stream pipeline when invoked but merely add a stage to the pipeline.
The pipeline is executed when a terminal operation is encountered. Thus terminal operations are eager.
Count() being one of the terminal operation executes the stream pipeline and gives the result while on removing it, the pipeline has only intermediate operations and so is not evaluated, hence no result.

Doctrine2 and Symfony2 - Why this getter mess all the things up?

This snippet of code is driving me crazy.
Suppose that entity on which I've done all this operation have all the members that I recall with getters (either on db that on class file; moreover, onto db, all values were inserted).
$this->logger->debug('INV ROOM RATES CAPACITY - $ROOM ID: '.$camera->getId());
$this->logger->debug('INV ROOM RATES CAPACITY - $ROOMCAPACITY: .$camera->getCapienza());
$this->logger->debug('INV ROOM RATES CAPACITY - CAMERA DB ID: '.$camera->getId());
This outputs the following
[2013-02-11 14:17:32] app.DEBUG: INV ROOM RATES CAPACITY - $ROOM ID: 14 [] []
[2013-02-11 14:17:32] app.DEBUG: INV ROOM RATES CAPACITY - $ROOMCAPACITY: [] []
[2013-02-11 14:17:32] app.DEBUG: INV ROOM RATES CAPACITY - CAMERA DB ID: [] []
It seems that ->getCapienza() method is messing all the things up (second getter on the same object, doesn't return previous value).
Obviously, no error or exceptions are raised.
What's going on here? Any ideas? I'm stuck since hours ....
EDIT
public function getCapienza()
{
return $this->capienza;
}
Since you just told me that you are clearing the EntityManager sometimes, I digged into it a bit further.
It is a known issue that will be fixed in Doctrine ORM 2.4 when https://github.com/doctrine/doctrine2/pull/406 is merged.
The issue happens because of following piece of code: https://github.com/doctrine/doctrine2/blob/2.3.2/lib/Doctrine/ORM/UnitOfWork.php#L1724-L1780
Basically, it is impossible to merge or generally use proxies that are detached from an EntityManager (you should have lazy-loaded them before).
The problem you are experiencing is related with http://www.doctrine-project.org/jira/browse/DDC-1734
A temporary solution is to re-fetch the data you want to use instead of recycling detached instances.

Is there a way to acquire a lock in Redis? (Node.js)

My Node.js application accepts connections from the outside. Each connection handler reads a SET on Redis, eventually modifies the set itself, then moves on. The problem is that in the meanwhile another async connection can try to read the same SET and try to update it or decide its next step based on what it reads.
I know that Redis does its best to be atomic, but this is not quite sufficient for my use case. Think about this: the set is read to understand if it's FULL (there is a business rule for that). If it's FULL, then something happens. The problem is that if there is one only slot left, two semi-concurrent connections could think each one is the last one. And I get an overflow.
I there a way to keep a connection "waiting" for the very short time the other eventually needs to update the set state?
I think this is a corner case, very very unluckely... but you know :)
Using another key as the "lock" is an option, or does it stink?
How about using blpop to do locking. blpop key 5 to wait 5 seconds for key. At start put item(to identify queue is not empty) at key. The connection acquiring the lock should remove item from key. The next connect then can't acquire the lock, because empty, but blpop has the following nice property:
Multiple clients can block for the same key. They are put into a
queue, so the first to be served will be the one that started to wait
earlier, in a first-BLPOP first-served fashion.
When connection which acquired lock has finished task it should put back item back in queue, then the next connection waiting can acquire lock(item).
You may be looking for WATCH with MULTI/EXEC. Here's the pattern that both threads follow:
WATCH sentinel_key
GET value_of_interest
if (value_of_interest = FULL)
MULTI
SET sentinel_key = foo
EXEC
if (EXEC returned 1, i.e. succeeded)
do_something();
else
do_nothing();
else
UNWATCH
The way this works is that all of the commands between MULTI and EXEC are queued up but not actually executed until EXEC is called. When EXEC is called, before actually executing the queued instructions it checks to see if sentinel_key has changed at all since the WATCH was set; if it has, it returns (nil) and the queued commands are discarded. Otherwise the commands are executed atomically as a block, and it returns the number of commands executed (1 in this case), letting you know you won the race and do_something() can be called.
It's conceptually similar to the fork()/exec() Unix system calls - the return value from fork() tells you which process you are (parent or child). In this case it tells you whether you won the race or not.

Resources