using goto vs while(wrongchoice) with a switch. - goto

I was wondering if there's a reason why you should not use a goto vs use a while(variable) in this case :
:start
result=askUser("do you want 1,2,or 3?")
switch(result){
case 1: display("you chose 1")
case 2: display("you chose 2")
case 3: display("you chose 3")
default: display("choice not availaible")
goto :start
}
vs
boolean wrongchoice=false
do
result=askUser("do you want 1,2,or 3?")
switch(result){
case 1: display("you chose 1")
case 2: display("you chose 2")
case 3: display("you chose 3")
default: display("choice not availaible")
wrongchoice=true
}
while(wrongchoice)
it seems to me that goto would be better on a small µc because you avoid having to set a variable and use memory for it (when you only have 256, a bit is a bit)
for today's computers, being greedy for a few bits is futile though, so is using the while much more clear then?

for today's computers, being greedy for a few bits is futile though, so is using the while much more clear then?
Exactly. Readability & maintainability is more important than performance in an overwhelming number of cases.
I've encountered situation where goto was faster than a while loop even with -O3 optimization, but this was in the innermost loop of an O(n^2) algorithm of a molecular dynamics simulator. Barring intense use cases (which require more intensive maintenance over the life of the code!), use the while because it's more readable and is less likely to do weird things.

Related

Does a switch statement care about case order?

I have an Arduino Pro Micro that during a loop will read the state of 4 pins and then use that pins to evaluate a switch statement i.e.
int bob = DigitalRead(1)+(DigitalRead(2)*2)+(DigitalRead(3)*4)+(DigitalRead(4)*8)
switch (bob) {
case 1:
case 2:
.
.
.
case 15:
}
My Question is do I have to go in numerical order? Does the switch statement actually care about that or will I loose performance by NOT going in order? Would I be better off grouping them so the code can fall through or using goto case#? There are several cases where I want to have some common code executed so I was thinking I could group those cases together and only have the code with the break at the end of it in the last case statement.
So I could have cases 4 and 5 grouped together as well as 8 and 10 grouped together or 9 and 11 grouped together.
Is that possible or will it see 10 comes before 9 and quit looking for 9?
My Question is do I have to go in numerical order?
No.
Does the switch statement actually care about that ...
No.
... will I loose performance by NOT going in order?
No.
:-)
Would I be better off grouping them so the code can fall through or using goto case#?
Depends on what you need.
Remember, you need a break statement between cases, if you do not want them to run together.
I would say that practically all compilers today are smart enough to optimise switch statements. First, make your code readable and maintainable. Later, if there's a performance bottleneck, AND the switch is part of it, then see what other options you might have.

How come Go calculates fibonacci recursion so fast?

This is not a correct version of it, I am just playing around Go but I was shocked how fast Go calculated the 42nd(43 actually) number in Fibonacci sequence.
Can someone please explain how come it calculates it this fast? I tried comparing it to python(I know its slow compared to other languages) but python took > 1 minute and I had to break the recursion.
package main
import "fmt"
func fib(a uint) uint {
if a <= 1 {
return 1
}
return fib(a-1) + fib(a-2)
}
func main() {
fmt.Println(fib(42))
}
[ `go run Hello.go` | done: 2.316821835s ]
433494437
Its compiler isn't as smart or mature as C's (at least not yet), but Go is still closer to C in its time performance than Python (space performance is a separate thing, and not what you asked about). Just being a compiled language instead of an interpreted language gives it a major leg up in time performance over Python (and it is still faster than PyPy in general, but not as much faster).
Why compiled languages generally offer greater time performance than interpreted languages has been thoroughly covered elsewhere. You can research this question on stackoverflow and elsewhere on the internet. For example, here's the TL;DR in one stackoverflow answer to that question:
Native programs run using instructions written for the processor they run on.
And here's the TL;DR in another answer:
Interpreted languages are slower because their method, object and global variable space model is dynamic
You can also find plenty of benchmark case studies and results comparing implementations in different languages if you look for them.
Performance improvements to the Go compiler and Go toolchain are also frequently made, which you can read about in the release notes (and elsewhere) such as this excerpt about version 1.8:
The new back end, based on static single assignment form (SSA), generates more compact, more efficient code and provides a better platform for optimizations such as bounds check elimination. The new back end reduces the CPU time required by our benchmark programs by 20-30% on 32-bit ARM systems.

Game Theory with prediction

To impress two (german) professors i try to improve the game theory.
AI in Computergames.
Game Theory:  Intelligence is a well educated proven Answer to an Question.
This means a thoughtfull decision is choosing an act who leads to an optimal result.
Question -> Resolution -> Answer -> Test (Check)
For Example one robot is fighting another robot.
This robot has 3 choices:
-move forward
-hold position
-move backward
The resulting Programm is pretty simple
randomseed = initvalue;
while (one_is_alive)
{
choice = randomselect(options,probability);
do_choice(roboter);
}  
We are using pseudorandomness.
The test for success is simply did he elimate the opponent.
The robots have automatically shooting weapons :
struct weapon
{
range
damage
}
struct life
{
hitpoints
}
Now for some Evolution.
We let 2 robots fight each other and remember the randomseeds.
What is the sign of a succesfull Roboter ?
struct {
ownrandomseed;
list_of_opponentrandomseed; // the array of the beaten opponents.
}
Now the question is how do we choose the right strategy against an opponent ?
 
We assume we have for every possible seed-strategy the optimal anti-strategy.  
Now the only thing we have to do is to observe the numbers from the opponent
and calculate his seed value.Then we could choose the right strategy.
For cracking the random generator we can use the manual method :
http://alumni.cs.ucr.edu/~jsun/random-number.pdf
or the brute Force :
https://jazzy.id.au/2010/09/20/cracking_random_number_generators_part_1.html
It depends on the algorithm used to generate the (pseudo) random numbers. If the pseudo random number generator algorithm is known, you can guess the seed by observing a number of states (robot moves). This is similar to brute force guessing a password, used for encryption, as, some encryption algorithms are known as stream ciphers, and are basically (sometimes exactly), a one time pad that is used to obfuscate the data. Now, lets say that you know the pseudorandom number generator used is a simple lagged fibonacci generator. Then, you know that they are generating each number by calculating x(n) = x(n - 2) + x(n - 3) % 3. Therefore, by observing 3 different robot moves, you will then be able to predict all of the future moves. The seed, is the first 3 numbers supplied that give the sequence you observe. Now, most random number generators are not this simple, some have up to 1024 bit length seeds, and would be impossible for a modern computer to cycle through all of those possibilities in a brute force manner. So basically, what you would need to do, is to find out what PRNG algorithm is used, find out all possible initial seed values, and devise an algorithm to determine the seed the opponent robot is using based upon their actions. Depending on the algorithm, there are ways of guessing the seed faster than testing each and every one. If there is a faster way of guessing such a seed, this means that the PRNG in question is not suitable from cryptographic applications, as it means passwords are easier guessed. AES256 itself has a break, but it still takes theoretically 2 ^ 111 guesses (instead of the brute force 2 ^256 guesses), which means it has been broken, technically, but 2 ^ 111 is still way too many operations for modern computers to process in a meaningful time frame.
if the PRNG was lagged fibonacci (which is never used anymore, I am just giving a simple example) and you observed that the robot did option 0, then, 1, then 2... you would then know that the next thing the robot will do is... 1, since 0 + 1 % 3 = 1. You could also backtrack, and figure out what the initial values were for this PRNG, which represents the seed.

CPLEX outputting different results on consecutive runs - Asynchronity issue?

I'm running CPLEX from IBM ILOG CPLEX Optimization Studio 12.6.
Here I'm facing a weird issue. Solving the same optimization problem (pure LP) multiple times in a row, yields different results.
The aim is to solve once, then iteratively modify the coefficient matrix, and re-solve the problem. However, we experienced that the changes between iterations did not correspond to the modifications.
This lead us to try re-solving the problem without doing modifications in between, which returned different results.
The catch is that we still do one major modification before we start iterating, and our hypothesis is that this change (cplex.setCoef(...) on about 10,000 rows) is done asynchronously, so that it is only partially done during the first re-solution iterations.
However, we cannot seem to find any documentation stating that this method is asynchronous, nor any way to ensure synchronous execution, so that all the changes are done before CPLEX restarts.
Does anyone know if this is the case? Is there any way to delay restart until cplex.setCoef(...) is done? The problem is quite huge, but the representative lines are:
functionUsingSetCoefOn10000rows();
for(var j = 0; j < 100; j++){
cplex.solve();
writeln("Iteration " + j + ": " + cplex.getObjValue());
for(var k = 0; k < 100000; k++){
doBusyWork(); //Just to kill time
}
}
which outputs
Iteration 0: 1529486959.814946
Iteration 1: 1544325969.750444
Iteration 2: 1549669732.757587
Iteration 3: 1551818419.584333
...
Iteration 33: 1564007987.849925
...
Iteration 98: 1564007987.849925
Iteration 99: 1564007987.849925
Last minute update
Reducing the number of calls to cplex.setCoef to about 2500 removes the issue, and all iterations return the same objective value. Sadly, we do need to change all the 10,000 coefficients.
Edit: The OPL scripting and engine log: http://goo.gl/ywJhkm and here: http://goo.gl/v2Qhm9
Sorry that this is not really an answer, but it is too big to go as a comment...
I don't think that the setCoef() calls would be asynchronous and not complete - that would be very surprising. Such behaviour would be too unpredictable and too many other people would have problems with this behaviour. However, CPLEX itself will use multiple threads to solve a problem and that means that it can generate different solutions each time it runs. The example objective values that you show do seem to change significantly, so a few questions/observations:
1: The numbers seem to be monotonically increasing - are they all increasing like this until they reach the maximum value? It looks like some kind of convergence behaviour. On re-running, CPLEX will start from a previous solution if it can. Check that there isn't some other CPLEX parameter stopping the search early such as an iteration or time limit or wider solution optimality tolerance.
2: Have you looked at the CPLEX logs from each run to see what CPLEX is doing in each run?
3: If you have doubts about the model being solved, try dumping out the model as an LP file and check the values in each iteration. They should all be the same in your case. You can also try solving the LP file in the CPLEX standalone optimiser to see what value that gives.
4: Have you tried setting the parameters to make CPLEX use a different LP algorithm (e.g. primal simplex, barrier etc)?

OpenCL: Better to use more registers, or to pack values using bit-shift operators?

In OpenCL, on GPUs, register pressure will reduce occupation rates, so we want to reduce the number of registers used.
In my program, I have tons of values, that are not known at compile time, but are typically in the range 0 to 127. I'm wondering whether it might be better to pack these into a smaller number of registers, using bit-shift operators, rather than use tons of registers for these?
eg, maybe create some macros like:
#define posToRow( pos ) ( ( pos >> 10 ) & ((1<<10)-1) )
#define posToCol( pos ) ( ( pos ) & ((1<<10)-1) )
#define rowColToPos( row, col ) ( ( row << 10 ) | col )
#define linearIdToPos( linearId, base ) ( rowColToPos( ( linearId / base ), ( linearId % base ) ) )
Thoughts on this? Any experiences with this? It would seem that the advantages/disadvantages are:
using bitshift involves slightly more computation (but: bitshifts are fast?)
... but fewer registers?
This will probably not answer your question the way you want it.
In OpenCL, on GPUs, register pressure will reduce occupation rates, so we want to reduce the number of registers used.
Register pressure will indeed reduce occupancy, but that does not imply it will reduce performance as a consequence: it depends on many other factors. If you haven't read it yet, I strongly encourage you to read Vasily Volkov's Better Performance at Lower Occupancy.
I'm wondering whether it might be better to pack these into a smaller number of registers, using bit-shift operators, rather than use tons of registers for these?
A very important rule in a developer's life is "don't optimize yet". This rule still stands even in high-performance code:
First, get your code to work correctly.
Then, if you find that performance is unsatisfactory and/or could be improved, profile it.
And then, identify the bottleneck and try to find a solution.
All in all, it is pointless to try to guess in advance about what may or may not create register pressure. Implement, measure, act. Don't try to forecast issues that do not exist yet and may very well never exist at all.
That being said, most of the time (if not always), when writing high-performance code (even "standard" code, actually), the only way to know which option is the fastest between several is to implement all of them and benchmark them. As such, I am afraid your question will have no definitive answer until you do such a benchmark.
As most of the answers above already mentioned, be aware not to overoptimize your code.
Are your values typically between 0 and 127 or are these values ALWAYS within this range. Otherwise (you did not mention, if you had already tried): why don't you turn to vector data (unsigned char16)?

Resources