JMH measurement iterations - jmh

I'm using JMH and I find something hard to understand: I have one method annotated with #Benchmark and I set measurementIterations(3). The method is called 3 times, but within each iteration call, the function runs a rather big and random number of times.
My question is: is that number completely random? Is there a way to control it and determine how many times should the function run within an iteration? And what is the importance with set up the measurementIterations if each way or another, the function will run a random number of times?

measurementIterations defines how many measured iterations you want to measure of the benchmark. I don't know which parameters you have specified but by default JMH runs the benchmark time-based (default I guess 1 second). This means the benchmark method is invoked in that time frame as often as possible. There are possibilities to specify how often the method should be called in one iteration (-> batching).
I would recommend to study the JMH Samples provided by JMH: http://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/
They are a very good introduction into JMH and cover pitfalls you easily make within benchmarks.

The number of iteration depends on the various JMH modes I think you must be using Avgtime mode it will perform various iterations.
/////////////////////////////////////////////////////////////////////////////////
Mode.Throughput: Calculate number of operations in a time unit.
Mode.AverageTime: Calculate an average running time.
Mode.SampleTime: Calculate how long does it take for a method to run
(including percentiles).
Mode.SingleShotTime: Just runs a method
once (useful for cold-testing mode).
////////////////////////////////////////////////////////////////////////////////
For example Use mode "Mode.SingleShotTime", it will perform iteration exactly the number of times you mentioned in the run(see below).
// Example runner class
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(JMHSample_01_HelloWorld.class.getSimpleName())
.warmupIterations(1)// number of times the warmup iteration should take place
.measurementIterations(1)//number of times the actual iteration should take place
.forks(1)
.shouldDoGC(true)
.build();
new Runner(opt).run();
}

JMH is doing warm-up iterations that are not measured but necessary for valid results.
measurementIterations defines how many iterations should be measured. This does not include warm-up, because warm-up is not measured.

Yes, in every iteration, the times for method running is random(it's the max number of times the method can run). The times is not important. What is important is the average time used each time.
Besides, you can control how many iterations to run with measurementIterations() and the duration of every iteration with measurementTime().
For example, if you want to run you method with only 1 iteration and it's duration to 1ms, without warmup, just set warmupIterations to 0, measurementTime to 1ms, measurementIterations to 1. Like below:
Options opt = new OptionsBuilder()
.include(xxx.class.getSimpleName())
.warmupIterations(0)
.measurementTime(TimeValue.milliseconds(1))
.measurementIterations(1)
.forks(1)
.build();
Significance for mutiple iterations: Run more, the results should be more reliable.

Related

Flipping coin simulation with gain/loss

Suppose you successively toss a fair coin and each time the result is
heads, you win $1, while if you get tails you lose 1$. Your initial capital is
3$. The throws stop if your capital is zeroed or you reach 10$. Let X_n be the
process that describes your chapter during the nth throw.
Simulate the X_n process 1000 times and present the graph
of its evolution through R.
2. Estimate the average number of consecutive throws until you stop. Is the result expected?
Can someone help me solve this or at least understand the steps I am supposed to take?
Someone already posted a link to a solution of your homework in the comments. I fear, however, that this uncommented code is incomprehensive for you, given that you have asked the question in the first place.
I would therefore suggest to first write your own implementation with an outer for loop and an inner while loop conditioned upon the running capital, call rbinom in each run and recompute the running capital. Store the resulting runs in a numeric vector and call mean on this vector.
It will start becoming interesting when you measure the runtime of your solution, which will be surprisingly slow. To speed it up, you must use "vectorization", which the linked to solution uses, but this is a completely different topic to be left for a different lesson...

what if the FD steps varied w.r.t output/input

I am using the finite difference scheme to find gradients.
Lets say i have 2 outputs (y1,y2) and 1 input (x) in a single component. And in advance I know that the sensitivity of y1 with respect to x is not same as the sensitivity of y2 to x. And thus i could potentially have two different steps for those as in ;
self.declare_partials(of=y1, wrt=x, method='fd',step=0.01, form='central')
self.declare_partials(of=y2, wrt=x, method='fd',step=0.05, form='central')
There is nothing that stops me (algorithmically) but it is not clear what would openmdao gradient calculation exactly do in this case?
does it exchange information from the case where the steps are different by looking at the steps ratios or simply treating them independently and therefore doubling computational time ?
I just tested this, and it does the finite difference twice with the two different step sizes, and only saves the requested outputs for each step. I don't think we could do anything with the ratios as you suggested, as the reason for using different stepsizes to resolve individual outputs is because you don't trust the accuracy of the outputs at the smaller (or large) stepsize.
This is a fair question about the effect of the API. In typical FD applications you would get only 1 function call per design variable for forward and backward difference and 2 function calls for central difference.
However in this case, you have asked for two different step sizes for two different outputs, both with central difference. So here, you'll end up with 4 function calls to compute all the derivatives. dy1_dx will be computed using the step size of .01 and dy2_dx will be computed with a step size of .05.
There is no crosstalk between the two different FD calls, and you do end up with more function calls than you would have if you just specified a single step size via:
self.declare_partials(of='*', wrt=x, method='fd',step=0.05, form='central')
If the cost is something you can bear, and you get improved accuracy, then you could use this method to get different step sizes for different outputs.

Wall time dominated by setup()

I have created an OpenMDAO problem where the total wall time is being dominated by the prob.setup(). The time that it takes to call prob.run() is 10 seconds while the time that it takes to call the prob.setup() is 1916 seconds. There are 8 individual components. The root group has 20 groups with 4 subgroups of 17 subsubgroups. The total number of params is 115,021 in the whole system but almost all are 20 user inputs that are promoted throughout. I will be using this for optimization. Is there a way to speed this up, especially since all the lowest groups are using the exact same params except for like one or two? Has any testing of scalability been done for larger problems like this? It is possible to run the setup() in parallel?
We have some work getting setup times under control for some problems. Things get expensive when you have a lot of separate variables. Our usual trick is to link things into bigger array variables and have components depend on slices of the larger arrays.
Setup does work in parallel, but we haven't parallelized the setup itself.

How to verify number of function evaluations when profiling R code

When profiling R code with Rprof-type functions we get the time spent in function alone and the time spent in function and callees. However, as far as I know we don't get the number of times a given function was evaluated.
For example, assume I wants to compare two integration functions:
integrate_1(myfunc, from = -Inf, to = Inf)
integrate_2(myfunc, from = -Inf, to Inf)
I could easily see how much time each function takes and where this time was spent, but I don't know how to check how many times myfunc had to be evaluated in each of the integrate functions.
Thanks,
One way of implementing Joran's counter method is to use the trace function.
For example, first we set the counter to zero. (Assigned in the global environment, for convenience.)
count <- 0
Then set up the trace. Here we set it on the identity function (that just returns the value that you input to it).
trace("identity", quote(count <<- count + 1), print = FALSE)
Now whenever identity is called, the value of count is incremented. print = FALSE just stops a message being printed to the console when the function is called.
Let's call the function a few times and inspect the count:
for(i in seq_len(123)) identity(1)
count
## [1] 123
Rprof works by sampling the call stack on a timer. It does not count calls.
It records the sampled call stacks in a file, and though it does not record line numbers where calls occur, those samples are still useful for seeing what causes time to be spent.
For example, if you happen to look at M random samples, and you see a pattern like A calling B calling C on N of them, then you know the program spends roughly fraction N/M of its time doing that (assuming N > 1).
If you see such a thing, and you can think of a way to avoid even part of it, you will save a substantial fraction of the total time.
Rprof comes with a summarization tool that gives you the kind of numbers you mentioned, but I don't find those numbers useful anyway.
I would much rather get a real sense of what's happening.

Formula to prioritize tasks based on weight and date

Is there a formula or algorithm which can prioritize items based on weight and a date? For instance, a critical item would always be at the top of the list while a two normal items would be prioritized based on their due date.
Scheduling is one of the most-studied areas of computer science, which is convenient, because it gives a lot of prior art that you can learn from.
Perhaps the easiest approach is Earliest Deadline First -- where you schedule the task with the first deadline and work on it until it blocks. Then work on the next earliest deadline. The downside is that low-priority tasks that take a long time might stall higher-priority tasks.
It might be worthwhile to determine if your scheduling must be hard, firm, or soft -- sometimes it makes sense to drop tasks completely and finish nearly everything on time than to finish everything but half a second too late.
Yes. This can either be done by defining a comparison function that checks priority first. I.e.
// Returns n < 0, 0, or n > 1 if value1 is less than, equal to or greater
compare(value1, value2) {
if(value1.priority != value2.priority) {
return value1.priority - value2.priority;
}
return value1.date - value2.date;
}
Alternatively, this function returns a value calculated from the date and the priority, this can be used to compare tasks and order them by priority (and then date):
// Returns
task.GetValue() {
return me.GetDateAsIntegerValue() + MAX_DATE_VALUE * me.GetPriority();
}
But just as sarnold mentioned, this is a highly studied area.
A different way to look at this is as a ranking problem. If you take these two values, weight and priority as inputs, you can create a table of paired comparisons that decompose items into their inputs (weight and priority) and outputs are relative orderings.
Consider, say, item 42 and item 69, denoted X42 and X69: if you have their weights and priority (W42, P42) and (W69, P69), you'd like to know if X42 should appear before X69, after it, or at an equal position. If you have a training set, you can tag whether one is preferred to the other.
What we're lacking here is a method for comparing these. A very simple method is to use logistic regression on the differences, i.e. a simple function f( (W_A - W_B), (P_A - P_B)), or f((W42 - W69),(P42 - P69)), in this case. If the result is above some threshold, then A is preferred to B, otherwise B is preferred to A. You can use this to sort the results.
As usual, most of the results online are not very accessible to beginners. Here's a short chapter that may be helpful in understanding the logistic regression. However, if you'd like to address such matters in more depth, the statistics StackExchange site would be better.
You'll have to decide: (1) if what you're looking at can be decomposed into an additive function of the weight and priority, and, if so, (2) the loss function or objective function that you need to minimize, so that you can get the optimal parameters for this additive function. An ordinal logistic model is one choice, ordinal probit another, and there are tons of other options. If you don't use an additive function (i.e. a linear combination), you'll have a challenging range of possibilities to consider, so it's best to start with something simple.
You can separate the tasks by rating the impact 1-10 (10 being highest) and the output needed 1-10 (also 10 being hardest)
You add the numbers together and divide by two. The result will be the priority ranking of your task 1-10 (10 being most important).
Example:
Check Emails: impact 2 output 1 = 1.5
Call potential customer: impact 10 output 2 = 6
From this example the calling of the customer would then be placed in a higher priority than checking emails.

Resources