Is Randoop --time-limit for spending that time for generating tests for each class or for the entire project? For example if I set --time-limit = 180 for a project for over 100 class, Randoop will spend 180 seconds for all of the classes, or for each class it will spend 180 seconds to generate tests.
Per the Randoop manual, it is the time for the entire project:
--time-limit=int. Maximum number of seconds to spend generating tests. Zero means no limit. If nonzero, Randoop is nondeterministic: it may generate different test suites on different runs.
The default value is appropriate for generating tests for a single class in the context of a larger program, but is too small to be effective for generating tests for an entire program.
If you are testing multiple classes, you might want to multiply the default time limit by the number of classes being tested.
Related
My book mentions " Depending on what you consider as the baseline, the reduction can be viewed as decreasing the number of clock cycles per instruction (CPI), as decreasing the clock cycle time, or as a combination.If the starting point is a processor that takes multiple clock cycles per instruction, then pipelining is usually viewed as reducing the CPI."
What I fail to understand is pipelining affects CPI or the clock period because in case of pipelining clock period is taken as max stage-delay + Latch-delay so pipelining does affect the clock time . Also it affects CPI because it becomes 1 in case of pipelining. Am I missing on some concept?
Executing an instruction requires a set of operations. For the sake of simplicity assume there are 5:
fetch-instruction decode-execute-memory access-write back.
This can be implemented with several schemes.
A/ Mono cycle processor
The scheme is the following:
The processor fetches an instruction, directs it to a decoder that controls a bank of multiplexers that will configure a large combinatorial datapath that will implement the instruction.
In this model, every instruction requires one cycle, and, assuming all the 5 "stages" require an equal time t, the period will be 5t.
Hence CPI=1, T=5
Actually, this was more or less the underlying model of the earlier computers in the late 40's. Besides that, no real processor has be done like that, but it is theorically quite doable.
B/ Multi cycle processor
Compared to the previous model, you introduce registers on the datapath. First one fetches the instruction and sends it to the inputs of an automaton that will sequentially apply the computation "stages".
In that case, instructions require 5 cycles (maybe slightly less as some instructions may be simpler and, for instance, skip the memory access). Period is 1t (or maybe slighly more to take into account the registers traversal time).
CPI=5, T=1
The first "true" computers were implemented like that and this was the main architectural model up to the early 80's. Nowadays several microcontrollers or, for instance, the simpler version of NIOS, are still relying on this scheme.
C/ pipeline processor
You add extra registers between the stages in order to keep track of the instruction and of all the partial results. In that case, the execution of every stage can be independent and you can execute several instructions simutaneously in different stages.
CPI becomes 1, as you can start a new instruction at every clock cycle (probably a bit more because of the hazards, but that is another story).
And T=1.
So CPI=1, T=1
(the CPI reflects the throughput increase but the execution time of a single instruction is not reduced)
So pipeline can be seen as either reducing the cycle time wrt scheme A, or reducing the CPI, wrt to scheme B. And you can also imagine an intermediate scheme (say 3 stages, with a period of 2) where pipeline will reduce both.
I've implemented a sensor scheduling problem using OptaPlanner 6.2 that has 1 hard constraint, 1 medium constraint, and 1 soft constraint. The trouble I'm having is that either I can satisfy some of the hard constraints after 30 seconds of or so, and then the solver makes very little progress satisfying them constraints with additional minutes of termination. I don't think the problem is over constrained; I also don't know how to help the local search process significantly increase the score.
My problem is a scheduling one, where I precalculate all possible times (intervals) that a sensor can observe objects prior to solving. I've modeled the problem as follows:
Hard constraint - no intervals can overlap
when
$s1: A( interval!=null,$id: id, $doy : interval.doy, $interval: interval, $sensor: interval.getSensor())
exists A( id > $id, interval!=null, $interval2: interval, $interval2.getSensor() == $sensor, $interval2.getDoy() == $doy, $interval.getStartSlot() <= $interval2.getEndSlot(), $interval.getEndSlot() >= $interval2.getStartSlot() )
then
scoreHolder.addHardConstraintMatch(kcontext,-10000);
Medium constraint - every assignment should have an Interval
when
A(interval==null)
then
scoreHolder.addMediumConstraintMatch(kcontext,-100);
Soft constraint - maximize a property/value in the Interval class
when
$s1: A( interval!=null)
then
scoreHolder.addSoftConstraintMatch(kcontext,-1 * $s1.getInterval().getSomeProperty())
A: entity planning class; each instance is an assignment for a particular object (i.e has an member objectid that corresponds with one in the Interval class)
Interval: planning variable class, all possible intervals (start time, stop time) for the sensor and objects
In A, I restrict access to B instances (intervals) to just those for the object associated with that assignment. For my test case, there are 40000 or so Intervals, but only dozens for each object. There are about 1100 instances of A (so dozens of possible Intervals for each one).
#PlanningVariable(valueRangeProviderRefs = {"intervalRange"},strengthComparatorClass = IntervalStrengthComparator.class, nullable=true)
public Interval getInterval() {
return interval;
}
#ValueRangeProvider(id = "intervalRange")
public List<Interval> getPossibleIntervalList() {
return task.getAvailableIntervals();
}
In my solution class:
//have tried commenting this out since the overall interval list does not apply to all A
#ValueRangeProvider (id="intervalRangeAll")
public List getIntervalList() {
return intervals;
}
#PlanningEntityCollectionProperty
public List<A> getAList() {
return AList;
}
I've reviewed the documentation and tried a lot of things. My problem is somewhat of a cross between the nurserostering and course scheduling examples, which I have looked at. I am using the FIRST_FIT_DECREASING construction heuristic.
What I have tried:
Turning on and off nullable in the planning variable annotation for A.getInterval()
Late acceptance, Tabu, both.
Benchmarking. I wasn't seeing any problems and average
Adding an IntervalChangeFactory as a moveListFactory. Restricting the custom ChangeMove to whether the interval can be accepted or not (i.e. enforcing or not the hard constraint in the IntervalChangeMove.isDoable).
Here is an example one, where most of the hard constraints are not satisfied, but the medium ones are:
[main] INFO org.optaplanner.core.impl.solver.DefaultSolver - Solving started: time spent (202), best score (0hard/-112500medium/0soft), environment mode (REPRODUCIBLE), random (WELL44497B with seed 987654321).
[main] INFO org.optaplanner.core.impl.constructionheuristic.DefaultConstructionHeuristicPhase - Construction Heuristic phase (0) ended: step total (1125), time spent (2296), best score (-9100000hard/0medium/-72608soft).
[main] INFO org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase - Local Search phase (1) ended: step total (92507), time spent (30000), best score (-8850000hard/0medium/-74721soft).
[main] INFO org.optaplanner.core.impl.solver.DefaultSolver - Solving ended: time spent (30000), best score (-8850000hard/0medium/-74721soft), average calculate count per second (5643), environment mode (REPRODUCIBLE).
So I don't understand is why the hard constraints can't be deal with by the search process. Any my calculate count per second has dropped to below 10000 due to all the tinkering I've done.
If it's not due to the Score trap (see docs, this is the first thing to fix), it's probably because it gets stuck in a local optima and there are no moves that go from 1 feasible solution to another feasible solution except those that don't really change much. There are several ways to deal with that:
Add course-grained moves (but still leave the fine-grained moves such as ChangeMove in!). You can add generic course grained moves (such as the pillar moves) or add a custom Move. Don't start making smarter selectors that try to only select feasible moves, that's a bad idea (as it will kill your ACC or will limit your search space). Just mix in course grained moves (= diversification) to complement the fine grained moves (= intensification).
A better domain model might help too. The Project Job Scheduling and Cheap Time scheduling examples have a domain model which naturally leads to a smaller search space while still allowing all feasible solutions.
Increase Tabu list size, LA size or use a hard constraint in the SA starting temperature. But I presume you've tried that with the benchmarker.
Turn on TRACE logging to see optaplanner's decision making. Only look at the part after it reaches the local optimum.
In the future, I 'll also add Ruin&Recreate moves, which will be far less code than custom moves or a better domain model (but it will be less efficient).
I am using Visual Studio TS Load Test for running WebTest (one client/controls hitting one server). How can I configure goal based load pattern to achieve a constant test / second?
I tried to use the counter 'Tests/Sec' under 'LoadTest:Test' but it does not seem to do anything.
I've recently tested against the Tests / Sec, and I've confirmed it working.
For the settings on the Goal Based Load Pattern, I used:
Category: LoadTest:Test
Counter: Tests/Sec
Instance: _Total
When the load test starts, verify it doesn't show an error re: not being able to access that Performance Counter.
Tests I ran for my own needs:
Set Initial User Load quite low (10), and gave it 5 minutes to see if
it would reach the target Tests / Sec target, and stabilise. In my case, it stabilised after about 1 minute 40.
Set the Maximum User Count [Increment|Decrement] to 50. Turns out the
user load would yo-yo up and down quite heavily, as it would keep
trying to play catch-up. (As the tests took 10-20 seconds each)
Set the Initial User Load quite close to the 'answer' from test 1,
and watched it make small but constant adjustments to the user
volume.
Note: When watching the stats, watch the value under "Last". I believe the "Average" is averaged over a relatively long period, and may appear out of step with the target.
I'm experiencing a really weird case when I am doing some performance tuning of Hadoop. I was running a job with large intermediate output (like InvertedIndex or WordCount without combiner), the network and computation resources are all homogeneous. According to how mapreduce work, when there is more WAVES of reduce task, the overall run time should be slower as there is less overlap between map and shuffle, but it is not the case. It turns out that the job with 5 WAVES of reduce task is about 10% faster than the one with only one WAVE of task. And I checked the log and it turns out that the map tasks' execution time is longer when there is less reduce tasks, also, the overall computation time(not shuffle or merge) during reduce phase is longer when there is less task. I tried to rule out other factors by setting reduce slow-start factor to be 1 so that there is no overlap between map and shuffle, I also limited it to be only one reduce task to be executed at the same time so there is no overlap between reduce tasks, and I modified the scheduler to force mapper and reducer to locate on different machines so there is no I/O congestion. Even with above approach, the same thing still happen. (I also set the map memory buffer to be large enough and the io.sort.factor to be 32 or even larger and io.sort.mb to be larger than 320 accordingly)
I really can't think of any other reason that cause this problem, so any suggestions would be greatly appreciated!
Just in case of confusion, the problem I am experiencing is:
0. I'm comparing the performance of running 1 reduce task vs 5 reduce task of the same job under all other same configurations. There is only one tasktracker for reduce computation.
1. I have forced all reduce task to be executed sequentially by having only one tasktracker for redcue task in both cases, and mapred.tasktracker.reduce.tasks.maximum=1, so there won't be any parallelism during reduce phase
2. I have set mapred.reduce.slowstart.completed.maps=1 so none of the reducer will start to pull data before all map is done
3. It turns out that having one reduce task is slower than having 5 SEQUENTIAL reduce tasks!
4. Even if I set set mapred.reduce.slowstart.completed.maps=0.05 to allow overlap between map & shuffle, (thus when there is only one reduce task, the overlap should be more and it should run faster, because the 5 reduce task are SEQUENTIALLY executing) the 5-reduce-task is still faster than 1-reduce task and the map phase of 1-reduce task become slower!
This is not a problem. The more reduce tasks you have, the faster your data gets processed.
The outputs of the map phase are sent to the reducers . If you have two reducers, the load is distributed between the two reducers.
Incase of the wordcount example, you will have two seperate files with count divided between them. So you will have to manually add the total, or run another map reduce job to calculate the total if you had lots of reduce tasks.
This is as expected, if you only have a single reducer than your job has a single point of failure. Your number of reducers should be set to about 90% capacity. You can find your reduce capacity by multiplying your number of reduce slots with your total number of nodes. I have found that it is also good practice to use a combiner if it is applicable.
If you have just 1 reduce task, then that reducer has to wait for all mappers to finish, and the shuffle phase has to collect all intermediate data to be redirected to just that one reducer. So, it's natural that the map and shuffle times are larger, and so is the overall time, if you have just one reducer.
However if you have more reducers, your data gets processed in parallel, and that makes it faster. Again, if you have too many reducers, then there's too much data being shuffled around, resulting in increase in network traffic. So you have to find that optimal number of reducers which gives you a good balance.
The right number of reduces seems to be 0.95 or 1.75 * (nodes * mapred.tasktracker.tasks.maximum). At 0.95 all of the reduces can launch immediately and start transfering map outputs as the maps finish. At 1.75 the faster nodes will finish their first round of reduces and launch a second round of reduces doing a much better job of load balancing.
courtesy:
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Setting the number of map tasks and reduce tasks
(similar question wirth resolved answer)
Hope this helps!
If I monitor ASP.NET Requests/sec performance counter every N seconds, how should I interpret the values? Is it number of requests processed during sample interval divided by N? or is it current requests/sec regardless the sample interval?
It is dependent upon your N number of seconds value. Which coincidentally is also why the performance counter will always read 0 on its first nextValue() after being initialized. It makes all of its calculations relatively from the last point at which you called nextValue().
You may also want to beware of calling your counters nextValue() function at intervals less than a second as it can tend to produce some very outlandish outlier results. For my uses, calling the function every 5 seconds provides a good balance between up to date information, and smooth average.