Opened 10 months ago

Last modified 5 weeks ago

#2634 assigned feature request

Run-length distribution (RLD) Analysis of Algorithm Instances

Reported by: abeham Owned by: abeham
Priority: medium Milestone: HeuristicLab 3.3.15
Component: Analysis Version: 3.3.13
Keywords: Cc:

Description

Run-length distributions give insight into the required algorithmic effort (in terms of function evaluations or wall-clock time) for reaching a certain target.

The empirical cumulative distribution function (ECDF) can be calculated from several runs of an optimization algorithm. From a first-hit graph several targets will be defined in the objective function and it will be measured when a target is hit by an algorithm. The proportion of all these hits forms the distribution.

RLD Analysis is to be included through analyzer operators that can be attached to the Analyzers of an algorithm instance. When run they track the monotonic convergence graph. In addition a run collection view is to be created that constructs and displays the ECDF.

Change History (9)

comment:1 Changed 10 months ago by abeham

  • Owner set to abeham
  • Status changed from new to accepted

comment:2 Changed 9 months ago by abeham

r14101: Fixed some remaining bugs in runcollection rld view regarding runs with unequal length

comment:3 Changed 9 months ago by abeham

  • Component changed from Analysis.Views to Analysis
  • Owner changed from abeham to mkommend
  • Status changed from accepted to reviewing

Trunk Integration

r14102: Integrated RLD analysis into trunk

  • HeuristicLab.Analysis:
    • IndexedDataTable<T>, IndexedDataRow<T>
    • QualityPerClockAnalyzer, QualityPerEvaluationsAnalyzer
    • ExpectedRuntimeHelper
  • HeuristicLab.Analysis.Views:
    • IndexedDataTableView
  • HeuristicLab.Optimization.Views:
    • RunCollectionRLDView

To test:

  1. Configure an algorithm/problem combination
  2. In the algorithm's analyzers add the QualityPerEvaluationsAnalyzer
  3. Make sure the BestAverageWorstQualityAnalyzer is executed before
  4. Run the algorithm several times
  5. In the Runs tab select the "Run-length Distribution View"

mkommend, if you don't have time for review, please move it to the next release milestone.

comment:4 Changed 9 months ago by mkommend

  • Milestone changed from HeuristicLab 3.3.14 to HeuristicLab 3.3.15

comment:5 Changed 7 months ago by mkommend

  • Owner changed from mkommend to abeham
  • Status changed from reviewing to assigned

Review comments

IndexedDataRow

  • ItemDescription "A data row that contains time series." is somehow misleading. Isn't it possible that the data row contains other data than time series?
  • Forward of NameChange to the VisualProperty.DisplayProperties (compare DataRow line 119-123)

IndexedDataTable

  • RegisterRowsEvents should not be virtual due to the call in the ctor. IMHO it is better to make it private and not virtual and the registered methods (e.g. rows_ItemsAdded) protected virtual. As a result a potential subclass is automatically registered to the relevant events, but could decide how to react on those by overriding the registered methods.

ExpectedRuntTimeHelper

  • Why are successful and unsuccessful timestamps < avg - 2 stddev removed for the ert calculation (line 33-34) ?

ErtCalculationResult

  • Fields should be readonly
  • ToString depends on the SuccessfulRuns. Wouldn't it be better to set the ExpectedRuntime directly to infinity and let .Net handle the string display.

QualityPerEvaluationsAnalyzer

  • I don't understand the second part of the if clause in line 92.
     if (values.Count > 0 && evaluations < values.Last().Item1) evaluations = 1;
    
  • Comments in line 94 and 101 are misleading. The values are not directly replaced, but rather in the next application of the analyzer. Does this code work if two consecutive improvements are achieved? (I think so) The code is IMHO pretty complex.
  • Line 101 checks for inequality between the last item and the best quality.
    var improvement = values.Last().Item2 != bestQuality;
    

What if its not an improvement but a decrease in quality? Should the graph still be updated and is the variable named incorrectly or should the best values be kept? The current implementation depends on the bestQuality which is updated only if an improvement has been achieved.

  • Why is the data row called first-hit graph? Is there a second hit graph as well?
  • What is the difference between newEntry and the newly created Tuple?

The same comments apply to the QualityPerExecutionTimeAnalyzer as well.

IndexedDataTableView

  • Line 160 - 162: wouldn't it be easier to write series.Color = row.VisualProperty.Color; ?
  • A whole lot of code (> 600 lines)
  • IsInvalidValue is never used.

comment:6 Changed 3 months ago by abeham

r14647:

  • fixed RLD analysis for problem instances with best known quality of 0
  • added option to perform comparison on relative targets or absolute targets
  • some performance improvements

comment:7 Changed 3 months ago by abeham

r14652:

  • changed relative/absolute checkbox button into combobox
  • added optional markers
  • added translucency of line in case run-data is more scarce (e.g. because they're already terminated)

comment:8 Changed 3 months ago by abeham

r14654:

  • restructured code a little to be easier to understand
  • removed some redundancies

comment:9 Changed 5 weeks ago by abeham

r14775: fixed bugs with respect to total number of runs

  • some further restructuring to increase readability
  • fixing numerical instability issues
Note: See TracTickets for help on using tickets.