Free cookie consent management tool by TermsFeed Policy Generator

Opened 8 years ago

Closed 7 years ago

#2634 closed feature request (done)

Run-length distribution (RLD) Analysis of Algorithm Instances

Reported by: abeham Owned by: abeham
Priority: medium Milestone: HeuristicLab 3.3.15
Component: Analysis Version: 3.3.13
Keywords: Cc:

Description

Run-length distributions give insight into the required algorithmic effort (in terms of function evaluations or wall-clock time) for reaching a certain target.

The empirical cumulative distribution function (ECDF) can be calculated from several runs of an optimization algorithm. From a first-hit graph several targets will be defined in the objective function and it will be measured when a target is hit by an algorithm. The proportion of all these hits forms the distribution.

RLD Analysis is to be included through analyzer operators that can be attached to the Analyzers of an algorithm instance. When run they track the monotonic convergence graph. In addition a run collection view is to be created that constructs and displays the ECDF.

Change History (15)

comment:1 Changed 8 years ago by abeham

  • Owner set to abeham
  • Status changed from new to accepted

comment:2 Changed 8 years ago by abeham

r14101: Fixed some remaining bugs in runcollection rld view regarding runs with unequal length

comment:3 Changed 8 years ago by abeham

  • Component changed from Analysis.Views to Analysis
  • Owner changed from abeham to mkommend
  • Status changed from accepted to reviewing

Trunk Integration

r14102: Integrated RLD analysis into trunk

  • HeuristicLab.Analysis:
    • IndexedDataTable<T>, IndexedDataRow<T>
    • QualityPerClockAnalyzer, QualityPerEvaluationsAnalyzer
    • ExpectedRuntimeHelper
  • HeuristicLab.Analysis.Views:
    • IndexedDataTableView
  • HeuristicLab.Optimization.Views:
    • RunCollectionRLDView

To test:

  1. Configure an algorithm/problem combination
  2. In the algorithm's analyzers add the QualityPerEvaluationsAnalyzer
  3. Make sure the BestAverageWorstQualityAnalyzer is executed before
  4. Run the algorithm several times
  5. In the Runs tab select the "Run-length Distribution View"

mkommend, if you don't have time for review, please move it to the next release milestone.

comment:4 Changed 8 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.14 to HeuristicLab 3.3.15

comment:5 Changed 8 years ago by mkommend

  • Owner changed from mkommend to abeham
  • Status changed from reviewing to assigned

Review comments

IndexedDataRow

  • ItemDescription "A data row that contains time series." is somehow misleading. Isn't it possible that the data row contains other data than time series?
  • Forward of NameChange to the VisualProperty.DisplayProperties (compare DataRow line 119-123)

IndexedDataTable

  • RegisterRowsEvents should not be virtual due to the call in the ctor. IMHO it is better to make it private and not virtual and the registered methods (e.g. rows_ItemsAdded) protected virtual. As a result a potential subclass is automatically registered to the relevant events, but could decide how to react on those by overriding the registered methods.

ExpectedRuntTimeHelper

  • Why are successful and unsuccessful timestamps < avg - 2 stddev removed for the ert calculation (line 33-34) ?

ErtCalculationResult

  • Fields should be readonly
  • ToString depends on the SuccessfulRuns. Wouldn't it be better to set the ExpectedRuntime directly to infinity and let .Net handle the string display.

QualityPerEvaluationsAnalyzer

  • I don't understand the second part of the if clause in line 92.
     if (values.Count > 0 && evaluations < values.Last().Item1) evaluations = 1;
    
  • Comments in line 94 and 101 are misleading. The values are not directly replaced, but rather in the next application of the analyzer. Does this code work if two consecutive improvements are achieved? (I think so) The code is IMHO pretty complex.
  • Line 101 checks for inequality between the last item and the best quality.
    var improvement = values.Last().Item2 != bestQuality;
    

What if its not an improvement but a decrease in quality? Should the graph still be updated and is the variable named incorrectly or should the best values be kept? The current implementation depends on the bestQuality which is updated only if an improvement has been achieved.

  • Why is the data row called first-hit graph? Is there a second hit graph as well?
  • What is the difference between newEntry and the newly created Tuple?

The same comments apply to the QualityPerExecutionTimeAnalyzer as well.

IndexedDataTableView

  • Line 160 - 162: wouldn't it be easier to write series.Color = row.VisualProperty.Color; ?
  • A whole lot of code (> 600 lines)
  • IsInvalidValue is never used.

comment:6 Changed 7 years ago by abeham

r14647:

  • fixed RLD analysis for problem instances with best known quality of 0
  • added option to perform comparison on relative targets or absolute targets
  • some performance improvements

comment:7 Changed 7 years ago by abeham

r14652:

  • changed relative/absolute checkbox button into combobox
  • added optional markers
  • added translucency of line in case run-data is more scarce (e.g. because they're already terminated)

comment:8 Changed 7 years ago by abeham

r14654:

  • restructured code a little to be easier to understand
  • removed some redundancies

comment:9 Changed 7 years ago by abeham

r14775: fixed bugs with respect to total number of runs

  • some further restructuring to increase readability
  • fixing numerical instability issues

comment:10 Changed 7 years ago by abeham

r15048: implemented reviewer's comments, details see below

IndexedDataRow

  • ItemDescription "A data row that contains time series." is somehow misleading. Isn't it possible that the data row contains other data than time series? Changed description
  • Forward of NameChange to the VisualProperty.DisplayProperties (compare DataRow line 119-123) forwarded

IndexedDataTable

  • RegisterRowsEvents should not be virtual due to the call in the ctor. IMHO it is better to make it private and not virtual and the registered methods (e.g. rows_ItemsAdded) protected virtual. As a result a potential subclass is automatically registered to the relevant events, but could decide how to react on those by overriding the registered methods. changed visibility and virtual definition, also fixed a potential bug as RegisterRowsEvents did not register individual row events (e.g. when called after copy constructor)

ExpectedRunTimeHelper

  • Why are successful and unsuccessful timestamps < avg - 2 stddev removed for the ert calculation (line 33-34) ?
    • I consider runs to be outliers if they're shorter than the mean runtime minus two standard deviations. I want to exclude them, because result analysis with respect to the shaded area is more clear if longer runs are used.

ErtCalculationResult

  • Fields should be readonly made readonly
  • ToString depends on the SuccessfulRuns. Wouldn't it be better to set the ExpectedRuntime directly to infinity and let .Net handle the string display. did as suggested

QualityPerEvaluationsAnalyzer

  • I don't understand the second part of the if clause in line 92. changed the analyzer to start recording values in the convergence graph only when evaluations is strictly greater than 0
  • Comments in line 94 and 101 are misleading. The values are not directly replaced, but rather in the next application of the analyzer. Does this code work if two consecutive improvements are achieved? (I think so) The code is IMHO pretty complex.
    • Yes the code works in case of consecutive improvements
    • Changed the comments
  • Line 101 checks for inequality between the last item and the best quality.
    var improvement = values.Last().Item2 != bestQuality;
    
    • What if it's not an improvement but a decrease in quality? Should the graph still be updated and is the variable named incorrectly or should the best values be kept? The current implementation depends on the bestQuality which is updated only if an improvement has been achieved.
      • Our algorithms and analyzers use best quality to track the overall progress of the algorithm, if it changes it means that a progress was made
    • Why is the data row called first-hit graph? Is there a second hit graph as well?
      • first-hit means the first time a certain quality is achieved
    • What is the difference between newEntry and the newly created Tuple?
      • the last entry tracks the maximum number of evaluations. I wanted to record that information inside the graph, because I need it in the analysis. Still, I wanted to keep amount of data low and thus the last entry is always overwritten with the current amount of evaluated solutions.

The same comments apply to the QualityPerExecutionTimeAnalyzer as well. adapted the comments as well

IndexedDataTableView

  • Line 160 - 162: wouldn't it be easier to write series.Color = row.VisualProperty.Color; ? done
  • A whole lot of code (> 600 lines)
    • As is the DataTableView. I remember that I tried deriving from DataTableView, but then had problems and reimplemented most functionality from scratch. Speaking of which, the improvements made by Philipp to DataTableView are not present in IndexedDataTableView
  • IsInvalidValue is never used. used it in FillSeriesWithRowValues

comment:11 Changed 7 years ago by abeham

  • Owner changed from abeham to mkommend
  • Status changed from assigned to reviewing

r15125:

  • Improved performance slightly by using binary instead of linear search
  • Fixed target value ECDFs by removing aggregate option (only drawn for each budget separately)
  • Added utility class for calculating ECDFs in HeuristicLab.Common
  • Added option to hide labels
  • Improved add result methods

comment:12 Changed 7 years ago by abeham

r15126: fixed build

comment:13 Changed 7 years ago by gkronber

Please re-include project HeuristicLab.Algorithms.DataAnalysis.Glmnet into the solution after this ticket has been released (see ticket:745#comment:64)

comment:14 Changed 7 years ago by abeham

  • Owner changed from mkommend to abeham
  • Status changed from reviewing to readytorelease

personally discussed, some issues remaining, a bit difficult to use, further improvement in future

Last edited 7 years ago by abeham (previous) (diff)

comment:15 Changed 7 years ago by abeham

  • Resolution set to done
  • Status changed from readytorelease to closed

r15220: merged revisions 14102, 14647, 14652, 14654, 14734, 14737, 14775, 15048, 15125, 15126 to stable, reverted changes from revision 15153

Note: See TracTickets for help on using tickets.