Opened 3 weeks ago

Last modified 2 weeks ago

#3030 reviewing defect

RF/GBT SurrogateModel triggers unnecessary recalculation

Reported by: pfleck Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.17
Component: Algorithms.DataAnalysis Version: trunk
Keywords: Cc:

Description (last modified by pfleck)

When the surrogate-storage mechanism is used for RFs and GBTs, the full model is still created after a run and passed to the surrogate model for immediate use. It currently is stored as a Lazy object: actualModel = new Lazy<IRandomForestModel>(() => model);

When a surrogate model is cloned, it checks whether the Lazy object is initialized and clones the full model if it is available:

IRandomForestModel clonedModel = null;
if (original.actualModel.IsValueCreated)
  clonedModel = cloner.Clone(original.ActualModel);
actualModel = new Lazy<IRandomForestModel>(() => clonedModel ?? RecalculateModel());

In case the cloning constructor is called immediately after creation of the surrogate Model, IsValueCreated is actually returning false since the Lazy object itself has not yet accessed its Value property. Thus the full model gets lost, and an unnecessary recalculation of the model is required later.

Change History (6)

comment:1 Changed 3 weeks ago by pfleck

  • Description modified (diff)

comment:2 Changed 3 weeks ago by pfleck

  • Description modified (diff)

comment:3 Changed 3 weeks ago by mkommend

  • Milestone set to HeuristicLab 3.3.17
  • Owner set to mkommend
  • Status changed from new to accepted
  • Version set to trunk

comment:4 Changed 3 weeks ago by mkommend

  • Owner changed from mkommend to pfleck
  • Status changed from accepted to reviewing

r17272: Fixed cloning of RF & GBT surrogate models with already created models.

comment:5 Changed 2 weeks ago by pfleck

Works as expected.

Code:

  • Alternatively to the Lazy<T>, we could also have a look at System.Threading.LazyInitializer. This could help avoiding to store the full model twice (in the Lazy<T> and the fullModel field). This way we could also get rid of the CreateLazyInitFunc, which I don't really like.
  • RandomForestModelSurrogate.cs
    • actualModel = CreateLazyInitFunc(); on line 86 is not necessary since it is already initialized in the this-constructor call.
  • GradientBoostedTreesModelSurrogate.cs
    • actualModel = CreateLazyInitFunc(); on line 120 is not necessary since it is already initialized in the this-constructor call.

comment:6 Changed 2 weeks ago by pfleck

  • Owner changed from pfleck to mkommend

r17278 Removed duplicated LazyInitFunc initialization.

Note: See TracTickets for help on using tickets.