Opened 13 months ago

Closed 7 months ago

#3030 closed defect (done)

RF/GBT SurrogateModel triggers unnecessary recalculation

Reported by: pfleck Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.17
Component: Algorithms.DataAnalysis Version: trunk
Keywords: Cc:

Description (last modified by pfleck)

When the surrogate-storage mechanism is used for RFs and GBTs, the full model is still created after a run and passed to the surrogate model for immediate use. It currently is stored as a Lazy object: actualModel = new Lazy<IRandomForestModel>(() => model);

When a surrogate model is cloned, it checks whether the Lazy object is initialized and clones the full model if it is available:

IRandomForestModel clonedModel = null;
if (original.actualModel.IsValueCreated)
  clonedModel = cloner.Clone(original.ActualModel);
actualModel = new Lazy<IRandomForestModel>(() => clonedModel ?? RecalculateModel());

In case the cloning constructor is called immediately after creation of the surrogate Model, IsValueCreated is actually returning false since the Lazy object itself has not yet accessed its Value property. Thus the full model gets lost, and an unnecessary recalculation of the model is required later.

Change History (9)

comment:1 Changed 13 months ago by pfleck

  • Description modified (diff)

comment:2 Changed 13 months ago by pfleck

  • Description modified (diff)

comment:3 Changed 13 months ago by mkommend

  • Milestone set to HeuristicLab 3.3.17
  • Owner set to mkommend
  • Status changed from new to accepted
  • Version set to trunk

comment:4 Changed 13 months ago by mkommend

  • Owner changed from mkommend to pfleck
  • Status changed from accepted to reviewing

r17272: Fixed cloning of RF & GBT surrogate models with already created models.

comment:5 Changed 13 months ago by pfleck

Works as expected.

Code:

  • Alternatively to the Lazy<T>, we could also have a look at System.Threading.LazyInitializer. This could help avoiding to store the full model twice (in the Lazy<T> and the fullModel field). This way we could also get rid of the CreateLazyInitFunc, which I don't really like.
  • RandomForestModelSurrogate.cs
    • actualModel = CreateLazyInitFunc(); on line 86 is not necessary since it is already initialized in the this-constructor call.
  • GradientBoostedTreesModelSurrogate.cs
    • actualModel = CreateLazyInitFunc(); on line 120 is not necessary since it is already initialized in the this-constructor call.

comment:6 Changed 13 months ago by pfleck

  • Owner changed from pfleck to mkommend

r17278 Removed duplicated LazyInitFunc initialization.

comment:7 Changed 8 months ago by mkommend

  • Status changed from reviewing to readytorelease

comment:8 Changed 7 months ago by mkommend

r17494: Merged r17272 and r17278 into stable.

comment:9 Changed 7 months ago by mkommend

  • Resolution set to done
  • Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.