Free cookie consent management tool by TermsFeed Policy Generator

Opened 8 years ago

Last modified 8 years ago

#2612 closed feature request

Regression tree models should support evaluation even when some of the variables are missing or contain missing values — at Version 7

Reported by: gkronber Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.14
Component: Algorithms.DataAnalysis Version: 3.3.13
Keywords: Cc:

Description (last modified by gkronber)

as described in "Greedy Function Approximation" paper

Change History (7)

comment:1 Changed 8 years ago by gkronber

  • Owner set to gkronber
  • Status changed from new to accepted

comment:2 Changed 8 years ago by gkronber

  • Owner changed from gkronber to pfleck
  • Status changed from accepted to assigned

The regression tree models can be easily extended such that they calculate a weighted average estimated value if a given variable value is not available in the dataset. This can be used also for partial dependence plots (only add the variable for which the partial dependence should be calculated to the dataset).

r13895:

  • extended GBT to support calculation of partial dependence
  • changed persistence of regression tree models
  • added two unit tests.

comment:3 Changed 8 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.14 to HeuristicLab 3.3.15

In r13895 the persistence format for gradient boosted trees has been improved, and handling of missing values for evaluation of GBT models has been added.

The related ticket #2622 is concerned with adding correct handling of missing values to the training phase.

A view for plotting the partial dependence has not yet been added. Therefore, I'm moving this ticket to the next milestone.

comment:4 Changed 8 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.15 to HeuristicLab 3.3.14
  • Owner changed from pfleck to gkronber

This ticket blocks the release of several others, e.g. #2604, #2541, or #1795, because of the changes to gradient boosted trees in r13895, on which the other tickets depend upon.

Last edited 8 years ago by gkronber (previous) (diff)

comment:5 Changed 8 years ago by mkommend

Reviewed r13895. The only changes in r13895 are the handling of missing values in the tree models and the change of the persistence format.

Last edited 8 years ago by gkronber (previous) (diff)

comment:6 Changed 8 years ago by gkronber

r14015: added NaN handling for the evaluation of regression tree models (GBT)

comment:7 Changed 8 years ago by gkronber

  • Description modified (diff)
  • Summary changed from Partial dependence plots for gradient boosted trees to Regression tree models should support evaluation even when some of the variables are missing or contain missing values
Note: See TracTickets for help on using tickets.