Free cookie consent management tool by TermsFeed Policy Generator

Opened 8 years ago

Closed 8 years ago

#2612 closed feature request (done)

Regression tree models should support evaluation even when some of the variables are missing or contain missing values

Reported by: gkronber Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.14
Component: Algorithms.DataAnalysis Version: 3.3.13
Keywords: Cc:

Description (last modified by gkronber)

as described in "Greedy Function Approximation" paper

Change History (12)

comment:1 Changed 8 years ago by gkronber

  • Owner set to gkronber
  • Status changed from new to accepted

comment:2 Changed 8 years ago by gkronber

  • Owner changed from gkronber to pfleck
  • Status changed from accepted to assigned

The regression tree models can be easily extended such that they calculate a weighted average estimated value if a given variable value is not available in the dataset. This can be used also for partial dependence plots (only add the variable for which the partial dependence should be calculated to the dataset).

r13895:

  • extended GBT to support calculation of partial dependence
  • changed persistence of regression tree models
  • added two unit tests.

comment:3 Changed 8 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.14 to HeuristicLab 3.3.15

In r13895 the persistence format for gradient boosted trees has been improved, and handling of missing values for evaluation of GBT models has been added.

The related ticket #2622 is concerned with adding correct handling of missing values to the training phase.

A view for plotting the partial dependence has not yet been added. Therefore, I'm moving this ticket to the next milestone.

comment:4 Changed 8 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.15 to HeuristicLab 3.3.14
  • Owner changed from pfleck to gkronber

This ticket blocks the release of several others, e.g. #2604, #2541, or #1795, because of the changes to gradient boosted trees in r13895, on which the other tickets depend upon.

Last edited 8 years ago by gkronber (previous) (diff)

comment:5 Changed 8 years ago by mkommend

Reviewed r13895. The only changes in r13895 are the handling of missing values in the tree models and the change of the persistence format.

Last edited 8 years ago by gkronber (previous) (diff)

comment:6 Changed 8 years ago by gkronber

r14015: added NaN handling for the evaluation of regression tree models (GBT)

comment:7 Changed 8 years ago by gkronber

  • Description modified (diff)
  • Summary changed from Partial dependence plots for gradient boosted trees to Regression tree models should support evaluation even when some of the variables are missing or contain missing values

comment:8 Changed 8 years ago by gkronber

  • Status changed from assigned to reviewing

comment:9 Changed 8 years ago by gkronber

  • Status changed from reviewing to readytorelease

comment:10 Changed 8 years ago by gkronber

r14016: reverse merge of r14015

comment:11 Changed 8 years ago by gkronber

r14017: added NaN handling for the evaluation of regression tree models (GBT) (again see r14015).

comment:12 Changed 8 years ago by mkommend

  • Resolution set to done
  • Status changed from readytorelease to closed

r14023: Merged r13895 and r14017 into stable. r14015 and r14016 have been recorded in the merge info, but haven't actually be merged.

Note: See TracTickets for help on using tickets.