Opened 2 months ago

Last modified 2 months ago

#2955 reviewing enhancement

Improve evaluating models on new data

Reported by: mkommend Owned by: gkronber
Priority: high Milestone: HeuristicLab 3.3.16
Component: Problems.DataAnalysis Version: trunk
Keywords: Cc:

Description

Currently, it is only possible to evaluate a model on new data if the different problem data exactly matches the original one w.r.t variable names. Even unused variables that do not occur in the model (either because of feature selection or disabling them in the problem data) are compared and have to have exact same spelling. This is a relic from the evolving source code, because previously there was no way to determine which variables are used in a model (input and target). Due to the inclusion of variable information in the models (#2604) it becomes feasible to refactor this functionality and only check for actually used variables when applying it on new data.

Change History (5)

comment:1 Changed 2 months ago by mkommend

  • Status changed from new to accepted

comment:2 Changed 2 months ago by mkommend

r16241: Added utility method that checks if a variable is present in the dataset.

comment:3 Changed 2 months ago by mkommend

r16243: Added IsProblemDataCompatible and IsDatasetCompatible to all DataAnalysisModels.

comment:4 Changed 2 months ago by mkommend

r16244: Used IsProblemDataCompatible and IsDatasetCompatible instead of now obsolete AdjustProblemDataProperties when exchanging the problem data of data analysis solutions.

Last edited 2 months ago by mkommend (previous) (diff)

comment:5 Changed 2 months ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from accepted to reviewing
Note: See TracTickets for help on using tickets.