Opened 8 weeks ago

Last modified 7 days ago

#2760 assigned feature request

Shuffle samples in the cross-validation wrapper for data analysis algorithms

Reported by: bburlacu Owned by: bburlacu
Priority: medium Milestone: HeuristicLab 3.3.15
Component: Algorithms.DataAnalysis Version: 3.3.14
Keywords: Cc:

Description

The cross-validation wrapper should offer an option to shuffle the data samples.

Change History (7)

comment:1 Changed 6 weeks ago by bburlacu

  • Owner set to bburlacu
  • Status changed from new to accepted

r14864: Implement shuffling of crossvalidation samples.

comment:2 Changed 6 weeks ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from accepted to reviewing

comment:3 Changed 6 weeks ago by bburlacu

r14865: Fix issue with resources in CrossValidationView.Designer.cs

comment:4 Changed 5 weeks ago by gkronber

It seems that in the ensemble the information wether a point was used for training or test is not stored correctly. Reproduce:

  1. Use cross-validation with shuffling and produce an overfit model on purpose.
  2. Check line chart
  3. Expected result: errors for training predictions (yellow) are very small, errors for test predictions (red) are significantly higher.
  4. Actual result: some errors for training predictions are also high, some errors for test points are suspiciously small.

comment:5 Changed 3 weeks ago by bburlacu

r14904: Reuse the shuffled data when creating the solution ensemble.

comment:6 Changed 3 weeks ago by gkronber

Overlaps with changes in r14781 (#2756) must be merged together.

comment:7 Changed 7 days ago by mkommend

  • Owner changed from mkommend to bburlacu
  • Status changed from reviewing to assigned

Review comments:

  • Backwards compatibility is not ensured
  • Shuffling can be changed during execution yielding inconsistent results
  • Clone shows wrong value of shuffle samples in view
  • Shuffled problemData is neither cloned nor serialized

Why do we need the shuffledProblemData at all?

Last edited 7 days ago by mkommend (previous) (diff)
Note: See TracTickets for help on using tickets.