Opened 10 months ago

Closed 5 months ago

#2942 closed enhancement (done)

KNN-Regression/Classification should allow "self" points

Reported by: bwerth Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: trunk
Keywords: kNN, DataAnalysis Cc: gkronber@heuristiclab.com, maffenze@heuristiclab.com

Description

In the prediction of the kNN-Model it is currently disallowed to utilize data points with zero distance to the query point, which is not only counter intuitive but might lead to worse prediction results (especially in cases where the features are de facto ordinal/integers and zero distances are common). The inclusion of zero-distance points should at least be optional. This however requires an alteration in the way weights are assigned to neighboring points (currently 1/distance which would cause division-by-zero exceptions).

Change History (12)

comment:1 Changed 10 months ago by bwerth

  • Status changed from new to accepted

comment:2 Changed 7 months ago by gkronber

  • Cc gkronber@heuristiclab.com maffenze@heuristiclab.com added
  • Owner changed from bwerth to msemenki
  • Status changed from accepted to assigned

comment:3 Changed 7 months ago by gkronber

bwerth has already prepared a document describing the necessary steps for implementation and will gladly help.

Last edited 7 months ago by gkronber (previous) (diff)

comment:4 Changed 6 months ago by msemenki

  • Owner changed from msemenki to gkronber
  • Status changed from assigned to reviewing

r16408: Add for KNN-Regression/Classification ability to utilize data points with zero distance to the query point. Alteration in the way weights are assigned to neighboring points (to except division-by-zero).

comment:5 Changed 6 months ago by gkronber

  • Version set to branch

comment:6 Changed 6 months ago by gkronber

r16488: added SelfMatch parameters in the AfterDeserialization hook (for loading files stored with the old version)

comment:7 Changed 6 months ago by gkronber

r16489: merged changes from trunk to support testing with trunk

comment:8 Changed 6 months ago by gkronber

r16490: added a comment on the handling of weights in the self-matching case

comment:9 Changed 6 months ago by gkronber

  • Status changed from reviewing to readytorelease
  • Version changed from branch to trunk

r16491: merged r16408, r16488, r16490 from branch to trunk (manually)

comment:10 Changed 6 months ago by gkronber

Thank you @msemenki. Good work. I only added the parameters in the AfterDeserialization hooks (see r16488). This is necessary so that we can open and re-run stored kNN experiments.

comment:11 Changed 6 months ago by gkronber

r16493: deleted branch after changes have been merged to trunk

comment:12 Changed 5 months ago by gkronber

  • Resolution set to done
  • Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.