Opened 14 years ago
Closed 12 years ago
#1481 closed feature request (done)
Visual view for clustering solutions displaying cluster centers and distributions
Reported by: | gkronber | Owned by: | abeham |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.8 |
Component: | Problems.DataAnalysis.Views | Version: | 3.3.8 |
Keywords: | Cc: |
Description
This is essentially a scatter plot with different marker colors (or types) for different clusters. Multiple ways for dimensionality reduction for 2-dimensional display of clusters are possible (principal component analysis is already implemented in Alglib, multidimensional scaling is already implemented by abeham in HeuristicLab.Analysis).
Change History (6)
comment:1 Changed 14 years ago by gkronber
- Component changed from Problems.DataAnalysis to Problems.DataAnalysis.Views
comment:2 Changed 12 years ago by abeham
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.8
- Owner changed from gkronber to abeham
- Status changed from new to accepted
- Version changed from branch to 3.3.7
comment:3 Changed 12 years ago by abeham
- Owner changed from abeham to gkronber
- Status changed from accepted to reviewing
comment:4 Changed 12 years ago by abeham
r8453: Fixed plugin dependencies
comment:5 Changed 12 years ago by mkommend
- Owner changed from gkronber to abeham
- Status changed from reviewing to readytorelease
The implementation looks fine to me and the views work as expected.
comment:6 Changed 12 years ago by swagner
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.7 to 3.3.8
Note: See
TracTickets for help on using
tickets.
r8435: Added clustering solution evaluation view that performs PCA on the data to reduce it to two dimensions
Note: I dropped the IDimensionReductionModel and IDimensionReductionSolution again since a simple solution evaluation view was all that is needed here. Btw, I didn't use MDS here since that would require to calculate a dissimilarity matrix between all instances. I think this is impractical when there are several thousand instances that have to be clustered.
I think the downside of PCA is that it doesn't tell you anything about how good your clusters are. It's just some projection onto a 2D surface. Maybe some users would misinterpret the chart in that they think the clustering didn't work when it's actually just PCA failing to provide a good projection.