Opened 4 years ago
Closed 4 years ago
#1999 closed feature request (done)
Regression problem instances for testing feature selection
Reported by: | gkronber | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.8 |
Component: | Problems.Instances | Version: | 3.3.8 |
Keywords: | Cc: |
Description
Change History (9)
comment:1 Changed 4 years ago by gkronber
comment:2 Changed 4 years ago by gkronber
r9094: formatting
comment:3 Changed 4 years ago by gkronber
- Status changed from new to accepted
comment:4 Changed 4 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from accepted to reviewing
comment:5 follow-up: ↓ 6 Changed 4 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from reviewing to assigned
Reviewing comments:
- Use Linq syntax in FeatureSelectionInstanceProvider.GetDataDescriptors as it is IMHO more readable.
- Make the training and test samples configurable in the FeatureSelection DataDescriptor.
- Additionally the ranges for the input variables and weights should also be configurable.
- Why are the generated input values normally (0,1) and not uniformly distributed?
- Obviously the formula to calculate the sigma for the noise RNG (targetSigma * Math.Sqrt(noiseRatio)) works, but I don't understand why? Is this due to the fact that targetSigma is approximately 1.0?
- The final formula y = f(x,w)+e with the selected variables and the according weights should be displayed somewhere (i.e. problem description, problem data parameter).
comment:6 in reply to: ↑ 5 Changed 4 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from assigned to reviewing
r9217: improved implementation of feature selection problem instances based on the review comments by mkommend.
- Created a PRNG for uniformly distributed values with a specified range [min..max[
- Created a class FeatureSelectionRegressionProblemData with additional informative parameters derived from RegressionProblemData
- fixed typos: shuffeled and varialbe
Replying to mkommend:
Reviewing comments:
- Use Linq syntax in FeatureSelectionInstanceProvider.GetDataDescriptors as it is IMHO more readable.
Fixed in r9217
- Make the training and test samples configurable in the FeatureSelection DataDescriptor.
Fixed in r9217 (default values training: 20% more than number of features, test: 5000)
- Additionally the ranges for the input variables and weights should also be configurable.
Fixed in r9217 by adding IRandom parameters to generate the values. Default values x: Normal(0,1) and weights: Uniform(0,10)
- Why are the generated input values normally (0,1) and not uniformly distributed?
- Obviously the formula to calculate the sigma for the noise RNG (targetSigma * Math.Sqrt(noiseRatio)) works, but I don't understand why? Is this due to the fact that targetSigma is approximately 1.0?
Last two points were discussed personally.
- The final formula y = f(x,w)+e with the selected variables and the according weights should be displayed somewhere (i.e. problem description, problem data parameter).
Added informative parameters for selected features, weights, and best achievable R² in the ProblemData parameter (introduced a new derived class with the additional parameters)
Please review the changes again.
comment:7 Changed 4 years ago by gkronber
r9218: fixed build fail
comment:8 Changed 4 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from reviewing to readytorelease
comment:9 Changed 4 years ago by swagner
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.7 to 3.3.8
r9093 added a provider and a configurable problem instance for testing feature selection