Opened 16 months ago

Closed 5 months ago

#2904 closed enhancement (done)

Create new CalculateImpacts method for a set of variables

Reported by: fholzing Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: trunk
Keywords: Cc:

Description

Currently the calculation of the VariableImpacts isn't possible for a single variable or a chosen set of variables. Two new methods like the current RegressionSolutionVariableImpactsCalculator.CalculateImpacts and ClassificationSolutionVariableImpactsCalculator.CalculateImpacts should be created which take an additional set of variables as an input.

Change History (59)

comment:1 Changed 16 months ago by fholzing

  • Status changed from new to accepted

comment:2 Changed 16 months ago by fholzing

  • Version changed from trunk to branch

comment:3 Changed 16 months ago by fholzing

r15804: Created branch

comment:4 Changed 16 months ago by fholzing

r15805: Made the branch compile

comment:5 Changed 16 months ago by fholzing

r15807: Deleted project (wrong one)

comment:6 Changed 16 months ago by fholzing

r15808: Created branch (this time with the correct project)

comment:7 Changed 16 months ago by fholzing

r15809: Made the branch compile

comment:8 Changed 16 months ago by fholzing

r15815: Splitted the Method CalculateImpact into smaller parts

comment:9 Changed 16 months ago by fholzing

r15816: Cleaned up the method signature (some optional parameters aren't so optional)

comment:10 Changed 16 months ago by fholzing

r15831: Refactored RegressionSolutionVariableImpactsCalculator. We don't depend on the solution anymore. The impact can be calculated for a single variable. The calculator can be chosen.

Last edited 11 months ago by fholzing (previous) (diff)

comment:11 Changed 16 months ago by fholzing

Important Points:
I guess the same refactoring should be also applied to Classifications.
There may be a better alternative than the additional method in the Interface IOnlineCalculator.
Some OnlineCalculators aren't in the branch, so attention is required as we merge the branch.
Additional UnitTest are preferred.
The UI doesn't support the new functionality (Maybe in an additional ticket later on).

comment:12 Changed 16 months ago by fholzing

  • Owner changed from fholzing to mkommend
  • Status changed from accepted to reviewing

comment:13 Changed 11 months ago by fholzing

r16001: Reverted the additional CalculateValue-Method in IOnlineCalculator and created a static property/method in RegressionSolutionVariableImpactsCalculator

comment:14 Changed 11 months ago by fholzing

r16002: Fixed static Property

comment:15 Changed 11 months ago by mkommend

  • Owner changed from mkommend to fholzing
  • Status changed from reviewing to assigned

Review comments

  • Adhere to the coding styles and fix the parentheses in the impact calculator
  • Factor replacement method cannot be set in a property
  • Program as threadsafe as possible -> No static properties that can be modified (Calculator)
  • Line 163 use Count() directly instead of .Where().Count()
  • Use self-explaining parameter names and not IDataset problemData
  • EvaluateModelWithReplacedVariable should be replace to GetReplacedValues or something similar that describes its purpose and not call EvaluateModelWithReplacedVariable .
  • EvaluateModelWithReplacedVariable should have a better ordering of its parameters and not start with the original vlaues
  • Calculate value do not use .Count before zip, because this evaluates the enumerable multiple times
  • Line 199: avoid creation of modifiable datasets on the original dataset directly
  • Use consistent values for default parameters, e.g. the defined enums

comment:16 Changed 11 months ago by fholzing

I imported and adhered to the given VisualStudio Settings from https://dev.heuristiclab.com/trac.fcgi/attachment/wiki/Documentation/DevelopmentCenter/DeveloperGuidelines/Visual%20Studio%202010%20text%20editor%20settings.vssettings (using the AutoFormat-Shortcut in VisualStudio). Either the Settings are depricated, in which case we should update them, or they don't include our preferred formatting whereby the formatting should be included i guess.

comment:17 Changed 11 months ago by fholzing

r16016: Removed unnecessary where-condition (.Where(...).Count()); Cloned the dataset before the .ToModifiable() call; Adapted the CalculateValue-Method for a single loop through the IEnumerables for the target and estimated values

comment:18 Changed 11 months ago by fholzing

I just had a closer look at the missing Property/Parameter for the factor replacement method.

At this moment, DataPartition and ReplacementMethod are defined as IFixedValueParameter but are never set this way. They are only used in the Calculate-Method, which is currently not in use.

I would like to suggest three options to think about (each with it's own (dis)advantages:

-Just use them as IFixedValueParameter
-Just use them as method-parameters
-A combination of both with the Parameter as Default, which can be overwritten (via Default-Parameter)

comment:19 Changed 11 months ago by fholzing

r16017: Added IFixedValueParameter for factory variable replacement. Unified the default values.

comment:20 Changed 11 months ago by fholzing

r16018: Unified order of IFixedValueParameter. Use FactorReplacementMethod as default for .Calculate method

comment:21 Changed 11 months ago by fholzing

r16020: Removed static calculator-variable, Changed default ReplacementMethod from Median to Shuffle, Adapted Calculation-Method adhering to the OnlineCalculators, Re-Added the condition for counting the input-parameters

Last edited 11 months ago by fholzing (previous) (diff)

comment:22 Changed 11 months ago by fholzing

r16027: Added folder for View-branch
r16028: Removed folder for View-branch
r16029: Branched the View
r16030: Made the branch compile

comment:23 Changed 11 months ago by fholzing

r16031: Changed formatting (adhering to the HL-standard) and renamed variables/methods for better comprehensibility

comment:24 Changed 11 months ago by fholzing

r16034: Removed callback, adapted both view and calculator.

comment:25 Changed 11 months ago by fholzing

r16035: Better method-ordering, variable-naming and cleaned up some code not necessary anymore.

comment:26 Changed 11 months ago by fholzing

r16036: Streamlined the variableimpactcalculator code on both Regression and Classification. Taken over the regression-code for classification with some minor adaptations.

comment:27 Changed 11 months ago by fholzing

r16037: Restyled ClassificationSolutionVariableImpactsCalculator to be nearly identical to it's Regression-pendant

comment:28 Changed 11 months ago by fholzing

r16038: Branched UnitTest-Project
r16039: Added Test-Proj to Solution

comment:29 Changed 11 months ago by fholzing

r16041: Removed ElementAt

comment:30 Changed 11 months ago by fholzing

r16042: Renamed variableImactsArrayView to variableImpactsArrayView

comment:31 Changed 11 months ago by fholzing

r16051: Refactored Methods for better readability. Fixed styling mistakes.

comment:32 Changed 11 months ago by fholzing

r16052: Fixed wrong variable usage

comment:33 Changed 11 months ago by fholzing

r16055: Also applied changes from regression to classification

comment:34 Changed 11 months ago by fholzing

r16058: Added Unit-Tests for Regression

comment:35 Changed 11 months ago by fholzing

r16061: Added Unit-Tests for Classification/Performance and renamed VariableImpactCalculationTest.cs to RegressionVariableImpactCalculationTest.cs

comment:36 Changed 11 months ago by fholzing

r16065: Added more Unit-Tests for Classification

comment:37 Changed 11 months ago by fholzing

r16067: Added additional Unit-Test for performance measurement

comment:38 Changed 11 months ago by fholzing

  • Owner changed from fholzing to mkommend
  • Status changed from assigned to reviewing

comment:39 Changed 10 months ago by fholzing

r16127: Shifted ascendingCheckBox to the left

comment:40 Changed 10 months ago by fholzing

r16128: Fixed Types (String for FactorVariables)

comment:41 Changed 9 months ago by fholzing

r16181: Fixed Types (String for FactorVariables) for Classification

comment:42 Changed 9 months ago by fholzing

r16188: Merged changes from trunk

comment:43 Changed 9 months ago by fholzing

r16189: Overwrote trunk-changes with branch-changes

comment:44 Changed 9 months ago by fholzing

r16190: Added changes from #2910 to ClassificationSolutionVariableImpactsCalculator

comment:45 Changed 6 months ago by mkommend

r16397: Updated branch with trunk changes.

comment:46 Changed 6 months ago by fholzing

r16401: Updated branch with trunk changes.

comment:47 Changed 6 months ago by fholzing

r16402: Updated branch with trunk changes (HeuristicLab.Tests).

comment:48 Changed 6 months ago by fholzing

r16410: Updated branch with trunk changes (#2966).

comment:49 Changed 6 months ago by fholzing

r16416: cleared variableImpacts on OnContentChanged(); Used existing epsilon (almost) function for comparison of estimated impacts; removed direct usage of alglib; Added a LDA- and RF-Unittest for Impact-Calculation; Changed the approach of loading ProblemData (no more strings); Renamed PerformanceTests for Regression/Classification so it can be found easier

comment:50 Changed 6 months ago by fholzing

r16418: fixed bug (duplicated entries in VariableImpacts)

comment:51 Changed 6 months ago by fholzing

r16420: fixed bug (duplicated entries in VariableImpacts)

comment:52 Changed 6 months ago by fholzing

r16421: fixed bug (duplicated entries in Classifciation VariableImpacts View)

comment:53 Changed 6 months ago by fholzing

r16422: reintegrated branch

comment:54 Changed 6 months ago by fholzing

  • Status changed from reviewing to readytorelease
  • Version changed from branch to trunk

comment:55 Changed 6 months ago by fholzing

r16423: Deleted Branch

comment:56 Changed 6 months ago by mkommend

r16438: Merged r16422 and r16423 into stable.

comment:57 Changed 6 months ago by mkommend

r16441: Changed unit tests to old LR because #2892 has not yet been released.

comment:58 Changed 6 months ago by mkommend

r16442: Merged r16441 into stable.

comment:59 Changed 5 months ago by mkommend

  • Resolution set to done
  • Status changed from readytorelease to closed
Note: See TracTickets for help on using tickets.