Opened 10 months ago

Last modified 3 months ago

#2904 reviewing enhancement

Create new CalculateImpacts method for a set of variables

Reported by: fholzing Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Algorithms.DataAnalysis Version: branch
Keywords: Cc:

Description

Currently the calculation of the VariableImpacts isn't possible for a single variable or a chosen set of variables. Two new methods like the current RegressionSolutionVariableImpactsCalculator.CalculateImpacts and ClassificationSolutionVariableImpactsCalculator.CalculateImpacts should be created which take an additional set of variables as an input.

Change History (44)

comment:1 Changed 10 months ago by fholzing

  • Status changed from new to accepted

comment:2 Changed 10 months ago by fholzing

  • Version changed from trunk to branch

comment:3 Changed 10 months ago by fholzing

r15804: Created branch

comment:4 Changed 10 months ago by fholzing

r15805: Made the branch compile

comment:5 Changed 10 months ago by fholzing

r15807: Deleted project (wrong one)

comment:6 Changed 10 months ago by fholzing

r15808: Created branch (this time with the correct project)

comment:7 Changed 10 months ago by fholzing

r15809: Made the branch compile

comment:8 Changed 10 months ago by fholzing

r15815: Splitted the Method CalculateImpact into smaller parts

comment:9 Changed 10 months ago by fholzing

r15816: Cleaned up the method signature (some optional parameters aren't so optional)

comment:10 Changed 9 months ago by fholzing

r15831: Refactored RegressionSolutionVariableImpactsCalculator. We don't depend on the solution anymore. The impact can be calculated for a single variable. The calculator can be chosen.

Last edited 5 months ago by fholzing (previous) (diff)

comment:11 Changed 9 months ago by fholzing

Important Points:
I guess the same refactoring should be also applied to Classifications.
There may be a better alternative than the additional method in the Interface IOnlineCalculator.
Some OnlineCalculators aren't in the branch, so attention is required as we merge the branch.
Additional UnitTest are preferred.
The UI doesn't support the new functionality (Maybe in an additional ticket later on).

comment:12 Changed 9 months ago by fholzing

  • Owner changed from fholzing to mkommend
  • Status changed from accepted to reviewing

comment:13 Changed 5 months ago by fholzing

r16001: Reverted the additional CalculateValue-Method in IOnlineCalculator and created a static property/method in RegressionSolutionVariableImpactsCalculator

comment:14 Changed 5 months ago by fholzing

r16002: Fixed static Property

comment:15 Changed 5 months ago by mkommend

  • Owner changed from mkommend to fholzing
  • Status changed from reviewing to assigned

Review comments

  • Adhere to the coding styles and fix the parentheses in the impact calculator
  • Factor replacement method cannot be set in a property
  • Program as threadsafe as possible -> No static properties that can be modified (Calculator)
  • Line 163 use Count() directly instead of .Where().Count()
  • Use self-explaining parameter names and not IDataset problemData
  • EvaluateModelWithReplacedVariable should be replace to GetReplacedValues or something similar that describes its purpose and not call EvaluateModelWithReplacedVariable .
  • EvaluateModelWithReplacedVariable should have a better ordering of its parameters and not start with the original vlaues
  • Calculate value do not use .Count before zip, because this evaluates the enumerable multiple times
  • Line 199: avoid creation of modifiable datasets on the original dataset directly
  • Use consistent values for default parameters, e.g. the defined enums

comment:16 Changed 5 months ago by fholzing

I imported and adhered to the given VisualStudio Settings from https://dev.heuristiclab.com/trac.fcgi/attachment/wiki/Documentation/DevelopmentCenter/DeveloperGuidelines/Visual%20Studio%202010%20text%20editor%20settings.vssettings (using the AutoFormat-Shortcut in VisualStudio). Either the Settings are depricated, in which case we should update them, or they don't include our preferred formatting whereby the formatting should be included i guess.

comment:17 Changed 5 months ago by fholzing

r16016: Removed unnecessary where-condition (.Where(...).Count()); Cloned the dataset before the .ToModifiable() call; Adapted the CalculateValue-Method for a single loop through the IEnumerables for the target and estimated values

comment:18 Changed 5 months ago by fholzing

I just had a closer look at the missing Property/Parameter for the factor replacement method.

At this moment, DataPartition and ReplacementMethod are defined as IFixedValueParameter but are never set this way. They are only used in the Calculate-Method, which is currently not in use.

I would like to suggest three options to think about (each with it's own (dis)advantages:

-Just use them as IFixedValueParameter
-Just use them as method-parameters
-A combination of both with the Parameter as Default, which can be overwritten (via Default-Parameter)

comment:19 Changed 5 months ago by fholzing

r16017: Added IFixedValueParameter for factory variable replacement. Unified the default values.

comment:20 Changed 5 months ago by fholzing

r16018: Unified order of IFixedValueParameter. Use FactorReplacementMethod as default for .Calculate method

comment:21 Changed 5 months ago by fholzing

r16020: Removed static calculator-variable, Changed default ReplacementMethod from Median to Shuffle, Adapted Calculation-Method adhering to the OnlineCalculators, Re-Added the condition for counting the input-parameters

Last edited 5 months ago by fholzing (previous) (diff)

comment:22 Changed 5 months ago by fholzing

r16027: Added folder for View-branch
r16028: Removed folder for View-branch
r16029: Branched the View
r16030: Made the branch compile

comment:23 Changed 5 months ago by fholzing

r16031: Changed formatting (adhering to the HL-standard) and renamed variables/methods for better comprehensibility

comment:24 Changed 5 months ago by fholzing

r16034: Removed callback, adapted both view and calculator.

comment:25 Changed 4 months ago by fholzing

r16035: Better method-ordering, variable-naming and cleaned up some code not necessary anymore.

comment:26 Changed 4 months ago by fholzing

r16036: Streamlined the variableimpactcalculator code on both Regression and Classification. Taken over the regression-code for classification with some minor adaptations.

comment:27 Changed 4 months ago by fholzing

r16037: Restyled ClassificationSolutionVariableImpactsCalculator to be nearly identical to it's Regression-pendant

comment:28 Changed 4 months ago by fholzing

r16038: Branched UnitTest-Project
r16039: Added Test-Proj to Solution

comment:29 Changed 4 months ago by fholzing

r16041: Removed ElementAt

comment:30 Changed 4 months ago by fholzing

r16042: Renamed variableImactsArrayView to variableImpactsArrayView

comment:31 Changed 4 months ago by fholzing

r16051: Refactored Methods for better readability. Fixed styling mistakes.

comment:32 Changed 4 months ago by fholzing

r16052: Fixed wrong variable usage

comment:33 Changed 4 months ago by fholzing

r16055: Also applied changes from regression to classification

comment:34 Changed 4 months ago by fholzing

r16058: Added Unit-Tests for Regression

comment:35 Changed 4 months ago by fholzing

r16061: Added Unit-Tests for Classification/Performance and renamed VariableImpactCalculationTest.cs to RegressionVariableImpactCalculationTest.cs

comment:36 Changed 4 months ago by fholzing

r16065: Added more Unit-Tests for Classification

comment:37 Changed 4 months ago by fholzing

r16067: Added additional Unit-Test for performance measurement

comment:38 Changed 4 months ago by fholzing

  • Owner changed from fholzing to mkommend
  • Status changed from assigned to reviewing

comment:39 Changed 3 months ago by fholzing

r16127: Shifted ascendingCheckBox to the left

comment:40 Changed 3 months ago by fholzing

r16128: Fixed Types (String for FactorVariables)

comment:41 Changed 3 months ago by fholzing

r16181: Fixed Types (String for FactorVariables) for Classification

comment:42 Changed 3 months ago by fholzing

r16188: Merged changes from trunk

comment:43 Changed 3 months ago by fholzing

r16189: Overwrote trunk-changes with branch-changes

comment:44 Changed 3 months ago by fholzing

r16190: Added changes from #2910 to ClassificationSolutionVariableImpactsCalculator

Note: See TracTickets for help on using tickets.