Opened 7 years ago
Closed 7 years ago
#2809 closed enhancement (done)
Remove "Logic" Classes in DataPreprocessing
Reported by: | pfleck | Owned by: | pfleck |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.15 |
Component: | DataPreprocessing | Version: | trunk |
Keywords: | Cc: |
Description
The "Logics" in the DataPreprocessing encapsulates different aspects of how the DataPreprocessingData can be accessed and manipulated. However, the "Logic"-classes mostly contain some helper methods that are only used within a single corresponding view. The logic classes should be removed and the code moved to the corresponding views.
By removing the "Logics", the initialization of the DataPreprocessingView should be simplified to avoid the highly coupled and entangled object initialization. Generally, the overall design of the data preprocessing should be more like the rest of HL.
Change History (25)
comment:1 Changed 7 years ago by pfleck
- Status changed from new to accepted
comment:2 Changed 7 years ago by pfleck
comment:3 Changed 7 years ago by pfleck
r15269: Removed SearchLogic
comment:4 Changed 7 years ago by pfleck
r15270: Removed TransactionalPreprocessingData and moved relevant code to PreprocessingData.
comment:5 Changed 7 years ago by pfleck
- Removed FilterLogic.
- Made Contents storable and implemented proper cloning.
comment:6 Changed 7 years ago by pfleck
r15283: Removed StatisticsLogic.
comment:7 Changed 7 years ago by pfleck
r15285 removed ManipulationLogic
- Moved "ReplaceByAverage/Random/..." to DataGridContent only (functions are redundant in manipulation view)
- Refactored some code for replacing values with ...
- Removed Smoothing (not very usefull and changes the data incomprehensibly)
- Moved Shuffling to DataGridContent
- Moved "Delete Rows/Columns with insufficient ..." to ManipulationContent
- Removed Logic Folder
- Moved Filter folder out
comment:8 Changed 7 years ago by pfleck
r15291: Added (Double/String/DateTime)PreprocessingDataColumn. (experimental state)
comment:9 Changed 7 years ago by pfleck
r15309 Worked on type-save PreprocessingDataColumns.
comment:10 Changed 7 years ago by pfleck
comment:11 Changed 7 years ago by pfleck
- Owner changed from pfleck to bburlacu
- Status changed from accepted to reviewing
comment:12 Changed 7 years ago by bburlacu
- Owner changed from bburlacu to pfleck
- Status changed from reviewing to assigned
The functionality is fine and consistent with the old behavior (tested side by side with the SARCOS problem data).
Some nitpicks:
- FilteredPreprocessingData.cs:
- Line 45: typo "preporcessingData"
- Filtering logic seems unnecessarily complicated
- FilterView.cs:
- Line 86: this piece of code is a little misleading and the whole logic in my opinion counter-intuitive
result1[row] = isAndCombination ? result1[row] || filterResult[row] : result1[row] && filterResult[row];
- the actual result is not the rows for which the filter combination is valid, but the rows for which the filter combination is NOT valid (and these rows get deleted in the SetFilter() method)
- i think the intent should be more clear and the result explicitly negated in order to get the rows which should be deleted
- names like "result1" should be more clear (eg., rowsToDelete, filteredRows)
comment:13 Changed 7 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.16 to HeuristicLab 3.3.15
comment:14 Changed 7 years ago by pfleck
- Status changed from assigned to accepted
comment:15 Changed 7 years ago by pfleck
r15466: Simplified the overall filtering logic as suggested by bburlacu
- changed parameter names to actively reflect that filter means "remaining"
- moved filter combination logic to FilterContent
- simplified/restructured code
comment:16 Changed 7 years ago by pfleck
- Owner changed from pfleck to bburlacu
- Status changed from accepted to reviewing
comment:17 Changed 7 years ago by bburlacu
- Owner changed from bburlacu to pfleck
- Status changed from reviewing to assigned
- unchecking a filter results in the wrong number of remaining and total rows displayed in the filter preview
- filtering is quite slow (takes ~7 seconds on my machine to remove 15k rows after filtering) - this problem also exists in the trunk version of the DataPreprocessing)
EDIT: The slowness is caused by the calls to IList.RemoveAt() which is called in succession for each row that needs to be removed inside the PreprocessingData.DeleteRow() method. Simply replacing the original values with the filtered values (instead of by row removal) results in a massive speed improvement.
comment:18 Changed 7 years ago by pfleck
- Status changed from assigned to accepted
comment:19 Changed 7 years ago by pfleck
- Fixed wrong number of remaining/toral rows after filter changed
- Improved performance for filtering as suggested by bburlacu. Thanks for the patch.
- Start all charts with index zero.
comment:20 Changed 7 years ago by pfleck
- Owner changed from pfleck to bburlacu
- Status changed from accepted to reviewing
comment:21 Changed 7 years ago by pfleck
- Owner changed from bburlacu to pfleck
- Version changed from branch to trunk
r15518: merged branch to trunk
comment:22 Changed 7 years ago by pfleck
r15534: Fixed a potential InvalidOperationException when applying filters
comment:23 Changed 7 years ago by pfleck
- Status changed from reviewing to readytorelease
comment:24 Changed 7 years ago by pfleck
comment:25 Changed 7 years ago by pfleck
- Resolution set to done
- Status changed from readytorelease to closed
r15536 deleted branch
r15268 Added branch with the DataPreprocessing plugins.