Opened 12 years ago
Closed 11 years ago
#2055 closed feature request (done)
Tool to reduce the file size of data analysis experiments
Reported by: | mkommend | Owned by: | ascheibe |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.9 |
Component: | Tools | Version: | 3.3.8 |
Keywords: | Cc: |
Description
Files storing data analysis experiments executed on the Hive grow rather large, because the dataset is saved multiple times in it. Therefore, a small utility tool which removes duplicate datasets from saved HL files would be nice to have.
Change History (19)
comment:1 Changed 12 years ago by mkommend
- Status changed from new to accepted
comment:2 Changed 12 years ago by mkommend
comment:3 Changed 12 years ago by mkommend
Performed first tests of the file shrinker on my Eurocast and GECCO experiments. It took for each folder around 30 minutes to be processed and the folder size got reduced from 2.16 GB to 141 MB (Eurocast) and from 1.34 GB to 123 MB (GECCO).
comment:4 Changed 12 years ago by abeham
- Owner changed from mkommend to architects
- Status changed from accepted to assigned
We could think about expanding our file format to include separate "input files" that may be linked from the main serialization file.
comment:5 Changed 11 years ago by gkronber
- Owner changed from architects to mkommend
A tools menu item should be added to the optimizer, which calls the functionality provided by the command line program.
comment:6 Changed 11 years ago by mkommend
- Milestone set to HeuristicLab 3.3.9
comment:7 Changed 11 years ago by abeham
- Version changed from 3.3.8 to branch
comment:8 Changed 11 years ago by mkommend
- Status changed from assigned to accepted
comment:9 Changed 11 years ago by mkommend
- Version changed from branch to 3.3.8
comment:10 Changed 11 years ago by mkommend
r9859: Added menuitem that removes duplicate datasets.
comment:11 Changed 11 years ago by mkommend
r9860: Added new menu item for data analysis commands.
comment:12 Changed 11 years ago by mkommend
r9861: Removed fileshrinker from tools directory.
comment:13 Changed 11 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from accepted to reviewing
comment:14 Changed 11 years ago by gkronber
I reviewed the code of the ShrinkDataAnalysisRunsMenuItem but didn't understand why it is necessary to create the variableValuesGetter and variableValuesSetter via Expressions. Seemingly, this allows to set also private fields?
Please add a comment explaining what you are doing in the static initializer and why this is necessary.
Please also add a comment that you are comparing variable names for line 106 if(!values1.Keys.SequenceEqual(values2.Keys)) return false;
comment:15 Changed 11 years ago by mkommend
r9866: Added comments to ShrinkDataAnalysisRunsMenuItem.
comment:16 Changed 11 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from reviewing to readytorelease
comment:17 Changed 11 years ago by ascheibe
- Owner changed from mkommend to ascheibe
- Status changed from readytorelease to reviewing
comment:18 Changed 11 years ago by ascheibe
- Status changed from reviewing to readytorelease
comment:19 Changed 11 years ago by ascheibe
- Resolution set to done
- Status changed from readytorelease to closed
r9497: Added first version of HL.FileShrinker.