Opened 13 years ago
Closed 13 years ago
#1640 closed enhancement (done)
Refactor datasets to allow the storage of strings and datetimes
Reported by: | mkommend | Owned by: | gkronber |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.6 |
Component: | Problems.DataAnalysis | Version: | 3.3.6 |
Keywords: | Cc: |
Description
Currently only double values can be stored in the Dataset. Although these are the only values that can be used for modeling, string and DateTime values are useful for information purpose.
Change History (15)
comment:1 Changed 13 years ago by mkommend
- Status changed from new to accepted
comment:2 Changed 13 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from accepted to assigned
The SymbolicDataAnalysisExpressionTreeILEmittingInterpreter must be adapted to work with r6740 correctly.
comment:3 Changed 13 years ago by gkronber
- Status changed from assigned to accepted
comment:4 Changed 13 years ago by gkronber
comment:5 Changed 13 years ago by gkronber
- Status changed from accepted to reviewing
comment:6 Changed 13 years ago by mkommend
r6749: Added Storable attribute to dataset values.
comment:7 Changed 13 years ago by gkronber
r6754: fixed a problem with empty trainingindizes when using cross-validation
comment:8 Changed 13 years ago by gkronber
comment:9 Changed 13 years ago by gkronber
Importing data does not work reliably. In the TableFileParser lines 100 - 122 first the type of a column is determined heuristically and then the parsed values are filled into the columns.
First of all the statement in line 100:
var columnType = types.GroupBy(v => v).OrderBy(v => v).Last().Key;
throws an exception because it is not possible to use an enumerable as key. Possibly OrderBy(v=>v.Count()) was meant?
The next problem is that first the type of column is determined by the majority of parsed elements for that column, however, later the elements are simply pushed into the columns without checking if the type is compatible. This throws an exception again.
comment:10 Changed 13 years ago by gkronber
- Status changed from reviewing to assigned
comment:11 Changed 13 years ago by gkronber
- Status changed from assigned to accepted
comment:12 Changed 13 years ago by gkronber
r6776 fixed a bug in parsing datetime values and improved code for filling dataset columns
comment:13 Changed 13 years ago by gkronber
- Owner changed from gkronber to mkommend
- Status changed from accepted to reviewing
comment:14 Changed 13 years ago by mkommend
- Owner changed from mkommend to gkronber
- Status changed from reviewing to readytorelease
comment:15 Changed 13 years ago by swagner
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.5 to 3.3.6
r6740: