Opened 5 years ago

Closed 4 years ago

#1942 closed feature request (done)

Improve CSV import for data analysis problems

Reported by: sforsten Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.8
Component: Problems.Instances Version: 3.3.8
Keywords: Cc:

Description

When importing a data analysis problem from a csv file, it should be possible to shuffle the data and to select how much should be training and test.

For classification problems, it also should be possible to decide whether to normalize the dataset in a way, so that from every class exactly the same amount of samples are imported, or to leave it as it is.

Change History (24)

comment:1 Changed 5 years ago by sforsten

  • Status changed from new to accepted

comment:2 Changed 5 years ago by sforsten

r8598: csv files for data analysis problems can be shuffled when imported

comment:3 Changed 5 years ago by sforsten

r8599: Training and test partition can be defined (with a TrackBar in percent), when importing a csv file for data analysis problems.

comment:4 Changed 5 years ago by sforsten

r8601: training has now at least one sample

comment:5 Changed 5 years ago by sforsten

r8602: removed .resx file in project files

comment:6 Changed 5 years ago by gkronber

Review comments:

  • fix dialog icon
  • fix dialog name and type (close and cancel options should not be available, minimize and maximize options should not be available
  • fix "Shuffel"

comment:7 Changed 5 years ago by mkommend

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.8

comment:8 Changed 5 years ago by sforsten

r8693:

  • fixed type "Shuffel"
  • removed icon and control box

A new branch will be created to add remaining features and features of #1819

comment:9 Changed 5 years ago by sforsten

r8694: create branch for remaining features and features of #1819
r8695: add project "Problems.Instances.DataAnalysis"
r8696: branch project "Problems.Instances.DataAnalysis.Views"
r8697: delete incorrectly branched project

comment:10 Changed 5 years ago by sforsten

r8701:

  • add combo boxes to DataAnalysisImportTypeDialog to select csv settings
  • get branch ready

comment:11 Changed 5 years ago by sforsten

r8713: branch additional project

comment:12 Changed 5 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from accepted to reviewing

r8715:

  • added csv import dialog for regression
  • improved existing dialog (tool tip, design, preview of dataset)

comment:13 Changed 5 years ago by sforsten

r8842: merged r8690:8840 from trunk into branch

comment:14 Changed 4 years ago by mkommend

r8875: Merged trunk changes into branch.

comment:15 Changed 4 years ago by mkommend

Reviewing comments:

  • Overhaul DataAnalysisInstanceProvider and remove not always supported import methods.
  • Place OK / Cancel button on the right side of the import dialog.
  • CancelButton hides a member inherited by Form. Use a different name for the button.
  • DataAnalysisImportTypeDialog: Is it really necessary to use a BindingList.
  • Improve ErrorHandling of ConsumerViews (catching of Exception).
  • Shuffling of classification problem data doesn't work correctly!
  • There should be a checkbox if the distribution of class values should be taking into account during the shuffling.
  • Evaluate if it is possible to import files although they are opened in other programs (e.g., Excel). Currently this is not possible.
  • License header is missing in .designer files.
Last edited 4 years ago by mkommend (previous) (diff)

comment:16 Changed 4 years ago by mkommend

r8877: Reintegrated branch for CSV import.

comment:17 Changed 4 years ago by mkommend

  • Owner changed from mkommend to sforsten
  • Status changed from reviewing to assigned

comment:18 Changed 4 years ago by mkommend

r8878: Fixed compiler error in CSV import functionality.

comment:19 Changed 4 years ago by sforsten

  • Status changed from assigned to accepted

r8885:

  • implemented changes suggested by mkommend in comment:15:ticket:1942 except the first remark
  • TimeSeriesPrognosisInstanceProvider has been adapted to work similar to other DataAnalysisInstanceProvider, also views have been created for it

The first remark of mkommend's comment will be dealt with next.

comment:20 Changed 4 years ago by sforsten

  • Owner changed from sforsten to mkommend
  • Status changed from accepted to reviewing

r8892: delete obsolete branch

As discussed with mkommend, no changes are going to be made to DataAnalysisInstanceProvider.

comment:21 Changed 4 years ago by mkommend

Reviewed r8885 & r8892.

comment:22 Changed 4 years ago by mkommend

r9021: Renamed field of DataAnalysisImportType.

comment:23 Changed 4 years ago by mkommend

  • Status changed from reviewing to readytorelease

comment:24 Changed 4 years ago by swagner

  • Resolution set to done
  • Status changed from readytorelease to closed
  • Version changed from 3.3.7 to 3.3.8
Note: See TracTickets for help on using tickets.