Opened 6 months ago

Last modified 11 days ago

#2709 reviewing enhancement

DataPreprocessing Enhancements

Reported by: pfleck Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.15
Component: DataPreprocessing.Views Version: branch
Keywords: Cc:

Description (last modified by pfleck)

This ticket contains smaller visual enhancements of the preprocessing views.

This ticket depends on #2698.

  • Multi-Scatterplot changes

This ticket depends on #2713.

  • moved DataTable/ScatterPlotControl out of DataTable/ScatterPlotView
  • introduced regression curves in scatterplot

This ticket depends on #2715.

  • introduce Histogram aggregation

Enhancements

  • ViewHost/ViewShortcut usage
    • Remove ViewHost icons for the ViewShortcuts
      • Split Single- and Multi-Scatterplot
    • Remove “View Shortcuts” grouping box
    • Doubleclick a ViewShortcut should not reset the state in the new view (not possible right now because the state of the views is partially located in the views, not the contents. will be fixed in the future)
  • PreprocessingCheckedItemView
    • Hide Move, Add, Delete buttons
    • Add Select Input/Target, All and None as Buttons (checkboxes with tooltips)
      • Remove context menu instead
  • DataGrid + Statistics
    • Show/Hide columns and rows
      • Check All, Input/Target, None Variables option
      • Initially only Input/Target variables should be checked
  • DataCompletnessChart
    • Remove title
    • Move legend to top (column style)
  • Scatterplot
    • Better default axis ranges (Bogdans helper functions)
    • Axis description instead of legend
    • Manual axis-range (also for linechart) (currently via config dialog)
  • MultiScatterplot
    • X-axis labels vertical

New features

  • Distinguish Color and Grouping option in scatterplot
    • Current “Color” feature becomes “Grouping”
    • “Color” should be possible for all features, using the Color Gradient
  • Scatterplot
    • Add slider for changing point size (currently via config dialog)
    • Add regression line and add option to show/hide (implemented in #2713)
  • MultiScatterplot
    • Add (better) tooltips (are legend tooltip in #2713)
      • Add correlation coefficient to scatterplot (visible in tooltip of legend)
  • Histogram + MultiLinechart
    • Add chart size sliders (as in MultiScatterplot)
      • or column count?
  • Feature Correlation Matrix
    • Check All, Input/Target, None Variables option
  • New Button should open a “are you sure current data is deleted” dialog

Change History (51)

comment:1 Changed 6 months ago by mkommend

  • Summary changed from Preprocessing Visual Enhancements to DataPreprocessing Enhancements

comment:2 Changed 6 months ago by pfleck

  • Description modified (diff)

comment:3 Changed 6 months ago by pfleck

  • Status changed from new to accepted

r14440 created branch

r14441 Copied plugins.

comment:4 Changed 6 months ago by pfleck

r14444 reverse merged r14441 because local changes were accidentally included in the branch.

comment:5 Changed 6 months ago by pfleck

r14445 Branched DataPreprocessing plugins. Adapted build paths and references.

comment:6 Changed 6 months ago by pfleck

r14446 Removed the PreprocessingScatterPlotView and use the HL ScatterPlotControl instead.

comment:7 Changed 6 months ago by pfleck

r14459

  • Removed the PreprocessingDataTable and PreprocessingDataTableView and use dhe HL DatatTableControl instead.
  • Moved and refactored some code of PreprocessingChart and moved unnecessary code from base classes to actual derivative classes.

Some features of the PreprocessingDataTableView are included in the regular DataTableView in #2715.

comment:8 Changed 6 months ago by pfleck

  • Description modified (diff)

comment:9 Changed 6 months ago by pfleck

r14460 Fixed missing resx in csproj.

comment:10 Changed 6 months ago by pfleck

  • Description modified (diff)

r14462

  • Added a separate MultiScatterPlot entry and removed the ViewHost views-icon instead.
  • Moved legend of DataCompletenessChart to the top and removed the title instead.

comment:11 Changed 6 months ago by pfleck

  • Description modified (diff)

r14467

  • Removed some groupboxes in ViewShortcutListView.
  • Removed unnecessary IViewChartShortcut
  • Split ScatterPlot Multi and Single in to separate contents.
  • Renamed Color-combo box in Scatterplot to "Group".

comment:12 Changed 6 months ago by pfleck

r14470

  • Fixed bugs with double-click on view shortcut.
  • Reuse visual properties for single scatterplot.

comment:13 Changed 6 months ago by pfleck

  • Description modified (diff)

r14472 Better initial axis intervals for scatterplots.

comment:14 Changed 6 months ago by pfleck

r14473 Improved default y-axis for line charts.

comment:15 Changed 6 months ago by pfleck

r14474

  • Improved legend description for grouped histogram and scatterplots.
  • Fixed initial size of points for scatterplots.
  • Added correlation calculation for scatterplots (not used yet).

comment:16 Changed 5 months ago by pfleck

  • Description modified (diff)

r14495

  • Fixed initial point size for scatterplots.
  • Reuse the visual properties of the old data row if a single variable is changed in the ScatterPlotSingleView

comment:17 Changed 5 months ago by pfleck

  • Description modified (diff)

r14511

  • Added Check Inputs/All/None buttons instead of showing disabled buttons of the ItemCollectionView.
  • Removed the PreprocessingCheckedItemListView. A standard ListView is used instead.
  • Fixed slow updating when simultaneously (un-)checking multiple variables in the chart views. (currently only works by using the new buttons)

comment:18 Changed 5 months ago by pfleck

  • Description modified (diff)

r14514

  • Added a VerticalLabel for the multi-scatterplot.
  • Added regression options for single- and multi-scatterplot

comment:19 Changed 5 months ago by pfleck

  • Description modified (diff)

r14512 Added an option for the preprocessing scatterplot to use a color gradient instead of the chart color palette.

comment:20 Changed 5 months ago by pfleck

r14525

  • Added suggestion feature for singlescatterplotview.
  • Shows NaN groups in scatterplot (black if gradient is selected).
  • Only enables input variables in DataGridContentView per default.
  • Added missing resx file (gradient image).

comment:21 Changed 5 months ago by pfleck

  • Description modified (diff)

r14545

  • Uses StringMatrix for statistics instead of winforms datagrid.
  • Precheck input/target variables only for statistics.

comment:22 Changed 5 months ago by pfleck

  • Description modified (diff)

r14546 Added shortcuts for select input/all/none variables in datagrid and statistics.

Last edited 5 months ago by pfleck (previous) (diff)

comment:23 Changed 5 months ago by pfleck

  • Owner changed from pfleck to mkommend
  • Status changed from accepted to reviewing

comment:24 Changed 4 months ago by pfleck

r14578 Fixed wrongly positioned options in histogram view.

comment:25 Changed 4 months ago by mkommend

r14579: Refactored histogram view and content to support grouping by string and datetime variables.

comment:26 Changed 4 months ago by mkommend

r14580: Changed initialization of caches to avoid NullReferenceExceptions.

comment:27 Changed 4 months ago by mkommend

r14581: Refactored get variables for grouping (extracted method to another class).

comment:28 Changed 4 months ago by pfleck

r14583

  • Added histogram aggregation option.
  • Show all columns in data grid per default.

comment:29 Changed 3 months ago by mkommend

r14723: Updated branch with most recent trunk changes.

comment:30 Changed 3 months ago by mkommend

Testing

  • View shortcuts should have more descriptive names and use spaces instead of camel case, for example "Line chart" instead of "LineChart".
  • All multi XXX chart should support opening an individual chart in a new tab by double clicking them
  • Data grid
    • What is the point of showing no variables? Especialle because the show column context menu cannot be opened anymore.
    • Spacing between row / column count & action button should be the same as for action buttons & the show variables
    • Show Variables GroupBox just as label or centered. Currently it looks slightly odd.
  • Statistics
    • Horizontally listed columns look much better.
    • However, would it be possible to configure the direction (horizontally vs vertically)
    • Show Variables GroupBox should be layouted vertically to use the available space better.
    • The datagrid shows per default all columns, whereas statistics only show the inputs + target. Per default all variables should be shown in the data grid and statistics view, but non-inputs should be highlighted maybe italic.
  • Line chart
    • Check and uncheck all variables have unintuitive icons. Can't you use a checked and unchecked box? (Applies to the histogram as well).
    • Reuse the icons for the data grid and statistics as well?
    • Size / Column count slider is missing. (Applies to the histogram as well).
  • Histogram
    • Title font is increased when enable grouping.
    • Aggregation options are pretty cool.
    • There should be an option to order the legend alphabetically instead of based on the occurance in the data(comment:47).
  • Scatter plot
    • It should be possible to change the point size and transparency of the data points. (Applies to the multi scatter plot as well).(comment:45)
    • More reasonable default text size.
  • Multi Scatter plot
    • It should only have one size slider instead of two separate ones for width and height

Review Comments

  • Chart classes should be sealed and members should be private (e.g. LineChartView). (will be done in a separate ticket on general DataPreprocessing architecture overhaul)
  • Commented code should be removed (PreprocessingChartView).(comment:50)
  • Remove resx files (ScatterPlotSingleView)(resx in ScatterPlotSingleView contains the gradient image)
Last edited 11 days ago by pfleck (previous) (diff)

comment:31 Changed 3 months ago by mkommend

  • Status changed from reviewing to assigned

comment:32 Changed 3 months ago by mkommend

  • Status changed from assigned to accepted

comment:33 Changed 3 months ago by mkommend

r14724: Adapted data preprocessing scatter plot to allow grouping of string variables.

comment:34 Changed 3 months ago by mkommend

  • Owner changed from mkommend to pfleck
  • Status changed from accepted to assigned

r14725: Added grouping for multi scatter plot view.

comment:35 Changed 4 weeks ago by pfleck

  • Status changed from assigned to accepted

comment:36 Changed 4 weeks ago by pfleck

  • Description modified (diff)

r14902

  • Changed chart sizing to absolute values (pixels).
  • Added chart sizing to Linechart and Histogram.

comment:37 Changed 4 weeks ago by pfleck

  • Description modified (diff)

comment:38 Changed 4 weeks ago by pfleck

r14903

  • Added warning when creating a new regression/classification that data will be lost.
  • Renamed view shortcuts to have a more descriptive name instead of the camel casing.
  • Added missing license header.

comment:39 Changed 4 weeks ago by pfleck

r14915

  • Added Check All/Inputs&Target/None Icons.
  • Improved location and formatting of the "Show Variables" groupbox in datagrid and statistics view.
  • Added an "Orientation" option for the statistics view.

comment:40 Changed 4 weeks ago by pfleck

r14917

  • Use the new icons for PreprocessingCheckedVariablesView (linechart, histogram).
  • Added a "lock aspect ratio" sizing for the multi scatter plot.
  • Fixed a bug in single scatter plot when changing the regression line.

comment:41 Changed 3 weeks ago by pfleck

  • Description modified (diff)

comment:42 Changed 3 weeks ago by pfleck

  • Description modified (diff)

comment:43 Changed 3 weeks ago by pfleck

Review comments from 2698#comment:11

  • Maximum of 20 variables should be selected by default. (comment:45)
  • All controls should be displayed (variable check box, slider, ...) before the charts are drawn (asynchronously?).
  • One slider for the chart size is sufficient. It is quite cumbersome to handle two sliders.(r14917) The point size should also be adapted when the chart size changes.
  • Possible leak (memory, window handles). Removed controls are not disposed.(comment:44)
  • Inserting and removing charts is quite ugly and pretty slow. Another possibility would be to set the column / row width to 0. Maybe that is a "better" solution. (comment:45)
Last edited 13 days ago by pfleck (previous) (diff)

comment:44 Changed 3 weeks ago by pfleck

r14953 Disposed dynamically created controls.

comment:45 Changed 2 weeks ago by pfleck

r14975

  • Improved Check/Uncheck of variables.
    • Instead of removing whole columns/rows from the tablelayout, the tablelayout stays the same with the column/rows width/height set to zero.
    • Hidden charts are not updated to avoid unnessecary calculations.
  • Added a check (messagebox) if >20 variables should be displayed in the multi scatterplot or reduced to 20.
  • Added configuration for point size/opacity and (histogram)aggregation.

comment:46 Changed 2 weeks ago by pfleck

r14983 Adapted DataTable/ScatterPlotControl to the recent (re-) merge of the -View and -Control.

comment:47 Changed 12 days ago by pfleck

r14993

  • Added Legend order when grouping for histogram and (single and multi)scatterplot.
  • Removed the limitation of distinct values for the singlescatterplot (for the color gradient).
  • Added a legend-visible checkbox for the multi-scatterplot.

comment:48 Changed 12 days ago by pfleck

  • Description modified (diff)

r14994 Added Check All/Inputs/None Buttons for the feature correlation view.

comment:49 Changed 12 days ago by pfleck

reviewed r14579, r14580, r14581, r14724 and r14725.

Looks good, especially that the code for creating a single chart (CreateHistrogram/Scatterplot) was moved into their respective Contents.

comment:50 Changed 11 days ago by pfleck

r14996

  • Fixed initial selection of the grouping text box (empty string instead of null to select the first entry).
  • General code fixes (removed unnessecary bank lines and code, class member order, ...)

comment:51 Changed 11 days ago by pfleck

  • Owner changed from pfleck to mkommend
  • Status changed from accepted to reviewing
Note: See TracTickets for help on using tickets.