Opened 6 years ago

Closed 5 years ago

#1756 closed defect (done)

Line chart is slow for large regression problems

Reported by: bburlacu Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.7
Component: Algorithms.DataAnalysis.Views Version: 3.3.7
Keywords: Cc: mkommend

Description

When the problem data contains a large number of rows (20000-50000), the line chart has a noticeable lag (up to several seconds), compared for instance to the scatter plot.

Attachments (1)

chart types.png (127.9 KB) - added by bburlacu 6 years ago.
Chart types and empty points behavior

Download all attachments as: .zip

Change History (12)

comment:1 Changed 6 years ago by gkronber

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.7
  • Priority changed from low to medium

comment:2 Changed 6 years ago by bburlacu

  • Status changed from new to accepted

comment:3 Changed 6 years ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from accepted to reviewing

r7327: Fixed speed issue in RegressionSolutionLineChartView. The problem was an unnecessary call to the InsertEmptyPoints procedure. The MSDN website (http://msdn.microsoft.com/en-us/library/dd456677.aspx) specifies the context in which this method is useful: when the data points have no Y value. In our case however there are no such points, so by removing the calls, performance becomes similar to the scatter plot (which also does not insert empty points) and is reasonably fast (tested on a data set with 50000 rows).

Last edited 6 years ago by gkronber (previous) (diff)

comment:4 Changed 6 years ago by gkronber

I introduced the call to InsertEmptyPoints in order to fix problems with NaN values in either predicted values or original values. Please check if this still works as expected.

Please also check if everything works correctly for regression and classification ensembles (as produced for instance by CV).

comment:5 Changed 6 years ago by bburlacu

r7333: Fixed series color for empty points.

Some observations after testing the chart behavior with different types of points: when calling the DataBindXY method, NaN y-values get converted and the points IsEmpty flag remains set to true:

  • FastPoint and FastLine chart types convert NaN values to 0.0
  • Point and Line chart types convert NaN values to the Y value of the nearest not empty neighboring point, or to the average of the left and right neighboring points (if both exist and are not empty)

By default, empty points are invisible on the chart (transparent foreground color), but this is solved by setting their color to the default series color. I attached a screenshot of this.

Changed 6 years ago by bburlacu

Chart types and empty points behavior

comment:6 Changed 6 years ago by bburlacu

r7335: Also removed the InsertEmptyPoints call from inside the ToggleSeriesData method.

comment:7 Changed 6 years ago by mkommend

  • Owner changed from mkommend to bburlacu
  • Status changed from reviewing to assigned

comment:8 Changed 6 years ago by bburlacu

r7406: Fixed line chart behavior for cases when the data point series are not continuous (some indices are not consecutive).

comment:9 Changed 5 years ago by ascheibe

  • Owner changed from bburlacu to mkommend
  • Status changed from assigned to reviewing

comment:10 Changed 5 years ago by mkommend

  • Status changed from reviewing to readytorelease

comment:11 Changed 5 years ago by mkommend

  • Resolution set to done
  • Status changed from readytorelease to closed
  • Version changed from 3.3.6 to 3.3.7
Note: See TracTickets for help on using tickets.