Opened 13 years ago
Closed 12 years ago
#1756 closed defect (done)
Line chart is slow for large regression problems
Reported by: | bburlacu | Owned by: | mkommend |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.7 |
Component: | Algorithms.DataAnalysis.Views | Version: | 3.3.7 |
Keywords: | Cc: | mkommend |
Description
When the problem data contains a large number of rows (20000-50000), the line chart has a noticeable lag (up to several seconds), compared for instance to the scatter plot.
Attachments (1)
Change History (12)
comment:1 Changed 13 years ago by gkronber
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.7
- Priority changed from low to medium
comment:2 Changed 13 years ago by bburlacu
- Status changed from new to accepted
comment:3 Changed 13 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:4 Changed 13 years ago by gkronber
I introduced the call to InsertEmptyPoints in order to fix problems with NaN values in either predicted values or original values. Please check if this still works as expected.
Please also check if everything works correctly for regression and classification ensembles (as produced for instance by CV).
comment:5 Changed 13 years ago by bburlacu
r7333: Fixed series color for empty points.
Some observations after testing the chart behavior with different types of points: when calling the DataBindXY method, NaN y-values get converted and the points IsEmpty flag remains set to true:
- FastPoint and FastLine chart types convert NaN values to 0.0
- Point and Line chart types convert NaN values to the Y value of the nearest not empty neighboring point, or to the average of the left and right neighboring points (if both exist and are not empty)
By default, empty points are invisible on the chart (transparent foreground color), but this is solved by setting their color to the default series color. I attached a screenshot of this.
comment:6 Changed 13 years ago by bburlacu
r7335: Also removed the InsertEmptyPoints call from inside the ToggleSeriesData method.
comment:7 Changed 13 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
comment:8 Changed 13 years ago by bburlacu
r7406: Fixed line chart behavior for cases when the data point series are not continuous (some indices are not consecutive).
comment:9 Changed 13 years ago by ascheibe
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:10 Changed 12 years ago by mkommend
- Status changed from reviewing to readytorelease
comment:11 Changed 12 years ago by mkommend
- Resolution set to done
- Status changed from readytorelease to closed
- Version changed from 3.3.6 to 3.3.7
r7326: Fixed speed issue in RegressionSolutionLineChartView. The problem was an unnecessary call to the InsertEmptyPoints procedure. The MSDN website (http://msdn.microsoft.com/en-us/library/dd456677.aspx) specifies the context in which this method is useful: when the data points have no Y value. In our case however there are no such points, so by removing the calls, performance becomes similar to the scatter plot (which also does not insert empty points) and is reasonably fast (tested on a data set with 50000 rows).