Free cookie consent management tool by TermsFeed Policy Generator

Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#2958 closed enhancement (done)

Vectorized/batch-mode interpreter for symbolic expression trees

Reported by: bburlacu Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.16
Component: Problems.DataAnalysis.Symbolic Version: trunk
Keywords: merged Cc:

Description (last modified by bburlacu)

This ticket explores the possibility of employing batching and vectorisation techniques (ie. using dedicated datatypes from System.Numerics) to speed up the interpretation of symbolic expression trees.

Batching consists in allocating a small buffer for each instruction and performing operations on the whole buffer (instead of individual values for each row in the dataset).

Vectorisation additionally involves using SIMD (Single Instruction Multiple Data) CPU instructions to speed up batch processing.

Managed (C#) interpreter

Batch processing using the Vector<double> class in System.Numerics allows us to achieve a 2-3x speed improvement compared to the standard linear interpreter.

Native interpreter

A tree interpreter in native code (C++) can offer a significant speed advantage due to more mature backends (msvc, gcc) and features like auto-vectorization and loop unrolling.

Preliminary results show 5-10x speed improvement compared to the linear tree interpreter. We should also investigate the potential benefit of integrating fast math libraries such as vdt (vectorized math]) to increase computation speed.

This functionality should be implemented as an external plugin.

Attachments (1)

PerformanceChart.xlsx (14.6 KB) - added by bburlacu 5 years ago.
Preliminary results

Download all attachments as: .zip

Change History (35)

Changed 5 years ago by bburlacu

Preliminary results

comment:1 Changed 5 years ago by bburlacu

  • Status changed from new to accepted

comment:2 Changed 5 years ago by bburlacu

r16266: Add native interpreter dll wrapper as external lib.

r16269: Add C++ source code

r16274: Update dll files and C++ source code to the latest version.

r16276: Add SymbolicDataAnalysisExpressionTreeNativeInterpreter which calls into the native implementation.

r16277: SymbolicDataAnalysisExpressionTreeNativeInterpreter: add EvaluatedSolutions as parameter, similar to the other interpreters.

Last edited 5 years ago by bburlacu (previous) (diff)

comment:3 Changed 5 years ago by bburlacu

  • Owner changed from bburlacu to gkronber
  • Status changed from accepted to reviewing

comment:4 Changed 5 years ago by bburlacu

  • Description modified (diff)
  • Summary changed from Native interpreter for symbolic expression trees to Vectorized/batch-mode interpreter for symbolic expression trees

comment:5 Changed 5 years ago by bburlacu

r16285: Add vectorized SymbolicDataAnalysisExpressionTreeBatchInterpreter and update project config (Nuget package System.Numerics).

r16286: Forgot to commit changes to project file

r16287: Keep the SymbolicDataAnalysisExpressionTreeBatchInterpreter, but remove vectorization.

r16289: Add plugin dependency to native interpreter plugin.

Last edited 5 years ago by bburlacu (previous) (diff)

comment:6 Changed 5 years ago by bburlacu

r16293: Support additional symbols in the SymbolicDataAnalysisExpressionTreeBatchInterpreter

comment:7 Changed 5 years ago by bburlacu

r16296: SymbolicDataAnalysisExpressionTreeBatchInterpreter: simplify Compile, add cache for variable values (helps a lot with performance).

r16297: Very minor refactor.

Last edited 5 years ago by bburlacu (previous) (diff)

comment:8 Changed 5 years ago by bburlacu

r16298: Add batch interpreter performance unit tests (for arithmetic and typecoherent grammar).

comment:9 Changed 5 years ago by bburlacu

r16333: Native interpreter dlls: statically link against the Visual C++ runtime

r16334: Add support for sqrt in the interpreter and update dlls .

Last edited 5 years ago by bburlacu (previous) (diff)

comment:10 Changed 5 years ago by gkronber

reviewed the code and made some changes in the branch for #2915. Will merge back later.

comment:11 Changed 5 years ago by gkronber

In r16356 I merged back changes to the native interpreter from the #2915 branch.

I tested a lot and found that the native interpreter produces exactly the same results as the managed interpreters except when using sqrt() or abs(). Do you have any idea why this might happen?

Last edited 5 years ago by gkronber (previous) (diff)

comment:12 Changed 5 years ago by gkronber

  • Owner changed from gkronber to bburlacu

comment:13 Changed 5 years ago by gkronber

  • Status changed from reviewing to assigned

The BatchInterpreter assumes that GetSymbolicExpressionTreeValues() is always called with the same dataset. On the first call it caches the supplied dataset and on later calls it just takes the values from the cache.

The API allows to call interpreters with different datasets so the supplied dataset must be checked against the cache on each call.

I just fell into this trap and only recognized the problem because I was skeptical of the results. Looking only at my code I would have never found that the interpreter actually ignores the dataset that I set as a parameter.

Please fix!

Probably this is true for the native interpreter as well.

comment:14 Changed 5 years ago by bburlacu

r16378: Batch and Native interpreter: keep a cached reference to the dataset so we can detect when it changes.

comment:15 Changed 5 years ago by bburlacu

r16379: NativeInterpreter: avoid memory leak (free pinned array handles when the cache changes)

comment:16 Changed 5 years ago by gkronber

  • Owner changed from bburlacu to gkronber
  • Status changed from assigned to reviewing

comment:17 Changed 5 years ago by gkronber

  • Version changed from 3.4 to trunk

comment:18 Changed 5 years ago by abeham

r16542: changed reference to PluginInfrastructure from file to project

The project build order wasn't right due to the missing dependency on the project. NativeInterpreter could be built before PluginInfrastucture since the reference wasn't recorded as a project reference.

comment:19 Changed 5 years ago by gkronber

The new interpreters do not support factor variables.

Version 0, edited 5 years ago by gkronber (next)

comment:20 Changed 5 years ago by gkronber

Reviewed r16378, r16379, r16542 and tested the new interpreters.

comment:21 Changed 5 years ago by gkronber

  • Status changed from reviewing to assigned

There should be an exception when the evaluator does not support a symbol.

comment:22 Changed 5 years ago by gkronber

r16762: added checks and exceptions if native and batch interpreters encounter an unsupported symbol.

comment:23 Changed 5 years ago by gkronber

  • Status changed from assigned to readytorelease

comment:24 Changed 5 years ago by gkronber

r16768: fixed supported operations in BatchInterpreter (fix failing unit test)

comment:25 Changed 5 years ago by abeham

  • Keywords mergewith-2915 added
  • Owner changed from gkronber to abeham
  • Status changed from readytorelease to reviewing

comment:26 Changed 5 years ago by abeham

  • Status changed from reviewing to readytorelease

comment:27 Changed 5 years ago by mkommend

  • Owner changed from abeham to mkommend
  • Status changed from readytorelease to assigned

comment:28 Changed 5 years ago by mkommend

  • Status changed from assigned to readytorelease

comment:29 Changed 5 years ago by abeham

  • Keywords depends-2520 mergewith-(2915 2866 2966) added; mergewith-2915 removed

comment:30 Changed 5 years ago by mkommend

r17071: Merged 16266, 16269, 16274, 16276, 16277, 16285, 16286, 16287, 16289, 16293, 16296, 16297, 16298, 16333, 16334 into stable.
r17073: Merged 16378, 16379, 16542 into stable.

Last edited 5 years ago by mkommend (previous) (diff)

comment:31 Changed 5 years ago by mkommend

  • Keywords merged added; depends-2520 mergewith-(2915 2866 2966) removed

r17104: Merged 16762, 16768 into stable.

comment:32 Changed 5 years ago by abeham

  • Resolution set to done
  • Status changed from readytorelease to closed

comment:33 Changed 5 years ago by gkronber

r16301 not merged?

comment:34 Changed 5 years ago by gkronber

r17170: merged r16301 from trunk to stable

Note: See TracTickets for help on using tickets.