Opened 22 months ago

Closed 7 weeks ago

Last modified 12 days ago

#2442 closed enhancement (done)

Linq Expression Trees Interpreter for HeuristicLab

Reported by: bburlacu Owned by: mkommend
Priority: medium Milestone: HeuristicLab 3.3.15
Component: Problems.DataAnalysis.Symbolic Version: 3.3.12
Keywords: Cc:

Description

The same functionality provided by the IL-emitting tree interpreter can also be achieved using linq expression trees, with the added benefit of more flexibility and a slight performance improvement.

This branch holds the implementation for the linq expression trees interpreter, with an additional symbol added which allows user-defined functions - along the lines of what was requested in #2437.

Attachments (1)

test-grammar.hl (107.5 KB) - added by bburlacu 22 months ago.
Example of a programmable problem using the new symbol and interpreter.

Download all attachments as: .zip

Change History (39)

comment:1 Changed 22 months ago by bburlacu

  • Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 3.3.x Backlog
  • Status changed from new to accepted

Changed 22 months ago by bburlacu

Example of a programmable problem using the new symbol and interpreter.

comment:2 Changed 22 months ago by bburlacu

r12807: Initial commmit (see the attached file for an example).

comment:3 Changed 20 months ago by bburlacu

r13039: Very minor changes in the naming of things. Removed readonly attributes for the min and max arity variables.

comment:4 Changed 19 months ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from accepted to reviewing

comment:5 Changed 19 months ago by mkommend

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.13

comment:6 Changed 19 months ago by mkommend

Review comments:

  • ClearState should not reset the evaluated solutions counter
  • Both parameters should be of type FixedValueParam<>
  • Improve handling of variables
    • Why does the interpreter has to retrieve Dataset.GetReadonlyDoubleValues.List.array?
    • Don't copy the values in another array
    • Just use the indexer of the ReadonlyCollection
    • Can't the IDataset be reused instead of double[][]? This would ease the use of the interpeter and removes copying the values.
  • Please add performance unit tests (similar the existing ones)
  • Add another unit tests, which interprets the same tree with all available interpreters and compares the results. The trees should contain all available symbols and be randomly created. If a mismatch is found the used seed, tree index and tree should be additionally output.

comment:7 Changed 19 months ago by mkommend

  • Owner changed from mkommend to bburlacu
  • Status changed from reviewing to assigned

comment:8 Changed 19 months ago by bburlacu

r13139: Commit code version using IDataset as an argument for the returned lambda.

comment:9 Changed 19 months ago by bburlacu

r13140: Commit code version using a List instead of double array (makes code slightly simpler since the underlying double array need not be retrieved from the list)

comment:10 Changed 19 months ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from assigned to reviewing

r13141: Merged files from trunk and updated project file. Implemented missing operations in the CompiledTreeInterpreter: Integral, Derivative, Lag, TimeLag. Adapted lambda signature to accept an array of List<double> in order to make it easier to work with compiled trees. Changed value parameters to fixed value parameters and adjusted interpreter constructors and after serialization hooks. Removed function symbol.

From the performance point of view, compiling the tree into a lambda accepting a double[][] parameter (an array of arrays for the values of each double variable), accessed with Expression.ArrayIndex is the fastest, but it can be cumbersome to provide the data as a double[][]. Therefore the variant with List<double>[] was chosen. Internally, for each variable node the List's underlying double array is used, result in an overall decent speed compromise.

comment:11 Changed 19 months ago by mkommend

  • Owner changed from mkommend to bburlacu
  • Status changed from reviewing to assigned

Review comments (r13141):

  • TreeInterpreter & LinearInterpreter & ILEmittingInterpreter
    • AfterDeserialization hook contains bugs
    • Formatting has been changed! (placement of curly brackets)
  • CompiledInterpreter
    • ReadOnlyCollectionRetrieveList, ListRetrieveItems, and listGetValue are never used
    • AfterDeserializationHook is not necessary
    • Change compiled tree to use IList<double> instead of List<double>
    • Improve formatting

comment:12 Changed 19 months ago by bburlacu

r13222: Removed unused code, fixed formatting, fixed bug in AfterDeserializationHook.

comment:13 Changed 19 months ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from assigned to reviewing

comment:14 Changed 19 months ago by mkommend

r13247: Merged trunk changes into branch for compiled interpreter.

comment:15 Changed 19 months ago by mkommend

r13248: Reintegrated branch for compiled symbolic expression tree interpreter. r13249: Updated mergeinfo for HL.Problems.DataAnalysis.Symbolic. r13250: Removed integrated branch for the compiled symbolic expression tree interpreter.

comment:16 Changed 19 months ago by mkommend

r13251: Adapted locking in the symbolic expression trees interpreters when incrementing the evaluated solutions.

comment:17 Changed 19 months ago by mkommend

r13254: Corrected behavior of the linear interpreter when evaluating the root symbol.

comment:18 Changed 19 months ago by mkommend

r13255: Automatically disabled variable condition symbols if no variable names are present.

comment:19 Changed 19 months ago by mkommend

r13256: Added new unit tests for interpreters.

comment:20 Changed 19 months ago by mkommend

  • Owner changed from mkommend to bburlacu
  • Status changed from reviewing to assigned

comment:21 Changed 19 months ago by bburlacu

r13268: Fix potential out of bounds exception when getting variable values in the SymbolicDataAnalysisExpressionTreeLinearInterpreter

comment:22 Changed 19 months ago by bburlacu

r13288 Removed time lagged symbols as they bring extra complexity in evaluating variables and a big performance penalty. Fixed interpretation of Psi, Power, Root, Square symbols.

comment:23 Changed 19 months ago by bburlacu

  • Owner changed from bburlacu to mkommend
  • Status changed from assigned to reviewing

r13313: Added CreateDelegate method to be able to get the lambda expression corresponding to a symbolic expression tree, performed minor cosmetic enhancements of the SymbolicDataAnalysisExpressionCompiledTreeInterpreter. Updated unit tests, added unit test for testing the correctness of the SymbolicDataAnalysisExpressionCompiledTreeInterpreter which can show exactly which operations cause deviations.

Due to the operations being performed slightly different (not sure exactly how), small deviations can occurr, which can propagate until they become significant. The C# reference specifies a precision of 15-16 digits for the double type (https://msdn.microsoft.com/en-us/library/678hzkk9.aspx).

Example:

Test Name:	TestCompiledInterpreterEvaluationResults
Test Outcome:	Failed
Result Message:	Assert.AreEqual failed. Expected a difference no greater than <1E-10> between expected value <4.18130961429957E+15> and actual value <4.18130961429954E+15>. Interpreters SymbolicDataAnalysisExpressionCompiledTreeInterpreter and SymbolicDataAnalysisExpressionTreeInterpreter do not agree on tree 8 and row 96 (seed = 199483619).
Result StandardOutput:	
199483619
ProgramRootSymbol──StartSymbol──Exponential──Multiplication┬─Cosine──CosineIntegral──6.3361E-002 x11
                                                           └─Square──6.1001E+000

Exponential
	++ Row 96: 4181309614299568.5 4181309614299538.5, Deviation = 30
Multiplication
	== Row 96: 35.96940089723398870091841672547161579132080078125 35.969400897233981595491059124469757080078125, Deviation = 7.105427357601E-15
Cosine
	== Row 96: 0.9666257786402574492257144811446778476238250732421875 0.9666257786402574492257144811446778476238250732421875, Deviation = 0
CosineIntegral
	== Row 96: -6.0241041182568455525370154646225273609161376953125 -6.0241041182568455525370154646225273609161376953125, Deviation = 0
6.3361E-002 x11
	== Row 96: -0.00135857445785335698672235960060561410500667989253997802734375 -0.00135857445785335698672235960060561410500667989253997802734375, Deviation = 0
Square
	== Row 96: 37.21129902808072387188076390884816646575927734375 37.21129902808072387188076390884816646575927734375, Deviation = 0
6.1001E+000
	== Row 96: 6.100106476782247710843876120634377002716064453125 6.100106476782247710843876120634377002716064453125, Deviation = 0

An experiment with the arithmetic grammar, along with exponential, power, and trigonometric symbols showed no deviation between the quality using the compiled tree interpreter and the standard interpreter over 50 runs.

comment:24 Changed 19 months ago by bburlacu

r13314: Update unit test (forgot to uncomment code and remove Debugger.Launch)

comment:25 Changed 18 months ago by bburlacu

r13315: Updated unit tests

comment:26 Changed 18 months ago by mkommend

  • Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0

comment:27 Changed 18 months ago by bburlacu

  • Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.13

r13318: Enabled power functions for the grammar in the TestCompiledInterpreterEstimatedValuesConsistency test method.

comment:28 Changed 18 months ago by mkommend

  • Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0

comment:29 Changed 18 months ago by bburlacu

Unit tests showed that in some cases the evaluated tree values between this interpreter and the other two interpreters (SymbolicDataAnalysisExpressionTreeLinearInterpreter and SymbolicDataAnalysisExpressionTreeInterpreter) were different. The reason was that when run in x86 mode, the runtime-generated code for the linq expressions differs from the compiled code in using x86-specific registers (80bit FPU registers, allowing some trucation when the result is put back into 64 bits)

So we have the following scenarios:

  • HL compiled for AnyCpu:
    • Run in x86 mode: unit test fails (consistency of values across interpreters)
    • Run in x64 mode: unit test passes (no problems)
  • HL compiled for x86:
    • Unit test fails, no 2 interpreters are consistent with each other (not all the time)

Solution: make sure to run tests in x64 mode, and maybe give up support for 32 bit since it is likely the results will differ when using different interpreters.

comment:30 Changed 12 months ago by gkronber

What's the status of this ticket? Ready for the next 3.3 release?

comment:31 Changed 8 months ago by gkronber

r14282 must be merged with this ticket (see comments in #2668)

comment:32 Changed 8 months ago by mkommend

  • Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.15

comment:33 Changed 2 months ago by bburlacu

r14809: Update SymbolicDataAnalysisExpressionTreeILEmittingInterpreter with explicit conversions to ensure correct results.

comment:34 Changed 2 months ago by bburlacu

r14810: Update interpreter unit test.

comment:35 Changed 2 months ago by mkommend

  • Status changed from reviewing to readytorelease

Reviewed all changesets

comment:37 Changed 7 weeks ago by mkommend

  • Resolution set to done
  • Status changed from readytorelease to closed

comment:38 Changed 12 days ago by mkommend

r14980: Merged r12349 and r12350 into stable.

Note: See TracTickets for help on using tickets.