#2442 closed enhancement (done)
Linq Expression Trees Interpreter for HeuristicLab
Reported by: | bburlacu | Owned by: | mkommend |
---|---|---|---|
Priority: | medium | Milestone: | HeuristicLab 3.3.15 |
Component: | Problems.DataAnalysis.Symbolic | Version: | 3.3.12 |
Keywords: | Cc: |
Description
The same functionality provided by the IL-emitting tree interpreter can also be achieved using linq expression trees, with the added benefit of more flexibility and a slight performance improvement.
This branch holds the implementation for the linq expression trees interpreter, with an additional symbol added which allows user-defined functions - along the lines of what was requested in #2437.
Attachments (1)
Change History (39)
comment:1 Changed 9 years ago by bburlacu
- Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 3.3.x Backlog
- Status changed from new to accepted
Changed 9 years ago by bburlacu
comment:2 Changed 9 years ago by bburlacu
r12807: Initial commmit (see the attached file for an example).
comment:3 Changed 9 years ago by bburlacu
r13039: Very minor changes in the naming of things. Removed readonly attributes for the min and max arity variables.
comment:4 Changed 9 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from accepted to reviewing
comment:5 Changed 9 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.13
comment:6 Changed 9 years ago by mkommend
Review comments:
- ClearState should not reset the evaluated solutions counter
- Both parameters should be of type FixedValueParam<>
- Improve handling of variables
- Why does the interpreter has to retrieve Dataset.GetReadonlyDoubleValues.List.array?
- Don't copy the values in another array
- Just use the indexer of the ReadonlyCollection
- Can't the IDataset be reused instead of double[][]? This would ease the use of the interpeter and removes copying the values.
- Please add performance unit tests (similar the existing ones)
- Add another unit tests, which interprets the same tree with all available interpreters and compares the results. The trees should contain all available symbols and be randomly created. If a mismatch is found the used seed, tree index and tree should be additionally output.
comment:7 Changed 9 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
comment:8 Changed 9 years ago by bburlacu
r13139: Commit code version using IDataset as an argument for the returned lambda.
comment:9 Changed 9 years ago by bburlacu
r13140: Commit code version using a List instead of double array (makes code slightly simpler since the underlying double array need not be retrieved from the list)
comment:10 Changed 9 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
r13141: Merged files from trunk and updated project file. Implemented missing operations in the CompiledTreeInterpreter: Integral, Derivative, Lag, TimeLag. Adapted lambda signature to accept an array of List<double> in order to make it easier to work with compiled trees. Changed value parameters to fixed value parameters and adjusted interpreter constructors and after serialization hooks. Removed function symbol.
From the performance point of view, compiling the tree into a lambda accepting a double[][] parameter (an array of arrays for the values of each double variable), accessed with Expression.ArrayIndex is the fastest, but it can be cumbersome to provide the data as a double[][]. Therefore the variant with List<double>[] was chosen. Internally, for each variable node the List's underlying double array is used, result in an overall decent speed compromise.
comment:11 Changed 9 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
Review comments (r13141):
- TreeInterpreter & LinearInterpreter & ILEmittingInterpreter
- AfterDeserialization hook contains bugs
- Formatting has been changed! (placement of curly brackets)
- CompiledInterpreter
- ReadOnlyCollectionRetrieveList, ListRetrieveItems, and listGetValue are never used
- AfterDeserializationHook is not necessary
- Change compiled tree to use IList<double> instead of List<double>
- Improve formatting
comment:12 Changed 9 years ago by bburlacu
r13222: Removed unused code, fixed formatting, fixed bug in AfterDeserializationHook.
comment:13 Changed 9 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
comment:14 Changed 9 years ago by mkommend
r13247: Merged trunk changes into branch for compiled interpreter.
comment:15 Changed 9 years ago by mkommend
comment:16 Changed 9 years ago by mkommend
r13251: Adapted locking in the symbolic expression trees interpreters when incrementing the evaluated solutions.
comment:17 Changed 9 years ago by mkommend
r13254: Corrected behavior of the linear interpreter when evaluating the root symbol.
comment:18 Changed 9 years ago by mkommend
r13255: Automatically disabled variable condition symbols if no variable names are present.
comment:19 Changed 9 years ago by mkommend
r13256: Added new unit tests for interpreters.
comment:20 Changed 9 years ago by mkommend
- Owner changed from mkommend to bburlacu
- Status changed from reviewing to assigned
comment:21 Changed 9 years ago by bburlacu
r13268: Fix potential out of bounds exception when getting variable values in the SymbolicDataAnalysisExpressionTreeLinearInterpreter
comment:22 Changed 9 years ago by bburlacu
r13288 Removed time lagged symbols as they bring extra complexity in evaluating variables and a big performance penalty. Fixed interpretation of Psi, Power, Root, Square symbols.
comment:23 Changed 9 years ago by bburlacu
- Owner changed from bburlacu to mkommend
- Status changed from assigned to reviewing
r13313: Added CreateDelegate method to be able to get the lambda expression corresponding to a symbolic expression tree, performed minor cosmetic enhancements of the SymbolicDataAnalysisExpressionCompiledTreeInterpreter. Updated unit tests, added unit test for testing the correctness of the SymbolicDataAnalysisExpressionCompiledTreeInterpreter which can show exactly which operations cause deviations.
Due to the operations being performed slightly different (not sure exactly how), small deviations can occurr, which can propagate until they become significant. The C# reference specifies a precision of 15-16 digits for the double type (https://msdn.microsoft.com/en-us/library/678hzkk9.aspx).
Example:
Test Name: TestCompiledInterpreterEvaluationResults Test Outcome: Failed Result Message: Assert.AreEqual failed. Expected a difference no greater than <1E-10> between expected value <4.18130961429957E+15> and actual value <4.18130961429954E+15>. Interpreters SymbolicDataAnalysisExpressionCompiledTreeInterpreter and SymbolicDataAnalysisExpressionTreeInterpreter do not agree on tree 8 and row 96 (seed = 199483619). Result StandardOutput: 199483619 ProgramRootSymbol──StartSymbol──Exponential──Multiplication┬─Cosine──CosineIntegral──6.3361E-002 x11 └─Square──6.1001E+000 Exponential ++ Row 96: 4181309614299568.5 4181309614299538.5, Deviation = 30 Multiplication == Row 96: 35.96940089723398870091841672547161579132080078125 35.969400897233981595491059124469757080078125, Deviation = 7.105427357601E-15 Cosine == Row 96: 0.9666257786402574492257144811446778476238250732421875 0.9666257786402574492257144811446778476238250732421875, Deviation = 0 CosineIntegral == Row 96: -6.0241041182568455525370154646225273609161376953125 -6.0241041182568455525370154646225273609161376953125, Deviation = 0 6.3361E-002 x11 == Row 96: -0.00135857445785335698672235960060561410500667989253997802734375 -0.00135857445785335698672235960060561410500667989253997802734375, Deviation = 0 Square == Row 96: 37.21129902808072387188076390884816646575927734375 37.21129902808072387188076390884816646575927734375, Deviation = 0 6.1001E+000 == Row 96: 6.100106476782247710843876120634377002716064453125 6.100106476782247710843876120634377002716064453125, Deviation = 0
An experiment with the arithmetic grammar, along with exponential, power, and trigonometric symbols showed no deviation between the quality using the compiled tree interpreter and the standard interpreter over 50 runs.
comment:24 Changed 9 years ago by bburlacu
r13314: Update unit test (forgot to uncomment code and remove Debugger.Launch)
comment:25 Changed 9 years ago by bburlacu
r13315: Updated unit tests
comment:26 Changed 9 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0
comment:27 Changed 9 years ago by bburlacu
- Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.13
r13318: Enabled power functions for the grammar in the TestCompiledInterpreterEstimatedValuesConsistency test method.
comment:28 Changed 9 years ago by mkommend
- Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0
comment:29 Changed 9 years ago by bburlacu
Unit tests showed that in some cases the evaluated tree values between this interpreter and the other two interpreters (SymbolicDataAnalysisExpressionTreeLinearInterpreter and SymbolicDataAnalysisExpressionTreeInterpreter) were different. The reason was that when run in x86 mode, the runtime-generated code for the linq expressions differs from the compiled code in using x86-specific registers (80bit FPU registers, allowing some trucation when the result is put back into 64 bits)
So we have the following scenarios:
- HL compiled for AnyCpu:
- Run in x86 mode: unit test fails (consistency of values across interpreters)
- Run in x64 mode: unit test passes (no problems)
- HL compiled for x86:
- Unit test fails, no 2 interpreters are consistent with each other (not all the time)
Solution: make sure to run tests in x64 mode, and maybe give up support for 32 bit since it is likely the results will differ when using different interpreters.
comment:30 Changed 8 years ago by gkronber
What's the status of this ticket? Ready for the next 3.3 release?
comment:31 Changed 8 years ago by gkronber
comment:32 Changed 8 years ago by mkommend
- Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.15
comment:33 Changed 8 years ago by bburlacu
r14809: Update SymbolicDataAnalysisExpressionTreeILEmittingInterpreter with explicit conversions to ensure correct results.
comment:34 Changed 8 years ago by bburlacu
r14810: Update interpreter unit test.
comment:35 Changed 8 years ago by mkommend
- Status changed from reviewing to readytorelease
Reviewed all changesets
comment:36 Changed 8 years ago by mkommend
comment:37 Changed 8 years ago by mkommend
- Resolution set to done
- Status changed from readytorelease to closed
Example of a programmable problem using the new symbol and interpreter.