Opened 21 months ago

Last modified 17 months ago

#2435 accepted enhancement

Update AlgLib to 3.10.0

Reported by: gkronber Owned by: gkronber
Priority: medium Milestone: HeuristicLab 4.0
Component: ExtLibs Version: 3.3.12
Keywords: Cc:

Description


Change History (15)

comment:1 Changed 21 months ago by gkronber

  • Owner set to gkronber
  • Status changed from new to accepted

comment:2 Changed 21 months ago by gkronber

Script to test alglib changes (including persistence)

// use 'vars' to access variables in the script's variable store (e.g. vars.x = 5)
// use 'vars[string]' to access variables via runtime strings (e.g. vars["x"] = 5)
// use 'vars.Contains(string)' to check if a variable exists
// use 'vars.Clear()' to remove all variables
// use 'foreach (KeyValuePair<string, object> v in vars) { ... }' to iterate over all variables
// use 'variables' to work with IEnumerable<T> extension methods on the script's variable store

using System;
using System.Linq;
using System.Collections.Generic;
using System.Threading;

using HeuristicLab.Core;
using HeuristicLab.Common;
using HeuristicLab.Algorithms.DataAnalysis;
using HeuristicLab.Algorithms.OffspringSelectionGeneticAlgorithm;
using HeuristicLab.Problems.Instances.DataAnalysis;
using HeuristicLab.Optimization;
using HeuristicLab.Algorithms.CMAEvolutionStrategy;
using HeuristicLab.Algorithms.GradientDescent;
using HeuristicLab.Problems.TestFunctions;
using HeuristicLab.Problems.DataAnalysis;
using HeuristicLab.Problems.DataAnalysis.Symbolic;
using HeuristicLab.Problems.DataAnalysis.Symbolic.Regression;
//using HeuristicLab.Collections;
using HeuristicLab.Data;
using HeuristicLab.ParallelEngine;

public class MyScript : HeuristicLab.Scripting.CSharpScriptBase {
  public override void Main() {
    TestRegressionTower();

    TestCMAES();
    TestLmbgfs();
    TestSymbReg();
  }
  
  public void TestRegressionTower() {
    // create tower problem
    var prov = new RegressionRealWorldInstanceProvider();
    var dd = prov.GetDataDescriptors().First(n=>n.Name.Contains("Tower"));
    var problemData = prov.LoadData(dd);
    var regProblem = new RegressionProblem();
    regProblem.Load(problemData);
    
    RunAlg("lr", new LinearRegression(), (IRegressionProblem)regProblem.Clone());
    RunAlg("rf", new RandomForestRegression(), (IRegressionProblem)regProblem.Clone());
       
    var smallProblem = (IRegressionProblem)regProblem.Clone();
    smallProblem.ProblemData.TrainingPartition.End = 300;
    RunAlg("gp", new GaussianProcessRegression(), smallProblem);
    
    RunAlg("nn", new NeuralNetworkRegression(), (IRegressionProblem)smallProblem.Clone());
    RunAlg("nne", new NeuralNetworkEnsembleRegression(), (IRegressionProblem)smallProblem.Clone());
    RunAlg("knn", new NearestNeighbourRegression(), (IRegressionProblem)regProblem.Clone());
  }
 
  
  public void TestCMAES() {   
    var cmaes = new CMAEvolutionStrategy();
    var prob = new SingleObjectiveTestFunctionProblem();
    prob.EvaluatorParameter.Value = new RosenbrockEvaluator();
    prob.ProblemSize = new IntValue(20);
    RunAlg("cmaes", cmaes, prob);
  }
  
  public void TestLmbgfs() {
    var lmbfgs = new LbfgsAlgorithm();
    var prob = new SingleObjectiveTestFunctionProblem();
    prob.EvaluatorParameter.Value = new RosenbrockEvaluator();
    prob.ProblemSize = new IntValue(20);
    lmbfgs.MaxIterations = 1000;
    RunAlg("lbfgs", lmbfgs, prob);
  }
  
  public void TestSymbReg() {
    // create tower problem
    var prov = new RegressionRealWorldInstanceProvider();
    var dd = prov.GetDataDescriptors().First(n=>n.Name.Contains("Tower"));
    var problemData = prov.LoadData(dd);
    var regProblem = new SymbolicRegressionSingleObjectiveProblem();
    regProblem.Load(problemData);
    regProblem.EvaluatorParameter.Value = new SymbolicRegressionConstantOptimizationEvaluator();
    regProblem.MaximumSymbolicExpressionTreeLength.Value = 100;
    regProblem.MaximumSymbolicExpressionTreeDepth.Value = 12;
    
    var gp = new OffspringSelectionGeneticAlgorithm();
    gp.Problem = regProblem;
    gp.MaximumGenerations.Value = 10;
    gp.ComparisonFactorLowerBound.Value = 1.0;
    gp.Mutator = gp.MutatorParameter.ValidValues.First(n => n.Name.Contains("TreeManipulator"));
    gp.Selector = gp.SelectorParameter.ValidValues.First(n => n.Name.Contains("GenderSpecific"));
    
    RunAlg("symbreg", gp, regProblem);
  }
  
  public void RunAlg(string name, IAlgorithm alg, IProblem prob) {
    alg.Problem = prob;
    var engineAlg = alg as EngineAlgorithm;
    if(engineAlg!=null) {
       engineAlg.Engine = new ParallelEngine();
    }
    vars[name] = alg;
    using(var wh = new AutoResetEvent(false)) {
      alg.ExceptionOccurred += (object sender, EventArgs<Exception> e) => { wh.Set(); };
      alg.Stopped += (object sender, EventArgs e) => { wh.Set(); };
      for(int i=0;i<10;i++) {
        alg.Prepare();
        alg.Start();
      
        wh.WaitOne();
      }
    }
  }
  // implement further classes and methods

}

comment:3 Changed 21 months ago by gkronber

r12790: updated alglib to version 3.9.0 (reverse merge)

Last edited 20 months ago by gkronber (previous) (diff)

comment:4 Changed 21 months ago by gkronber

r12791: updated alglib reference in test project (reverse merge)

Last edited 20 months ago by gkronber (previous) (diff)

comment:5 Changed 21 months ago by gkronber

r12792: updated some of the alglib calls (we should only use the external API whenever possible) (reverse merge)

Last edited 20 months ago by gkronber (previous) (diff)

comment:6 Changed 21 months ago by gkronber

r12796: updated alglib references to version 3.9.0 in FLA branch (reverse merge)

Last edited 20 months ago by gkronber (previous) (diff)

comment:7 Changed 21 months ago by gkronber

r12798: quick fix for new result values in unit tests because of update to alglib 3.9 (reverse merge)

Version increment of HeuristicLab.Algorithms.DataAnalysis necessary? This would allow us to adjust the persistence code also.

Last edited 20 months ago by gkronber (previous) (diff)

comment:8 Changed 20 months ago by gkronber

r12801: changed result value for gaussian process regression unit test because of changes of alglib version (reverse merge)

Last edited 20 months ago by gkronber (previous) (diff)

comment:9 Changed 20 months ago by gkronber

It seems there will be a new release soon. http://bugs.alglib.net/roadmap_page.php

comment:10 Changed 20 months ago by gkronber

r12817: reverse merge of all trunk changes for updating to alglib version 3.9.0 (r12790:12792, r12798, r12801)

comment:11 Changed 20 months ago by gkronber

r12818: reverse merge of r12796

comment:12 Changed 20 months ago by gkronber

As discussed by architects it is necessary to increment the minor version of all dependent plugins if the alglib version increment would lead to a change in behavior.

comment:13 Changed 20 months ago by gkronber

Results of evaluation of commercial version of alglib (including native implementation using Intel MKL)

  1. The native implementation automatically detects wether it is running on x86 or x64 systems at runtime and simply works.
  2. Native implementation is basically a C# wrapper to a compiled binary of C++ alglib that is statically linked to Intel MKL where useful. Thus, even the random forest implementation (even though it does not use MKL or parallelization) should be faster since it falls back to the native version.
  3. In several places we use 'internal' classes of alglib instead of the standard API (e.g. alglib.dforest.dfbuild(...) instead of alglib.dfbuild(...)). This poses a problem because the native implementation only provides a C# wrapper for the user-facing API.
    • Code has to be rewritten to use 'official' alglib API calls. Especially for persistence (alglib now provides functions for serialization/deserialization of composed types)
    • Our LMBFGS uses internal API extensively. The code has to be completely rewritten when using the external API and possibly it is not possible to implement it in the same way as now.
    • We use a patched version of free alglib where we marked the PRNG as ThreadStatic. Alglib now has a different way of supporting multi-core parallel processing and thread-safety for the PRNG. Our code has to be adapted accordingly.
  4. The native implementation might produce slightly different results than the managed implementation. Also, the native implementation might produce different results on different processors (especially x86 vs. x64 systems).
  5. Updating alglib would lead to different algorithm results and therefore we need at least a minor version increment for all plugins that actually use alglib direcly or indirectly.
  6. Updating the version would also be useful because we could completely update the persistence format for alglib types.

comment:14 Changed 19 months ago by gkronber

  • Summary changed from Update AlgLib to 3.9.0 to Update AlgLib to 3.10.0

comment:15 Changed 17 months ago by gkronber

  • Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0.x Backlog
Note: See TracTickets for help on using tickets.