Free cookie consent management tool by TermsFeed Policy Generator

Opened 10 years ago

Closed 10 years ago

#1968 closed enhancement (done)

The number of used variable per tree should be configurable in random forests modeling

Reported by: mkommend Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.8
Component: Algorithms.DataAnalysis Version: 3.3.8
Keywords: Cc:

Description (last modified by mkommend)

Another problem is that currently the random seed cannot be specified and hence the results of a random forest modeling run are not reproducable.

Change History (12)

comment:1 Changed 10 years ago by mkommend

  • Description modified (diff)
  • Status changed from new to accepted

comment:2 Changed 10 years ago by mkommend

r8786: Added seed and m parameter to random forest modeling.

comment:3 Changed 10 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from accepted to reviewing

Please have a detailed look at the locking of the RNG to enable the specification of a seed value and feel free to rename the M parameter.

comment:4 Changed 10 years ago by mkommend

  • Owner changed from gkronber to mkommend
  • Status changed from reviewing to assigned

comment:5 follow-up: Changed 10 years ago by mkommend

  • Status changed from assigned to accepted

There is a problem if multiple random forest regression algorithms run in parallel, it is not guaranteed that the result is reproducible.

comment:6 in reply to: ↑ 5 Changed 10 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from accepted to assigned

Replying to mkommend:

There is a problem if multiple random forest regression algorithms run in parallel, it is not guaranteed that the result is reproducible.

I had a further look at the problem and IMHO we either have to add a [ThreadStatic] attribute to the RNG in the alglib sources or remove the seed from the parameter list which yield to not reproducible random forests runs.

comment:7 Changed 10 years ago by mkommend

  • Owner changed from gkronber to mkommend

It was decided to change the ALGLIB sources to use [ThreadStatic] for the RNG.

comment:8 Changed 10 years ago by mkommend

  • Status changed from assigned to accepted

comment:9 Changed 10 years ago by mkommend

  • Owner changed from mkommend to gkronber
  • Status changed from accepted to reviewing

r8803: Added [ThreadStatic] to the RNG of ALGLIB and removed lock from random forest algorithm.

The ALGLIB source file automatically got formatted according to my local settings. However, the only line changed was ap.cs line 494.

comment:10 Changed 10 years ago by mkommend

r8805: Added initialization code for the RNG in the ALGLIB sources.

comment:11 Changed 10 years ago by gkronber

  • Status changed from reviewing to readytorelease

Reviewed r8805, r8803, r8786.

comment:12 Changed 10 years ago by swagner

  • Resolution set to done
  • Status changed from readytorelease to closed
  • Version changed from 3.3.7 to 3.3.8
Note: See TracTickets for help on using tickets.