Opened 8 years ago

Closed 8 years ago

#1569 closed defect (done)

Parallel execution of many algorithms fails (sometimes)

Reported by: cneumuel Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.5
Component: Core Version: 3.3.5
Keywords: Cc:

Description (last modified by cneumuel)

When many algorithms are executed in parallel some fail because the ExecutionContext property of Parameters and Operators is null. It is unclear why ExecutionContext can be null (it might be related with #1522).

This issue was found on a 4-core machine.

Attachments (1)

1569.patch (5.5 KB) - added by cneumuel 8 years ago.
adds SetControl to IStatefultem to check which object cleans it up.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 8 years ago by cneumuel

  • Description modified (diff)

r6495 checked in test-case for this issue.

The following output shows the number of instances each genetic algorithm consists of. It shows that there is an issue with the objects used in the genetic algorithm instances (the number of objects increases before cleanup).

(before cleanup and after cleanup are inserted for debugging purposes in Algorithm.OnStopped right before and after ClearState is called on the IStatefulItems)

Alg 1; Objects before execution: 5920
Alg 3; Objects before execution: 5920
Alg 2; Objects before execution: 5920
Alg 0; Objects before execution: 5920
Alg 4; Objects before execution: 5924
Alg 5; Objects before execution: 5924
Alg 6; Objects before execution: 5924
Alg 7; Objects before execution: 5924
Alg 8; Objects before execution: 5924
Alg 9; Objects before execution: 5924
Alg 10; Objects before execution: 5924
Alg 11; Objects before execution: 5924
Alg 12; Objects before execution: 5924
Alg 13; Objects before execution: 5924
Alg 14; Objects before execution: 5924
Alg 15; Objects before execution: 5924
Alg 16; Objects before execution: 5924
Alg 17; Objects before execution: 5924
Alg 18; Objects before execution: 5924
Alg 19; Objects before execution: 5924
Alg 20; Objects before execution: 5924
Alg 21; Objects before execution: 5924
Alg 22; Objects before execution: 5924
Alg 23; Objects before execution: 5924
Alg 24; Objects before execution: 5924
Alg 25; Objects before execution: 5924
Alg 26; Objects before execution: 5924
Alg 27; Objects before execution: 5924
Alg 28; Objects before execution: 5924
Alg 29; Objects before execution: 5924
Alg 30; Objects before execution: 5924
Alg 31; Objects before execution: 5924
Alg 32; Objects before execution: 5924
Alg 33; Objects before execution: 5924
Alg 34; Objects before execution: 5924
Alg 35; Objects before execution: 5924
Alg 36; Objects before execution: 5924
Alg 37; Objects before execution: 5924
Alg 38; Objects before execution: 5924
Alg 39; Objects before execution: 5924
Alg 40; Objects before execution: 5924
Alg 41; Objects before execution: 5924
Alg 42; Objects before execution: 5924
Alg 43; Objects before execution: 5924
Alg 44; Objects before execution: 5924
Alg 45; Objects before execution: 5924
Alg 46; Objects before execution: 5924
Alg 47; Objects before execution: 5924
Alg 48; Objects before execution: 5924
Alg 49; Objects before execution: 5924
Alg 50; Objects before execution: 5924
Alg 51; Objects before execution: 5924
Alg 52; Objects before execution: 5924
Alg 53; Objects before execution: 5924
Alg 54; Objects before execution: 5924
Alg 55; Objects before execution: 5924
Alg 56; Objects before execution: 5924
Alg 57; Objects before execution: 5924
Alg 58; Objects before execution: 5924
Alg 59; Objects before execution: 5924
Alg 58; Objects before cleanup: 7723
Alg 58; Objects after cleanup: 5987
Alg 58; Objects after execution: 6617
Alg 58; ExecutionTime: 00:00:02.0686906 
Alg 52; Objects before cleanup: 7723
Alg 52; Objects after cleanup: 5987
Alg 3; Objects before cleanup: 7723
Alg 52; Objects after execution: 6617
Alg 52; ExecutionTime: 00:00:07.8375751 
Alg 3; Objects after cleanup: 5987
Alg 3; Objects after execution: 6617
Alg 3; ExecutionTime: 00:00:54.3482500 
Alg 12; Objects before cleanup: 7723
Alg 12; Objects after cleanup: 5987
Alg 43; Objects before cleanup: 7743
Alg 16; Objects before cleanup: 7743
Alg 43; Objects after cleanup: 5987
Alg 16; Objects after cleanup: 5987
Alg 12; Objects after execution: 6617
Alg 12; ExecutionTime: 00:00:46.5861780 
Alg 19; Objects before cleanup: 7723
Alg 43; Objects after execution: 6617
Alg 43; ExecutionTime: 00:00:17.1807794 
Alg 16; Objects after execution: 6617
Alg 16; ExecutionTime: 00:00:42.6905119 
Alg 19; Objects after cleanup: 5987
Alg 40; Objects before cleanup: 7743
Alg 36; Objects before cleanup: 7743
Alg 36; Objects after cleanup: 5987
Alg 40; Objects after cleanup: 5987
Alg 36; Objects after execution: 6617
Alg 36; ExecutionTime: 00:00:23.9725462 
Alg 40; Objects after execution: 6617
Alg 40; ExecutionTime: 00:00:20.4900846 
Alg 49; Objects before cleanup: 11568
Alg 19; Objects after execution: 6617
Alg 19; ExecutionTime: 00:00:40.3966295 
Alg 35; Objects before cleanup: 11588
Alg 49; Objects after cleanup: 5991
Alg 35; Objects after cleanup: 5991
Alg 49; Objects after execution: 6621
Alg 49; ExecutionTime: 00:00:11.6556820 
Alg 20; Objects before cleanup: 9196
Alg 9; Objects before cleanup: 9022
Alg 35; Objects after execution: 6621
Alg 35; ExecutionTime: 00:00:25.2416881 
Alg 57; Objects before cleanup: 9128
Alg 30; Objects before cleanup: 9452
Alg 9; Objects after cleanup: 5991
Alg 44; Objects before cleanup: 8592
Alg 30; Objects after cleanup: 5991
Alg 44; Objects after cleanup: 5991
Alg 9; Objects after execution: 6621
Alg 9; ExecutionTime: 00:00:50.4785472 
Alg 30; Objects after execution: 6621
Alg 30; ExecutionTime: 00:00:30.6093741 
Alg 20; Objects after cleanup: 5991
Alg 44; Objects after execution: 6621
Alg 44; ExecutionTime: 00:00:17.1490891 
Alg 57; Objects after cleanup: 5991
Alg 0; Objects before cleanup: 7747
Alg 20; Objects after execution: 6621
Alg 20; ExecutionTime: 00:00:40.2014259 
Alg 57; Objects after execution: 6621
Alg 57; ExecutionTime: 00:00:04.3012417 
Alg 0; Objects after cleanup: 5991
Alg 29; Objects before cleanup: 7747
Alg 31; Objects before cleanup: 7747
Alg 22; Objects before cleanup: 7747
Alg 4; Objects before cleanup: 7727
Alg 29; Objects after cleanup: 5991
Alg 0; Objects after execution: 6621
Alg 0; ExecutionTime: 00:00:55.9005753 
Alg 22; Objects after cleanup: 5991
Alg 4; Objects after cleanup: 5991
Alg 29; Objects after execution: 6621
Alg 29; ExecutionTime: 00:00:32.0662430 
Alg 22; Objects after execution: 6621
Alg 22; ExecutionTime: 00:00:38.6215223 
Alg 23; Objects before cleanup: 7747
Alg 31; Objects after cleanup: 5991
Alg 23; Objects after cleanup: 5991
Alg 4; Objects after execution: 6621
Alg 4; ExecutionTime: 00:00:55.6459584 
Alg 23; Objects after execution: 6621
Alg 23; ExecutionTime: 00:00:38.5701435 
Alg 15; Objects before cleanup: 7747
Alg 31; Objects after execution: 6621
Alg 31; ExecutionTime: 00:00:31.3148336 
Alg 15; Objects after cleanup: 5991
Alg 13; Objects before cleanup: 16522
Alg 51; Objects before cleanup: 8098
Alg 10; Objects before cleanup: 12684
Alg 37; Objects before cleanup: 8118
Alg 14; Objects before cleanup: 12408
Alg 8; Objects before cleanup: 7727
Alg 10; Objects after cleanup: 5991
Alg 37; Objects after cleanup: 5991
Alg 15; Objects after execution: 6621
Alg 15; ExecutionTime: 00:00:47.3591908 
Alg 14; Objects after cleanup: 5991
Alg 10; Objects after execution: 6621
Alg 10; ExecutionTime: 00:00:53.3813388 
Alg 17; Objects before cleanup: 13496
Alg 6; Objects before cleanup: 16809
Alg 8; Objects after cleanup: 5991
Alg 1; Objects before cleanup: 22548
Alg 5; Objects before cleanup: 13598
Alg 1; Objects after cleanup: 5991
Alg 8; Objects after execution: 6621
Alg 8; ExecutionTime: 00:00:55.4761503 
Alg 59; Objects before cleanup: 13606
Alg 5; Objects after cleanup: 5991
Alg 11; Objects before cleanup: 13736
Alg 13; Objects after cleanup: 17975
Alg 32; Objects before cleanup: 38395
Alg 6; Objects after cleanup: 5991
Alg 2; Objects before cleanup: 13954
Alg 59; Objects after cleanup: 5991
Alg 32; Objects after cleanup: 5991
Alg 37; Objects after execution: 6621
Alg 37; ExecutionTime: 00:00:28.0989391 
Alg 18; Objects before cleanup: 12968
Alg 18; Objects after cleanup: 5991
Alg 5; Objects after execution: 6621
Alg 5; ExecutionTime: 00:00:59.4582223 
Alg 11; Objects after cleanup: 5991
Alg 51; Objects after cleanup: 5991
Alg 13; Objects after execution: 17975
Alg 13; ExecutionTime: 00:00:51.7377608 
Alg 6; Objects after execution: 6621
Alg 6; ExecutionTime: 00:00:58.3172097 
Alg 51; Objects after execution: 6621
Alg 51; ExecutionTime: 00:00:15.8311793 
Alg 55; Objects before cleanup: 14215
Alg 27; Objects before cleanup: 14429
Alg 55; Objects after cleanup: 5991
Alg 32; Objects after execution: 6621
Alg 32; ExecutionTime: 00:00:34.2596412 
Alg 48; Objects before cleanup: 22047
Alg 48; Objects after cleanup: 5991
Alg 11; Objects after execution: 6621
Alg 11; ExecutionTime: 00:00:55.3964186 
Alg 17; Objects after cleanup: 5991
Alg 2; Objects after cleanup: 5991
Alg 18; Objects after execution: 6621
Alg 18; ExecutionTime: 00:00:49.2897451 
Alg 2; Objects after execution: 6621
Alg 2; ExecutionTime: 00:01:03.4336448 
Alg 48; Objects after execution: 6621
Alg 48; ExecutionTime: 00:00:20.7766015 
Alg 59; Objects after execution: 6621
Alg 59; ExecutionTime: 00:00:07.8375179 
Alg 27; Objects after cleanup: 5991
Alg 17; Objects after execution: 6621
Alg 17; ExecutionTime: 00:00:50.4518862 
Alg 1; Objects after execution: 12922
Alg 1; ExecutionTime: 00:01:00.1245930 
Alg 46; Objects before cleanup: 15590
Alg 27; Objects after execution: 6621
Alg 27; ExecutionTime: 00:00:41.2345048 
Alg 46; Objects after cleanup: 5991
Alg 46; Objects after execution: 6621
Alg 46; ExecutionTime: 00:00:23.6075889 
Alg 50; Objects before cleanup: 15780
Alg 56; Objects before cleanup: 14142
Alg 42; Objects before cleanup: 15362
Alg 28; Objects before cleanup: 42548
Alg 42; Objects after cleanup: 5991
Alg 21; Objects before cleanup: 15616
Alg 56; Objects after cleanup: 5991
Alg 53; Objects before cleanup: 15652
Alg 56; Objects after execution: 6621
Alg 56; ExecutionTime: 00:00:14.0335432 
Alg 50; Objects after cleanup: 5991
Alg 34; Objects before cleanup: 15517
Alg 50; Objects after execution: 6621
Alg 50; ExecutionTime: 00:00:20.1975716 
Alg 34; Objects after cleanup: 5991
Alg 25; Objects before cleanup: 12298
Alg 53; Objects after cleanup: 5991
Alg 21; Objects after cleanup: 5991
Alg 34; Objects after execution: 6621
Alg 34; ExecutionTime: 00:00:35.8141091 
Alg 25; Objects after cleanup: 5991
Alg 53; Objects after execution: 6621
Alg 53; ExecutionTime: 00:00:17.4529265 
Alg 33; Objects before cleanup: 14389
Alg 25; Objects after execution: 6621
Alg 25; ExecutionTime: 00:00:44.5057761 
Alg 47; Objects before cleanup: 15398
Alg 55; Objects after execution: 6621
Alg 55; ExecutionTime: 00:00:13.3723473 
Alg 14; Objects after execution: 12922
Alg 14; ExecutionTime: 00:00:49.6233374 
Alg 45; Objects before cleanup: 15642
Alg 24; Objects before cleanup: 21406
Alg 47; Objects after cleanup: 5991
Alg 33; Objects after cleanup: 5991
Alg 42; Objects after execution: 6621
Alg 42; ExecutionTime: 00:00:27.9519557 
Alg 28; Objects after cleanup: 5991
Alg 47; Objects after execution: 6621
Alg 47; ExecutionTime: 00:00:23.7910409 
Alg 24; Objects after cleanup: 5991
Alg 28; Objects after execution: 6621
Alg 28; ExecutionTime: 00:00:41.8742630 
Alg 33; Objects after execution: 6621
Alg 33; ExecutionTime: 00:00:37.3226893 
Alg 24; Objects after execution: 6621
Alg 24; ExecutionTime: 00:00:45.8945664 
Alg 45; Objects after cleanup: 5991
Alg 45; Objects after execution: 6621
Alg 45; ExecutionTime: 00:00:26.0076927 
Alg 21; Objects after execution: 6621
Alg 21; ExecutionTime: 00:00:48.4631807 
Alg 38; Objects before cleanup: 18481
Alg 38; Objects after cleanup: 5991
Alg 38; Objects after execution: 6621
Alg 38; ExecutionTime: 00:00:32.8644902 
Alg 7; Objects before cleanup: 12298
Alg 54; Objects before cleanup: 15694
Alg 7; Objects after cleanup: 5991
Alg 54; Objects after cleanup: 5991
Alg 41; Objects before cleanup: 18297
Alg 7; Objects after execution: 6621
Alg 7; ExecutionTime: 00:01:02.6671363 
Alg 54; Objects after execution: 6621
Alg 54; ExecutionTime: 00:00:17.6873038 
Alg 41; Objects after cleanup: 5991
Alg 39; Objects before cleanup: 18730
Alg 26; Objects before cleanup: 18296
Alg 41; Objects after execution: 6621
Alg 41; ExecutionTime: 00:00:30.8185525 
Alg 39; Objects after cleanup: 5991
Alg 26; Objects after cleanup: 5991
Alg 39; Objects after execution: 6621
Alg 39; ExecutionTime: 00:00:32.9168649 
Alg 26; Objects after execution: 6621
Alg 26; ExecutionTime: 00:00:44.9566087

Changed 8 years ago by cneumuel

adds SetControl to IStatefultem to check which object cleans it up.

comment:2 Changed 8 years ago by mkommend

  • Owner changed from swagner to mkommend
  • Status changed from new to accepted

comment:3 Changed 8 years ago by mkommend

The cause of this issue is that the backing fields of ThreadLocal<> were reached during the object graph collection. As a result multiple algorithms call the ClearState method of IStatefulItems which clears the IExecutionContext of every IOperator and IParameter. and a NullReferenceException occurs.

comment:4 Changed 8 years ago by mkommend

r6500: Added special handling for ThreadLocal<> in Object.GetObjectGraphObjects().

comment:5 Changed 8 years ago by mkommend

  • Owner changed from mkommend to cneumuel
  • Status changed from accepted to assigned

Please test the commited changes on your quad-core machine to ensure that everything works as expected.

comment:6 Changed 8 years ago by mkommend

  • Owner changed from cneumuel to gkronber
  • Status changed from assigned to reviewing

comment:7 Changed 8 years ago by swagner

  • Milestone changed from HeuristicLab 3.3.x Backlog to HeuristicLab 3.3.5

comment:8 Changed 8 years ago by gkronber

  • Status changed from reviewing to readytorelease

Reviewed r6500. Thanks!

comment:9 Changed 8 years ago by swagner

  • Resolution set to done
  • Status changed from readytorelease to closed
  • Version changed from 3.3.4 to 3.3.5
Note: See TracTickets for help on using tickets.