Opened 10 years ago

Closed 10 years ago

#1522 closed enhancement (done)

Improve object graph traversing

Reported by: mkommend Owned by: swagner
Priority: medium Milestone: HeuristicLab 3.3.5
Component: Common Version: 3.3.5
Keywords: Cc:

Description

The traversing of an object graph was implemented to clean up all IStatefulItems when an algorithm is prepared or stopped (see #1424).

This traversing is rather slow especially when large arrays are enumerated (e.g., Datasets).

Additionally structs are not handled correctly because they are skipped due to a check for ValueType.

Change History (20)

comment:1 Changed 10 years ago by mkommend

  • Status changed from new to accepted

comment:2 Changed 10 years ago by mkommend

r6181: Reorganized type checks and added a clause to exclude primitive arrays from being enumerated.

comment:3 follow-up: Changed 10 years ago by mkommend

  • Owner changed from mkommend to cneumuel
  • Status changed from accepted to assigned

The include structs in the traversing of the object graph, value types must be allowed for traversing. As this leads to an StackoverflowException I didn't change this behavior. It would be good if you could have a look at this issue.

comment:4 follow-up: Changed 10 years ago by cneumuel

  • Cc swagner added

Some further comments (open for discussion):

  • The type should be checked for IsArray rather than for IEnumerable. By definition the implementation of IEnumerable could result an infinate number of elements.
  • The check fot the ElementType should only be made if it is an array.
  • String values should also be collected, as discussed with Stefan Wagner (this was changed in r6181)
  • A possible performance optimization: The datatype of the objects could be given. In this case we could skip collecting strings or going through arrays with strings and primitive datatypes.

comment:5 Changed 10 years ago by swagner

  • Cc swagner removed

comment:6 in reply to: ↑ 3 Changed 10 years ago by swagner

Replying to mkommend:

The include structs in the traversing of the object graph, value types must be allowed for traversing. As this leads to an StackoverflowException I didn't change this behavior. It would be good if you could have a look at this issue.

To avoid the StackOverflowException it is sufficient to exclude the type System.Reflection.Pointer. However, I am not quite sure about any possible side effects. We have to have a detailed look at this.

comment:7 in reply to: ↑ 4 Changed 10 years ago by swagner

Replying to cneumuel:

  • The type should be checked for IsArray rather than for IEnumerable. By definition the implementation of IEnumerable could result an infinate number of elements.

I experimented with checking for arrays and not for IEnumerables right before the 3.3.4 release. However, this change reduced the performance drastically and resulted in fewer collected objects (quite strange). We have to analyze this in more detail.

comment:8 Changed 10 years ago by mkommend

In a discussion with swagner it was decided to exclude primitive types (int,short,string,char,...) from the returned objects, as they do not include any semantics and are immutable.

comment:9 Changed 10 years ago by cneumuel

  • The reason IsArray returns less objects than checking for IEnumerable is that it's quite different if an Enumerator of an IEnumerable is used rather than going deeper down into the object until an array is reached. An example is the Dictionary. dict is IEnumerable returns true, resulting in enumerating that dictionary. The implementation of the enumerator reveals that KeyValuePair<TKey, TValue> objects are created within the enumerator. If IsArray is used, dict.IsArray returns false, resulting in a further recursion into the dictionary object. In there an array of Entry<TKey, TValue>[] is found, which is then used. So in this example all the KeyValuePair objects are missing when IsArray is used (which is good!).
  • I have not noticed a big performance difference between IsArray and is IEnumerable, so in my opinion IsArray would be the way to go. I could reproduce the performance difference, but I could mostly fix it by handling dictionaries in a special way.
Last edited 10 years ago by cneumuel (previous) (diff)

comment:10 Changed 10 years ago by cneumuel

r6192

  • included ValueTypes (which includes structs and enums)
  • avoided stack overflow by excluding Type
  • improved performance by handling dictionaries in a special way
  • avoided stack overflow by excluding ThreadLocal<> from (occured after many repetitions, when the field the field ConcurrentStack<int> s_availableIndices grew larger
  • using IsArray instead of is IEnumerable

mkommend: Please test if performance is acceptable with your datasets.

comment:11 Changed 10 years ago by cneumuel

  • Owner changed from cneumuel to mkommend
  • Status changed from assigned to reviewing

comment:12 Changed 10 years ago by mkommend

r6201: Updated object graph traversing.

comment:13 Changed 10 years ago by mkommend

  • Owner changed from mkommend to cneumuel
  • Status changed from reviewing to assigned

Please check if further simplifications and unifications can be made without affecting the results and the performance (e.g., check for delegates).

comment:14 Changed 10 years ago by cneumuel

  • Owner changed from cneumuel to mkommend
  • Status changed from assigned to reviewing

r6205

  • simplified check for EventHandler (by simply checking for delegate)
  • special handling types which use a hashtable internally (dictionaries, hashset, hashtable, ...) (as discussed with swagner)
  • added testcase which shows a stack overflow if ThreadLocal is included and compiler is on Debug (it does not happen in Release)

comment:15 Changed 10 years ago by cneumuel

r6207, r6208: fixed missing files. r6209: added comment for test method

comment:16 Changed 10 years ago by mkommend

  • Owner changed from mkommend to cneumuel

r6380: Replaced recursive object graph traversing with a linear solution using a stack to avoid stack overflows.

comment:17 Changed 10 years ago by cneumuel

  • Owner changed from cneumuel to swagner
  • Status changed from reviewing to readytorelease

comment:18 Changed 10 years ago by mkommend

  • Owner changed from swagner to mkommend
  • Status changed from readytorelease to reviewing

comment:19 Changed 10 years ago by mkommend

  • Owner changed from mkommend to swagner
  • Status changed from reviewing to readytorelease

comment:20 Changed 10 years ago by swagner

  • Resolution set to done
  • Status changed from readytorelease to closed
  • Version changed from 3.3.4 to 3.3.5
Note: See TracTickets for help on using tickets.