Opened 2 years ago

Closed 20 months ago

#2417 closed enhancement (rejected)

Extend interface for IDataset to allow access by column index

Reported by: gkronber Owned by: gkronber
Priority: medium Milestone: HeuristicLab 3.3.14
Component: Problems.DataAnalysis Version: 3.3.11
Keywords: Cc:

Description

Currently, values from the dataset can only be accessed via variable names. This has the effect that dictionary lookups are necessary e.g. whenever GetDoubleValues() or GetDoubleValue() are called. In this case it would be faster to access the value through the variable (or column) index.

This is especially problematic if GetDoubleValue() must be called repeatedly.

Change History (5)

comment:1 Changed 23 months ago by gkronber

  • Owner set to gkronber
  • Status changed from new to assigned

comment:2 Changed 21 months ago by gkronber

  • Milestone changed from HeuristicLab 3.3.13 to HeuristicLab 4.0.x Backlog

comment:3 follow-up: Changed 20 months ago by gkronber

  • Milestone changed from HeuristicLab 4.0 to HeuristicLab 3.3.14

I believe this is already accomplished with the changes for modifiable datasets?

comment:4 in reply to: ↑ 3 ; follow-up: Changed 20 months ago by mkommend

Replying to gkronber:

I believe this is already accomplished with the changes for modifiable datasets?

This functionality to access Dataset values by column and row is implemented for the Dataset and ModifiableDataset, but only to implement IStringConvertibleMatrix and thus returns a string.

The implemented method is not very efficient for batch access, because every call gets the variable name and accesses the dictionary.

    string IStringConvertibleMatrix.GetValue(int rowIndex, int columnIndex) {
      return variableValues[variableNames[columnIndex]][rowIndex].ToString();
    }

If an efficient access for values per row and column is necessary, the way how Dataset organizes and stores values internally has to be rewritten. A possible solution would be to store the values list in an indexed collection (List or array) and have the dictionary synchronized.

However, I don't see the point of the ticket. For performance reasons one could always retrieve the whole readonly collections and store them locally.

IMHO this ticket should be closed as obsolete unless further reasons for having column indexed based access could be given.

comment:5 in reply to: ↑ 4 Changed 20 months ago by gkronber

  • Resolution set to rejected
  • Status changed from assigned to closed

Replying to mkommend:

Replying to gkronber:

I believe this is already accomplished with the changes for modifiable datasets?

IMHO this ticket should be closed as obsolete unless further reasons for having column indexed based access could be given.

Ok, we just need to be careful with usage of GetDoubleValue(string name, int rowIndex)

Note: See TracTickets for help on using tickets.