org.knime.base.node.preproc.groupby
Class GroupByTable

java.lang.Object
  extended by org.knime.base.node.preproc.groupby.GroupByTable
Direct Known Subclasses:
BigGroupByTable, MemoryGroupByTable

public abstract class GroupByTable
extends Object

Author:
Tobias Koetter, University of Konstanz

Constructor Summary
protected GroupByTable(ExecutionContext exec, BufferedDataTable inDataTable, List<String> groupByCols, ColumnAggregator[] colAggregators, GlobalSettings globalSettings, boolean sortInMemory, boolean enableHilite, ColumnNamePolicy colNamePolicy, boolean retainOrder)
          Constructor for class GroupByTable.
 
Method Summary
protected  void addHiliteMapping(RowKey newKey, Set<RowKey> oldKeys)
           
protected  void addSkippedGroup(String colName, String skipMsg, DataCell[] groupVals)
           
static BufferedDataTable appendOrderColumn(ExecutionContext exec, BufferedDataTable dataTable, Set<String> workingCols, String retainOrderCol)
           
static void checkGroupCols(DataTableSpec spec, List<String> groupCols)
           
protected abstract  BufferedDataTable createGroupByTable(ExecutionContext exec, BufferedDataTable dataTable, DataTableSpec resultSpec, int[] groupColIdx)
           
static DataTableSpec createGroupByTableSpec(DataTableSpec spec, List<String> groupColNames, ColumnAggregator[] columnAggregators, ColumnNamePolicy colNamePolicy)
           
static String createSkippedGroupName(DataCell[] groupVals)
           
 BufferedDataTable getBufferedTable()
           
 ColumnAggregator[] getColAggregators()
           
 GlobalSettings getGlobalSettings()
           
 List<String> getGroupCols()
           
 Map<RowKey,Set<RowKey>> getHiliteMapping()
          the hilite translation Map or null if the enableHilte flag in the constructor was set to false.
 Map<String,Collection<Pair<String,String>>> getSkippedGroupsByColName()
          Returns a Map with all skipped groups.
 String getSkippedGroupsMessage(int maxGroups, int maxCols)
           
 boolean isEnableHilite()
           
 boolean isRetainOrder()
           
 boolean isSortInMemory()
           
static BufferedDataTable sortTable(ExecutionContext exec, BufferedDataTable table2sort, List<String> sortCols, boolean sortInMemory)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GroupByTable

protected GroupByTable(ExecutionContext exec,
                       BufferedDataTable inDataTable,
                       List<String> groupByCols,
                       ColumnAggregator[] colAggregators,
                       GlobalSettings globalSettings,
                       boolean sortInMemory,
                       boolean enableHilite,
                       ColumnNamePolicy colNamePolicy,
                       boolean retainOrder)
                throws CanceledExecutionException
Constructor for class GroupByTable.

Parameters:
exec - the ExecutionContext
inDataTable - the table to aggregate
groupByCols - the name of all columns to group by
colAggregators - the aggregation columns with the aggregation method to use in the order the columns should be appear in the result table numerical columns
globalSettings - the global settings
sortInMemory - true if the table should be sorted in the memory
enableHilite - true if a row key map should be maintained to enable hiliting
colNamePolicy - the ColumnNamePolicy for the aggregation columns input table if set to true
retainOrder - true if the original row order should be retained
Throws:
CanceledExecutionException - if the user has canceled the execution
Method Detail

createGroupByTable

protected abstract BufferedDataTable createGroupByTable(ExecutionContext exec,
                                                        BufferedDataTable dataTable,
                                                        DataTableSpec resultSpec,
                                                        int[] groupColIdx)
                                                 throws CanceledExecutionException
Parameters:
exec - the ExecutionContext
dataTable - the data table to aggregate
resultSpec - the result DataTableSpec
groupColIdx - the group column indices
Returns:
the aggregated input table
Throws:
CanceledExecutionException - if the operation has been canceled

getGroupCols

public List<String> getGroupCols()
Returns:
the columns to group by

getGlobalSettings

public GlobalSettings getGlobalSettings()
Returns:
the global settings

isEnableHilite

public boolean isEnableHilite()
Returns:
if a hilite mapping should be maintained

isSortInMemory

public boolean isSortInMemory()
Returns:
if sorting should be performed in memory

isRetainOrder

public boolean isRetainOrder()
Returns:
if the input table order should be retained

getColAggregators

public ColumnAggregator[] getColAggregators()
Returns:
the colAggregators

appendOrderColumn

public static BufferedDataTable appendOrderColumn(ExecutionContext exec,
                                                  BufferedDataTable dataTable,
                                                  Set<String> workingCols,
                                                  String retainOrderCol)
                                           throws CanceledExecutionException
Parameters:
exec - the ExecutionContext
dataTable - the BufferedDataTable to add the order column to
workingCols - the names of all columns needed for grouping
retainOrderCol - the name of the order column
Returns:
the given table with the appended order column
Throws:
CanceledExecutionException - if the operation has been canceled

sortTable

public static BufferedDataTable sortTable(ExecutionContext exec,
                                          BufferedDataTable table2sort,
                                          List<String> sortCols,
                                          boolean sortInMemory)
                                   throws CanceledExecutionException
Parameters:
exec - ExecutionContext
table2sort - the BufferedDataTable to sort
sortCols - the columns to sort by
sortInMemory - the sort in memory flag
Returns:
the sorted BufferedDataTable
Throws:
CanceledExecutionException - if the operation has been canceled

createSkippedGroupName

public static String createSkippedGroupName(DataCell[] groupVals)
Parameters:
groupVals - the group values of the skipped group
Returns:
the group name

addHiliteMapping

protected void addHiliteMapping(RowKey newKey,
                                Set<RowKey> oldKeys)
Parameters:
newKey - the new RowKey
oldKeys - all old RowKeys

addSkippedGroup

protected void addSkippedGroup(String colName,
                               String skipMsg,
                               DataCell[] groupVals)
Parameters:
colName - the name of the column
skipMsg - the skip message to display
groupVals - the skipped group values

createGroupByTableSpec

public static final DataTableSpec createGroupByTableSpec(DataTableSpec spec,
                                                         List<String> groupColNames,
                                                         ColumnAggregator[] columnAggregators,
                                                         ColumnNamePolicy colNamePolicy)
Parameters:
spec - the original DataTableSpec
groupColNames - the name of all columns to group by
columnAggregators - the aggregation columns with the aggregation method to use in the order the columns should be appear in the result table
colNamePolicy - the ColumnNamePolicy for the aggregation columns
Returns:
the result DataTableSpec

getHiliteMapping

public Map<RowKey,Set<RowKey>> getHiliteMapping()
the hilite translation Map or null if the enableHilte flag in the constructor was set to false. The key of the Map is the row key of the new group row and the corresponding value is the Collection with all old row keys which belong to this group.

Returns:
the hilite translation Map or null if the enableHilte flag in the constructor was set to false.

getSkippedGroupsByColName

public Map<String,Collection<Pair<String,String>>> getSkippedGroupsByColName()
Returns a Map with all skipped groups. The key of the Map is the name of the column and the value is a Collection with Pair objects with the group name as first and the corresponding skip message as second object.

Returns:
a Map with all skipped groups

getSkippedGroupsMessage

public String getSkippedGroupsMessage(int maxGroups,
                                      int maxCols)
Parameters:
maxGroups - the maximum number of skipped groups to display
maxCols - the maximum number of columns to display per group
Returns:
String message with the skipped groups per column or null if no groups where skipped

checkGroupCols

public static void checkGroupCols(DataTableSpec spec,
                                  List<String> groupCols)
                           throws IllegalArgumentException
Parameters:
spec - the DataTableSpec to check
groupCols - the group by column name List
Throws:
IllegalArgumentException - if one of the group by columns doesn't exists in the given DataTableSpec

getBufferedTable

public BufferedDataTable getBufferedTable()
Returns:
the aggregated BufferedDataTable


Copyright, 2003 - 2012. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.