org.knime.base.node.preproc.groupby
Class GroupByNodeModel

java.lang.Object
  extended by org.knime.core.node.NodeModel
      extended by org.knime.base.node.preproc.groupby.GroupByNodeModel
Direct Known Subclasses:
Pivot2NodeModel

public class GroupByNodeModel
extends NodeModel

The NodeModel implementation of the group by node which uses the GroupByTable class implementations to create the resulting table.

Author:
Tobias Koetter, University of Konstanz

Field Summary
protected static String CFG_COLUMN_NAME_POLICY
          Configuration key for the aggregation column name policy.
protected static String CFG_ENABLE_HILITE
          Configuration key for the enable hilite option.
protected static String CFG_GROUP_BY_COLUMNS
          Configuration key of the selected group by columns.
protected static String CFG_IN_MEMORY
          Configuration key for the in memory option.
protected static String CFG_MAX_UNIQUE_VALUES
          Configuration key for the maximum none numerical values.
protected static String CFG_RETAIN_ORDER
          Configuration key for the retain order option.
protected static String CFG_SORT_IN_MEMORY
          Configuration key for the sort in memory option.
protected static String CFG_VALUE_DELIMITER
          Configuration key for the value delimiter option.
 
Constructor Summary
GroupByNodeModel()
          Creates a new group by model with one in- and one out-port.
GroupByNodeModel(int ins, int outs)
          Creates a new group by model.
 
Method Summary
protected static List<ColumnAggregator> compGetColumnMethods(DataTableSpec spec, List<String> excludeCols, ConfigRO config)
          Compatibility method used for compatibility to versions prior Knime 2.0.
protected static ColumnNamePolicy compGetColumnNamePolicy(NodeSettingsRO settings)
          Compatibility method used for compatibility to versions prior Knime 2.0.
protected  DataTableSpec[] configure(PortObjectSpec[] inSpecs)
          Configure method for general port types.
protected  DataTableSpec createGroupBySpec(DataTableSpec origSpec, List<String> groupByCols)
          Generate table spec based on the input spec and the selected columns for grouping.
protected  GroupByTable createGroupByTable(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols)
          Create group-by table.
protected  GroupByTable createGroupByTable(ExecutionContext exec, BufferedDataTable table, List<String> groupByCols, boolean inMemory, boolean sortInMemory, boolean retainOrder, List<ColumnAggregator> aggregators)
          Create group-by table.
protected  PortObject[] execute(PortObject[] inData, ExecutionContext exec)
          Execute method for general port types.
protected  List<ColumnAggregator> getColumnAggregators()
           
protected  ColumnNamePolicy getColumnNamePolicy()
           
protected  List<String> getGroupByColumns()
          Returns list of columns selected for group-by operation.
protected  HiLiteHandler getOutHiLiteHandler(int outIndex)
          Returns the HiLiteHandler for the given output index.
protected  void inMemoryChanged()
          Call this method if the process in memory flag has changed.
protected  boolean isProcessInMemory()
           
protected  boolean isRetainOrder()
           
protected  boolean isSortInMemory()
           
protected  void loadInternals(File nodeInternDir, ExecutionMonitor exec)
          Load internals into the derived NodeModel.
protected  void loadValidatedSettingsFrom(NodeSettingsRO settings)
          Sets new settings from the passed object in the model.
protected  void reset()
          Override this function in the derived model and reset your NodeModel.
protected  void saveInternals(File nodeInternDir, ExecutionMonitor exec)
          Save internals of the derived NodeModel.
protected  void saveSettingsTo(NodeSettingsWO settings)
          Adds to the given NodeSettings the model specific settings.
protected  void setHiliteMapping(DefaultHiLiteMapper mapper)
          Applies a new mapping to the hilite translator.
protected  void setInHiLiteHandler(int inIndex, HiLiteHandler hiLiteHdl)
          This implementation is empty.
protected  void validateSettings(NodeSettingsRO settings)
          Validates the settings in the passed NodeSettings object.
 
Methods inherited from class org.knime.core.node.NodeModel
addWarningListener, configure, continueLoop, execute, executeModel, getAvailableFlowVariables, getCredentialsProvider, getInHiLiteHandler, getLoopEndNode, getLoopStartNode, getNrInPorts, getNrOutPorts, getOutgoingFlowObjectStack, getWarningMessage, notifyViews, notifyWarningListeners, onDispose, peekFlowVariableDouble, peekFlowVariableInt, peekFlowVariableString, peekScopeVariableDouble, peekScopeVariableInt, peekScopeVariableString, pushFlowVariableDouble, pushFlowVariableInt, pushFlowVariableString, pushScopeVariableDouble, pushScopeVariableInt, pushScopeVariableString, removeWarningListener, resetAndConfigureLoopBody, setWarningMessage, stateChanged
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CFG_GROUP_BY_COLUMNS

protected static final String CFG_GROUP_BY_COLUMNS
Configuration key of the selected group by columns.

See Also:
Constant Field Values

CFG_MAX_UNIQUE_VALUES

protected static final String CFG_MAX_UNIQUE_VALUES
Configuration key for the maximum none numerical values.

See Also:
Constant Field Values

CFG_ENABLE_HILITE

protected static final String CFG_ENABLE_HILITE
Configuration key for the enable hilite option.

See Also:
Constant Field Values

CFG_SORT_IN_MEMORY

protected static final String CFG_SORT_IN_MEMORY
Configuration key for the sort in memory option.

See Also:
Constant Field Values

CFG_RETAIN_ORDER

protected static final String CFG_RETAIN_ORDER
Configuration key for the retain order option.

See Also:
Constant Field Values

CFG_IN_MEMORY

protected static final String CFG_IN_MEMORY
Configuration key for the in memory option.

See Also:
Constant Field Values

CFG_COLUMN_NAME_POLICY

protected static final String CFG_COLUMN_NAME_POLICY
Configuration key for the aggregation column name policy.

See Also:
Constant Field Values

CFG_VALUE_DELIMITER

protected static final String CFG_VALUE_DELIMITER
Configuration key for the value delimiter option.

See Also:
Constant Field Values
Constructor Detail

GroupByNodeModel

public GroupByNodeModel()
Creates a new group by model with one in- and one out-port.


GroupByNodeModel

public GroupByNodeModel(int ins,
                        int outs)
Creates a new group by model.

Parameters:
ins - number of data input ports
outs - number of data output ports
Method Detail

inMemoryChanged

protected void inMemoryChanged()
Call this method if the process in memory flag has changed.


loadInternals

protected void loadInternals(File nodeInternDir,
                             ExecutionMonitor exec)
                      throws IOException
Load internals into the derived NodeModel. This method is only called if the Node was executed. Read all your internal structures from the given file directory to create your internal data structure which is necessary to provide all node functionalities after the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
loadInternals in class NodeModel
Parameters:
nodeInternDir - The directory to read from.
exec - Used to report progress and to cancel the load process.
Throws:
IOException - If an error occurs during reading from this dir.
See Also:
NodeModel.saveInternals(File,ExecutionMonitor)

saveInternals

protected void saveInternals(File nodeInternDir,
                             ExecutionMonitor exec)
                      throws IOException
Save internals of the derived NodeModel. This method is only called if the Node is executed. Write all your internal structures into the given file directory which are necessary to recreate this model when the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
saveInternals in class NodeModel
Parameters:
nodeInternDir - The directory to write into.
exec - Used to report progress and to cancel the save process.
Throws:
IOException - If an error occurs during writing to this dir.
See Also:
NodeModel.loadInternals(File,ExecutionMonitor)

saveSettingsTo

protected void saveSettingsTo(NodeSettingsWO settings)
Adds to the given NodeSettings the model specific settings. The settings don't need to be complete or consistent. If, right after startup, no valid settings are available this method can write either nothing or invalid settings.

Method is called by the Node if the current settings need to be saved or transfered to the node's dialog.

Specified by:
saveSettingsTo in class NodeModel
Parameters:
settings - The object to write settings into.
See Also:
NodeModel.loadValidatedSettingsFrom(NodeSettingsRO), NodeModel.validateSettings(NodeSettingsRO)

validateSettings

protected void validateSettings(NodeSettingsRO settings)
                         throws InvalidSettingsException
Validates the settings in the passed NodeSettings object. The specified settings should be checked for completeness and consistency. It must be possible to load a settings object validated here without any exception in the #loadValidatedSettings(NodeSettings) method. The method must not change the current settings in the model - it is supposed to just check them. If some settings are missing, invalid, inconsistent, or just not right throw an exception with a message useful to the user.

Specified by:
validateSettings in class NodeModel
Parameters:
settings - The settings to validate.
Throws:
InvalidSettingsException - If the validation of the settings failed.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)

loadValidatedSettingsFrom

protected void loadValidatedSettingsFrom(NodeSettingsRO settings)
                                  throws InvalidSettingsException
Sets new settings from the passed object in the model. You can safely assume that the object passed has been successfully validated by the #validateSettings(NodeSettings) method. The model must set its internal configuration according to the settings object passed.

Specified by:
loadValidatedSettingsFrom in class NodeModel
Parameters:
settings - The settings to read.
Throws:
InvalidSettingsException - If a property is not available.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.validateSettings(NodeSettingsRO)

setHiliteMapping

protected final void setHiliteMapping(DefaultHiLiteMapper mapper)
Applies a new mapping to the hilite translator.

Parameters:
mapper - new hilite mapping, or null

reset

protected void reset()
Override this function in the derived model and reset your NodeModel. All components should unregister themselves from any observables (at least from the hilite handler right now). All internally stored data structures should be released. User settings should not be deleted/reset though.

Specified by:
reset in class NodeModel

setInHiLiteHandler

protected void setInHiLiteHandler(int inIndex,
                                  HiLiteHandler hiLiteHdl)
This implementation is empty. Subclasses may override this method in order to be informed when the hilite handler changes at the inport, e.g. when the node (or an preceding node) is newly connected.

Overrides:
setInHiLiteHandler in class NodeModel
Parameters:
inIndex - The index of the input.
hiLiteHdl - The HiLiteHandler at input index. May be null when not available, i.e. not properly connected.

getOutHiLiteHandler

protected HiLiteHandler getOutHiLiteHandler(int outIndex)
Returns the HiLiteHandler for the given output index. This default implementation simply passes on the handler of input port 0 or generates a new one if this node has no inputs.

This method is intended to be overridden

Overrides:
getOutHiLiteHandler in class NodeModel
Parameters:
outIndex - The output index.
Returns:
HiLiteHandler for the given output port.

configure

protected DataTableSpec[] configure(PortObjectSpec[] inSpecs)
                             throws InvalidSettingsException
Configure method for general port types. The argument specs represent the input object specs and are guaranteed to be subclasses of the PortObjectSpecs that are defined through the PortTypes given in the constructor. Similarly, the returned output specs need to comply with their port types spec class (otherwise an error is reported by the framework). They may also be null.

For a general description of the configure method refer to the description of the specialized NodeModel.configure(DataTableSpec[]) methods as it addresses more use cases.

Overrides:
configure in class NodeModel
Parameters:
inSpecs - The input data table specs. Items of the array could be null if no spec is available from the corresponding input port (i.e. not connected or upstream node does not produce an output spec). If a port is of type BufferedDataTable.TYPE and no spec is available the framework will replace null by an empty DataTableSpec (no columns) unless the port is marked as optional.
Returns:
The output objects specs or null.
Throws:
InvalidSettingsException - If this node can't be configured.

createGroupBySpec

protected final DataTableSpec createGroupBySpec(DataTableSpec origSpec,
                                                List<String> groupByCols)
                                         throws InvalidSettingsException
Generate table spec based on the input spec and the selected columns for grouping.

Parameters:
origSpec - original input spec
groupByCols - group-by columns
Returns:
a new table spec containing the group-by and aggregation columns
Throws:
InvalidSettingsException - if the group-by can't by generated due to invalid settings

getGroupByColumns

protected final List<String> getGroupByColumns()
Returns list of columns selected for group-by operation.

Returns:
group-by columns

execute

protected PortObject[] execute(PortObject[] inData,
                               ExecutionContext exec)
                        throws Exception
Execute method for general port types. The argument objects represent the input objects and are guaranteed to be subclasses of the PortObject classes that are defined through the PortTypes given in the constructor. Similarly, the returned output objects need to comply with their port types object class (otherwise an error is reported by the framework).

For a general description of the execute method refer to the description of the specialized NodeModel.execute(BufferedDataTable[], ExecutionContext) methods as it addresses more use cases.

Overrides:
execute in class NodeModel
Parameters:
inData - The input objects.
exec - For BufferedDataTable creation and progress.
Returns:
The output objects.
Throws:
Exception - If the node execution fails for any reason.

createGroupByTable

protected final GroupByTable createGroupByTable(ExecutionContext exec,
                                                BufferedDataTable table,
                                                List<String> groupByCols)
                                         throws CanceledExecutionException
Create group-by table.

Parameters:
exec - execution context
table - input table to group
groupByCols - column selected for group-by operation
Returns:
table with group and aggregation columns
Throws:
CanceledExecutionException - if the group-by table generation was canceled externally

createGroupByTable

protected final GroupByTable createGroupByTable(ExecutionContext exec,
                                                BufferedDataTable table,
                                                List<String> groupByCols,
                                                boolean inMemory,
                                                boolean sortInMemory,
                                                boolean retainOrder,
                                                List<ColumnAggregator> aggregators)
                                         throws CanceledExecutionException
Create group-by table.

Parameters:
exec - execution context
table - input table to group
groupByCols - column selected for group-by operation
inMemory - keep data in memory
sortInMemory - does sorting in memory
retainOrder - reconstructs original data order
aggregators - column aggregation to use
Returns:
table with group and aggregation columns
Throws:
CanceledExecutionException - if the group-by table generation was canceled externally

isRetainOrder

protected boolean isRetainOrder()
Returns:
true if the row order should be retained

isProcessInMemory

protected boolean isProcessInMemory()
Returns:
true if all operations should be processed in memory

isSortInMemory

protected boolean isSortInMemory()
Returns:
true if any sorting should be performed in memory

getColumnAggregators

protected List<ColumnAggregator> getColumnAggregators()
Returns:
list of column aggregator methods

getColumnNamePolicy

protected ColumnNamePolicy getColumnNamePolicy()
Returns:
column name policy used to create resulting pivot columns

compGetColumnNamePolicy

protected static ColumnNamePolicy compGetColumnNamePolicy(NodeSettingsRO settings)
Compatibility method used for compatibility to versions prior Knime 2.0. Helper method to get the ColumnNamePolicy for the old node settings.

Parameters:
settings - the settings to read the old column name policy from
Returns:
the ColumnNamePolicy equivalent to the old setting

compGetColumnMethods

protected static List<ColumnAggregator> compGetColumnMethods(DataTableSpec spec,
                                                             List<String> excludeCols,
                                                             ConfigRO config)
Compatibility method used for compatibility to versions prior Knime 2.0. Helper method to get the aggregation methods for the old node settings.

Parameters:
spec - the input DataTableSpec
excludeCols - the columns that should be excluded from the aggregation columns
config - the config object to read from
Returns:
the ColumnAggregators


Copyright, 2003 - 2012. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.