Once this has been completed and implemented, the actual algorithm for equidistant binning can be written. The algorithm operating on the data must be placed in the execute method. In this example only one column is appended to the original data. For this purpose the so-called ColumnRearranger is used. It requires a CellFactory, which returns the appended cells for a given row.
...
CellFactory cellFactory = new NumericBinnerCellFactory(
createOutputColumnSpec(), splitPoints, colIndex);
ColumnRearranger outputTable = new ColumnRearranger(
inData[IN_PORT].getDataTableSpec());
outputTable.append(cellFactory);
...
Having created the ColumnRearranger, it can be transferred together with the input table to the ExecutionContext to create a BufferedDataTable which is returned by the execute method, i.e. provided at the outport. Each node buffers the data in a BufferedDataTable. In order to avoid redundant buffering of the same data the ColumnRearranger is used. In this way only the appended column is buffered in our node. That is why we have to retrieve the BufferedDataTable from the ExecutionContext:
...
BufferedDataTable bufferedOutput = exec.createColumnRearrangeTable(
inData[IN_PORT], outputTable, exec);
return new BufferedDataTable[]{bufferedOutput};
...
For purposes of the CellFactory it is necessary to implement a NumericBinnerCellFactory. This extends the SingleCellFactory and only implements the getCell method. The passed row is checked to find out which bin contains the value from the selected column. It returns the number of the bin as a DataCell.
@Override
public DataCell getCell(DataRow row) {
DataCell currCell = row.getCell(m_colIndex);
if (currCell.isMissing()) {
return DataType.getMissingCell();
}
double currValue = ((DoubleValue)currCell).getDoubleValue();
int binNr = 0;
for (Double intervalBound : m_intervalUpperBounds) {
if (currValue <= intervalBound) {
return new IntCell(binNr);
}
binNr++;
}
return DataType.getMissingCell();
}