This example demonstrates the usage of the network mining plug-in based on an artificially generated social network. The network consists of 105 nodes representing people and 240 edges representing relationships between these people. Each person has different attributes such as age, gender, income, etc., which are assigned as features to the corresponding nodes.
This workflow demonstrates the usage of the Object Inserter and Feature Inserter.
The Object Inserter node inserts objects (e.g. nodes and edges) from a data table into a network. The table can contain the nodes per edge in one row or the nodes of one edge in several rows. An example of a table that contains one row per edge (typical edge table) is the following:
|Node1 ID||Node2 ID|
whereas a table that contains several rows per edge could look like this:
|Edge ID||Node ID|
The Feature Inserter node allows features to be assigned to all graph elements (e.g. graph, node, edge, end object). In this example workflow, it is used to assign various node features such as the age and gender of a person as well as a dummy graph feature.
To get a first impression of the network the workflow uses the Network Viewer node.
The Network Writer node is used to create a network file, which we will use in the following workflow.
This workflow shows the benefits of integrating the network mining capability into the KNIME platform. The workflow makes use of the wealth of available data mining nodes that are available within KNIME in order to predict the phase of life attributes for all persons that are missing this information.
In order to make use of the existing nodes the network features need to be extracted into a KNIME data table using the Feature Table node. Once the features are extracted, the standard KNIME data processing and mining nodes are used to predict the missing features. The predicted features are than inserted into the network which subsequently contains the phase of life feature for all people.
This workflow demonstrates the usage of various nodes that allow network objects to be filtered. It shows for example the filtering of leaves using the Node Degree Fiter node or filtering based on feature values using the Feature Value Filter node.
This workflow demonstrates the symbiosis between the network based data structure provided by the network plug-in and the existing data mining nodes within KNIME.
In order to make use of the existing mining nodes the workflow first converts the network structure into a standard data table using the Node Adjacency Matrix node. As soon as we have a standard KNIME data table we can use the countless nodes within KNIME such as the "Distance Matrix" nodes in this example to cluster the persons of the network based on their neighborhood.
The cluster result is used to partition the person network into distinct partitions using the Assign Partition node. The result is an n-partite graph where n is the number of clusters.
Using the Partition Graph Creator node, we can analyze the connections between and within each partition. This node generates a new network that contains the partitions as nodes which are connected to an edge if the corresponding partitions are also connected in the original graph. It also provides information about the number of nodes and edges that belong to a partition and the number of edges that connect two partitions.
The result of this graph projection can be visualized in an external program (e.g. visone) using the Viz Output Connector node.
The example image shows the generated partition graph in visone with the number of edges within a partition mapped on the node color whereas the node size represents the number of nodes within a partition. The edge width reflects the number of edges between two partitions.
Notice: This workflow requires the Distance Matrix plug-in, which is available as a KNIME extension form the KNIME update site.
This workflow demonstrates how subgraphs can be processed in KNIME by using the available Flow Controls.
The workflow extracts each persons direct neighborhood using the SubGraph Extractor node. The result is a data table that contains a NetworkCell with a subgraph in each row.
To process each row (e.g. subgraph) separately we use the "Chunk Loop Start" node with a chunk size of one. Once the NetworkCell is converted back into a network using the Row To Network node, it can be analyzed e.g. by using the Network Analyzer node.