Justifying the need to consolidate
Aggregating is the way to group data on a time axis. We have collected data for different periods regarding a same activity and aggregating is the way to have a global view on this activity by enlarging the time frame.
Consolidating is also a way to group data but not on a time axis. We have collected data on several activities on different periods and consolidating is the way to have a global view for several activities on a single period.
For example, for an application server we have several nodes running the application, we have captured data at Os level on each node et we want to have a single view of the global CPU consumption for all nodes. Consolidating is the way to do this.
How to consolidate ?
Let's take an example:
We have the following picture: A node "CONSOLIDATE" and under this node, 2 other nodes "NODE1" and "NODE2" and for each of them we have date coming from May 19th, 2014.
We would like, at OS Level (statistics coming from nmon) to consolidate data coming from both NODE1 and NODE2 for this particular day.
How to do ?
a) create a new node and rename it to some mnemonic value
This new node should be independent of both NODE1 and NODE2 but under consolidate. We will name it, for example, ALL_NODES
b) create a new node under ALL_NODES who will be the consolidated view of both NODE1/2014-05-19 and NODE2/2014-05-19
At this point, there is no link between the created node and the 2 producers in both NODE1 and NODE2
c) drag /CONSOLIDATE/NODE1/2014-05-19 over /CONSOLIDATE/ALL_NODES/2014-05-19 while the "ALT" key is pressed
This has the effect to turn red the chip attached to the target. If we open this node, we could see that /CONSOLIDATE/NODE1/2014-05-19 is a producer
d) drag /CONSOLIDATE/NODE2/2014-05-19 over /CONSOLIDATE/ALL_NODES/2014-05-19 while the "ALT" key is pressed
Here we are repeating the step above but we are dragging from a different node (NODE2/2014-05-19 instead of NODE1/2014-05-19.
If we have more than 2 nodes, we can repeat this operation as much as we want, changing each time the origin.
e) open the aggregated node
When the drag & drop operations are complete, we should find in the producers list all nodes from which we want to take data.
In this example the "selector" in the "aggregator" section is : /CONSOLIDATE/NODE1/2014-05-19$|/CONSOLIDATE/NODE2/2014-05-19$
This is a list of 2 regular expressions. The separator character is "|".
The first regular expression is "/CONSOLIDATE/NODE1/2014-05-19$", the second is "/CONSOLIDATE/NODE2/2014-05-19$". KAIROS doesn't try to simplify the list of regular expression. We could have written it as follow: /CONSOLIDATE/NODE./2014-05-19$. The result would have been the same:
In such an operation (consolidating), it's very important to consider the aggregation level. Data coming from NODE1 and NODE2 don't have necessarily the same timestamps. It's important to check timestamps in NODE1 and NODE2 and consider an aggregation level to be sure to have the same timestamps for both NODE1 and NODE2.
In this particular example, an average method set to 10 minutes is a good choice:
f) view the result
Above an example of result on the aggregated node. It's easy to check that the result is the sum of the result for NODE1 and NODE1.
for NODE1:
for NODE2:
In order to compare, there is a better way to compare data within KAIROS. See How to compare data within KAIROS