Justifying aggregating


With KAIROS you can record data collected at regular intervals (every day for example). Each collection will appear as a different node in the KAIROS repository.


Behind "aggregating", the idea is to give the capability to group data in a larger view in order to predict trends.


Some examples:


a) with Oracle, a request can require more and more time to execute and sometimes it's difficult on a single day to perceive the drift. By grouping several days, it can be more easy to prove that this request is drifting

b) at the system level, you want to monitor the CPU consumption in time and determine if the system capacity must be upgraded.


How to aggregate ?


Aggregating uses the same mechanism as those described in "How to average data within KAIROS". In fact, "averaging" is a particular case.


Let's take an example:



In this example, we have a lot of reports (represented by nodes with blue chips). Each node represents a day of activity. 


We got data during the five first opened days of each month (Saturday, Sunday and may holidays are excluded). We want to get a consolidated view of this picture with one point per month, each point being the average of the 5 opened days of the month.


So how to build this consolidated view?


a) create a new node


This node can be anywhere in the tree and most probably at level of nodes we want to aggregate



b) rename the created node with an appropriated name



c) drag one of the nodes to be aggregated over the created node while the "Alt" key is pressed


The yellow chip is then turned to red meaning that the node is an aggregated node


d) open the aggregated node



At this step of the process, in the producers section we have only one producer. This is not what we want. We want a given number of nodes to be producers, in fact all nodes under the /SAMPLE directory with a name beginning with 2014 or more.


To update this, we need to address the "selector" field in the "aggreagator" section. The actual value is "/SAMPLE/2015-03-06$".


KAIROS interprets this field as a list of regular expression. The separator of the list is the pipe character "|". In this example there is no "|" character so the list is reduced to a single regular expression. 


To get more producers we need to modify this regular expression, for example "/SAMPLE/20."


Every node matching this expression will be a candidate to be a producer.


Let's try (don't forget to push apply aggregator to see the result)



The result is the same: the selector has been updated but after pushing "apply aggregator" we have only one producer which is node "/SAMPLE/2015-03-06".


This is not what we were expecting.


Why ?


In fact,  we need to address the other fields in the "aggregator" section. We have the "take" selector with the value 1, "skip" with 0, ....


"Take" means: among the list of candidates, KAIROS retains "take" candidates. In this example with the value "1", KAIROS retains only one candidate among the list of candidates. Candidates are ordered by their name (here in a descending order) and the first of the the list has been retained.


So, if we modify "take" with a greater value, what happens?


Suppose we update "take" with the value "3"



We have now more producers and you can check which producers are retained in the "producers" section.


If we want many nodes but we don't know exactly the number, we can choose a big number, big enough to be sure that this number will be greater than the real number of nodes to be considered.


For example 1000:



Now the list of producers is conform to what we were expecting.


There is an other field named "skip" if needed. Among the list of candidates selected by the regular expression, they are sorted according to the "sort" parameter, the first "skip" will be ignored, and we will take the "take" after.


From day to day, from month to month, the result of such an aggregation is not a constant. This is the advantage of this method; to be able to build a report whose result is varying in time but without the need to rebuild this report each time.


Before to see the result of such an aggregation, we should consider first the "method" field to know if we have to average data. When we are aggregating many reports in time, the number of points is the sum of the number of points in each report. So the result can be very high and generally it's not necessary to keep this high number of points and furthermore the readability can be difficult.


In our example, we want to have one point per month. So we need to address the "method" field and choose an average method corresponding to our expectation.



Now we are ready to view the result.


e) viewing the result



You have then an overall view of a given situation (here the wait events on an Oracle database) and you can evaluate if the picture is clean , if some actions should be decided or if the result of engaged actions is conclusive.


Of course, all available charts at the detailed level are available at the aggregated level.


Filtering on time


In the above example, filtering data on time has not been applied. Every data coming from a producer node is included in the result. KAIROS gives the capability to filter data coming from the producer node. In the "aggregator" section, there is an additional field named "TimeFilter" allowing to restrict data coming from producers.


This field is a regular expression who will be applied on the "timestamp" field when data is copied from a producer to the aggregated node. By default, this regular expression equals "." so that everything is taken from a producer.


Within KAIROS a timestamp is a 17 characters length field with the following meaning: "YYYYMMDDHHMNSSXXX" where:


"YYYY" is the year

"MM" is the month

"DD" is the day

"HH" is the hour

"MN" is the minute

"SS" is the second

"XXX" is the millisecond"


Let's take an example:


Suppose the want to take all the events when these events are between 9h and 11h. We can specify the time filter like this: "^........(09|10).......$"


8 "." characters for the year (4), the month (2) and the day (2) followed by an alternative for the hour (09 or 10) followed by 7 "." characters for the minute (2), the second(2) and the millisecond(3).



With that definition, the previous chart becomes:



Note the difference on the Y axis.


If the time selection is the range between 20h and 22h, the result will be: