Aggregation

Aggregation

Aggregation is located under Model Studio () in Data Preparation, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to Properties of Aggregation.

Properties of Aggregation

The available properties of Aggregation are as shown in the figure given below.


The table given below describes different fields present on the properties of Aggregation.

Field

Description

Remark

RunIt allows you to run the node.-
ExploreIt allows you to explore the successfully executed node.-
Vertical Ellipse

The available options are

  • Run till node
  • Run from node
  • Publish as a model
  • Publish code
-

Task Name

It displays the name of the selected task.

You can click the text field to edit or modify the name of the task as required.

GroupBy

It allows you to select the function for which you want to group the data.

  • Multiple functions can be selected.
  • You can group by numerical as well as categorical function.

Aggregate Function

It allows you to select the type of data that is to be aggregated.

  • Multiple functions can be selected.
  • Same data can be grouped according to different statistical measures at the same time.
  • You can aggregate numerical as well as categorical data.
  • Numerical data is aggregated by
    • Sum
    • Mean
    • Mode
    • Minimum
    • Maximum
    • Count
    • Count (Distinct)
    • Standard Deviation
    • Variance
  • Categorical data is aggregated by
    • Minimum
    • Maximum
    • Count
    • Count (Distinct)

Example of Aggregation

The figure given below displays the output of aggregation performed on sample data. The data of the number of deaths (numerical data) in a US county is aggregated by sum, mean, standard deviation, and the maximum value of the number of deaths. The data is grouped by the name of the county and date (both categorical data).


Field

Result

county

It displays the name of the US county whose data corresponding to the number of deaths is aggregated.

deaths_Aggr_0

It displays the aggregate deaths in that county by the sum of deaths on a particular date.

date

It displays the date corresponding to which the data is aggregated.

deaths_Aggr_1

It displays the aggregate deaths in that county by the mean number of deaths on a particular date.

deaths_Aggr_2

It displays the aggregate deaths in that county by the standard deviation of the deaths on a particular date.

deaths_Aggr_3

It displays the aggregate deaths in that county by the maximum value of the number of deaths on a particular date.


    • Related Articles

    • Aggregation

      Aggregation is located under Model Studio () in Data Preparation, in the task pane on the left. Use drag-and-drop method to use the algorithm in the canvas. Click the algorithm to view and select different properties for analysis. Refer to Properties ...
    • Aggregate Line

      The figure given below shows the Add Aggregate Line page. The table given below describes different fields present on Add Aggregate Line page. Section Field Description Remark Line Name It allows you to select a name for the Aggregate Line. · Select ...
    • Creating Charts using Widgets

      To create a chart using widgets, follow the steps given below. Open the Dashboard in edit mode. Refer to Editing a Dashboard. The Dashboard is displayed. In the WIDGETS pane, click the widget you want to use. Note: Hover over any widget to see its ...
    • Copying Node in Same Workbook

      Rubiscape provides a facility to copy a single node, multiple nodes, or connected nodes in the same workbook using keyboard events (shortcuts). Notes: You can copy the Node (s) in the same workbook or workflow but cannot copy them in another workbook ...
    • Analytics ( Reference) Line

      An Analytics (Reference) Line corresponds to a particular value on the X or Y axis regarding the widget plotted. Notes: An analytics line is used to draw in any widget that contains at least one of the two axes, for example, Column Chart, Pareto ...