Fold: Rotate many fields into one field

class txf.Fold(pipeline, inputs, outputs, tags=None)

The Fold transform folds (unpivots) a set of fields.

Simple folding consists of rotating multiple input fields into a single output field and a field of tags containing the original input field names.

This can be generalised to multiple output fields where the input fields are broken up into equal-sized groups, each of which is mapped to one of the output fields. In the latter case, the tags can be supplied by the caller or generated by concatenating the field names.

pipeline: Transform

The input pipeline (required).

inputs: tuple(str)

The list of fields to be folded. They will be dropped from the output, so use Copy to preserve them.

outputs: tuple(str)

The the output fields receiving the tags and the folded fields. The first output contains the tags and the remaining fields contain the group values. The number of inputs must be a multiple of the number of groups. They cannot overwrite existing fields, so use Drop to remove unwanted fields.

tags: tuple(str)

The optional tags for the output. It will be generated if not provided.

Usage

Fold(p, ('Sales 1992', 'Sales 1993', 'Sales 1994',),
        ('Year', 'Sales',),
        ('1992', '1993', '1994',))
Fold(p, ('Sales 1992', 'Sales 1993', 'Sales 1994',
         'Profit 1992', 'Profit 1993', 'Profit 1994',),
        ('Year', 'Sales', 'Profit',),
        ('1992', '1993', '1994',))

Examples

Single Group

The first Usage example is a case where Sales values have been pivoted by Year, so that the Sales for each Year is in a separate field.

Input

Dept

Sales 1992

Sales 1993

Sales 1994

Houseware

S-H-1992

S-H-1993

S-H-1994

Auto

S-A-1992

S-A-1993

S-A-1994

In order to graph Sales by Department and Year, the table needs a Year field. Fold takes the list of Sales fields to combine (“fold”) as its inputs and the fields to put them in as the outputs. The first output field is the “Tags” field, which contains the value used to identify the original field. In this example, this is the Year of the field.

After Folding, each Sales value appears in a separate row tagged by Year:

Output

Dept

Year

Sales

Home

1992

S-H-1992

Home

1993

S-H-1993

Home

1994

S-H-1994

Auto

1992

S-A-1992

Auto

1993

S-A-1993

Auto

1994

S-A-1994

Multiple Groups

The second Usage example is a related case where multiple measures (Sales and Profit) have been pivoted by Year so that the Sales and Profits for each Year are in separate fields.

Input

Dept

Sales 1992

Sales 1993

Sales 1994

Profit 1992

Profit 1993

Profit 1994

Home

S-H-1992

S-H-1993

S-H-1994

P-H-1992

P-H-1993

P-H-1994

Auto

S-A-1992

S-A-1993

S-A-1994

P-A-1992

P-A-1993

P-A-1994

In order to do an analysis comparing Sales and Profit by Year, the table needs to have each record contain the Year, Sales and Profit. This means that there are two groups that need to be Folded: Sales and Profit, and the value from each group needs to be tagged by Year. To express this, each group is listed in order in the inputs and the values are mapped to the corresponding tag value and output field. In this example, the Year is again the first output field, and the following output fields are the groups in the order given by the inputs.

After Folding, each Sales and Profit pair appears in a separate row tagged by Year:

Output

Dept

Year

Sales

Profit

Home

1992

S-H-1992

P-H-1992

Home

1993

S-H-1993

P-H-1993

Home

1994

S-H-1994

P-H-1994

Auto

1992

S-A-1992

P-A-1992

Auto

1993

S-A-1993

P-A-1993

Auto

1994

S-A-1994

P-A-1994