Fold: Rotate many fields into one field¶
- class txf.Fold(pipeline, inputs, outputs, tags=None)¶
The
Fold
transform folds (unpivots) a set of fields.Simple folding consists of rotating multiple input fields into a single output field and a field of tags containing the original input field names.
This can be generalised to multiple output fields where the input fields are broken up into equal-sized groups, each of which is mapped to one of the output fields. In the latter case, the tags can be supplied by the caller or generated by concatenating the field names.
- inputs: tuple(str)¶
The list of fields to be folded. They will be dropped from the output, so use
Copy
to preserve them.
- outputs: tuple(str)¶
The the output fields receiving the tags and the folded fields. The first output contains the tags and the remaining fields contain the group values. The number of inputs must be a multiple of the number of groups. They cannot overwrite existing fields, so use
Drop
to remove unwanted fields.
- tags: tuple(str)¶
The optional tags for the output. It will be generated if not provided.
Usage¶
Fold(p, ('Sales 1992', 'Sales 1993', 'Sales 1994',),
('Year', 'Sales',),
('1992', '1993', '1994',))
Fold(p, ('Sales 1992', 'Sales 1993', 'Sales 1994',
'Profit 1992', 'Profit 1993', 'Profit 1994',),
('Year', 'Sales', 'Profit',),
('1992', '1993', '1994',))
Examples¶
Single Group¶
The first Usage example is a case where Sales values have been pivoted by Year, so that the Sales for each Year is in a separate field.
Dept |
Sales 1992 |
Sales 1993 |
Sales 1994 |
---|---|---|---|
Houseware |
S-H-1992 |
S-H-1993 |
S-H-1994 |
Auto |
S-A-1992 |
S-A-1993 |
S-A-1994 |
In order to graph Sales by Department and Year, the table needs a Year field.
Fold
takes the list of Sales fields to combine (“fold”) as its inputs
and the fields to put them in as the outputs.
The first output field is the “Tags” field, which contains the value used to
identify the original field.
In this example, this is the Year of the field.
After Folding, each Sales value appears in a separate row tagged by Year:
Dept |
Year |
Sales |
---|---|---|
Home |
1992 |
S-H-1992 |
Home |
1993 |
S-H-1993 |
Home |
1994 |
S-H-1994 |
Auto |
1992 |
S-A-1992 |
Auto |
1993 |
S-A-1993 |
Auto |
1994 |
S-A-1994 |
Multiple Groups¶
The second Usage example is a related case where multiple measures (Sales and Profit) have been pivoted by Year so that the Sales and Profits for each Year are in separate fields.
Dept |
Sales 1992 |
Sales 1993 |
Sales 1994 |
Profit 1992 |
Profit 1993 |
Profit 1994 |
---|---|---|---|---|---|---|
Home |
S-H-1992 |
S-H-1993 |
S-H-1994 |
P-H-1992 |
P-H-1993 |
P-H-1994 |
Auto |
S-A-1992 |
S-A-1993 |
S-A-1994 |
P-A-1992 |
P-A-1993 |
P-A-1994 |
In order to do an analysis comparing Sales and Profit by Year, the table needs to have each record contain the Year, Sales and Profit. This means that there are two groups that need to be Folded: Sales and Profit, and the value from each group needs to be tagged by Year. To express this, each group is listed in order in the inputs and the values are mapped to the corresponding tag value and output field. In this example, the Year is again the first output field, and the following output fields are the groups in the order given by the inputs.
After Folding, each Sales and Profit pair appears in a separate row tagged by Year:
Dept |
Year |
Sales |
Profit |
---|---|---|---|
Home |
1992 |
S-H-1992 |
P-H-1992 |
Home |
1993 |
S-H-1993 |
P-H-1993 |
Home |
1994 |
S-H-1994 |
P-H-1994 |
Auto |
1992 |
S-A-1992 |
P-A-1992 |
Auto |
1993 |
S-A-1993 |
P-A-1993 |
Auto |
1994 |
S-A-1994 |
P-A-1994 |