Divide: Separate one field into two using a predicate

class txf.Divide(source, outputs, values)

The Divide transform separates a field into two fields, one containing the values that pass a predicate and the other containing the ones that fail. If the predicate is a regular expression, matching it will be used as the predicate.

Divide can be used to separate a field with multiple layouts, find erroneous values or pull sections names out from the main data.

source: Transform

The input pipeline (required).

input: str

The name of the field to apply the predicate to. It will be dropped from the output, so use Copy to preserve it.

passed: str

The output field receiving the values that pass the predicate. It cannot overwrite existing fields, so use Drop to remove unwanted fields.

failed: str

The output field receiving the values that fail the predicate. It cannot overwrite existing fields, so use Drop to remove unwanted fields.

predicate: str or callable

A callable predicate or a regular expression.

fills: str or tuple(str)

The value(s) to be used for the field that does not recieve the input value.

Usage

Divide(p, 'Date', ('Date', 'Invalid',), r'(\d+)/(\d+)/(\d+)')
Divide(p, 'Line', ('Query', 'Run',), r'Q(\d+)')

Example

Line

Branch

|| Q01_PARALLEL ||

master

Cold run…DONE

master

1/5…0.083052

master

2/5…0.079075

master

3/5…0.079176

master

4/5…0.078928

master

5/5…0.079142

master

Divide(p, 'Line', 'Query', 'Run', 'Q')

Query

Run

Branch

|| Q01_PARALLEL ||

master

Cold run…DONE

master

1/5…0.083052

master

2/5…0.079075

master

3/5…0.079176

master

4/5…0.078928

master

5/5…0.079142

master