Data transformations

Data transformations provide values for all the arrays defined in one transformation node (fromNode) given arrays defined in another transformation node (toNode). Each transformation defines a dictionary of transformation parameters; the values of these parameters are individual per data item. Each transformation in a pipeline requires subclassing from Transform.

class parseq.core.transforms.Transform(fromNode, toNode)

Parental Transform class. Must be subclassed to define the following class variables:

name: str name that must be unique within the pipeline.

defaultParams: dict of default transformation parameters for new data.

Transforms, if several are present, must be instantiated in the order of data flow.

The method run_main() must be declared either with @staticmethod or @classmethod decorator. A returned not None value indicates success.

nThreads or nProcesses can be > 1 to use threading or multiprocessing. If both are > 1, threading is used. If nThreads or nProcesses > 1, the lists inArrays and outArrays must be defined to send the operational arrays (those used in run_main()) over process-shared queues. The value can be an integer, ‘all’ or ‘half’ which refer to the hardware limit multiprocessing.cpu_count().

progressTimeDelta, float, default 1.0 sec, a timeout delta to report on transformation progress. Only needed if run_main() is defined with a parameter progress.

__init__(fromNode, toNode)

fromNode and toNode are instances of Node. They may be the same object.

classmethod run_main(data)

Provides the actual functionality of the class. Other possible signatures:

run_main(cls, data, allData, progress)
run_main(cls, data, allData)
run_main(cls, data, progress)

data is a data item, instance of Spectrum.

allData and progress are both optional in the method’s signature. The keyword names must be kept as given above if they are used and must be in this given order if both are present.

allData is a list of all data items living in the data model. If allData is needed, both nThreads or nProcesses must be set to 1.

progress is an object having a field value. A heavy transformation should periodically update this field, like this: progress.value = 0.5 (means 50% completion). If used with GUI, progress will be visualized as an expanding colored background rectangle in the data tree. Quick transformations do not need progress reporting.

Should an error happen during the transformation, the error state will be reported in the ParSeq status bar and the traceback will be shown in the data item’s tooltip in the data tree view.

Returns True when successful. If returns an int, this int will be set as the data state at the destination node (the state is a dict of node names).