Pipes & Filters Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al Problem • You are designing a system that needs to perform several transformations on some data • The transformations can be performed sequentially • It should be easy to: – Reorder the sequence of the transformations – Add new kinds of transformations – Remove transformations • There are different sources of input data (files, network, programs) • There are different destinations for output data (files, network, programs) Solution Data Source • • • • Pipe Filter Pipe Filter Pipe Data Sink Organize the system using Pipes & Filters Input data originates from Data Source components Output data is consumed by Data Sink components Each transformation required to convert the input into the output is implemented by a Filter component • Data Sources, Filters, and Data Sinks arranged in a pipeline • Pipes connect adjacent components, the output of one component becoming the input to the next component Solution Data Source Filter Data Sink -Delivers input to processing pipeline -Reads input data -Performs a transformation on the input data -Writes output data -Consumes output of processing pipeline Pipe -Transfers data -Buffers data -Synchronizes active neighbors • Sources, Filters, and Sinks can be Active or Passive • Active components run on their own thread of control • Passive components execute only when invoked, directly or indirectly, by an Active component Dynamic Behavior : Scenario I Data Source • Pipe Filter Pipe Filter Pipe Data Sink Active Source / Passive Filter / Passive Sink Data Source Filter 1 Filter 2 Data Sink Write(data) Transform(data) Write(data) Transform(data) Write(data) Dynamic Behavior : Scenario II Data Source • Pipe Filter Pipe Filter Pipe Data Sink Passive Source / Passive Filter / Active Sink Data Source Filter 1 Filter 2 Data Sink Read Read Read data Transform(data) data Transform(data) data Dynamic Behavior : Scenario III Data Source • Pipe Filter Pipe Filter Pipe Data Sink Passive Source / Active Filter / Passive Sink Data Source Filter 1 Filter 2 Data Sink Read Read data Transform(data) data Transform(data) Write(data) Dynamic Behavior : Scenario IV Buffering Pipe Data Source • Pipe Filter Pipe Filter Pipe Data Sink Passive Source / Multiple Active Filters / Passive Sink Data Source Filter 1 Filter 2 Buffering Pipe Read Data Sink Read data Transform(data) Write(data) Read data Transform(data) data Transform(data) Write(data) Write(data) Read data Transform(data) Write(data) Implementation • Divide the system into a sequence of processing stages • Define the data format to be passed along each pipe • Decide how to implement each pipe connection – The simplest Pipe is direct Read/Write method calls between adjacent filters – If buffering or synchronization is required between filters, actual Pipe objects will be needed to perform these functions Implementation • Design and implement the filters – Passive filters can be implemented as regular methods – Active filters can be implemented as processes or threads • Design the error handling – In pipelines with one active component, standard error handling mechanisms can be used (exceptions, return codes, etc.) – In pipelines with multiple active components, how will an error in one thread of control become visible to other threads? – If an error occurs, we could: • Tear down the entire pipeline and restart it from scratch, or • Restart only the parts that failed and resynchronize the pipeline • Setup the pipeline Known Uses: UNIX Command Pipelines cat fall.txt win.txt | sort | gzip | mail fred@byu.edu fall.txt cat win.txt sort gzip mail Known Uses: Image Processing Known Uses: Compilers Source File ASCII Text Lexical Analyzer Token Stream Parser Abstract Syntax Tree Semantic Analysis Augmented Abstract Syntax Tree Code Generator Object Code Optimizer Optimized Object Code Object File Consequences • Very flexible – Filters can be reused and recombined in arbitrary ways – Individual filters can be easily replaced (e.g., plug in a different kind of compression) • No intermediate files necessary between processing stages • Benefits from efficiencies inherent in parallel processing (multiple active components) • Some filters don’t produce any output until they've consumed all of their input (e.g., sort), which is not conducive to parallelism • Data transfer and context switching between processes and threads can be expensive