PROJECT TITLE :
Spotting Code Optimizations in Data-Parallel Pipelines through PeriSCOPE
To reduce the number of knowledge-shuffling I/O that happens between the pipeline stages of a distributed data-parallel program, its procedural code must be optimized with full awareness of the pipeline that it executes in. Unfortunately, neither pipeline optimizers nor ancient compilers examine both the pipeline and procedural code of a information-parallel program so programmers should either hand-optimize their program across pipeline stages or live with poor performance. To resolve this tension between performance and programmability, this paper describes PeriSCOPE, which automatically optimizes adata-parallel program’s procedural code in the context of knowledge flow that is reconstructed from the program’s pipeline topology. Such optimizations eliminate unnecessary code and information, perform early knowledge filtering, and calculate little derived values (e.g., predicates) earlier in the pipeline, thus that less knowledge—generally abundant less knowledge—is transferred between pipeline stages. PeriSCOPE additional leverages symbolic execution to enlarge the scope of such optimizations by eliminating dead code. We describe how PeriSCOPE is implemented and evaluate its effectiveness on real production jobs.
Did you like this research project?
To get this research project Guidelines, Training and Code... Click Here