Location:Home > Engineering science > Computer Science > Computer System Architecture > Design and Optimization of Predicated Execution for Tiled Processor Architecture

Design and Optimization of Predicated Execution for Tiled Processor Architecture

Downloads: []
Tutor: AnHong
School: University of Science and Technology of China
Course: Computer System Architecture
Keywords: Tiled Processor Architecture,Dataflow-like Computing Model,Hyperblock,Predicated
CLC: TP302
Type: Master's thesis
Year:  2011
Facebook Google+ Email Gmail Evernote LinkedIn Twitter Addthis

not access Image Error Other errors

With the evolvement of computer architecture and semiconductor technology, the performance of computer system is being improved via placing more cores in one single processor, instead of employing higher frequency. The problem of lacking processor resource is consequently alleviated. However, comes along another problem that how to maximize performance by fully utilizing these resources without exceeding the energy budget. Titled processor architecture, which distributes the computing and storage resources evenly across the chip, resolves the issues of memory wall, resource utilization, wire delay and scalability, and therefore becomes the trend of microprocessor development.The work in the thesis implements the predicated execution technique in the back-end of compiler for titled processor TPA-PI, and optimizes the execution. The main work and contributions are as follows. First, we investigate the back-end implementation of LLVM compiling framework, and implement the predicated execution technique for TPA-PI. Such technique traverses the control flow graph of the program, finds candidate block of predicated execution, and predicates the candidate blocks according to the dependency relationship between them. Second, we study the procedure of selecting basic blocks to build hyper-block, analyze the influence that various program factors imposed on the selecting procedure, and make dynamic decisions for the basic block selecting according to profiling information. Lastly, we study factors that affect hyperblock splitting, and propose the heuristic choosing algorithm for hyperblock splitting. By balancing between the execution overhead of hyperblock splitting and the quality of hyperblock split, we propose the criteria for choosing hyperblock splitting node. By conforming such criteria, we could improve both the execution efficiency of hyperblock and the overall performance of processor. Experimental results of preliminary suggest that predicated execution technique proposed in this thesis could effectively eliminate branch instructions and merge instruction blocks in program. Meanwhile, the optimization of predicated execution technique increases the branch prediction correct rate by 0.68%-3%, and the overall performance of program by 1.67%-8.39%.The research work in this thesis implements the predicated execution technique for titled processor TPA-PI, and establishes the foundation for the compiler back-end design. At the same time, the optimization technique we proposed could be referred as an approach that improves the instruction level parallelism from the complier side.
Related Dissertations
Last updated
Sponsored Links
Home |About Us| Contact Us| Feedback| Privacy | copyright | Back to top