ECE734 Course Project Report I: A Survey on Retiming: Algorithms and Applications James D. Z. Ma (9017465734) Electrical and Computer Engineering Department University of Wisconsin, Madison, WI 53706 Retiming is a transformation technique used to change the locations of delay elements in a circuit without affecting the functionality of the circuit. It has many applications in synchronous circuit design, including the clock period minimization, register count minimization, power reduction, preprocessing for folding, computation of round-off noise, and logic synthesis. In this ECE734 course project, a few recent algorithms and applications of retiming will be surveyed. This phase report I lists some of the papers to be referred but not limited to, with a short summary for each of them. [1] first formulated the retiming problem, proposed several polynomial algorithms, and provided a framework for retiming via mathematical programming. Many follow-up works based their retiming algorithms on the framework, incorporating new components in the objectives and/or constraints. Different from [1] that introduced retiming for circuits with edge-triggered latches and a single phase clock, [2] developed an algorithm for the retiming of single phase sequential circuits with level sensitive (transparent) latches. [3] displayed a different view of retiming by using an equivalence between clock skew and retiming. In [4], the effectiveness of retiming was improved by transforming the sequential circuits. The bottlenecks in the sequential circuits were identified and eliminated. [5] presented a different way of considering retiming. A loop analysis is developed to evaluate all possible retiming solutions, which were then efficiently reduced to those that are appropriate for implementation. The retiming for multi-rate synchronous data-flow graph (DFG) was studied in [6] to reduce the execution time. [7] introduced a modification of the retiming algorithm [1] to help minimize the effort required to find equivalent initial states and reduce the chances that the network needs to be modified in order to find an equivalent initial state. In [8], the interconnect, variable register, and clock distribution, and non-zero clock skew were considered into a retiming algorithm. In order to consider propagation delay variation due to VLSI manufacturing, a new probabilistic retiming method was presented in [9] to transform a circuit based on statistical timing data. [10] described a polynomial algorithm for min-period and min-area retiming of edge-triggered circuits with setup and hold constraints as well as satisfying both long-path and short-path timing requirements. [11,12] proposed a new graph-transformation technique -- extended retiming -- to achieve optimal schedules, which cannot be done by traditional retiming, and develop efficient algorithms to find and verify the extended retiming solution. To consider retiming circuits that operate on 2-D signals, [13] designed two techniques to minimize the amount of memory required to implement a 2-D DFG while maintaining a desired clock rate for the circuit. Retiming techniques have been extensively applied to circuit optimization. [14] proposed a technique for optimizing a sequential network by moving the registers to the boundary of the network using an extension of retiming [1], resynthesizing the combinational logic between registers using existing logic minimization techniques, and replacing registers throughout the network using retiming algorithms. [15] gave a method of estimating power in pipelined sequential CMOS circuits, considering the impact of switching activity on the power dissipation, and applied retiming that modifies the switching activity. In [16], a negative retiming technique, together with algebraic and redundancy manipulation transformations, was applied to improve the performance of embedded systems in terms of latency and throughput. [17] developed a novel framework, which consisted of a multi-dimensional (MD) retiming technique that considered the final schedule as part of the DFG optimization process. MD retiming have also been applied in [18] and [19] to achieve the full parallelism of the DFG's with more than more dimension, substantially decreasing the overall computation time. The retiming technique that reorganizes an iteration and the unfolding technique that schedules several iterations together were combined in [20] to obtain a static schedule with a reduced average computation time per iteration. A novel multirate folding transformation was proposed in [21] to systematically synthesize control circuit for pipelined VLSI architectures which implement multirate algorithms. A retiming method for folding constraints was derived to indicate how a multirate DFG must be retimed for a given schedule to be feasible. In [22], reducing the communication caused by data and loop dependencies for perfect nested loops was explored. It was shown that for a given partition, the data dependencies may be regarded as a specialized form of loop dependencies and the effects of changing the partition for both loop and data communication were examined. [23] studied the characteristics of retiming and scheduling, and the interaction between them. An algorithm was developed to generate all possible retiming or scheduling solutions, allowing a circuit designer to explore the space of possible implementations of a given DFG. Retiming has also been applied to partial scan [24,25,26], Test Pattern Generation (TPG) for Built-In Self Test (BIST) [27,28], equivalence checking [29], FPGA technology mapping [30,31,32], software pipelining [33], and even physical design automation [34,35,36], which is beyond the scope of this survey. References [1] C. E. Leiserson, F. Rose, and J. B. Saxe, ``Optimizing synchronous circuitry by retiming,'' in The Third Caltech Conference on VLSI, pp. 87--116, 1983. [2] N. Shenoy and R. K. Brayton, ``Retiming of circuits with single phase transparent latch,'' in Proc. IEEE Int. Conf. on Computer Design, pp. 86--89, 1991. [3] S. S. Sapatnekar and R. B. Deokar, ``Utilizing the retiming-skew equivalence in a practical algorithm for retiming large circuits,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 15, pp. 1237--1248, Oct. 1996. [4] S. Dey, M. Potkonjak, and S. G. Rothweiler, ``Performance optimization of sequential circuits by eliminating retiming bottlenecks,'' in Proc. Int. Conf. on Computer Aided Design, pp. 504--509, 1992. [5] S. Simon, E. Bernard, M. Sauer, and J. A. Nossek, ``A new retiming algorithm for circuit design,'' in Proc. IEEE Int. Symp. on Circuits and Systems, pp. 35--38, 1994. [6] T. W. O'Neil and E. H.-M. Sha, ``Retiming synchronous data-flow graphs to reduce execution time,'' IEEE Trans on Signal Processing, vol. 49, pp. 2397--2407, Oct. 2001. [7] G. Even, I. Y. Spillinger, and L. Stok, ``Retiming revisited and reversed,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 15, pp. 348--357, Mar. 1996. [8] T. Soyata, E. G. Friedman, and J. J. H. Mulligan, ``Incorporating interconnect, register, and clock distribution delays into the retiming process,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 16, pp. 105--120, Jan. 1997. [9] S. Tongsima, C. Chantrapornchai, E. H.-M. Sha, and N. L. Passos, ``Optimizing circuits with confidence probability using probabilistic retiming,'' in Proc. IEEE Int. Symp. on Circuits and Systems, pp. 270--273, 1998. [10] V. Sundararajan, S. Sapatnekar, and K. L. Parhi, ``MARSH: Min-area retiming with setup and hold constraints,'' in Proc. Int. Conf. on Computer Aided Design, pp. 2--6, 1999. [11] T. W. O'Neil, S. Tongsima, and E. H.-M. Sha, ``Extended retiming: Optimal scheduling via a graph-theoretical approach,'' in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, pp. 2001--2004, 1999. [12] S. Tongsima, E. H.-M. Sha, C. Chantrapornchai, D. R. Surma, and N. L. Passos, ``Probabilistic loop scheduling for applications with uncertain execution time,'' IEEE Trans. on Computers, vol. 49, pp. 65--80, Jan. 2000. [13] T. C. Denk and K. K. Parhi, ``Two-dimensional retiming,'' IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 7, pp. 198--211, June 1999. [14] S. Malik, E. M. Sentovich, R. K. Brayton, and A. Sangiovanni-Vincentelli, ``Retiming and resynthesis: Optimizing sequential networks with combinational techniques,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 10, pp. 74--84, Jan. 1991. [15] J. Nonteiro, S. Devadas, and A. Ghosh, ``Retiming sequential circuits for low power,'' in Proc. Int. Conf. on Computer Aided Design, pp. 398--402, 1993. [16] M. Potkonjak, S. Dey, Z. Iqbal, and A. C. Parker, ``High performance embedded system optimization using algebraic and generalized retiming techniques,'' in Proc. IEEE Int. Conf. on Computer Design, pp. 498--504, 1993. [17] N. L. Passos, E. H.-M. Sha, and S. C. Bass, ``Optimizing DSP flow graphs via schedule-based multidimensional retiming,'' IEEE Trans on Signal Processing, vol. 44, pp. 150--155, Jan. 1996. [18] N. L. Passos and E. H.-M. Sha, ``Achieving full parallelism using multidimensional retiming,'' IEEE Trans on Parallel and Distributed Systems, vol. 7, pp. 1150--1163, Nov. 1996. [19] N. L. Passos and E. H.-M. Sha, ``Synchronous circuit optimization via multidimensional retiming,'' IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing, vol. 43, pp. 507--519, July 1996. [20] L.-F. Chao and E. H.-M. Sha, ``Scheduling data-flow graphs via retiming and unfolding,'' IEEE Trans on Parallel and Distributed Systems, vol. 8, pp. 1259--1267, Dec. 1997. [21] T. C. Denk and K. K. Parhi, ``Synthesis of folded pipelined architectures for multirate DSP algorithms,'' IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 6, pp. 695--607, Dec. 1998. [22] M. Sheliga, Z. Yu, F. Chen, and E. H.-M. Sha, ``Graph transformation for communication minimization using retiming,'' in Proc. IEEE Int. Symp. on Circuits and Systems, pp. 207--210, 1998. [23] T. C. Denk and K. K. Parhi, ``Exhaustive scheduling and retiming of digital signal processing systems,'' IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing, vol. 45, pp. 821--837, July 1998. [24] K. Kagaris and S. Tragoudas, ``Partial scan with retiming,'' in Proc. Design Automation Conf, pp. 249--254, 1993. [25] P. Pan and C. L. Liu, ``Partial scan with preselected scan signals,'' IEEE Trans. on Computers, vol. 48, pp. 1000--1005, Sept. 1999. [26] S. T. Chakradhar and S. Dey, ``Resynthesis and retiming for optimal partial scan,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 18, pp. 621--630, May 1999. [27] A. H. El-Maleh and Y. E. Osais, ``A retiming-based test pattern generator design for built-in self test of data path architectures,'' in Proc. IEEE Int. Symp. on Circuits and Systems, pp. 550--553, 2001. [28] A. El-Maleh, T. E. Marchok, J. Rajski, and W. Maly, ``Behavior and testability preservation under the retiming transformation,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 16, pp. 528--543, May 1998. [29] G. Hasteer, A. Mathur, and P. Bannerjee, ``Efficient equivalence checking of multi-phase designs using phase abstraction and retiming,'' ACM Trans. on Design Automation of Electronics Systems, vol. 3, pp. 600--625, Oct. 1998. [30] J. Cong and C. Wu, ``An efficient algorithm for performance-optimal FPGA technology mapping with retiming,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 17, pp. 738--748, Sept. 1998. [31] J. Cong and C. Wu, ``Optimal FPGA mapping and retiming with efficient initial state computation,'' IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 18, pp. 1595--1607, Nov. 1999. [32] P. Pan and C. L. Liu, ``Optimal clock period FPGA technology mapping for sequential circuits,'' ACM Trans. on Design Automation of Electronics Systems, vol. 3, pp. 437--462, July 1998. [33] P.-Y. Calland, A. Darte, and Y. Robert, ``Circuit retiming applied to decomposed software pipelining,'' IEEE Trans on Parallel and Distributed Systems, vol. 9, pp. 24--35, Jan. 1998. [34] J. Cong, S. K. Lim, and C. Wu, ``Performance driven multi-level and multiway partitioning with retiming,'' in Proc. Design Automation Conf., pp. 274--279, 2000. [35] J. Cong and S. K. Lim, ``Physical planning with retiming,'' in Proc. Int. Conf. on Computer Aided Design, pp. 2--7, 2000. [36] J. Cong, H. Li, and C. Wu, ``Simultaneous circuit partitioning/clustering with retiming for performance optimization,'' in Proc. Design Automation Conf., pp. 460--465, 1999.