Clock Mesh Network Design with Through-Silicon Vias in 3D Integrated Circuits

ETRI Journal.
2014.
Dec,
36(6):
931-941

- Received : November 28, 2013
- Accepted : May 17, 2014
- Published : December 01, 2014

Download

PDF

e-PUB

PubReader

PPT

Export by style

Article

Metrics

Cited by

TagCloud

Many methodologies for clock mesh networks have been introduced for two-dimensional integrated circuit clock distribution networks, such as methods to reduce the total wirelength for power consumption and to reduce the clock skew variation through consideration of buffer placement and sizing. In this paper, we present a methodology for clock mesh to reduce both the clock skew and the total wirelength in three-dimensional integrated circuits. To reduce the total wirelength, we construct a smaller mesh size on a die where the clock source is not directly connected. We also insert through-silicon vias (TSVs) to distribute the clock signal using an effective clock TSV insertion algorithm, which can reduce the total wirelength on each die. The results of our proposed methods show that the total wirelength was reduced by 12.2%, the clock skew by 16.11%, and the clock skew variation by 11.74%, on average. These advantages are possible through increasing the buffer area by 2.49% on the benchmark circuits.
x
- and
y
-direction. However, in a 3D clock mesh network, the designs are more complicated because the clock signal is propagated to the clock sinks in three directions (
x
-,
y
-, and
z
-direction) using clock TSVs. In this sense, conventional 2D clock mesh network methods
[12]
–
[22]
did not consider the method of clock TSV insertion for propagating the clock signal on multiple dies. When a conventional 2D clock mesh network is directly applied to a 3D IC by using clock TSVs regularly inserted at nodes of the mesh, the low skew variation is guaranteed but wirelength is increased. The wirelength is directly affected by the position and number of clock TSVs, and any unnecessary increases in wirelength can cause thermal issues on a 3D IC consisting of multiple vertically stacked dies. In addition, if a conventional 2D clock mesh network is directly applied to a 3D IC without using TSV insertion, then the reliability of the chip is decreased. Also, this method will not guarantee the global clock skew variation and will lead to a decrease in chip performance. For these reasons, it is necessary to study 3D ICs consisting of 3D clock mesh networks with clock TSVs.
In this work, we present an effective method to reduce the clock skew and wirelength using clock TSVs in a 3D clock mesh network.
The contributions of this paper are as follows:
The rest of the paper is organized as follows. In Section II, we discuss the preliminary research and background for the 2D clock mesh design. In Section III, the proposed 3D clock mesh design methodology is explained in detail. In Section IV, the simulation results are presented. The paper ends with concluding remarks in Section V.
L
_{total}
is calculated as follows:
(1) $${L}_{\text{total}}={L}_{\text{top}}+{L}_{\text{mesh}}+{L}_{\text{stub}},$$
where
L
_{top}
is the wirelength of the top-level tree,
L
_{mesh}
is the wirelength of the mesh grid wire, and
L
_{stub}
represents the stub wirelength of the stub wire, which is the wire connecting the mesh grid to the clock sink (see
Fig. 1
). The stub and mesh grid wires implicitly affect the power consumption of the circuit. As the wirelength increases, the entire capacitance and power consumption of the circuit are increased. Consequently, to reduce power consumption in the clock mesh network, we need to reduce both the length of the stub wires and the mesh grid wires. In this paper, to reduce
L
_{mesh}
, we select the smallest possible mesh size such that the selected mesh has a lower clock skew than the clock skew constraint
L
_{stub}
.
Clock mesh size.
t
_{skew}
is estimated as follows:
(2) $${t}_{skew}=\text{}{t}_{skew}^{\text{buf}}+{D}_{\text{mesh}}({d}_{\mathrm{max}})+{D}_{\text{stub}}({L}_{\text{stub}}^{\text{max}}),$$
where
D
_{mesh}
(
d
_{max}
) is the maximum delay from the mesh drivers to the points where the stub wires meet with the mesh grid wires, and
D
_{mesh}
(
d
_{max}
) and
t
_{skew}
is reduced, because the
D
_{mesh}
(
d
_{max}
) and
L
_{total}
and
t
_{skew}
.
L
_{stub}
.
Flow of proposed methodology.
L
_{stub}
. The global clock skew
t
_{skew}
of a 3D IC is represented below
(3) $${t}_{skew}={t}_{skew}^{\text{buf}}+{D}_{\text{mesh}}({d}_{\mathrm{max}})+{D}_{\text{stub}}({L}_{\text{stub}}^{\text{max}})+{D}_{\text{TSV}},$$
where the variable
D
_{TSV}
is the delay caused by inserting the clock TSVs between die 1 and die 2. The total wirelength
L
_{total}
is represented as
(4) $${L}_{\text{total}}={L}_{\text{top}}+{L}_{\text{mesh}}+{L}_{\text{stub}}+{L}_{\text{TSV}},$$
where the variable
L
_{TSV}
is the total wirelength of the inserted clock TSVs. We assume that the all clock TSVs have the same wirelength. After the clock signal is instantaneously transferred to each mesh node, the mesh that has a lower
L
_{mesh}
and
L
_{stub}
also has a lower
t
_{skew}
. However, through (4), the sparser mesh has a lower total wirelength than the denser mesh. Therefore, we need to select the size of the mesh by considering both the total wirelength and the clock skew.
There are many works that consider the above elements when selecting the mesh size in a two-dimensional integrated circuit. A method is proposed in
[23]
to select the minimum mesh size so as to reduce both the wirelength and the clock skew on each voltage domain. In
[21]
, the constraints of the clock skew and wirelength are considered when the initial mesh is constructed so that the total wirelength is lower than
L
_{const}
and the clock skew is lower than
m
and horizontal metal wire
n
for the clock mesh are of the same value. In physical IC design, a uniform clock mesh is generally preferred since the mesh grid can be placed between uniform power rails to prevent crosstalk
[17]
. The detailed proposed mesh size–selection algorithm is presented in
Fig. 3
.
Pseudocode for mesh size selection procedure.
The proposed mesh size–selection method selects the mesh size having minimum total wirelength when the clock skew of the selected mesh is lower than clock skew constraint. The inputs of the proposed mesh size–selection algorithm are the location of the clock sinks on each die, the clock skew constraint, and the maximum number of TSVs (TSV
_{max}
). We select the minimum candidate mesh size so as to obtain the minimum total wirelength (line 2 of
Fig. 3
). The delay from the clock source to the clock sinks on each die is calculated using (5) and (6) with the Elmore delay model (line 3 of
Fig. 3
). The delay
D
_{sink∈die1}
from the clock source to the clock sink located on die 1, where it is directly connected with the clock source, is represented as (5) by Elmore delay modeling.
(5) $${D}_{\text{sink}\in \text{die}1}={R}_{\text{d}}\left({C}_{\text{w}}+{C}_{\text{s}}\right)+{R}_{\text{w}}\left(\frac{{C}_{\text{w}}}{2}+{C}_{\text{s}}\right)+{D}_{\text{sd}}+{D}_{\text{d}}.$$
In (5),
R
_{d}
and
R
_{w}
are the resistance of the mesh driver and the resistance of the wire, respectively. The variables
D
_{d}
,
D
_{sd}
,
C
_{w}
, and
C
_{s}
are the intrinsic delay of the mesh driver, the delay from the clock source to the mesh driver, the capacitance of the wire, and the capacitance of the clock sink, respectively. The delay
D
_{sink∈die2}
from the clock source to the clock sink located on die 2, where the clock signal is transferred from the clock source to the clock sink through the clock TSVs, is represented as (6) by Elmore delay modeling.
(6) $$\begin{array}{l}{D}_{\text{sink}\in \text{die2}}={R}_{\text{d}}({C}_{\text{TSV}}+{C}_{\text{d}})+{R}_{\text{TSV}}(\frac{{C}_{\text{TSV}}}{2}+{C}_{\text{Tb}})\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}+{D}_{\text{Tb}}+{R}_{\text{Tb}}({C}_{\text{w}}+{C}_{\text{s}})+{R}_{\text{w}}(\frac{{C}_{\text{w}}}{2}+{C}_{\text{s}})\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}+{D}_{\text{sd}}+{D}_{\text{d}}.\end{array}$$
The variables
R
_{TSV}
and
R
_{Tb}
are the resistance of the clock TSV and the resistance of the clock TSV buffer, respectively. The variables
C
_{TSV}
,
C
_{d}
,
C
_{Tb}
, and
D
_{Tb}
are the capacitance of the clock TSV, mesh driver, clock TSV buffer, and the intrinsic delay of the clock TSV buffer, respectively. The delay models of the wire, clock TSV, and clock buffer are shown in
Fig. 4
.
To calculate the delay from the mesh driver on die 1 to the clock sinks on dies 1 and 2, we use the delay model illustrated in
Fig. 4
. If the calculated
t
_{skew}
is higher than
t
_{skew}
(lines 4–7). The insertion of the clock TSVs is performed using Algorithm 2 (introduced in Section III-4). If
t
_{skew}
is higher than
Delay models of (a) wire, (b) clock TSV, and (c) clock buffer.
x_{s}
,
y_{s}
,
z_{s}
) on die 2 (see
Fig. 5(a)
). The candidate location of a clock TSV (
x_{k}
,
y_{k}
,
z_{s}
) is determined so as to be on the same die as the target sink. The detailed procedure for inserting a clock TSV is as follows. The variables
i
and
n
are the number of steps and the number of horizontal (and also vertical) metal wires, respectively.
Clock TSV insertion with three-step search on die 2.
Pseudocode for TSV insertion and local mesh sizing.
A. Step. 1
(
i
= 1)
We select the center of each mesh grid as a candidate location for the insertion of a clock TSV ((
x_{k}
,
y_{k}
,
z_{s}
), line 1). To insert a clock TSV close to the target sink (
x_{s}
,
y_{s}
,
z_{s}
), the distance between the candidate locations and the location of the target sink are compared using (line 2)
(7) $$\begin{array}{l}\text{dist}=\Vert ({x}_{s}-{x}_{k}),\text{\hspace{0.17em}\hspace{0.17em}}({y}_{s}-{y}_{k}),\text{\hspace{0.17em}\hspace{0.17em}}({z}_{s}-{z}_{k})\Vert \\ \left\{s,\text{\hspace{0.17em}\hspace{0.17em}}k\right\}\in N,\text{\hspace{0.17em}\hspace{0.17em}}N=\left\{1,\text{\hspace{0.17em}\hspace{0.17em}}2,\text{\hspace{0.17em}\hspace{0.17em}}\dots \text{\hspace{0.17em}\hspace{0.17em}},\text{\hspace{0.17em}\hspace{0.17em}}{n}^{2}\right\}.\end{array}$$
As the distances between (
x_{k}
,
y_{k}
,
z_{s}
) and (
x_{s}
,
y_{s}
,
z_{s}
) decrease in length, so the wirelength between the target sink and a clock TSV will also decrease. We select the candidate location (
x_{i}
,
y_{i}
,
z_{s}
) for the insertion of a clock TSV by using the lowest value from the results of (7) (line 4).
B. Step. 2
(
i
= 2)
The mesh grid that is selected in Section III-4-A, including a candidate location from the result of Section III-4-A, is bisected in both the vertical and horizontal directions by the candidate location. Each center location of the resulting four regions then becomes a new candidate location for a clock TSV (
x_{k}
,
y_{k}
,
z_{s}
); these TSVs can be inserted using
(8) $$\begin{array}{l}({x}_{k},\text{\hspace{0.17em}\hspace{0.17em}}{y}_{k},\text{\hspace{0.17em}\hspace{0.17em}}{z}_{k})=({x}_{i-1},\text{\hspace{0.17em}\hspace{0.17em}}{x}_{i-1},\text{\hspace{0.17em}\hspace{0.17em}}{z}_{s})-{M}_{T}\frac{1}{n}(\frac{x}{{2}^{i}},\text{\hspace{0.17em}\hspace{0.17em}}\frac{y}{{2}^{i}},\text{\hspace{0.17em}\hspace{0.17em}}0),\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}{t}_{0}=(1,\text{\hspace{0.17em}\hspace{0.17em}}1,\text{\hspace{0.17em}\hspace{0.17em}}0),\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}{t}_{1}=(1,\text{\hspace{0.17em}\hspace{0.17em}}-1,\text{\hspace{0.17em}\hspace{0.17em}}0).\end{array}$$
In (8),
T
is the set consisting of the center locations
t
_{0}
,
t
_{1}
,
t
_{2}
, and
t
_{3}
, of the four regions, divided in both the vertical and horizontal directions by the clock TSV location (
x
_{i–1}
,
y
_{i–1}
,
z_{s}
). The location of the clock TSV (
x_{i}
,
y_{i}
,
z_{s}
) in Section III-4-B is the lowest value from the results of (7) (lines 7–8 and 4).
C. Step. 3
(
i
= 3)
In this step, the process performed in Section III-4-B is repeated. Through these processes, the candidate location of the clock TSV(
x
_{3}
,
y
_{3}
,
z
_{3}
) that is selected in this step may not be the closest location to the target sink. In other words, the distance from the target sink to the selected location in this step may not be closer than the same corresponding distances found in the previous steps. Therefore, to insert the clock TSV at the closest possible location to the target sink, the results of all three steps are compared (lines 11–12). The clock signal will be propagated to die 2 through a clock TSV from die 1 when the clock TSV is vertically inserted. After the clock TSV is connected to a mesh node on die 1, the clock signal is transferred simultaneously to both the mesh nodes, where the clock TSV is connected, and the other mesh nodes on die 2. However, the proposed clock TSV insertion algorithm may not situate the clock TSV at the position of the mesh node on die 1. To solve the above issue, we proposed a method to locally increase the size of the mesh on die 1. When the size of the mesh is locally increased, the mesh nodes are added. The clock TSV is inserted at the added mesh node on die 1 (line 13). The method for locally increasing the size of the mesh on die 1 is described in
Fig. 7
below.
The vertical and horizontal mesh wires are generated at the clock TSV location on die 1 and expand in the vertical and horizontal directions of the points until the uniform mesh grid wires are connected.
Figure 7(a)
shows the location of a clock TSV determined in step 1, and
Figs. 7(b)
and
7(c)
show the locally increased size of the mesh on die 1 when a clock TSV is inserted.
Figure 8
shows an example of the increased size of mesh on die 1. The clock TSVs that are inserted close to the target sinks, by Algorithm 2, affect the delay from the clock source to not only the target sink but also the other clock sinks. Therefore, the clock skews are recalculated by using (5) and (6). The clock TSV insertion is repeated until the clock skew is less than the clock skew constraint. The iteration is completed when the maximum number of clock TSVs (# TSV
_{max}
) have been used.
Size of the mesh on die 1 is locally increased by inserting the clock TSV: (a) locations of a clock TSV as determined in steps 1–3, and (b)–(d) size of the mesh locally increased at die 1.
Figure 8(a)
shows the inserted clock TSVs on the clock mesh.
Figure 8(b)
shows an example of the clock TSV on die 2 that connects not only to the target sink but also to other clock sinks; they are connected at the clock TSV if the distance from the clock TSV is closer than the nearby mesh nodes.
3D clock mesh with clock TSVs: (a) clock TSV insertion between die 1 and die 2, and (b) clock sinks connected to a clock TSV on die 2 and the locally increased size of the mesh at die 1.
x
and that the distance between each buffer is
y
. However, the delay equations proposed in
[25]
are complex and have a lot of variables. In
[24]
, the authors assume that the distance between each buffer is
x
; therefore, the delay equations are more simple than in
[25]
. In this paper, we proposed a buffer assignment method to find the optimized number of buffers using the simple delay equation presented in
[24]
, which can reduce the delay. We assume that all of the wire parameters, such as the capacitance per unit length and the resistance per unit length, are the same; although these parameters are in fact different. We assume that a number of buffers,
k
, are inserted between either the clock TSV on die 2 or the mesh driver on die 1 and the clock sink. To insert the buffers on the highly delayed wire, the buffers are selected in the buffer library. To find
k
, the minimum delay
D
_{m}
is obtained by
(9) $$\begin{array}{l}{D}_{\text{m}}={R}_{\text{Tb}}\left({c}_{\text{w}}x+{C}_{\text{b}}\right)+{D}_{\text{Tb}}+\frac{1}{2}{r}_{\text{w}}{c}_{\text{w}}{x}^{2}+{r}_{\text{w}}x{C}_{\text{b}}\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}+\left(k-1\right)\left[{R}_{\text{b}}\left({c}_{\text{w}}x+{C}_{\text{b}}\right)+{D}_{\text{b}}+\frac{1}{2}{r}_{\text{w}}{c}_{\text{w}}{x}^{2}+{r}_{\text{w}}{c}_{\text{w}}{C}_{\text{b}}\right]\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}+{R}_{\text{b}}\left[{c}_{\text{w}}\left(L-kx\right)+{C}_{\text{s}}\right]+{D}_{\text{b}}+\frac{1}{2}{r}_{\text{w}}{c}_{\text{w}}{\left(L-kx\right)}^{2}\\ \text{\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}}+{r}_{\text{w}}\left(L-kx\right){C}_{\text{s}}.\end{array}$$
In (9),
R
_{Tb}
,
R
_{b}
,
r
_{w}
, and
c
_{w}
are the resistance of the buffer with the clock TSV, the resistance of the buffer, the resistance per unit length of wire, and the capacitance per unit length of wire, respectively. The variables
C
_{b}
,
C
_{s}
,
D
_{Tb}
, and
D
_{b}
are the capacitance of the clock buffer, clock sink, the intrinsic delay of the buffer with the clock TSV, and clock buffer, respectively. We assume that the clock buffers are inserted at the position where the wirelength
L
from a clock TSV buffer to a clock sink is divided by
k
+ 1. According to this assumption, the distance between each buffer is
x
= (
k
+ 1)
^{−1}
. We assign
x
to (9), and the delay
D
_{m}
is represented as follows:
(10) $${D}_{\text{m}}=\left(k+1\right)\left[{R}_{\text{b}}\left({C}_{\text{w}}\frac{1}{k+1}+{C}_{\text{b}}\right)+{D}_{\text{b}}+\frac{{r}_{\text{w}}{c}_{\text{w}}{L}^{2}}{2{\left(k+1\right)}^{2}}+\frac{{r}_{\text{w}}{c}_{\text{w}}L}{k+1}\right].$$
According to the method in
[25]
, it is effective to insert a buffer when the delay from the buffer insertion is less than the delay without the buffer. This method is generalized as follows:
(11) $${D}_{\text{m}}\left(k-1\right)>{D}_{\text{m}}\left(k\right).$$
Equation (10) is assigned to (11), and the result is represented as follows:
(12) $${R}_{\text{b}}{C}_{\text{b}}+{D}_{\text{b}}+\frac{{r}_{\text{w}}{c}_{\text{w}}{L}^{2}}{2(k+1)}-\frac{{r}_{\text{w}}{c}_{\text{w}}{L}^{2}}{2k}<0.$$
Rearranging (12) in terms of
k
we have
(13) $${k}^{2}+k-\frac{{r}_{\text{w}}{c}_{\text{w}}{L}^{2}}{2\left({R}_{\text{b}}{C}_{\text{b}}+{D}_{\text{b}}\right)}<0.$$
Then, the solution of (13) is as follows:
(14) $$k<\frac{-1+\sqrt{1+2\frac{{r}_{\text{w}}{c}_{\text{w}}{L}^{2}}{({R}_{\text{b}}{C}_{\text{b}}+{D}_{\text{b}})}}}{2}.$$
We are able to obtain the optimal number of buffers
k
with the minimum delay
D
_{m}
by using (14). The clock skew
t
_{skew}
is then calculated. If the calculated clock skew is greater than the clock skew constraint, then the above procedure is repeated for the highly delayed wire until the clock skew is lower than the clock skew constraint.
μ
_{skew}
” for the mean deviation of the clock skew, and “σ
_{skew}
” for the standard deviation of clock skew. The parameters
μ
_{skew}
and σ
_{skew}
are obtained through a Monte Carlo simulation in HSPICE. The columns under “Ratio” are the relative value with respect to “Proposal.” For the ISCAS benchmarks, the proposed buffer assignment method, denoted by “
[21]
_EX with buffer,” shows the clock skew results reduced by 11.31% and the clock skew variation reduced by 30.02%, on average, compared to the method in
[21]
. However, the buffer area is generally increased when the clock buffers are inserted. Our proposed buffer assignment method increases the buffer area by 8.92%, on average, by inserting the buffers.

The method with the iterative deletion of the buffer, which is proposed in
[22]
, shows better performance than our buffer insertion algorithm. However, our proposed method with Algorithms 1 and 2, as well as the buffer assignment, shows a better performance in terms of buffer area, wirelength, and clock skew variation.
Table 2
shows the results of the 3D clock mesh network on ISPD 2010 benchmarks. Our proposed algorithm shows the clock skew results reduced by 4.2% and the wirelength reduced by 12.2%, on average, compared to the method in
[22]
, which is expanded to the 3D clock mesh network.
Table 3
shows the effects of variation on the top-level tree by modeling the input arrival time for the mesh drivers. The different parameter in
Table 3
is the skew variation, which represents the clock skew difference between the mesh drivers. We obtained the reduced total wirelength and clock skew shown in
Table 1
and
Table 2
. These results show the effectiveness of our proposed methods. When we assign the variation of skew to the buffers constructing the top-level tree, the buffer area is increased by approximately 2.48%, on average, more than the conventional method in
[21]
that was expanded to 3D However, instead of a buffer area increased by 2.48%, our methods reduce the total wirelength by 12.2%, the clock skew by 16.11%, and the clock skew variation by 11.74%, on average.

This work was supported by the MSIP (Ministry of Science, ICT & Future Planning), Rep. of Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency), (NIPA-2013-H0301-13-1011).
ruddls1116@gmail.com
Kyungin Cho received his BS degree in electronics engineering from the school of Electronics Computer Engineering, Inha University, Incheon, Rep. of Korea, in 2010. From 2010 to 2012, he was a researcher at the HE Department of LG Electronics, Pyeongtaek, Rep. of Korea. Since 2013, he has been with the Department of Electronics Computer Engineering, Hanyang University, Seoul, Rep. of Korea, where he is currently pursuing his MS degree. His main research interest is SoC design methodology, including the physical design and automation of 3D ICs.
jangcj@hanyang.ac.kr
Cheoljon Jang received his BS and MS degrees in electronics and computer engineering from the school of Electronics Computer Engineering, Hanyang University, Seoul, Rep. of Korea, in 2011 and 2013, respectively. He is currently pursuing his PhD degree in nanoscale semiconductor engineering at Hanyang University. His main research interest is SoC design methodology, including the physical design and automation of 3D ICs.
Corresponding Author jchong@hanyang.ac.kr
Jong-wha Chong received his BS and MS degrees in electronics engineering from Hanyang University, Seoul, Rep. of Korea, in 1975 and 1979, respectively and his PhD degree in electronics and communication engineering from Waseda University, Shinjuku-ku, Tokyo, Japan, in 1981. Since 1981, he has been a professor of the Department of Electronics Engineering, Hanyang University. From 1979 to 1980, he was a researcher at the C&C Research Center of Nippon Electronics Company, Shiba Minato, Tokyo, Japan. From 1983 to 1984, he was a visiting researcher at the Korean Institute of Electronics & Technology, Seongnam, Rep. of Korea. Between 1986 and 2008, he was a visiting professor at the University of California, Berkeley, USA. He was the chairman of the CAD & VLSI society of the Institute of the Electronic Engineers of Korea in 1993. In 2007, he was the president of the IEEK, and from 2009 to 2010, he was the president of the KIEEE. He is currently the chairman of the Fusion SoC Forum. His main research interests are SoC design methodology; including memory centric design and the physical design and automation of 3D ICs; indoor wireless communication SoC design for ranging and location; video systems; and power IT systems.

I. Introduction

The CMOS technology process is continuing to decline, and as the design of its integrated circuits becomes more complex, the capability to optimize performance is fast approaching the limit. Three-dimensional integrated circuits (3D ICs) using through-silicon vias (TSVs) have attracted considerable attention as an important technology; these are continuing the decreasing technology process trend that Moore predicted
[1]
. A 3D IC connects multiple dies that are vertically stacked on top of each other. The TSVs are used to connect the vertically stacked dies. A TSV is a via through the dies. This 3D stacking technology can significantly reduce mean and maximum wirelength; power consumption; chip area; and signal delay
[2]
–
[3]
. The methods of reducing the wirelength with TSVs in 3D ICs can also be applied to clock distribution networks
[4]
–
[6]
.
A clock distribution network functions to propagate a clock signal from a clock source to clock sinks. The clock skew is defined as the maximum difference of a clock signal’s arrival time from a clock source to all of the clock sinks. A high clock skew directly affects the maximum frequency and timing of circuits, which leads to a degradation of chip performance. According to the International Technology Roadmap for Semiconductors projection
[7]
, the clock skew is generally required to be less than 3% to 4% of a clock period in a clock network design. To reduce the clock skew variation, many methods have been studied including variation-aware buffer and wire sizing
[8]
, variation-aware routing
[9]
, link insertion in clock trees
[10]
, and leaf-level mesh
[11]
–
[12]
. Among these different methods to reduce clock skew variation, the leaf-level mesh with a top-level tree demonstrates highly effective results in several commercial chips
[11]
. The clock mesh network connects the clock sinks to the metal wires by intersecting vertical and horizontal metal wires. The clock signal is distributed from a clock source with a top-level tree to the clock sinks through the mesh driver located on the vertical and horizontal metal wire intersection. The clock mesh network requires more resources, including wirelength and power, than the clock tree network
[11]
. In addition, in a 3D clock mesh network, the wirelength is increased in proportion to the number of dies. For this reason, clock trees is more actively researched than clock mesh.
However, a clock mesh network guarantees a global skew variation, while a clock tree network guarantees a local skew variation. Well known for having a low global skew variation, the clock mesh network is mainly used in microprocessor design for high performance. With this in mind, researches on 3D clock mesh networks focus on reducing resources such as wirelength and power.
Because of the advantages of a clock mesh network, many studies related to mesh synthesis and optimization have been conducted
[12]
–
[20]
. The work of
[13]
proposed to remove only mesh wires that did not significantly impact on the clock skew of the mesh. The authors in
[13]
also suggest the method of buffer placement and sizing. The work of
[14]
aims at connecting clock sinks to mesh wires using a Steiner tree avoiding individually connecting clock sinks to mesh wires in an effort to reduce the stub wire of the mesh. A method is proposed in
[12]
to determine the buffer driver insertion and sizing and to remove the mesh wire for power consumption. In
[15]
, the authors consider a timing delay on the path of the combinational circuits to determine the size of the initial mesh, by considering the timing when the clock mesh is constructed. A non-uniform clock mesh grid is proposed in
[16]
–
[17]
to reduce power consumption. A method is proposed in
[18]
to generate the clock mesh grid wires using an integer linear programming formulation to minimize the wirelength of the mesh. The works of
[19]
and
[20]
propose methods to simultaneously reduce the wires of a mesh grid and the stub wires of the mesh by placing the mesh grid wires close to the clock sinks. A method is proposed in
[21]
–
[22]
to determine the initial mesh size by considering the clock skew and wirelength. They also proposed a method for buffer placement, sizing, and wire sizing. In these works with 2D clock mesh networks, the clock signal is propagated to clock sinks in the
- ▪ We propose a method to select the size of the mesh on each die. By inserting clock TSVs, the proposed mesh size selection algorithm has a lower clock skew than the clock skew constraint. Additionally, the mesh size selection algorithm can reduce the total clock mesh wirelength by decreasing the wirelength of the clock mesh grid.
- ▪ We present a method to insert clock TSVs close to clock sinks. The proposed clock TSV insertion and local mesh sizing methods achieve a low clock skew variation, although the size of the mesh is sparse. This is possible since the clock TSVs are inserted close to the clock sinks.
- ▪ We suggest a clock buffer assignment method to reduce the clock skew variation caused by an imbalanced stub wirelength. A proposed buffer assignment method can reduce the clock skew variation by inserting a buffer at regular intervals on the wires that go from the mesh driver to the clock sink. Additionally, the proposed buffer assignment method selects the number of buffers that can minimally reduce the clock skew.

II. Preliminary Research and Background of 2D Clock Mesh Network Design

The most important elements in clock mesh design are the wirelength and clock skew. In a clock mesh network, because of a wire’s resistance and capacitance, wirelength affects power consumption. A high clock skew degrades the maximum operating frequency of the chip and causes a signal timing issue. We present stub and mesh grid wires, which are affected by the mesh size in a 2D clock mesh network (see Section II-1). In Section II-2, clock skew and the size of the mesh in a 2D clock mesh network are discussed.
- 1. Wirelength in a 2D Clock Mesh Network

In a 2D clock mesh network, the total wirelength
t skew constraint

. We simply insert clock TSVs to reduce
PPT Slide

Lager Image

- 2. Clock Skew in 2D Clock Mesh Networks

In
[13]
, the global clock skew
t skew constraint

is the skew introduced by the difference in the maximum and minimum delay on the mesh driver,
D stub ( L stub max )

is the maximum delay from the points where the stub wires meet with the mesh grid wires to the clock sinks. In (2),
D stub ( L stub max )

must be reduced to decrease the clock skew. As we move to the right in
Fig. 1
, the size of the mesh is increased and
D stub ( L stub max )

are reduced. However, as the size of the mesh increases, the number of horizontal and vertical metal wires increases. Therefore, in this paper, we suggest a method to construct a clock mesh network that considers both
III. Proposed Methodology

- 1. Overview

The flow of the proposed method is presented in
Fig. 2
. The first step is the selection of the mesh size. The uniform size of the mesh on each die is selected when the size of the mesh has both the minimal total wirelength and a lower clock skew than the clock skew constraint. After the size of the mesh is selected, the buffer-inserted top-level tree is constructed to simultaneously transfer the clock signal from a clock source to all of the clock sinks. A clock TSV at the center of die 1 is inserted to transfer the clock signal from die 1 to die 2 with minimum wirelength. After the initial mesh construction, clock TSVs are inserted and the local mesh–sizing algorithm is performed to reduce
D stub ( L stub max )

. In the last step, the buffer assignment, the buffers are inserted to reduce the delay that occurs in the imbalanced
PPT Slide

Lager Image

- 2. Mesh Size Selection

As mentioned in Section II, the size of the mesh affects both the wirelength and the clock skew. In
Fig. 1
, it can be seen that the denser the clock mesh, the lower the value of
t skew constraint

.
The proposed mesh size selection algorithm expands upon the initial mesh sizing method of
[21]
for a 3D clock mesh network with clock TSVs. To select the mesh size of minimal wirelength, we select a candidate mesh size and insert clock TSVs using our proposed method of clock TSV insertion. After the above procedure, we select the mesh size of minimum wirelength by calculating the clock skew. In this paper, we assume that the vertical metal wire
PPT Slide

Lager Image

t skew constraint

, then clock TSVs are inserted to reduce
t skew constraint

after the maximum number of TSVs is inserted, then the above process is repeated by increasing the size of the mesh by one (lines 8–10). During this iteration, the size of the mesh is selected when the clock skew is less than clock skew constraint set by the designer (lines 12–13).
PPT Slide

Lager Image

- 3. Initial Mesh Construction

After the candidate mesh size is selected, an initial clock TSV needs to be inserted to transfer the clock signal from die 1 to die 2. This is inserted at the center of die 1 so as to propagate the clock signal from die 1 to die 2 with minimum wirelength.
- 4. TSV Insertion and Local Mesh Size

In this step, clock TSVs are inserted on die 1 so that the clock signal is propagated to the clock sinks on die 2. We calculate the delay from the clock source to the clock sink with (5) and (6). To obtain a clock skew value that is less than the clock skew constraint, a clock TSV is inserted near to the clock sink having the largest delay from the clock source located in die 2.
Figure 5
shows the clock TSV insertion process with a three-step search, and
Fig. 6
shows the detailed procedure of the clock TSV insertion method.
We assume that the clock sink with the largest delay from the clock source — namely, the target sink — is located at (
PPT Slide

Lager Image

PPT Slide

Lager Image

PPT Slide

Lager Image

PPT Slide

Lager Image

- 5. Buffer Assignment

The stub wirelength on die 2 is relatively longer than that on die 1 because the size of the mesh on die 1 is larger than that of die 2. As the wirelength is increased, the signal delay from the clock source to the clock sinks is generally increased since both the capacitance and resistance of the wire are increased
[24]
. In this step, we propose a method to assign buffers to reduce the delay from the longer stub wirelength. The method to insert buffers to reduce the delay is well studied in
[24]
–
[25]
. The proposed method in
[25]
assumes that the distance from the source to the nearest buffer is
IV. Simulation Results

- 1. Design Environment

The algorithms were implemented in C++, and simulations were run on a Linux workstation with 2 GB of RAM. The proposed methods were verified using experiments performed on the ISCAS89 and ISPD 2010 benchmark circuits. The proposed method uses an existing placement result as the input. We compared our result with the result from the 2D clock mesh network in
[21]
and
[22]
because the clock mesh network in a 3D IC has not yet been presented. The study in
[21]
and
[22]
only constructed a 2D clock mesh network. We expand the results from
[21]
and
[22]
to a 3D clock mesh network. To do this, we stacked the same two dies vertically. The nominal skew constraint is 75 ps, which is the same as in
[21]
. We used 12 different buffer sizes with a maximum-capacitance limit ranging from 60 fF to 300 fF; this is the same as in
[21]
. We used buffer sizes with a maximum-capacitance limit ranging from 60 fF in the proposed Algorithm 3. The clock skew constraint used in Algorithms 1 and 2 is 50 ps. The clock skew constraint used in Algorithm 3 is 35 ps. We also used the same 65 nm technology parameters, transistor model from
[21]
, and a similar set of benchmark circuits. In this paper, the resistance of the clock TSV is 0.053 Ω, and the capacitance of the clock TSV is 27.9 fF. The variation parameters considered are the buffer channel lengths, power supply variation, and sink load capacitance variation. These parameters are varied with a 5% standard deviation from their nominal value. We model the effects of the variation in the top-level tree in a similar way as
[21]
by modeling the input arrival time for the mesh drivers with a random variable. We used a range of ±25 ps for the clock skew between two mesh drivers and used the same slew for all of the mesh buffers. The methods used in the simulation are presented as follows:
- ▪ In[21]and[22], the clock mesh was only constructed as a 2D clock mesh network. Therefore, we expanded the method from[21]and[22]to a 3D clock mesh network to compare our proposed TSV insertion, local mesh sizing, and buffer assignment method. We construct a 3D clock mesh by vertically stacking two identical dies that have a 2D clock mesh; this is proposed in[21]and[22]. To connect the two dies, the same number of clock TSVs were used in the proposed method and were inserted at regular space intervals. This approach is denoted by “[21]_EX” and “[22]_EX” in our tables.
- ▪ We construct a 3D clock mesh by applying the proposed Algorithm 3 to[21]_EX to demonstrate the effects of our proposed buffer assignment algorithm. This approach is denoted by “[21]_EX with buffer” in our tables.
- ▪ We run our 3D clock mesh algorithm on the clock mesh obtained from the clock mesh sizing, clock TSV insertion, and buffer assignment. This approach is denoted by “Proposal” in our tables.

- 2. Results

Table 1
shows the results of the 3D clock mesh network with the maximum clock skew of ±50 ps between mesh drivers and a slew of 50 ± 10 ps on ISCAS benchmarks. The parameters in
Table 1
are “BA” for the buffer area, “WL” for the total wirelength, “
Comparison of clock mesh with maximum skew of ±50 ps between mesh buffers and a slew of 50±10 ps on ISCAS benchmarks.

Benchmark (# sinks) | Method | BA | WL | _{skew} | σ_{skew} | ||
---|---|---|---|---|---|---|---|

μm^{2} | Ratio | μm | Ratio | ps | ps | ||

S5378 (165) | 63.2 | 1.29 | 65382 | 1.14 | 31.8 | 8.7 | |

69.6 | 1.42 | 65382 | 1.14 | 28.7 | 6.1 | ||

51.4 | 1.05 | 65382 | 1.14 | 30.1 | 7.4 | ||

Proposal | 48.7 | 1.00 | 57040 | 1.00 | 29.3 | 6.9 | |

S13207 (500) | 168.5 | 1.10 | 245858 | 1.16 | 19 | 4.5 | |

177.4 | 1.16 | 245858 | 1.16 | 17.1 | 3.8 | ||

152.6 | 0.99 | 245858 | 1.16 | 18.1 | 4.5 | ||

Proposal | 152.5 | 1.00 | 210311 | 1.00 | 17.9 | 4.6 | |

S15850 (566) | 200.5 | 1.06 | 218550 | 1.23 | 19.2 | 4.0 | |

214.1 | 1.13 | 218550 | 1.23 | 17.2 | 3.1 | ||

189.1 | 1.01 | 218550 | 1.23 | 17.8 | 3.8 | ||

Proposal | 188.7 | 1.00 | 176445 | 1.00 | 17.8 | 3.5 | |

S35932 (1426) | 536.2 | 1.05 | 637424 | 1.13 | 23.7 | 5.1 | |

588.2 | 1.15 | 637424 | 1.13 | 21.0 | 3.0 | ||

510.5 | 1.01 | 637424 | 1.13 | 21.9 | 3.6 | ||

Proposal | 509.4 | 1.00 | 559993 | 1.00 | 21.6 | 3.4 | |

S38584 (1728) | 658.4 | 1.02 | 761346 | 1.10 | 28.7 | 4.9 | |

742.0 | 1.15 | 761346 | 1.10 | 24.4 | 2.9 | ||

646.6 | 1.01 | 761346 | 1.10 | 26.9 | 4.4 | ||

Proposal | 640.6 | 1.00 | 689531 | 1.00 | 25.6 | 3.6 |

Comparison of clock mesh with maximum skew of ±50 ps between mesh buffers and a slew of 50±10 ps on ISPD 2010 benchmarks.

Benchmark (# sinks) | Method | BA | WL | _{skew} | σ_{skew} | ||
---|---|---|---|---|---|---|---|

μm^{2} | Ratio | μm | Ratio | ps | ps | ||

01 | 1658.4 | 1.38 | 3232546 | 1.22 | 29.3 | 6.9 | |

1891.2 | 1.58 | 3232546 | 1.22 | 26.1 | 3.1 | ||

1108.4 | 0.92 | 3232546 | 1.22 | 28.4 | 5.0 | ||

Proposal | 1193.2 | 1.00 | 2634824 | 1.00 | 27.9 | 4.5 | |

02 | 2832.5 | 1.47 | 3024654 | 1.18 | 42.1 | 9.8 | |

3521.4 | 1.83 | 3024654 | 1.18 | 38.4 | 6.1 | ||

1953.4 | 1.01 | 3024654 | 1.18 | 40.6 | 7.3 | ||

Proposal | 1923.7 | 1.00 | 2542389 | 1.00 | 39.8 | 6.9 | |

03 | 953.4 | 1.23 | 2032485 | 1.25 | 19.9 | 9.6 | |

990.7 | 1.28 | 2032485 | 1.25 | 14.2 | 4.4 | ||

753.4 | 0.97 | 2032485 | 1.25 | 16.1 | 6.9 | ||

Proposal | 769.4 | 1.00 | 1624685 | 1.00 | 15.8 | 5.9 | |

04 | 1035.2 | 1.18 | 2498327 | 1.27 | 15.9 | 8.9 | |

1142.9 | 1.30 | 2498327 | 1.27 | 11.6 | 4.5 | ||

890.6 | 1.02 | 2498327 | 1.27 | 13.4 | 5.0 | ||

Proposal | 872.6 | 1.00 | 1954654 | 1.00 | 12.8 | 4.9 | |

05 | 893.5 | 1.21 | 2456872 | 1.34 | 16.7 | 9.4 | |

953.4 | 1.30 | 2456872 | 1.34 | 13.1 | 3.8 | ||

742.8 | 1.01 | 2456872 | 1.34 | 15.8 | 6.4 | ||

Proposal | 732.5 | 1.00 | 1824648 | 1.34 | 14.8 | 5.8 |

Results of different skew values for mesh buffer input signals.

Skew variation (ps) | Method | BA %Red | WL %Red | _{skew} AVG. | σ_{skew} AVG. |
---|---|---|---|---|---|

±10 | 0.00 | 0.00 | 26.88 | 8.42 | |

−10.49 | 0.00 | 21.89 | 5.91 | ||

3.54 | 0.00 | 24.04 | 7.31 | ||

Proposal | −2.49 | 12.2 | 23.03 | 7.29 | |

±30 | 0.00 | 0.00 | 25.87 | 6.24 | |

−10.49 | 0.00 | 21.97 | 5.07 | ||

8.65 | 0.00 | 24.18 | 5.98 | ||

Proposal | −2.49 | 12.2 | 23.17 | 5.74 | |

±50 | 0.00 | 0.00 | 27.45 | 5.45 | |

−10.49 | 0.00 | 21.09 | 4.96 | ||

7.68 | 0.00 | 23.99 | 5.35 | ||

Proposal | −2.49 | 12.2 | 22.84 | 5.11 | |

Average | 0.00 | 0.00 | 26.73 | 6.70 | |

−10.49 | 0.00 | 21.65 | 5.31 | ||

12.21 | 0.00 | 23.98 | 6.27 | ||

Proposal | −2.49 | 12.2 | 23.79 | 6.05 | |

Improvement | Proposal | −2.49 | 12.2 | 4.61 | 0.65 |

V. Conclusion

In this paper, we presented effective methods, used in the construction of a 3D clock mesh network, for the selection of mesh size, TSV insertion, local mesh sizing, and buffer assignment. From the simulation results, we verified that our proposed mesh size selection, clock TSV insertion, and local mesh sizing methods can reduce the total wirelength and that the proposed buffer assignment method can reduce the clock skew and clock skew variation. The simulation results show that the total wirelength is reduced by 12.2%, the clock skew by 16.11%, and the clock skew variation by 11.74%; this is compared with the conventional method in
[21]
that was expanded to 3D with the same number of clock TSVs as used in the proposed methods. These advantages are possible with the buffer area increased by 2.49% on the benchmark circuits. The above results show that our proposed method can construct an effective and powerful 3D clock mesh network.
BIO

Tsai Y.F.
2005
“Three-Dimensional Cache Design Exploration Using 3DCacti,”
Proc. IEEE Int. Conf. Comput. Des.: VLSI Comput. Processors
San Jose, CA, USA
Oct. 2–5, 2005
519 -
524
** DOI : 10.1109/ICCD.2005.108**

Dong X.
,
Xie Y.
2009
“System-Level Cost Analysis and Design Exploration for Three-Dimensional Integrated Circuits (3D ICs),”
Asia South Pacific Des. Autom. Conf.
Yokohama, Japan
Jan. 19–22, 2009
234 -
241
** DOI : 10.1109/ASPDAC.2009.4796486**

Deng Y.
,
Maly W.
2003
“A Feasibility Study of 2.5D System Integration,”
Proc. IEEE Custom Integr. Circuits Conf.
San Jose, CA, USA
Sept. 21–24, 2003
667 -
670
** DOI : 10.1109/CICC.2003.1249483**

Zhao X.
,
Minz J.
,
Lim S.K.
2011
“Low-Power and Reliable Clock Network Design for Through-Silicon Via (TSV) Based 3D ICs,”
IEEE Trans. Compon., Packaging Manuf. Technol.
1
(2)
247 -
259
** DOI : 10.1109/TCPMT.2010.2099590**

Zhao X.
,
Lim S.K.
2010
“Power and Slew-Aware Clock Network Design for Through-Silicon-Via (TSV) Based 3D ICs,”
Asia South Pacific Des. Autom. Conf.
Taipei, Taiwan
Jan. 18–21, 2010
175 -
180

Kim T.Y.
,
Kim T.W.
2010
“Clock Tree Embedding for 3D ICs,”
Asia South Pacific Des. Autom. Conf.
Taipei, Taiwan
Jan. 18–21, 2010
486 -
491
** DOI : 10.1109/ASPDAC.2010.5419833**

International Technology Roadmap for Semiconductors (ITRS)
http://www.itrs.net

Guthaus M.R.
,
Sylvester D.
,
Brown R.B.
2006
“Clock Buffer and Wire Sizing Using Sequential Programming,”
ACM/IEEE, Des. Autom. Conf.
San Francisco, CA, USA
July 24–28, 2006
1041 -
1046
** DOI : 10.1145/1146909.1147171**

Xiao L.
2010
“Local Clock Skew Minimization Using Blockage-Aware Mixed Tree-Mesh Clock Network,”
IEEE/ACM Int. Conf. Comput.-Aided Des.
San Jose, CA, USA
Nov. 7–11, 2010
458 -
462

Rajaram A.
,
Hu J.
,
Mahapatra R.
2006
“Reducing Clock Skew Variability via Crosslinks,”
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
25
(6)
1176 -
1182
** DOI : 10.1109/TCAD.2005.855928**

Restle P.J.
2000
“A Clock Distribution Network for Microprocessors,”
Symp. VLSI Circuits, Dig. Techn. Paper
Honolulu, HI, USA
June 15–17, 2000
184 -
187
** DOI : 10.1109/4.918917**

Venkataraman G.
2010
“Combinatorial Algorithms for Fast Clock Mesh Optimization,”
IEEE Trans. Very Large Scale Integr. Syst.
18
(1)
131 -
141
** DOI : 10.1109/TVLSI.2008.2007737**

Rajaram A.
,
Pan D.Z.
“MeshWorks: An Efficient Framework for Planning, Synthesis and Optimization of Clock Mesh Network,”
Asia South Pacific Des. Autom. Conf.
Seoul, Rep. of Korea
Mar. 21–24, 2008
250 -
257

Shelar R.S.
“An Algorithm for Routing with Capacitance/Distance Constraints for Clock Distribution in Microprocessors,”
Int. Symp. Physical Des.
San Diego, CA, USA
Mar. 29–Apr. 1, 2009
141 -
148
** DOI : 10.1145/1514932.1514964**

Abdelhadi A.
“Timing–Driven Variation–Aware Nonuniform Clock Mesh Synthesis,”
Proc. Symp. Great Lakes Symp. VLSI
Providence, RI, USA
May 16–18, 2010
15 -
20
** DOI : 10.1145/1785481.1785487**

Guthaus M.R.
,
Wilke G.
,
Reis R.
“Non-uniform Clock Mesh Optimization with Linear Programming Buffer Insertion,”
ACM/IEEE Des. Autom. Conf.
Anaheim, CA, USA
June 13–18, 2010
74 -
79
** DOI : 10.1145/1837274.1837295**

Lu J.
,
Mao X.
,
Taskin B.
2012
“Integrated Clock Mesh Synthesis with Incremental Register Placement,”
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
31
(2)
217 -
227
** DOI : 10.1109/TCAD.2011.2173491**

Cho M.
,
Pan D.Z.
,
Puri R.
“Novel Binary Linear Programming for High Performance Clock Mesh Synthesis,”
IEEE/ACM Int. Conf. Comput.-Aided Des.
San Jose, CA, USA
Nov. 7–11, 2010
438 -
443

Lu J.
,
Mao X.
,
Taskin B.
“Timing Slack Aware Incremental Register Placement with Non-uniform Grid Generation for Clock Mesh Synthesis,”
Proc. Int. Symp. Physical Des.
Santa Barbara, CA, USA
Mar. 27–30, 2011
131 -
138
** DOI : 10.1145/1960397.1960426**

Lu J.
,
Aksehir Y.
,
Taskin B.
“Register On MEsh (ROME): A Novel Approach for Clock Mesh Network Synthesis,”
IEEE Int. Symp. Circuits Syst.
Rio de Janeiro, Brazil
May 15–18, 2011
1219 -
1222
** DOI : 10.1109/ISCAS.2011.5937789**

Rajaram A.
,
Pan D.Z.
2010
“MeshWorks: A Comprehensive Framework for Optimized Clock Mesh Network Synthesis,”
IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.
29
(12)
1945 -
1958
** DOI : 10.1109/TCAD.2010.2061130**

Guthaus M.R.
2012
“High-Performance Clock Mesh Optimization,”
ACM Trans. Des. Autom. Electron. Syst.
17
(3)
33 -
** DOI : 10.1145/2209291.2209306**

Sitik C.
,
Taskin B.
“Multi-voltage Domain Clock Mesh Design,”
IEEE Int. Conf. Comput. Des.
Montreal, Canada
Sept. 30–Oct. 3, 2012
201 -
206
** DOI : 10.1109/ICCD.2012.6378641**

You M.
,
Shin H.
2004
“Improvement of Delay and Noise Characteristics by Buffer Insertion,”
IEEK
41
(6)
81 -
90

Alpert C.
,
Devgan A.
“Wire Segmenting for Improved Buffer Insertion,”
Proc. Des. Autom. Conf.
Anaheim, CA, USA
June 9–13, 1997
588 -
593
** DOI : 10.1145/266021.266291**

Citing 'Clock Mesh Network Design with Through-Silicon Vias in 3D Integrated Circuits
'

@article{ HJTODO_2014_v36n6_931}
,title={Clock Mesh Network Design with Through-Silicon Vias in 3D Integrated Circuits}
,volume={6}
, url={http://dx.doi.org/10.4218/etrij.14.0113.1257}, DOI={10.4218/etrij.14.0113.1257}
, number= {6}
, journal={ETRI Journal}
, publisher={Electronics and Telecommunications Research Institute}
, author={Cho, Kyungin
and
Jang, Cheoljon
and
Chong, Jong-wha}
, year={2014}
, month={Dec}