#### Horizon 2020 Program (2014-2020) **FET-Open – Novel ideas for radically new technologies FETOPEN-01-2018-2019-2020**



Architecting More than Moore – Wireless Plasticity for Massive Heterogeneous Computer Architectures <sup>+</sup>

## **D3.1: Wireless Channel Modeling**

WP3 - Wireless Communications within Package

| Contractual Date of Delivery    | 31/12/2020          |
|---------------------------------|---------------------|
| Actual Date of Delivery         | 17/01/2021          |
| Deliverable Dissemination Level | Public              |
| Editor                          | Sergi Abadal (UPC)  |
| Contributors                    | UPC (leader), UNIBO |
| Quality Assurance               | Sergi Abadal (UPC)  |

<sup>&</sup>lt;sup>†</sup>This project is supported by the European Commission under the Horizon 2020 Program with Grant agreement no: 863337.

### **Document Revisions & Quality Assurance**

| Deliverable Number      | D3.1         |
|-------------------------|--------------|
| Deliverable Responsible | UPC          |
| Work Package            | WP3          |
| Main Editor             | Sergi Abadal |

#### **Internal Reviewers**

- 1. Davide Rossi (UNIBO)
- 2. Mohamed Elsayed (RWTH)

#### Revisions

| Version | Date       | Ву             | Overview                                           |  |
|---------|------------|----------------|----------------------------------------------------|--|
| 1.3.0   | 17/01/2021 | Editor         | Final draft, with figures and text finalized.      |  |
| 1.2.0   | 14/01/2021 | Editor, #1, #2 | Updated draft, including reviewers comments.       |  |
| 1.1.0   | 07/01/2021 | Editor         | Second draft, missing some figures, sent for in-   |  |
|         |            |                | ternal review.                                     |  |
| 1.0.0   | 31/12/2020 | Editor         | First draft, missing figures and portions of text. |  |

#### Legal Disclaimer

The information in this document is provided "as is", and no guarantee or warranty is given that the information is fit for any particular purpose. The above referenced consortium members shall have no liability to third parties for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials subject to any liability which is mandatory due to applicable law. ©2019 by WiPLASH Consortium.

### **Executive Summary**

Multicore processors rely on an integrated packet-switched network for cores to exchange and share data. The performance of these intra-chip networks is a key determinant of the processor speed and, at high core counts, becomes an important bottleneck due to scalability issues. To address this, several works propose the use of wireless interconnects for intra-chip communication due to their superior broadcast and system-level flexibility. These same works, however, generally make unrealistic assumptions on the wireless channel within the computing package. Hence, there is an urgent need for channel characterization in the context of chip-scale networks. While various works have approached this problem, most of them focus in the 60–100 GHz band and do not provide a realistic model of the integrated package.

This deliverable aims to go beyond the state of the art in channel characterization for wireless networks within computing packages in two ways: (i) extending the frequency range up to 240 GHz, which is the expected frequency band of experimental tests during the course project, and (ii) faithfully modeling three flavours of computing packages, i.e. flip-chip, interposer, and wirebond. Both are necessary to, in future work, model the transmission speeds and power consumption that can be expected in those packages at the frequencies targeted by the WiPLASH project.

The characterization methodology employed in this deliverable is based on fullwave solvers that calculate the channel response in the frequency and time domains. The channel response is then post-processed to obtain the path loss and delay spread as functions of the distance and multiple design parameters for each of the three considered packages. Therefore, the main contributions of this work are (i) channel characterizations in both frequency and time domains upto 240 GHz, (ii) a comparison of the three different packages, including interposers as a key element of the WiPLASH vision, and (iii) sensitivity analyses for each package allowing us to identify which package design decisions make a larger impact on the channel characteristics. Our results show that flip-chip and interposers are preferable over wirebond, that path loss of 30–40 dB and delay spreads below 0.1 ns can be achieved without cumbersome optimization processes, and that thinning down the silicon die is the most impactful design decision. We conclude that, to enable ultra-fast and ultra-low power communications at the chip scale, additional effort is required, either in the form of package optimization, metasurface-led channel engineering, or the use of directional antennas.

## **Abbreviations and Acronyms**

NoC Network-on-Chip

WNoC Wireless Network-on-Chip

**TSV** Through-Silicon Via

MCM Multi-chip Module

mmWave millimeter-Wave

THz terahertz

**FDTD** Finite-Difference Time-Domain

PDP Power-Delay Profile

**AIN** Aluminum nitride

I/O Input/Output

ISI Inter-Symbol Interference

BER Bit Error Rate

FEM Finite Elements Method

SNR Signal-to-Noise Ratio

MOO Multi-Objective Optimization

#### SA Simulated Annealing

**PEC** Perfect Electric Conductor

SiGe Silicon-Germanium

### The WiPLASH consortium is composed by

| UPC   | Coordinator | Spain       |
|-------|-------------|-------------|
| IBM   | Beneficiary | Switzerland |
| UNIBO | Beneficiary | Italy       |
| EPFL  | Beneficiary | Switzerland |
| AMO   | Beneficiary | Germany     |
| UoS   | Beneficiary | Germany     |
| RWTH  | Beneficiary | Germany     |



### IBM Research | Zurich











# Contents

| 1 | Introduction                                                                                                                                                                      | 13                                                  |
|---|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------|
| 2 | Background         2.1       Design Drivers         2.1.1       High Performance         2.1.2       Resource Awareness         2.1.3       Monolithic System                     | <b>17</b><br>18<br>18<br>18<br>18                   |
|   | 2.2 Physical Landscape       2.2.1 Flip-chip Package         2.2.1 Flip-chip Package       2.2.2 Interposer Package         2.2.2 Interposer Package       2.2.3 Wirebond Package | 19<br>20<br>20<br>21                                |
|   | 2.3       On-Chip Electromagnetics         2.3.1       Antennas         2.3.2       Propagation                                                                                   | 21<br>21<br>23                                      |
|   | 2.4 Modeling Methods                                                                                                                                                              | 24                                                  |
| 3 | State of the Art of Channel Characterization at the Chip Scale3.1Characterization at mmWave Frequencies3.2Characterization at THz Frequencies                                     | <b>26</b><br>26<br>29                               |
| 4 | Methodology4.1Simulation Setup4.2Frequency Domain Analysis4.3Time Domain Analysis                                                                                                 | <b>31</b><br>31<br>33<br>33                         |
| 5 | <ul> <li>Analysis of Flip-chip Package</li> <li>5.1 Environment Description</li></ul>                                                                                             | <b>36</b><br>37<br>37<br>38<br>40<br>41<br>41<br>41 |
| 6 | Analysis of Interposer-based Package         6.1 Environment Description         6.2 Frequency Domain Analysis         6.2.1 Analysis at mmWave Frequencies                       | <b>48</b><br>48<br>50<br>50                         |

| 8 | Disc      | cussion and Concluding Remarks               | 64<br>66 |
|---|-----------|----------------------------------------------|----------|
|   | 7.4       |                                              | 01       |
|   | 7.3<br>74 | Time Domain Analysis                         | 60<br>61 |
|   |           | 7.2.2 Scale-up to THz Frequencies            | 59       |
|   |           | 7.2.1 Analysis at mmWave Frequencies         | 58       |
|   | 7.2       | Frequency Domain Analysis                    | 58       |
|   | 7.1       | Environment Description                      | 56       |
| 7 | Ana       | lysis of Wire-Bonding Package                | 56       |
|   | 6.4       | Sensitivity Analysis for Channel Engineering | 52       |
|   | 6.3       | Time Domain Analysis                         | 52       |
|   |           | 6.2.2 Scale-up to THz Frequencies            | 51       |

# **List of Figures**

| 1.1        | A general view on wireless communications at the chip scale within a heterogeneous computer architecture (top) with detail on the wireless transmission process (bottom)                                                                                                                                                                                                                                                                                      | 14       |
|------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| 1.2        | Graphical abstract of this deliverable (D3.1). Channel modeling is per-<br>formed via full-wave simulations and post-processing to obtain path loss<br>and delay spread as functions of the distance, frequency range, and<br>package design. Input ranges are determined in conjunction with task<br>T1.1. Results provide feedback to guide further simulations changing<br>geometry and materials, and lead to the development of models for task<br>T3.2. | 16       |
| 2.1<br>2.2 | Cross-section view of a typical chip without package, from [1] Different flavours of computing packages capable of hosting multiple chips                                                                                                                                                                                                                                                                                                                     | 19<br>20 |
| 2.3        | Schematic representation of wave propagation in an interposer system<br>with flip-chip package excited with vertical monopole antennas, distin-<br>guishing between intra- and inter-chip regions, and exemplifying differ-<br>ent propagation phenomena.                                                                                                                                                                                                     | 23       |
| 3.1<br>3.2 | Channel characteristics for intra-chip propagation within a flip-chip pack-<br>age at 60 GHz with variable silicon thickness, heat spreader of 0.2 mm,<br>and heat sink on top. Data from [2]                                                                                                                                                                                                                                                                 | 28<br>29 |
| 4.1<br>4.2 | General view of the evaluation methodology used in this deliverable<br>Overview of a chip package and the simulated ports. We only need to<br>excite one white (edge), one grey (center), and one black port (corner).<br>The rest of combinations can be inferred thanks to symmetry                                                                                                                                                                         | 31<br>32 |
| 4.3        | Gaussian pulse used in the time-domain simulations and whose spec-<br>trum spans from 10 GHz upto 1 THz                                                                                                                                                                                                                                                                                                                                                       | 34       |
| 5.1<br>5.2 | Schematic of the layers of a flip-chip package                                                                                                                                                                                                                                                                                                                                                                                                                | 36       |
| 5.3        | mated radiation patterns of the different ports                                                                                                                                                                                                                                                                                                                                                                                                               | 38       |
| 5.4        | Path loss in a flip-chip package at different frequencies.                                                                                                                                                                                                                                                                                                                                                                                                    | 39<br>39 |

| 5.5        | Path loss in a flip-chip package at 240 GHz for different substrate and                                                                                                                                                                                                                                                |           |
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
|            | heat spreader thicknesses.                                                                                                                                                                                                                                                                                             | 40        |
| 5.6        | Delay spread in a flip-chip package for different substrate and heat spreader thicknesses                                                                                                                                                                                                                              | 40        |
| 57         | Path loss and delay spread in a flip-chip package for different die sizes                                                                                                                                                                                                                                              | 41        |
| 5.8        | Path loss at 60 GHz and 240 GHz on top plot and delay spread on bot-                                                                                                                                                                                                                                                   | - T I     |
| 0.0        | tom plot for a flip, chip, package, with different package, margin dimensions                                                                                                                                                                                                                                          | 12        |
| 5.9        | Path loss at 60 GHz and 240 GHz and delay spread in a flip-chip pack-                                                                                                                                                                                                                                                  | 42        |
|            | age for different package margin materials.                                                                                                                                                                                                                                                                            | 43        |
| 5.10       | Proposed optimization methodology to engineer the wireless channel                                                                                                                                                                                                                                                     | 11        |
| E 44       | Evaluation of the design appear for shannel angine with respect to                                                                                                                                                                                                                                                     | 44        |
| 5.11       | (a–c) the silicon thickness $T_s$ , (d–f) heat spreader thickness $T_h$ , and (g–<br>i) central frequency $f_c$ . Panels show the path loss over distance, delay<br>spread over distance, and maximum delay spread/average path loss as<br>functions of the swept parameter. Unless noted, $T_s = 0.2$ mm, $T_h = 0.7$ |           |
|            | mm, and $f_c = 60$ GHz.                                                                                                                                                                                                                                                                                                | 45        |
| 5.12       | Figure of merit $\phi_w$ as function of $\{T_s, T_h, f_c\}$ for different priority weights.                                                                                                                                                                                                                            |           |
|            | Unless noted, $T_s = 0.2$ mm, $T_h = 0.7$ mm, and $f_c = 60$ GHz                                                                                                                                                                                                                                                       | 46        |
| <b>0</b> 4 |                                                                                                                                                                                                                                                                                                                        | 40        |
| 6.1        | Schematic of the layers of an interposer package.                                                                                                                                                                                                                                                                      | 48        |
| 6.2        | (a) Path loss in an interposer package at 60 GHz for a silicon thickness                                                                                                                                                                                                                                               |           |
|            | of 0.1mm, AIN thickness of 0.5mm, four chiplets, and a chiplet separa-                                                                                                                                                                                                                                                 |           |
|            | tion of 2mm. (b-d) As a reference, approximated radiation patterns of                                                                                                                                                                                                                                                  |           |
|            | the different ports.                                                                                                                                                                                                                                                                                                   | 50        |
| 6.3        | Path loss in an interposer package at 60 GHz for different substrate and                                                                                                                                                                                                                                               |           |
|            | heat spreader thicknesses.                                                                                                                                                                                                                                                                                             | 51        |
| 6.4        | Path loss in an interposer package at different frequencies                                                                                                                                                                                                                                                            | 51        |
| 6.5        | Delay spread in an interposer package for different substrate and heat                                                                                                                                                                                                                                                 |           |
|            | spreader thicknesses.                                                                                                                                                                                                                                                                                                  | 52        |
| 6.6        | Path loss and delay spread in a 20×20 mm <sup>2</sup> silicon interposer divided                                                                                                                                                                                                                                       |           |
|            | into 4 or 16 chiplets. The distance between chiplets is 2 mm.                                                                                                                                                                                                                                                          | 53        |
| 6.7        | Path loss and delay spread for different inter-chiplet spacings.                                                                                                                                                                                                                                                       | 54        |
| 6.8        | Path loss and delay spread in an interposer package for low-resistivity                                                                                                                                                                                                                                                |           |
|            | (active) and high-resistivity (passive) silicon interposers.                                                                                                                                                                                                                                                           | 54        |
| 6.9        | Path loss and delay spread in an interposer package for different filling                                                                                                                                                                                                                                              |           |
|            | materials                                                                                                                                                                                                                                                                                                              | 55        |
|            |                                                                                                                                                                                                                                                                                                                        | 00        |
| 7.1        | Schematic of the layers of an wirebond package, together with a top                                                                                                                                                                                                                                                    |           |
|            | view and cross-section diagrams.                                                                                                                                                                                                                                                                                       | 56        |
| 7.2        | (a) Path loss in a wirebond package at 60 GHz for a silicon thickness                                                                                                                                                                                                                                                  |           |
|            | of 0.1mm, AIN thickness of 0.5mm, and 32 bondwires. The red circle                                                                                                                                                                                                                                                     |           |
|            | points out nearby ports with reduced coupling. (b-d) As a reference.                                                                                                                                                                                                                                                   |           |
|            | approximated XY-plane radiation patterns of the different ports.                                                                                                                                                                                                                                                       | 59        |
| 7.3        | Path loss in a wirebond package at 60 GHz for different substrate and                                                                                                                                                                                                                                                  |           |
|            | heat spreader thicknesses.                                                                                                                                                                                                                                                                                             | <u>59</u> |
| 7.4        | Path loss in a wirebond package at different frequencies                                                                                                                                                                                                                                                               | 60        |
| 7.5        | Delay spread in a wirebond package for different substrate and heat                                                                                                                                                                                                                                                    |           |
| ,.0        | spreader thicknesses                                                                                                                                                                                                                                                                                                   | 60        |
|            |                                                                                                                                                                                                                                                                                                                        | 00        |

| 7.6 | Path loss and delay spread for different die sizes in a wirebond package. | 61 |
|-----|---------------------------------------------------------------------------|----|
| 7.7 | Path loss and delay spread in a wirebond package for different amounts    |    |
|     | of bond wires (i.e. different I/O pitches).                               | 62 |
| 7.8 | Path loss and delay spread in a wirebond package for different enclo-     |    |
|     | sure materials.                                                           | 62 |
| 7.9 | Path loss and delay spread in a wirebond package for different molding    |    |
|     | compound thicknesses.                                                     | 63 |

# **List of Tables**

| 2.1<br>2.2        | Summary of integrated antennas amenable to the intra-/inter-chip com-<br>munication scenario.                    | 22<br>24       |
|-------------------|------------------------------------------------------------------------------------------------------------------|----------------|
| 3.1               | Works on mmWave channel modeling at the chip scale upto 2020                                                     | 27             |
| 5.1<br>5.2<br>5.3 | Characteristics of the layers in a flip-chip package and default dimensions.<br>Package parameters for flip-chip | 37<br>37<br>46 |
| 6.1<br>6.2        | Characteristics of the layers in an interposer-based package Package parameters for interposer                   | 49<br>49       |
| 7.1<br>7.2        | Characteristics of the layers in a wirebond package.                                                             | 57<br>57       |

### 1. Introduction

Multicore processors are present in virtually every computing domain nowadays. They integrate a number of processor cores within the same chip and, in the past few years, manufacturers have been consistently increasing the core count seeking higher execution speeds. However, in order to translate this potential into effective performance, the on-chip communication problem must be solved: cores need an integrated interconnect to exchange or share data and, for densely populated chips, traditional interconnects are burdensome and slow down the processor. Communication, not computation, thus becomes the main performance bottleneck in multicore systems [4].

In the past, most chips did not contain more than a handful of cores and on-chip communication was easily performed through a bus. Since buses do not scale well with the number of cores, a completely different approach was soon required. The adopted solution, called Network-on-Chip (NoC), consists of a packet-switched network of routers that are co-integrated with the cores. Since then, NoCs have been widely applied not only in research works [5–8], but also in commercial chips such as Tilera's TILE-GX [9] or Intel's Xeon Phi [10]. Nevertheless, with the arrival of extreme scaling and new architectural trends such as massively parallel accelerators or heterogeneous architectures, standard NoCs start to show issues in performance, efficiency, or area overhead [11, 12]. New paradigms are thus required in the manycore era, which is the hypothesis over which the WiPLASH project unfolds.

The scalability problems of NoCs are mainly the network diameter and overprovisioning. These cause the communication latency and power to increase, especially for chip-wide transactions. Therefore, any new candidate to improve existing NoCs should address them. In this context, the WiPLASH project advocates the use of wireless communications at the chip scale to complement a backbone wired NoC to redress its performance and ridigity issues. This concept, commonly referred to as Wireless Network-on-Chips (WNoCs), has been shown to reduce the latency of chip-wide transfers, including broadcasts, by virtue of the speed-of-light and possibly omnidirectional propagation of radio waves [13, 14]. It also combats overprovisioning thanks to its global reconfigurability potential. As demonstrated in the literature, these unique features become key enablers of new multicore architectures capable of pushing current scalability limits [15–17].

To illustrate the WNoC paradigm, Figure 1.1 represents a possible scenario with wireless links within an heterogeneous architecture. Bits are serialized, modulated and radiated by the transmitter before being picked up, demodulated and deserialized by the receiver. In the middle, the wireless channel attenuates and broadens the transmitted signal. The higher is the attenuation, the higher will the amplification be to reach the target Bit Error Rate (BER) and, thus, the higher will be the power consumption at the transceiver circuits. The higher is the broadening (i.e. dispersion), the lower will be the transmission speed in order to avoid having Inter-Symbol Interference (ISI).



Figure 1.1: A general view on wireless communications at the chip scale within a heterogeneous computer architecture (top) with detail on the wireless transmission process (bottom).

The main caveat of the majority of WNoC research is that it lays on incorrect channel models. Many works [18–25] either neglect the influence of the chip package, which introduce losses and dispersion, or directly neglect dispersion whatsoever. This does not invalidate the potential of the WNoC paradigm, but leads to erroneous assumptions on the achievable speed and power. For instance, many WNoC architectures assume bandwidths well over 10 GHz [16,26–29], which may not be achievable due to multipath effects arising from the numerous wave reflections within highly integrated packages. Other works obtain power consumption estimates by assuming path losses well below 30 dB [30–32], but often ignoring that propagation may occur through lossy silicon and losses may be far greater.

This problem is exacerbated in the WiPLASH project as it explores uncharted territories in the WNoC paradigm, i.e. new antennas and new computing packages. In particular, WiPLASH seeks to take WNoCs to the next level by exploiting the unique properties of graphene antennas and applying it to multi-chiplet environments. Graphene antennas are proposed due to their natural support of plasmonic surface waves at frequencies between 0.1 and 1 THz, which allow to reduce the size of the antenna by upto 1-2 orders of magnitude with respect to that of a metallic antenna, and to tune the resonance point by a large extent [33–35]. These properties indeed make them a perfect candidate for the creation of versatile on-chip and off-chip networks for heterogeneous architectures, but pose new challenges in the channel modeling for WNoCs.

The main focus of this deliverable is on the realistic modeling of the wireless channel within a computing package. Through a simulation-based study, we aim to provide a solid foundation upon which to build a realistic physical layer of design of future WNoCs. As we summarize in Chapter 3, several research groups have approached the problem of wireless channel characterization at the chip scale with a similar goal in mind. However, the existing studies are limited to frequencies below 100 GHz and, except for some prior works by the authors of this deliverable, only considering open dies without a realistic computing package. It is therefore clear that, in terms of channel characterization, the WiPLASH project needs to go beyond the state of the art in two main ways:

- First, it must study the channel beyond 100 GHz in order to approach the bands where graphene antennas are expected to operate. In this deliverable, we set the upper bound of the study to 240 GHz for two main reasons: (i) prototypes of the technologically integrated antennas in the project are expected to operate in this band, and (ii) computational constraints discourage extending full-wave simulations of chip environments beyond 240 GHz.
- Second, it must model different computing packages to bring channel modeling close to a realistic WNoC scenario. In this respect, we compare three existing and widespread packages: flip-chip, interposer, and wire-bond. The second one is especially relevant here due to the rising importance of multi-chiplet architectures and because it is in the scope of the project. It is worth noting that, in response to this need, our work is among the first to consider channel characterization in interposer-based systems.

Figure 1.2 summarizes the main contributions described in this document. In essence, we take as inputs a frequency range of operation and multiple package designs (i.e. flip-chip, interposer, wire-bond), which are modeled and simulated in a full-wave solver in both frequency and time domains. With some post-processing and analysis of the resulting path loss and delay spread, we obtain attenuation and dispersion scaling trends. We finally note that, since manufacturers have a rough control over the package design, we can close the loop by repeating the process for multiple geometries and material choices. Therefore, an additional contribution of this deliverable is the study of relevant package design decisions that can greatly affect the attenuation and dispersion of the wireless channel.

Within the project, the specification of the wireless architecture from task T1.1 provides fundamental inputs in terms of target frequency ranges and package design. The outputs of this study are relevant to the physical layer of design (task T3.2). The channel models will allow to perform an accurate link budget in the pathway to determining the area overhead and power consumption of the transceiver circuits necessary to establish the wireless links (deliverable D3.2). More tangentially, these will also guide the experimental measurement of integrated antennas and chipsets in an emulated package, which will be addressed in work packages WP1 and WP2.

The remainder of this document is organized as follows. We first provide background on the topic of characterization of wireless channels within computing packages, including on-chip electromagnetics, antenna positioning, and characterization methods in Chapter 2. We then outline the state of the art in channel characterization in chip-scale systems, identifying a lack of models for realistic computing packages at mmWave and, most notably, THz bands in Chapter 3. After that, we describe the channel characterization methodology in Chapter 4. The wireless channel characterization results are presented in the three subsequent chapters, one for each considered package: flip-chip in Chapter 5, interposer in Chapter 6 and wire-bond in Chapter 7. We



Figure 1.2: Graphical abstract of this deliverable (D3.1). Channel modeling is performed via full-wave simulations and post-processing to obtain path loss and delay spread as functions of the distance, frequency range, and package design. Input ranges are determined in conjunction with task T1.1. Results provide feedback to guide further simulations changing geometry and materials, and lead to the development of models for task T3.2.

finally summarize the main findings and discuss possible future lines of research in Chapter 8, and list the publications that stemmed from this work in Chapter 9.

## 2. Background

Wireless chip-scale communications emerge from the need to tightly couple processing elements and memory within complex computing packages. Specifically, the wireless paradigm has been proposed as a complement to the wired interconnects to (i) improve the communication between far-apart processors, (ii) alleviate existing bandwidth bottlenecks caused by Input/Output (I/O) pin limitations, and (iii) implement global channels. These are possible thanks to the inherent low latency, broadcast capability, and lack of path infrastructure of the wireless technology.

As illustrated in Figure 1.1, wireless chip-scale communications broadly refer to the implementation of intra-chip or inter-chip links with integrated antennas. In general terms, any of the components within a multiprocessor architecture (e.g. CPUs, GPUs, accelerators, memory) may be provided with a wireless transceiver that would serialize, modulate and radiate outgoing information. Electromagnetic waves propagate through the processor package until reaching the intended destinations, where they are demodulated and deserialized.

This deliverable focuses on the electromagnetic propagation aspect. Signals radiated at the transmitting end suffer losses and dispersion, which affect the ability of the receiver to correctly demodulate the transmitted information. This is generally made explicit in the *RF link budget*, where the RF designer lists the sources of losses (including the channel) to then evaluate the minimum power that needs to be transmitted to (i) meet the receiver's power sensitivity, and to (ii) achieve an Signal-to-Noise Ratio (SNR) allowing to meet the BER requirement of the communications scenario for a particular modulation scheme. Another study that hinges on the channel response is that of dispersion: different propagation paths cause transmitted signals to be spread in time. Dispersion limits the bandwidth of the channel, which in turn limits the achievable symbol rate (again depending on the chosen modulation). Transmitting at rates beyond the dispersion limit will lead to ISI, which reduces the effective SNR and thus increases the BER.

For all the reasons above, understanding the channel characteristics are crucial to determine the potential performance and cost of the wireless chip-scale communication. In this chapter, we provide background on the chip-scale environment in an attempt to gain insight on the particularities of the scenario. First, in Section 2.1, we review the main design drivers of multiprocessor interconnects to better understand their performance and efficiency requirements. In Section 2.2, we explain the physical land-scape for both single-chip and multi-chip architectures within different package flavors evaluated later in this document. Finally, in Section 2.3, we discuss the fundamentals of chip-scale antennas and propagation.

### 2.1 Design Drivers

The wireless chip-scale scenario has a unique blend of requirements and constraints that impact on the channel modeling and characterization. We next summarize them in three main points: high performance, resource awareness, and monolithic system.

#### 2.1.1 High Performance

Computing systems generally require very fast and reliable communications at the chip scale for two main reasons: (i) the latency introduced by communication essentially delays the progress of the computation, lagging the system, and (ii) seemingly minor errors may corrupt an entire computation. Most WNoC proposals consider wireless speeds over 10 or even 100 Gb/s to implement chip-wide latencies below 10 ns, whereas it is widely accepted that the error rate should be in the order of  $\sim 10^{-15}$  [36]. Thus, an accurate characterization of the wireless channel (especially in the time domain) becomes critical in order to achieve such stringent performance requirements.

#### 2.1.2 Resource Awareness

Nodes in wireless networks are typically mobile and hence have a limited battery, rendering communication largely energy-constrained. In chip environments, energy availability is guaranteed, yet energy cannot be considered unlimited since heat dissipation is expensive. Actually, power has become a driver of multiprocessor design, suggesting the use of power-gating techniques to increase the overall efficiency [37]. Similarly, chip real state is a precious resource in the scenario at hand, prompting WNoCs to employ simple and low-power transceivers that support only low-order modulations [38]. From the perspective of the channel, this forces architects to minimize path loss while increasing the frequency and looking for wide spectral bandwidths to accommodate the high requirements of data rate. Moreover, this situation also limits the signal processing techniques that could be used to combat dispersion, therefore making time-domain characterization critical.

#### 2.1.3 Monolithic System

The propagation of electromagnetic waves takes place in a confined space. This physical landscape, including the network topology, the chip layout, and the characteristics of the employed materials, is fixed and known beforehand [39]. This represents one of the main uniquenesses of the WNoC scenario, since nodes in other wireless networks generally move within a propagation environment that can also be dynamic. The chipscale channels, in fact, become quasi-deterministic at the data-link layer and can be accurately characterized by exploiting *a priori* knowledge of the physical landscape. We argue that the only factor affecting the channel characteristics that may vary at runtime is temperature, which may modify the amount of thermal noise, while other decisions such as the antenna positioning or the package design do not change and can be known with high accuracy at design time. We expect that the stringent high performance and efficiency requirements of the scenario, discussed above, will only be met by introducing the wireless chip-scale network within the design loop of the complete architecture.



Figure 2.1: Cross-section view of a typical chip without package, from [1].

#### 2.2 Physical Landscape

The typical cross-section of a standard chip consists of a metal stack with 5–10 layers, separated by an insulator and placed over a lossy silicon substrate as shown in Figure 2.1 [1]. On-chip antennas can be implemented using any of the metallization layers of the chip, as we will see in Section 2.3. On-chip antennas are appreciated by its compact form factor and integration with other circuitry.

In most wireless communication applications, signals need to be radiated away from the transmitting device. A clear example is that of cellular communications, where on-chip antennas are a solution that reduces integration costs and fits well within cell phones. In this case, the signals are radiated from the cell phone to the base station. As a result, the on-chip antenna needs to radiate towards free-space (passivation in Fig. 2.1) and not towards the substrate. This has motivated the use of on-chip antennas in open die (i.e. without package) and other stacked configurations. The interested reader is referred to [40] for a very complete survey on this matter.

The case of WNoC is different for two reasons: (i) multiprocessors are typically enclosed in a computing package, and (ii) the antennas need to radiate towards the other antennas within the chip or the package, instead of to free-space. With regards to the first point, and although a recent work suggests a packageless architecture [41], multiprocessor dies have indeed historically been embedded in a package to (i) act as a space transformer for I/O pins, (ii) provide mechanical support to the dies, and (iii) for ease of testability and repairability. Some packages include a molding compound around the chip to improve mechanical stability [42], but its typically poor thermal conductivity discourages its use in hot architectures. In most cases, even the packageless one [41], the die is contacted by a thermal interface material with a metallic heat sink on top, avoiding the use of the molding compound.

Depending on the actual implementation of the system package, this scenario could lead to a totally enclosed volume, which is actually desirable because communications occur within the package and because one wants to avoid external interference or eavesdropping for security reasons. Still, as we will see, losses arise due



Figure 2.2: Different flavours of computing packages capable of hosting multiple chips.

to reflections, dielectric losses of the materials found within the package, or spreading in undesired areas. To better understand the source of these impairments, we next describe the packages analyzed in this deliverable, which are pictorially represented in Figure 2.2.

#### 2.2.1 Flip-chip Package

Together with wirebond, *flip-chip* has been the most common package in multiprocessor systems, although multiple custom variants and alternatives exist depending on the final application [21, 41]. Flip-chip packaging is generally preferred in the multiprocessor context due to its lower inductance and higher power/bandwidth density [43]. In this configuration, chips are turned over and carefully connected to the system substrate through a set of solder bumps, as shown in Figure 2.2(a). The packaged chip therefore has the silicon substrate on top, which is in turn interfaced by the spreader material and system heat sink on top. The insulator and metal stack are therefore placed at the bottom, interfaced by the solder bumps that connect it to the system substrate [1].

Flip-chip is compatible with the (heterogeneous) integration of multiple chips either vertically or horizontally. The former, represented in Figure 2.2(a), consists on the stacking of several chips that have been previously thinned down below 100  $\mu$ m [44, 45]. Once stacked, the chips are interconnected through a forest of vertical Through-Silicon Vias (TSVs) with very fine pitch. This provides a huge bandwidth density and efficiency due to the very short link lengths. On the downside, 3D integration suffers from evident heat dissipation issues and the available area of integration basically depends on the dimensions of the chip at the base, i.e. a maximum of around 20×20 mm<sup>2</sup>. Horizontal integration is explained below.

#### 2.2.2 Interposer Package

Contrary to 3D stacking, heterogeneous 2.5D integration takes a co-planar approach and interconnects chips (or smaller chiplets) through a common platform [46]. Depending on the level of integration, this common platform may be silicon interposer, Fig. 6.1, or the package substrate in a more classical Multi-chip Module (MCM) approach. Such an arrangement alleviates the heat dissipation issue of 3D stacking and also increases the available area, as the limit is now set by the interposer (up to  $40 \times 40$ mm<sup>2</sup> in [46, 47]) or the substrate ( $77 \times 77$  mm<sup>2</sup> in [48]). It also reduces the cost of the interconnects, as the pitch of TSVs is significantly coarsened. The main downturn of the approach is the reduction of bandwidth density and efficiency due to pin limitations and the need for longer links.

For heat dissipation purposes, heat spreading material is generally applied to each chip individually. Then, all chips are covered by a common lid that acts as a heat sink. Molding compounds are sometimes used to fill the gaps between chips and below the heat spreader [42]. However, due to its poor thermal behavior, we advocate to the direct interfacing of the chip with the heat spreader. The lateral space between chips can be filled with either the molding compound, or vacuum.

#### 2.2.3 Wirebond Package

Wire bonding is possibly the most widespread technique for packaging integrated circuits and, for this, we also describe it in this deliverable. In contrast to flip-chip, in wirebond package the chip is mounted upright and wires are used to interconnect the chip pads to external circuitry [49]. The I/O pads are manufactured first and, afterwards, the wire is aligned and bonded with the pads using different techniques that depend on manufacturing constraints, pitch requirements, and so on, but that generally uses heat to fuse the wire with the metallic contact.

Since the chip is mounted upright directly on top of the package substrate or filler, the space beneath the chip cannot be used for the contacts as in flip-chip. In this case, only the periphery of the chip is amenable to bonding. This reduces the bandwidth density that can be achieved. Moreover, the wires have much higher inductance. For these reasons, many multiprocessor systems employ flip-chip and interposer packages; yet wirebond is popular in low-cost and low-power processor architectures.

### 2.3 **On-Chip Electromagnetics**

While the fundamental electromagnetic concepts and principles affecting antenna design and electromagnetic wave propagation are still valid in WNoC, the chip-scale environment introduces unique requirements on the design of wireless communication systems, as we describe next.

#### 2.3.1 Antennas

Table 2.1 summarizes the main characteristics of common on-chip antennas for freespace applications that have been proposed for its use in the inter-/intra-chip communications domain. Here, we discuss some of their characteristics and their impact on channel characterization.

The miniaturization of the largest dimension of an antenna to meet the chip-scale size requirements imposes the use of very high communication frequencies [50, 51]. In broad terms, an antenna becomes resonant at a frequency at which its length corresponds to half of the wavelength. For example, a 1-mm-long antenna is expected to resonate at approximately 150 GHz, whereas a 150- $\mu$ m-long antenna would do so at 1 THz. In the case of graphene antennas such as those considered in this project, the resonance condition is met at half of the *plasmonic* wavelength, which may be upto an order of magnitude smaller than the conventional freespace wavelength. In light of these considerations, channel modeling for chip-scale communications should extend

to millimeter-Wave (mmWave) and, to the possible extent, terahertz (THz) frequencies. Note that, given the miniaturization of the antennas at such high frequencies and the need for high performance in multiprocessor architectures, some works start to consider the use of simplified antenna arrays in WNoC [52, 53].

| -        |            |            |                           |                                                  |
|----------|------------|------------|---------------------------|--------------------------------------------------|
| Antenna  | Direction  | Position   | Characteristics           | Frequency [Ref]                                  |
| Printed  | Horizontal | Within     | Easy to manufacture, but  | 15 GHz [54], 60 GHz [50],                        |
| Dipole   |            | insulator  | has an end-fire null      | 150 GHz [ <mark>24</mark> ]                      |
| Meander  | Horizontal | Within     | More complex, but more    | 15 GHz [49], 25 GHz [55],                        |
| zig-zag  |            | insulator  | compact than dipole       | 60 GHz [ <mark>56</mark> ]                       |
| Circular | Horizontal | Within     | Better omnidirectionality | 60 GHz [57], 100 GHz [1]                         |
| antenna  |            | insulator  | at the chip plane         |                                                  |
| Vivaldi  | Horizontal | Within     | Broadband and direc-      | 180 GHz [58], 200 THz                            |
| antenna  |            | insulator  | tional, but also complex  | (optical) [59]                                   |
| Bond-    | Vertical   | Off-chip   | Reuses bond-wiring pro-   | 20 GHz [60], 43 GHz [20],                        |
| wire     |            |            | cess, but is hard to tune | 200 GHz [ <mark>61</mark> ]                      |
| Vertical | Vortical   | Through    | Coplanar radiation, em-   | 60 GHz [ <mark>62</mark> ], 120 GHz              |
| monopole | vertical   | Silicon    | bedded in lossy silicon   | [63]                                             |
|          |            | Through    | Coplanar radiation, extra | 20 GHz [ <mark>64</mark> ], 150 GHz              |
|          |            | dielectric | packaging steps           | [ <mark>24</mark> ], 160 GHz [ <mark>65</mark> ] |
| Folded   | Both       | Through    | Uses vertical dimension   | 60 GHz [21], 77 GHz [66]                         |
| monopole |            | and on top | to shorten the footprint  |                                                  |
| 2×2      | Horizontal | Within     | Simple feed array, direc- | 60 GHz [52, 53]                                  |
| array    |            | insulator  | tional to diagonals       |                                                  |

Table 2.1: Summary of integrated antennas amenable to the intra-/inter-chipcommunication scenario.

Moving to higher frequencies usually opens the door to also communicating over much larger bandwidths. Traditional narrow-band antenna designs (e.g., dipole and patch antennas) commonly exhibit a bandwidth approaching 1% of their resonant frequency. Moreover, ultra-broadband antenna designs (e.g., bowtie, lognormal, spiral) offer bandwidths in excess of 10% of the carrier frequency. Therefore, up to a few tens of GHz of bandwidth<sup>1</sup> can be supported in chip-scale environments. Such a broad bandwidth suggests the need for characterization over multiple frequency spans (in frequency domain analysis) and for a broadband excitation (generally employed in impulse-based time domain analysis). Finally, one can further increase the bandwidth not through more wideband behavior, but rather via arrays enabling spatial diversity, this is, multiple concurrent transmissions not overlapping in space.

In an environment as highly integrated as WNoC, the antenna placement is an important design consideration which also affects the location of the excitation ports in the channel characterization methodology. Placing the radiating element as far from the lossy silicon as possible, like it is generally done in conventional wireless applications [19, 50, 67], can be also done in wirebond packages. However, it is not realistic in flip-chip-based packages because the antenna would be short-circuited by the array of micro-bumps. Instead, antennas may be implemented in the metal layers closest to the silicon. However, the proximity of the antennas to the *virtual ground plane* formed

<sup>&</sup>lt;sup>1</sup>In general terms, bandwidth in this deliverable refers to the range of frequencies where the antenna/channel/circuit shows good performance and is expressed in Hertz (Hz). This is in contrast to the other possible definition of bandwidth, which refers to transmission or access speeds in computers and is expressed in bits per second (b/s).



Figure 2.3: Schematic representation of wave propagation in an interposer system with flip-chip package excited with vertical monopole antennas, distinguishing between intra- and inter-chip regions, and exemplifying different propagation phenomena.

by the array of micro-bumps reduces their efficiency, whereas co-planarity between antennas further increases losses. Alternatively, one could use TSVs as quarter-wave monopole antennas because the antenna would radiate laterally, directly towards the receiving antennas, while the array of micro-bumps would naturally act as a ground plane [2,65,68].

#### 2.3.2 Propagation

The propagation of electromagnetic waves in chip-scale environments is governed by the same phenomena affecting those in larger scale scenarios, but with the caveat that, as in any system governed by the Maxwell's equations, we need to reconsider the entire system in light of the now much smaller wavelength of the frequency bands of interest.

Based on the description of the typical structure of a chip and of different computing packages, it is straightforward to see that propagation occurs in two regions as exemplified in Figure 2.3. First, in the *intra-chip region*, the waves radiated by the antenna travel through several layers of the chip, including the dielectric. Second, in the *inter-chip region*, waves that have left the chip travel through the inter-chip space until they reach the boundaries of another chip or the package limits. The layers and materials most relevant to propagation in both regions will eventually depend on the antenna position, frequency band, and choice of package.

Beyond the free-space path-loss and its antenna-induced frequency-dependence, electromagnetic waves within package can suffer from reflections, refraction, diffraction and absorption as illustrated in Figure 2.3. More specifically, *reflections* will appear both when a wave reaches an obstacle (e.g., another chip or core) as well as at the interface between different material layers within one chip. The latter depend on the exact material composition and the frequency and, depending on the smoothness of the surface/obstacle (measured relative to the wavelength), can be specular (as defined by the Snell's law) or scattered. In addition, when transitioning from a medium to another, *refraction* of the electromagnetic wave will occur again depending on the change in the refraction index. *Diffraction* or bending of the wave around the (sharp) edges of chips will further determine the propagation of signals in the scenarios under study. Last but not least, absorption refers to the distance electromagnetic waves

| Methodology              | Measurement | Full-Wave Solver | EM Field Analysis | Ray Tracing |
|--------------------------|-------------|------------------|-------------------|-------------|
| Accuracy                 | High        | High             | Medium            | Low         |
| Computational Complexity | Low         | High             | Low               | Medium      |
| Examples                 | [19]        | [2]              | [18]              | [69]        |

| Table 2.2: Methodologies for channel modeling at the chip so | scales. |
|--------------------------------------------------------------|---------|
|--------------------------------------------------------------|---------|

travel within a material before being absorbed, and depends on the material and radiation frequency. All these phenomena ultimately depend on the environment and the frequency of operation. They will appear superimposed in the channel response, creating notches in the frequency domain and spreading in the time domain.

### 2.4 Modeling Methods

Channel modeling generally implies capturing the effects of electromagnetic propagation across the physical environment. There are several methods and approaches that can be used to this end, from analytical to semi-analytical, numerical, and empirical. Table 2.2 lists the main approaches, namely:

- **Measurement campaigns:** Measurement-based approaches are adopted to characterize wireless propagation and generate empirical and statistical models. On the one hand, frequency-domain measurements sweep a spectrum band and record the channel transfer function. The time domain characteristic, i.e., the channel impulse response, is then obtained by inverse Fourier transform. The time resolution is determined by the measurement bandwidth while the maximum excess delay is determined by the sampling interval in the frequency domain. On the other hand, a time-domain measurement usually correlates the received sequence with the transmitted random sequence at the receiver to obtain the channel impulse response.
- Full-wave Solving: Full-wave electromagnetic solvers, including High Frequency Structure Simulator (HFSS), COMSOL Multi-physics, Computer Simulation Technology (CST), IE3D and FEKO, among others, involve one or more than one computational electromagnetic (CEM) methods to solve Maxwell equations with boundary conditions for computing the EM fields in a propagation medium. The CEM methods are divided into time-domain methods and frequency-domain methods, as well as integral methods and differential methods depending on the solving domain and the form of Maxwell's equations. The memory and time cost of full-wave solving increase with the simulation scale in number of wavelengths, but it varies from method to method. For WNoC, the largest dimension of the environment is up to 100 millimeters, which is comparable with dozens of times of the wavelength of THz and optical waves.
- **EM Field Analysis:** Exact mathematical solutions to Maxwell's equations can be derived under specific system considerations allowing to keep the mathematical expressions tractable. The EM fields radiated by an antenna can be calculated through the Green's function of the radiation space, which we call the analytical EM evaluation. The intra-chip environment can be considered as a stratified

media, and the expressions for the EM fields are the Sommerfeld integration. It should be noted that the lateral dimension of the stratified media is infinite. Therefore, the effect of the chip edge is ignored in this method [70], trading off accuracy against complexity.

• **Ray Tracing:** As a compromise between accuracy, mathematical tractability and complexity, ray-tracing techniques originated from geometric optics approaches can be utilized. This essentially implies approximating the complete response by the addition of the set of most relevant far-field *rays*, whose propagation is evaluated accurately. Ray tracing is widely utilized for the modeling the large-scale environments, e.g., indoor WiFi scenarios, since it shows low computational complexity but gives reasonable results. The main challenge is to determine, based on the geometry of the environment, which is this set of relevant rays, which may be challenging in highly integrated environments.

All the channel modeling methods discussed in this section are deterministic methods. While this would be a major shortcoming for the modeling of traditional wireless networks, in which the users and the environment are generally mobile and not static, it is acceptable in chip-scale networks, in which everything is static.

Most works on channel characterization for chip-scale communications have been based on full-wave simulation due to manufacturing costs and the complexity of probing in highly integrated packages [22,23,64]. In open packages, however, experimental works have been more common because it characterizes the channel parameters in the real world while, in the other methodologies, the channel environments are more or less simplified under certain assumptions [22, 49, 71]. Analytical methods have been explored less [18] due to their low accuracy in geometrically complex environments (i.e., small objects like solder bumps and irregular metal lines are generally neglected because they make the exact solution of the Green's function challenging). Finally, ray tracing has recently emerged as a valid alternative in the THz band [69] due to the increasing computational cost of simulating complex packages at high frequencies<sup>2</sup>. We refer the reader to Chapter 3 for a comprehensive list of channel characterization works.

<sup>&</sup>lt;sup>2</sup>In broad terms, the solution provided by CEM methods is accurate if the space in which Maxwell's equations need to be solved is sampled with a resolution of approximately a fifth of the wavelength corresponding to the highest frequency of interest.

## 3. State of the Art of Channel Characterization at the Chip Scale

The study of the wireless channel at the chip scale has mostly raised interest in the last decade with the advent of mmWave integrated antennas and compact transceivers. However, the works that provided the first rudimentary chip-scale channel models dating back from the early 2000s explored the use of lower frequencies. More specifically, Kenneth K. O's group from University of Florida pioneered the field by unveiling the first measurements between integrated antennas located within the same chip at the 6–18 GHz band [49, 54, 71]. Those works not only showed the relatively high loss introduced by the channel (around 60 dB), but also discussed the potential effects of the chip package or the role of the dielectrics used for thermal aspects. The latter two aspects, however, have not been investigated again until recently. Next sections detail these investigations at the mmWave and THz bands, and briefly discuss why they are not enough in the context of the WiPLASH project.

#### 3.1 Characterization at mmWave Frequencies

Table 3.1 shows a comprehensive summary of the efforts that followed the pioneering efforts from [49, 54, 71] and compares them with the work contained within this deliverable. It can be observed that progress in mmWave integrated antennas [50, 72] and pioneering works in WNoC [73] in the late 2000s renewed the interest in this area. Some works appeared in the 2007–2013 period, followed by a significant surge of papers from 2017 to date. Most efforts have been centered in the more mature bands between 20 GHz and 60 GHz [2, 18, 20, 21, 55, 57, 64, 74], with some forays into frequencies over 100 GHz [24, 56, 65, 75] using full-wave solving and actual measurements mostly. Due to the relatively reduced size of the environment at mmWave frequencies, there have been no serious attempts at using ray tracing in this band [63]. From the perspective of the considered antenna, there has been a shift from printed dipole and its variants [19, 76] to a set of research groups that have considered vertical monopoles [2, 24, 62]. Package-wise, open chip or custom packages have been evaluated most frequently [19,23,55], but with an increasing interest for flip-chip packages [22,64,75]. Wirebond [20] and interposers [68] have had marginal relevance thus far. This deliverable addresses three different packages and upto 240 GHz.

Frequency domain analysis has driven most of the efforts, highlighting the importance of path loss in the feasibility of chip-scale links. Full-wave simulations of a standard flip-chip package, reproduced in Figure 3.1(a), confirmed that path loss can exceed 70 dB for a few centimeters distance. To put such figures in context, recent on-chip mmWave transceivers with reasonable efficiency (2 pJ/bit in [38]) considered

| Ref.                | Year | Freq.<br>[GHz] | Method                       | Antenna                | Package                               | Scope         | Domain     |
|---------------------|------|----------------|------------------------------|------------------------|---------------------------------------|---------------|------------|
| [71]                | 2001 | 6–18           | Measurements,<br>Ray Tracing | Printed dipole         | Open chip,<br>Flip-chip               | Intra-chip    | Frequency  |
| [54]                | 2002 | 10–18          | Measurements                 | Printed dipole         | Open chip                             | Intra-chip    | Frequency  |
| [49]                | 2005 | 15             | Measurements                 | Printed dipole         | Flip-chip                             | Intra-chip    | Frequency  |
| [19]                | 2007 | 10–110         | Measurements                 | Printed dipole         | Open chip                             | Intra-chip    | Freq, time |
| [20]                | 2009 | 30–55          | Measurements                 | Bond-wire antenna      | Bond wires                            | Inter-chip    | Frequency  |
| [55]                | 2009 | 16–30          | Measurements                 | Printed meander        | Open chip                             | Intra-chip    | Frequency  |
| [18]                | 2009 | 15–140         | Field Analysis               | Dipole (model)         | Open chip                             | Intra-chip    | Frequency  |
| [21]                | 2013 | 50–70          | Measurements                 | Folded monopole        | Custom over flip-chip                 | Inter-chip    | Frequency  |
| [74]                | 2015 | 55–60          | Full-wave solver             | Printed Dipole         | Open chip                             | Intra-chip    | Frequency  |
| [ <mark>64</mark> ] | 2016 | 17–27          | Full-wave solver             | Vert monopole          | Flip-chip                             | Inter-chip    | Frequency  |
| [77]                | 2016 | 10–90          | Full-wave solver             | Vert monopole          | Custom over flip-chip                 | Intra-chip    | Frequency  |
| [76]                | 2017 | 0–80           | Full-wave solver             | Zig-zag monopole       | Custom                                | Inter-chip    | Frequency  |
| [23]                | 2017 | 60             | Full-wave solver             | Loop, PLPA             | Open chip                             | Intra-chip    | Time       |
| [24]                | 2017 | 130-170        | Full-wave solver             | Vert Monopole          | Custom                                | Intra-/Inter- | Frea. time |
| [65]                | 2017 | 155-165        | Measurements                 | Vert Monopole          | Custom                                | Intra-/Inter- | Frequency  |
| [78]                | 2018 | 55–65          | Full-wave solver             | Folded dipole, PLPA    | Open chip                             | Intra-chip    | Freq, time |
| [56]                | 2018 | 155–165        | Measurements                 | Vert Monopole          | Custom<br>open chip                   | Intra-chip    | Frequency  |
| [ <mark>68</mark> ] | 2018 | 60–120         | Full-wave solver             | Vert Monopole          | Flip-chip                             | Intra-/Inter- | Frequency  |
| [57]                | 2018 | 56–67          | Measurements                 | Dipole, circular patch | Custom over flip-chip                 | Inter-chip    | Frequency  |
| [ <b>79</b> ]       | 2018 | 150–250        | Full-wave solver             | Vert monopole          | Open chip                             | Intra-chip    | Frequency  |
| [25]                | 2019 | 40–60          | Measurements                 | Dipole                 | Open chip                             | Intra-chip    | Frequency  |
| [80]                | 2019 | 50–60          | Measurements                 | Printed dipole         | None                                  | Inter-chip    | Frequency  |
| [ <mark>62</mark> ] | 2019 | 60–70          | Full-wave solver             | Vert monopole          | Custom                                | Intra-chip    | Frequency  |
| [81]                | 2019 | 60–180         | Full-wave solver             | Vert monopole          | Custom                                | Intra-/Inter- | Freq, time |
| [22]                | 2019 | 10–40          | Measurements                 | Zig-zag monopole       | Custom open chip                      | Intra-/Inter- | Frequency  |
| [22]                | 2019 | 50–70          | Full-wave solver             | Zig-zag monopole       | Flip-chip                             | Intra-/Inter- | Frequency  |
| [53]                | 2020 | 55–65          | Full-wave solver             | Zig-zag array          | Custom open chip                      | Intra-/Inter- | Frequency  |
| [75]                | 2020 | 60–120         | Full-wave solver             | Vert Monopole          | Flip-chip                             | Intra-chip    | Freq, time |
| D3.1                | 2021 | 60–240         | Full-wave<br>solver          | Punctual excitation    | Flip-chip,<br>interposer,<br>wirebond | Intra-/Inter- | Freq, time |

Table 3.1: Works on mmWave channel modeling at the chip scale upto 2020.

an attenuation of 26.5 dB between transmitter and receiver. Subsequently, Zhang *et al.* [19] tested high-resistivity silicon as a way to reduce losses induced by the lossy substrate. This method achieved improvements of around 20–30 dB. In [74], the authors reduce attenuation to 15–30 dB using a layer of undoped silicon below the die substrate. Another line of research [21,57] resorts to metamaterial-like structures in open chip schemes to enhance the coupling of surface-waves and reduce amount of energy radiated away from the chip and into the silicon. Similarly, Wu *et al.* [65] propose a 3D-printed optimized dielectric attempting to jointly optimize several links within a single package. The disadvantage of the above methods, however, is that they resort to non-standard processes. Thus, there has been an alternative line of



Figure 3.1: Channel characteristics for intra-chip propagation within a flip-chip package at 60 GHz with variable silicon thickness, heat spreader of 0.2 mm, and heat sink on top. Data from [2].

research which looks at solutions compatible with standard packages and that is further pursued in this deliverable. For instance, WiPLASH partner UPC has evaluated the impact of optimizing the silicon and thermal interface material thicknesses within a flip-chip package [68], taking chip-wide losses down to around 30 dB.

As for time domain analysis, little has been reported about the dispersive nature of the chip-scale wireless channels. In their theoretical work, Matolak et al. predicted worst-case values of several nanoseconds using the micro-reverberation chamber model at mmWave and THz frequencies [39]. First measurements of the Power-Delay Profile (PDP) of open die schemes, on the contrary, vielded delay spreads around 100 ps for transmissions at 30-60 GHz [19]. This is because the reverberation chamber model assumes full encasement and does not take dielectric losses into account. For flip-chip and custom packages, which fall in between these two extremes, the simulated delay spread has been of a few hundreds of picoseconds [24,81]. In this context, it may not be possible to provide the speeds of several tens of Gb/s promised in several works [36], since the coherence bandwidth would be around a few GHz at most. With this in mind, WiPLASH partner UPC proposes to optimize the flip-chip package taking dispersion into account [75]. For instance, it was shown that thinning down the silicon can have a positive effect on the delay spread, as shown in Figure 3.1(b). With more exhaustive explorations, the authors are able to reduce the worst-case delay spread below 100 ps while maintaining a reasonable path loss, ensuring a chip-wide coherence bandwidth over 10 GHz.

Why is the existing work not enough for WiPLASH? While prior work has taught us lessons on the detrimental effects of thick bulk silicon layers and benefits of relatively thick heat spreaders, the studies have been mostly focused on flip-chip or open die configurations and in the 60–100 GHz range. Interposers and other multi-chip packages have been rarely studied and little to no work has been done in the range between 100 GHz and 1 THz. This is important in WiPLASH because we propose the use of antennas in the THz range and within environments including but not limited to multi-chiplet packages. Therefore, in this deliverable we go beyond the state of the art by providing models of wirebond, flip-chip, and interposer packages and making

a fair comparison between their characteristics at different frequencies spanning the 60–240 GHz range (we remind that 240 GHz is the experimentally targeted frequency in WiPLASH) and with time-domain impulses covering the whole band from 10 GHz to 1 THz.

### 3.2 Characterization at THz Frequencies

The research in characterization of wireless channels at the chip scale has been less intense in the THz band. As a pioneering work, Lee *et al.* simulated the intra-chip channel where antennas are placed in a polyimide layer in an open die scheme, by using a full-wave solver at 300 GHz [82]. The authors report an attenuation of around 40 dB at 1 cm distance, and argue that, compared with a conventional on-chip antenna over silicon, the on-chip antenna placed in the low-loss dielectric polyimide layer improves the channel loss by 20-30 dB.

As one of the first attempts for THz chip-scale propagation modeling, Chen *et al.* analyze the EM fields by using the Sommerfield integration method in the CMOS chip, and the results are validated with the full-wave solver HFSS [3]. As main observations, the path loss is highly frequency-selective due to the surface wave and guided wave propagation, as presented in Fig. 3.2. The path loss is periodically oscillating in the THz band, for which the period is corresponding to the frequency between two adjacent surface wave modes. Some recommendations such as thinning the underfill layer or using a conductive layer as heat spreader are also analyzed here.



Figure 3.2: Path loss of the THz chip-scale wireless channel [3].

Recently, Chen *et al.* developed a multi-ray model by using the ray-tracing method for intra-chip channels within flip-chip packaging structures in the THz band (0.1-1 THz) [69]. The authors observed that the intra-chip channel is highly frequency-selective due to multipath and that, as expected, low-loss silicon leads to large delay spread due to reverberation. Yet still, with the right conditions, the capacity of the intra-chip channel can reach 150 Gbps and 1 Tbps with BER below  $10^{-14}$  when the transmit power is 1 dBm and 10 dBm, respectively, and the transmission distance is 40 mm.

Moving up in the spectrum, the optical frequency bands enter into consideration. Given that light is already utilized for intra-chip and inter-chip wired communications

[83], the possibility to reuse some of the existing components to enable wireless optical communications in WNoC has recently been considered. As a result, channel modeling for optical frequencies has also raised the interest of the research community. Given that this deliverable focuses on the mmWave and THz bands, for the sake of brevity, we refer the interested reader to related works [84–88].

Why is the existing work not enough for WiPLASH? It can be observed that works in the THz band have been very scarce and using analytical and ray-tracing methods mostly. Even though these works lay the foundations of deep analysis of the propagation in deep THz frequencies, their methods may be inaccurate in highly integrated packages or if frequencies are below a few hundreds of GHz. For the purposes of WiPLASH, where we aim to provide models of wirebond, flip-chip, and interposer packages at different frequencies spanning the 60–240 GHz range, full-wave solving is preferred due to its higher accuracy and ease of use in multiple packages.

## 4. Methodology

This chapter summarizes the methods employed in subsequent chapters to evaluate propagation within the different computing packages. Figure 4.1 shows a graphical schematic of the methodology. In essence, we provide 3D models that capture the geometry and materials of the different packages. These are simulated in a particular frequency band or using broadband pulses in the time domain by means of a full-wave solver as further described in Section 4.1. The outcome of the simulations are a set of S-parameters or time signals relating the output at the receiving antenna as a function of the input at the transmitting one. These parameters are then fed to custom MATLAB scripts that obtain the path loss characteristics out of the S-parameters, as elaborated in Section 4.2, and the delay spread scaling out of the time signals, as described in Section 4.3.

#### 4.1 Simulation Setup

In Chapter 2, we mentioned that full-wave electromagnetic simulators solve Maxwell equations with boundary conditions for computing the EM fields in a propagation medium. This method is chosen here given its high accuracy and the impracticality of probing highly integrated computing packages for direct measurements. In our case, we employ CST Microwave Studio [89], which hosts a variety of methods for the solving of electromagnetic problems in the time and frequency domains.

**Computational resources.** Since the memory and time cost of full-wave solving increase with the simulation scale in number of wavelengths, the accurate simulation of highly integrated packages at mmWave and THz frequencies requires significant



Figure 4.1: General view of the evaluation methodology used in this deliverable.



Figure 4.2: Overview of a chip package and the simulated ports. We only need to excite one white (edge), one grey (center), and one black port (corner). The rest of combinations can be inferred thanks to symmetry.

resources. In our case, we provisioned two dedicated server workstations: the first one with a quad-core CPU at 3.90 GHz, 32 GB of RAM, and a GeForce GTX 1080Ti GPU; the second one with a 16-core CPU at 2.16 GHz and 128 GB of RAM. The GPU can be used to accelerate time-domain simulations.

**Package modeling.** The structures of flip-chip, interposer, and wirebond shown in their respective sections, namely, Sections 5.1, 6.1, and 7.1, are modeled in CST based on datasheets and schematics from real packages. Common parameters in the models are as follows: the silicon die has a resistivity of 10  $\Omega$ ·cm, with  $\varepsilon_{r,Si} = 11.9$ , whereas the heat spreader is Aluminum Nitride (AIN) with  $\varepsilon_{r,AlN} = 9$  and negligible losses. Both keep the thickness as a simulation parameter. The insulator is silicon dioxide with  $\varepsilon_{r,SiO2} = 3.9$  and  $\tan \delta = 0.025$ , with fixed thickness of 10  $\mu$ m. To reduce the computational burden, metals are modeled as perfect electrical conductors. Moreover, the interconnect layers and micro-bump arrays are generally approximated as a solid metallic element. This assumption has been validated in previous simulation works [2,68] and is justified by the small pitch of the interconnect layers (<10  $\mu$ m) and bump array (<0.1 mm) as compared to the excitation wavelength (0.3–3 mm). Finally, the PCB is also generally modeled as a solid block of metal due to the dense maze of metal layers within it that route signals from the chips to outside the system. We note this in subsequent sections as the *redistribution layer*.

Antennas. Unless noted, we consider a homogeneous distribution of  $4 \times 4$  antennas within the die(s) of the package. In order to minimize the impact of the antenna on the channel characterization procedure, we employ electrically small antennas implemented as small discrete or waveguide ports in CST, depending on the package. This way, the source is as omnidirectional and broadband as it can be, yet at the cost of a low effective gain. In any case, the antenna losses are decoupled in post-processing –see Equation 4.1. Finally, due to the XY symmetry in all cases, we only need to excite three ports as shown in Figure 4.2 to obtain all the relevant responses.

**Meshing.** The process of meshing determines the accuracy and computational cost of the simulation. In our case, we employ CST's adaptive mesh refinement process in both time and frequency domain, with a minimum of 2 passes and a maximum of 8. In simulations of large structures at high frequencies where adaptive refinement

may not be affordable, we manually adapt the mesh properties in a trial and error process to avoid running out of memory and keep simulations tractable.

### 4.2 Frequency Domain Analysis

The full-wave solver uses the Finite Elements Method (FEM) to obtain the field distribution, the antenna gain, and the coupling between antennas in the frequency domain, i.e. the S parameters. Our simulations consider a bandwidth of 10 GHz around the target central frequency, which is enough for path loss calculation purposes.

Once the S parameters are obtained, the channel frequency response  $H_{ij}(f)$  is evaluated for each antenna pair as

$$G_i G_j |H_{ij}(f)|^2 = \frac{|S_{ji}(f)|^2}{(1 - |S_{ii}(f)|^2) \cdot (1 - |S_{jj}(f)|^2)},$$
(4.1)

where  $G_i$  and  $G_j$  are the transmitter and receiver antenna gains,  $S_{ji}$  is the coupling between transmitter *i* and receiver *j*, whereas  $S_{ii}$  and  $S_{jj}$  are the reflection coefficients at both ends. With respect to the antenna gain, there are two important observations to make, namely:

- The antenna gain in the Friis formula and similar expressions such as Equation (4.1) is single-valued and generally corresponds to the gain at the direction pointing to the receiver or at the direction of maximum radiation. However, computing packages are highly integrated and the receiving antenna will pick up signals coming from a myriad of directions since we are using quasi-omnidirectional excitations. Therefore, we instead employ the antenna gain averaged over the complete solid angle, which essentially represents the losses of the antenna.
- A large fraction of our simulations occur in an close environment where all boundaries are Perfect Electric Conductor (PEC) layers. This hinders the evaluation of the far field, from which CST obtains the gain. To obtain an approximation of the actual radiation pattern and gain, we perform extra simulation where packages are surrounded by open boundaries.

Once the whole matrix of frequency responses H is obtained, a path loss analysis can be performed by fitting the average attenuation L over distance d to

$$L = 10 n \cdot \log_{10}(d/d_0) + L_0, \tag{4.2}$$

where  $L_0$  is the path loss at the reference distance  $d_0$  and n is the path loss exponent [19]. The path loss exponent is around 2 in free space, below 2 in guided or enclosed structures, and above 2 in lossy environments. Since losses at the channel are crucial to determine the power consumption at the transceiver, we may report path loss in terms of worst-case  $L_{max}$ , average  $L_{avg}$ , and path loss exponent n.

### 4.3 Time Domain Analysis

In the time domain, we define an input excitation  $x_i(t)$  at the input of the transmitting antenna *i*. Then, CST employs the Finite-Difference Time-Domain (FDTD) method

www.wiplash.eu

to calculate the output signal  $y_j(t)$  at the receiving antenna j. Hence, the impulse response  $h_{ij}(t)$  between transmitter i and receiver j can be derived with the classical formulation

$$y_i(t) = x_i(t) \star h_{ij}(t),$$
 (4.3)

where  $\star$  denotes the convolution operator.

Our simulations consider a Gaussian cycle whose bandwidth spans the whole spectrum from 10 GHz to 1 THz. The range below 10 GHz is eliminated from the input signal to minimize the resonant cavity effects of the different packages. Regardless, this extremely large frequency span leads to a very short impulse with a duration of less than 5 ps (Figure 4.3). Since the channel is expected to be much longer, on the order of hundreds of picoseconds upto a few nanoseconds, we consider that the input signal tends to a delta,  $x(t) \rightarrow \delta(t)$ , Therefore, the output signal approximates the impulse response  $y(t) \rightarrow h(t)$ .



Figure 4.3: Gaussian pulse used in the time-domain simulations and whose spectrum spans from 10 GHz upto 1 THz.

Once h(t) is calculated, it is straightforward to evaluate the PDP, which gives the intensity of a signal received through a multipath channel as a function of time delay  $\tau$ . For small-scale channel modeling, the PDP of the channel is generally found by taking the spatial average of the channel's baseband impulse response h(t). However, since this scenario is static, it is not necessary to resort to statistical models. We rather obtain the PDP between transmitter *i* and receiver *j* as

$$P_{ij}(\tau) = |h_{ij}(t,\tau)|^2,$$
(4.4)

therefore obtaining a matrix of PDP functions  $\mathbf{P}$  for all transmitters and receivers within the chip.

One metric for evaluating the multipath richness of the channel is the delay spread  $\tau_{rms}$ , which is evaluated using the PDP of each channel as

$$\tau_{rms}^{(i,j)} = \sqrt{\frac{\int (\tau - \overline{\tau_{ij}})^2 P_{ij}(\tau) \, d\tau}{\int P_{ij}(\tau) \, d\tau}},\tag{4.5}$$

www.wiplash.eu

January 17, 2021

where  $\overline{\tau_{ij}}$  is the mean delay of the channel, which is calculated as the first moment of the PDP. In other words,

$$\overline{\tau_{ij}} = \frac{\int \tau P_{ij}(\tau) d\tau}{\int P_{ij}(\tau) d\tau}.$$
(4.6)

In this work, we will assume that the transmission rate of all nodes are dimensioned to the worst case across all links and, therefore, they should be operated at the lowest speed ensuring correct decoding at all nodes. As a result, we will take the worst delay spread across all pairs of transmitters-receivers (i.e., across all distances) as limiting case and use it to evaluate the coherence bandwidth  $B_c$ , as follows

$$\tau_{rms} = \max_{i,j \neq i} \tau_{rms}^{(i,j)} \Rightarrow B_c \propto \frac{1}{\tau_{rms}}.$$
(4.7)

For simplicity, we will take  $B_c = \frac{1}{\tau_{rms}}$ .

## 5. Analysis of Flip-chip Package

This chapter is devoted to the evaluation of a flip-chip package in the frequency and time domains. The chapter is organized as follows. First, we detail the geometry and materials of the package in Section 5.1. Then, we analyze the results in the frequency domain in Section 5.2 and in the time domain in Section 5.3. Finally, we discuss other aspects relative to package engineering and optimization in Section 5.4.

### 5.1 Environment Description

An instance of a complete flip-chip package with solder bumps is shown in Figure 5.1. During the manufacturing process, the solder bumps are deposited on the chip pads, which already carry a valid under bump metallization (UBM) like nickel/gold (Ni/Au). Then, the chip is flipped over and its solder bumps are aligned precisely to the pads of the package carrier external circuit.

The layers are described from top to bottom as summarized in Table 5.1. On top, the heat sink and heat spreader dissipate the heat out of the silicon chip, as they both have good thermal conductivity. Bulk silicon serves as the foundation of the transistors. This layer has low resistivity (10  $\Omega \cdot cm$ ), which is convenient for the operation of transistors, but not for electromagnetic propagation [72]. The interconnect layers, which occupy the bottom of the silicon die as shown in the inset of Fig. 5.1, are generally made of copper and surrounded by an insulator such as silicon dioxide (SiO<sub>2</sub>) [1]. Finally, we find a package substrate or PCB below the bump array. Although the material of the carrier may be alumina or similar, we model it as perfect electrical conductor due to the existence of a dense metallic redistribution layer within it.



Figure 5.1: Schematic of the layers of a flip-chip package.
|                      | Thickness  | Material         | $\varepsilon_r$ | $tan(\delta)$      | ho      |
|----------------------|------------|------------------|-----------------|--------------------|---------|
| Heat sink            | 0.1–0.5 mm | Aluminum         | PEC             | PEC                | PEC     |
| Heat spreader        | 0.1–0.5 mm | Aluminum Nitride | 8.6             | 3·10 <sup>-4</sup> | —       |
| Silicon die          | 0.5 mm     | Bulk Silicon     | 11.9            | _                  | 10 Ω·cm |
| Insulator            | 10 µm      | $SiO_2$          | 3.9             | 0.025              | —       |
| Bumps                | 87.5 μm    | Cu and Sn        | PEC             | PEC                | PEC     |
| Redistribution layer | 3 μm       | Copper           | PEC             | PEC                | PEC     |
| PCB                  | 0.5 mm     | Epoxy resin      | 4               | _                  | -       |

Table 5.1: Characteristics of the layers in a flip-chip package and default dimensions.

| able 5.2: Package | parameters | for | flip-chip. |
|-------------------|------------|-----|------------|
|-------------------|------------|-----|------------|

| Parameter                | Default Value | Variations    | Units |
|--------------------------|---------------|---------------|-------|
| Die size                 | 8             | 12, 16, 20    | mm    |
| Silicon thickness        | 0.1           | 0.5           | mm    |
| Heat spreader thickness  | 0.5           | 0.1           | mm    |
| Lateral space material   | Vacuum        | Epoxy         | N/A   |
| Lateral space dimensions | 1             | 1.4, 1.8      | mm    |
| Frequency                | 60            | 120, 180, 240 | GHz   |

The vertical dimensions of the different layers have an impact on propagation [2]. The bulk silicon used in the chip substrate generally has low resistivity, and therefore a thin substrate is preferred [2]; whereas materials used as heat spreaders have low electrical losses [72] and rather thick layers are desirable. To evaluate this impact in our simulations, we assume that both the substrate and the heat spreader, Aluminum nitride (AIN) in our case, can have a thickness of either 0.1 or 0.5 mm each.

As for the lateral dimensions, we initially assume a small die with a lateral size of 8 mm so that it is comparable to the size of dies in bondwire and interposer packages. Later, in Section 5.4, we simulate larger dies. On the sides of the die, we assume an empty space of variable size filled with air or epoxy. The package is laterally enclosed with a metallic lid.

In this scenario, antennas are modeled as electrically small waveguide ports within the silicon dioxide with orientation upwards in the Z axis. This configuration allows to have a rather omnidirectional radiation laterally and upwards to the substrate, as shown in Figure 5.2(b-d). A set of  $4 \times 4$  antennas is distributed homogeneously across the die. For small dies, this antenna distribution could theoretically lead to near-field coupling and undesired resonances. However, the presence of lossy silicon and the poor efficiency of the excitation structures minimizes such effects.

## 5.2 Frequency Domain Analysis

#### 5.2.1 Analysis at mmWave Frequencies

Here, we start by quantifying the path loss in a flip-chip package at 60 GHz for the default dimensions and materials given at Table 5.2. We simulate the different combinations of silicon and heat spreader thickness, obtaining the path loss for all antenna pairs and performing a linear regression to obtain the dependence with distance.





Figure 5.2 shows the path loss as a function of the transmission distance of an instance of a flip-chip package, along with approximate radiation patterns. We observe that path loss points are scattered even though the radiation patterns are quite omnidirectional. This suggests that there is no clear direct path between transmitters and receivers, and that energy comes from many different reflections. Hence, the path loss depends more on the position of transmitter and receiver rather than their distance. Still, a linear regression seems to suggest an upward trend with distance.

Figure 5.3 plots the path loss for all the silicon and AIN thickness combinations. First, we observe that the benefits of thinning the silicon layer down are significant. A 100- $\mu$ m chip has a path loss ranging between 20 and 45 dB, whereas packages with thick silicon have an extra 30 dB of path loss in the worst case. We also observe that the AIN has a subtle impact on path loss, affecting distant links mostly. The impact is more noticeable when the silicon die is also thick, because waves that enter the silicon suffer very significant losses as compared to when the silicon is thin, whereas those travelling through the AIN layer are less attenuated.

#### 5.2.2 Scale-up to THz Frequencies

To understand propagation at frequencies closer to the objectives set by the WiPLASH project, we gradually increase the frequency until reaching 240 GHz. This is the frequency band at which the test Silicon-Germanium (SiGe) transceivers will operate in co-integration with the graphene antennas in WiPLASH.







Figure 5.4: Path loss in a flip-chip package at different frequencies.

Figure 5.4 presents the results of the frequency scaling analysis assuming a silicon thickness of 0.1 mm and an AIN thickness of 0.5 mm. We observe how, in general, the path loss increases with frequency. At 60 GHz, the path loss ranges between 30–40 dB, approximately. The path loss rises upto 55 dB for 240 GHz. Since the size of the ports is not modified when changing frequency, and since the antenna and mismatch losses are removed from the channel response, this increase in path loss is not due to the antenna. Rather, we speculate that this is due to an increase of the losses at the materials. The trend does not appear to be monotonic, however, as the results at 120 GHz are significantly lower than the rest (less than 30 dB in average). This may be due to particular resonances or waveguiding behavior of the package structures at this frequency.

To confirm whether the trends associated to silicon and AIN thicknesses still hold at higher frequencies, we repeated the analysis at 240 GHz. Figure 5.5 shows that, effectively, the trends are largely maintained: thin silicon is always preferred, and thick AIN may help in improving path loss. When compared to the results at 60 GHz (Figure 5.3), we observe a general increase of around 10 dB. We also see that increasing the AIN thickness is more effective at high frequencies as it better combats the effect of lossy silicon.



Figure 5.5: Path loss in a flip-chip package at 240 GHz for different substrate and heat spreader thicknesses.

### 5.3 Time Domain Analysis

We next assess how the delay spread scales with distance in a flip-chip package of lateral size 8 mm for different substrate and heat spreader thicknesses. In subsequent sections, we analyze the impact of other package dimensions and materials. Remind that we excite the antennas with an extremely short Gaussian pulse whose spectrum spans all frequencies between 10 GHz and 1 THz.

Figure 5.6 shows the results of the time domain analysis. The first observation is that the delay spread is generally lower than 0.1 ns in such a small small chip, leading to a coherence bandwidth larger than 10 GHz. The delay spread is generally larger when thick layers are employed, because the main components of the signal become weaker and more delayed reflections appear, creating a longer tail. The best design point is with thick silicon and thin AIN, leading to a delay spred below 0.05 ns (20 GHz). The cause is most likely that the thick layer of lossy silicon *kills* all long multipath components. The second best design point is that of thin silicon and AIN, which is better suited due to the lower path loss as shown previously. In this case,



Figure 5.6: Delay spread in a flip-chip package for different substrate and heat spreader thicknesses.

the worst-case delay spread drops to below 0.07 ns for a coherence bandwidth of  ${>}14$  GHz.

## 5.4 Channel Engineering

The results shown in previous sections suggest that manufacturers can exploit the monolithic nature of the system to engineer the channel towards reducing the path loss, the delay spread, or both. Here, we provide a deeper insight on this aspect. First, we show the impact of parameters such as the lateral dimensions or the filling material in Section 5.4.1. Second, in Section 5.4.2 we propose a package optimization methodology that explores the package design space to find design points that strike a balance between path loss and delay spread.

### 5.4.1 Sensitivity Analysis

Here, we evaluate the wireless channel within a flip-chip package for the set of variations listed in Table 5.2, both in the frequency and time domains.

**Die size.** Figure 5.7 shows how scaling the die size impacts on the path loss and delay spread, maintaining the rest of parameters fixed. Our first observation is that larger chips allow to implement longer wireless links that, in principle, will lead to larger losses and dispersion. The results seem to confirm this, as scatter points tend to the top-right part of the plot. The range of path loss values does not change much from small to large chips. At 20-mm, the path loss stays around 25–50 dB for the silicon and AIN thickness. In terms of delay spread, larger chips lead to higher delay spread in general. Worst-case values increase from 0.09 to 0.12, leading to coherence bandwidths lower than 8.5 GHz. It is also worth noting that, in general, larger chips seems to improve the performance in particular mid-range links (see points in the range of 6mm–10mm in 5.7). The longer distance between antennas (and consequent reduction of possible near-field coupling effects), and the lower impact of parameters such as the dimensions of the package margins, may be the cause for this improvement. However, this result may not be extensible to all combinations of Si and AIN thicknesses,



Figure 5.7: Path loss and delay spread in a flip-chip package for different die sizes.

**Package dimensions and filler materials.** Figures 5.8 and 5.9 show the impact of the characteristics of the lateral interface between the die and the package. We simulate different lengths and switch the material to epoxy, which is typically used for mechanical support.

From Fig. 5.8, we observe how the lateral margin has a relatively small impact to the average scaling of the path loss over distance. The trend is similar at both 60 GHz and 240 GHz. The effect is more noticeable for long links, whose main component comes from this lateral space. By reducing this margin, the wave travelling through the sides of the package arrives stronger and faster, thereby reducing the path loss. With respect to the delay spread, the effect is more noticeable. Except for a few outlier links, the delay spread for the package with margin of 1 mm at each side is well below 0.05 ns (coherence bandwidth over 20 GHz).

From Fig. 5.9, we have two interesting observations. First, that the use of epoxy instead of vacuum seems to improve path loss by a decent amount (upto 10 dB) at 60 GHz. The effect is less noticeable at 240 GHz, but nonetheless shows specific points of great improvement (a 8.5mm distance, the path loss is improved by 12 dB). While not shown for the sake of brevity, we have checked that the impact of using epoxy instead of vacuum is always present, although the exact amount varies with the silicon and AIN thicknesses. We hypothesize that the reason for this behavior is the larger refractive index of the epoxy material, which becomes closer to that of Si/AIN. This increases



Figure 5.8: Path loss at 60 GHz and 240 GHz on top plot and delay spread on bottom plot for a flip-chip package with different package margin dimensions.





the transmission of waves into the package margins, propagating with lower loss and then entering back into the chip. This allows to improve coupling between antennas that are close to the edges of the chip. As a side effect, the delay spread seems to to increase by a moderate amount (it adds an extra 0.01 ns in average). The reason is similar: package-travelling waves generally have larger delays due to the longer paths until the destination. If these rays are stronger, then the energy distribution is spread over a longer time. Yet still, worst case values are maintained below 0.1 ns (10 GHz).

### 5.4.2 Package Optimization

Results shown in prior sections highlight that some design decisions can shift the package to a more or less resonant cavity, affecting path loss and delay spread in contradicting ways at times. Based on this, we postulate that there is a way to reach design points that can balance both path loss and delay spread and that optimization methods can be employed to find this sweet spot. For the sake of self-containment, we next reproduce the formulation and some results of the work presented in [75] with a different set of parameters than the evaluations above. In the interest of brevity, this is not repeated for other packages, leaving it for future work.





#### 5.4.2.1 Formulation

Our methodology, summarized in Figure 5.10, takes path loss and delay spread as two metrics to be optimized. Since both aspects are dependent on multiple inputs, the channel engineering can be either formally treated as a Multi-Objective Optimization (MOO) problem and solved with evolutionary algorithms or others [90], or reduced to a single-objective problem using weights. In particular, our methodology defines a single custom figure of merit  $\phi_w$  that we will attempt to maximize. Since the aim is to mitigate the path loss and the delay spread, the figure of merit takes the form

$$\phi_w = \frac{1}{PL^w DS^{(1-w)}} \tag{5.1}$$

where PL is the path loss metric, DS is the delay spread metric, and  $w \in [0, 1]$  models the importance of power or speed in different designs. In other words, w is fixed by the architect: small values will be used in high performance devices where speed needs to be optimized over power, whereas large values imply minimization of the path loss oriented to low-power embedded systems.

In this work, the metrics used are  $PL = L_{avg}$  and  $DS = \tau_{rms}$ . Moreover, we normalize both metrics so that they have the same dynamic range between 0 and 1. For the purpose of illustration, here we consider three variables that can be modified at design time: the silicon thickness  $T_s$ , the heat spreader thickness  $T_h$ , and the carrier frequency  $f_c^{-1}$ . Then, the objective is to maximize the figure of merit

$$\max_{T_s,T_h,f_c}\phi_w , \qquad (5.2)$$

this is, to find the  $T_s$ ,  $T_h$ , and  $f_c$  values that maximize the figure of merit for a given w and within the bounds given by the manufacturer or the architect. Note that the optimization can be extended to other decisions such as those evaluated in Section 5.4.1. In our design exploration, we consider  $T_s \in [0.1, 0.7]$  mm and  $T_h \in [0, 0.8]$  mm.

<sup>&</sup>lt;sup>1</sup>To exemplify the impact of carrier frequency in the delay spread, we modified the time-domain methodology slightly. In this section only, the excitation has a cut-off frequency around the carrier frequency, thereby adding up the effect of limited bandwidth at the antenna.

To solve the optimization problem, we note that exhaustive search is impractical due to the computational demand of full-wave simulations and the relatively large size of the design space. Also, path loss and dispersion are related to  $\{T_s, T_h, f_c\}$  in non-monotonic ways and often showing opposed trends. This creates local peaks in the  $\phi_w$  function, thus discarding methods such as the gradient-based *hill climbing*, which tends to get stuck into local maxima. One feasible alternative would be Simulated Annealing (SA), which uses a probabilistic method to avoid local peaks and progressively approach a global optimum. Although SA can be modified to solve MOOs [90], we treat our problem as a single-objective optimization and use conventional SA variants.



Figure 5.11: Exploration of the design space for channel engineering with respect to (a–c) the silicon thickness  $T_s$ , (d–f) heat spreader thickness  $T_h$ , and (g–i) central frequency  $f_c$ . Panels show the path loss over distance, delay spread over distance, and maximum delay spread/average path loss as functions of the swept parameter. Unless noted,  $T_s = 0.2 \text{ mm}$ ,  $T_h = 0.7 \text{ mm}$ , and  $f_c = 60 \text{ GHz}$ .

#### 5.4.2.2 Results

Here, we show the potential of channel engineering through a partial exploration of the  $\{T_h, T_s, f_c\}$  design space. The package model is the same than for the rest of explorations, with the exception of using a die size of 22 mm and a package size of 33 mm.

Figure 5.11 shows an overview of the design exploration, illustrating how our PL and DS metrics (average path loss and maximum delay spread, respectively) scale with respect to the three considered parameters. The results confirm that thinner silicon reduce losses and delay spread, and that their relation with the silicon thickness and frequency are not necessarily monotonic.

Taking the aforementioned data, we can then plot the figure of merit  $\phi_w$  as function of each exploration parameter while leaving the others fixed. The results, summarized in Figure 5.12, confirm the tendencies outlined above and suggests that the choice of walso plays an important role in the optimization. Since path loss and delay spread often show opposed trends, the shape of  $\phi$  changes in unexpected ways and causes wild variations in the optimal design points. Take, for instance, the frequency scaling trend. The optimal point is clearly at 110 GHz for w = 0, but that peak dilutes progressively and disappears around w = 0.6. At that point, the optimal frequency becomes 60 GHz or 80 GHz due to the better path loss behavior.

In order to estimate the maximum gains that we can achieve through channel engineering, we further explored the design space in the quest for points close to a hypothetical global optima. We chose three representative values of w (w = 0 for high performance, w = 1 for low losses, and w = 0.5 for balanced design points) and compared the results with those of a standard chip ( $T_s = 0.7 \text{ mm}$ ,  $T_h = 0.2 \text{ mm}$ ,  $f_c = 60$ 



Figure 5.12: Figure of merit  $\phi_w$  as function of  $\{T_s, T_h, f_c\}$  for different priority weights. Unless noted,  $T_s = 0.2 \text{ mm}$ ,  $T_h = 0.7 \text{ mm}$ , and  $f_c = 60 \text{ GHz}$ .

|         | $	au_{rms}$ (ns) | $B_c$ (GHz) | $L_{max}$ (dB) | $L_{avg}$ (dB) | n    |
|---------|------------------|-------------|----------------|----------------|------|
| w = 0   | 0.07             | 14.02       | 58.62          | 42.76          | 3.28 |
| w = 0.5 | 0.15             | 6.76        | 45.49          | 36.48          | 1.74 |
| w = 1   | 0.59             | 1.69        | 28.55          | 21.88          | 1.32 |
| Std.    | 0.52             | 1.92        | 75.62          | 54.57          | 4.61 |

| Table 5.3: Summary | of the | optimized | package | designs |
|--------------------|--------|-----------|---------|---------|
|--------------------|--------|-----------|---------|---------|

GHz). Table 5.3 summarizes the main results. There  $L_{max}$  and  $L_{avg}$  refer to the maximum and average path loss across all measured transmitter-receiver pairs within the 4×4 homogeneous grid of antennas.

# 6. Analysis of Interposer-based Package

This chapter is devoted to the evaluation of an interposer package as main enabler of multi-chip architectures. The chapter is organized like the previous one. First, we detail the geometry and materials of an interposer package in Section 6.1. Then, we analyze the results in the frequency and time domains in Sections 6.2 and 6.3, respectively. Finally, in Section 6.4, we perform a sensitivity analysis to find the package design decisions that have the highest impact on the channel characteristics.

### 6.1 Environment Description

Figure 6.1 shows a schematic representation of an interposer-based package. The process of integration here is similar to that of flip-chip, but with a few extra added steps. In particular, the interposer is a thin layer of silicon that interfaces the PCB/carrier with its array of solder bumps at a similar granularity than a flip-chip. On top, however, the contacts are patterned at a finer granularity. The top side of the interposer interfaces with the chiplets, which are integrated using a flip-chip technique. Therefore, the chiplets have the same structure that the one summarized in Chapter 5. As for heat dissipation, we can consider that each chip is added its heat spreader individually and then covered by a common heat sink.

Table 6.1 depicts the layers from top to bottom, whereas Table 6.2 lists the different variants that we evaluate in Section 6.4. On top, the heat sink and heat spreader dissipate the heat out of the silicon chip. Bulk silicon (10  $\Omega \cdot cm$ ) serves as the founda-



Figure 6.1: Schematic of the layers of an interposer package.

|                      | Thickness  | Material         | $\varepsilon_r$ | $tan(\delta)$      | ρ        |
|----------------------|------------|------------------|-----------------|--------------------|----------|
| Heat sink            | 0.1–0.5 mm | Aluminum         | PEC             | PEC                | PEC      |
| Heat spreader        | 0.1–0.5 mm | Aluminum Nitride | 8.6             | 3·10 <sup>-4</sup> | _        |
| Silicon die          | 0.5 mm     | Bulk Silicon     | 11.9            | _                  | 10 Ω.cm  |
| Insulator            | 10 µm      | $SiO_2$          | 3.9             | 0.025              | _        |
| Microbumps           | 40 µm      | Cu and Sn        | PEC             | PEC                | PEC      |
| Interposer           | 0.1 mm     | High-Res Silicon | 11.9            | _                  | 0.1 Ω.cm |
| Bumps                | 0.1 mm     | Lead             | PEC             | PEC                | PEC      |
| Redistribution layer | 3 μm       | Copper           | PEC             | PEC                | PEC      |
| PCB                  | 0.5 mm     | Epoxy resin      | 4               | _                  | —        |

| Table 6.1: Characteristics | of the la | vers in an i | nterposer-based | package. |
|----------------------------|-----------|--------------|-----------------|----------|
|                            |           |              |                 |          |

| Table 6.2: Package parameters for interposer. |               |               |       |  |
|-----------------------------------------------|---------------|---------------|-------|--|
| Parameter                                     | Default Value | Variations    | Units |  |
| Interposer size                               | 20            | -             | mm    |  |
| Interposer resistivity                        | 0.1           | 1, 10         | Ω·cm  |  |
| Number of chiplets                            | 4             | 16            | _     |  |
| Chiplet silicon thickness                     | 0.1           | 0.5           | mm    |  |
| Heat spreader thickness                       | 0.5           | 0.1           | mm    |  |
| Chiplet separation                            | 2             | 1, 4          | mm    |  |
| Filling material                              | Vacuum        | Epoxy         | N/A   |  |
| Frequency                                     | 60            | 120, 180, 240 | GHz   |  |

tion of the transistors in each chiplet. The interconnect layers reside within the silicon dioxide (SiO<sub>2</sub>) insulator. Then, below the fine array of micro-bumps, we find the silicon interposer. Interposers can be (i) active, which include active devices and are implemented in bulk silicon, and (ii) passive, which can be implemented in high-resistivity silicon [91]. Due to lower cost and more widespread adoption nowadays, passive interposer is assumed by default, although we evaluate the impact of having an active interposer in Section 6.4. Below the interposer, we model an interposer-wide bump array, and below it, a PCB whose body material is irrelevant because we model it as perfect electrical conductor due to the existence of a dense metallic redistribution layer within it.

Laterally, the cross-section of the interposer package resembles that of flip-chip, with the exception that void now appears not only between the chiplets and package limits, but also between chiplets and between the interposer and the package limits. Several works report different interposer sizes like  $25 \times 25 \text{ mm}^2$  in [92],  $24 \times 36 \text{ mm}^2$  in [47], or  $40 \times 40 \text{ mm}^2$  in [46]. To have chiplet sizes commensurate to those evaluate in the previous chapter, we assume an interposer of  $20 \times 20 \text{ mm}^2$ . We finally note that we still account for an array of  $4 \times 4$  antennas, which are distributed among the chiplets:  $2 \times 2$  antennas per chiplet in the case of four chiplets, or one antenna per chiplet in the case of sixteen chiplets. Due to the fact that chiplets are essentially integrated using a flip-chip approach, antennas are modeled similarly here: as electrically small waveguide ports within the silicon dioxide with orientation upwards in the Z axis.



Figure 6.2: (a) Path loss in an interposer package at 60 GHz for a silicon thickness of 0.1mm, AIN thickness of 0.5mm, four chiplets, and a chiplet separation of 2mm. (b-d) As a reference, approximated radiation patterns of the different ports.

## 6.2 Frequency Domain Analysis

### 6.2.1 Analysis at mmWave Frequencies

Like in the previous chapter, we begin the assessment by quantifying the path loss in the interposer package at 60 GHz for the default dimensions and materials given at Table 6.2. We simulate the different combinations of silicon and heat spreader thickness.

Figure 6.2 plots the path loss of the interposer as a function of the distance between nodes, as well as reference radiation patterns of the different antennas. Important observations are as follows. First, the radiation patterns are very similar for all ports, since chiplets host four antennas that have very similar surroundings (i.e. this does not happen in flip-chip or wirebond, where central/edge ports are surrounded by silicon primarily). This makes the path loss trend more predictable. Second, the path loss is similar in value than that of flip-chip pacakages, but with lower distance. This may be due to the presence of vacuum/epoxy pathways that separate the chiplets, as well as of the high-resistivity interposer, which may be providing an extra low-loss pathway for propagation.

Figure 6.3 plots the path loss for all the silicon and AIN thickness combinations. The results show that, similarly to in flip-chip packages, thin silicon is preferable as it minimizes the losses of waves propagation through it. Thinning down the silicon from 0.5mm to 0.1mm reduces the path loss by upto 20 dB. Moreover, having a thick AIN





seems to aid reducing the path loss a bit further, but with differences that are marginal (of a few dB for long links). For the best desing point out of the four evaluated ones, the average path loss is around 40 dB with a worst case value of 50 dB.

#### 6.2.2 Scale-up to THz Frequencies

Again, to gain insight of propagation at frequencies closer to the objectives set by the WiPLASH project, we increase the frequency up to 240 GHz. The results, summarized in Figure 6.4, suggest that lower frequencies are preferable. Increasing the frequency to subTHz bands seeking higher potential bandwidth has a cost of around 10 dB when tripling the frequency from 60 to 180 GHz. The reason may be the increase in losses of the different materials found along the path. We also note that the results at 120 GHz follow a similar trend than other frequencies, unlike the case of flip-chip, where 120 GHz seemed to be a sweet spot.



Figure 6.4: Path loss in an interposer package at different frequencies.

## 6.3 Time Domain Analysis

We next evaluate the dispersion within an interposer package of 20 mm for different substrate and heat spreader thicknesses. To this end, we use an picosecond-long impulse signal covering the whole spectrum from 0.01 to 1 THz. In next sections, we extend the analysis to other package dimensions and materials.

Figure 6.5 shows the results of the time domain analysis. Th first observation from the results is that the delau spread does not exceed 0.25 ns in any of the evaluated scenarios, even though the transmission distance scales up to almost 20 mm. We see that the AIN thickness has an impact on the scaling trend with distance: thick AIN leads to relatively higher delay spreads at short distances and better values at longer distances. On the contrary, thin AIN layers lead to better results at short distances, but worse at long distances. The reason may be that the extra propagation length of having to go through the AIN layer, reflect on the heat sink, and propagate back to the receiving antenna, is proportionally larger at short co-planar distances. At longer distances, this extra thickness at the AIN layer actually aids propagation through waveguiding. Since we calculate the coherence bandwidth based on the worst-case delay spread, then it seems that thick AIN are preferrable in this scenario.



Figure 6.5: Delay spread in an interposer package for different substrate and heat spreader thicknesses.

## 6.4 Sensitivity Analysis for Channel Engineering

Next, we evaluate the different variations of interposer package listed in Table 6.2 in both domains. For the sake of brevity, we do not repeat the optimization formulation and exploration performed in Section 5.4.2, although it could be applicable in the case of interposer-based packages as well.

**Number of chiplets.** We start by evaluating the impact of dividing the interposer area into more chiplets. To this end, we break down the  $20 \times 20 \text{ mm}^2$  space into four and sixteen chiplets, always leaving a separation of 2 mm between chiplets and between



Figure 6.6: Path loss and delay spread in a  $20 \times 20$  mm<sup>2</sup> silicon interposer divided into 4 or 16 chiplets. The distance between chiplets is 2 mm.

the edge chiplets and the package limits. As a result, chiplets are 7-mm and 2.5-mm wide and long for the 4 and 16 chiplets cases, respectively.

Figure 6.6 shows that more chiplets lead to an improvement of the path loss of upto 10 dB. Theoretically, having more chiplets leads to a more frequent change of propagation medium and more reflections in the inter-chiplet filler. Therefore, the improvement seem counter-intuitive. Reasons for this behavior may be (i) that the waves leave the chiplet sooner and, instead of propagating through lossy silicon, they propagate through the lossless filler and/or couple onto the interposer to reach the rest of chiplets more efficiently. We also note that such small chiplets may become a resonant structure and lead to distorted or more directive radiation patterns at certain frequencies.

With regards to the delay spread, the right plot of Figure 6.6 illustrates that more chiplets lead to a rather constant increase of the delay spread of around 0.02 ns in average. The worst-case delay spread increases from 0.2 to 0.25 ns, reducing the coherence bandwidth from 5 to 4 GHz. One possible reason is the more frequent change of propagation medium, which may be generating more reflections at the interface between the chiplets and the package. These reflections may accumulate at the tail of the received signal.

**Inter-chiplet separation.** Due to interconnectivity, floorplanning, and thermal reasons, chiplets may need to be integrated with a certain separation. Figure 6.7 plots the path loss and delay spread for different chiplet separations, namely, 1, 2, and 4 mm. We note that the interposer size is left constant and, therefore, the chiplets are down-scaled accordingly, i.e. 8.5, 7, and 4 mm. On the one hand, the results from Figure 6.7 seem to imply that larger separations lead to better path loss, especially at longer distances. The improvement can be larger than 20 dB. The reasons are compatible with the discussion made above for varying number of chiplets: smaller chiplets allow waves to escape the package sooner, leaving most part of the propagation a matter within the filling material and/or the interposer itself, which are much better conductors than the lossy silicon at the chiplets. On the other hand, the delay spread analysis seems to imply that the improvement in path loss comes at the cost of a degradation of the delay spread. However, the effect is clearly focused on mid-range links. In fact,



Figure 6.7: Path loss and delay spread for different inter-chiplet spacings.



Figure 6.8: Path loss and delay spread in an interposer package for low-resistivity (active) and high-resistivity (passive) silicon interposers.

the existence of larger spacings reduces the length of the link communicating the most far-apart cores. However, even overlooking that fact, the value of the worst-case delay spread is very similar in the three evaluated cases, being around 0.22 ns (4.5 GHz).

**Interposer resistivity.** Following up with the discussions above, it appears that the high-resistivity silicon may be well supporting the propagation of waves within the package. To confirm or deny this fact, we repeated a set of simulations using low-resistivity bulk silicon to emulate the effect of employing a more expensive active interposer. The use of low-resistivity silicon is motivated by the fact that active interposers can host transistors and other devices whose performance degrades with high-resistivity silicon.

Figure 6.8 shows the result of such an experiment. For a silicon thickness of 0.1 mm and AIN of 0.5 mm, the impact of using bulk silicon instead of a high-resistivity material on path loss is marginal. A potential reason may be that the bump arrays are sort of a *barrier* hindering the coupling of waves to the interposer. In fact, to couple into the interoser, radiated signals need to be reflected the heat sink or the limits of the package and propagate down again to the interposer areas not covered by the



Figure 6.9: Path loss and delay spread in an interposer package for different filling materials.

chiplets. This long path reduces the chances of exploiting the interposer and, thus, lead to such marginal change of path loss.

The case of the delay spread is interestingly different. In this case, the use of bulk silicon reduces the delay spread in half. One possible reason may be that the lossy interposer is attenuating multipath rays that would otherwise lead to higher dispersion.

**Filling material.** Figure 6.9 shows the impact of the characteristics of the filling material. To this end, we switch the material from vacuum to epoxy, which is typically used for mechanical support. The results suggest that, similarly to in flip-chip packages, using epoxy instead of vacuum could improve path loss. However, in the interposer case, the introduction of epoxy resin as package filling material seems to be also help-ing reduce the delay spread, which did not happen in simple flip-chips. The reason for this behavior is that the change of refractive index between the chiplet and the package is less abrupt, reducing intra-chiplet reflections and improving the chiplet–package transition. We speculate that, since chiplets are smaller and the filling material is also present in the space between chiplets (and not only in the package margins), the impact is more profound and positive.

# 7. Analysis of Wire-Bonding Package

In contrast to the previous chapters characterizing high-end flip-chip and interposer packages, here we evaluate the more classical wirebond package. We first detail the geometry and materials of the package in Section 7.1. Then, we analyze the results in the frequency domain in Section 7.2 and in the time domain in Section 7.3. Finally, we discuss other aspects relative to package engineering and optimization in Section 7.4.

### 7.1 Environment Description

Figure 7.1 shows a three-dimensional schematic of a wirebond package. The key of this option is that it is a surface-mount technology that does not require any holes or vias to connect the external die to the system. The die is mounted in the upright position, with the insulator facing up and placed on top of an underfill material that fixes the chip mechanically to a metallic frame. The role of this frame is to mechanically interface the chip with the PCB. The electrical I/O connections, on the other hand, are performed by means of bond wires stemming directly from the top metallization layers of the die and reaching the contacts in the PCB or ceramic carrier. Finally, the chip and the bond wires are covered by a mold compound and, on top, a ceramic enclosure.

Table 7.1 depicts the layers from top to bottom, whereas Table 7.2 lists the different variants that we evaluate in Section 7.4. On top, the ceramic enclosure and mold compound cover the entire system. heat sink and heat spreader dissipate the heat out of the silicon chip. Bulk silicon ( $10 \ \Omega \cdot cm$ ) serves as the foundation of the transistors in



Figure 7.1: Schematic of the layers of an wirebond package, together with a top view and cross-section diagrams.

|                      | Thickness  | Material         | $\varepsilon_r$ | $\tan(\delta)$     | ρ       |
|----------------------|------------|------------------|-----------------|--------------------|---------|
| Enclosure            | 50 μm      | Alumina          | 9.9             | 10 <sup>-4</sup>   | _       |
| Mold compound        | 0.45–1 mm  | Epoxy resin      | 4               | _                  | —       |
| Insulator            | 10 µm      | $SiO_2$          | 3.9             | 0.025              | —       |
| Silicon die          | 0.5 mm     | Bulk Silicon     | 11.9            | _                  | 10 Ω⋅cm |
| Underfill            | 0.1–0.5 mm | Aluminum Nitride | 8.6             | 3·10 <sup>-4</sup> | —       |
| Frame                | 0.1 mm     | Copper           | PEC             | PEC                | PEC     |
| Leads                | 0.1 mm     | Copper           | PEC             | PEC                | PEC     |
| Redistribution layer | 3 μm       | Copper           | PEC             | PEC                | PEC     |
| PCB                  | 0.5 mm     | Epoxy resin      | 4               | _                  | _       |

| Table 7.1: | <b>Characteristics</b> | of the I | layers in a | a wirebond | package. |
|------------|------------------------|----------|-------------|------------|----------|
|            |                        |          |             |            |          |

| Table 7.2: Package parameters for wirebond. |               |               |       |  |  |
|---------------------------------------------|---------------|---------------|-------|--|--|
| Parameter                                   | Default Value | Variations    | Units |  |  |
| Die size                                    | 8             | 12, 16, 20    | mm    |  |  |
| Bond wires                                  | 32            | 64, 128       | —     |  |  |
| Molding compound margin                     | 0.1           | 0.05, 0.5     | mm    |  |  |
| Silicon thickness                           | 0.1           | 0.5           | mm    |  |  |
| Heat spreader thickness                     | 0.5           | 0.1           | mm    |  |  |
| Enclosure material                          | Alumina       | PEC           | N/A   |  |  |
| Frequency                                   | 60            | 120, 180, 240 | GHz   |  |  |

each chiplet. The interconnect layers reside within the silicon dioxide (SiO<sub>2</sub>) insulator. Then, below the fine array of micro-bumps, we find the silicon interposer. Interposers can be (i) active, which include active devices and are implemented in bulk silicon, and (ii) passive, which can be implemented in high-resistivity silicon [91]. Due to lower cost and more widespread adoption nowadays, passive interposer is assumed by default, although we evaluate the impact of having an active interposer in Section 6.4. Below the interposer, we model an interposer-wide bump array, and below it, a PCB whose body material is irrelevant because we model it as perfect electrical conductor due to the existence of a dense metallic redistribution layer within it.

Dies connected through bond wires are generally relatively small because only the periphery of the chip can be used to implement I/O connectors. We assume a die size of 8mm. The package extends laterally beyond the die first through the frame, which has a size of 10.4mm. There is another space between the frame and the limit of the package, which is necessary to host the PCB-side leads of the bond wires. The complete package has a size of 13.52mm. The number of bond wires used by default is 32 (8 per side) and their pitch is calculated based on the specifications of the widespread QFN64 package provided by the partner UNIBO. As shown in Table 7.2, higher density of bond wires enabled by a smaller pitch is possible and will be evaluated in Section 7.4.

In this scenario, since the die is mounted upright, the antennas implemented within the insulator do not find large metal components nearby (at least in the vertical dimension). Therefore, unlike in flip-chip and interposer cases where the bump array creates a sort of virtual ground plane for the antennas, here antennas may potentially radiate upwards or downwards. To capture the effect of all these potential propagation paths, we model the antennas as infinitesimally short discrete ports seeking a low directivity at least in the elevation angles. Laterally, the presence of potentially resonating bond wires can produce losses and significant distortion.

The presence of a layer of higher dielectric constant below the insulator (i.e. the lossy silicon) theoretically leads to higher radiation downwards. However, the losses in the silicon layer may jeopardize this option. Yet still, the metallic frame and/or the PCB may reflect waves back up. The fraction of electromagnetic waves not reaching the receiver in these conditions may escape the package because the cover is generally of a ceramic material and, therefore, there is no metallic encasement in the lateral and vertical directions. The change of refractive index between the molding compound found on top of the chip and the ceramic cover, plus the change at the interface between the cover and the outside of the package, may generate some reflections that could travel back to the antennas. In next sections, we evaluation how these considerations affect the path loss and delay spread.

## 7.2 Frequency Domain Analysis

#### 7.2.1 Analysis at mmWave Frequencies

Like in the previous chapters, we begin the assessment by quantifying the path loss in the wirebond package at 60 GHz for the default dimensions and materials given at Table 7.2. We evaluate the effect of modifying the silicon and heat spreader thicknesses.

Figure 7.2 plots the path loss of a representative design point of wirebond packaging, together with the polar plots of the antennas' radiation pattern in the plane of the chip ( $\theta = \pi/2$ ). We observe that the path loss stays rather constant at different distances, with large values around 50 dB. The causes may be several: the presence of resonating bond wires that turn part of the radiation into heat, the use of ceramic enclosures that allow radiation to leak outside the package, and the presence of lossy silicon. We also observe that a few short links show an extremely high attenuation around 80–90 dB, indicated with a red circle in Fig. 7.2(a). Our hypothesis is that either (i) these are an artifact of the simulation, and should be ignored, or (ii) the package structure and the presence of resonating bond wires creates directions of minimum radiation, which leads to low lateral coupling at short distances. At more distant links, energy may still come from reflections coming from the bonding wires or the end of the package.

Due to the presence of these high-attenuation short links, regression fitting yields lines with negative slope, which counter-intuitively suggest that the path loss improves with distance. To avoid confusion, we do not show the regression fitting lines in further path loss figures.

Figure 7.3 plots the path loss for all the silicon and AIN thickness combinations. The main observation to make here is that, as usual, thick silicon harms the wireless channel by introducing significant losses. This is clearly observable at high distances. Another observation is that having a thick piece of AIN does not necessarily help reduce losses. The reason is that in the wirebond package the die is mounted upright, leaving the thermal pad at the bottom. Should there be any thermal filler below the die, waves would need to go through the lossy silicon die to reach it, and travel back again through it to reach the receiving antenna. A large fraction of the waves traveling through this path, however, can escape through the lateral openings of the package.



Figure 7.2: (a) Path loss in a wirebond package at 60 GHz for a silicon thickness of 0.1mm, AIN thickness of 0.5mm, and 32 bondwires. The red circle points out nearby ports with reduced coupling. (b-d) As a reference, approximated XY-plane radiation patterns of the different ports.



Figure 7.3: Path loss in a wirebond package at 60 GHz for different substrate and heat spreader thicknesses.

#### 7.2.2 Scale-up to THz Frequencies

In this section, we raise the frequency of operation up to 240 GHz to reproduce a scenario relevant to the objectives of the WiPLASH project. Figure 7.4 shows the results of the analysis, which suggest that an increase in the antenna frequency may have negative impact on the path loss. We see a that the path loss reaches an average amount of upto 80 dB when reaching 240 GHz. Moreover, the upscaling in frequency



Figure 7.4: Path loss in a wirebond package at different frequencies.

does not avoid the the presence of extremely attenuated links at short distances, which maintain a similar range of values around 100–120 dB. This also negates that this behavior at short distance could be an issue related to near-field coupling.

### 7.3 Time Domain Analysis

We next evaluate the dispersion within a wirebond package of 8 mm for different substrate and heat spreader thicknesses. To make sure that the dispersion limits are given by the channel and not the excitation port, we use an picosecond-long impulse signal covering the whole spectrum from 0.01 to 1 THz.

Figure 7.5 shows the results of the time domain analysis. We observe that the wirebond package has a reasonable delay spread, with worst-case values well below below 0.15 ns. This means that the coherence bandwidth is around 7 GHz. We also see how thin silicon alternatives are again preferable by a long margin as they reduce the worst-case delay spread by around 30%. For thin silicon, the impact of AIN is higher at short distances. At high distances, its impact becomes marginal.



Figure 7.5: Delay spread in a wirebond package for different substrate and heat spreader thicknesses.

## 7.4 Sensitivity Analysis for Channel Engineering

Next, we evaluate the different wirebond package variations summarized in Table 7.2 in both domains. Even though the package optimization performed in Section 5.4.2 would be applicable to wirebond packages, we do not repeat the whole process for the sake of brevity.

**Die size.** Figure 7.6 shows how scaling the die size impacts on the path loss and delay spread. In terms of path loss, we draw the counter-intuitive conclusion that increasing the die size improves the attenuation suffered by the electromagnetic waves by 5–10 dB in average between 8-mm and 20-mm dies. One possible reason is that the package *breathes* as the distance between the antennas and the bonding wires increases, reducing the detrimental coupling between them and affecting less the antennas' radiation pattern.

In terms of delay spread, the size of the chip does not change the main linearly increasing trend. When scaling to 12 or 16 mm (this figure is missing the 20-mm point due to time constraints), the same average trend is observed, but simply extrapolated to longer distances. For a die size of 16 mm, which yields a maximum transmission distance of approximately 17 mm through the diagonal, the maximum delay spread is around 0.2 ns (5 GHz).

**I/O pitch.** Figure 7.7 shows the path loss and delay spread as a function of the number of bond wires. From the first plot, we can see how the presence of more bondwires seems to have a mildly detrimental effect on path loss. We speculated that increasing the bond wire density, this is, reducing the I/O pitch, would reduce the leakage and scattering of EM waves in-between the bond wires and that, instead, the bond wire array would becomes a virtual reflection plane. This would theoretically help improving the strength of the main propagation path. However, we observe the opposite trend: the values for path loss may increase by more than 10 dB when increasing the number of bond wires in the periphery of the chip from 32 to 128. Therefore, it seems that either this stronger reflection causes a destructive interference in the antennas, or simply that the presence of more bondwires increases the amount of wave energy that resonates in the wires and is lost in the form of heat or re-radiation towards other directions.



Figure 7.6: Path loss and delay spread for different die sizes in a wirebond package.



Figure 7.7: Path loss and delay spread in a wirebond package for different amounts of bond wires (i.e. different I/O pitches).



Figure 7.8: Path loss and delay spread in a wirebond package for different enclosure materials.

From the second plot, we see that the presence of a dense array of bond wires seems to have a positive effect on the delay spread. Worst-case spreads as low as 0.1 ns (10 GHz) are shown in this plot for 128 wires in contrast to the value of 0.13 ns (7.7 GHz) for 32 wires. One possible reason compatible with the explanations above is that having more bond wires reduces the amount of reflections that *come back* to within the chip, making the direct ray more important and diminishing the multipath components.

**Top of the package.** Finally, Figure 7.8 and 7.9 show how the decisions relative to the vertical dimensions of the package and the material of the lid affect the channel.

On the one hand, we first see in Figure 7.8 the impact of using a metallic cover. This decision is inspired by security aspects derived from the need to not leak sensitive information through the emanation of electromagnetic waves. The use of a metallic cover prevents that possibility and, as we see in the figure, that it can improve the path loss by variable amounts between a few dB upto more than 10 dB. This is because of the internal reflections that the metallic cover produces, which helps to conserve more energy within the system. In case of delay spread, since the package starts becoming an attenuated reverberation chamber, the delay spread is expected to increase. As



Figure 7.9: Path loss and delay spread in a wirebond package for different molding compound thicknesses.

shown in the figure, the delay spread with the metallic cover is around  $2 \times$  larger than the delay spread obtained with the ceramic cover.

On the other hand, Figure 7.9 illustrates that the dimensions of the molding compound that fills the cavity containing the bond wires does not have a noticeable impact in path loss. Note that the value shown in the legend corresponds to the thickness of the molding compound on top of the bond wires. One would expect that a larger package (larger molding compound) would lead to higher losses due to the extra distance of propagating to the package limits and back. However, we observe a rather marginal impact, which suggests that the main propagation mechanism is either surface waves at the interface between the insulator and other materials, or space waves within the silicon/heat spreading material. In terms of delay spread, the effect of the molding compound thickness is more noticeable –it leads to an increase of around 40 ps.

# 8. Discussion and Concluding Remarks

Wireless Network-on-Chip, or WNoC, has been proposed as a potential solution to the scalability problems of current multicore processors. However, the realization of this potential requires overcoming a broad set of challenges and research questions. This deliverable addresses one of them, namely: *is the wireless channel within computing packages amenable to the transmissions speeds and energy efficiencies assumed in visionary works on WNoC?* 

To answer this question, we have first reviewed the fundamentals of on-chip electromagnetics across the spectrum, from themmWave to the THz band, and surveyed the state of the art in wave propagation and channel modeling for WNoC. Table 3.1 summarizes over 25 papers on the topic. We have confirmed that, with few exceptions mostly coming from our own prior work, the existing literature focuses on the 60–100 GHz band and does not model the computing package realistically, resorting to variants of an open die configuration.

Aware of this gap, we have performed an extensive simulation campaign to characterize the wireless channel within realistic computing packages. Since the vision of the WiPLASH project is based on application of graphene-based antennas in the THz band (0.1–1 THz), our study pushes the carrier frequency upto 240 GHz in the frequency domain and considers the whole spectrum from 0.01 to 1 THz in the time domain analysis. Further, our evaluations include not only the common wirebond and flip-chip packages, but also multi-chip interposer-based environments, again coherently with the WIPLASH vision.

#### Summary of Results

The flip-chip package evaluations from Chapter 5 show that a path loss below 40 dB can be achieved in a  $8 \times 8 \text{ mm}^2$  without considering the effect of the antennas. Despite being highly influenced by the losses within the silicon die, path loss exponents below 2 can be obtained. Thus, extending to larger dies is possible without increasing the path loss beyond 50 dB. Unfortunately, scaling the operation frequency upto 240 GHz seems to add an extra 10 dB of attenuation. In terms of dispersion, we have confirmed that flip-chip is able to yield coherence bandwidths on the order of 10–20 GHz even for large dies. We have finally observed that using epoxy as filling material can help improve path loss at the cost of increasing dispersion moderately.

The interposer package, modeled in Chapter 6, is similar to taking a large flip-chip, placing it on top of a high-resistivity silicon die, and breaking it down into a number of chiplets. It was therefore not surprising to find that the trends are similar than to flip-chip, yet with a few more design decisions to make. We have obtained a path loss below 50 dB even for distances around 20mm, but that worsen as we increase the operation frequency. In terms of dispersion, an insufficient coherence bandwidth of 5

GHz has been obtained. In the sensitivity analysis, we have observed that having a higher number of small chiplets is preferrable to having a few larger ones due to the presence of more inter-chiplet *corridors* where waves can propagate with low losses. Due to this, larger inter-chiplet separations are preferred from the perspective of path loss (reducing it below 40 dB) with a small impact on the delay spread. We found that epoxy instead of vacuum as filling material is a promising approach to further reduce the path loss and the delay spread, whereas the choice of silicon at the interposer seems to have an impact only in terms of dispersion.

The results for the wirebond package, modeled in Chapter 7, do not yield conclusive results. Our simulations show a set of unusually large path loss at very short distances, which may be just an artifact of the simulations. Even ignoring those points, the path loss results are discouraging as even at short distances the path loss is above 40 dB, possibly due to the fact that the package is not fully enclosed and that the bond wires may be resonating and leading to extra losses. Fortunately, it seems that the results improve for larger dies or when a metallic cover is introduced. On the other hand, the delay spread results are similar to those of the flip-chip or interposer.

In summary, our evaluations show that (i) flip-chip and interposers are preferable over wirebond, that (ii) path loss of 30–40 dB and delay spreads below 0.1 ns can be achieved without cumbersome optimization processes, and (iii) that thinning down the silicon die is the most impactful design decision, which can be combined with other optimizations such as using epoxy resin instead of vacuum as filling material. However, we also need to be aware that increasing the distance and the frequency of operation will increase both the path loss and delay spread. Fortunately, we have also observed some promising design points (e.g. the simulation at 120 GHz in Figure 5.4) with unexpectedly good performance. We speculate that those may be due to a particular combination of dimensions and frequencies causing the package to an opportunistic waveguiding behavior.

### **Promising Directions**

In light of the results above, we conclude that additional effort is required to enable ultra-fast and ultra-low power communications at the chip scale. In Section 5.4.2, we proposed a methodology to automatically optimize the chip package according to certain target channel metrics. Using this approach, we reduced the path loss and delay spread of two separate instances of a  $22 \times 22 \text{ mm}^2$  die within a large flip-chip package of  $33 \times 33 \text{ mm}^2$  down to 28.55/21.88 dB (maximum/average) and 0.07 ns, respectively, by just changing the substrate and heat spreader thicknesses.

Another promising direction, which we started exploring recently [93], is the use of programmable metasurfaces to shape the channel response. The idea is that, since the chip package acts as a chaotic cavity with significant reverberating behavior, one can use the programmable metasurfaces to engineer the channel in a way that either radiation is focused to a particular antenna or the delay spreads is minimized.

Finally, to further improve the quality of the wireless links within package, we cannot discard the possibility of using directional antennas or other mechanisms to increase their gain. This is of key relevance in the WiPLASH project, as it proposes the design and integration of graphene-based antennas. In particular, the miniaturization of the antenna enabled by the plasmonic effects of graphene at THz frequencies opens the door to new approaches to antenna array or metasurface design.

# 9. Publications

Out of the work performed in Task T3.1, leading to this deliverable, the following papers have been published:

- M. Imani, S. Abadal, P. Del Hougne, "Toward Dynamically Adapting Wireless Intra-Chip Channels to Traffic Needs with a Programmable Metasurface," in Proceedings of the ACM NanoCoCoA '20, Yokohama, Japan, November 2020.
- X. Timoneda, S. Abadal, A. Franques, D. Manessis, J. Zhou, J. Torrellas, E. Alarcón, and A. Cabellos-Aparicio, "Engineer the Channel and Adapt to it: Enabling Wireless Intra-Chip Communication," IEEE Transactions on Communications, vol. 68, no. 5, pp. 3247-3258, February 2020.
- S. Abadal, C. Han, and J. M. Jornet, "Wave Propagation and Channel Modeling in Chip-Scale Wireless Communications: A Survey from Millimeter-Wave to Terahertz and Optics," IEEE Access, vol. 8, pp. 278-293, December 2019.

## Bibliography

- [1] O. Markish, B. Sheinman, O. Katz, D. Corcos, and D. Elad, "On-chip mmWave Antennas and Transceivers," in *Proceedings of the NoCS '15*, p. Art. 11, 2015.
- [2] X. Timoneda, S. Abadal, A. Cabellos-Aparicio, D. Manessis, J. Zhou, A. Franques, J. Torrellas, and E. Alarcón, "Millimeter-Wave Propagation within a Computer Chip Package," in *Proceedings of the ISCAS '18*, 2018.
- [3] Y. Chen, X. Cai, and C. Han, "Wave Propagation Modeling for mmWave and Terahertz Wireless Networks-on-Chip Communications," in *Proc. of IEEE ICC*, 2019.
- [4] R. Marculescu, U. Ogras, L.-S. Peh, N. Enright Jerger, and Y. Hoskote, "Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives," *IEEE Transactions* on Computer-Aided Design of Integrated Circuits and Systems, vol. 28, no. 1, pp. 3–21, 2009.
- [5] S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, A. Singh, T. Jacob, S. Jain, V. Erraguntla, C. Roberts, Y. Hoskote, N. Borkar, and S. Borkar, "An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 1, pp. 29–41, 2008.
- [6] G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, "On-chip networks from a networking perspective: congestion and scalability in many-core interconnects," in *Proceedings of the SIGCOMM*, pp. 407–18, 2012.
- [7] S. Park, T. Krishna, C.-H. Chen, B. Daya, A. Chandrakasan, and L.-S. Peh, "Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI," in *Proceedings of the DAC-49*, pp. 398–405, 2012.
- [8] G. Chen, M. A. Anders, H. Kaul, S. K. Satpathy, S. K. Mathew, S. K. Hsu, A. Agarwal, R. K. Krishnamurthy, V. De, and S. Borkar, "A 340 mV-to-0.9v 20.2 Tb/s source-synchronous hybrid packet/circuit-switched 16x16 network-on-chip in 22 nm tri-gate CMOS," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 1, pp. 59–67, 2015.
- [9] D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. F. Brown III, and A. Agarwal, "On-chip interconnection architecture of the tile processor," *IEEE Micro*, vol. 27, no. 5, pp. 15–31, 2007.
- [10] G. Chrysos, "Intel® xeon phi<sup>™</sup> coprocessor-the architecture," *Intel Whitepaper*, vol. 176, p. 43, 2014.
- [11] D. Bertozzi, G. Dimitrakopoulos, J. Flich, and S. Sonntag, "The fast evolving landscape of on-chip communication," *Design Automation for Embedded Systems*, vol. 19, no. 1, pp. 59–76, 2015.
- [12] S. Abadal, R. Guirado, H. Taghvaee, A. Jain, E. P. de Santana, P. Haring Bolívar, M. Saeed, R. Negra, Z. Wang, K.-T. Wang, *et al.*, "Graphene-based wireless agile interconnects for massive heterogeneous multi-chip processors," *arXiv preprint arXiv:2011.04107*, 2020.
- [13] J. Kim, K. Choi, and G. Loh, "Exploiting new interconnect technologies in on-chip communication," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 2, no. 2, pp. 124–136, 2012.
- [14] D. Matolak, A. Kodi, S. Kaya, D. DiTomaso, S. Laha, and W. Rayess, "Wireless networks-on-chips: architecture, wireless channel, and devices," *IEEE Wireless Communications*, vol. 19, no. 5, 2012.

- [15] S. Abadal, B. Sheinman, O. Katz, O. Markish, D. Elad, Y. Fournier, D. Roca, M. Hanzich, G. Houzeaux, M. Nemirovsky, E. Alarcón, and A. Cabellos-Aparicio, "Broadcast-Enabled Massive Multicore Architectures: A Wireless RF Approach," *IEEE MICRO*, vol. 35, no. 5, pp. 52–61, 2015.
- [16] R. G. Kim, W. Choi, Z. Chen, P. P. Pande, D. Marculescu, and R. Marculescu, "Wireless NoC and Dynamic VFI Codesign: Energy Efficiency Without Performance Penalty," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 24, no. 7, pp. 2488–2501, 2016.
- [17] M. A. I. Sikder, A. Kodi, W. Rayess, D. Ditomaso, D. Matolak, and S. Kaya, "Exploring wireless technology for off-chip memory access," in *Proceedings of the HOTI '16*, pp. 92–99, 2016.
- [18] L. Yan and G. W. Hanson, "Wave propagation mechanisms for intra-chip communications," *IEEE Transactions on Antennas and Propagation*, vol. 57, no. 9, pp. 2715–2724, 2009.
- [19] Y. P. Zhang, Z. M. Chen, and M. Sun, "Propagation Mechanisms of Radio Waves Over Intra-Chip Channels With Integrated Antennas: Frequency-Domain Measurements and Time-Domain Analysis," *IEEE Transactions on Antennas and Propagation*, vol. 55, no. 10, pp. 2900–2906, 2007.
- [20] W.-H. Chen, S. Joo, S. Sayilir, R. Willmot, T.-Y. Choi, D. Kim, J. Lu, D. Peroulis, and B. Jung, "A 6-Gb/s Wireless Inter-Chip Data Link Using 43-GHz Transceivers and Bond-Wire Antennas," *IEEE Journal of Solid-State Circuits*, vol. 44, pp. 2711–2721, oct 2009.
- [21] H. H. Yeh, N. Hiramatsu, and K. L. Melde, "The design of broadband 60 GHz AMC antenna in multi-chip RF data transmission," *IEEE Transactions on Antennas and Propagation*, vol. 61, no. 4, pp. 1623–1630, 2013.
- [22] R. S. Narde, J. Venkataraman, A. Ganguly, and I. Puchades, "Intra-and Inter-Chip Transmission of Millimeter-Wave Interconnects in NoC-based Multi-Chip Systems," *IEEE Access*, vol. 7, pp. 112200–15, 2019.
- [23] S. H. Gade, S. Garg, and S. Deb, "OFDM Based High Data Rate, Fading Resilient Transceiver for Wireless Networks-on-Chip," in *Proceedings of the ISVLSI '17*, pp. 483–488, 2017.
- [24] W. Rayess, D. W. Matolak, S. Kaya, and A. K. Kodi, "Antennas and Channel Characteristics for Wireless Networks on Chips," *Wireless Personal Communications*, vol. 95, no. 4, pp. 5039–5056, 2017.
- [25] I. El Masri, T. Le Gouguec, P.-M. Martin, R. Allanic, and C. Quendo, "Electromagnetic Characterization of the Intra-chip Propagation Channel in Ka and V Bands," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 9, no. 10, pp. 1931–1941, 2019.
- [26] D. DiTomaso, A. Kodi, D. Matolak, S. Kaya, S. Laha, and W. Rayess, "A-WiNoC: Adaptive Wireless Network-on-Chip Architecture for Chip Multiprocessors," *IEEE Transactions on Parallel and Distributed Systems*, vol. 26, no. 12, pp. 3289–3302, 2015.
- [27] W. Choi, K. Duraisamy, R. G. Kim, J. R. Doppa, P. P. Pande, D. Marculescu, and R. Marculescu, "On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems," *IEEE Transactions on Computers*, vol. 67, no. 5, pp. 672–686, 2018.
- [28] S. Abadal, E. Alarcón, A. Cabellos-Aparicio, and J. Torrellas, "WiSync: An Architecture for Fast Synchronization through On-Chip Wireless Communication," in *Proceedings of the ASPLOS '16*, pp. 3–17, 2016.
- [29] V. Fernando, A. Franques, S. Abadal, S. Misailovic, and J. Torrellas, "Replica: A Wireless Manycore for Communication-Intensive and Approximate Data," in *Proceedings of the ASPLOS '19*, pp. 849– 863, 2019.
- [30] X. Yu, J. Baylon, P. Wettin, D. Heo, P. Pratim Pande, and S. Mirabbasi, "Architecture and Design of Multi-Channel Millimeter-Wave Wireless Network-on-Chip," *IEEE Design & Test*, vol. 31, no. 6, pp. 19–28, 2014.
- [31] S. Subramaniam, T. Shinde, P. Deshmukh, S. Shamim, M. Indovina, and A. Ganguly, "A 0.36pJ/bit, 17Gbps OOK Receiver in 45-nm CMOS for Inter and Intra-Chip Wireless Interconnects," in *Proceedings of the SOCC '17*, 2017.

- [32] T. Shinde, S. Subramaniam, P. Deshmukh, M. M. Ahmed, M. Indovina, and A. Ganguly, "A 0.24 pJ/bit, 16 Gbps OOK Transmitter Circuit in 45-nm CMOS for Inter and Intra-Chip Wireless Interconnects," in *Proceedings of the GLSVLSI '18*, pp. 69–74, 2018.
- [33] J. M. Jornet and I. F. Akyildiz, "Graphene-based plasmonic nano-antenna for terahertz band communication in nanonetworks," *IEEE Journal on selected areas in communications*, vol. 31, no. 12, pp. 685–694, 2013. U.S. Patent No. 9,643,841, May 9, 2017 (Priority Date: Dec. 6, 2013).
- [34] A. Singh, M. Andrello, N. Thawdar, and J. M. Jornet, "Design and operation of a graphene-based plasmonic nano-antenna array for communication in the terahertz band," *IEEE Journal on Selected Areas in Communications*, vol. 38, no. 9, pp. 2104–2117, 2020.
- [35] S. Abadal, S. E. Hosseininejad, M. Lemme, P. Haring Bolívar, J. Solé-Pareta, E. Alarcón, and A. Cabellos-Aparicio, "Graphene-based antenna design for communications in the terahertz band," *Nanoscale Networking and Communications Handbook*, p. 25, 2019.
- [36] S. Deb, A. Ganguly, P. P. Pande, B. Belzer, and D. Heo, "Wireless NoC as Interconnection Backbone for Multicore Chips: Promises and Challenges," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 2, no. 2, pp. 228–239, 2012.
- [37] V. Catania, A. Mineo, S. Monteleone, M. Palesi, and D. Patti, "Improving Energy Efficiency in Wireless Network-on-Chip Architectures," ACM Journal on Emerging Technologies in Computing Systems, vol. 14, no. 1, p. Art. 9, 2018.
- [38] X. Yu, J. Baylon, P. Wettin, D. Heo, P. P. Pande, and S. Mirabbasi, "Architecture and design of multichannel millimeter-wave wireless NoC," *IEEE Design & Test*, vol. 31, no. 6, pp. 19–28, 2014.
- [39] D. Matolak, S. Kaya, and A. Kodi, "Channel modeling for wireless networks-on-chips," *IEEE Communications Magazine*, vol. 51, no. 6, pp. 180–186, 2013.
- [40] Y. Zhang and J. Mao, "An Overview of the Development of Antenna-in-Package Technology for Highly Integrated Wireless Devices," *Proceedings of the IEEE*, vol. 107, no. 11, pp. 2265–2280, 2019.
- [41] S. Pal, D. Petrisko, A. A. Bajwa, P. Gupta, S. S. Iyer, and R. Kumar, "A Case for Packageless Processors," in *Proceedings of the HPCA-24*, pp. 466–479, 2018.
- [42] H. Ardebili and M. Pecht, *Encapsulation technologies for electronic applications*. William Andrew, 2009.
- [43] S. L. Wright, R. Polastre, H. Gan, L. P. Buchwalter, R. Horton, P. S. Andry, E. Sprogis, C. Patel, C. Tsang, J. Knickerbocker, J. R. Lloyd, A. Sharma, and M. S. Sri-Jayantha, "Characterization of micro-bump C4 interconnects for Si-carrier SOP applications," in *Proceedings of the ECTC '06*, pp. 633–640, 2006.
- [44] A. W. Topol, D. C. La Tulipe, L. Shi, D. J. Frank, K. Bernstein, S. E. Steen, A. Kumar, G. U. Singco, A. M. Young, K. W. Guarini, and M. leong, "Three-dimensional integrated circuits," *IBM Journal of Research and Development*, vol. 50, no. 4, pp. 491–506, 2006.
- [45] P. Garrou, C. Bower, and P. Ramm, Handbook of 3D Integration, Volume 1: Technology and Applications of 3D Integrated Circuits. John Wiley & Sons, 2011.
- [46] X. Zhang, J. K. Lin, S. Wickramanayaka, S. Zhang, R. Weerasekera, R. Dutta, K. F. Chang, K. J. Chui, H. Y. Li, D. S. Wee Ho, L. Ding, G. Katti, S. Bhattacharya, and D. L. Kwong, "Heterogeneous 2.5D integration on through silicon interposer," *Applied Physics Reviews*, vol. 2, no. 2, 2015.
- [47] A. Kannan, N. Enright Jerger, and G. H. Loh, "Exploiting Interposer Technologies to Disintegrate and Reintegrate Multicore Processors," *IEEE Micro*, vol. 36, no. 3, pp. 84–93, 2016.
- [48] A. Arunkumar, E. Bolotin, B. Cho, U. Milic, E. Ebrahimi, O. Villa, A. Jaleel, C.-J. Wu, and D. Nellans, "MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability," *Proceedings of the ISCA* '17, pp. 320–332, 2017.
- [49] J. Branch, X. Guo, L. Gao, A. Sugavanam, J. J. Lin, and K. K. O, "Wireless communication in a flip-chip package using integrated antennas on silicon substrates," *IEEE Electron Device Letters*, vol. 26, no. 2, pp. 115–117, 2005.

- [50] F. Gutierrez, S. Agarwal, K. Parrish, and T. S. Rappaport, "On-chip integrated antenna structures in CMOS for 60 GHz WPAN systems," *IEEE Journal on Selected Areas in Communications*, vol. 27, no. 8, pp. 1367–1378, 2009.
- [51] H. M. Cheema and A. Shamim, "The last barrier: On-chip antennas," *IEEE Microwave Magazine*, vol. 14, no. 1, pp. 79–91, 2013.
- [52] P. Baniya and K. L. Melde, "Switched-Beam Endfire Planar Array With Integrated 2-D Butler Matrix for 60 GHz Chip-to-Chip Space-Surface Wave Communications," *IEEE Antennas and Wireless Propagation Letters*, vol. 18, no. 2, pp. 236–240, 2019.
- [53] R. S. Narde, J. Venkataraman, A. Ganguly, and I. Puchades, "Antenna Arrays as Millimeter-Wave Wireless Interconnects in Multichip Systems," *IEEE Antennas and Wireless Propagation Letters*, vol. 19, no. 11, pp. 1973–1977, 2020.
- [54] X. Guo, J. Caserta, R. Li, B. Floyd, and K. K. O, "Propagation Layers for Intra-Chip Wireless Interconnection Compatible with Packaging and Heat Removal," in *Proceedings of the VLSIT '02*, pp. 36–37, 2002.
- [55] M. Sun and Y. Zhang, "Performance of intra-chip wireless interconnect using on-chip antennas and UWB radios," *IEEE Transactions on Antennas and Propagation*, vol. 57, no. 9, pp. 2756–2762, 2009.
- [56] R. S. Narde, N. Mansoor, A. Ganguly, and J. Venkataraman, "On-Chip Antennas for Inter-Chip Wireless Interconnections: Challenges and Opportunities," in *Proceedings of the EuCAP '18*, 2018.
- [57] P. Baniya, A. Bisognin, K. L. Melde, and C. Luxey, "Chip-to-Chip Switched Beam 60 GHz Circular Patch Planar Antenna Array and Pattern Considerations," *IEEE Transactions on Antennas and Propagation*, vol. 66, no. 4, pp. 1776–1787, 2018.
- [58] R. Hahnel, B. Klein, and D. Plettemeier, "Integrated stacked vivaldi-shaped on-chip antenna for 180 ghz," in 2015 IEEE International Symposium on Antennas and Propagation & USNC/URSI National Radio Science Meeting, pp. 1448–1449, IEEE, 2015.
- [59] G. Bellanca, G. Calò, A. E. Kaplan, P. Bassi, and V. Petruzzelli, "Integrated vivaldi plasmonic antenna for wireless on-chip optical communications," *Optics express*, vol. 25, no. 14, pp. 16214– 16227, 2017.
- [60] H.-T. Wu, J.-J. Lin, and K. K. O, "Inter-Chip Wireless Communication," in *Proceedings of the EuCAP* '13, pp. 3647–3649, 2013.
- [61] P. V. Testa, C. Carta, and F. Ellinger, "Novel high-performance bondwire chip-to-chip interconnections for applications up to 220 ghz," *IEEE Microwave and Wireless Components Letters*, vol. 28, pp. 102–104, Feb 2018.
- [62] V. Pano, I. Tekin, Y. Liu, K. R. Dandekar, and B. Taskin, "In-Package Wireless Communication with TSV-based Antenna," in *Proceedings of the ISCAS '19*, pp. 19–21, 2019.
- [63] X. Timoneda, S. Abadal, A. Cabellos-Aparicio, and E. Alarcón, "Modeling the EM Field Distribution within a Computer Chip Package," in *Proceedings of the WCNC '18*, 2018.
- [64] W. Wang, Y. Chen, S. Yang, Q. Cao, H. Li, X. Zheng, and Y. Wang, "Wireless inter/intra-chip communication using an innovative PCB channel bounded by a metamaterial absorber," *IEEE Antennas and Wireless Propagation Letters*, vol. 15, pp. 1634–1637, 2016.
- [65] J. Wu, A. Kodi, S. Kaya, A. Louri, and H. Xin, "Monopoles Loaded with 3-D-Printed Dielectrics for Future Wireless Intra-Chip Communications," *IEEE Transactions on Antennas and Propagation*, vol. 65, no. 12, pp. 6838–6846, 2017.
- [66] S. Hwangbo, R. Bowrothu, H.-I. Kim, and Y.-K. Yoon, "Integrated Compact Planar Inverted-F Antenna (PIFA) with a Shorting Via Wall for Millimeter-wave Wireless Chip-to-chip (C2C) Communications in 3D-SiP," *Proceedings of the ECTC'19*, pp. 983–988, 2019.
- [67] Y. P. Zhang, M. Sun, and L. H. Guo, "On-chip antennas for 60-GHz radios in silicon technology," IEEE Transactions on Electron Devices, vol. 52, no. 7, pp. 1664–1668, 2005.

- [68] X. Timoneda, A. Cabellos-Aparicio, D. Manessis, E. Alarcón, and S. Abadal, "Channel Characterization for Chip-scale Wireless Communications within Computing Packages," in *Proceedings of the NOCS '18*, 2018.
- [69] Y. Chen and C. Han, "Channel modeling and analysis for wireless networks-on-chip communications in the millimeter wave and terahertz bands," in *Proc. of IEEE INFOCOM workshop on WCNEE*, 2018.
- [70] Y. Chen and C. Han, "Channel Modeling and Characterization for Wireless Networks-on-Chip Communications in the Millimeter Wave and Terahertz Bands," *IEEE Transactions on Molecular, Biological, and Multi-Scale Communications*, 2019.
- [71] K. Kim, W. Bornstad, and K. K. O, "A Plane Wave Model Approach to Understanding Propagation in an Intra-chip Communication System," in *Proceedings of the APS '01*, pp. 166–169, 2001.
- [72] K. Kimoto, N. Sasaki, S. Kubota, W. Moriyama, and T. Kikkawa, "High-Gain On-Chip Antennas for LSI Intra- / Inter-Chip Wireless Interconnection," *Proceedings of the EuCAP '09*, pp. 278–282, 2009.
- [73] A. Ganguly, K. Chang, S. Deb, P. P. Pande, B. Belzer, and C. Teuscher, "Scalable Hybrid Wireless Network-on-Chip Architectures for Multi-Core Systems," *IEEE Transactions on Computers*, vol. 60, no. 10, pp. 1485–1502, 2010.
- [74] Y. Liu, V. Pano, D. Patron, K. Dandekar, and B. Taskin, "Innovative propagation mechanism for inter-chip and intra-chip communication," in *Proceedings of the WAMICON* '15, 2015.
- [75] X. Timoneda, S. Abadal, A. Franques, D. Manessis, J. Zhou, J. Torrellas, E. Alarcón, and A. Cabellos-Aparicio, "Engineer the Channel and Adapt to it: Enabling Wireless Intra-Chip Communication," *IEEE Transactions on Communications*, vol. 68, no. 5, pp. 3247–3258, 2020.
- [76] R. S. Narde and J. Venkataraman, "Feasibility study of Transmission between Wireless Interconnects in Multichip Multicore systems," in *Proceedings of the APS/URSI '17*, pp. 1821–1822, 2017.
- [77] M. Opoku Agyeman, Q.-T. Vien, A. Ahmadnia, A. Yakovlev, K.-F. Tong, and T. Mak, "A Resilient 2-D Waveguide Communication Fabric for Hybrid Wired-Wireless NoC Design," *IEEE Transactions* on Parallel and Distributed Systems, vol. 28, no. 2, pp. 359–373, 2016.
- [78] S. H. Gade, S. S. Rout, and S. Deb, "On-Chip Wireless Channel Propagation : Impact of Antenna Directionality and Placement on Channel Performance," *Proceedings of the NOCS '18*, 2018.
- [79] I. El Masri, P. M. Martin, H. K. Mondal, R. Allanic, T. Le Gouguec, C. Quendo, C. Roland, and J. P. Diguet, "Accurate Channel Models for Realistic Design Space Exploration of Future Wireless NoCs," in *Proceedings of the NOCS '18*, pp. 148–155, 2018.
- [80] Y. Al-Alem, A. A. Kishk, and R. M. Shubair, "One-to-Two Wireless Interchip Communication Link," *IEEE Antennas and Wireless Propagation Letters*, vol. 18, no. 11, pp. 2375–2378, 2019.
- [81] A. C. Tasolamprou, A. Pitilakis, S. Abadal, O. Tsilipakos, X. Timoneda, H. Taghvaee, M. S. Mirmoosa, F. Liu, C. Liaskos, A. Tsioliaridou, *et al.*, "Exploration of intercell wireless millimeter-wave communication in the landscape of intelligent metasurfaces," *IEEE Access*, vol. 7, pp. 122931– 122948, 2019.
- [82] S.-B. Lee, S.-W. Tam, I. Pefkianakis, S. Lu, M. F. Chang, C. Guo, G. Reinman, C. Peng, M. Naik, L. Zhang, et al., "A scalable micro wireless interconnect structure for CMPs," in *Proc. Of ACM the* 15th annual international conference on Mobile computing and networking, pp. 217–228, 2009.
- [83] C. Thraskias, E. Lallas, N. Neumann, L. Schares, B. Offrein, R. Henker, D. Plettemeier, F. Ellinger, J. Leuthold, and I. Tomkos, "Survey of Photonic and Plasmonic Interconnect Technologies for Intra-Datacenter and High-Performance Computing Communications," *IEEE Communications Surveys* and Tutorials, vol. 20, no. 4, pp. 2758–2783, 2018.
- [84] D. M. Solís, J. M. Taboada, F. Obelleiro, and L. Landesa, "Optimization of an optical wireless nanolink using directive nanoantennas," *Optics express*, vol. 21, no. 2, pp. 2369–2377, 2013.
- [85] M. Saad-Bin-Alam, M. I. Khalil, A. Rahman, and A. M. Chowdhury, "Hybrid plasmonic waveguide fed broadband nanoantenna for nanophotonic applications," *IEEE Photonics Technology Letters*, vol. 27, pp. 1092–1095, May 2015.

- [86] Y. Yang, Q. Li, and M. Qiu, "Broadband nanophotonic wireless links and networks using on-chip integrated plasmonic antennas," *Scientific reports*, vol. 6, p. 19490, 2016.
- [87] M. Nafari and J. M. Jornet, "Modeling and performance analysis of metallic plasmonic nanoantennas for wireless optical communication in nanonetworks," *IEEE Access*, vol. 5, pp. 6389– 6398, 2017.
- [88] G. Calo, G. Bellanca, A. E. Kaplan, F. Fuschini, M. Barbiroli, M. Bozzetti, P. Bassi, and V. Petruzzelli, "On-chip wireless optical communication through plasmonic nanoantennas," in 12th European Conference on Antennas and Propagation (EuCAP 2018), pp. 1–5, April 2018.
- [89] "CST Microwave Studio."
- [90] D. Nam and C. H. Park, "Multiobjective simulated annealing: A comparative study to evolutionary algorithms," *International Journal of Fuzzy Systems*, vol. 2, no. 2, pp. 87–97, 2000.
- [91] N. Kim, D. Wu, D. Kim, A. Rahman, and P. Wu, "Interposer design optimization for high frequency signal transmission in passive and active interposer using through silicon via (tsv)," in 2011 IEEE 61st electronic components and technology conference (ECTC), pp. 1160–1167, IEEE, 2011.
- [92] X. Zhang, T. Chai, J. H. Lau, C. Selvanayagam, K. Biswas, S. Liu, D. Pinjala, G. Tang, Y. Ong, S. Vempati, *et al.*, "Development of through silicon via (tsv) interposer technology for large die (21× 21mm) fine-pitch cu/low-k fcbga package," in *2009 59th Electronic components and technology conference*, pp. 305–312, IEEE, 2009.
- [93] M. F. Imani, S. Abadal, and P. Del Hougne, "Toward dynamically adapting wireless intra-chip channels to traffic needs with a programmable metasurface," in *Proceedings of the 1st ACM International Workshop on Nanoscale Computing, Communication, and Applications*, pp. 20–25, 2020.