Architectures for synchronous optical TDM switching employing semiconductor laser amplifiers

D.K. Hunter
I. Andonovic

Indexing terms: Synchronous optical TDM switching, Semiconductor Laser amplifier

Abstract: Designs for synchronous optical TDM switching fabrics are introduced that can be implemented using semiconductor laser amplifier switches or any on/off switch technology. Their performance is compared, and one architecture, which can be built up on large silicon substrates, is highlighted. The issues of switch count, passive coupler count, control complexity, attenuation and noise buildup are considered.

1 Introduction

Optical time division multiplexed (TDM) switching has several advantages which make it attractive for deployment in future telecommunications networks. Its transparency to bitrate (instantaneous bitrate within a block), coding format and wavelength, coupled with its virtually limitless bitrate capacity, ensure that optical switching systems would be essentially 'future-proof'. Furthermore, the adoption of a TDM approach implies compatibility with the existing network, which represents a very substantial investment in both hardware and software. The approach considered in this paper is synchronous TDM switching, which is used for the transport of high speed continuous bitrate services such as high quality videophone and high definition television. In addition, it could also be used in a high-speed transport network to carry ATM cells and other traffic. The switch fabrics described here use device technologies which are quite well established, so that these architectures could be implemented with relatively little further device development.

Much of the existing development work on optical TDM switching has used lithium niobate devices. However, these suffer from serious problems such as sensitivity to temperature, instability and drift of characteristics, high loss and polarisation sensitivity; although polarisation insensitive switches have been realised, they suffer from large drive voltages [1]. Semiconductor laser amplifiers (SLAs), although not so well developed, offer two major advantages over lithium niobate: first, their inherent gain can be used to overcome any attenuation due to coupling in the system, thus removing the need for external amplification and secondly, devices can be fabricated that are polarisation insensitive with little or no penalty with respect to drive voltage, removing the need for polarisation controllers.

Since TDM switching will play an important role in future telecommunications networks, a study of optical TDM architectures using SLA devices is timely and highly desirable. SLAs produce noise in the form of amplified spontaneous emission (ASE) which limits the number of fabrics that can be cascaded. Analysis of these fabrics must initially be in terms of noise buildup, although there are other problems such as saturation nonlinearities which have been studied elsewhere [2]. Many factors are relevant in evaluating and comparing the performance of different architectures:

- 'Number of switches' is loosely correlated to the cost of the fabric.
- 'System loss' is the loss due to the splitters and combiners only.
- 'Coupling loss' is the loss at substrate boundaries when coupling to other substrates or to fibres.
- 'Number of stages' is equal to the number of switches a block must pass through, and is related to the noise accumulation in the fabric.
- 'Crosstalk' refers to the stray signals added to a block due to the finite extinction ratio of each switch. If the crosstalk and the signal come from the same source, the crosstalk manifests itself as intensity noise. Coherent crosstalk effects can also limit performance [3]. The nature and severity of the crosstalk depends both on the architecture and the performance of each individual switch.
- 'Noise' is produced in the switch fabric as amplified spontaneous emission (ASE) in the switches; it degrades the error rate performance of the fabric.

This work was performed under a contract with BT Labs. In addition, the authors wish to thank Steve Cassidy, Mike Robertson and Julie Burton of BT Labs for providing useful information on devices and their fabrication.

© IEE, 1995
Paper 1921J (E13), first received 19th September 1994 and in revised form 2nd March 1995
The authors are with the Department of Electronic and Electrical Engineering, University of Strathclyde, 204 George Street, Glasgow G1 1XW, United Kingdom
'Bit error ratio' (BER) is a function of the crosstalk and noise levels.

'Control complexity' is related both to the switch architecture and the number of processors used to control it. The amount of processor time available to set up each new call will depend on the application. Blocking characteristics will also have implications for control complexity, but this will not be considered in this paper.

'Frame integrity' is held by a TDM switching system if all the blocks entering during one time-frame leave during one time-frame; without frame integrity, the blocks may be spread over several output time-frames, and blocks may be lost when the switch assignment is changed. Frame integrity is essential for practical telecommunications switching systems.

'Frame delay', in a frame integrity system, is the delay between an input frame entering the fabric and the corresponding output frame leaving the fabric.

Throughout the subsequent analysis, it is assumed that crosstalk does not impact significantly on system performance; this is only true for switches having an extinction ratio of approximately 50 dB [4]. Therefore the analysis must be regarded as only a preliminary study of fabric performance, since in practice, devices operating at hundreds of megahertz would probably be unable to exhibit such efficient extinction. A study of the effect of crosstalk on the fabric would be much more complex, and would have to take into account the effect of interferometric noise [5] which is produced as a result of phase noise in the signal. Nevertheless, the analysis presented here should provide a useful initial guide to system performance.

The analysis of InP space switch architectures is considered elsewhere [4, 6], the novelty of the fabrics reported in this paper lies in its consideration of TDM switch fabric networks, the use of silicon motherboards, and the study of architectures of unrestricted size. Four main classes of fabric are considered, some classes containing several subclasses. Indium phosphate substrates, such as $2 \times 2$ [7], $4 \times 4$ [8] and $1 \times 4$ [9], have been realised, through on/off switches, and can form building blocks for the fabrication of large fabrics. Larger switch blocks may be possible in the future; the desirability of using these will be discussed later. The substrates are interconnected by silicon motherboards which are in turn interconnected by optical fibre delay lines. Fibre is capable of providing the long delays required (memory) for telecommunications TDM switching; silica-on-silicon integrated optical delays would only be used for path length equalisation on the motherboards. Although this technology is assumed throughout this paper, the architectures themselves are device-independent and can be used with any on/off switching technology plus delay lines.

This paper discusses the various space switch 'building blocks' which make up the TDM switching architectures. Also a number of architectures for lithium niobate switching [10] are adapted for use with SLAs. A very simple architecture is also introduced which uses many components, but has very low control complexity. The final architecture uses small timeslot interchangers as 'building blocks' out of which larger architectures can be built. It should be stressed that the architectures presented here, realised through SLA on/off switch blocks, have, for the most part, a corresponding lithium niobate directional coupler implementation. To facilitate their presentation, a new categorisation will be adopted with no reference to their lithium niobate counterparts. These architectures represent a new family of TDM switch fabrics, hitherto not reported in the literature and provide an alternative approach to realising switches for use in the optical domain.

2 Space switch constituents of TDM switching fabrics

TDM switches are made up from space switches and delay lines; in this Section, space switch architectures are discussed. The first type is a conventional $b \times b$ space switch termed $S(b)$, having $b$ inputs and $b$ outputs and capable of connecting each input to a different output simultaneously. The most obvious method of producing the TDM architecture would be to place each of these space switches onto a single InP substrate; however, a more complex method, with benefits regarding power budget, is presented in Section 2.1.

In Section 2.2, baseline and reverse baseline networks are discussed, which could be built out of smaller substrates containing $S(b)$ networks interconnected on a silicon motherboard. Likewise, in Section 2.3, Benes networks are described, which will be used as building blocks when constructing the final TDM switching fabrics.

2.1 Interconnection of space switch arrays

One route to realising TDM switching fabrics made from $S(b)$ networks involves using InP substrates each containing one $S(b)$ network [7–9]. However, a more efficient method of realising fabrics, where each substrate contains part of two different space switches, is advantageous in that the coupling loss in a signal path is reduced, and the size of substrate is reduced, for a given size of space switch. The latter implies that yield considerations will not be as significant a problem, as well as relaxing the alignment, since a large number of silica/InP couplings at the boundaries of the substrate are not necessary.

Consider a network made from InP substrates on silicon consisting of $s$ stages of $b \times b$ switches. Using the network of Fig. 1 as an example (here, $s = b = 2$), the obvious way of implementing this is as several InP substrates each containing a $b \times b$ switch (Fig. 2a); here, this is a $2 \times 2$ switch containing four SLA gates. It is also possible to combine pairs of Y-couplers to form X-couplers; Fig. 2b shows the modified network where back-to-back pairs of Y-couplers have been replaced by X-couplers. Since the insertion loss of both types of coupler is typically the same (i.e. 3 dB), this results in a reduction in total coupler loss. Possible structures for the X-coupler include directional couplers, merged direc-

![Fig. 1 Two-stage network of $2 \times 2$ switches](image-url)
tional couplers, and structures based on 90° mirrors, with appropriate attention to fabrication tolerances, and assuming the problem of polarisation dependence can be overcome. In the interests of tractability, losses due to waveguide bends and crossovers have not been considered.

Finally, the switches from two stages can be integrated onto one substrate (Fig. 2c), reducing the number of silica/InP couplings in a signal path by a factor of two. Ultimately, this would represent approximately 2 dB loss per coupling, so the loss in the system has been drastically reduced. When following a similar procedure for larger switches of size $b \times b$, the number of switches per InP substrate is reduced from $b^2$ to $2b$, although this does not make any difference for the present example (Figs. 2a–2c) because, for $b = 2$, both quantities are equal. For $b > 2$, this process reduces the number of switches per substrate while increasing the number of these substrates; the total number of switches remains constant.

The number of stages of InP substrates in the modified network is $s/2$ if $s$ is even or $(s + 1)/2$ if $s$ is odd. If $s$ is even, only one type of substrate is required; this is shown in Figs. 3a–c, for $b = 2$, 4 or 8, respectively. These substrates consist of two stages of $b$ SLAs interconnected by a $b \times b$ coupler [11]. There is only one interface between each SLA and the silica waveguide, since an intervening section of InP waveguide is not required, minimising the loss due to coupling. If $s$ is odd, several stages of substrates each containing two stages of switches, followed by one extra stage of switches is required. This extra switch stage would essentially be implemented as a stage of substrates each containing a column of switches.

Thus the modified approach detailed above for implementing the space switches results in a smaller InP substrate size and a lower coupling loss. The only difficulty lies in its dependence on X-couplers, which are difficult to fabricate, but any alterations to the scheme in order to circumvent this would only alter the details of the remainder of this paper.
2.2 Baseline and reverse baseline networks

Baseline and reverse baseline networks represent a particular class of so-called 'multistage interconnection networks' [12] and will be used to form TDM switching fabrics. They may be constructed from \( S(h) \) networks, where \( h \) is referred to as the base of the (reverse) baseline network. Baseline networks \( B(mb) \) can be constructed from smaller networks \( B(m) \) and space switches \( S(h) \) (Fig. 4).

![Diagram of baseline network](image)

**Fig. 4** Definition of a baseline network \( B(mb) \) in terms of smaller baseline networks and space switches

The small network \( B(h) \) is simply a \( S(h) \) network, a starting point before applying the concept illustrated in Fig. 4 repeatedly to build up a baseline network of arbitrary size. Such a small network will be referred to as a seed. Although the seed is a single space switch here, the seed for producing the TDM switching fabrics themselves can be much larger units (which are still small compared to the finished TDM fabric). It is possible to use a different value for \( h \) (base) every time a large network is made from a smaller one, but it will be assumed in the interest of tractability that \( h \) is constant, \( m \) must therefore be an integral power of \( h \).

An example of a base 4 and base 2 \( 16 \times 16 \) baseline network is shown in Fig. 5. In the case of the base 4 network, the seed is a \( S(4) \) switch. For the base 2 structure, the seed would have been an \( S(2) \) switch (i.e., a \( 2 \times 2 \) switch), and Fig. 4 would be used first to create a \( 4 \times 4 \) baseline network \( B(4) \); after a second time \( B(8) \), and finally, after the third time, the final network \( B(16) \). Although baseline networks themselves are blocking, they can be used to build larger networks which are non-blocking.

A reverse baseline network \( R(m) \) is a baseline network that has been reflected about its vertical axis, generally producing a network which is distinct from the original baseline network since baseline networks need not be symmetrical [12]. For example, the base 2 network of Fig. 5, when reflected to become a reverse baseline network is different from the original, while the base 4 network happens to be the same.

It can be shown that:

\[
N(B(m)) = mh \log_2 m
\]

where the function \( N \) represents the number of switches in the switch fabric. The number of \( 2 \times 2 \) couplers required to make a \( a \times b \) coupler is \( (b \log_2 b)/2 \), so the total number of couplers required \( (X) \), assuming \( 2 \times 2 \) devices are used throughout, is

\[
X(B(m)) = \frac{mh(\log_2 h)(\log_2 m - 1)}{2} + 2m(h - 1)
\]

![Diagram of a base 4 and a base 2 \( 16 \times 16 \) network](image)

**Fig. 5** A base 4 and a base 2 \( 16 \times 16 \) network

Values for different types of coupling losses are shown in Table 1. These figures do not represent typical losses; they are representative of losses that might be attained by refining existing techniques*. Also, these values do not include the packaging losses associated with bonding; since this is just a preliminary study, this additional source of loss has been neglected. Lenses would be used where appropriate, to improve mode matching and allow greater tolerance to alignment error. Present InP laser to waveguide coupling exhibits a loss of around 2 dB, but in

<table>
<thead>
<tr>
<th>Type of loss</th>
<th>Symbol</th>
<th>Value used, dB</th>
</tr>
</thead>
<tbody>
<tr>
<td>Fibre to silica waveguide</td>
<td>( L_\alpha )</td>
<td>1</td>
</tr>
<tr>
<td>Silica to InP switch</td>
<td>( L_\gamma )</td>
<td>2</td>
</tr>
<tr>
<td>Fibre to InP switch</td>
<td>( L_\beta )</td>
<td>3</td>
</tr>
<tr>
<td>InP laser to waveguide</td>
<td>( L_\omega )</td>
<td>0</td>
</tr>
</tbody>
</table>

* With current technology, the fabrics will still be feasible, although the number of fabrics that can be cascaded will be smaller, for reasons of space, an analysis of this has not been included here. It should certainly be possible to build demonstration architectures.

IEE Proc.-Optoelectron., Vol. 142, No. 3, June 1995

135
principle, lossless operation is possible, as shown in the table. Waveguide losses, bend losses and intersection losses are neglected in the interest of tractability, although these would ultimately have to be considered when a more accurate system characterisation is required. Also, effects due to unequal attenuation along different paths will be ignored.

Normally, the gain of the laser amplifiers will be constrained so that the gain of the entire switch fabric is unity. The loss that must be overcome by each stage of laser diodes is shown in Tables 2 and 3, for T both even and odd, where \( T = \log_b m \) is the number of stages in the baseline or reverse baseline network. Here it is assumed that the loss due to a \( b \times b \) coupler is equally shared by the laser diodes on either side of it. The loss is divided up into two components; \( L_{in} \) is the loss on the input to a laser diode, and \( L_{out} \) is the loss on the output. This figure includes both system and coupling losses.

If \( b = m \), the entire switch can be fabricated on an InP substrate with no silicon motherboard; otherwise, if \( m > b \), a silicon motherboard must be used to hold many \( b \times b \) InP substrates. The loss is \( L_{in} = L_{out} = 10 \log_{10} b + L_{inw} + L_{tj} \).

2.3 Benes networks

Unlike baseline and reverse baseline networks, Benes networks \( W(m) \) [13–15] are nonblocking, but at the expense of using more components, and, again, can be used as a constituent of TDM switching fabrics. The definition of larger Benes networks in terms of smaller networks is shown in Fig. 6. Since Benes networks are rearrangeably nonblocking, there are implications for the control of TDM switching fabrics incorporating them (Section 4). The seed \( W(b) \) is defined as \( S(b) \), and it will be assumed again that \( b \) is held constant throughout the production of the network. As with baseline networks, a Benes network of any size can be built starting with the seed. A Benes network is similar in appearance to a baseline network and a reverse baseline network joined back-to-back. It can be shown that:

\[
N(W(m)) = 2mb \log_b m - mb
\]

\[
X(W(m)) = mb(\log_b b)(\log_b m - 1) + 2mb - 1
\]

The loss to be overcome by the laser amplifiers in each stage is the same as for a baseline network, although the number of stages is always odd (Table 3) since:

\[
T = 2 \log_b m - 1
\]

3 Optical TDM switching architectures

There are various types of optical TDM switching fabrics which can be built out of baseline, reverse baseline and Benes networks. A comparison of all types is made in Section 5. The notation \( T(m, n) \) is used to represent a TDM switching fabric with \( m \) inputs and outputs handling \( n \) timeslots per frame. The different types of switch are summarised as follows:

(i) Fabrics with no frame integrity (type SL1A), unsuitable for many practical applications where data must be retained in the same frame and blocks must not be lost when the assignment is changed.

(ii) Fabrics with frame integrity (type SL1B), achieved by including extra baseline networks, and hence using more hardware than type SL1A.

(iii) Fabrics with frame integrity (type SL1C), but accomplished with fewer components than type SL1B. Extra delay lines are included between the silicon substrates to provide frame integrity.

(iv) Fabrics which generally use fewer switches (type SL2) than type SL1B or type SL1C but are not amenable to partitioning in silicon boards.

(v) Fabrics realised through the generalisation of both, tree architectures and parallel timeslot interchangers with (type SL4A) or without (type SL4B) frame integrity.

(vi) Fabrics which use small timeslot interchangers as building blocks to realise larger architectures (type SL5).

---

Table 2: Loss to be overcome by each laser amplifier stage with an even number of stages for baseline and reverse baseline networks

<table>
<thead>
<tr>
<th>Number</th>
<th>( L_{in} )</th>
<th>( L_{out} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>First stage</td>
<td>( 1 )</td>
<td>( L_{in} + 10 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Even stages 2 to ( T = 2 )</td>
<td>( T/2 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Odd stages 3 to ( T = 3 )</td>
<td>( T/2 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Last stage</td>
<td>( 1 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
</tbody>
</table>

Table 3: Loss to be overcome by each laser amplifier stage with an odd number of stages for baseline, reverse baseline and Benes networks

<table>
<thead>
<tr>
<th>Number</th>
<th>( L_{in} )</th>
<th>( L_{out} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>First stage</td>
<td>( 1 )</td>
<td>( 10 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Even stages 2 to ( T = 1 )</td>
<td>( (T-1)/2 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Odd stages 3 to ( T = 2 )</td>
<td>( (T-3)/2 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
<tr>
<td>Last stage</td>
<td>( 1 )</td>
<td>( 5 \log_{10} b + L_{inw} )</td>
</tr>
</tbody>
</table>

---

Fig. 6 Definition of a Benes network \( W(mb) \) in terms of smaller Benes networks and space switches
Fig. 7 shows how a large SL1 fabric \( T(m, n) \) can be constructed out of smaller ones \( T(m, 1) \) by adding baseline and reverse baseline networks, \( e \) and \( f \) must both be integral powers of \( b \). To produce a \( T(m, n) \) fabric, start with a seed \( T(m, 1) \) which is equivalent to \( W(m) \), and produce larger and larger fabrics until the desired size of fabric has been realised. The thick dotted lines in Fig. 7 have various meanings depending on the type of architecture to be constructed. To ensure correct operation, the connections must be made in exactly the order stated. The fabrics in the centre stage must have all delay lines multiplied by a factor of \( f \). \( e \) may be thought of as the 'space expansion factor', since the fabric \( T(m, n) \) has \( e \) times as many inputs and outputs as the centre stage fabrics \( T(m, n) \). Likewise, \( f \) may be thought of as the 'time expansion factor'.

Each of the thick dotted lines (Fig. 7) represents a bundle of \( f \) connections; the precise way in which these connections are made depends on the architecture. The substitutions for the dotted lines to create types SL1A, SL1B, and SL1C are shown in Figs. 8a, b, and c, respectively. To specify the delay line lengths correctly, it is necessary to introduce \( r(i) \), the bit-reversal function [12], which operates on \( \log_2(f) \)-digit binary numbers; if the binary representation of \( i \) is \( p_k 1 p_{k-1} \ldots p_1 p_0 \) where \( k = \log_2(f) \) then \( r(i) = p_k 1 p_{k-1} \ldots p_1 p_0 \).

For type SL1A fabrics, each thick dotted line represents a bundle of \( f \) connections (Fig. 8a). To provide frame integrity (type SL1B) a \( f \times f \) omega network \( P(f) \) [16] is used to align the blocks and keep them in separate frames (Fig. 8b). Omega networks were originally proposed for interconnection within computer systems; they are easily obtained from baseline networks [12]. For type SL1C, yet another scheme is used (Fig. 8c). The \( 1 \times 2 \) and \( 2 \times 1 \) couplers would be incorporated within the space switch fabrics since \( b \times 2 \) couplers included in the connection to the \( r(f - 1) \) delay lines to equalise attenuation. The SLAs in Fig. 8c could be incorporated within the substrate lying immediately to their left.

Sometimes it is necessary to have switch fabrics with a single input and output, i.e., timeslot interchangers (TSIs), expressed as \( T(1, n) \). One way of creating such a fabric would be to take some fabric \( T(2, n) \) and simply leave one input and output unused. However, a more economical method is shown in Fig. 9 with frame integrity (type SL1C). The \( 2 \times 1 \) and \( 1 \times 2 \) couplers on either side of the \( T(2, n/4) \) fabric are integrated onto the substrate. The number of components for each type of SL1 fabric is summarised in Table 4.

It must be noted that there are many ways of building up a switching fabric of any given dimensions. When compiling the table, it was assumed that the fabric was built out of \( d \times d \) substrates, using seed \( W(d) \).

Fabric without frame integrity (type SL1A) will not be considered further, since this property is intolerable in

---

**Table 4: Number of components for type SL1 fabrics**

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Type SL1A</th>
<th>Type SL1B</th>
<th>Type SL1C</th>
</tr>
</thead>
<tbody>
<tr>
<td>( T(1, n) )</td>
<td>( T(m, n) )</td>
<td>( T(m, n) )</td>
<td>( T(m, n) )</td>
</tr>
<tr>
<td>( m \geq 2 )</td>
<td>( m \geq 2 )</td>
<td>( m \geq 2 )</td>
<td>( m \geq 2 )</td>
</tr>
<tr>
<td>( N )</td>
<td>( 8 \log_2(n) - 8 )</td>
<td>( 2mb \log_2 mn - mb )</td>
<td>( 16 \log_2(n) - 16 )</td>
</tr>
<tr>
<td>( X )</td>
<td>( 8 \log_2(n) - 10 )</td>
<td>( m(b \log_2 mn) )</td>
<td>( 16 \log_2(n) - 18 )</td>
</tr>
</tbody>
</table>
a practical telecommunications system. Type SL1B architectures have more stages of switches than type SL1C, and will therefore have poorer noise performance. The control complexity for the two types of architecture is identical. The latter architectures have the advantage of requiring fewer switches and couplers (Table 4). For the more general case of \( n \geq 2 \), the number of switches used by SL1C is fewer than SL1B by the following number:

\[
2m \log_2 n \left( b \log_2 d - \frac{2d - 1}{d} \right)
\]

Recall that \( b \) is the size of the space switch elements in the substrates, and \( d \) is the size of the silicon substrates. Since \( b \geq 2 \), and \( d \geq b \), the term inside brackets is always positive, thus SL1C networks are clearly superior in performance.

A third approach to achieving frame integrity will be considered now, which has certain advantages and disadvantages over type SL1B (extra baseline networks) and type SL1C (extra delay lines). It is sufficient to say that although the new architectures (type SL2) generally use fewer switches, they are not amenable to partitioning into silicon boards containing large numbers of semiconductor switches, thereby increasing the packaging costs. The control complexity of all three architectures is approximately equal [10]. As intermediate steps in creating the new desired fabric \( T(m, n) \), two other types of fabric are produced: \( X(m, n) \) and \( Y(m, n) \) [17]. They are essentially the same as the \( T(m, n) \)-fabrics already considered, the only difference being that the frame boundaries are not necessarily aligned on the inputs and outputs. This misalignment is crucial to the operation of the fabric and its economical use of hardware.

As before, the principle is to build up larger fabrics from smaller ones until the desired size of fabric is produced. The small fabrics use more than the single \( 2 \times 2 \) switch that would be required for \( T(2, 1) \), since extra hardware is used to re-align frame boundaries. Also, note that switches larger than \( 2 \times 2 \) may not be used to build this part of the fabric. Producing larger fabrics from smaller ones is, consequently, more complex than with SL1 architectures. There are certain restrictions on what seeds may be used; the choice when starting to create a fabric can be shown, as a consequence, to depend on the size of frame ultimately desired [17].

Once a fabric with the desired number of timeslots per frame has been produced, it can be extended to the required number of inputs and outputs by substituting into Fig. 7 one or more times with \( e = m/2 \) and \( f = 1 \). The first time, a minor optimisation is possible where each pair of \( B(m/2) \) or \( R(m/2) \) are combined, along with all the \( 2 \times 2 \) switches directly connected to it, into one \( B(m) \) or \( R(m) \) switch, respectively.

Fig. 10 is an example of a SL2 architecture, \( T(8, 4) \). Assuming \( n \geq 8 \) throughout, the number of components in this type of architecture is given by:

\[
N(T(m, n)) = 2m b \log_2 m + 4m \log_2 n + (3m/2)
\]

\[
N(1, n) = 8 \log_2 n + 1
\]

\[
X(T(m, n)) = 4m \log_2 n + b) + mb(\log_2 b) \log_2 m - 1
\]

\[
X(1, n) = 8 \log_2 n + 6
\]

The values for \( T(1, n) \) were calculated assuming the same method used to obtain one input and one output as for type SL1C.

Type SL4 architectures are a generalisation of both tree architecture space switches [18] and parallel structure TSIs [19]. They consist of a passive demultiplexer on each input and a passive multiplexer on each output, interconnected so that each input can be connected to each output via a SLA by any delay from 0 timeslots up to and including some value \( k \). \( k \) will be \( n - 1 \) for type SL4A architectures which do not have frame integrity, and \( 2n - 2 \) for type SL4B with frame integrity (Fig. 11a

---

Fig. 10 A type SL2 T(8, 4) network

---

Fig. 11 SL4 architecture and network

a) Definition of a type SL4 architecture

b) Substitution for dotted lines in a, for type SL4 network

and b). All control logic must route each incoming block through the appropriate delay line so that it arrives at the correct output at the correct time. Controlling the
fabric is therefore very simple, although the fabric does use a lot of hardware.

The hardware usage for both types is shown in Table 5; note that the multiplexers and demultiplexers must

Table 5: Performance figures for type SL4 architectures

<table>
<thead>
<tr>
<th>Type SL4A</th>
<th>Type SL4B</th>
</tr>
</thead>
<tbody>
<tr>
<td>(no frame integrity)</td>
<td>(frame integrity)</td>
</tr>
<tr>
<td>( N )</td>
<td>( m^2(n) )</td>
</tr>
<tr>
<td>( X )</td>
<td>( 2m(mn-1) )</td>
</tr>
</tbody>
</table>

...a,b,c,d,e... equivalent to...

Fig. 12  SLS architecture and network

a Definition of a large type SL5 network \( T(m,n) \) in terms of smaller type SL5 network
b Substitution for dotted lines in a to produce a type SL5 architecture

Table 6: Values of \( f \) and lengths of delays for types SL5A and SL5B

<table>
<thead>
<tr>
<th>Type SL5A</th>
<th>Type SL5B</th>
</tr>
</thead>
<tbody>
<tr>
<td>(without frame integrity)</td>
<td>(with frame integrity)</td>
</tr>
<tr>
<td>Value of ( f )</td>
<td>( f = b )</td>
</tr>
<tr>
<td>Lengths of delays</td>
<td>( 0, 1, 2, \ldots, k = f - 1 )</td>
</tr>
<tr>
<td>(one switch not used)</td>
<td></td>
</tr>
</tbody>
</table>

have path independent insertion loss even though they do not split (or combine) an integral power of two. This is achieved by leaving some splitter outputs and combiner inputs uncoupled in type SL4B architectures.

Fig. 12a shows how to produce the final type of archi-
tecture considered here, type SL5. Each dashed line in
Fig. 12a corresponds to the parallel structure shown in
Fig. 12b. The lengths of the delay lines and the value of \( f \) are shown in Table 6 for type SL5A (without frame integrity) and type SL5B (with frame integrity). For type SL5B, only \( b - 1 \) of the switches are used; this might provide an opportunity to make use of substrates where one switch is not functional. Table 7 shows the relationship between \( b, n \) and the delay line lengths for this fabric in a TSI realisation.

A switch fabric with frame integrity is shown in Fig. 13 representing \( T(4,4) \). It is effectively a 'STS' configuration (Fig. 13), the time switching is in the centre, in the form of \( m \) TISs, while the space switching takes place on either side. Alternatively switches can assume a 'TST' configuration with the centre of the fabric consisting of one large space switch.

Table 7: Value of \( n \) and lengths of delay lines for type SL5 fabrics

<table>
<thead>
<tr>
<th>Type SL5A</th>
<th>Type SL5B</th>
</tr>
</thead>
<tbody>
<tr>
<td>(without frame integrity)</td>
<td>(with frame integrity)</td>
</tr>
<tr>
<td>Value of ( n )</td>
<td>( n = b )</td>
</tr>
<tr>
<td>Lengths of delays</td>
<td>( 0, 1, \ldots, k = n - 1 )</td>
</tr>
<tr>
<td>(one switch not used)</td>
<td></td>
</tr>
</tbody>
</table>

Fig. 14 Scheme for concatenating time stages in type SL5 networks
Table 8 gives the hardware requirements for type SL5A and SL5B in both the TST and STS configurations. Also, formulae are given for timeslot interchangers (TSIs) or T(1; n); this is necessary because the formulae for \( N \) and \( X \) for baseline and Benes networks do not apply for \( m = 1 \). Since these formulae were used to calculate the results in Table 8, separate formulae must be supplied for the case of a single input and output. It is easily shown that type SL5B STS architectures only use \((1/2)mb \log_2 b + m + h\) more couplers and only \( m \) more switches than type SL5B TST. Both varieties of architecture may be considered roughly equivalent in their use of hardware. Throughout the remainder of this paper, the TST architecture will be considered since it uses marginally fewer components.

It will be assumed when calculating performance that the time switch sections of the fabric are fabricated as in Fig. 14.

4 Fabric control

When using one of the fabrics described above, it is necessary to have some means of translating the desired connections between input and output channels into control signals for the switches; the latter may change state every timeslot. A standard Benes network control algorithm may be used, requiring one processor; the architecture is rearrangeably nonblocking. The control complexity is \( \text{O}(m \log m) \) [20], which would be satisfactory for cross-connect applications but is generally too high for use as a switching node. There are two ways of reducing the computation time: (i) by using \( m \) processors in parallel, the computation time can be reduced to \( \text{O}(m \log m)(\log m - \log b) \) [20, 21]; and (ii) by using multiple fabrics in parallel, it is possible to make the fabric strict-sense nonblocking. This requires \( \log_2 m \) parallel fabrics, and implies a control complexity of \( (\log_2 m)^2 \) [22]. Unfortunately, a very large number of switches are required, and methods of using fewer parallel fabrics while still retaining acceptable performance are under investigation.

The control algorithm computes the switch settings for a hypothetical space switch i.e., a fabric composed solely of switches, with no delay lines. The space switch can be obtained from the real switch fabric by using a space-time mapping [23, 24] and has \( m \) inputs and outputs. The architectures map into Waksman networks [25] and Benes networks [13–15].

The switch settings for the space switch are then fed in parallel into an array of shift registers which transmit them in series to the real switch fabric. Since the shift registers are the only part of the electronics that has to operate at the timeslot rate, the rest of the control electronics is essentially future-proof. If the bit-rate passing through the fabric is updated without changing the timeslot rate, then only the interfaces to the telecommunications network need be replaced. If the timeslot rate is increased, the shift registers may have to be changed; however, the electronics which implements the control algorithm itself remains the same.

5 Fabric model and performance comparison

Following Kalman et al. [4], the performance of the fabric is analysed neglecting crosstalk. It must be noted that the analysis presented here is intended as a first approximation to obtain a preliminary idea of the switch fabric characteristics. Further work will be necessary to obtain a full and accurate model. The output power from a SLA, \( P_{\text{out}} \), is related to the input power, \( P_{\text{in}} \), by [4]:

\[
P_{\text{out}} = L_{\text{in}} L_{\text{out}} GP_{\text{in}} + L_{\text{out}} n_{\text{sp}} (G - 1) h v, \Delta v
\]

where:

\[
G = \text{internal SLA gain}
\]
\[
L_{\text{in}} = \text{input coupling loss to SLA}
\]
\[
L_{\text{out}} = \text{output coupling loss from SLA}
\]
\[
v_c = \text{centre optical frequency of amplifier bandpass}
\]
\[
\Delta v = \text{effective amplifier optical bandwidth}
\]
\[
n_{\text{sp}} = \text{excess spontaneous emission factor}
\]
\[
p = 1 \, (\text{completely polarisation dependent})
\]
\[
p = 2 \, (\text{completely polarisation independent})
\]

Now consider a cascade of \( M \) stages, numbered from 1 to \( M \). It can be shown that the output power from the last stage is given by:

\[
P_M = G_{\text{sp}} P_{\text{in}} + G_{\text{sp}} n_{\text{sp}} P_{\text{eff}} h v, \Delta v_{\text{eff}}
\]

where:

\[
\Delta v_{\text{eff}} = \text{overall gain bandwidth due to the cascade of SLAs}
\]
\[
P_{\text{eff}} = \text{effective polarisation-dependent factor, varying from 1 to 2}
\]

Also, it can be shown that the net signal gain through the fabric assuming that the overall gain bandwidth is constant:

\[
G_{\text{sp}} = \prod_{i=1}^{M} L_{\text{in}i} L_{\text{out}i} G_i
\]

where:

\[
L_{\text{in}i} = \text{input loss for stage } i \, (\text{Section 2})
\]
\[
L_{\text{out}i} = \text{output loss for stage } i \, (\text{Section 2})
\]
\[
G_i = \text{gain of amplifier in stage } i
\]

\( L_{\text{in}i} \) and \( L_{\text{out}i} \) include both the system and coupling loss (Section 2). The spontaneous gain for the whole system is given by:

\[
G_{\text{sp}} = \sum_{i=1}^{M} L_{\text{out}i} G_i - 1 \prod_{j=1}^{M} \left( L_{\text{in}j} L_{\text{out}j} G_j \right)
\]

If the gain of each stage is set to cancel out the coupling and system losses exactly, then \( L_{\text{in}i} L_{\text{out}i} G_i = 1 \), implying \( G_{\text{sp}} = 1 \), and

\[
G_{\text{sp}} = \sum_{i=1}^{M} L_{\text{out}i} G_i - 1
\]

Hence \( G_{\text{sp}} \) for the whole fabric can be computed by summing the \( G_{\text{sp}} \) of each stage (still assuming that the net gain of each stage is unity). Also, if \( G_{\text{sp max}} \) is the maximum allowed value of \( G_{\text{sp}} \), and each fabric has a spontaneous gain of \( G_{\text{sp n w}} \), then the number that can be connected in cascade is

\[
N = \frac{G_{\text{sp n w}}}{G_{\text{sp max}}}
\]

To achieve a bit error ratio (BER) of \( 10^{-6} \), the input power should be:

\[
P_{\text{in}}^* = 576 G_{\text{sp n w}} h v, B_0 \left( G_{\text{sp}} P_{\text{in}} + G_{\text{sp}} n_{\text{sp}} h v, B_0 \right)
\]

Satisfying the saturation constraint and choosing the optimum value of \( P_{\text{in}} \) to maximise the spontaneous gain \( G_{\text{sp in}} \), yields:

\[
G_{\text{sp max}} = \frac{L_{\text{out}M} P_{\text{in}} n_{\text{sp}} h v}{1/\sqrt{[(1/4 B_0^2) + (1/576 B_0 B_4)] - (1/2 B_0)}} + P_{\text{eff}} \Delta v_{\text{eff}}
\]

Using the following values:

\[ L_{\text{tot}} = L_0 + 10 \log_{10} b + L_{f_s} \text{ (dB)} \]

- \[ L_0 = 10 \log_{10} b + 3 \text{ (dB)} \] if \( b > d \)
- \[ L_{f_s} = 10 \log_{10} b + 3 \text{ (dB)} \] if \( b = d \)

\[ n_s = 4.5 \times 10^2 \times 10^{14} \text{ Hz} (\lambda = 1300 \text{ nm}) \]

- \( R_s = 177.4 \text{ GHz} \) (1 nm optical filter at fabric output)
- \( R_s = 1.5 \text{ GHz} \) (assuming a 2.488 Gb/s bitrate)
- \( P_{\text{eff}} = 2 \)
- \( \Delta V_{\text{eff}} = 3.548 \text{ THz} (\Delta \lambda = 20 \text{ nm}) \)

For any value of \( P_{\text{sat}} \), the corresponding \( G_{s_{\text{max}}} \) can be substituted into eqn. 1 to obtain the number of fabrics possible in cascade. Here, it is assumed that a high \( P_{\text{sat}} \) of 10 mW is used; higher values (and hence higher \( G_{s_{\text{max}}} \) and more cascaded fabrics) could be obtained by using MQW devices. For example, devices with a very high \( P_{\text{sat}} \) of 100 mW would give ten times as many switch fabrics in cascade.

As discussed in Section 4, the control complexity has already been considered. The control complexity will be considered, since otherwise they are not suitable for practical telecommunications applications. Here, the switching fabrics are evaluated by considering the number of switches and also the buildup of amplified spontaneous emission (ASE). The latter is done using the technique considered in the previous section, with crosstalk assumed to be negligible.

Type SL4B fabrics use far more components than the other types, since the component count varies approximately with \( m^*n \). Thus this type should only be used for small fabrics requiring low control complexity.

The relationships between the number of switches required for the other architectures are more subtle. Type SL1C architectures have the desirable property of being integrated onto large substrates. This implies reduced fabrication costs, reduced attenuation, and perhaps reduced packaging costs also, although the possibility of reduced yield for large substrates must be taken into account. Therefore, it seems sensible to adopt type SL1C assuming that it does not require considerably more switches than the other approaches. This argument does not hold for timeslot interchangers; this case will be considered shortly.

For fabrics suitable for use in practice (16 \( \leq m \leq 256 \)
and 16 \( \leq n \leq 4096 \)), type SL1C use no more than 50% more switches than type SL2 if \( b = 2 \), 24% if \( b = 4 \), and 26% if \( b = 8 \). If fabrics for which \( d = m \) are considered (i.e., the substrates are as large as possible), these figures become 12%, 9%, and 17%, respectively. Therefore, the benefit in number of switches offered by type SL2 is outweighed by type SL1C's advantage of being able to have many small InP substrates mounted on one large silicon substrate.

A comparison of type SL5B and SL1C shows that, for the same range of values as used above, the former uses at least 25% more switches if \( b = 4 \), and 84% if \( b = 8 \). Since the latter implies a possible saving in interconnection costs due to the use of large substrates, it seems that it is the best architecture assuming that the noise performance is no worse than the other approaches. This assumption will be confirmed shortly.

A type SL5B TS1 uses \( 6 \log_2 n - 3 \) switches. It is easily

![Fig. 15 Number of cascaded fabrics possible for types SL1C, SL2, and SL5B](image)

- Assumptions: \( b = 2, d = 16, m = 16, \) and \( n = 16, 256, \) or 4096
- \( x \times 1C \)
- \( o o 2 \)
- \( v v 5B \)

![Fig. 16 Number of cascaded fabrics possible for types SL1C, SL2 and SL5B](image)

- Assumptions: \( b = 2, 4 \) or \( 16, d = 16, m = 16, \) and \( n = 256 \)
- \( x \times 1C \)
- \( o o 2 \)
- \( v v 5B \)
best approach for most applications. Indeed, in many cases, type SLIC architectures give slightly better performance. An example of this is shown in Fig. 15 for fabrics made from 2 × 2 switches on 16 × 16 substrates, with 16 inputs and outputs. Also, base 2 fabrics give consistently better performance than base 4. Fig. 16 compares the performance of fabrics using different bases and having 16 inputs and outputs, and 256 timeslots per frame. Base 8 and base 16 fabrics are not generally feasible, presumably due to the increased system loss; it would be worthwhile to see if strategically placed boosters could be inserted to reduce $L_{	ext{sys}}$ in eqn. 3, thereby improving the performance of the fabrics.

It has been made clear that several simplifying assumptions were made in arriving at these results.

6 Conclusions

A number of optical TDM switch architectures, suitable for implementation with laser amplifier switches, have been compared, many based on previously proposed architectures implemented with lithium niobate directional coupler switches. The best architecture for most purposes has been identified (designated type SLIC), which uses relatively few switches, is integrated onto large substrates, and may easily be modified to reduce control complexity.

The analysis in this paper should only be taken as representing a first attempt at the problem of evaluating the performance of these architectures. Further work will be necessary to characterise these architectures accurately and identify possible applications within the telecommunications network.

7 References


