Linearized Bregman Iterations for Automatic Optical Fiber Fault Analysis

Supervision of the physical layer of optical networks is an extremely relevant subject. To detect fiber faults, single-ended solutions such as the Optical Time Domain Reflectometer (OTDR) allow for precise measurements of fault profiles. Combining th…

Authors: Michael Lunglmayr, Gustavo C. Amaral

Linearized Bregman Iterations for Automatic Optical Fiber Fault Analysis
1 Linearized Bre gman Iterations for Automatic Optical Fiber F ault Analysis Michael Lunglmayr Member , IEEE, and Gusta vo C. Amaral Abstract Supervision of the physical layer of optical networks is an e xtremely relev ant subject. T o detect fiber faults, single-ended solutions such as the Optical T ime Domain Reflectometer (O TDR) allo w for precise measurements of fault profiles. Combining the O TDR with a signal processing approach for high-dimensional sparse parameter estimation allows for automated and reliable results in reduced time. In this work, a measurement system composed of a Photon-Counting O TDR data acquisition unit and a processing unit based on a Linearized Bre gman Iterations algorithm for automatic fault finding is proposed. An in-depth comparati ve study of the proposed algorithm’ s fault-finding prowess in the presence of noise is presented. Characteristics such as sensitivity , specificity , processing time, and complexity , are analysed in simulated environments. Real-life measurements that are conducted using the Photon-Counting O TDR subsystem for data acquisition and the Linearized Bregman-based processing unit for automated data analysis demonstrated accurate results. It is concluded that the proposed measurement system is particularly well suited to the task of fault finding. The natural characteristic of the algorithm fosters embedding the solution in digital hardware, allo wing for reduced costs and processing time. Index T erms Optical fiber measurements, Optical T ime Domain Reflectometry , Fault location, Signal Processing, Signal Processing Algorithms M. Lunglmayr is with the Institute of Signal Processing, Johannes Kepler Uni versity , Linz, Austria (e-mail: michael.lunglmayr@jku.at). G. C. Amaral is with the Center for T elecommunications Studies, Pontifical Catholic University of Rio de Janeiro, RJ, Brazil and with QuT ech and Kavli Institute of Nanoscience, Delft Univ ersity of T echnology , Delft, The Netherlands (e-mail: gustav o@opto.cetuc.puc-rio.br). Copyright (c) 2018 IEEE. Personal use of this material is permitted. Howe ver , permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org. 2 I . I N T RO D U C T I O N Optical fiber netw orks provide the backbone of today’ s information society . In the year 2014, already about 25 million kilometers of optical fiber had been installed worldwide, carrying ov er 80% of the long distance data traffic [1]. Even though an e xcellent media for broad band communications, the mechanical vulnerability of the optical fiber might jeopardize its transmis- sion capacity: fiber bending, fiber breaking, imperfect fiber splices, and corrupted connectors are examples of f aults that compromise the po wer budget in a fiber optical link and, e ventually , render the link inoperable. Therefore, a means of detecting, locating, and e valuating the magnitude of such fault ev ents is unav oidable and of great interest to network operators. Consequently , many physical layer monitoring techniques hav e been de veloped over the years, with the Optical T ime Domain Reflectometry (O TDR) [2] and the Optical Frequency Domain Reflectometry (OFDR) [3] figuring as the most successful ones. The O TDR, which measures the Rayleigh backscattered po wer of a high-intensity optical probing pulse to determine a fiber’ s profile [4], is typically used as the data acquisition system for fiber fault analysis. As the architectures and sizes of the optical networks change with the de velopment of ne w technologies and increasing demand for higher data rates, the monitoring techniques must also ev olve to remain compatible with the networks [5]. An example of the e volution of supervision techniques to meet requirements of new architectures is the wa velength- tunable O TDR, which allows for the supervision of wav elength di vision multiple xing (WDM) optical networks [6]. An e xample monitoring system for multiple fibers using an O TDR device is described in [7]. In this conte xt, a natural e volution is the de velopment of mathematical tools capable of automatically identifying the physical layer faults from the raw supervision data. This can eliminate the necessity of human resource allocation to this e xhaustiv e process, which, in turn, reflects into minimized expenses and reduced do wntime in the network caused by the scheduling of in-field repair units [8]. Apart from this operational adv antage related to optical network management, location and estimation of losses in an O TDR profile can also be used for sensing applications: in [9], a system for measuring the mechanical displacement of a fiber by measuring the losses due to fiber bends has been presented; in [10], fiber losses sensed by a Photon-Counting O TDR are used to measure the pH value of a liquid. Naturally , a methodology able to locate and characterize faults in a fiber is general enough so that it can be adapted to either application. 3 The previously mentioned fiber faults, or an y other fault in a fiber that causes the transmitted po wer to drop in an indi vidual location, hav e been sho wn to manifest as trend breaks, or steps, in the digitized O TDR measurement data [11], [12]. Therefore, the location of a step can be directly associated to the location of the corresponding fiber fault. In other words, detecting faults in an O TDR profile is equiv alent to detecting steps in the trend of the digitized data. If an algorithm can be responsible for such identification, automatic location of potential faults could be performed. The reliable identification of steps in a data series is a matter that attracts substantial interest in areas such as: detection of disease outbreaks [13]; analysis of geophysical signals [14]; surveillance [15]; speech processing [16]; macroeconomics [17]; fraud detection [18]; or ev en in biological machines [19]. Although this problem is extremely simple in a noiseless scenario, it becomes quite challenging in the presence of noise. In Fig. 1, this feature is exemplified when a trend break is present under dif ferent signal-to-noise ratio (SNR) conditions. While in the first case (Fig. 1-a and -b) taking the discrete deri vati ve of the original data and searching for a peak in the outcome is sufficient to precisely identify the fault, as the SNR decreases (Fig. 1-c, -d -e, and -f), the process yields poor results. For this reason, and due to its rele vance, signal processing techniques have been dev eloped [20]–[24] to attack the problem of multiple, unknown, trend break detection in the presence of noise. Specifically in OTDR data acquisition, noise is present due to the power losses along the fiber that causes the measurement SNR to decrease and to the intrinsic scattering noise known as Coherent Rayleigh Noise (CRN) [4]. Also specific to the context of fiber monitoring is the fact that, from the point of view of a network operator , being able to identify and repair a fault before a user complaint is registered is e xtremely advantageous, the technique must be highly sensiti ve to the presence of a fault. On the other hand, triggering a repair unit to repair a f ault that does not exist is economically harmful, so the technique must also e xhibit high specificity in fault detection. In a recent w ork [12], the Adaptive ` 1 F ilter , an algorithm based on sparse estimation, was sho wn to outperform other techniques with respect to the aforementioned e xpected characteristics. Despite being an excellent method for trend break detection under noise, the Adapti ve ` 1 Filter re veals some shortcomings with respect to an implementation in an embedded system for automated fiber measurements. It relies on a coordinate descent algorithm, with a greedy candidate selection process similar to the algorithm described in [25]. Even though the simplicity 4 Fig. 1. Trend break detection using a discrete deriv ativ e and peak detection method. The result is compared under three signal- to-noise ratio en vironments: infinite; 20dB; and 10dB. The upper panels show the original data and the lo wer panels sho w the results, where a peak identifies the trend break. of the model used for step detection enables low complexity estimation algorithms, the pre- processing step required for the Adapti ve ` 1 Filter leads to multiplication-hea vy algorithms, increasing its complexity . In addition, as the authors of [25] point out, to ensure stability , the algorithm requires calculating complete solutions paths with different le vels of sparsity , which may compromise its processing time. On a different proposal [26], [27], detection of splice faults in an O TDR profile has been demonstrated with a digital signal processing approach based on a Gabor series representation and rank-1 matched subspace detection. Although the excellent detection capability demonstrated, the computational complexity of this approach is cubic in the number of measurement samples thus also limiting its implementation in an embedded measurement system for long fibers. 5 In the context of sparse estimation, Linearized Bre gman Iterations (LBI) offer a lo w comple xity and high precision approach for solving the combined ` 1 /` 2 minimization problem [28]–[33]. By adapting the LBI algorithm to take into account the characteristics of the trend break detection problem, a signal processing methodology was de veloped exhibiting better timing and accurac y when compared to state-of-the-art solutions, namely the Adaptive ` 1 F ilter . In this paper , the Linearized Bregman Iterations framew ork is presented as a candidate for an automatic trend break detection technique. Through an in-depth analysis focused on the particular case of fault detection in noisy O TDR profiles, figures of merit such as sensiti vity , specificity , and processing time ha ve been drawn in simulated and real-life en vironments. For comparison analysis between the proposed algorithm and the Adaptiv e ` 1 Filter , both algorithms ha ve been implemented in the Julia language. The conclusion is that the LBI algorithm is a promising candidate not only for this particular problem, but for the more general issue of trend break detection in noisy data. Furthermore, since the prospect of automated operation promotes the urge of reduced complexity so that a lo w-cost embedded processing unit such as an FPGA can be responsible for real-time monitoring of the fiber links, this point has also been in vestigated and the proposed algorithm is shown to of fer promising characteristics for FPGA embedding. The paper introduces the basic concepts of O TDR operation and the mathematical model of the Linearized Bre gman Iterations algorithm in Section II. Section III is responsible for highlighting the modifications on the original Linearized Bregman Iterations algorithm necessary to adapt it to the fault finding problem and thus creating a consistent parameter -free processing algorithm. W ith the background developed in the pre vious sections, a testbench experimental result is tested with the proposed algorithm. The results allow the dev elopment of a framew ork of extensi ve simulated profiles and the results both from real fiber measurements and simulated fiber profiles is detailed in Section IV . Finally , Section V concludes the work summarizing the findings as well as laying the path for future de velopments. I I . OT D R A N D L B I - B A S I C C O N C E P T S Optical T ime Domain Reflectometry is a single-ended reflectometry measurement technique that consists of measuring the Rayleigh backscattered portion of a light pulse trav ersing the fiber as a function of time. The O TDR provides the fiber profile by associating the measured data to the pulse’ s round-trip time, which, combined with the knowledge of the fiber’ s refracti ve index and the speed of light, can be translated into distance. The fiber profile is, thus, a descending line in 6 logarithmic scale with a slope equal to the fiber’ s attenuation coefficient [2], [4], [34]. Any event which causes optical po wer loss is interpreted accordingly , i.e., the aforementioned ef fects of fiber bending, fiber breaking, imperfect fiber splices, and corrupted connectors can be identified by an abrupt lev el shift in the signal profile. The goal would be to identify such le vel shifts and associate their positions and magnitudes to faults. The proposed automated measurement system that pro vides such results is depicted schematically in Fig. 2, where the Photon-Counting O TDR presented in [11] is used as the data acquisition subsystem. In Fig. 3, an example O TDR fiber profile acquired with this system is presented, indicating the most common ef fects associated to po wer loss and/or fiber faults. After the data acquisition step, the Linearized Bregman-based processing unit analyses the resulting O TDR profile and provides the ev ent list to the user . OTDR Data Aquisition U nit Data Process ing Unit • • • • User display with e vent lis t Fiber optic cable Fig. 2. Block diagram of the proposed measurement system with its desired output: an “event list” containing positions and magnitudes of e vents along the optical fiber link. The data acquisition subsystem represents the Photon-Counting O TDR of [11]. Gi ven the O TDR fiber profile such as the one in Fig. 3, ev entual power losses that could possibly compromise the link’ s transmission capacity will be identified by trend breaks. Although some of the power losses are expected – for instance the splitters and connectors necessary for 7 Fig. 3. Example of OTDR fiber profile highlighting the most common fiber fault causes. The profile was acquired with a tunable Photon-Counting O TDR setup, presented in detail in [11]. the network architecture – some are spurious or unexpected and often arise due to the mechanical fragility of the optical fiber [4]. The po wer balance of the optical link, which takes into account the input optical po wer and the detector margin for error-free reception, may be able to cope with some of these losses but, e ventually , they might render the link inoperable. The task, therefore, is to rapidly detect the presence of such breaks with the highest accuracy possible (within the probing method’ s limitation). This way , scheduling of in-field repair units can be accelerated and managing costs of the netw ork supervisor can be reduced as well as an y down-times experienced by the network user . It should be noted that the spatial resolution employed by the measuring technique will limit the achiev able accuracy of an algorithm that processes the output digitized data. The O TDR, or an y other concei v able fiber monitoring apparatus, albeit offering an excellent tool for fiber monitoring, poses a trade-off in terms of its measurable distance and resolution: in order to 8 in vestigate distant parts of the fiber , the probe pulse is enlarged – so to carry more optical po wer – and the spatial resolution is thus reduced. If a higher resolution is required, on the other hand, the probe pulse width is reduced at the e xpense of reduced reach. For that reason, a signal processing approach will not be able to locate a fault with a precision higher than the physical parameters of the measuring apparatus allo w for . Accurac y for automatic fault detection should be interpreted as “as accurate as the physically av ailable spatial resolution”. The signal processing approach deals with points in a data series and it is left to the measurement apparatus to link the digitized data points to a physical position or distance. The concept behind the automatic identification of fault positions is to interpret the O TDR profile as a weighted sum of step functions and then find the steps and respectiv e weights that best e xplain the original data series. The profile depicted in Fig. 3 clarifies this point, as, after each highlighted ev ent, there is a sharp decrease in the measured power . Step functions are uniquely defined by their amplitude and the position beyond which they assume a non-zero v alue. This allo ws to cast an arbitrary profile, under this interpretation, in the following form: y ( z ) = α X i ∈ F a i u ( z − z i ) , (1) where y ( z ) is the ideal (noise-free) fiber profile, z denotes the distance inside the fiber , F is the set of fault indices, a i are the weights (or magnitudes) of the faults, and z i its positions. The step function is represented by the Heaviside function u ( z ) and the f actor α stands for the linear slope of the fiber attenuation. For a measurement application, y ( z ) is typically sampled equidistantly and the samples are combined into a vector y . Identification of the set F of fault indices is the goal of the fault finding problem, which can be assumed to be sparse in this particular application. Fig. 2 is an example of the sparse characteristic of the problem since the total number of possible fault positions is 8000 (each entry in the dataset, or the spatial resolution of the acquisition system, corresponds to one meter), but the size of the set F is 7 . Indeed, from a practical point of view , it makes sense that the number of events should be much smaller than the points in the data series: a high number of fault ev ents translates into high optical losses and, as commented abov e, the inoperability of the optical link. A remarkable point that arises from this reasoning is that, gi ven the spatial resolution, one has access to all possible positions in the digitized data series where a trend break might appear . Thus, a naiv e approach w ould be to check all the possible combinations of step functions until the best approximation to the measured signal is found follo wing (1). Such 9 a procedure, ho we ver , is known to belong to the class of the so-called NP-hard problems [35] and, as such, is computationally intractable. A relaxation of this e xtremely dif ficult problem is to use the ` 1 norm instead, which allo ws one to employ algorithms with manageable complexity that output sparse results for a broad range of practical problems [36]. T o estimate the fault positions within a giv en O TDR profile, a dictionary of candidate vectors is provided, where the name dictionary is used to emphasize that candidate vectors do not hav e to be linearly independent [37]. There can be, as it is used in this work, ev en more candidate vectors than measurement v alues, in which case it is often referred to as an overcomplete dictionary [37]. The ` 1 minimization problem where the dictionary of candidates is provided is kno wn as the basis pursuit problem which, in its matrix form, can be represented as: min β || β || 1 s.t. A β = y , (2) where A is the dictionary , with each column of A being a dictionary or candidate vector . β are the coef ficients of the dictionary vectors, which represent the amount of contribution, or weight, of a candidate to the fiber profile data series y . Based on (1), (sampled) step functions shifted at all possible sample points will be included as candidate vectors in matrix A . In addition, the estimate of the attenuation coefficient is taken into account by also including a single linear candidate with unitary slope into the dictionary ( A is described in more detail in Sect. III). Ultimately , estimation of the attenuation as well as the le vel shifts is the estimation of the coefficient vector β where the non-zero coef ficients identify the step locations and, thus, the position of the faults in the profile. Linearized Bregman Iterations (LBI) form a class of implementation-efficient algorithms that solve the combined ` 1 /` 2 problem of the form: min β λ || β || 1 + 1 2 α || β || 2 2 s.t. A β = y . (3) As e.g. sho wn in [28], [38], [39], Linearized Bre gman- based algorithms also gi ve meaningful results in the presence of noise, i.e., e ven in cases when A β = y does not hav e a solution. In such scenarios, it can be sho wn that the squared error norm of an estimate β ( k ) at iteration k (details on the algorithm are gi ven belo w) is reduced by follo wing iterations as long as k A β ( k ) − y k 2 is higher than the noise lev el [28], [38]. An interesting observ ation is that, adding the ` 2 norm in the cost function abo ve, leads to very low complexity solution algorithms [28], [32] that can be implemented very efficiently in digital hardware [40]. Furthermore, it can be noted that, with the 10 addition of the ` 2 norm term, a correct balancing of λ and α allo ws for the solution of the basis pursuit problem (2) within the framework of the LBI (3) in an efficient manner . For simplicity , we set α = 1 , leaving only the parameter λ to adjust the weight of the ` 1 versus the ` 2 norm. Despite their lo w complexity , Linearized Bregman Iterations still require matrix vector mul- tiplications. A further simplification, ho we ver , called Sparse Kaczmarz, only requires vector operations while still maintaining con ver gence [41]. The basic algorithm of Sparse Kaczmarz is shown in Algorithm 1, where the Kaczmarz simplification has been used: instead of the full matrix A , only a single ro w of A is used in a single iteration of the algorithm. Although this algorithm represents an adaptation of the original Linearized Bregman Iterations algorithm described in [28], in the following, references to the LBI algorithm should be interpreted as to the Kaczmarz v ariant. This will also pre vent confusion between LBI-based Sparse Kaczmarz algorithms and other algorithms also called Sparse Kaczmarz, e.g., as described in [42]. Algorithm 1 Linearized Bregman-based Sparse Kaczmarz 1: β (0) ← 0 2: v (0) ← 0 3: for k = 1 ..N do 4: for j = 1 ..p do 5: β ( k ) j ← shrink  v ( k ) j , λ  6: end for 7: i ← (( k − 1) mod p ) + 1  cyclic re-use of ro ws of A 8: v ( k +1) ← v ( k ) + 1 k a i k 2 2 a i  y i − a i T β ( k )  9: end f or The LBI-based Sparse Kaczmarz algorithm can be seen as a steepest descent approach using an approximate gradient derived from a single row of A and a single measurement value (line 8 ) combined with a sparsifying operation (line 5 ), while the term 1 k a i k 2 2 can be seen as the step width of the descent. The sparsifying operation is due to the definition of the shrink function: shrink ( v , λ ) = max ( | v | − λ, 0) · sign ( v ) , that sets values of v with a magnitude smaller than λ to zero. In general, a sensible relation when using the Linearized Bregman Iterations is, as commented abov e, the balance between the ` 1 and ` 2 norm translated by λ . When adapting the Sparse Kaczmarz algorithm to the O TDR fault detection problem, this parameter turned out to have 11 a much lo wer sensibility than it is observed for general ` 1 /` 2 estimation problems, as will be discussed in the follo wing Section. I I I . A N A LY S I N G OT D R F I B E R P R O FI L E S W I T H T H E L I N E A R I Z E D B R E G M A N I T E R A T I O N S The first, most crucial aspect of adapting the Bre gman methodology to any basis pursuit problem, is the definition of the matrix A , the dictionary of candidates that will compose the signal of interest as a linear combination. In the specific case of O TDR profiles, the candidates are step functions, which correspond to discontinuous descends of the profile magnitude. As described in Sect. II, due to an intrinsic attenuation of light propagation through optical fibers, the matrix A must also account for a negati ve slope. The final form of A is: A =              1 1 0 0 · · · 0 0 2 1 1 0 · · · 0 0 3 1 1 1 · · · 0 0 . . . . . . . . . . . . . . . . . . p − 2 1 1 1 · · · 1 0 p − 1 1 1 1 · · · 1 1              , (4) with n = p − 1 rows. The structure of A allows for a significant simplification that can be incorporated into the Sparse Kaczmarz algorithm, pre venting the need to store and load this matrix, and is as follo ws. When analysing a row a T i of A , one can see that this row consists of the number i as its first element follo wed by i one elements and, finally , by p − i − 1 zero elements. This allows to formulate the inner product calculation of a T i β efficiently as a T i β = iβ 1 + i +1 X s =2 β s , (5) as well as the calculation of the squared norm k a i k 2 2 as k a i k 2 2 = i 2 + i. (6) Considering that this norm is used as step width in the LBI algorithm ( 1 k a i k 2 2 = 1 i 2 + i ), the squared part i 2 , originating from the first column, contributes most significantly to the norm. This means that the contribution of the first column scales do wn the step width, limiting the algorithm’ s con ver gence speed. F or this reason, a scaling factor σ for the first column of A is used in a T i β = σ iβ 1 + i − 1 X s =2 β s , (7) 12 as well as in k a i k 2 2 = σ 2 i 2 + i, (8) leading to larger step widths and faster con ver gence. One point to consider is that, with this scaling factor , the first element of the output of the algorithm is also scaled by σ . Since the algorithm counteracts this scaling and implicitly multiplies by 1 /σ during the estimation process, the first element has to be multiplied by σ at the end of the algorithm for compensation. Considering a future hardware implementation, the v alue of σ was selected as a power of two, which allo ws for replacing the multiplication by a shift operation. Furthermore, for large v alues of the step width 1 k a i k 2 2 , the v alue of σ should be chosen as small as possible. Ho we ver , a compromise between lar ge step widths and the required numerical dynamic range, especially when considering a fix ed point implementation, has to be found. F or this reason σ = 2 − 10 = 1 / 1024 was selected allo wing for a fast con ver gence while still keeping the required dynamic range manageable in a future hardware implementation. These considerations allowed for the de velopment of the follo wing Algorithm 2. In order to clarify the steps of the algorithm, a flowchart accompanies the pseudo-code’ s main steps. This flo wchart sho ws the simple iterati ve nature of the algorithm, consisting mainly of a successi ve application of the described Sparse Kaczmarz iteration: an approximate gradient step follo wed by the ev aluation of the shrink function. The structure displayed in Algorithm 2 fosters a lo w complexity implementation (complexities of Algorithm 2 as well as of the Adaptiv e ` 1 Filter [12] are discussed and compared in Sect. IV -C). 13 Sparse Kaczmarz Iteration k