A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

A Surv e y on Federated Learning Systems: V ision, Hype and Reality for Data Pri v ac y and Protection Qinbin Li 1 , Zeyi W en 2 , Zhaomin W u 1 , Sixu Hu 1 , Naibo W ang 1 , Y uan Li 1 , Xu Liu 1 , Bingsheng He 1 1 National Uni versity of Sing apore 2 The Uni versity of W estern Australia 1 { qinbin, zhaomin, sixuhu, naibo wang, liyuan, liuxu, hebs } @comp.nus.edu.sg 2 zeyi.wen@uw a.edu.au Abstract Federated learning has been a hot research topic in enabling the collaborative training of machine learning models among different or ganizations under the priv acy restrictions. As researchers try to support more machine learning models with different priv acy-preserving approaches, there is a requirement in de veloping systems and infrastructures to ease the de velopment of v arious federated learning algorithms. Similar to deep learning systems such as PyT orch and T ensorFlow that boost the dev elopment of deep learning, federated learning systems (FLSs) are equiv alently important, and face challenges from v arious aspects such as effecti veness, ef ﬁciency , and pri vac y . In this surve y , we conduct a comprehensi ve re view on federated learning systems. T o achiev e smooth ﬂow and guide future research, we introduce the deﬁnition of federated learning systems and analyze the system components. Moreo ver , we pro vide a thorough cate gorization for federated learning systems according to six different aspects, including data distrib ution, machine learning model, priv acy mechanism, communication architecture, scale of federation and motiv ation of federation. The categorization can help the design of federated learning systems as shown in our case studies. By systematically summarizing the existing federated learning systems, we present the design factors, case studies, and future research opportunities. 1 Intr oduction Many machine learning algorithms are data hungry , and in reality , data are dispersed over dif ferent org anizations under the protection of pri v acy restrictions. Due to these factors, federated learning (FL) [ 129 , 207 , 85 ] has become a hot research topic in machine learning. For e xample, data of different hospitals are isolated and become “data islands”. Since each data island has limitations in size and approximating real distributions, a single hospital may not be able to train a high-quality model that has a good predictiv e accuracy for a speciﬁc task. Ideally , hospitals can beneﬁt more if they can collaborativ ely train a machine learning model on the union of their data. Howe ver , the data cannot simply be shared among the hospitals due to v arious policies and re gulations. Such phenomena on “data islands” are commonly seen in many areas such as ﬁnance, go v ernment, and supply chains. Policies such as General Data Protection Regulation (GDPR) [ 10 ] stipulate rules on data sharing among different or ganizations. Thus, it is challenging to de velop a federated learning system which has a good predicti ve accuracy while obeying policies and re gulations to protect pri v acy . Many ef forts hav e recently been de voted to implementing federated learning algorithms to support ef fecti ve machine learning models. Speciﬁcally , researchers try to support more machine learning models with dif ferent pri v acy-preserving approaches, including deep neural networks (NNs) [ 119 , 213 , 24 , 158 , 129 ], gradient boosted decision trees (GBDTs) [ 217 , 38 , 104 ], logistics regression [ 141 , 36 ] and support vector machines (SVMs) [ 169 ]. For instance, Nikolaenko et al. [141] and Chen et al. [36] propose 1 approaches to conduct FL based on linear regression. Since GBDTs hav e become very successful in recent years [ 34 , 200 ], the corresponding Federated Learning Systems (FLSs) ha ve also been proposed by Zhao et al. [217] , Cheng et al. [38] , Li et al. [104] . Moreover , there are many FLSs supporting the training of NNs. Google proposes a scalable production system which enables tens of millions of devices to train a deep neural network [24]. As there are common methods and building blocks (e.g., pri vacy mechanisms such as differential pri v acy) for building FL algorithms, it makes sense to de velop systems and infrastructures to ease the de velopment of v arious FL algorithms. Systems and infrastructures allo w algorithm dev elopers to reuse the common building blocks, and av oid building algorithms e very time from scratch. Similar to deep learning systems such as PyT orch [ 148 , 149 ] and T ensorFlow [ 7 ] that boost the development of deep learning algorithms, FLSs are equiv alently important for the success of FL. Ho wev er , b uilding a successful FLS is challenging, which needs to consider multiple aspects such as ef fectiveness, efﬁciency , priv acy , and autonomy . In this paper , we take a survey on the existing FLSs from a system vie w . First, we show the deﬁnition of FLSs, and compare it with conv entional federated systems. Second, we analyze the system components of FLSs, including the parties, the manager , and the computation-communication frame work. Third, we categorize FLSs based on six dif ferent aspects: data distribution, machine learning model, pri vac y mechanism, communication architecture, scale of federation, and motiv ation of federation. These aspects can direct the design of an FLS as common building blocks and system abstractions. Fourth, based on these aspects, we systematically summarize the e xisting studies, which can be used to direct the design of FLSs. Last, to make FL more practical and po werful, we present future research directions to work on. W e belie ve that systems and infrastructures are essential for the success of FL. More work has to be carried out to address the system research issues in ef fecti veness, ef ﬁciency , pri vacy , and autonomy . 1.1 Related Surveys There have been se veral surveys on FL. A seminal surve y written by Y ang et al. [ 207 ] introduces the basics and concepts in FL, and further proposes a comprehensi ve secure FL framework. The paper mainly target at a relatively small number of parties which are typically enterprise data owners. Li et al. [109] summarize challenges and future directions of FL in massi ve netw orks of mobile and edge de vices. Recently , Kairouz et al. [85] hav e a comprehensi ve description about the characteristics and challenges on FL from different research topics. Howe ver , they mainly focus on cross-device FL, where the participants are a v ery large number of mobile or IoT de vices. More recently , another surve y [ 11 ] summarizes the platforms, protocols and applications of federated learning. Some surve ys only focus on an aspect of federated learning. For example, Lim et al. [113] conduct a surve y of FL speciﬁc to mobile edge computing, while [ 125 ] focuses on the threats to federated learning. 1.2 Our Contribution T o the best of our kno wledge, there lacks a surve y on re viewing existing systems and infrastructure of FLSs and on boosting the attention of creating systems for FL (Similar to prosperous system research in deep learning). In comparison with the pre vious surveys, the main contrib utions of this paper are as follows. (1) Our surve y is the ﬁrst one to provide a comprehensive analysis on FL from a system’ s point of view , including system components, taxonomy , summary , design, and vision. (2) W e provide a comprehensi ve taxonomy against FLSs on six different aspects, including data distribution, machine learning model, pri v acy mechanism, communication architecture, scale of federation, and motiv ation of federation, which can be used as common building blocks and system abstractions of FLSs. (3) W e summarize existing typical and state- of-the-art studies according to their domains, which is con venient for researchers and de velopers to refer to. (4) W e present the design factors for a successful FLS and comprehensi vely re view solutions for each scenario. (5) W e propose interesting research directions and challenges for future generations of FLSs. 2 The rest of the paper is or ganized as follo ws. In Section 2, we introduce the concept and the system components of FLSs. In Section 3, we propose six aspects to classify FLSs. In Section 4, we summary existing studies and systems on FL. W e then present the design factors and solutions for an FLS in Section 5. Last, we propose possible future directions on FL in Section 7 and conclude our paper in Section 8. 2 An Over view of F ederated Learning Systems 2.1 Background As data breach becomes a major concern, more and more go vernments establish re gulations to protect users’ data, such as GDPR in European Union [ 185 ], PDP A in Singapore [ 39 ], and CCP A [ 1 ] in the US. The cost of breaching these policies is pretty high for companies. In a breach of 600,000 dri vers’ personal information in 2016, Uber had to pay $ 148 million to settle the in vestig ation [ 3 ]. SingHealth was ﬁned $ 750,000 by the Singapore gov ernment for a breach of PDP A [ 5 ]. Google was ﬁned $ 57 million for a breach of GDPR [ 4 ], which is the largest penalty as of March 18, 2020 under the European Union pri vac y law . Under the abov e circumstances, federated learning, a collaborativ e learning without e xchanging users’ original data, has drawn increasingly attention no wadays. While machine learning, especially deep learning, has attracted many attentions ag ain recently , the combination of federation and machine learning is emerging as a ne w and hot research topic. 2.2 Deﬁnition FL enables multiple parties jointly train a machine learning model without exchanging the local data. It cov ers the techniques from multiple research areas such as distributed system, machine learning, and pri v acy . Inspired by the deﬁnition of FL gi ven by other studies [ 85 , 207 ], here we give a deﬁnition of FLSs. In a federated learning system, multiple parties collaborati vely train machine learning models without exchanging their raw data. The output of the system is a machine learning model for each party (which can be same or dif ferent). A practical federated learning system has the following constraint: gi ven an e v aluation metric such as test accuracy , the performance of the model learned by federated learning should be better than the model learned by local training with the same model architecture. 2.3 Compare with Con ventional F ederated Systems The concept of federation can be found with its counterparts in the real world such as business and sports. The main characteristic of federation is cooperation. Federation not only commonly appears in society , but also plays an important role in computing. In computer science, federated computing systems ha ve been an attracti ve area of research under dif ferent contexts. Around 1990, there were many studies on federated database systems (FDBSs) [ 166 ]. An FDBS is a collection of autonomous databases cooperating for mutual beneﬁts. As pointed out in a pre vious study [166], three important components of an FDBS are autonomy , heterogeneity , and distribution. • Autonomy . A database system (DBS) that participates in an FDBS is autonomous, which means it is under separate and independent control. The parties can still manage the data without the FDBS. • Heter ogeneity . The database management systems can be dif ferent inside an FDBS. For example, the dif ference can lie in the data structures, query languages, system software requirements, and communication capabilities. • Distribution . Due to the e xistence of multiple DBSs before an FDBS is b uilt, the data distrib ution may dif fer in dif ferent DBSs. A data record can be horizontally or vertically partitioned into dif ferent DBSs, and can also be duplicated in multiple DBSs to increase the reliability . 3 More recently , with the dev elopment of cloud computing, man y studies hav e been done for federated cloud computing [ 97 ]. A federated cloud (FC) is the deployment and management of multiple external and internal cloud computing services. The concept of cloud federation enables further reduction of costs due to partial outsourcing to more cost-efﬁcient regions. Resource migration and resource redundancy are two basic features of federated clouds [ 97 ]. First, resources may be transferred from one cloud pro vider to another . Migration enables the relocation of resources. Second, redundancy allo ws concurrent usage of similar service features in dif ferent domains. F or e xample, the data can be partitioned and processed at dif ferent providers follo wing the same computation logic. Overall, the scheduling of different resources is a ke y factor in the design of a federated cloud system. There are some similarities and differences between FLSs and con ventional federated systems. First, the concept of federation still applies. The common and basic idea is about the cooperation of multiple independent parties. Therefore, the perspectiv e of considering heterogeneity and autonomy among the parties can still be applied to FLSs. Second, some factors in the design of distrib uted systems are still important for FLSs. F or example, ho w the data are shared between the parties can inﬂuence the efﬁciency of the systems. F or the differences, these federated systems hav e dif ferent emphasis on collaboration and constraints. While FDBSs focus on the management of distributed data and FCs focus on the scheduling of the resources, FLSs care more about the secure computation among multiple parties. FLSs induce ne w challenges such as the algorithm designs of the distributed training and the data protection under the pri v acy restrictions. Figure 1 sho ws the number of papers in each year for these three research areas. Here we count the papers by searching k eyw ords “federated database”, “federated cloud”, and “federated learning” in Google Scholar 1 . Although federated database was proposed 30 years ago, there are still about 400 papers that mentioned it in recent years. The popularity of federated cloud grows more quickly than federated database at the beginning, while it appears to decrease in recent years probably because cloud computing becomes more mature and the incenti ves of federation diminish. For FL, the number of related papers is increasing rapidly and has achie ved about 4,400 last year . Now adays, the “data island” phenomena are common and ha ve increasingly become an important issue in machine learning. Also, there is a increasing pri v acy concern and social a wareness from the general public. Thus, we expect the popularity of FL will keep increasing for at least ﬁ ve years until there may be mature FLSs. 2.4 System Components There are three major components in an FLS: parties (e.g., clients), the manager (e.g., serv er), and the communication-computation frame work to train the machine learning model. 2.4.1 Parties In FLSs, the parties are the data owners and the beneﬁciaries of FL. They can be or ganizations or mobile de vices, named cross-silo or cross-device settings [ 85 ], respectiv ely . W e consider the follo wing properties of the parties that af fect the design of FLSs. First, what is the hardware capacity of the parties? The hardware capacity includes the computation po wer and storage. If the parties are mobile phones, the capacity is weak and the parties cannot perform much computation and train a large model. F or example, W ang et al. [192] consider a resource constrained setting in FL. The y design an objecti ve to include the resource budget and proposed an algorithm to determine the rounds of local updates. Second, what is the scale and stability of the parties? For organizations, the scale is relativ e small compared with the mobile devices. Also, the stability of the cross-silo setting is better than the cross- de vice setting. Thus, in the cross-silo setting, we can expect that e very party can continuously conduct computation and communication tasks in the entire federated process, which is a common setting in many studies [ 104 , 38 , 169 ]. If the parties are mobile de vices, the system has to handle possible issues such 1 https://scholar .google.com/ 4 Year Number of papers 0 1000 2000 3000 4000 5000 1990 2000 2010 2020 federated database federated cloud federated learning Figure 1: The number of related papers on “federated database”, “federated cloud”, and “federated learning” as connection lost [ 24 ]. Moreover , since the number of devices can be very large (e.g., millions), it is unpractical to assume all the devices to participate e very round in FL. The widely used setting is to choose a fraction of de vices to perform computation in each round [129, 24]. Last, what are the data distributions among the parties? Usually , no matter cross-device or cross- silo setting, the non-IID (identically and independently distributed) data distribution is considered a practical and challenging setting in federated learning [ 85 ], which is e valuated in the e xperiments of recent work [ 104 , 213 , 111 , 189 ]. Such non-IID data distrib ution may be more obvious among the organizations. For example, a bank and an insurance company can conduct FL to impro v e their predictions (e.g., whether a person can repay the loan and whether the person will buy the insurance products), while e ven the features can v ary a lot in these or ganizations. T echniques in transfer learning [ 147 ], meta-learning [ 55 ], and multi-task learning [157] may be useful to combine the kno wledge of v arious kinds of parties. 2.4.2 Manager In the cross-device setting, the manager is usually a pow erful central server . It conducts the training of the global machine learning model and manages the communication between the parties and the server . The stability and reliability of the server are quite important. Once the serv er fails to provide the accurate computation results, the FLS may produce a bad model. T o address these potential issues, blockchain [ 176 ] may be a possible technique to of fer a decentralized solution in order to increase the system reliability . F or e xample, Kim et al. [93] le verage the blockchain in lieu of the central serv er in their system, where the blockchain enables exchanging the de vices’ updates and providing re wards to them. In the cross-silo setting, since the or ganizations are expected to ha ve po werful machines, the manager can also be one of the or ganizations who dominates the FL process. This is particularly used in the vertical FL [ 207 ], which we will introduce in Section 3.1 in detail. In a vertical FL setting by Liu et al. [119] , the features of data are v ertically partitioned across the parties and only one party has the labels. The party that o wns the labels is naturally considered as the FL manager . One challenge can be that it is hard to ﬁnd a trusted serv er or party as the manager , especially in the cross-silo setting. Then, a fully-decentralized setting can be a good choice, where the parties communicate with each other directly and almost equally contrib ute to the global machine learning model training. These parties jointly set a FL task and deploy the FLS. Li et al. [104] propose a federated gradient 5 1 2 3 4 1 1 3 S e nd t he g l obal m o de l to t he se l e c t e d pa r t i e s 1 Upd at e m o de l wi t h l o c al dat a 2 S e nd l o c al m ode l s to t he se r v e r 3 Up dat e t he g l o ba l m o de l 4 2 2 3 (a) FedA vg 2 3 4 4 Upd at e t he l o c al g r ad i e nt s 1 S e nd t he g r ad i e nt s t o t he se l e c t e d pa r t y 2 Upd at e m o de l wi t h l o c al da t a an d al l g r ad i e nt s 3 S e nd t he m o de l to t he o t he r pa r t i e s 4 1 1 1 2 (b) SimFL Figure 2: Federated learning frameworks boosting decision trees framework, where each party trains decision trees sequentially and the ﬁnal model is the combination of all trees. It is challenging to design a fully-decentralized FLS with reasonable communication ov erhead. 2.4.3 Communication-Computation Framework In FLSs, the computation happens on the parties and the manager , while the communication happens between the parties and the manager . Usually , the aim of the computation is for the model training and the aim of the communication is for exchanging the model parameters. A basic and widely used frame work is Federated A veraging (FedA vg) [ 129 ] proposed in 2016, as sho wn in Figure 2a. In each iteration, the server ﬁrst sends the current global model to the selected parties. Then, the selected parties update the global model with their local data. Next, the updated models are sent back to the server . Last, the server a verages all the receiv ed local models to get a new global model. FedA vg repeats the above process until reaching the speciﬁed number of iterations. The global model of the server is the ﬁnal output. While FedA vg is a centralized FL framew ork, SimFL, proposed by Li et al. [109] , represents a decentralized FL frame work. In SimFL, no trusted serv er is needed. In each iteration, the parties ﬁrst update the gradients of their local data. Then, the gradients are sent to a selected party . Next, the selected party use its local data and the gradients to update the model. Last, the model is sent to all the other parties. T o ensure fairness and utilize the data from different parties, e very party is selected for updating the model for about the same number of rounds. SimFL repeats a speciﬁed number of iterations and outputs the ﬁnal model. 3 T axonomy Considering the common system abstractions and building blocks for different FLSs, we classify FLSs by six aspects: data partitioning, machine learning model, priv acy mechanism, communication architecture, scale of federation, and moti v ation of federation. These aspects include common factors (e.g., data partitioning, communication architecture) in pre vious FLSs [ 166 , 97 ] and unique consideration (e.g., machine learning model and priv acy mechanism) for FLSs. Furthermore, these aspects can be used to guide the design of FLSs. Figure 3 shows the summary of the taxonomy of FLSs. 6 Mo ti vati o n o f Fe de rati o n Ince nt iv e Re gu la ti o n Scal e o f Fe de rati o n C ro ss- si l o C ro ss- de vic e C o mmun i cati o n A rchi te ctu re Ce n t r a l i z e d De c e n t r a l i z e d Pri va cy Me c h ani s m Di ffer en t i a l P r i v a c y Cr y p t o g r a p h i c M et ho d s . . . Mach i ne Le arn i n g Mo de l Li n e a r Mo d e l s De c i s i o n T r e e s Ne ur a l N e t wo r k s . . . Data Parti ti o n i n g Ho r i z o n t a l Ve r t i c a l Hy b r i d F ed era t ed Le a rning Systems Figure 3: T axonomy of federated learning systems In T able 1 of [ 85 ], the y consider dif ferent characteristics to distinguish distrib uted learning, cross- de vice federated learning, and cross-silo federated learning, including setting, data distrib ution, communi- cation, etc. Our taxonomy is used to distinguish different federated learning systems from a deployment vie w , and aspects like machine learning models and moti vation of federation are not considered in [85]. 3.1 Data Partitioning Based on ho w data are distrib uted ov er the sample and feature spaces, FLSs can be typically cate gorized in horizontal, vertical, and hybrid FLSs [207]. In horizontal FL, the datasets of dif ferent parties ha ve the same feature space b ut little intersection on the sample space. This is a natural data partitioning especially for the cross-device setting, where dif ferent users try to impro ve their model performance on the same task using FL. Also, the majority of FL studies adopt horizontal partitioning. Since the local data are in the same feature space, the parties can train the local models using their local data with the same model architecture. The global model can simply be updated by a veraging all the local models. A basic and popular frame work of horizontal federated learning is FedA vg, as shown in Figure 2. W ake-w ord recognition [ 98 ], such as ‘Hey Siri’ and ‘OK Google’, is a typical application of horizontal partition because each user speaks the same sentence with a dif ferent voice. In vertical FL, the datasets of dif ferent parties ha ve the same or similar sample space but differ in the feature space. For the vertical FLS, it usually adopts entity alignment techniques [ 206 , 41 ] to collect the ov erlapped samples of the parties. Then the o verlapped data are used to train the machine learning model using encryption methods. Cheng et al. [38] propose a lossless vertical FLS to enable parties to collaborati vely train gradient boosting decision trees. They use priv acy-preserving entity alignment to ﬁnd common users among two parties, whose gradients are used to jointly train the decision trees. Cooperation among dif ferent companies usually can be treated as a situation of vertical partition. In many other applications, while existing FLSs mostly focus on one kind of partition, the partition of data among the parties may be a hybrid of horizontal partition and vertical partition. Let us take cancer diagnosis system as an example. A group of hospitals wants to b uild an FLS for cancer diagnosis but each hospital has dif ferent patients as well as different kinds of medical examination results. T ransfer learning [ 147 ] is a possible solution for such scenarios. Liu et al. [119] propose a secure federated transfer learning system which can learn a representation among the features of parties using common instances. 7 3.2 Machine Learning Models Since FL is used to solv e machine learning problems, the parties usually want to train a state-of-the-art machine learning model on a speciﬁed task. There hav e been many ef forts in de veloping ne w models or rein venting current models to the federated setting. Here, we consider the widely-used models nowadays. The most popular machine learning model no w is neural network (NN), which achieves state-of-the-art results in many tasks such as image classiﬁcation and w ord prediction [ 96 , 175 ]. There are many federated learning studies based on stochastic gradient descent [129, 189, 24], which can be used to train NNs. Another widely used model is decision tree, which is highly efﬁcient to train and easy to interpret compared with NNs. A tree-based FLS is designed for the federated training of single or multiple decision trees (e.g., gradient boosting decision trees (GBDTs) and random forests). GBDTs are especially popular recently and it has a v ery good performance in many classiﬁcation and regression tasks. Li et al. [104] and Cheng et al. [38] propose FLSs for GBDTs on horizontally and vertically partitioned data, respecti vely . Besides NNs and trees, linear models (e.g., linear re gression, logistic regression, SVM) are classic and easy-to-use models. There are some well dev eloped systems for linear regression and logistic regression [ 141 , 72 ]. These linear models are easy to learn compared with other complex models (e.g., NNs). While a single machine learning model may be weak, ensemble methods [ 150 ] such as stacking and voting can be applied in the federated setting. Each party trains a local model and sends it to the serv er , which aggregates all the models as an ensemble. The ensemble can directly be used for prediction by max voting or be used to train a meta-model by stacking. A beneﬁt of federated ensemble learning is that each party can train heterogeneous models as there is no a veraging of model parameters. As shown in pre vious studies [ 213 , 103 ], federated ensemble learning can also achie v e a good accuracy in a single communication round. Currently , many FL frame works [ 129 , 94 , 192 , 108 ] are proposed based on stochastic gradient descent, which is a typical optimization algorithm for many models including neural networks and logistic regression. Ho we ver , to increase the ef fecti veness of FL, we may ha ve to exploit the model architecture [ 189 ]. Since the research of FL is still at an early stage, there is still a gap for FLSs to better support the state-of-the-art models. 3.3 Privacy Mechanisms Although the local data are not e xposed in FL, the e xchanged model parameters may still leak sensitiv e information about the data. There ha ve been man y attacks against machine learning models [ 56 , 167 , 137 , 131 ], such as model in version attack [ 56 ] and membership inference attack [ 167 ], which can potentially infer the raw data by accessing to the model. Moreov er , there are many pri v acy mechanisms such as differential pri v acy [ 48 ] and k -anonymity [ 50 ], which provide dif ferent priv acy guarantees. The characteristics of e xisting pri vacy mechanisms are summarized in the surve y [ 186 ]. Here we introduce two major approaches that are adopted in the current FLSs for data protection: cryptographic methods and dif ferential pri v acy . Cryptographic methods such as homomorphic encryption [ 15 , 72 , 28 , 156 , 116 ], and secure multi- party computation (SMC) [ 165 , 32 , 22 , 23 ] are widely used in pri vac y-preserving machine learning algorithms. Basically , the parties have to encrypt their messages before sending, operate on the encrypted messages, and decrypt the encrypted output to get the ﬁnal result. Applying the abov e methods, the user pri v acy of FLSs can usually be well protected [ 89 , 211 , 91 , 144 ]. For example, SMC [ 63 ] guarantees that all the parties cannot learn an ything except the output, which can be used to securely aggregate the transferred gradients. Howe ver , SMC does not provide pri vacy guarantees for the ﬁnal model, which is still vulnerable to the inference attacks and model inv ersion attacks [ 167 , 56 ]. Also, due to the additional encryption and decryption operations, such systems suf fer from the e xtremely high computation o verhead. Dif ferential priv acy [ 48 , 49 ] guarantees that one single record does not inﬂuence much on the output of a function. Man y studies adopt differential pri v acy [ 31 , 8 , 105 , 180 , 220 , 112 ] for data priv acy protection to ensure the parties cannot know whether an individual record participates in the learning or not. By 8 injecting random noises to the data or model parameters [ 8 , 105 , 170 , 202 ], dif ferential pri v acy pro vides statistical pri v acy guarantees for indi vidual records and protection against the inference attack on the model. Due to the noises in the learning process, such systems tend to produce less accurate models. Note that the abo ve methods are independent of each other , and an FLS can adopt multiple methods to enhance the pri v acy guarantees [ 64 , 205 , 86 ]. There are also other approaches to protect the user priv acy . An interesting hardware-based approach is to use trusted e xecution en vironment (TEE) such as Intel SGX processors [ 159 , 145 ], which can guarantee that the code and data loaded inside are protected. Such en vironment can be used inside the central server to increase its credibility . Related to pri v acy lev el, the threat models also vary in FLSs [ 125 ]. The attacks can come from any stage of the process of FL, including inputs, the learning process, and the learnt model. • Inputs The malicious parties can conduct data poisoning attacks [ 35 , 99 , 12 ] on FL. F or example, the parties can modify the labels of training samples with a speciﬁc class, so that the ﬁnal model performs badly on this class. • Learning pr ocess During the learning process, the parties can perform model poisoning attacks [ 16 , 203 ] to upload designed model parameters. Lik e data poisoning attacks, the global model can hav e a very low accurac y due to the poisoned local updating. Besides model poisoning attacks, the Byzantine fault [ 27 , 21 , 173 ] is also a common issue in distributed learning, where the parties may behav e arbitrarily badly and upload random updates. • The learnt model . If the learnt model is published, the inference attacks [ 56 , 167 , 131 , 137 ] can be conducted on it. The server can infer sensitiv e information about the training data from the exchanged model parameters. For example, membership inference attacks [ 167 , 137 ] can infer whether a speciﬁc data record is used in the training. Note that the inference attacks may also be conducted in the learning process by the FL manager , who has access to the local updates of the parties. 3.4 Communication Architectur e There are two major ways of communication in FLSs: centralized design and decentralized design. In the centralized design, the data ﬂo w is often asymmetric, which means the manager aggregates the information (e.g., local models) from parties and sends back training results [ 24 ]. The parameter updates on the global model are alw ays done in this manager . The communication between the manager and the local parties can be synchronous [ 129 ] or asynchronous [ 204 , 171 ]. In a decentralized design, the communications are performed among the parties [217, 104] and e very party is able to update the global parameters directly . Google K eyboard [ 71 ] is a case of centralized architecture. The server collects local model updates from users’ de vices and trains a global model, which is sent back to the users for inference, as sho wn in Figure 2a. The scalability and stability are two important factors in the system design of the centralized FL. While the centralized design is widely used in e xisting studies, the decentralized design is preferred at some aspects since concentrating information on one serv er may bring potential risks or unfairness. Ho we ver , the design of the decentralized communication architecture is challenging, which should take fairness and communication ov erhead into consideration. There are currently three different decentralized designs: P2P [ 104 , 217 ], graph [ 128 ] or blockchain [ 191 , 219 ]. In a P2P design, the parties are equally pri vileged and treated during federated learning. An e xample is SimFL [ 104 ], where each party trains a tree sequentially and sends the tree to all the other parties. The communication architecture can also be modeled as a graph with the additional constrains such as latency and computation time. Marfoq et al. [128] propose an algorithm to ﬁnd a throughput-optimal topology design. Recently , blockchain [ 223 ] is a popular decentralized platform for consideration. It can be used to store the information of parties in federated learning and ensure the transparency of federated learning [191]. 9 3.5 Scale of F ederation The FLSs can be cate gorized into tw o typical types by the scale of federation: cross-silo FLSs and cross-de vice FLSs [ 85 ]. The differences between them lie on the number of parties and the amount of data stored in each party . In cross-silo FLS, the parties are organizations or data centers. There are usually a relati vely small number of parties and each of them has a relativ ely large amount of data as well as computational po wer . For e xample, Amazon wants to recommend items for users by training the shopping data collected from hundreds of data centers around the world. Each data center possesses a huge amount of data as well as suf ﬁcient computational resources. Another example is that federated learning can be used among medical institutions. Different hospitals can use federated learning to train a CNN for chest radiography classiﬁcation while keeping their chest X-ray images locally [ 86 ]. W ith federated learning, the accuracy of the model can be signiﬁcantly improv ed. One challenge that such FLS faces is ho w to ef ﬁciently distribute computation to data centers under the constraint of pri vac y models [224]. In cross-device FLS, on the contrary , the number of parties is relati vely lar ge and each party has a relati vely small amount of data as well as computational power . The parties are usually mobile de vices. Google K eyboard [ 208 ] is an example of cross-de vice FLSs. The query suggestions of Google Keyboard can be impro v ed with the help of FL. Due to the energy consumption concern, the devices cannot be asked to conduct comple x training tasks. Under this occasion, the system should be po werful enough to manage a large number of parties and deal with possible issues such as the unstable connection between the de vice and the server . 3.6 Motivation of F ederation In real-world applications of FL, indi vidual parties need the moti v ation to get in volv ed in the FLS. The moti v ation can be regulations or incenti ves. FL inside a company or an organization is usually moti vated by regulations (e.g., FL across dif ferent departments of a company). For example, the department which has the transaction records of users can help another department to predict user credit by federated learning. In many cases, parties cannot be forced to provide their data by re gulations. Ho wev er , parties that choose to participate in federated learning can beneﬁt from it, e.g., higher model accuracy . For example, hospitals can conduct federated learning to train a machine learning model for chest radiography classiﬁcation [ 86 ] or CO VID-19 detection [ 152 ]. Then, the hospitals can get a good model which has a higher accuracy than human experts and the model trained locally without federation. Another example is Google Ke yboard [ 208 ]. While users hav e the choice to prevent Google K eyboard from utilizing their data, those who agree to upload input data may enjoy a higher accuracy of word prediction. Users may be willing to participate in federated learning for their con venience. A challenging problem is ho w to design a fair incenti ve mechanism, such that the party that contributes more can also beneﬁt more from federated learning. There hav e been some successful cases for incenti ve designs in blockchain [ 228 , 51 ]. The parties inside the system can be collaborators as well as competitors. Other incenti ve designs lik e [ 88 , 87 ] are proposed to attract participants with high-quality data for FL. W e expect game theory models [ 163 , 84 , 136 ] and their equilibrium designs should be revisited under the FLSs. Even in the case of Google K eyboard, the users need to be moti vated to participate this collaborati ve learning process. 4 Summary of Existing Studies In this section 2 , we summarize and compare the existing studies on FLSs according to the aspects considered in Section 3. 2 Last updated on December 7, 2021. W e will periodically update this section to include the state-of-the-art and valuable FL studies. Please check out our latest v ersion at this URL: https://arxi v .org/abs/1907.09693. Also, if you hav e an y reference that you want to add into this surve y , kindly drop Dr . Bingsheng He an email (hebs@comp.nus.edu.sg). 10 4.1 Methodology T o discover the existing studies on FL, we search keyw ord “Federated Learning” in Google Scholar . Here we only consider the published studies in the computer science community . Since the scale of federation and the moti vation of federation are problem dependent, we do not compare the existing studies by these two aspects. For ease of presentation, we use “NN”, “DT” and “LM” to denote neural netw orks, decision trees and linear models, respectively . Moreover , we use “CM” and “DP” to denote cryptographic methods and differential priv acy , respecti vely . Note that the algorithms (e.g., federated stochastic gradient descent) in some studies can be used to learn many machine learning models (e.g., logistic regression and neural netw orks). Thus, in the “model implementation” column, we present the models implemented in the e xperiments of corresponding papers. Moreov er , in the “main area” column, we indicate the major area that the papers study on. 4.2 Individual Studies W e summarize existing popular and state-of-the-art research w ork, as shown in T able 1. From T able 1, we hav e the following four ke y ﬁndings. First, most of the existing studies consider a horizontal data partitioning. W e conjecture a part of the reason is that the experimental studies and benchmarks in horizontal data partitioning are relati vely ready than vertical data partitioning. Ho we ver , vertical FL is also common in real world, especially between dif ferent org anizations. V ertical FL can enable more collaboration between di verse parties. Thus, more ef forts should be paid to vertical FL to ﬁll the gap. Second, most studies consider exchanging the ra w model parameters without any pri vac y guarantees. This may not be right if more po werful attacks on machine learning models are discov ered in the future. Currently , the mainstream methods to pro vide pri vac y guarantees are differential priv acy and cryptographic methods (e.g., secure multi-party computation and homomorphic encryption). Dif ferential priv acy may inﬂuence the ﬁnal model quality a lot. Moreov er , the cryptographic methods bring much computation and communication ov erhead and may be the bottleneck of FLSs. W e look forward to a cheap way with reasonable pri v acy guarantees to satisfy the re gulations. Third, the centralized design is the mainstream of current implementations. A trusted serv er is needed in their settings. Howe ver , it may be hard to ﬁnd a trusted server especially in the cross-silo setting. One nai ve approach to remove the central server is that the parties share the model parameters with all the other parties and each party also maintains the same global model locally . This method bring more communication and computation cost compared with the centralized setting. More studies should be done for practical FL with the decentralized architecture. Last, the main research directions (also the main challenge) of FL are to improve the ef fectiv eness, ef ﬁciency , and priv acy , which are also three important metrics to e valuate an FLS. Meanwhile, there are many other research topics on FL such as fairness and incentive mechanisms. Since FL is related to many research areas, we believ e that FL will attract more researchers and we can see more interesting studies in the near future. 4.2.1 Effectiveness Impr ovement While some algorithms are based on SGD, the other algorithms are specially designed for one or sev eral kinds of model architectures. Thus, we classify them into SGD-based algorithms and model specialized algorithms accordingly . SGD-Based If we consider the local data on a party as a single batch, SGD can be easily implemented in a federated setting by performing a single batch gradient calculation each round (i.e., FedSGD [ 129 ]). Howe v er , such method may require a lar ge number of communication rounds to con ver ge. T o reduce the number of communication rounds, FedA vg [ 129 ], as introduced in Section 2.3.3 and Figure 1a of the main paper, 11 T able 1: Comparison among e xisting published studies. LM denotes Linear Models. DM denotes Decision T rees. NN denotes Neural Networks. CM denotes Cryptographic Methods. DP denotes Dif ferential Pri v acy . FL Studies main area data partitioning model implementation priv acy mechanism communication architecture remark FedA vg [129] Effecti ve Algorithms horizontal NN centralized SGD-based FedSVRG [94] LM FedProx [108] LM, NN SCAFFOLD [90] LM, NN FedNov a [190] NN Per-FedA vg [52] NN pFedMe [46] LM, NN IAPGD, AL2SGD+ [69] LM IFCA [61] LM, NN Agnostic FL [134] LM, NN FedRobust [155] NN FedDF [114] NN FedBCD [120] vertical NN PNFM [213] horizontal NN-specialized FedMA [189] SplitNN [189] vertical T ree-based FL [217] horizontal DT DP decentralized DT -specialized SimFL [104] hashing FedXGB [122] CM centralized FedForest [121] SecureBoost [38] vertical Ridge Regression FL [141] horizontal LM LM-specialized PPRR [36] Linear Regression FL [162] vertical Logistic Regression FL [72] horizontal Federated MTL [169] multi-task learning Federated Meta-Learning [33] NN meta-learning Personalized FedA vg [81] LFRL [115] reinforcement learning FBO [44] LM Bayesian optimization Structure Updates [95] Practicality Enhancement NN efﬁcienc y improvement Multi-Objectiv e FL [226] On-Device ML [79] Sparse T ernary Compression [164] DP ASGD [128] decentralized Client-Lev el DP FL [60] DP centralized priv acy guarantees FL-LSTM [130] Local DP FL [20] LM, NN Secure Aggregation FL [23] NN CM Hybrid FL [181] LM, DT , NN CM, DP Backdoor FL [16, 174, 188] NN robustness and attacks Adversarial Lens [19] Distributed Backdoor [203] Image Reconstruction [58] RSA [100] LM Model Poison [53] LM, NN q -FedA vg [110] LM, NN fairness BlockFL [93] LM incenti ves Reputation FL [87] FedCS [143] Applications NN edge computing DRL-MEC [194] Resource-Constrained MEC [192] LM, NN FedGKT [73] NN FedCF [14] LM collaborativ e ﬁlter FedMF [29] matrix factorization FedRecSys [177] LM, NN CM recommender system FL Ke yboard [71] NN natural language processing Fraud detection [222] NN credit card transaction FedML [74] Benchmarks horizontal &vertical LM, NN centralized &decentralized general purpose benchmarks FedEval [30] horizontal NN centralized O ARF [77] NN CM,DP centralized Edge AIBench [70] PerfEval [142] NN centralized targeted benchmarks FedReID [227] semi-supervised benchmark [216] non-IID benchmark [117] LEAF [25] centralized datasets Street Dataset [124] 12 is now a typical and practical FL framew ork based on SGD. In FedA vg, each party conducts multiple training rounds with SGD on its local model. Then, the weights of the global model are updated as the mean of weights of the local models. The global model is sent back to the parties to ﬁnish a global iteration. By av eraging the weights, the local parties can take multiple steps of gradient descent on their local models, so that the number of communication rounds can be reduced compared with FedSGD. K one ˇ cn ` y et al. [94] propose federated SVRG (FSVRG). The major difference between federated SVRG and federated av eraging is the way to update parameters of the local model and global model (i.e., step 2 and step 4). The formulas to update the model weights are based on stochastic v ariance reduced gradient (SVRG) [ 82 ] and distributed approximate ne wton algorithm (D ANE) in federated SVRG. They compare their algorithm with the other baselines lik e CoCoA+ [ 126 ] and simple distrib uted gradient descent. Their method can achieve better accurac y with the same communication rounds for the logistic regression model. There is no comparison between federated av eraging and federated SVRG. A ke y challenge in federated learning is the heterogeneity of local data (i.e., non-IID data) [ 106 ], which can degrade the performance of federated learning a lot [ 108 , 90 , 111 ]. Since the local models are updated to wards their local optima, which are far from each other due to non-IID data, the a veraged global model may also far from the global optima. T o address the challenge, Li et al. [108] propose FedProx. Since too many local updates may lead the av eraged model far from the global optima, FedProx introduces an additional proximal term in the local objective to limit the amount of local changes. Instead of directly limiting the size of local updates, SCAFFOLD [ 90 ] applies the v ariance reduction technique to correct the local updates. While FedProx and SCAFFOLD impro ve the local training stage of FedA vg, FedNova [ 190 ] improv es the aggre gation stage of FedA vg. It takes the heterogeneous local updates of each party into consideration and normalizes the local models according to the local updates before av eraging. The abov e studies’ objecti ve is to minimize the loss on the whole training dataset under the non-IID data setting. Another solution is to design personalized federated learning algorithms, where the aim is that each party learns a personalized model which can perform well on its local data. Per-FedA vg [ 52 ] applies the idea of model-agnostic meta-learning [ 55 ] frame work in FedA vg. pFedMe [ 46 ] uses Moreau en velope to help decompose the personalized model optimization. Hanzely et al. [69] establish the lo wer bound for the communication complexity and local oracle comple xity of the personalized federated learning optimization. Moreo ver , they apply accelerated proximal gradient descent (APGD) and accelerated L2SGD+ [ 68 ], which can achiev e optimal comple xity bound. IFCA [ 61 ] assumes that the parties are partitioned into clusters by the local objecti ves. The idea is to alternati vely minimize the loss functions while estimating the cluster identities. Another research direction related to the non-IID data setting is to design robust federated learning against possible combinations of the local distributions. Mohri et al. [134] propose a new frame work named agnostic FL. Instead of minimizing the loss with respect to the average distribution among the data distributions from local clients, they try to train a centralized model optimized for any possible target distrib ution formed by a mixture of the client distributions. FedRobust [ 155 ] considers a structured af ﬁne distribution shifts. It proposes gradient descent ascent method to solv e the distributed minimax optimization problem. While the abov e studies consider the heterogeneity of data, the heterogeneity of the local models may also exist in federated learning. The parties can train models with different architectures. FedDF [ 114 ] utilizes kno wledge distillation [ 75 ] to aggregate the local models. It assumes a public dataset exists on the server -side, which can be used to extract the knowledge of the local models and update the global model. There are few studies on SGD-based vertical federated learning. [ 120 ] propose the Federated Stochas- tic Block Coordinate Descent (FedBCD) for v ertical FL. By applying coordinate descent, each party updates its local parameter for multiple rounds before communicating the intermediate results. They also provide con vergence analysis for FedBCD. Hu et al. [78] propose FDML for vertical FL assuming all parties hav e the labels. Instead of exchanging the intermediate results, it aggregates the local prediction from each of the participated party . 13 Neural Networks Although neural networks can be trained using the SGD optimizer, we can potentially increase the model utility if the model architecture can also be exploited. Y urochkin et al. [213] de velop probabilistic federated neural matching (PFNM) for multilayer perceptrons by applying Bayesian nonparametric machinery [ 59 ]. They use an Beta-Bernoulli process informed matching procedure to combine the local models into a federated global model. The experiments sho w that their approach can outperform FedA vg on both IID and non-IID data partitioning. W ang et al. [189] sho w ho w to apply PFNM to CNNs (con v olutional neural netw orks) and LSTMs (long short-term memory networks). Moreover , they propose Federated Matched A veraging (FedMA) with a layer -wise matching scheme by e xploting the model architecture. Speciﬁcally , the y use matched av eraging to update a layer of the global model each time, which also reduces the communication size. The e xperiments show that FedMA performs better than FedA vg and FedProx [ 108 ] on CNNs and LSTMs. Another study for vertical federated learning on neural networks is split learning [ 184 ]. V epakomma et al. [184] propose a nov el paradigm named SplitNN, where a neural network is di vided into two parts. Each participated party just need to train a fe w layers of the network, then the output at the cut layer are transmitted to the party who has label and completes the rest of the training. T rees Besides neural netw orks, decision trees are also widely used in the academic and industry [ 34 , 92 , 54 , 105 ]. Compared with NNs, the training and inference of trees are highly efﬁcient. Howe ver , the tree parameters cannot be directly optimized by SGD, which means that SGD-based FL frame works are not applicable to learn trees. W e need specialized frame works for trees. Among the tree models, the Gradient Boosting Decision T ree (GBDT) model [34] is quite popular . There are se veral studies on federated GBDT . There are some studies on horizontal federated GBDTs. Zhao et al. [217] propose the ﬁrst FLS for GBDTs. In their frame work, each decision tree is trained locally without the communications between parties. The trees trained in a party are sent to the ne xt party to continuous train a number of trees. Dif ferential pri vac y is used to protect the decision trees. Li et al. [104] exploit similarity information in the building of federated GBDTs by using locality-sensitiv e hashing [ 45 ]. They utilize the data distrib ution of local parties by aggregating gradients of similar instances. W ithin a weaker pri v acy model compared with secure multi-party computation, their approach is ef fectiv e and ef ﬁcient. Liu et al. [122] propose a federated extreme boosting learning frame work for mobile cro wdsensing. They adopted secret sharing to achie ve pri vac y-preserving learning of GBDTs. Liu et al. [121] propose Federated Forest, which enables training random forests in the vertical FL setting. In the building of each node, the party with the corresponding split feature is responsible for splitting the samples and sharing the results. The y encrypt the communicated data to protect pri v ac y . Their approach is as accurate as the non-federated version. Cheng et al. [38] propose SecureBoost, a frame work for GBDTs in the vertical FL setting. In their assumption, only one party has the label information. They used the entity alignment technique to get the common data and then build the decision trees. Additi vely homomorphic encryption is used to protect the gradients. Linear/Logistic Regression Linear/logistic regression can be achie ved using SGD. Here we sho w the studies that are not SGD-based and specially designed for linear/logistic regression. In the horizontal FL setting, Nikolaenk o et al. [141] propose a system for pri v acy-preserving ridge regression. Their approaches combine both homomorphic encryption and Y ao’ s garbled circuit to achieve pri v acy requirements. An extra ev aluator is needed to run the algorithm. Chen et al. [36] propose a system for pri vacy-preserving ridge regression. Their approaches combine both secure summation and homomorphic encryption to achiev e pri v acy requirements. They provided a complete communication and computation ov erhead comparison among their approach and the pre vious state-of-the-art approaches. 14 In the vertical FL setting, Sanil et al. [162] present a secure regression model. The y focus on the linear regression model and secret sharing is applied to ensure pri vac y in their solution. Hardy et al. [72] present a solution for two-party vertical federated logistic re gression. They apply entity resolution and additiv ely homomorphic encryption. Others There are man y studies that combine FL with other machine learning techniques such as multi-task learning [157], meta-learning [55], reinforcement learning [133], and transfer learning [147]. Smith et al. [169] combine FL with multi-task learning [ 26 , 215 ]. Their method considers the issues of high communication cost, stragglers, and fault tolerance for MTL in the federated en vironment. Corinzia and Buhmann [43] propose a federated MTL method with non-con vex models. They treated the central server and the local parties as a Bayesian network and the inference is performed using variational methods. Chen et al. [33] adopt meta-learning in the learning process of FedA vg. Instead of training the local NNs and e xchanging the model parameters, the parties adopt the Model-Agnostic Meta-Learning (MAML) [ 55 ] algorithm in the local training and exchange the gradients of MAML. Jiang et al. [81] inter- pret FedA vg in the light of existing MAML algorithms. Furthermore, they apply Reptile algorithm [ 139 ] to ﬁne-tune the global model trained by FedA vg. Their e xperiments sho w that the meta-learning algorithm can improv e the ef fecti veness of the global model. Liu et al. [115] propose a lifelong federated reinforcement learning framew ork. Adopting transfer learning techniques, a global model is trained to effecti vely remember what the robots ha ve learned in reinforcement learning. Dai et al. [44] considers Bayesian optimization in FL. The y propose federated Thompson sampling to address the communication ef ﬁciency and heterogeneity of the clients. Their approach can potentially be used in the parameter search in federated learning. Another issue in FL is the package loss or party disconnection during FL process, which usually happens on mobile de vices. When the number of failed messages is small, the server can simply ignore them as they have a small weight on the updating of the global model. If the party failure is signiﬁcant, the server can restart from the results of the pre vious round [ 24 ]. W e look forw ard to more no v el solutions to deal with the disconnection issue for ef fecti veness impro vement. Summary W e summarize the above studies as follo ws. • As the SGD-based frame work has been widely studied and used, more studies focus on model specialized FL recently . W e expect to achieve better model accuracy by using model specialized methods. Moreo ver , we encourage researchers to study on federated decision trees models. The tree models ha ve a small model size and are easy to train compared with neural networks, which can result in a lo w communication and computation ov erhead in FL. • The study on FL is still on a early stage. Few studies hav e been done on appling FL to train the state-of-the-art neural networks such as ResNeXt [ 127 ] and Ef ﬁcientNet [ 178 ]. How to design an ef fecti ve and practical algorithm to train a complex ma chine learning model is still a challenging and on-going research direction. • While most studies focus on horizontal FL, there is still no well de veloped algorithm for vertical FL. Ho we ver , the v ertical federated setting is common in real world applications where multiple org anizations are in volved. W e look forward to more studies on this promising area. 15 4.2.2 Communication Efﬁciency While the computation of FL can be accelerated using modern hardware and techniques [ 123 , 101 , 102 ] in high performance computing community [ 197 , 199 ], the FL studies mainly work on reducing the communication size during the FL process. K one ˇ cn ` y et al. [95] propose two ways, structured updates and sketched updates, to reduce the communication costs in federated averaging. The ﬁrst approach restricts the structure of local updates and transforms it to the multiplication of tw o smaller matrices. Only one small matrix is sent during the learning process. The second approach uses a lossy compression method to compress the updates. Their method can reduce the communication cost by tw o orders of magnitude with a slight de gradation in con ver gence speed. Zhu and Jin [226] design a multi-objecti ve ev olutionary algorithm to minimize the communication costs and global model test errors simultaneously . Considering the minimization of the communication cost and the maximization of the global learning accuracy as two objectives, they formulated FL as a bi-objectiv e optimization problem and solve it by the multi-objecti ve e volutionary algorithm. Jeong et al. [79] propose a FL framew ork for de vices with non-IID local data. They design federated distillation, whose communication size depends on the output dimension but not on the model size. Also, they propose a data augmentation scheme using a generati ve adversarial netw ork (GAN) to make the training dataset become IID. Many other studies also design specialize approach for non-IID data [ 221 , 111 , 118 , 210 ]. Sattler et al. [164] propose a new compression framew ork named sparse ternary compression (STC). Speciﬁcally , STC compresses the communication using sparsiﬁcation, ternarization, error accumulation, and optimal Golomb encoding. Their method is rob ust to non-IID data and large numbers of parties. Beside the communication size, the communication architecture can also be improved to increase the training ef ﬁciency . Marfoq et al. [128] consider the topology design for cross-silo federated learning. They propose an approach to ﬁnd a throughput-optimal topology , which can signiﬁcantly reduce the training time. 4.2.3 Privacy , Robustness and Attacks Although the original data is not exchanged in FL, the model parameters can also leak sensitiv e information about the training data [ 167 , 137 , 196 ]. Thus, it is important to provide priv acy guarantees for the exchanged local updates. Dif ferential pri v acy is a popular method to provide pri vac y guarantees. Geyer et al. [60] apply dif fer- ential priv acy in federated av eraging from a client-lev el perspective. They use the Gaussian mechanism to distort the sum of updates of gradients to protect a whole client’ s dataset instead of a single data point. McMahan et al. [130] deploy federated av eraging in the training of LSTM. The y also use client-le vel dif ferential priv acy to protect the parameters. Bho wmick et al. [20] apply local differential priv acy to protect the parameters in FL. T o increase the model quality , they consider a practical threat model that wishes to decode indi viduals’ data but has little prior information on them. W ithin this assumption, they can better utilize the pri v acy b udget. Bonawitz et al. [23] apply secure multi-party computation to protect the local parameters on the basis of federated av eraging. Speciﬁcally , they present a secure aggregation protocol to securely compute the sum of vectors based on secret sharing [ 165 ]. They also discuss how to combine dif ferential priv acy with secure aggregation. T ruex et al. [181] combine both secure multiparty computation and differential pri vac y for pri vacy- preserving FL. They use dif ferential priv acy to inject noises to the local updates. Then the noisy updates will be encrypted using the Paillier cryptosystem [146] before sent to the central serv er . For the attacks on FL, one kind of popular attack is backdoor attack, which aims to achieve a bad global model by exchanging malicious local updates. Bagdasaryan et al. [16] conduct model poisoning attack on FL. The malicious parties commit the attack models to the server so that the global model may o verﬁt with the poisoned data. The secure multi-party computation cannot pre vent such attack since it aims to protect the conﬁdentiality of the 16 model parameters. Bhagoji et al. [19] also study the model poisoning attack on FL. Since the a veraging step will reduce the effect of the malicious model, it adopts an e xplicit boosting way to increase the committed weight update. Sun et al. [174] conduct experiments to e valuate backdoor attacks and defenses for federated learning on federated EMNIST dataset to see what factors can af fect the performance of adversary . The y ﬁnd that in the absence of defenses, the performance of the attack largely depends on the fraction of adversaries presented and the ”comple xity” of the targeted task. The more backdoor tasks we hav e, the harder it is to backdoor a ﬁxed-capacity model while maintaining its performance on the main task. W ang et al. [188] discuss the backdoor attack from a theoretical view and prov e that it is feasible in FL. They also propose a new class of backdoor attacks named edge-case backdoors, which are resistant to the current defending methods. Xie et al. [203] propose a distributed backdoor attack on FL. They decompose the global trigger pattern into local patterns. Each adversarial party only employs one local pattern. The experiments sho w that their distrib uted backdoor attack outperforms the central backdoor attack. Another kind of attack is the Byzantine -attacks, where adversaries fully control some authenticated de vices and beha ve arbitrarily to disrupt the netw ork. There hav e been some existing rob ust aggregation rules in distributed learning such as K r um [ 21 ] and Bulyan [ 132 ]. These rules can be directly applied in federated learning. Howe ver , since each party conduct multiple local update steps in federated learning, it is interesting to inv estigate the Byzantine attacks and defenses in federated learning. Li et al. [100] propose RSA, a Byzantine-robust stochastic aggre gation method for federated learning on non-IID data setting. Fang et al. [53] propose model poison attacks for byzantine-robust federated learning approaches. The goal of their approach is to modify the local models such that the global model deviates the most to wards the in verse of the correct update direction. Another line of study about FL attack are the inference attacks. There are existing studies for the inferences attack [ 56 , 167 , 137 ] on the machine learning model trained in a centralized setting. For the federated setting, Geiping et al. [58] sho w that it is possible to reconstruct the training images from the kno wledge of the exchanged gradients. 4.2.4 F airness and Incenti ve Mechanisms By taking fairness into consideration based on FedA vg, Li et al. [110] propose q -FedA vg. Speciﬁcally , they deﬁne the fairness according to the variance of the performance of the model on the parties. If such variance is smaller , then the model is more fair . Thus, they design a new objectiv e inspired by α -fairness [ 13 ]. Based on federated av eraging, they propose q -FedA vg to solve their ne w objecti ve. The major dif ference between q -FedA vg with FedA vg is in the formulas to update model parameters. Kim et al. [93] combine blockchain architecture with FL. On the basis of federated averaging, the y use a blockchain network to exchange the devices’ local model updates, which is more stable than a central server and can provide the re wards for the devices. Kang et al. [87] designed a reputation-based worker selection scheme for reliable FL by using a multi-weight subjecti ve logic model. They also le verage the blockchain to achie ve secure reputation management for workers with non-repudiation and tamper-resistance properties in a decentralized manner . Summary According to the re vie w abov e, we summarize the studies in Section 4.2.2 to Section 4.2.4 as follows. • Besides ef fecti veness, ef ﬁciency and pri vacy are the other two important factors of an FLS. Com- pared with these three areas, there are fewer studies on fairness and incenti ve mechanisms. W e look forward to more studies on f airness and incenti ve mechanisms, which can encourage the usage of FL in the real world. • For the efﬁciency impro vement of FLSs, the communication o verhead is still the main challenge. Most studies [ 95 , 79 , 164 ] try to reduce the communication size of each iteration. How to reasonably 17 set the number of communication rounds is also promising [ 226 ]. The trade-of f between the computation and communication still needs to be further in vestigated. • For the pri v acy guarantees, differential pri v acy and secure multi-party computation are two popular techniques. Howe ver , differential pri vac y may impact the model quality signiﬁcantly and secure multi-party computation may be very time-consuming. It is still challenging to design a practical FLS with strong priv acy guarantees. Also, the ef fectiv e robust algorithms against poisoning attacks are not widely adopted yet. 4.2.5 A pplications One related area with FL is edge computing [ 140 , 212 , 153 , 47 , 218 ], where the parties are edge devices. Many studies try to integrate FL with the mobile edge systems. FL also shows promising results in recommender system [ 14 , 29 , 225 ], natural language processing [ 71 ] and transaction fraud detection [222]. Edge Computing Nishio and Y onetani [143] implement federated averaging in practical mobile edge computing (MEC) frame works. They use an operator of MEC framw orks to manage the resources of heterogeneous clients. W ang et al. [194] adopt both distributed deep reinforcement learning (DRL) and federatd learning in mobile edge computing system. The usage of DRL and FL can effecti vely optimize the mobile edge computing, caching, and communication. W ang et al. [192] perform FL on resource-constrained MEC systems. They address the problem of ho w to ef ﬁciently utilize the limited computation and communication resources at the edge. Using federated a veraging, they implement many machine learning algorithms including linear regression, SVM, and CNN. He et al. [73] also consider the limited computing resources in the edge de vices. They propose FedGKT , where each de vice only trains a small part of a whole ResNet to reduce the computation ov erhead. Recommender System Ammad-ud din et al. [14] formulate the ﬁrst federated collaborati ve ﬁlter method. Based on a stochastic gradient approach, the item-factor matrix is trained in a global server by aggre gating the local updates. They empirically show that the federated method has almost no accuracy loss compared with the centralized method. Chai et al. [29] design a federated matrix factorization frame work. They use federated SGD to learn the matrices. Moreo ver , they adopt homomorphic encryption to protect the communicated gradients. T an et al. [177] build a federated recommender system (FedRecSys) based on F A TE. FedRecSys has implemented popular recommendation algorithms with SMC protocols. The algorithms include matrix factorization, singular v alue decomposition, factorization machine, and deep learning. Natural Language Processing Hard et al. [71] apply FL in mobile ke yboard next-w ord prediction. They adopt the federated a veraging method to learn a v ariant of LSTM called Coupled Input and F or get Gate (CIFG) [ 65 ]. The FL method can achie ve better precision recall than the serv er -based training with log data. T ransaction Fraud Detection Zheng et al. [222] introduce FL into the ﬁeld of fraud detection on credit card transaction. They design a nov el meta-learning based federated learning frame work, named deep K-tuplet netw ork, which not only guarantees data pri v acy b ut also achieves a signiﬁcantly higher performance compared with the e xisting approaches. 18 Summary According to the abov e studies, we ha ve the follo wing summaries. • Edge computing naturally ﬁts the cross-device federated setting. A nontrivial issue of applying FL to edge computing is ho w to ef fectively utilize and manage the edge resources. The usage of FL can bring beneﬁts to users, especially for improving mobile de vice services. • FL can solve many traditional machine learning tasks such as image classiﬁcation and work prediction. Due to the regulations and “data islands”, the federated setting may be a common setting in the next years. W ith the fast de velopment of FL, we believ e that there will be more applications in computer vision, natural language processing, and healthcare. 4.2.6 Benchmark Benchmark is important for directing the dev elopment of FLSs. Multiple benchmark-related works hav e been conducted recently , and sev eral benchmark frame works are a v ailable online. W e categorize them into three types: 1) Gener al purpose benc hmark systems aim at comprehensiv ely ev aluate FLSs and gi ve a detailed characterization of different aspects of FLSs; 2) T ar geted benchmarks aim at one or more aspects that concentrated in a small domain and tries to optimize the performance of the system in that domain; 3) Dataset benchmarks aim at pro viding dedicated datasets for federated learning. General Purpose Benchmark Systems FedML [ 74 ] is a research library that provides both framew orks for federated learning and benchmark functionalities. As a benchmark, it provides comprehensive baseline implementations for multiple ML models and FL algorithms, including FedA vg, FedNAS, V ertical FL, and split learning. Moreov er, it supports three computing paradigms, namely distributed training, mobile on-device training, and standalone simulation. Although some of its experiment results are currently still at a preliminary stage, it is one of the most comprehensi ve benchmark frame works concerning its functionalities. FedEv al [ 30 ] is another ev aluation model for federated learning. It features the “ A CTPR” model, i.e., using accuracy , communication, time consumption, pri vacy and robustness as its ev aluation targets. It utilizes Docker containers to provide an isolated e v aluation en vironment to work around the hardware resource limitation problem, and simulated up to 100 clients in the implementation. Currently , two horizontal algorithms are supported: FedSGD and FedA vg, and the models including MLP and LeNet are tested. O ARF [ 77 ] provides a set of utilities and reference implementations for FL benchmarks. It features the measurement of different components in FLSs, including FL algorithms, encryption mechanisms, pri vac y mechanisms, and communication methods. In addition, it also features realistic partitioning of datasets, which utilizes public datasets collected from dif ferent sources to reﬂect real-world data distrib utions. Both horizontal vertical algorithms are tested. Edge AIBench [ 70 ] pro vides a testbed for federated learning applications, and models four application scenarios as reference implementations: ICU patients monitor , surveillance camera, smart home, and autonomous vehicles. The implementation is open sourced, b ut no experiment result has been reported currently . T argeted Benchmarks Nilsson et al. [142] propose a method utilizing correlated t-test to compare between dif ferent types of federated learning algorithms while bypassing the inﬂuence of data distributions. Three FL algorithms, FedA vg, FedSVRG [ 95 ] and CO-OP [ 195 ] are compared in both IID and non-IID setup in their work, and the result shows that FedA vg achiev es the highest accuracy among the three algorithms regardless of ho w data is partitioned. 19 Zhuang et al. [227] utilize benchmark analysis to improv e the performance of federated person re- identiﬁcation. The benchmark part uses 9 dif ferent datasets to simulate real-w orld situations and uses federated partial a veraging, an algorithm that allo ws the aggregation of partially dif ferent models, as the reference implementations. Zhang et al. [216] present a benchmark targeted at semi-supervised federated learning setting, where users only hav e unlabelled data, and the server only has a small amount of labelled data, and explore the relation between ﬁnal model accurac y and multiple metrics, including the distrib ution of the data, the algorithm and communication settings, and the number of clients. Utilizing the experiment results, their semi-supervised learning improv ed method achie ves better generalization performance. Liu et al. [117] focus on the non-IID problem, where datasets are distrib uted unevenly across the participating parties. Their work explores methods for quantitati vely describing the sk e wness of the data distribution, and propose se veral non-IID dataset generation approaches. Datasets LEAF [ 25 ] is one of the earliest dataset proposals for federated learning. It contains six datasets covering dif ferent domains, including image classiﬁcation, sentiment analysis, and ne xt-character prediction. A set of utilities is provided to di vide datasets into dif ferent parties in an IID or non-IID way . F or each dataset, a reference implementation is also provided to demonstrate the usage of that dataset in the training process. Luo et al. [124] present real-w orld image datasets which are collected from 26 different street cameras. Images in that dataset contain objects of 7 dif ferent categories and are suitable for the object detection task. Implementations with federated av eraging running Y OLOv3 model and Faster R-CNN model are provided as references. Summary Summarizing the studies abov e, we ha ve the follo wing discoveries • Benchmarks serve an important role in the dev elopment of federated learning. Through dif ferent types of benchmarks, we can quantitati vely characterize the dif ferent components and aspects of federated learning. Benchmarks regarding the security and pri v acy issues in federated learning are still at an early stage and require further de velopment. • Currently no comprehensive enough benchmark system has been implemented to cover all the algorithms or application types in FLSs. Even the most comprehensi ve benchmark systems lack supports for certain algorithms and ev aluation metrics for each le vel of the system. Further de velopment of comprehensive benchmark systems requires the support of extensiv e FL framew orks. • Most benchmark researches are using datasets which are split from a single dataset, and there is no consensus on what type of splitting method should be used. Similarly , regarding the non-IID problem, there is no consensus on the metric of non-IID-ness. Using realistic partitioning method, as proposed in FedML [ 74 ] and O ARF [ 77 ] may mitigate this issue, but for federated learning at a large-scale, realistic partitioning is not suitable due to the dif ﬁculty of collecting data from different sources. 4.3 Open Source Systems In this section, we introduce ﬁ ve open source FLSs: Federated AI T echnology Enabler (F A TE) 3 , Google T ensorFlo w Federated (TFF) 4 , OpenMined PySyft 5 , Baidu PaddleFL 6 , and FedML 7 . 3 https://github .com/FederatedAI/F A TE 4 https://github .com/tensorﬂow/federated 5 https://github .com/OpenMined/PySyft 6 https://github .com/PaddlePaddle/PaddleFL 7 https://github .com/FedML- AI/FedML 20 Kube FA T E De pl o y me n t C l us te r manage me n t FA T E- Bo ard Vi s u al i z ati o n FA T E- Se rvi n g In fe re nce s e rvi ce s FA T E- Fl o w Pi pe l i n e Mo de l ma n ag e r Fe de rate dML A l g o ri thms : NN s , T re e s .. . Se cure pro to co l s : SM C , H E.. . Eg g Ro l l Di s t ri bute d co m puti ng Sto rag e Fed e rat ed AI Figure 4: The F A TE system structure 4.3.1 F A TE F A TE is an industrial le vel FL frame work de veloped by W eBank, which aims to provide FL services between different org anizations. F A TE is based on Python and can be installed on Linux or Mac. It has attracted about 3.2k stars and 900 forks on GitHub. The o verall structure of F A TE is shown in Figure 4. It has six major modules: EggRoll, FederatedML, F A TE-Flow , F A TE-Serving, F A TE-Board, and KubeF A TE. EggRoll manages the distributed computing and storage. It provides computing and storage AIPs for the other modules. FederatedML includes the federated algorithms and secure protocols. Currently , it supports training many kinds of machine learning models under both horizontal and vertical federated setting, including NNs, GBDTs, and logistic regression. F A TE assumes that the parties are honest-but-curious. Thus, it uses secure multi-party computation and homomorphic encryption to protect the communicated messages. Howe ver , it does not support differential pri v acy to protect the ﬁnal model. F A TE-Flow is a platform for the users to deﬁne their pipeline of the FL process. The pipeline can include the data preprocessing, federated training, federated e valuation, model management, and model publishing. F A TE-Serving provides inference services for the users. It supports loading the FL models and conducting online inference on them. F A TE-Board is a visualization tool for F A TE. It provides a visual way to track the job execution and model performance. Last, KubeF A TE helps deploy F A TE on clusters by using Docker or K ubernetes. It pro vides customized deplo yment and cluster management services. In general, F A TE is a powerful and easy-to-use FLS. Users can simply set the parameters to run a FL algorithm. Moreov er , F A TE provides detailed documents on its deployment and usage. Howe ver , since F A TE provides algorithm-le vel interf aces, practitioners ha ve to modify the source code of F A TE to implement their o wn federated algorithms. This is not easy for non-expert users. 4.3.2 TFF TFF , de veloped by Google, provides the b uilding blocks for FL based on T ensorFlo w . It has attracted about 1.5k stars and 380 forks on GitHub . TFF pro vides a Python package which can be easily installed and imported. As sho wn in Figure 5, it pro vides two APIs of different layers: FL API and Federated Core (FC) API. FL API of fers high-lev el interfaces. It includes three ke y parts, which are models, federated computation builders, and datasets. FL API allo ws users to deﬁne the models or simply load the Keras [ 66 ] model. The federated computation builders include the typical federated a veraging algorithm. Also, FL API provides simulated federated datasets and functions to access and enumerate the local datasets for FL. Besides high-lev el interfaces, FC API also includes lower -le vel interf aces as the foundation of the FL process. Dev elopers can implement their functions and interfaces inside the federated core. Finally , FC provides the building blocks for FL. It support multiple federated operators such as federated sum, federated reduce, and federated broadcast. Dev elopers can deﬁne their own operators to implement the FL algorithm. Overall, TFF is a lightweight system for de velopers to design and implement ne w FL algorithms. Currently , TFF does not consider consider any adversaries during FL training. It does not provide pri vac y mechanisms. TFF can only deploy on a single machine now , where the federated setting is implemented by simulation. 21 T ensorFlow Feder at ed Feder at ed Le arn ing AP I Feder at ed Core AP I Mo d el s Fed erat ed Co m p u t at i o n Bu i l d ers D at as et s Py t h o n In t erface T y p e Sy s t em Bu i l d i n g b l o ck s Figure 5: The TFF system structure P addl eFL Com pil e T ime Run T ime FL s t rat eg i es (e.g., Fed A v g ) U s er de fi n ed m o d el s and al g o ri t h m s D i s t ri b u t ed co n fi g u rat i o n FL s erver FL w o rk er FL s ch ed u l er J o b g en erat o r fo r t h e s erv er an d w o rk ers Figure 6: The PaddleFL system structure 4.3.3 PySyft PySyft, ﬁrst proposed by Ryffel et al. [158] and dev eloped by OpenMined, is a python library that provides interfaces for de velopers to implement their training algorithm. It has attracted about 7.3k stars and 1.7k forks on GitHub . While TFF is based on T ensorFlo w , PySyft can work well with both PyT orch and T ensorFlow . PySyft provides multiple optional priv acy mechanisms including secure multi-party computation and differential priv acy . Thus, it can support running on honest-b ut-curious parties. Moreo ver , it can be deployed on a single machine or multiple machines, where the communication between different clients is through the websocket API [ 168 ]. Howe v er , while PySyft provides a set of tutorials, there is no detailed document on its interfaces and system architecture. 4.3.4 PaddleFL PaddleFL is a FLS based on P addlePaddle 8 , which is a deep learning platform dev eloped by Baidu. It is implemented on C++ and Python. It has attracted about 260 stars and 60 forks on GitHub . Lik e PySyft, PaddleFL supports both dif ferential priv acy and secure multi-party computation and can work on honest-but-curious parties. The system structure of PaddleFL is sho wn in Figure 6. In the compile time, there are four components including FL strategies, user deﬁned models and algorithms, distributed training conﬁguration, and FL job generator . The FL strategies include the horizontal FL algorithms such as FedA vg. V ertical FL algorithms will be integrated in the future. Besides the pro vided FL strategies, users can also deﬁne their own models and training algorithms. The distributed training conﬁguration deﬁnes the training node information in the distributed setting. FL job generator generates the jobs for federated server and workers. In the run time, there are three components including FL server , FL worker , and FL scheduler . The server and worker are the manager and parties in FL, respecti vely . The scheduler selects the workers that participate in the training in each round. Currently , the development of P addleFL is still in a early stage and the documents and examples are not clear enough. 8 https://github .com/PaddlePaddle/Paddle 22 FedML FedML- API FedML-core FL a lgorithms Models D atasets Security/privacy Communication Learning framework Environments On -device training Distributed computing Standalone simulation Figure 7: The FedML system structure 4.3.5 F edML FedML provides both a frame work for federated learning and a platform for FL benchmark. It is dev eloped by a team from Univ ersity of Southern California [ 74 ] based on PyT orch. FedML has attracted about 660 stars and 180 forks on GitHub . As an FL frame work, It’ s core structure is di vided into two le vels, as shown in Figure 7. In the low-le vel FedML-core, training engine and distributed communication infrastructures are implemented. The high-le vel FedML-API is built on top of them and provides training models, datasets, and FL algorithms. Reference application/benchmark implementations are further built on top of the FedML-API. While most algorithms implemented on FedML does not consider an y adversaries, it supports applying differential pri vac y when aggreg ating the messages from the parties. FedML supports three computing paradigms, namely standalone simulation, distributed computing and on-de vice training, which provides a simulation en vironment for a broad spectrum of hardw are requirements. Reference implementations for all supported FL algorithms are provided. Although there are still gaps between some of the experiment results and the optimal results, they provide useful information for further dev elopment. 4.3.6 Others There are other closed source federated learning systems. NVIDIA Clara 9 has enabled FL. It adopts a centralized architecture and encrypted communication channel. The tar geted users of Clara FL is hospitals and medical institutions. Ping An T echnology aims to build a federated learning system named Hive [ 2 ], which tar gets at the ﬁnancial industries. While Clara FL pro vides APIs and documents, we cannot ﬁnd the of ﬁcial documents of Hi ve. 4.3.7 Summary Overall, F A TE, PaddleFL, and FedML try to provide algorithm-le vel APIs for users to use directly , while TFF and PySyft try to provide more detailed building blocks so that the de velopers can easily implement their FL process. T able 2 shows the comparison between the open-source systems. In the algorithm le vel, F A TE is the most comprehensiv e system that supports man y machine learning models under both horizontal and vertical settings. TFF and PySyft only implement FedA vg, which is a basic frame work in FL as sho wn in Section 4.2. PaddleFL supports se veral horizontal FL algorithms currently on NNs and logistic regression. FedML integrates se veral state-of-the-art FL algorithms such as FedOpt [ 154 ] and FedNov a [ 190 ]. Compared with F A TE, TFF , and FedML, PySyft and PaddleFL provide more priv acy mechanisms. PySyft covers all the listed features that TFF supports, while TFF is based on T ensorFlo w and PySyft works better on PyT orch. Based on the popularity on GitHub, PySyft is currently the most impactful federated learning system in the machine learning community . 9 https://dev eloper.n vidia.com/clara 23 T able 2: The comparison among some existing FLSs. The notations used in this table are the same as T able 1. The cell is left empty if the system does not support the corresponding feature. There is no release version for FedML. Supported features F A TE 1.5.0 TFF 0.17.0 PySyft 0.3.0 PaddleFL 1.1.0 FedML Operation systems Mac 3 3 3 3 3 Linux 3 3 3 3 3 W indows 3 3 3 iOS 3 Android 3 Data partitioning horizontal 3 3 3 3 3 vertical 3 3 3 Models NN 3 3 3 3 3 DT 3 LM 3 3 3 3 3 Pri v acy Mechanisms DP 3 3 3 3 CM 3 3 3 Communication simulated 3 3 3 3 3 distributed 3 3 3 3 Hardwares CPUs 3 3 3 3 3 GPUs 3 3 3 5 System Design Figure 8 shows the factors that need to be considered in the design of an FLS. Here effecti veness, efﬁciency , and priv acy are three important metrics of FLSs, which are also main research directions of federated learning. Inspired by federated database [ 166 ], we also consider autonomy , which is necessary to make FLSs practical. Next, we explain these f actors in detail. 5.1 Effectiveness The core of an FLS is an (multiple) ef fectiv e algorithm (algorithms). T o determine the algorithm to be implemented from lots of existing studies as sho wn in T able 1, we should ﬁrst check the data partitioning of the parties. If the parties ha ve the same features but dif ferent samples, one can use FedA vg [ 129 ] for NNs and SimFL [ 104 ] for trees. If the parties have the same sample space b ut dif ferent features, one can use FedBCD [120] for NNs and SecureBoost [38] for trees. 5.2 Privacy An important requirement of FLSs is to protect the user priv acy . Here we analyze the reliability of the manager . If the manager is honest and not curious, then we do not need to adopt any additional technique, since the FL frame work ensures that the ra w data is not exchanged. If the manager is honest b ut curious, then we ha ve to take possible inference attacks into consideration. The model parameters may also expose sensiti ve information about the training data. One can adopt dif ferential priv acy [ 60 , 40 , 130 ] to inject random noises into the parameters or use SMC [ 22 , 72 , 23 ] to exchanged encrypted parameters. If the manager cannot be trusted at all, then we can use trusted ex ecution environments [ 37 ] to execute the code in the manager . Blockchain is also an option to play the role as a manager [93]. 5.3 Efﬁciency Ef ﬁciency is an important factor in the success of many existing systems such as XGBoost [ 34 ] and ThunderSVM [ 198 ]. Since federated learning in volv es multi-rounds training and communication, the computation and communication costs may be large, which increases the threshold of usage of FLSs. T o 24 F ed erate d Lear n in g Sy stem s Privacy Ef f icien cy Au to n o m y Ef f ectiv en ess Data p artition in g Fed A v g , Sim F L FedBC D , Secure Bo o st Bo ttlen eck Ho riz on t al V ertic al Hardware acceler atio n Co m p ressio n C omp u t a tion C omm u n ic a tion Manag er DP , SMC TEE, Blo ck ch ain Parties Fau lt to lerance Incen tiv e m echan is m s Ho n es t b u t c u rio u s No t tru s t e d Dr op out Self ish Figure 8: The design factors of FLSs increase the ef ﬁciency , the most ef fecti ve w ay is to deal with the bottleneck. If the bottleneck lies in the computation, we can use po werful hardware such as GPUs [ 42 ] and TPUs [ 83 ]. If the bottleneck lies in the communication, the compression techniques [ 18 , 95 , 164 ] can be applied to reduce the communication size. 5.4 A utonomy Like federated databases [ 166 ], a practical FLS has to consider the autonomy of the parties. The parties may drop out (e.g., netw ork failure) during the FL process, especially in the cross-de vice setting where the scale is large and the parties are unreliable [ 85 ]. Thus, the FLS should be robust and stable, which can tolerate the failure of parties or reduce the number of failure cases. Google has de v eloped a practical FLS [ 24 ]. In their system, the y monitor de vices’ health statistics to av oid wasting devices’ battery or bandwidth. Also, the system will complete the current round or restart from the results of the previously committed round if there are f ailures. Zhang et al. [214] propose a blockchain-based approach to detect the device disconnection. Robust secure aggregation [ 17 ] is applicable to protect the communicated message in case of party drop out. Besides the disconnection issues, the parties may be selﬁsh and are not willing to share the model with good quality . Incenti ve mechanisms [ 87 , 88 ] can encourage the participation of the parties and improv e the ﬁnal model quality . 5.5 The Design Reference Based on our taxonomy shown in Section 3 and the design factors sho wn in Figure 8, we derive a simple design reference for de veloping an FLS. The ﬁrst step is to identity the participated entities and the task, which signiﬁcantly inﬂuence the system design. The participated entities determines the communication architecture, the data partitioning and the scale of federation. The task determines the suitable machine learning models to train. Then, we can choose or design a suitable FL algorithm according to the abov e attributes and T able 1. After ﬁxing the FL algorithm, to satisfy the pri vacy requirements, we may determine the priv acy mechanisms to protect the communicated messages. DP is preferred if ef ﬁciency is more important than model performance compared with SMC. Last, incenti ve mechanism can be considered to enhance the system. Existing systems [ 74 , 24 ] usually do not support incenti v e mechanisms. Ho we ver , incenti ve mechanisms can encourage the parties to participate and contrib ute in the system and make the system more attractiv e. Shapley v alue [193, 187] is a fair approach that can be considered. 25 T able 3: Requirements of the real-w orld federated systems System Aspect Mobile Service Healthcare Financial Data Partitioning Horizontal Partitioning Hybrid Partitioning V ertical Partitioning Machine Learning Model No speciﬁc Models No speciﬁc Models No speciﬁc Models Scale of Federations Cross-device Cross-silo Cross-silo Communication Architecture Centralized Distributed Distributed Priv acy Mechanism DP DP/SMC DP/SMC Motiv ation of Federation Incentiv e Moti v ated Policy Moti vated Interest Moti vated For real-w orld applications of federated learning systems, please refer to Section 4 of the supplemen- tary material. 5.6 Evaluation The e v aluation of FLSs is v ery challenging. According to our studied system factors, it has to cover the follo wing aspects: (1) model performance, (2) system security , (3) system efﬁcienc y , and (4) system robustness. For the e valuation of the model, there are two dif ferent settings. One is to e valuate the performance (e.g., prediction accuracy) of the ﬁnal global model on a global dataset. The other one is to ev aluate the performance of the ﬁnal local models on the corresponding non-IID local datasets. The ev aluation setting depends on the objecti ve of FL, i.e., learn a global model or learn personalized local models. While theoretical security/pri v acy guarantee is a good e valuation metric for system security , another way is to conduct membership inference attacks [ 167 ] or model in version attacks [ 56 ] to test the system security . These attacks can be conducted in two ways: (1) white-box attack: the attacker has access to all the exchanged models during the FL process. (2) black-box attack: the attacker only has access to the ﬁnal output model. The attack success ratio can be an ev aluation metric for the system security . The ef ﬁciency of the system includes two parts: computation efﬁcienc y and communication ef ﬁciency . An intuiti ve metric is the training time, including the computation and communication time. Note that FL is usually a multi-round process. Thus, for a fair comparison, one approach is to use time per round as a metric. Another approach is to record the time or round to achieve the same tar get performance [ 90 , 107 ]. It is challenging to quantifying the robustness of an FLS. A possible solution is to use a similar metric as robust secure aggregation, i.e., the maximum number of disconnected parties that can tolerate during the FL process. 6 Case Study In this section, we present se veral real-world applications of FL according to our taxonomy , as summarized in T able 3. 6.1 Mobile Service There are many corporations pro viding predicting service to their mobile users, such as Google Key- board [ 208 ], Apple’ s emoji suggestion and QuickT ype [ 179 ]. These services bring much con venience to the users. Howe ver , the training data come from users’ edge de vices, like smartphones. If the compan y collects data from all users and trains a global model, it might potentially cause pri vacy leakage. On the other hand, the data of each single user are insufﬁcient to train an accurate prediction model. FL enables these companies to train an accuracy prediction model without accessing users’ original data, which means protecting users’ priv ac y . In the frame work of FLSs, the users calculate and send their local models instead of their original data. That means a Google K eyboard user can enjoy an accurate prediction for the next word while not sharing his/her input history . If FLS can be widely applied to such prediction services, there will be much less data leakage since data are always stored in the edge. 26 In such a scenario, data are usually horizontally split into millions of devices. Hence, the limitation of single device computational resource and the bandwidth are two major problems. Besides, the robustness of the system should also be considered since a user could join or leav e the system at an ytime. In other words, a centralized, cross-de vice FLS on horizontal data should be designed for such prediction services. Although the basic frame work of an FLS can ha ve someho w protected indi viduals’ pri v acy , it may not be secure against inference attacks [ 167 ]. Some additional priv acy mechanisms lik e dif ferential pri vacy should be lev eraged to ensure the indistinguishability of indi viduals. Here secure multi-party computation may not be appropriate since each de vice has a weak computation capacity and cannot af ford expensi ve encryption operations. Apart from guaranteeing users’ priv acy , some incentiv e mechanisms should be de veloped to encourage users to contrib ute their data. In reality , these incentiv es could be vouchers or additional service. 6.2 Healthcare Modern health systems require cooperation among research institutes, hospitals, and federal agencies to improv e health care of the nation [ 57 ]. Moreover , collaborati ve research among countries is vital when facing global health emer gencies, like CO VID-19 [ 6 ]. These health systems mostly aim to train a model for the diagnosis of a disease. These models for diagnosis should be as accurate as possible. Ho we ver , the information of patients are not allowed to transfer under some regulations such as GDPR [ 10 ]. The pri v acy of data is e ven more concerned in international collaboration. W ithout solving the pri vacy issue, the collaborati ve research could be stagnated, threatening the public health. The data pri vac y in such collaboration is largely based on conﬁdentiality agreement. Ho wever , this solution is based on “trust”, which is not reliable. FL makes the cooperation possible because it can ensure the priv acy theoretically , which is prov able and reliable. In this way , every hospital or institute only has to share local models to get an accurate model for diagnosis. In such a scenario, the health care data is partitioned both horizontally and vertically: each party contains health data of residents for a speciﬁc purpose (e.g., patient treatment), b ut the features used in each party are di verse. The number of parties is limited and each party usually has plenty of computational resource. In other w ords, a priv ate FLS on hybrid partitioned data is required. One of the most challenging problems is ho w to train the h ybrid partitioned data. The design of the FLS could be more complicated than a simple horizontal system. In a federation of healthcare, there is probably no central server . So, another challenging part is the design of a decentralized FLS, which should also be rob ust against some dishonest or malicious parties. Moreov er , the priv acy concern can be solv ed by additional mechanisms like secure multi-party computation and dif ferential priv acy . The collaboration is largely moti v ated by regulations. 6.3 Finance A federation of ﬁnancial consists of banks, insurance companies, etc. The y often hope to cooperate in daily ﬁnancial operations. For e xample, some ‘bad’ users might pack back a loan in one back with the money borro wed from another bank. All the banks w ant to av oid such malicious behavior while not re vealing other customers’ information. Also, insurance companies also w ant to learn from the banks about the reputation of customers. Howe ver , a leakage of ‘good’ customers’ information may cause loss of interest or some legal issues. This kind of cooperation can happen if we ha v e a trusted third party , lik e the gov ernment. Ho wever , in many cases, the government is not in volv ed in the federation or the gov ernment is not al ways trusted. So, an FLS with pri vac y mechanisms can be introduced. In the FLS, the pri v acy of each bank can be guaranteed by theoretical prov ed pri v acy mechanisms. In such a scenario, ﬁnancial data are often vertically partitioned, linked by user ID. T raining a classiﬁer in vertically partitioned data is quite challenging. Generally , the training process can be di vided into two parts: priv acy-preserving record linkage [ 183 ] and vertical federated training. The ﬁrst part aims to ﬁnd links between vertical partitioned data, and it has been well studied. The second part aims to train the 27 linked data without sharing the original data of each party , which still remains a challenge. The cross-silo and decentralized setting are applied in this federation. Also, some pri vac y mechanisms should be adopted in this scenario and the participant can be moti v ated by interest. 7 V ision In this section, we show interesting directions to work on in the future. Although some directions are already cov ered in existing studies introduced in Section 4, we belie ve the y are important and provide more insights on them. 7.1 Heterogeneity The heterogeneity of the parties is an important characteristic in FLSs. Basically , the parties can dif fer in the accessibility , pri v acy requirements, contrib ution to the federation, and reliability . Thus, it is important to consider such practical issues in FLSs. Dynamic scheduling Due to the instability of the parties, the number of parties may not be ﬁxed during the learning process. Howe ver , the number of parties is ﬁxed in many e xisting studies and they do not consider the situations where there are entries of ne w parties or departures of the current parties. The system should support dynamic scheduling and ha ve the ability to adjust its strate gy when there is a change in the number of parties. There are some studies addressing this issue. For example, Google’ s system [ 24 ] can tolerate the drop-out of the de vices. Also, the emergence of blockchain [ 223 ] can be an ideal and transparent platform for multi-party learning. Ho wev er , to the best of our knowledge, there is no work that study a increasing number of parties during FL. In such a case, more attention may be paid to the later parties, as the current global model may hav e been welled trained on existing parties. Diverse privacy restrictions Little work has considered the priv acy heterogeneity of FLSs, where the parties hav e different pri vacy requirements. The existing systems adopt techniques to protect the model parameters or gradients for all the parties on the same lev el. Ho wever , the priv acy restrictions of the parties usually dif fer in reality . It would be interesting to design an FLS which treats the parties dif ferently according to their priv acy restrictions. The learned model should have a better performance if we can maximize the utilization of data of each party while not violating their priv acy restrictions. The heterogeneous dif ferential pri v acy [ 9 ] may be useful in such settings, where users hav e different pri v acy attitudes and expectations. Intelligent beneﬁts Intuiti vely , one party should gain more from the FLS if it contrib utes more information. Existing incenti ve mechanisms are mostly based on Shaple y v alues [ 193 , 187 ], the computation overhead is a major concern. A computation-efﬁcient and fair incenti ve mechanism needs to be de veloped. 7.2 System Dev elopment T o boost the dev elopment of FLSs, besides the detailed algorithm design, we need to study from a high-le vel vie w . System architectur e Like the parameter server [ 76 ] in deep learning which controls the parameter synchronization, some common system architectures are needed to be in vestigated for FL. Although FedA vg is a widely used framew ork, the applicable scenarios are still limited. For example, while unsupervised learning [ 129 , 107 , 108 ] still adopt model a veraging as the model aggregation method, which cannot work if the parties want to train heterogeneous models. W e want a general system architecture, which provides man y aggreg ation methods and learning algorithms for dif ferent settings. Model market Model market [ 182 ] is a promising platform for model storing, sharing, and selling. An interesting idea is to use the model market for federated learning. The party can buy the models to conduct model aggreg ation locally . Moreov er , it can contribute its models to the market with additional information such as the target task. Such a design introduces more ﬂexibility to the federation and is 28 more acceptable for the organizations, since the FL just like se veral transactions. A well ev aluation of the models is important in such systems. The incentiv e mechanisms may be helpful [201, 87, 88]. Benchmark As more FLSs are being dev eloped, a benchmark with representati ve data sets and workloads is quite important to e valuate the e xisting systems and direct future dev elopment. Although there have been quite a few benchmarks [ 25 , 77 , 74 ], no benchmark has been widely used in the e xperiments of federated learning studies. W e need a robust benchmark which has representativ e datasets and strict priv acy e v aluation. Also, comprehensi ve ev aluation metric including model performance, system ef ﬁciency , system security , and system robustness is often ignored in existing benchmarks. The ev aluation of model performance on non-IID datasets and system security under data pollution needs more in vestigation. Data life cycles Learning is simply one aspects of a federated system. A data life c ycle [ 151 ] consists of multiple stages including data creation, storage, use, share, archiv e and destroy . For the data security and pri v acy of the entire application, we need to in vent ne w data life cycles under FL context. Although data sharing is clearly one of the focused stage, the design of FLSs also affects other stages. For example, data creation may help to prepare the data and features that are suitable for FL. 7.3 FL in Domains Internet-of-thing Security and pri v acy issues have been a hot research area in fog computing and edge computing, due to the increasing deployment of Internet-of-thing applications. For more details, readers can refer to some recent surveys [ 172 , 209 , 135 ]. FL can be one potential approach in addressing the data pri v acy issues, while still offering reasonably good machine learning models [ 113 , 138 ]. The additional key challenges come from the computation and energy constraints. The mechanisms of priv acy and security introduces runtime o verhead. For e xample, Jiang et al. [80] apply independent Gaussian random projection to improv e the data pri v acy , and then the training of a deep network can be too costly . The authors need to de velop a new resource scheduling algorithm to move the workload to the nodes with more computation po wer . Similar issues happen in other en vironments such as v ehicle-to-vehicle networks [160]. Regulations While FL enables collaborati ve learning without exposing the ra w data, it is still not clear ho w FL comply with the existing re gulations. For e xample, GDPR proposes limitations on data transfer . Since the model and gradients are actually not safe enough, is such limitation still apply to the model or gradients? Also, the “right to explainability” is hard to ex ecute since the global model is an a veraging of the local models. The explainability of the FL models is an open problem [ 67 , 161 ]. Moreover , if a user wants to delete its data, should the global model be retrained without the data [ 62 ]? There is still a gap between the FL techniques and the regulations in reality . W e may expect the cooperation between the computer science community and the law community . 8 Conclusion Many ef forts have been dev oted to de veloping federated learning systems (FLSs). A complete o vervie w and summary for existing FLSs is important and meaningful. Inspired by the previous federated systems, we hav e sho wn that heterogeneity and autonomy are tw o important factors in the design of practical FLSs. Moreov er , with six dif ferent aspects, we provide a comprehensiv e categorization for FLSs. Based on these aspects, we also present the comparison on features and designs among e xisting FLSs. More importantly , we hav e pointed out a number of opportunities, ranging from more benchmarks to integration of emer ging platforms such as blockchain. FLSs will be an exciting research direction, which calls for the effort from machine learning, system and data pri v acy communities. Acknowledgement This work is supported by a MoE AcRF Tier 1 grant (T1 251RES1824), an SenseTime Y oung Scholars Research Fund, and a MOE T ier 2 grant (MOE2017-T2-1-122) in Singapore. 29 Refer ences [1] California Consumer Priv acy Act Home Page. https://www .capri v acy .org/. [2] URL https://www .intel.com/content/www/us/en/customer - spotlight/stories/ ping- an- sgx- customer - story .html. [3] Uber settles data breach in vestigation for $ 148 million, 2018. URL https://www .nytimes.com/2018/ 09/26/technology/uber- data- breach.html. [4] Google is ﬁned $ 57 million under europe’ s data pri v acy law , 2019. URL https://www .nytimes.com/ 2019/01/21/technology/google- europe- gdpr- ﬁne.html. [5] 2019 is a ’ﬁne’ year: Pdpc has ﬁned s’pore ﬁrms a record $ 1.29m for data breaches, 2019. URL https://vulcanpost.com/676006/pdpc- data- breach- singapore- 2019/. [6] Rolling updates on coronavirus disease (co vid-19), 2020. URL https://www .who.int/emergencies/ diseases/nov el- coronavirus- 2019/ev ents- as- they- happen. [7] Mart ´ ın Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Da vis, Jeffrey Dean, Matthieu De vin, Sanjay Ghemaw at, Geoffre y Irving, Michael Isard, et al. T ensorﬂo w: A system for large-scale machine learning. In 12th { USENIX } Symposium on Oper ating Systems Design and Implementation ( { OSDI } 16) , pages 265–283, 2016. [8] Martin Abadi, Andy Chu, Ian Goodfello w , H Brendan McMahan, Ilya Mironov , Kunal T al war , and Li Zhang. Deep learning with differential pri vac y . In Pr oceedings of the 2016 ACM SIGSA C Confer ence on Computer and Communications Security , pages 308–318. ACM, 2016. [9] Mohammad Alaggan, S ´ ebastien Gambs, and Anne-Marie K ermarrec. Heterogeneous differential pri v acy . arXiv pr eprint arXiv:1504.06998 , 2015. [10] Jan Philipp Albrecht. How the gdpr will change the w orld. Eur . Data Pr ot. L. Rev . , 2:287, 2016. [11] Mohammed Aledhari, Rehma Razzak, Reza M P arizi, and F ahad Saeed. Federated learning: A surve y on enabling technologies, protocols, and applications. IEEE Access , 8:140699–140725, 2020. [12] Scott Alfeld, Xiaojin Zhu, and Paul Barford. Data poisoning attacks ag ainst autoregressi ve models. In Thirtieth AAAI Confer ence on Artiﬁcial Intelligence , 2016. [13] Eitan Altman, Konstantin A vrachenko v , and Andrey Garnae v . Generalized α -fair resource allocation in wireless networks. In 2008 47th IEEE Conference on Decision and Contr ol , pages 2414–2419. IEEE, 2008. [14] Muhammad Ammad-ud din, Elena Ivanniko v a, Suleiman A Khan, W ere Oyomno, Qiang Fu, Kuan Eeik T an, and Adrian Flanagan. Federated collaborati ve ﬁltering for priv acy-preserving personalized recommendation system. arXiv preprint , 2019. [15] Y oshinori Aono, T akuya Hayashi, Lihua W ang, Shiho Moriai, et al. Priv acy-preserving deep learning via additiv ely homomorphic encryption. IEEE T ransactions on Information F or ensics and Security , 13(5):1333–1345, 2018. [16] Eugene Bagdasaryan, Andreas V eit, Y iqing Hua, Deborah Estrin, and V italy Shmatik ov . How to backdoor federated learning. In International Confer ence on Artiﬁcial Intelligence and Statistics , pages 2938–2948. PMLR, 2020. [17] James Henry Bell, Kallista A Bonawitz, Adri ` a Gasc ´ on, T ancr ` ede Lepoint, and Mariana Rayko v a. Secure single-server aggre gation with (poly) logarithmic o verhead. In CCS , 2020. 30 [18] Jeremy Bernstein, Y u-Xiang W ang, Kamyar Azizzadenesheli, and Anima Anandkumar . signsgd: Compressed optimisation for non-con vex problems. arXiv pr eprint arXiv:1802.04434 , 2018. [19] Arjun Nitin Bhagoji, Supriyo Chakraborty , Prateek Mittal, and Seraphin Calo. Analyzing federated learning through an adversarial lens, 2018. [20] Abhishek Bhowmick, John Duchi, Julien Freudiger , Gaura v Kapoor , and Ryan Rogers. Pro- tection against reconstruction and its applications in priv ate federated learning. arXiv pr eprint arXiv:1812.00984 , 2018. [21] Pe v a Blanchard, Rachid Guerraoui, Julien Stainer, et al. Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neur al Information Pr ocessing Systems , pages 119–129, 2017. [22] K eith Bonawitz, Vladimir Iv anov , Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. Practical secure aggre gation for federated learning on user-held data. arXiv pr eprint arXiv:1611.04482 , 2016. [23] K eith Bonawitz, Vladimir Iv anov , Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Se gal, and Karn Seth. Practical secure aggreg ation for priv acy- preserving machine learning. In Pr oceedings of the 2017 A CM SIGSAC Confer ence on Computer and Communications Security , pages 1175–1191. A CM, 2017. [24] K eith Bonawitz, Hubert Eichner , W olfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Iv anov , Chloe Kiddon, Jakub Konecn y , Stefano Mazzocchi, H Brendan McMahan, et al. T owards federated learning at scale: System design. arXiv pr eprint arXiv:1902.01046 , 2019. [25] Sebastian Caldas, Peter W u, T ian Li, Jakub Kone ˇ cn ` y, H Brendan McMahan, V ir ginia Smith, and Ameet T alw alkar . Leaf: A benchmark for federated settings. arXiv pr eprint arXiv:1812.01097 , 2018. [26] Rich Caruana. Multitask learning. Machine learning , 28(1):41–75, 1997. [27] Miguel Castro, Barbara Liskov , et al. Practical byzantine fault tolerance. In OSDI , volume 99, pages 173–186, 1999. [28] Herv ´ e Chabanne, Amaury de W ar gny , Jonathan Milgram, Constance Morel, and Emmanuel Prouf f. Pri v acy-preserving classiﬁcation on deep neural netw ork. IACR Cryptolo gy ePrint Ar chive , 2017: 35, 2017. [29] Di Chai, Leye W ang, Kai Chen, and Qiang Y ang. Secure federated matrix f actorization. arXiv pr eprint arXiv:1906.05108 , 2019. [30] Di Chai, Le ye W ang, Kai Chen, and Qiang Y ang. Fedev al: A benchmark system with a comprehen- si ve e valuation model for federated learning. arXiv pr eprint arXiv:2011.09655 , 2020. [31] Kamalika Chaudhuri, Claire Monteleoni, and Anand D Sarwate. Differentially pri vate empirical risk minimization. Journal of Machine Learning Resear ch , 12(Mar):1069–1109, 2011. [32] David Chaum. The dining cryptographers problem: Unconditional sender and recipient untrace- ability . Journal of cryptolo gy , 1(1):65–75, 1988. [33] Fei Chen, Zhenhua Dong, Zhenguo Li, and Xiuqiang He. Federated meta-learning for recommen- dation. arXiv preprint , 2018. [34] T ianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In KDD , pages 785–794. A CM, 2016. 31 [35] Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. T argeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint , 2017. [36] Y i-Ruei Chen, Amir Rezapour , and W en-Guey Tzeng. Pri v acy-preserving ridge regression on distributed data. Information Sciences , 451:34–49, 2018. [37] Y u Chen, Fang Luo, T ong Li, T ao Xiang, Zheli Liu, and Jin Li. A training-inte grity priv acy- preserving federated learning scheme with trusted e xecution environment. Information Sciences , 522:69–79, 2020. [38] K e wei Cheng, T ao Fan, Y ilun Jin, Y ang Liu, T ianjian Chen, and Qiang Y ang. Secureboost: A lossless federated learning frame work. arXiv pr eprint arXiv:1901.08755 , 2019. [39] W arren B Chik. The singapore personal data protection act and an assessment of future trends in data pri v acy reform. Computer Law & Security Revie w , 29(5):554–575, 2013. [40] Oli via Choudhury , Aris Gk oulalas-Di v anis, Theodoros Salonidis, Issa Sylla, Y oonyoung Park, Grace Hsu, and Amar Das. Differential pri vacy-enabled federated learning for sensiti ve health data. arXiv pr eprint arXiv:1910.02578 , 2019. [41] Peter Christen. Data matching: concepts and techniques for r ecor d linkage , entity r esolution, and duplicate detection . Springer Science & Business Media, 2012. [42] Shane Cook. CUD A pr ogramming: a developer’ s guide to parallel computing with GPUs . Ne wnes, 2012. [43] Luca Corinzia and Joachim M Buhmann. V ariational federated multi-task learning. arXiv preprint arXiv:1906.06268 , 2019. [44] Zhongxiang Dai, Kian Hsiang Low , and Patrick Jaillet. Federated bayesian optimization via thompson sampling. NeurIPS , 2020. [45] Mayur Datar , Nicole Immorlica, Piotr Indyk, and V ahab S Mirrokni. Locality-sensitiv e hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry , pages 253–262. A CM, 2004. [46] Canh T Dinh, Nguyen H T ran, and T uan Dung Nguyen. Personalized federated learning with moreau en velopes. arXiv preprint , 2020. [47] Moming Duan. Astraea: Self-balancing federated learning for improving classiﬁcation accuracy of mobile deep learning applications. arXiv preprint , 2019. [48] Cynthia Dwork, Frank McSherry , K obbi Nissim, and Adam Smith. Calibrating noise to sensitivity in pri v ate data analysis. In Theory of cryptography confer ence , pages 265–284. Springer , 2006. [49] Cynthia Dw ork, Aaron Roth, et al. The algorithmic foundations of dif ferential pri v acy . F oundations and T r ends® in Theoretical Computer Science , 9(3–4):211–407, 2014. [50] Khaled El Emam and Fida Kamal Dankar . Protecting priv acy using k-anon ymity . Journal of the American Medical Informatics Association , 15(5):627–637, 2008. [51] Ittay Eyal, Adem Efe Gencer , Emin Gun Sirer , and Robbert V an Renesse. Bitcoin-ng: A scalable blockchain protocol. In 13th USENIX Symposium on Networked Systems Design and Implementa- tion (NSDI 16) , pages 45–59, Santa Clara, CA, March 2016. USENIX Association. ISBN 978-1- 931971-29-4. URL https://www .usenix.or g/conference/nsdi16/technical- sessions/presentation/eyal. 32 [52] Alireza Fallah, Aryan Mokhtari, and Asuman Ozdaglar . Personalized federated learning with theoretical guarantees: A model-agnostic meta-learning approach. Advances in Neural Information Pr ocessing Systems , 33, 2020. [53] Minghong F ang, Xiaoyu Cao, Jinyuan Jia, and Neil Gong. Local model poisoning attacks to byzantine-robust federated learning. In USENIX , 2020. [54] Ji Feng, Y ang Y u, and Zhi-Hua Zhou. Multi-layered gradient boosting decision trees. In Advances in neural information pr ocessing systems , pages 3551–3561, 2018. [55] Chelsea Finn, Pieter Abbeel, and Sergey Le vine. Model-agnostic meta-learning for fast adaptation of deep networks. In Pr oceedings of the 34th International Confer ence on Machine Learning- V olume 70 , pages 1126–1135. JMLR. org, 2017. [56] Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. Model in version attacks that exploit conﬁdence information and basic countermeasures. In Pr oceedings of the 22nd ACM SIGSA C Confer ence on Computer and Communications Security , pages 1322–1333. ACM, 2015. [57] Charles P Friedman, Adam K W ong, and David Blumenthal. Achieving a nationwide learning health system. Science translational medicine , 2(57):57cm29–57cm29, 2010. [58] Jonas Geiping, Hartmut Bauermeister , Hannah Dr ¨ oge, and Michael Moeller . In verting gradients– ho w easy is it to break pri v acy in federated learning? arXiv pr eprint arXiv:2003.14053 , 2020. [59] Samuel J Gershman and Da vid M Blei. A tutorial on bayesian nonparametric models. Journal of Mathematical Psycholo gy , 56(1):1–12, 2012. [60] Robin C Geyer , T assilo Klein, and Moin Nabi. Differentially pri v ate federated learning: A client le vel perspecti ve. arXiv preprint , 2017. [61] A vishek Ghosh, Jichan Chung, Dong Y in, and Kannan Ramchandran. An efﬁcient frame work for clustered federated learning. arXiv preprint , 2020. [62] Antonio Ginart, Melody Guan, Gre gory V aliant, and James Y Zou. Making ai forget you: Data deletion in machine learning. In Advances in Neural Information Pr ocessing Systems , pages 3513–3526, 2019. [63] Oded Goldreich. Secure multi-party computation. Manuscript. Pr eliminary version , 78, 1998. [64] Slaw omir Goryczka and Li Xiong. A comprehensi ve comparison of multiparty secure additions with dif ferential pri v acy . IEEE transactions on dependable and secur e computing , 14(5):463–477, 2015. [65] Klaus Greff, Rupesh K Sri vasta va, Jan K outn ´ ık, Bas R Steunebrink, and J ¨ urgen Schmidhuber . Lstm: A search space odyssey . IEEE transactions on neural networks and learning systems , 28 (10):2222–2232, 2016. [66] Antonio Gulli and Sujit Pal. Deep learning with Ker as . Packt Publishing Ltd, 2017. [67] David Gunning. Explainable artiﬁcial intelligence (xai). Defense Advanced Researc h Pr ojects Agency (D ARP A), nd W eb , 2, 2017. [68] Filip Hanzely and Peter Richt ´ arik. Federated learning of a mixture of global and local models. arXiv pr eprint arXiv:2002.05516 , 2020. [69] Filip Hanzely , Slav om ´ ır Hanzely , Samuel Horv ´ ath, and Peter Richt ´ arik. Lower bounds and optimal algorithms for personalized federated learning. arXiv preprint , 2020. 33 [70] T ianshu Hao, Y unyou Huang, Xu W en, W anling Gao, F an Zhang, Chen Zheng, Lei W ang, Hainan Y e, Kai Hwang, Zujie Ren, et al. Edge aibench: T owards comprehensiv e end-to-end edge computing benchmarking. arXiv preprint , 2019. [71] Andre w Hard, Kanishka Rao, Rajiv Mathews, Fran c ¸ oise Beaufays, Sean Augenstein, Hubert Eichner , Chlo ´ e Kiddon, and Daniel Ramage. Federated learning for mobile k eyboard prediction. arXiv pr eprint arXiv:1811.03604 , 2018. [72] Stephen Hardy , Wilk o Henecka, Hamish Iv ey-La w , Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. Priv ate federated learning on vertically partitioned data via entity resolution and additi vely homomorphic encryption. arXiv pr eprint arXiv:1711.10677 , 2017. [73] Chaoyang He, Murali Anna v aram, and Salman A vestimehr . Group kno wledge transfer: Federated learning of large cnns at the edge. Advances in Neural Information Pr ocessing Systems , 33, 2020. [74] Chaoyang He, Songze Li, Jinhyun So, Mi Zhang, Hongyi W ang, Xiaoyang W ang, Praneeth V epakomma, Abhishek Singh, Hang Qiu, Li Shen, et al. Fedml: A research library and benchmark for federated machine learning. arXiv preprint , 2020. [75] Geof frey Hinton, Oriol V inyals, and Jef f Dean. Distilling the kno wledge in a neural network. arXiv pr eprint arXiv:1503.02531 , 2015. [76] Qirong Ho, James Cipar , Henggang Cui, Seunghak Lee, Jin Kyu Kim, Phillip B Gibbons, Garth A Gibson, Greg Ganger , and Eric P Xing. More ef fecti ve distrib uted ml via a stale synchronous parallel parameter server . In Advances in neural information pr ocessing systems , pages 1223–1231, 2013. [77] Sixu Hu, Y uan Li, Xu Liu, Qinbin Li, Zhaomin W u, and Bingsheng He. The oarf benchmark suite: Characterization and implications for federated learning systems. arXiv pr eprint arXiv:2006.07856 , 2020. [78] Y aochen Hu, Di Niu, Jianming Y ang, and Shengping Zhou. Fdml: A collaborati ve machine learning framew ork for distributed features. In Pr oceedings of the 25th ACM SIGKDD International Confer ence on Knowledge Discovery & Data Mining , pages 2232–2240, 2019. [79] Eunjeong Jeong, Seungeun Oh, Hyesung Kim, Jihong Park, Mehdi Bennis, and Seong-L yun Kim. Communication-ef ﬁcient on-de vice machine learning: Federated distillation and augmentation under non-iid pri v ate data. arXiv preprint , 2018. [80] Linshan Jiang, Rui T an, Xin Lou, and Guosheng Lin. On lightweight priv acy-preserving col- laborati ve learning for internet-of-things objects. In Pr oceedings of the International Confer - ence on Internet of Things Design and Implementation , IoTDI ’19, pages 70–81, New Y ork, NY , USA, 2019. A CM. ISBN 978-1-4503-6283-2. doi: 10.1145/3302505.3310070. URL http://doi.acm.org/10.1145/3302505.3310070. [81] Y ihan Jiang, Jakub K one ˇ cn ` y, Keith Rush, and Sreeram Kannan. Improving federated learning personalization via model agnostic meta learning. arXiv preprint , 2019. [82] Rie Johnson and T ong Zhang. Accelerating stochastic gradient descent using predicti ve v ariance reduction. In Advances in neural information pr ocessing systems , pages 315–323, 2013. [83] Norman P Jouppi, Cliff Y oung, Nishant Patil, Da vid Patterson, Gaurav Agra wal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. In-datacenter performance analysis of a tensor processing unit. In Pr oceedings of the 44th Annual International Symposium on Computer Ar chitectur e , pages 1–12, 2017. 34 [84] R. Jurca and B. F altings. An incentive compatible reputation mechanism. In EEE International Confer ence on E-Commerce , 2003. CEC 2003. , pages 285–292, June 2003. doi: 10.1109/COEC. 2003.1210263. [85] Peter Kairouz, H Brendan McMahan, Brendan A vent, Aur ´ elien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bona witz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. arXiv preprint , 2019. [86] Georgios Kaissis, Ale xander Ziller , Jonathan Passerat-P almbach, Th ´ eo Ryf fel, Dmitrii Usynin, Andre w T rask, Ion ´ esio Lima, Jason Mancuso, Friederike Jungmann, Marc-Matthias Steinborn, et al. End-to-end priv acy preserving deep learning on multi-institutional medical imaging. Natur e Machine Intellig ence , pages 1–12, 2021. [87] Jiawen Kang, Zehui Xiong, Dusit Niyato, Shengli Xie, and Junshan Zhang. Incenti ve mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory . IEEE Internet of Things Journal , 2019. [88] Jiawen Kang, Zehui Xiong, Dusit Niyato, Han Y u, Y ing-Chang Liang, and Dong In Kim. Incenti ve design for efﬁcient federated learning in mobile netw orks: A contract theory approach. arXiv pr eprint arXiv:1905.07479 , 2019. [89] Murat Kantarcioglu and Chris Clifton. Priv acy-preserving distrib uted mining of association rules on horizontally partitioned data. IEEE T ransactions on Knowledge & Data Engineering , (9): 1026–1037, 2004. [90] Sai Praneeth Karimireddy , Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. Scaffold: Stochastic controlled ave raging for federated learning. In International Confer ence on Machine Learning , pages 5132–5143. PMLR, 2020. [91] Alan F Karr , Xiaodong Lin, Ashish P Sanil, and Jerome P Reiter . Priv acy-preserving analysis of vertically partitioned data using secure matrix products. Journal of Ofﬁcial Statistics , 25(1):125, 2009. [92] Guolin Ke, Qi Meng, Thomas Finley , T aifeng W ang, W ei Chen, W eidong Ma, Qiwei Y e, and T ie-Y an Liu. Lightgbm: A highly efﬁcient gradient boosting decision tree. In NIPS , 2017. [93] Hyesung Kim, Jihong P ark, Mehdi Bennis, and Seong-L yun Kim. On-device federated learning via blockchain and its latency analysis. arXiv pr eprint arXiv:1808.03949 , 2018. [94] Jakub Kone ˇ cn ` y, H Brendan McMahan, Daniel Ramage, and Peter Richt ´ arik. Federated optimization: Distributed machine learning for on-de vice intelligence. arXiv pr eprint arXiv:1610.02527 , 2016. [95] Jakub Kone ˇ cn ` y, H Brendan McMahan, Felix X Y u, Peter Richt ´ arik, Ananda Theertha Suresh, and Da ve Bacon. Federated learning: Strategies for improving communication ef ﬁciency . arXiv pr eprint arXiv:1610.05492 , 2016. [96] Alex Krizhevsky , Ilya Sutskev er, and Geof frey E Hinton. Imagenet classiﬁcation with deep con volutional neural netw orks. In Advances in neural information pr ocessing systems , pages 1097–1105, 2012. [97] T obias Kurze, Markus Klems, Da vid Bermbach, Ale xander Lenk, Stefan T ai, and Marcel Kunze. Cloud federation. Cloud Computing , 2011:32–38, 2011. [98] David Lero y , Alice Coucke, Thibaut Lavril, Thibault Gisselbrecht, and Joseph Dureau. Federated learning for keyw ord spotting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Pr ocessing (ICASSP) , pages 6341–6345. IEEE, 2019. 35 [99] Bo Li, Y ining W ang, Aarti Singh, and Y evgeniy V orobeychik. Data poisoning attacks on factorization-based collaborati ve ﬁltering. In Advances in neural information pr ocessing sys- tems , pages 1885–1893, 2016. [100] Liping Li, W ei Xu, T ianyi Chen, Georgios B Giannakis, and Qing Ling. Rsa: Byzantine-rob ust stochastic aggregation methods for distributed learning from heterogeneous datasets. In AAAI , 2019. [101] Peilong Li, Y an Luo, Ning Zhang, and Y u Cao. Heterospark: A heterogeneous cpu/gpu spark platform for machine learning algorithms. In 2015 IEEE International Confer ence on Networking, Ar chitectur e and Storage (N AS) , pages 347–348. IEEE, 2015. [102] Qinbin Li, Zeyi W en, and Bingsheng He. Adaptiv e kernel value caching for svm training. IEEE transactions on neur al networks and learning systems , 2019. [103] Qinbin Li, Bingsheng He, and Da wn Song. Model-agnostic round-optimal federated learning via kno wledge transfer . arXiv preprint , 2020. [104] Qinbin Li, Zeyi W en, and Bingsheng He. Practical federated gradient boosting decision trees. In Pr oceedings of the AAAI Conference on Artiﬁcial Intelligence , v olume 34, pages 4642–4649, 2020. [105] Qinbin Li, Zhaomin W u, Zeyi W en, and Bingsheng He. Priv acy-preserving gradient boosting decision trees. In Pr oceedings of the AAAI Conference on Artiﬁcial Intelligence , v olume 34, pages 784–791, 2020. [106] Qinbin Li, Y iqun Diao, Quan Chen, and Bingsheng He. Federated learning on non-iid data silos: An experimental study . arXiv pr eprint arXiv:2102.02079 , 2021. [107] Qinbin Li, Bingsheng He, and Dawn Song. Model-contrastiv e federated learning. In CVPR , 2021. [108] T ian Li, Anit Kumar Sahu, Manzil Zaheer , Maziar Sanjabi, Ameet T alw alkar , and V irginia Smith. Federated optimization in heterogeneous networks. arXiv pr eprint arXiv:1812.06127 , 2018. [109] T ian Li, Anit Kumar Sahu, Ameet T alwalkar , and V irginia Smith. Federated learning: Challenges, methods, and future directions, 2019. [110] T ian Li, Maziar Sanjabi, and V irginia Smith. Fair resource allocation in federated learning. arXiv pr eprint arXiv:1905.10497 , 2019. [111] Xiang Li, Kaixuan Huang, W enhao Y ang, Shusen W ang, and Zhihua Zhang. On the con vergence of fedavg on non-iid data. arXiv pr eprint arXiv:1907.02189 , 2019. [112] Hans Albert Lianto, Y ang Zhao, and Jun Zhao. Attacks to federated learning: Responsiv e web user interface to reco ver training data from user gradients. In The A CM Asia Confer ence on Computer and Communications Security (ASIA CCS) , 2020. [113] W ei Y ang Bryan Lim, Nguyen Cong Luong, Dinh Thai Hoang, Y utao Jiao, Y ing-Chang Liang, Qiang Y ang, Dusit Niyato, and Chunyan Miao. Federated learning in mobile edge networks: A comprehensi ve surv ey , 2019. [114] T ao Lin, Lingjing K ong, Sebastian U Stich, and Martin Jaggi. Ensemble distillation for rob ust model fusion in federated learning. arXiv preprint , 2020. [115] Boyi Liu, Lujia W ang, Ming Liu, and Chengzhong Xu. Lifelong federated reinforcement learning: a learning architecture for na vigation in cloud robotic systems. arXiv pr eprint arXiv:1901.06455 , 2019. 36 [116] Jian Liu, Mika Juuti, Y ao Lu, and Nadarajah Asokan. Oblivious neural network predictions via minionn transformations. In Pr oceedings of the 2017 A CM SIGSA C Conference on Computer and Communications Security , pages 619–631. A CM, 2017. [117] Lifeng Liu, Fengda Zhang, Jun Xiao, and Chao W u. Evaluation framework for lar ge-scale federated learning. arXiv preprint , 2020. [118] Lumin Liu, Jun Zhang, SH Song, and Khaled B Letaief. Edge-assisted hierarchical federated learning with non-iid data. arXiv preprint , 2019. [119] Y ang Liu, Tianjian Chen, and Qiang Y ang. Secure federated transfer learning. arXiv pr eprint arXiv:1812.03337 , 2018. [120] Y ang Liu, Y an Kang, Xinwei Zhang, Liping Li, Y ong Cheng, T ianjian Chen, Mingyi Hong, and Qiang Y ang. A communication ef ﬁcient vertical federated learning framework. arXiv pr eprint arXiv:1912.11187 , 2019. [121] Y ang Liu, Y ingting Liu, Zhijie Liu, Junbo Zhang, Chuishi Meng, and Y u Zheng. Federated forest. arXiv pr eprint arXiv:1905.10053 , 2019. [122] Y ang Liu, Zhuo Ma, Ximeng Liu, Siqi Ma, Surya Nepal, and Robert Deng. Boosting pri- v ately: Priv acy-preserving federated extreme boosting for mobile cro wdsensing. arXiv preprint arXiv:1907.10218 , 2019. [123] Noel Lopes and Bernardete Ribeiro. Gpumlib: An ef ﬁcient open-source gpu machine learning library . International Journal of Computer Information Systems and Industrial Management Applications , 3:355–362, 2011. [124] Jiahuan Luo, Xue yang W u, Y un Luo, Anb u Huang, Y unfeng Huang, Y ang Liu, and Qiang Y ang. Real-world image datasets for federated learning. arXiv pr eprint arXiv:1910.11089 , 2019. [125] Lingjuan L yu, Han Y u, and Qiang Y ang. Threats to federated learning: A survey . arXiv pr eprint arXiv:2003.02133 , 2020. [126] Chenxin Ma, Jakub K one ˇ cn ` y, Martin Jaggi, V irginia Smith, Michael I Jordan, Peter Richt ´ arik, and Martin T ak ´ a ˇ c. Distributed optimization with arbitrary local solvers. optimization Methods and Softwar e , 32(4):813–848, 2017. [127] Dhruv Mahajan, Ross Girshick, V ignesh Ramanathan, Kaiming He, Manohar Paluri, Y ixuan Li, Ashwin Bharambe, and Laurens van der Maaten. Exploring the limits of weakly supervised pretraining. In Pr oceedings of the Eur opean Confer ence on Computer V ision (ECCV) , pages 181–196, 2018. [128] Othmane Marfoq, Chuan Xu, Giov anni Neglia, and Richard V idal. Throughput-optimal topology design for cross-silo federated learning. arXiv preprint , 2020. [129] H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. Communication-ef ﬁcient learning of deep networks from decentralized data. arXiv pr eprint arXiv:1602.05629 , 2016. [130] H Brendan McMahan, Daniel Ramage, Kunal T alwar , and Li Zhang. Learning differentially priv ate recurrent language models. arXiv preprint , 2017. [131] Luca Melis, Congzheng Song, Emiliano De Cristofaro, and V italy Shmatiko v . Exploiting unin- tended feature leakage in collaborati ve learning. In 2019 IEEE Symposium on Security and Privacy (SP) , pages 691–706. IEEE, 2019. [132] El Mahdi El Mhamdi, Rachid Guerraoui, and S ´ ebastien Rouault. The hidden vulnerability of distributed learning in byzantium. arXiv pr eprint arXiv:1802.07927 , 2018. 37 [133] V olodymyr Mnih, K oray Kavukcuoglu, Da vid Silver , Andrei A Rusu, Joel V eness, Marc G Bellemare, Alex Gra ves, Martin Riedmiller , Andreas K Fidjeland, Georg Ostrovski, et al. Human- le vel control through deep reinforcement learning. Natur e , 518(7540):529–533, 2015. [134] Mehryar Mohri, Gary Sivek, and Ananda Theertha Suresh. Agnostic federated learning. arXiv pr eprint arXiv:1902.00146 , 2019. [135] M. Mukherjee, R. Matam, L. Shu, L. Maglaras, M. A. Ferrag, N. Choudhury, and V . K umar. Security and pri vacy in fog computing: Challenges. IEEE Access , 5:19293–19304, 2017. doi: 10.1109/A CCESS.2017.2749422. [136] Moni Naor, Benny Pinkas, and Reuban Sumner . Pri vacy preserving auctions and mechanism design. In Pr oceedings of the 1st ACM Confer ence on Electr onic Commerce , EC ’99, pages 129–139, Ne w Y ork, NY , USA, 1999. A CM. ISBN 1-58113-176-3. doi: 10.1145/336992.337028. URL http://doi.acm.org/10.1145/336992.337028. [137] Milad Nasr , Reza Shokri, and Amir Houmansadr . Comprehensi ve priv acy analysis of deep learning: Passi ve and activ e white-box inference attacks against centralized and federated learning. In Compr ehensive Privacy Analysis of Deep Learning: P assive and Active White-box Infer ence Attacks a gainst Centralized and F ederated Learning , page 0. IEEE, 2019. [138] Thien Duc Nguyen, Samuel Marchal, Markus Miettinen, Hossein Fereidooni, N. Asokan, and Ahmad-Reza Sadeghi. D ¨ Iot: A federated self-learning anomaly detection system for iot, 2018. [139] Alex Nichol and John Schulman. Reptile: a scalable metalearning algorithm. arXiv pr eprint arXiv:1803.02999 , 2:2, 2018. [140] Solmaz Niknam, Harpreet S Dhillon, and Jeffery H Reed. Federated learning for wireless commu- nications: Motiv ation, opportunities and challenges. arXiv pr eprint arXiv:1908.06847 , 2019. [141] V aleria Nikolaenko, Udi W einsberg, Stratis Ioannidis, Marc Joye, Dan Boneh, and Nina T aft. Pri v acy-preserving ridge re gression on hundreds of millions of records. In 2013 IEEE Symposium on Security and Privacy , pages 334–348. IEEE, 2013. [142] Adrian Nilsson, Simon Smith, Gregor Ulm, Emil Gusta vsson, and Mats Jirstrand. A performance e v aluation of federated learning algorithms. In Pr oceedings of the Second W orkshop on Distributed Infrastructur es for Deep Learning , pages 1–8. A CM, 2018. [143] T akayuki Nishio and Ryo Y onetani. Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019-2019 IEEE International Confer ence on Communications (ICC) , pages 1–7. IEEE, 2019. [144] Richard Nock, Stephen Hardy , W ilko Henecka, Hamish Ive y-Law , Giorgio Patrini, Guillaume Smith, and Brian Thorne. Entity resolution and federated learning get a federated resolution. arXiv pr eprint arXiv:1803.04035 , 2018. [145] Olga Ohrimenko, Felix Schuster , C ´ edric Fournet, Aastha Mehta, Sebastian Now ozin, Kapil V aswani, and Manuel Costa. Oblivious multi-party machine learning on trusted processors. In 25th { USENIX } Security Symposium ( { USENIX } Security 16) , pages 619–636, 2016. [146] Pascal Paillier . Public-key cryptosystems based on composite degree residuosity classes. In International Confer ence on the Theory and Applications of Crypto graphic T echniques , pages 223–238. Springer , 1999. [147] Sinno Jialin Pan and Qiang Y ang. A surve y on transfer learning. IEEE T ransactions on knowledge and data engineering , 22(10):1345–1359, 2010. 38 [148] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Y ang, Zachary DeV ito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer . Automatic differentiation in p ytorch. 2017. [149] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer , James Bradbury , Gregory Chanan, T rev or Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library . In Advances in Neural Information Pr ocessing Systems , pages 8024–8035, 2019. [150] Robi Polikar . Ensemble learning. In Ensemble machine learning . Springer , 2012. [151] Neoklis Polyzotis, Sudip Roy , Steven Euijong Whang, and Martin Zinke vich. Data lifecycle challenges in production machine learning: a survey . ACM SIGMOD Recor d , 47(2):17–28, 2018. [152] Adnan Qayyum, Kashif Ahmad, Muhammad Ahtazaz Ahsan, Ala Al-Fuqaha, and Junaid Qadir . Collaborati ve federated learning for healthcare: Multi-modal covid-19 diagnosis at the edge, 2021. [153] Y ongfeng Qian, Long Hu, Jing Chen, Xin Guan, Mohammad Mehedi Hassan, and Abdulhameed Alelaiwi. Priv acy-aware service placement for mobile edge computing via federated learning. Information Sciences , 505:562–570, 2019. [154] Sashank Reddi, Zachary Charles, Manzil Zaheer , Zachary Garrett, K eith Rush, Jakub K one ˇ cn ` y, Sanji v Kumar , and H Brendan McMahan. Adaptiv e federated optimization. arXiv preprint arXiv:2003.00295 , 2020. [155] Amirhossein Reisizadeh, F arzan Farnia, Ramtin Pedarsani, and Ali Jadbabaie. Robust federated learning: The case of afﬁne distrib ution shifts. arXiv pr eprint arXiv:2006.08907 , 2020. [156] M Sadegh Riazi, Christian W einert, Oleksandr Tkachenko, Ebrahim M Songhori, Thomas Schnei- der , and Farinaz K oushanfar . Chameleon: A hybrid secure computation frame work for machine learning applications. In Pr oceedings of the 2018 on Asia Confer ence on Computer and Communi- cations Security , pages 707–721. A CM, 2018. [157] Sebastian Ruder . An overvie w of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 , 2017. [158] Theo Ryf fel, Andre w T rask, Morten Dahl, Bobby W agner , Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach. A generic framework for priv acy preserving deep learning. arXiv pr eprint arXiv:1811.04017 , 2018. [159] Mohamed Sabt, Mohammed Achemlal, and Abdelmadjid Bouabdallah. T rusted ex ecution en viron- ment: what it is, and what it is not. In 2015 IEEE T rustcom/BigDataSE/ISP A , volume 1, pages 57–64. IEEE, 2015. [160] Sumudu Samarakoon, Mehdi Bennis, W alid Saad, and Merouane Debbah. Federated learning for ultra-reliable lo w-latency v2v communications, 2018. [161] W ojciech Samek, Thomas W ieg and, and Klaus-Robert M ¨ uller . Explainable artiﬁcial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv pr eprint arXiv:1708.08296 , 2017. [162] Ashish P Sanil, Alan F Karr, Xiaodong Lin, and Jerome P Reiter . Priv acy preserving regression modelling via distrib uted computation. In Pr oceedings of the tenth ACM SIGKDD international confer ence on Knowledge discovery and data mining , pages 677–682. A CM, 2004. [163] Y unus Sarikaya and Ozgur Ercetin. Motiv ating work ers in federated learning: A stack elberg g ame perspecti ve, 2019. 39 [164] Felix Sattler, Simon W iedemann, Klaus-Robert M ¨ uller , and W ojciech Samek. Robust and communication-ef ﬁcient federated learning from non-iid data. arXiv pr eprint arXiv:1903.02891 , 2019. [165] Adi Shamir . How to share a secret. Communications of the A CM , 22(11):612–613, 1979. [166] Amit P Sheth and James A Larson. Federated database systems for managing distributed, heteroge- neous, and autonomous databases. ACM Computing Surve ys (CSUR) , 22(3):183–236, 1990. [167] Reza Shokri, Marco Stronati, Congzheng Song, and V italy Shmatikov . Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP) , pages 3–18. IEEE, 2017. [168] Dejan Skvorc, Matija Horvat, and Sinisa Srbljic. Performance e v aluation of websocket protocol for implementation of full-duplex web streams. In 2014 37th International Convention on Information and Communication T echnology , Electr onics and Micr oelectr onics (MIPR O) , pages 1003–1008. IEEE, 2014. [169] V irginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S T al walkar . Federated multi-task learning. In Advances in Neural Information Pr ocessing Systems , pages 4424–4434, 2017. [170] Shuang Song, Kamalika Chaudhuri, and Anand D Sarwate. Stochastic gradient descent with dif ferentially pri vate updates. In 2013 IEEE Global Confer ence on Signal and Information Pr ocessing , pages 245–248. IEEE, 2013. [171] Michael R Sprague, Amir Jalalirad, Marco Scavuzzo, Catalin Capota, Moritz Neun, L yman Do, and Michael Kopp. Asynchronous federated learning for geospatial applications. In J oint Eur opean Confer ence on Machine Learning and Knowledge Discovery in Databases , pages 21–28. Springer , 2018. [172] Iv an Stojmenovic, Sheng W en, Xin yi Huang, and Hao Luan. An o verview of fog computing and its security issues. Concurr . Comput. : Pract. Exper . , 28(10):2991–3005, July 2016. ISSN 1532-0626. doi: 10.1002/cpe.3485. URL https://doi.org/10.1002/cpe.3485. [173] Lili Su and Jiaming Xu. Securing distributed machine learning in high dimensions. arXiv preprint arXiv:1804.10140 , 2018. [174] Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, and H Brendan McMahan. Can you really backdoor federated learning? arXiv preprint , 2019. [175] Martin Sundermeyer , Ralf Schl ¨ uter , and Hermann Ney . Lstm neural networks for language modeling. In Thirteenth annual confer ence of the international speech communication association , 2012. [176] Melanie Swan. Blockchain: Blueprint for a new economy . ” O’Reilly Media, Inc. ”, 2015. [177] Ben T an, Bo Liu, V incent Zheng, and Qiang Y ang. A federated recommender system for online services. In F ourteenth ACM Confer ence on Recommender Systems , pages 579–581, 2020. [178] Mingxing T an and Quoc V Le. Ef ﬁcientnet: Rethinking model scaling for con volutional neural networks. arXiv pr eprint arXiv:1905.11946 , 2019. [179] ADP T eam et al. Learning with priv acy at scale. Apple Machine Learning J ournal , 1(8), 2017. [180] Om Thakkar , Galen Andrew , and H Brendan McMahan. Dif ferentially priv ate learning with adapti ve clipping. arXiv pr eprint arXiv:1905.03871 , 2019. 40 [181] Stacey T ruex, Nathalie Baracaldo, Ali Anwar , Thomas Steinke, Heik o Ludwig, Rui Zhang, and Y i Zhou. A hybrid approach to priv acy-preserving federated learning. In Pr oceedings of the 12th A CM W orkshop on Artiﬁcial Intelligence and Security , pages 1–11. A CM, 2019. [182] Manasi V artak, Harihar Subramanyam, W ei-En Lee, Srinidhi V iswanathan, Saadiyah Husnoo, Samuel Madden, and Matei Zaharia. Modeldb: a system for machine learning model management. In Pr oceedings of the W orkshop on Human-In-the-Loop Data Analytics , pages 1–3, 2016. [183] Dinusha V atsalan, Ziad Sehili, Peter Christen, and Erhard Rahm. Priv acy-preserving record linkage for big data: Current approaches and research challenges. In Handbook of Big Data T ec hnologies , pages 851–895. Springer , 2017. [184] Praneeth V epakomma, Otkrist Gupta, T ristan Swedish, and Ramesh Raskar . Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint , 2018. [185] Paul V oigt and Ax el V on dem Bussche. The eu general data protection regulation (gdpr). A Practical Guide , 1st Ed., Cham: Springer International Publishing , 2017. [186] Isabel W agner and Da vid Eckhof f. T echnical priv acy metrics: a systematic surv ey . ACM Computing Surve ys (CSUR) , 51(3):57, 2018. [187] Guan W ang, Charlie Xiaoqian Dang, and Ziye Zhou. Measure contribution of participants in federated learning. In 2019 IEEE International Confer ence on Big Data (Big Data) , pages 2597– 2604. IEEE, 2019. [188] Hongyi W ang, Kartik Sreeni v asan, Shashank Rajput, Harit V ishwakarma, Saurabh Agarwal, Jy- yong Sohn, Kangwook Lee, and Dimitris Papailiopoulos. Attack of the tails: Y es, you really can backdoor federated learning. Advances in Neural Information Pr ocessing Systems , 33, 2020. [189] Hongyi W ang, Mikhail Y urochkin, Y uekai Sun, Dimitris Papailiopoulos, and Y asaman Khazaeni. Federated learning with matched av eraging. arXiv pr eprint arXiv:2002.06440 , 2020. [190] Jianyu W ang, Qinghua Liu, Hao Liang, Gauri Joshi, and H V incent Poor . T ackling the objectiv e inconsistency problem in heterogeneous federated optimization. Advances in Neural Information Pr ocessing Systems , 2020. [191] Rui W ang, Heju Li, and Erwu Liu. Blockchain-based federated learning in mobile edge networks with application in internet of vehicles. arXiv pr eprint arXiv:2103.01116 , 2021. [192] Shiqiang W ang, Tif fany T uor , Theodoros Salonidis, Kin K Leung, Christian Makaya, Ting He, and K e vin Chan. Adaptiv e federated learning in resource constrained edge computing systems. IEEE J ournal on Selected Ar eas in Communications , 37(6):1205–1221, 2019. [193] T ianhao W ang, Johannes Rausch, Ce Zhang, Ruoxi Jia, and Dawn Song. A principled approach to data v aluation for federated learning. In F ederated Learning , pages 153–167. Springer , 2020. [194] Xiaofei W ang, Y iwen Han, Chenyang W ang, Qiyang Zhao, Xu Chen, and Min Chen. In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network , 2019. [195] Y ushi W ang. Co-op: Cooperati ve machine learning from mobile de vices. 2017. [196] Zhibo W ang, Mengkai Song, Zhifei Zhang, Y ang Song, Qian W ang, and Hairong Qi. Beyond inferring class representatives: User -le vel priv acy leakage from federated learning. In IEEE INFOCOM 2019-IEEE Confer ence on Computer Communications , pages 2512–2520. IEEE, 2019. 41 [197] Zeyi W en, Bingsheng He, Ramamohanarao K otagiri, Shengliang Lu, and Jiashuai Shi. Efﬁcient gradient boosted decision tree training on gpus. In 2018 IEEE International P arallel and Distributed Pr ocessing Symposium (IPDPS) , pages 234–243. IEEE, 2018. [198] Zeyi W en, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. ThunderSVM: A fast SVM library on GPUs and CPUs. Journal of Machine Learning Resear ch , 19:797–801, 2018. [199] Zeyi W en, Jiashuai Shi, Bingsheng He, Jian Chen, K otagiri Ramamohanarao, and Qinbin Li. Exploiting gpus for efﬁcient gradient boosting decision tree training. IEEE T ransactions on P arallel and Distributed Systems , 2019. [200] Zeyi W en, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. Thundergbm: Fast gbdts and random forests on gpus. In https://github.com/Xtr a-Computing/thundergbm , 2019. [201] Jiasi W eng, Jian W eng, Jilian Zhang, Ming Li, Y ue Zhang, and W eiqi Luo. Deepchain: Auditable and priv ac y-preserving deep learning with blockchain-based incenti ve. IEEE T ransactions on Dependable and Secur e Computing , 2019. [202] Xiaokui Xiao, Guozhang W ang, and Johannes Gehrke. Differential pri v acy via wa velet transforms. IEEE T ransactions on knowledg e and data engineering , 23(8):1200–1214, 2010. [203] Chulin Xie, Keli Huang, Pin-Y u Chen, and Bo Li. Dba: Distributed backdoor attacks against federated learning. In International Conference on Learning Repr esentations , 2019. [204] Cong Xie, Sanmi K oyejo, and Indranil Gupta. Asynchronous federated optimization. arXiv pr eprint arXiv:1903.03934 , 2019. [205] Runhua Xu, Nathalie Baracaldo, Y i Zhou, Ali Anwar , and Heiko Ludwig. Hybridalpha: An ef ﬁcient approach for pri v acy-preserving federated learning. In Pr oceedings of the 12th ACM W orkshop on Artiﬁcial Intelligence and Security , pages 13–23, 2019. [206] Zhuang Y an, Li Guoliang, and Feng Jianhua. A survey on entity alignment of kno wledge base. J ournal of Computer Resear ch and Development , 1:165–192, 2016. [207] Qiang Y ang, Y ang Liu, T ianjian Chen, and Y ongxin T ong. Federated machine learning: Concept and applications. A CM T ransactions on Intelligent Systems and T echnolo gy (TIST) , 10(2):12, 2019. [208] T imothy Y ang, Galen Andrew , Hubert Eichner , Haicheng Sun, W ei Li, Nicholas K ong, Daniel Ramage, and Fran c ¸ oise Beaufays. Applied federated learning: Improving google ke yboard query suggestions. arXiv preprint , 2018. [209] Shanhe Y i, Zhengrui Qin, and Qun Li. Security and priv ac y issues of fog computing: A surv ey . In W ASA , 2015. [210] Naoya Y oshida, T akayuki Nishio, Masahiro Morikura, K oji Y amamoto, and Ryo Y onetani. Hybrid- ﬂ: Cooperativ e learning mechanism using non-iid data in wireless networks. arXiv preprint arXiv:1905.07210 , 2019. [211] Hwanjo Y u, Xiaoqian Jiang, and Jaideep V aidya. Priv acy-preserving svm using nonlinear kernels on horizontally partitioned data. In Pr oceedings of the 2006 A CM symposium on Applied computing , pages 603–610. A CM, 2006. [212] Zhengxin Y u, Jia Hu, Geyong Min, Haochuan Lu, Zhiwei Zhao, Haozhe W ang, and Nektarios Georg alas. Federated learning based proactiv e content caching in edge computing. In 2018 IEEE Global Communications Confer ence (GLOBECOM) , pages 1–6. IEEE, 2018. 42 [213] Mikhail Y urochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenew ald, T rong Nghia Hoang, and Y asaman Khazaeni. Bayesian nonparametric federated learning of neural networks. arXiv pr eprint arXiv:1905.12022 , 2019. [214] W eishan Zhang, Qinghua Lu, Qiuyu Y u, Zhaotong Li, Y ue Liu, Sin Kit Lo, Shiping Chen, Xiwei Xu, and Liming Zhu. Blockchain-based federated learning for device failure detection in industrial iot. IEEE Internet of Things Journal , 2020. [215] Y u Zhang and Qiang Y ang. A survey on multi-task learning. arXiv pr eprint arXiv:1707.08114 , 2017. [216] Zhengming Zhang, Zhe wei Y ao, Y aoqing Y ang, Y ujun Y an, Joseph E Gonzalez, and Michael W Mahoney . Benchmarking semi-supervised federated learning. arXiv pr eprint arXiv:2008.11364 , 2020. [217] Lingchen Zhao, Lihao Ni, Shengshan Hu, Y aniiao Chen, Pan Zhou, Fu Xiao, and Libing W u. Inpri- v ate digging: Enabling tree-based distributed data mining with dif ferential pri v acy . In INFOCOM , pages 2087–2095. IEEE, 2018. [218] Y ang Zhao, Jun Zhao, Linshan Jiang, Rui T an, and Dusit Niyato. Mobile edge computing, blockchain and reputation-based crowdsourcing iot federated learning: A secure, decentralized and pri v acy-preserving system. arXiv pr eprint arXiv:1906.10893 , 2019. [219] Y ang Zhao, Jun Zhao, Linshan Jiang, Rui T an, Dusit Niyato, Zengxiang Li, Lingjuan L yu, and Y ingbo Liu. Priv acy-preserving blockchain-based federated learning for iot de vices. IEEE Internet of Things J ournal , 2020. [220] Y ang Zhao, Jun Zhao, Mengmeng Y ang, T eng W ang, Ning W ang, Lingjuan L yu, Dusit Niyato, and Kwok Y an Lam. Local differential pri vac y based federated learning for internet of things. arXiv pr eprint arXiv:2004.08856 , 2020. [221] Y ue Zhao, Meng Li, Liangzhen Lai, Na veen Suda, Damon Civin, and V ikas Chandra. Federated learning with non-iid data. arXiv preprint , 2018. [222] W enbo Zheng, Lan Y an, Chao Gou, and Fei-Y ue W ang. Federated meta-learning for fraudulent credit card detection. In Pr oceedings of the T wenty-Ninth International Joint Confer ence on Artiﬁcial Intelligence (IJCAI-20) , 2020. [223] Zibin Zheng, Shaoan Xie, Hong-Ning Dai, Xiangping Chen, and Huaimin W ang. Blockchain challenges and opportunities: A surve y . International Journal of W eb and Grid Services , 14(4): 352–375, 2018. [224] Amelie Chi Zhou, Y ao Xiao, Bingsheng He, Jidong Zhai, Rui Mao, et al. Priv acy regulation aw are process mapping in geo-distributed cloud data centers. IEEE T ransactions on P arallel and Distributed Systems , 2019. [225] Pan Zhou, K ehao W ang, Linke Guo, Shimin Gong, and Bolong Zheng. A pri v acy-preserving distributed conte xtual federated online learning framework with big data support in social recom- mender systems. IEEE T ransactions on Knowledge and Data Engineering , 2019. [226] Hangyu Zhu and Y aochu Jin. Multi-objectiv e ev olutionary federated learning. IEEE transactions on neural networks and learning systems , 2019. [227] W eiming Zhuang, Y onggang W en, Xuesen Zhang, Xin Gan, Daiying Y in, Dongzhan Zhou, Shuai Zhang, and Shuai Y i. Performance optimization of federated person re-identiﬁcation via benchmark analysis. In Pr oceedings of the 28th A CM International Confer ence on Multimedia , pages 955–963, 2020. 43 [228] G. Zyskind, O. Nathan, and A. ’. Pentland. Decentralizing pri v acy: Using blockchain to protect personal data. In 2015 IEEE Security and Privacy W orkshops , pages 180–184, May 2015. doi: 10.1109/SPW .2015.27. 44

A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment