A client-server architecture to simultaneously solve multiple learning tasks from distributed datasets is described. In such architecture, each client is associated with an individual learning task and the associated dataset of examples. The goal of the architecture is to perform information fusion from multiple datasets while preserving privacy of individual data. The role of the server is to collect data in real-time from the clients and codify the information in a common database. The information coded in this database can be used by all the clients to solve their individual learning task, so that each client can exploit the informative content of all the datasets without actually having access to private data of others. The proposed algorithmic framework, based on regularization theory and kernel methods, uses a suitable class of mixed effect kernels. The new method is illustrated through a simulated music recommendation system.
Deep Dive into Client-server multi-task learning from distributed datasets.
A client-server architecture to simultaneously solve multiple learning tasks from distributed datasets is described. In such architecture, each client is associated with an individual learning task and the associated dataset of examples. The goal of the architecture is to perform information fusion from multiple datasets while preserving privacy of individual data. The role of the server is to collect data in real-time from the clients and codify the information in a common database. The information coded in this database can be used by all the clients to solve their individual learning task, so that each client can exploit the informative content of all the datasets without actually having access to private data of others. The proposed algorithmic framework, based on regularization theory and kernel methods, uses a suitable class of mixed effect kernels. The new method is illustrated through a simulated music recommendation system.
The solution of learning tasks by joint analysis of multiple datasets is receiving increasing attention in different fields and under various perspectives. Indeed, the information provided by data for a specific task may serve as a domainspecific inductive bias for the others. Combining datasets to solve multiple learning tasks is an approach known in the machine learning literature as multitask learning or learning to learn [68,21,69,9,13,8,39]. In this context, the analysis of the inductive transfer process and the investigation of general methodologies for the simultaneous learning of multiple tasks are important topics of research. Many theoretical and experimental results support the intuition that, when relationships exist between the tasks, simultaneous learning performs better than separate (single-task) learning [61,76,77,75,17,6,16,79,54,4]. Theoretical results include the extension to the multi-task setting of generalization bounds and the notion of VC-dimension [10,14,43] and a methodology for learning multiple tasks exploiting unlabeled data (the so-called semi-supervised setting) [5].
Importance of combining datasets is especially evident in biomedicine. In pharmacological experiments, few training examples are typically available for a specific subject due to technological and ethical constraints [20,35]. This makes hard to formulate and quantify models from experimental data. To obviate this problem, the so-called population method has been studied and applied with success since the seventies in pharmacology [63,12,78]. Population methods are based on the knowledge that subjects, albeit different, belong to a population of similar individuals, so that data collected in one subject may be informative with respect to the others [72,50]. Such population approaches belongs to the family of so-called mixed-effect statistical methods. In these methods, clinical measurements from different subjects are combined to simultaneously learn individual features of physiological responses to drug administration [64]. Population methods have been applied with success also in other biomedical contexts such as medical imaging and bioinformatics [28,15]. Classical approaches postulate finite-dimensional nonlinear dynamical systems whose unknown parameters can be determined by means of optimization algorithms [11,62,25,1]. Other strategies include Bayesian estimation with stochastic simulation [74,41,29] and nonparametric population methods [27,42,46,47,48,51].
Information fusion from different but related datasets is widespread also in econometrics and marketing analysis, where the goal is to learn user preferences by analyzing both user-specific information and information from related users, see e.g. [66,3,2,32]. The so-called conjoint analysis aims to determine the features of a product that mostly influence customer’s decisions. In the web, collaborative approaches to estimate user preferences have become standard methodologies in many commercial systems and social networks, under the name of collaborative filtering or recommender systems, see e.g. [58]. Pioneering collaborative filtering systems include Tapestry [30], GroupLens [57,38], Refer-ralWeb [36], PHOAKS [67]. More recently, the collaborative filtering problem has been attacked with machine learning methodologies such as Bayesian networks [18], MCMC algorithms [22], mixture models [34], dependency networks [33], maximum margin matrix factorization [65].
Coming back to the machine learning literature, in the single-task context much attention has been given in the last years to non-parametric techniques such as kernel methods [60] and Gaussian processes [56]. These approaches are powerful and theoretically sound, having their mathematical foundations in regularization theory for inverse problems, statistical learning theory and Bayesian estimation [7,70,53,73,71,24]. The flexibility of kernel engineering allows for the estimation of functions defined on generic sets from arbitrary sources of data. These methodologies have been recently extended to the multi-task setting. In [26], a general framework to solve multi-task learning problems using kernel methods and regularization has been proposed, relying on the theory of reproducing kernel Hilbert spaces (RKHS) of vector-valued functions [44].
In many applications (e-commerce, social network data processing, recommender systems), real-time processing of examples is required. On-line multitask learning schemes find their natural application in data mining problems involving very large datasets, and are therefore required to scale well with the number of tasks and examples. In [52], an on-line task-wise algorithm to solve multi-task regression problems has been proposed. The learning problem is formulated in the context of on-line Bayesian estimation, see e.g. [49,23], within which Gaussian processes with suitable covariance functions are used to char-acterize a non-parametric mixed-effect model. O
…(Full text truncated)…
This content is AI-processed based on ArXiv data.