Neural Federated Learning for Livestock Growth Prediction

Neural Federated Learning for Li v estock Gro wth Prediction Shoujin W ang University of T echnology Sydne y Sydney , Australia shoujin.wang@uts.edu.au Mingze Ni University of T echnology Sydne y Sydney , Australia mingze.ni@uts.edu.au W ei Liu University of T echnology Sydne y Sydney , Australia wei.liu@uts.edu.au V ictor W . Chu University of T echnology Sydne y Sydney , Australia wingyan.chu@uts.edu.au K enny Sabir AgriW ebb Sydney , Australia kenny .sabir@agriwebb .com Bryan Zheng University of T echnology Sydne y Sydney , Australia boyuan.zheng@uts.edu.au A yush Kanwal AgriW ebb Sydney , Australia ayush.kanwal@agriwebb .com Roy Jing Y ang Queensland University of T echnology Brisbane, Australia roy .j.yang@qut.edu.au Fang Chen University of T echnology Sydne y Sydney , Australia fang.chen@uts.edu.au Abstract —Livestock gro wth prediction is essential for opti- mising farm management and impro ving the efﬁciency and sustainability of livestock pr oduction, yet it r emains underex- plored due to limited large-scale datasets and privacy concerns surrounding farm-lev el data. Existing biophysical models rely on ﬁxed f ormulations, while most machine learning approaches are trained on small, isolated datasets, limiting their robustness and generalisability . T o address these challenges, we propose LivestockFL, the ﬁrst federated learning framework speciﬁcally designed for li vestock gr owth pr ediction. LivestockFL enables collaborative model training across distributed farms without sharing raw data, thereby pr eserving data priv acy while allevi- ating data sparsity , particularly for farms with limited historical records. The framework employs a neural architectur e based on a Gated Recurrent Unit combined with a multilayer per ceptron to model temporal gro wth patterns fr om historical weight records and auxiliary features. W e further introduce LivestockPFL, a novel personalised federated learning framework that extends the above federated learning framework with a personalized prediction head trained on each farm’s local data, producing farm-speciﬁc predictors. Experiments on a real-world dataset demonstrate the effectiv eness and practicality of the pr oposed approaches. Index T erms —Livestock growth modeling, deep learning, fed- erated learning I . I N T R O D U C T I O N Liv estock growth modeling and prediction play a critical role in optimising farm management, improving production efﬁcienc y , supporting sustainable food systems, and ensuring the security and resilience of the global food supply chain. Accurate and reliable growth predictions enable data-driven and more informed decisions on feeding and grazing strate- gies, stocking rates, and market timing, thereby enhancing Identify applicable funding agency here. If none, delete this. productivity while reducing economic and en vironmental risks. This importance is particularly pronounced in agriculture- dominated countries like Australia where li vestock production operates across v ast and climatically di verse regions, and is increasingly challenged by climate variability and resource constraints [1]. Although livestock gro wth prediction is highly relev ant to both academic research and industry practice, publicly av ail- able studies in this domain remain scarce. The limited body of existing literature indicates that li vestock growth modeling is an underexplored research area, highlighting a clear need for further systematic in vestigation. Existing studies can generally be classiﬁed into two main categories based on the techni- cal approaches employed: (1) biophysical approaches and (2) machine learning approaches. Biophysical approaches primarily rely on mathematical models (e.g., GrazFeed [2], GRAZPLAN [3]) to simulate interactions among animals, pastures, soil, and climate in order to forecast liv estock weight gain, body composition, and resource requirements. While these models embed strong domain knowledge and predeﬁned mathematical formulations, they often hav e the difﬁculty in parameterising at scale, as it requires precise inputs on feed quality and quantity that farmers often lack, especially in complex real-world settings where data across farms and regions are heterogeneous, noisy , and sparse. In contrast, machine learning approaches train predicti ve models directly on collected li vestock gro wth data and subsequently use these models to forecast growth outcomes [4]. As these approaches are data-dri ven and tailored to speciﬁc datasets from indi vidual farms or re gions [5], they are generally more ﬂexible and better able to generalize to di verse real-world scenarios. In recent years, machine learning approaches have attracted increasing attention due to their strong capability to model complex data and deliver accurate and reliable predictions, and hav e been widely applied in the liv estock sector, in- cluding tasks such as weight estimation and precision farm- ing [6]. Howe ver , studies speciﬁcally focused on machine learning–based liv estock growth prediction remain limited. According to our in vestigation, although many related studies use the term “weight prediction, ” they primarily apply ma- chine learning models to estimate li vestock liv e weight from body indicators such as height and ribeye area [7]. These approaches do not predict liv estock growth by forecasting future weights ov er time. Moreov er, most existing studies rely on one or more small-scale datasets, typically containing fewer than 1,000 li vestock samples, which substantially limits model generalization and reduces their practicality in real- world settings where data are often large-scale, complex, and highly heterogeneous. This rev eals a major barrier to advancing machine learning–based liv estock growth predic- tion: the lack of dedicated large datasets that intensi vely and continuously record livestock growth. In practical farming operations, frequent li vestock weighing and the associated data collection and storage are highly time-consuming and labor- intensiv e, further constraining data av ailability . Another factor hindering research on machine learn- ing–based liv estock gro wth prediction is the priv acy and security of farm management data. In real-world settings, farmers are often reluctant to share operational and production data, as such data may contain sensitiv e information related to business practices and is critical for preserving farm-le vel competitiv e advantage. Collecting and centralizing data from multiple farms on an external platform can therefore raise signiﬁcant concerns regarding data privac y , security , and the risk of information leakage. Federated learning, ﬁrst proposed in 2017 [8], pro vides a natural and effecti ve solution to these challenges by enabling machine learning models to be trained locally on each client’ s data without transferring raw data to any external or centralized storage, thereby preserving data priv acy and security . Despite its suitability , to the best of our knowledge, federated learning has not yet been explored for liv estock gro wth prediction. The only closely related work in the literature applies federated learning to crop yield predic- tion [9]. T o this end, in order to address the aforementioned gaps in liv estock growth prediction, this paper proposes a novel neural federated learning framework speciﬁcally designed for liv estock growth prediction, termed Li vestockFL. In Liv e- stockFL, a Gated Recurrent Unit (GRU) model combined with a multilayer perceptron (MLP) is ﬁrst adopted as the base prediction model to forecast future liv estock weights using historical weight records and auxiliary features (e.g., livestock attributes and location) as inputs. Building upon this model, a federated learning scheme is developed to enable joint training across farms while preserving data locality . Speciﬁcally , each farm is treated as an independent client that retains its data in local storage without uploading potentially sensitive livestock production data to a centralized platform. During training, model updates are computed locally on each client using farm-speciﬁc data and then transmitted to a central server , where they are aggregated to update the global model. The updated global model parameters are subsequently distributed back to clients for the next training round. Through this iterativ e process, the global model is collaborati vely trained using liv estock growth data distributed across multiple farms. T o further enhance the prediction performance of the global model, we propose a personalized extension of the aforemen- tioned federated framew ork for liv estock growth prediction, termed Liv estockPFL. Liv estockPFL ﬁrst trains a general fed- erated liv estock growth prediction model using Li vestockFL, and then ﬁne-tunes the global model on each farm’ s local data. This personalization process enables the model to better cap- ture farm-speciﬁc characteristics, thereby improving prediction accuracy and reliability for individual farms. The main contributions of this work are summarized as follows: • W e propose the ﬁrst federated learning framework for li ve- stock gro wth prediction, which enables collaborativ e model training across distributed farm-le vel edge devices without centralising sensitiv e or priv ate data. By allowing multiple farms to jointly train a shared model while preserving data priv acy , the proposed framew ork effecti vely mitigates data sparsity and data scarcity issues, particularly for small farms with limited historical records. This provides a practical and scalable solution for deplo ying machine learning–based live- stock growth prediction in real-world agricultural settings. • Building upon this frame work, we further introduce a per - sonalised federated learning approach that extends the stan- dard federated learning framew ork with a personalized pre- diction head which is trained on each farm’ s local production data. This yields farm-speciﬁc gro wth prediction models that better capture local conditions and management practices, thereby improving prediction accuracy and reliability for individual farms. • W e conduct extensiv e experiments on a large-scale real- world livestock production dataset, and the results consis- tently demonstrate the effecti veness and robustness of the proposed federated and personalised learning framew orks. I I . R E L AT E D W O R K A. Biophysical Livestock Modelling Biophysical models hav e been used to simulate livestock growth by explicitly modelling feed intake, digestion and metabolism, and their interactions with pasture, soil, and climate. In Australia, the SCA Feeding Standards provide widely used nutrition-requirement relationships [10], while tools such as GrazF eed operationalise these feeding-standard concepts to predict voluntary intake, nutrient supply , and animal production responses under varying diets and pasture conditions [11]. At the system le vel, GRAZPLAN inte grates animal, pasture, and soil components to forecast live weight change under alternative management and seasonal scenarios [3], and SGS has been used as a whole-farm/grazing-system simulator combining pasture gro wth, feed supply , and ani- mal demand to analyse li vestock performance across v ariable en vironments [12], [13]. Despite their interpretability and suitability for scenario analysis, these models are rarely end- to-end and often require multiple intermediate estimates (e.g., metabolism and intake) and calibrated parameters, which can cause errors to propagate through the modelling pipeline. In addition, many underlying processes are difﬁcult to observe at scale, leading to systematic bias when key inputs are misspec- iﬁed or poorly measured. Their transferability across regions and climates can also be limited because core assumptions and parameter settings may not remain valid under geographical or en vironmental change. B. Machine Learning for Livestock Gr owth Machine learning (ML) methods provide a data-driven alternativ e for predicting livestock weight and supporting precision liv estock management, and recent re vie ws report growing adoption of ML for monitoring, decision support, and productivity-related prediction [6]. Many ML studies focus on current li veweight estimation by ﬁtting regression models (e.g., RF/SVR/NN) to manually collected morphometric measure- ments such as body length, heart girth, and wither/hip height [14], [15]. Others replace manual measurements with com- puter vision(CV) pipelines that learn body-shape features from images (e.g., RGB-D) and directly regress to weight, which can reduce handling b ut depends on controlled image capture and can be sensitiv e to pose, occlusion, and camera setup [16]. Despite this progress, there remains a notable gap in ML models that perform longitudinal growth forecasting from historical weight trajectories, where the learning problem re- quires modelling temporal dynamics under irregular sequence, missing records, and measurement noise. Such forecasting models are practically important because they enable proactive management by predicting future weights for planning feeding, drafting, and mark eting decisions, rather than only estimating an animal’ s current weight at observation time [6]. C. F ederated Learning in Agricultur e Federated learning (FL) enables multiple clients to collab- orativ ely train a shared model without exchanging raw data by iterativ ely sending local model updates to a coordinating server [17]. In agriculture, this priv acy-preserving paradigm is attractiv e for farm management because production data are often sensiti ve and constrained by o wnership and commercial considerations. Prior work has mainly studied FL from a crop- focused perspectiv e, for example demonstrating beneﬁts for crop-yield prediction using distributed data while av oiding direct data sharing [9], [18]. In contrast, FL remains undere x- plored for livestock weight and we are not aw are of studies that de velop and ev aluate FL speciﬁcally for liv estock growth forecasting from longitudinal weight histories. This gap is practically important because multi-farm li vestock data are typically heterogeneous across farms in genetics, management, and measurement practices, which can challenge standard FL aggregation and moti vate personalized or heterogeneity-aware training strategies [17]. I I I . P R O B L E M F O R M A L I Z A T I O N In our work, we lev erage a div erse set of data types from multiple sources to predict li vestock growth. Speciﬁcally , for each animal, we incorporate both dynamic data—namely , sequences of historical weight records—and static features, including animal attrib utes (e.g., sex and breed) and location- related information (e.g., state, NRM region, and farm). Let i index li vestock individuals and t index observation times. Each individual i has an irregular time series of length T i . x i,t ∈ R d n , m i,t ∈ R d m , c i = ( c i, 1 , . . . , c i,K ) , represent the weight record related numerical features at each time step, masking vector to indicate if there exists an observed weight record or not at the current time step, and stati- cal categorical features of the i th animal respectively . H denotes the prediction horizon. Mathematically , giv en an individual animal i ’ s historical weight records till to the current time point T i , together with the animal’ s static features, we aim to build a prediction model M to predict the future weights y i,h of the animal ( h = 1 , . . . , H ) . H denotes the prediction horizon. Here each time step in the weight record sequence corresponds to one- month period in our work. I V . M E T H O D O L O G Y In this section, we will illustrate our proposed method for neural federated learning for li vestock gro wth prediction. Speciﬁcally , ﬁrst, we will describe the base machine learning prediction model, which is built on the top of the commonly utilized time series model Gated Recurrent Unit (GRU) com- bined with the multiple perception layer (MLP), as shown in Fig 1. The GR U is employed to model the time series weight record data while the MLP is to model the static feature data as well as to combine these two parts together in the modeling process. Then, we will describe the federated learning architecture b uilt on the top of the above prediction model, which actually conv erts the commonly used centralized training scheme to the federated training scheme for data priv acy and security . Finally , we will describe our proposed personalized federated learning architecture to further enhance the prior federated learning architecture to a farm-lev el per- sonalized edition so that the model can be more sensiti ve and reliable to each farm’ s speciﬁc data. A. GR U-based Livestoc k Gr owth Pr ediction Model Static data embedding. For the static feature data, all of them are categorical ones, and we employ a typical categorical feature embedding layer to transfer each categorical value into a numerical vector , so that it can be well fed into the machine learning model. Speciﬁcally , we ﬁrst embed each categorical v alue, e.g., c i,l from the l th categorical feature, and then concatenate all categorical feature embeddings together to form a uniﬁed categorical embedding vector e ( c ) i, 1 , e ( c ) i,l = Embedding l ( c i,l ) ∈ R d e , (1) G RU G RU G RU t 1 Ag e _ i n te rp o l a te W e i g h t_ i n te rp o l a te Dis ta n c e _ c l o s e s t C re d i b i l i ty Ag e _ o b s e rv e d W e i g h t_ o b s e r v e d O b s e rv a ti o n _ m a s k S y nth eti c w ei gh t r ec ords O bs erv ed w ei gh t r ec ords … W ei gh t W ei gh t W ei gh t t 2 t … … t m t m +1 t m +2 t m +3 Pre d i c t Mean S td Mean S td Mean S td S tat e Regi on B r ee d S ex , … Em b e d d i n g v e c to r Lo c ati on L i v e s to c k c ha r ac teri s ti c s Hi s tori c al w ei gh t r ec ords St at ic dat a m odelling Sequence dat a m odelling F ut ure w eight predic t ion Ag e _ i n te rp o l a te W e i g h t_ i n te rp o l a te Dis ta n c e _ c l o s es t C re d i b i l i ty Age_ ob s erv e d W e i g h t_ o b s e r v e d O b s e rv a ti o n _ m a s k Ag e _ i n te rp o l a te W e i g h t_ i n te rp o l a te Dis ta n c e _ c l o s e s t C re d i b i l i ty Ag e _ o b s e rv e d W e i gh t_ ob s e r v e d O b s e rv a ti o n _ m a s k Fig. 1. Architecture of our proposed livestock weight prediction model. e ( c ) i = h e ( c ) i, 1 ; . . . ; e ( c ) i,L i ∈ R L ∗ d e . (2) Sequence data embedding and modeling. Once we got the numerical feature input x i,t for time step t of the i th animal, we ﬁrst embed it into a latent vector e i,t and then combined it with the masking vector m i,t to construct the input for GRU model via the follo wing equations respecti vely: e i,t = ϕ n ( x i,t ) = W n x i,t + b n , e i,t ∈ R d e . (3) z i,t = [ e i,t ; m i,t ] ∈ R d e + d m , (4) Then, we take the constructed input vector z i,t at each time step into the GRU model for modeling the temporal correlations ov er the historical weight records of each animal and obtain the ﬁnal hidden state h i from the last time step T i of GR U model as the input of the prediction layer . Speciﬁcally , h i,t = GRU( z i,t , h i,t − 1 ) , (5) h i = h i,T i ∈ R d h . (6) Encoding aggregation and weight prediction. Once all the encoding completed, all the embedding vectors are combined together to form a ﬁnal representation of the all the input information as an input for the prediction layer which output the predicted future weights: ˜ h i = h h i ; e ( c ) i i . (7) o i = W 2 σ  W 1 ˜ h i + b 1  + b 2 , (8) where θ ( . ) denotes the ReLU acti vation. In our work, we aim to quantify the uncertainty of the prediction, and thus both the mean µ i and variance σ 2 i of future weights are predicted instead of a single deterministic scaler weight value: o i =  µ i log σ 2 i  , µ i , log σ 2 i ∈ R H . (9) Accordingly , we select the Gaussian Negati ve Log- Likelihood (NLL) as the loss function to optimize the pre- diction model: L NLL = 1 2 H H X h =1 " log σ 2 i,h + ( y i,h − µ i,h ) 2 σ 2 i,h # . (10) where the predicted future weight y i,h is drawn from the distribution predicted abov e: y i,h ∼ N ( µ i,h , σ 2 i,h ) , h = 1 , . . . , H . (11) B. F ederated Learning Arc hitectur e W e adopt a standard federated averaging (FedA vg) paradigm to train the li vestock gro wth prediction model collaborati vely across multiple farms, while keeping all raw liv estock produc- tion data local. Let k ∈ { 1 , . . . , K } denote the client (farm) index and r ∈ { 1 , . . . , R } the federated communication round. Each client k owns a priv ate dataset D k with n k = |D k | samples, and the total number of samples is N = P K k =1 n k . Let θ denote the full set of model parameters, including the encoder , GR U layers, and prediction head. Global optimization objective. Federated learning im- plicitly minimizes a data-size–weighted global empirical risk across all clients: min θ K X k =1 n k N E ( x , y ) ∼D k [ ℓ ( y , f ( x ; θ ))] , (12) where f ( . ; θ ) is the liv estock growth prediction model and ℓ ( . ) denotes the local training loss. Server initialization and broadcast. At the beginning of communication round r , the server broadcasts the current global model to all clients: θ ( r ) − → all clients k . (13) Each client initializes its local model as: θ ( r, 0) k = θ ( r ) . (14) Local model training and update. Each client performs local optimization for E epochs using its priv ate dataset. For a training sample i at client k , the model predicts a mean µ k,i,h ) and variance σ 2 k,i,h for each forecast horizon h . The local Gaussian negativ e log-likelihood loss is: ℓ k,i = 1 2 H X h =1 " log σ 2 k,i,h + ( y k,i,h − µ k,i,h ) 2 σ 2 k,i,h # . (15) The empirical local objecti ve at client k is: L k ( θ ) = 1 n k n k X i =1 ℓ k,i . (16) Starting from the receiv ed global parameters, each client performs gradient-based optimization: θ ( r,E ) k = θ ( r ) − η E X e =1 ∇L k  θ ( r,e − 1) k  , (17) where η denotes the learning rate. After local training, the updated parameters are denoted as: θ ( r ) k = θ ( r,E ) k . (18) Federated aggregation (F edA vg) and optimization. Once all clients complete local training, the server aggregates the receiv ed models using data-size–weighted averaging: θ ( r +1) = K X k =1 n k N θ ( r ) k . (19) This aggregation is applied to all layers of the network, en- suring that both temporal dynamics and uncertainty estimates are globally shared. The broadcast–train–aggregate procedure is repeated for R rounds: θ (1) → θ (2) → · · · → θ ( R ) . (20) The ﬁnal global model θ ( R ) is used for inference on all farms. C. P ersonalized F ederated Learning Ar chitectur e T o account for farm-speciﬁc growth patterns and manage- ment practices, we extend the standard federated learning framew ork with a personalized model head, while keeping the representation learning components globally shared. Speciﬁ- cally , we decompose the model parameters into two parts: θ = n θ ( s ) , θ ( p ) o , (21) where θ ( s ) denotes the shared body , including numeric embed- dings, GR U layers, and categorical embeddings, θ ( p ) denotes the personalized prediction head, implemented as the ﬁnal fully connected layers. The federated optimization objective with personalization becomes: min θ ( s ) , { θ ( p ) k } K k =1 K X k =1 n k N E ( x , y ) ∼D k h ℓ  y , f ( x ; θ ( s ) , θ ( p ) k ) i , (22) where each client k maintains its o wn personalized head θ ( p ) k , while sharing the body θ ( s ) . In this way , at communication round r , the server broadcasts only the shared body: θ ( s,r ) − → all clients k . (23) Each client initializes its local model as: θ ( s,r ) k = θ ( s,r ) . (24) If av ailable, the client restores its previously learned person- alized head θ ( p ) k . Local training with personalized head. Each client per- forms local optimization on its priv ate data, updating both shared and personalized parameters. For sample i at client k : ℓ k,i = 1 2 H X h =1 " log σ 2 k,i,h + ( y k,i,h − µ k,i,h ) 2 σ 2 k,i,h # . (25) The empirical local objecti ve is: L k  θ ( s ) , θ ( p ) k  = 1 n k n k X i =1 ℓ k,i . (26) Starting from the broadcast shared body , client k performs gradient-based updates: θ ( s,r,E ) k = θ ( s,r ) − η E X e =1 ∇ θ ( s ) L k , (27) θ ( p,r,E ) k = θ ( p,r ) k − η E X e =1 ∇ θ ( p ) k L k . (28) After local training, the client retains: θ ( p,r ) k = θ ( p,r,E ) k , (29) while sending only the shared parameters θ ( s,r ) = θ ( s,r,E ) k to the server . The server aggreg ates only the shared body parameters using FedA vg: θ ( s,r +1) = K X k =1 n k N θ ( s,r ) k . (30) The personalized heads θ ( p ) k are excluded from aggregation and remain priv ate to each client. The personalized federated learning process repeats for R rounds: θ ( s, 1) → θ ( s, 2) → · · · → θ ( s,R ) . (31) At con ver gence, the ﬁnal model for client k is: f k ( x ) = f ( x ; θ ( s,R ) , θ ( p ) k ) . (32) For inference on farm k , predictions are generated using the shared body and the farm-speciﬁc head: ˆ y k = f ( x ; θ ( s,R ) , θ ( p ) k ) . (33) V . E X P E R I M E N T S A. Dataset W e conducted intensi ve experiments on a real-world live- stock production dataset collected in Australia in collaboration with a local livestock management service provider . After processing, the dataset contains 25,422 beef cattle from 110 farms distributed across 10 NRM regions in Ne w South W ales. The cattle are primarily of the Angus breed and its crosses, including ’Angus’, ’Angus X’, ’Angus Hereford X’, ’Red Angus’, ’Angus Friesian X’, ’South Dev on Angus X’, ’Angus Lowline’, ’Brahman Angus X’, and ’Australis South Dev on/Angus’. Four categorical static liv estock features are included: sex, breed, state, and NRM region, with more features (e.g., climate) easily incorporable when av ailable. For sequence data, we use historical weight records from each animal between 2 and 24 months of age, the main growth stage of beef cattle. In the raw dataset, most animals have very sparse weight records, averaging only 2.5 measurements per animal. T o address this, we designed a nov el quantile regression-based inter- and extrapolation method to generate monthly weight records from 2 to 24 months. T o improv e reliability , an age-based distance is calculated between each augmented record and its nearest observed weight record, and a distance-based credibility is assigned: smaller distances correspond to higher credibility . Min-max normalization is applied to weight, age, distance, and credibility , and a binary masking v ector indicates whether a weight record is observed at each time step. The ﬁnal dataset statistics are as follows: 20,337 animals contribute to 77,987 training instances, and 5,085 animals contribute to 19,502 testing instances. B. Experiment Settings W e aim to answer the following research questions (RQ) through experiments: RQ 1: Comparison of prediction accuracy: How is our proposed method compared to centralized training? How does our personalized federated learning perform? RQ 2: How our proposed model can effecti vely alleviates data sparsity and data scarcity issues, particularly for small farms (which is a typical issue in our current project)? RQ 3: Who beneﬁts most from our propose method, small farms or large farms? (Small vs large farm analysis) and how our method can beneﬁt farms of different sizes differently? C. Experiment Results and Analysis 1) W eight prediction accuracy comparison under differ ent settings (RQ1): : W e ev aluate the GRU-based li vestock gro wth prediction model under three settings to understand its overall performance: (1) Centralized training, serving as the reference baseline; (2) Federated learning (Liv estockFL), where each client’ s contrib ution to global model updates is weighted by its sample size; and (3) Personalized federated learning (Liv estockPFL), an enhanced version of Liv estockFL designed to address imbalanced and heterogeneous data distributions across farms. T o further mitigate the impact of highly imbalanced client sizes, we also introduce Liv estockFL-Sqrt and Liv estockPFL- Sqrt, v ariants that apply √ n weighting in FedA vg. Speciﬁcally , instead of weighting each client’ s model by its sample size n i , the weight is proportional to the square root of the sample size: w i = √ n i P K j =1 √ n j . This √ n weighting reduces the dominance of large clients in standard federated learning and helps alleviate potential bias in the aggregated prediction models. Overall prediction performance (T able I) : T able I sho ws the overall weight prediction accuracy across all animals and time horizons. Centralized training achiev es the best perfor- mance, with RMSE = 14.34kg, MAE = 10.30kg, MAPE = 2.71%, and R 2 = 0.97, as expected since it has full access to all data. Standard federated learning (Liv estockFL) performs substantially worse (RMSE = 42.76kg, MAE = 33.46kg, MAPE = 9.13%, R 2 = 0.75), reﬂecting the challenges of imbalanced and heterogeneous farm-le vel data across the 110 farms, where small farms’ contributions are underrepresented. Introducing the √ n weighting (Liv estockFL-Sqrt) improves FL performance (RMSE = 39.55kg, MAE = 31.84kg, MAPE = 8.53%, R = 0.80), conﬁrming that down-weighting the dominance of very large farms mitigates aggregation bias. Personalized FL (Li vestockPFL) signiﬁcantly impro ves ac- curacy ov er standard FL, achieving RMSE = 23.42kg, MAE = 17.33kg, MAPE = 4.66%, R = 0.93. Its √ n v ariant (Liv estockPFL-Sqrt) also performs well, although slightly lower than standard PFL (RMSE = 26.81kg, MAE = 20.41kg, MAPE = 5.42%, R² = 0.91). This demonstrates that per- sonalization is highly ef fecti ve in heterogeneous and imbal- anced data scenarios, allowing the model to adapt to farm- speciﬁc characteristics while preserving priv acy . Overall, Liv e- stockPFL bridges a lar ge portion of the gap between FL and centralized training. Per -horizon prediction performance (T able II) : T able II further reveals trends across three prediction horizons. Cen- tralized training consistently outperforms all FL methods for each horizon, with RMSE increasing gradually from horizon 1 (9.24kg) to horizon 3 (18.25kg), reﬂecting the increased difﬁ- culty in longer-term predictions. Standard FL (Liv estockFL) shows the largest errors in horizon 1 (RMSE = 46.22kg) but improves slightly across longer horizons, likely due to T ABLE I O V E RA L L C O M P A RI S O N S O N LI V E S TO C K W E IG H T P R E DI C T I ON AC C U RA CY O N T H E W H O L E D A TA SE T RMSE MAE MAPE R-score Centralized training 14.34 10.30 2.71% 0.97 LivestockFL 42.76 33.46 9.13% 0.75 LivestockFL-Sqrt 39.55 31.84 8.53% 0.80 LivestockPFL 23.42 17.33 4.66% 0.93 LivestockPFL-Sqrt 26.81 20.41 5.42% 0.91 T ABLE II P E R - P R E DI C T IO N H OR I Z ON C O MPA R IS O N S O N L IV E S TO C K W E I GH T P R ED I C T IO N A CC U R AC Y O N T H E W H O LE DA TAS E T Prediction Horizon Method RMSE MAE MAPE R-score Horizon 1 Centralized training 9.24 6.77 1.90% 0.98 LivestockFL 46.22 36.55 10.44 0.72 LivestockFL- Sqrt 39.49 32.29 9.16 0.80 LivestockPFL 19.71 14.78 4.17% 0.95 LivestockPFL- Sqrt 24.43 19.20 5.32% 0.92 Horizon 2 Centralized training 14.08 10.39 2.77% 0.97 LivestockFL 42.01 32.65 8.84% 0.77 LivestockFL- Sqrt 40.07 31.61 8.50% 0.79 LivestockPFL 23.03 16.88 4.55% 0.93 LivestockPFL- Sqrt 25.88 19.51 5.18% 0.91 Horizon 3 Centralized training 18.25 13.74 3.45% 0.95 LivestockFL 39.79 31.18 8.08% 0.78 LivestockFL- Sqrt 39.08 31.63 7.95% 0.79 LivestockPFL 26.96 20.33 5.26% 0.90 LivestockPFL- Sqrt 29.82 22.51 5.76% 0.88 av eraging ef fects. The √ n variant (Liv estockFL-Sqrt) reduces horizon errors noticeably , especially for horizon 1 (RMSE = 39.49kg), validating its bias-mitigation effect. Personalized FL methods (Liv estockPFL and Li vestockPFL- Sqrt) achieve consistently lower errors across all horizons, with PFL achie ving RMSE = 19.71kg, 23.03kg, and 26.96kg for horizons 1, 2, and 3, respecti vely . The improvement over standard FL is particularly pronounced in the early horizon, demonstrating the ability of PFL to lev erage farm-speciﬁc information for more accurate short-term predictions, while maintaining robustness ov er longer horizons. 2) Pr ediction err or analysis on small farms with extr eme data scarcity (for RQ2): W e ev aluated the performance of Liv estockPFL against local training on small farms charac- terized by extreme data scarcity ( < 20 individual animal measurements (IAMs)). As illustrated in the Figs 2 and 3, Liv estockPFL consistently outperforms local training in both predictiv e accuracy and model stability . Local training ex- hibits high volatility , with error metrics spiking signiﬁcantly ( RM S E ≈ 180 , M AE ≈ 170 ) at low IAM counts of 4. This instability underscores the failure of local models to generalize when trained on sparse, high-v ariance datasets. In contrast, Liv estockPFL maintains a substantially lo wer and more consistent error proﬁle across the entire spectrum of data scarcity . By lev eraging the federated framework to capture global weight-gain patterns while personalizing the model to local farm conditions, our approach ef fectiv ely mitigates the “cold-start” problem. The results demonstrate that Liv e- stockPFL pro vides a robust “knowledge ﬂoor , ” prev enting the performance collapse typical of local deep learning models in data-poor en vironments. W e conclude that the proposed personalized federated learning architecture successfully alle- viates data sparsity issues, making accurate liv estock weight prediction accessible and reliable for small-scale enterprises that lack sufﬁcient local data for independent model training. 3) Small vs. larg e farm analysis (F or RQ3): T able III reports performance comparisons between the proposed per- sonalised federated learning method Li vestockPF and local training across ﬁv e farm-size categories deﬁned by the number of IAMs. A clear and consistent pattern emerges: the beneﬁt of the proposed method strongly depends on farm size. For small farms ( < 50 IAMs), PFL impro ves performance on 4 out of 5 farms (80%). In this regime, local models often exhibit high prediction error and unstable or even negativ e R 2 values, reﬂecting sev ere data scarcity . By contrast, PFL substantially reduces RMSE, MAE, and MAPE while yielding consistently positiv e R 2 , indicating improved generalisation through cross-farm knowledge transfer . For 51–200 IAMs and 201–500 IAMs, the proposed method achie ves improvements on all farms (100% improv ement rate in both groups). Error reductions are consistent across RMSE, MAE, and MAPE, and R 2 values are uniformly high. These results suggest that medium-sized farms represent the most fav ourable regime for PFL, where suf ﬁcient local data enables ef fectiv e personalisa- tion while still beneﬁting from shared global representations. In contrast, the advantage diminishes for 501–1000 IAMs, where only 1 out of 5 farms (20%) shows improvement. For most farms in this group, local training already achiev es strong performance, limiting the marginal gains from federated learning. For large farms ( > 1000 IAMs), no improv ements are observed. Local models consistently outperform or match PFL across all metrics, indicating that abundant local data largely eliminates the need for e xternal knowledge transfer . Fig. 2. RMSE comparison on small farms. Overall, these results demonstrate that the proposed method beneﬁts small and medium farms most, while remaining competitiv e b ut unnecessary for very large, data-rich farms. V I . C O N C L U S I O N S Liv estock growth prediction is essential for optimizing farm management and impro ving both ef ﬁciency and sustainability . T ABLE III P E RF O R M AN C E C O M P A R I S ON B ET W E E N P E R S ON AL I S E D F E D E RAT ED L EA R N I NG ( PF L ) A N D L O C AL T RA I N I NG AC RO S S D I FFE R E N T FA R M - S I Z E L E VE L S . I A M S R EF E R T O I N D IV I D UA L A N IM A L S . T H E L A S T C O LU M N I N D IC ATE S W H E T HE R PF L I MP RO VE S O VE R L O C A L T R A I NI N G ( 1 = Y ES , 0 = N O ) . Farm-size lev el # IAMs Personalised FL Local Training Improve RMSE MAE MAPE(%) R 2 RMSE MAE MAPE(%) R 2 < 50 IAMs 6 55.35 44.11 11.84 0.59 97.46 70.16 20.51 -0.26 1 14 49.00 39.16 11.13 0.60 71.27 59.96 17.55 0.15 1 26 53.38 46.83 14.57 0.13 51.69 40.72 11.65 0.18 0 38 29.46 22.31 10.44 0.85 38.60 29.55 12.23 0.75 1 46 24.69 19.92 6.42 0.93 51.26 39.00 13.54 0.69 1 51–200 IAMs 55 37.82 29.12 6.94 0.42 56.83 46.44 11.02 -0.31 1 70 48.98 37.21 8.63 0.26 61.30 46.26 11.77 -0.15 1 92 24.53 18.97 5.34 0.91 30.00 24.62 7.05 0.86 1 127 22.97 17.75 4.19 0.83 27.98 24.50 5.61 0.75 1 174 19.07 14.91 5.23 0.95 19.24 14.80 4.97 0.95 1 201–500 IAMs 214 25.56 19.43 5.92 0.93 30.63 24.14 6.55 0.90 1 245 14.97 11.98 4.37 0.96 25.63 19.85 6.64 0.89 1 361 17.91 12.94 3.72 0.96 22.44 17.51 5.12 0.94 1 408 20.14 16.77 4.50 0.94 21.13 17.49 4.75 0.93 1 490 23.59 18.34 4.36 0.89 59.26 55.30 13.58 0.32 1 501–1000 IAMs 502 11.94 9.71 2.84 0.97 10.21 7.91 2.34 0.98 0 747 20.62 15.90 5.05 0.95 23.65 17.94 5.51 0.93 1 837 16.14 12.96 3.16 0.96 16.02 12.90 3.20 0.96 0 883 43.33 38.99 12.08 0.75 28.40 23.15 7.41 0.89 0 969 25.16 21.18 5.23 0.93 17.84 14.20 3.54 0.96 0 > 1000 IAMs 1796 18.07 13.74 3.78 0.93 16.98 13.49 3.83 0.93 0 2004 22.60 17.94 4.34 0.90 22.04 16.91 4.00 0.90 0 2667 19.24 14.89 3.94 0.94 15.90 11.87 3.16 0.96 0 3279 23.80 18.02 4.42 0.91 16.88 12.67 3.05 0.96 0 3419 23.39 16.90 3.94 0.93 13.37 9.58 2.31 0.98 0 Fig. 3. MAE comparison on small farms. In practice, livestock production data are often fragmented and distributed across multiple farms. Effecti vely leveraging this data for reliable gro wth prediction while addressing data priv acy and security concerns remains a signiﬁcant challenge, with limited studies addressing it. In this paper, we propose a novel neural federated learning framework as a promising solution. T o further tackle data imbalance and heterogeneity , we extend this frame work to a personalized federated learning approach. T o the best of our knowledge, this is the ﬁrst public study applying federated learning to liv estock growth prediction. Extensive experiments on a real-world liv estock production dataset demonstrate the ef fecti veness of our pro- posed methods. Future work will explore more adv anced federated learning models for liv estock growth prediction, and collect and inv estigate more real-world li vestock datasets. R E F E R E N C E S [1] Bev erley Henry , Ed Charmley , Richard Eckard, John B Gaughan, and Roger Hegarty , “Liv estock production in a changing climate: adaptation and mitigation research in australia, ” Crop and P asture Science , vol. 63, no. 3, pp. 191–202, 2012. [2] Neville; Moore, Andre w; Herrmann and Eric Zurcher , “Grazfeed. csiro., ” 2018. [3] JR Donnelly , AD Moore, and M Freer , “Grazplan: Decision support systems for australian grazing enterprises—i. overvie w of the grazplan project, and a description of the metaccess and lambaliv e dss, ” Agricul- tural Systems , vol. 54, no. 1, pp. 57–76, 1997. [4] A yush Kanwal, Shoujin W ang, Kenneth Sabir, and W ei Liu, “Li vestock growth: Machine learning re view of the literature, ” A vailable at SSRN 5225771 , 2025. [5] Kenneth T uyishime and Michael Adelani Adewusi, “Data-driven predic- tion of cattle weight gain for ev aluating key gro wth factors with machine learning approaches, ” Discover Computing , vol. 28, no. 1, pp. 275, 2025. [6] Rodrigo Garc ´ ıa, Jose Aguilar , Mauricio T oro, Angel Pinto, and P aul Rodr ´ ıguez, “ A systematic literature review on the use of machine learning in precision li vestock farming, ” Computers and Electr onics in Agriculture , vol. 179, pp. 105826, 2020. [7] Ricardo Garro, Cara W ilson, David Swain, Anibal Pordomingo, and Santoso W ibowo, “Enhancing carcass yield prediction in angus cattle feedlots: a comparative analysis of machine learning models, ” in 2024 IEEE International Conference on Future Machine Learning and Data Science (FMLDS) . IEEE, 2024, pp. 135–140. [8] Y ue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and V ikas Chandra, “Federated learning with non-iid data, ” arXiv preprint arXiv:1806.00582 , 2018. [9] T Manoj, Krishnamoorthi Makkithaya, and V G Narendra, “ A federated learning-based crop yield prediction for agricultural production risk management, ” in 2022 IEEE delhi section conference (DELCON) . IEEE, 2022, pp. 1–7. [10] Resource Management Council of Australia, New Zealand. Stand- ing Committee on Agriculture, and Resource Management. Ruminants Sub-Committee, F eeding Standards for Austr alian Livestock. Ruminants , Number 23. CSIR O PUBLISHING, 1994. [11] CSIR O, “Grazfeed: A decision support tool for grazing animal nutrition and production, ” T echnical documentation / user manual, 1990. [12] P Sanford, BR Cullen, PM Dowling, DF Chapman, DL Garden, GM Lodge, MH Andrew , PE Quigley , SR Murphy , W McG King, et al., “Sgs pasture theme: effect of climate, soil factors and management on pasture production and stability across the high rainfall zone of southern australia, ” Austr alian Journal of Experimental Agriculture , vol. 43, no. 8, pp. 945–959, 2003. [13] IR Johnson, GM Lodge, and RE White, “The sustainable grazing systems pasture model: description, philosophy and application to the sgs national e xperiment, ” Australian Journal of Experimental Agricultur e , vol. 43, no. 8, pp. 711–728, 2003. [14] Anjar Setiawan and Ema Utami, “Predicting the weight of li vestock using machine learning, ” in 2024 IEEE International Confer ence on Artiﬁcial Intelligence and Mechatr onics Systems (AIMS) . IEEE, 2024, pp. 1–6. [15] AN Ruchay , VI K olpakov , VV Kalschikov , KM Dzhulamanov , and KA Dorofeev , “Predicting the body weight of hereford cows using machine learning, ” in IOP Conference Series: Earth and Envir onmental Science . IOP Publishing, 2021, vol. 624, p. 012056. [16] Alexe y Ruchay , V italy Kober , Konstantin Dorofeev , Vladimir K olpakov , Alexe y Gladkov , and Hao Guo, “Liv e weight prediction of cattle based on deep re gression of r gb-d images, ” Agriculture , vol. 12, no. 11, pp. 1794, 2022. [17] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas, “Communication-ef ﬁcient learning of deep networks from decentralized data, ” Proceedings of the 20th International Confer ence on Artiﬁcial Intelligence and Statistics (AIST ATS) , pp. 1273– 1282, 2017. [18] RJ Garro, CS Wilson, DL Swain, AJ Pordomingo, and Santoso W ibow o, “ A systematic literature revie w on the applications of federated learning and enabling technologies for livestock management, ” Computers and Electr onics in Agriculture , vol. 234, pp. 110180, 2025.

Neural Federated Learning for Livestock Growth Prediction

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment