Precision livestock farming requires objective assessment of social behavior to support herd welfare monitoring, yet most existing approaches infer interactions using static proximity thresholds that cannot distinguish affiliative from agonistic behaviors in complex barn environments. This limitation constrains the interpretability of automated social network analysis in commercial settings. We present a pose-based computational framework for interaction classification that moves beyond proximity heuristics by modeling the spatiotemporal geometry of anatomical keypoints. Rather than relying on pixel-level appearance or simple distance measures, the proposed method encodes interaction-specific motion signatures from keypoint trajectories, enabling differentiation of social interaction valence. The framework is implemented as an end-to-end computer vision pipeline integrating YOLOv11 for object detection (mAP@0.50: 96.24%), supervised individual identification (98.24% accuracy), ByteTrack for multi-object tracking (81.96% accuracy), ZebraPose for 27-point anatomical keypoint estimation, and a support vector machine classifier trained on pose-derived distance dynamics. On annotated interaction clips collected from a commercial dairy barn, the classifier achieved 77.51% accuracy in distinguishing affiliative and agonistic behaviors using pose information alone. Comparative evaluation against a proximity-only baseline shows substantial gains in behavioral discrimination, particularly for affiliative interactions. The results establish a proof-of-concept for automated, vision-based inference of social interactions suitable for constructing interaction-aware social networks, with near-real-time performance on commodity hardware.
1 of 36
Beyond Proximity: A Keypoint-Trajectory Framework for
Classifying Affiliative and Agonistic Social Networks in Dairy
Cattle
Sibi Parivendan 1, Kashfia Sailunaz 2 and Suresh Neethirajan 1,2*
1 Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, NS
B3H 4R2, Canada
2 Faculty of Agriculture, Dalhousie University, Truro, NS B3H 4R2, Canada
Abstract
Precision livestock farming requires objective assessment of social behavior to support herd
welfare monitoring, yet most existing approaches infer interactions using static proximity
thresholds that cannot distinguish affiliative from agonistic behaviors in complex barn
environments. This limitation constrains the interpretability and usefulness of automated social
network analysis in commercial settings. In this study, we present a pose-based computational
framework for interaction classification that moves beyond proximity heuristics by modeling
the spatiotemporal geometry of anatomical keypoints. Rather than relying on pixel-level
appearance or simple distance measures, the proposed method encodes interaction-specific
motion signatures from keypoint trajectories, enabling differentiation of social interaction
valence. The framework is implemented as an end-to-end computer vision pipeline integrating
YOLOv11 for object detection (mAP@0.50: 96.24%), supervised individual identification
(98.24% accuracy), ByteTrack for multi-object tracking (81.96% accuracy), ZebraPose for 27-
point anatomical keypoint estimation, and a support vector machine classifier trained on pose-
derived distance dynamics. On annotated interaction clips collected from a commercial dairy
barn, the interaction classifier achieved 77.51% accuracy in distinguishing affiliative and
agonistic behaviors using pose information alone. Comparative evaluation against a proximity-
only baseline demonstrates that keypoint-trajectory features substantially improve behavioral
discrimination, particularly for affiliative interactions. The results establish a proof-of-concept
for automated, vision-based inference of social interactions suitable for constructing
interaction-aware social networks. The modular architecture and low computational overhead
support near-real-time processing on commodity hardware, providing a scalable
methodological foundation for future precision livestock welfare monitoring systems.
Keywords: Precision livestock farming; Computer vision; Keypoint detection; Dairy cattle
welfare; Automated behavior monitoring; Deep learning; Interaction classification
2 of 36
- Introduction
Understanding dairy cow behavior is essential for effective herd management, animal welfare
improvement, and productivity enhancement in commercial dairy operations. Recognition of
fundamental behaviors such as lying, standing, walking, and feeding, alongside more complex
patterns including lameness detection, gait analysis, and social interactions, enables early
disease identification and proactive welfare interventions. Timely detection and management
of health problems like lameness minimize long-term negative impacts on productivity,
reproductive performance, and animal well-being [1]. Beyond individual health, social
interactions among cows, whether affiliative such as grooming and allogrooming, or agonistic
such as headbutting and displacement, offer critical insight into stress levels, dominance
hierarchies, resource access patterns, and overall herd cohesion [2]. Comprehensive monitoring
of these behaviors supports holistic herd welfare assessment and data-driven decision-making
in precision livestock management.
Behavior monitoring in dairy cattle has traditionally relied on contact sensors including
accelerometers, magnetometers, and radio frequency identification ear tagging systems [3, 4].
These sensor-based systems, while accurate for specific behaviors, encounter significant
practical limitations such as animal discomfort, sensor detachment or loss, high maintenance
costs, and limited scalability across large herds. Although earlier generations of three-
dimensional vision and wearable sensing systems showed promising improvements in
locomotion tracking and calving detection, device durability and data integration challenges
persisted in farm-level applications [5]. This limitation catalyzed a transition toward non-
invasive computer vision methods, which have gained substantial momentum in automated
animal motion and behavior recognition systems [6, 7].
Recent advances in machine learning (ML) and deep learning (DL) models integrated with
computer vision have transformed livestock behavior analysis. Deep neural networks such as
C3D, ConvLSTM, and DenseNet-based spatiotemporal architectures achieve 86 to 98%
accuracies in classifying single-animal behaviors including walking, lying, and feeding [8, 9].
However, pixel-level models impose subs
This content is AI-processed based on open access ArXiv data.