GaussianPlant: Structure-aligned Gaussian Splatting for 3D Reconstruction of Plants
We present a method for jointly recovering the appearance and internal structure of botanical plants from multi-view images based on 3D Gaussian Splatting (3DGS). While 3DGS exhibits robust reconstruction of scene appearance for novel-view synthesis, it lacks structural representations underlying those appearances (e.g., branching patterns of plants), which limits its applicability to tasks such as plant phenotyping. To achieve both high-fidelity appearance and structural reconstruction, we introduce GaussianPlant, a hierarchical 3DGS representation, which disentangles structure and appearance. Specifically, we employ structure primitives (StPs) to explicitly represent branch and leaf geometry, and appearance primitives (ApPs) to the plants’ appearance using 3D Gaussians. StPs represent a simplified structure of the plant, i.e., modeling branches as cylinders and leaves as disks. To accurately distinguish the branches and leaves, StP’s attributes (i.e., branches or leaves) are optimized in a self-organized manner. ApPs are bound to each StP to represent the appearance of branches or leaves as in conventional 3DGS. StPs and ApPs are jointly optimized using a re-rendering loss on the input multi-view images, as well as the gradient flow from ApP to StP using the binding correspondence information. We conduct experiments to qualitatively evaluate the reconstruction accuracy of both appearance and structure, as well as real-world experiments to qualitatively validate the practical performance. Experiments show that the GaussianPlant achieves both high-fidelity appearance reconstruction via ApPs and accurate structural reconstruction via StPs, enabling the extraction of branch structure and leaf instances.
💡 Research Summary
GaussianPlant introduces a novel framework for jointly reconstructing the appearance and internal structure of plants from multi‑view RGB images by extending the recent 3D Gaussian Splatting (3DGS) paradigm. Traditional 3DGS excels at photorealistic novel‑view synthesis but lacks explicit structural cues, which limits its usefulness for tasks such as plant phenotyping that require knowledge of branching architecture and leaf instances. To bridge this gap, the authors propose a hierarchical representation consisting of Structure Primitives (StPs) and Appearance Primitives (ApPs).
StPs are low‑frequency Gaussian clusters that are converted into explicit geometric primitives: cylinders for branches and elliptic disks for leaves. Each StP carries a branch‑leaf probability pₛₜ, a 3‑D position µₛₜ, orientation Rₛₜ, and anisotropic scale Sₛₜ. The initial StP set is obtained by clustering a pre‑optimized 3DGS point cloud with k‑means, then applying PCA to each cluster to extract principal axes and scales. Anisotropy and planarity heuristics provide a coarse prior for pₛₜ (e.g., 0.6 for branch‑like clusters, 0.4 for leaf‑like clusters).
ApPs are high‑frequency 3D Gaussians identical to those used in vanilla 3DGS. They store view‑dependent color (base color + spherical harmonics), opacity, and a learnable semantic feature vector fₐₚ. ApPs are densely sampled on the explicit surfaces of StPs; their orientation aligns the Gaussian’s local z‑axis with the surface normal of the underlying StP.
Training optimizes both primitive sets jointly using four complementary loss terms:
-
Photometric loss (Lₚₕₒₜ) – standard L2 reconstruction between rendered images (computed by volumetric splatting of ApPs) and the input photographs. Gradients flow from ApPs to their bound StPs, updating StP positions, orientations, and scales indirectly.
-
Semantic loss (Lₛₑₘ) – a contrastive alignment between DINO‑derived image features and the semantic vectors fₐₚ. This loss drives the branch‑leaf probability pₛₜ toward a self‑organized segmentation, effectively distinguishing branches from leaves without any manual labeling.
-
Structural loss (Lₛₜᵣ) – a graph‑based regularizer that encourages StPs to form a tree‑like connectivity graph. It penalizes discontinuities, enforces smoothness along the cylinder axes, and promotes plausible branch lengths, thereby allowing the recovery of partially occluded branches.
-
Regularization loss (Lᵣₑg) – standard priors on scale, opacity, and color to prevent degenerate solutions.
Through this multi‑task optimization, ApPs capture fine‑grained appearance (textures, shading, view‑dependent effects) while StPs encode the coarse geometry and topology of the plant. The binding relationship between the two tiers enables a bidirectional flow of information: appearance cues refine structure, and structural cues constrain appearance.
The authors built a new dataset comprising indoor and outdoor plants captured from multiple viewpoints. Quantitative evaluations compare GaussianPlant against baseline 3DGS, SuGaR (surface‑aligned Gaussian reconstruction), and plant‑specific pipelines such as Splant‑ing and Wheat3DGS. Metrics include PSNR/SSIM for appearance fidelity and branch‑length error, graph‑matching scores for structural accuracy. GaussianPlant consistently outperforms baselines, achieving higher visual quality while delivering accurate branch graphs and leaf instance segmentation without any post‑hoc skeletonization.
Qualitative results demonstrate that the recovered branch graph can be directly extracted from the StP set, and leaf instances are obtained by clustering leaf‑labeled StPs. The method works on dense foliage where traditional point‑cloud skeletonization fails, and it does not require species‑specific priors, manual masks, or LiDAR scans.
Potential applications span high‑throughput phenotyping (automatic measurement of branch length, leaf count, and leaf area), growth modeling, robotic manipulation of plants, and creation of photorealistic CG assets. Because the framework relies only on RGB images and self‑organized learning, it scales to large, heterogeneous plant collections with minimal data acquisition overhead.
In summary, GaussianPlant extends 3DGS with a dual‑primitive hierarchy, semantic and structural regularizers, and a binding‑based optimization scheme that together enable simultaneous high‑fidelity appearance rendering and accurate internal structure recovery of plants. Future work may explore dynamic growth modeling, root system reconstruction, and cross‑species generalization, further cementing the approach as a versatile tool for both computer vision and graphics communities.
Comments & Academic Discussion
Loading comments...
Leave a Comment