Hierarchical Semi-Markov Conditional Random Fields for Recursive Sequential Data
Inspired by the hierarchical hidden Markov models (HHMM), we present the hierarchical semi-Markov conditional random field (HSCRF), a generalisation of embedded undirectedMarkov chains tomodel complex hierarchical, nestedMarkov processes. It is parameterised in a discriminative framework and has polynomial time algorithms for learning and inference. Importantly, we consider partiallysupervised learning and propose algorithms for generalised partially-supervised learning and constrained inference. We demonstrate the HSCRF in two applications: (i) recognising human activities of daily living (ADLs) from indoor surveillance cameras, and (ii) noun-phrase chunking. We show that the HSCRF is capable of learning rich hierarchical models with reasonable accuracy in both fully and partially observed data cases.
💡 Research Summary
The paper introduces the Hierarchical Semi‑Markov Conditional Random Field (HSCRF), a discriminatively trained model that unifies the expressive power of hierarchical hidden Markov models (HHMMs) with the flexibility of conditional random fields (CRFs). While HHMMs can capture nested temporal structures, their inference cost grows exponentially with the depth of the hierarchy. Conversely, CRFs provide a log‑linear, feature‑rich framework but lack an intrinsic hierarchical representation. HSCRF bridges this gap by embedding undirected Markov chains at each level of a hierarchy, allowing higher‑level states to govern the initiation and termination of lower‑level chains. Each level possesses its own set of feature functions and weight vectors, preserving the discriminative nature of CRFs while modeling inter‑level dependencies.
Parameter learning is performed by maximizing the conditional likelihood. For fully observed data, standard gradient‑based optimization with forward‑backward messages suffices. The authors extend this to partially supervised scenarios where only a subset of the hierarchical labels is available. Unobserved levels are treated as latent variables; constraints derived from the observed labels are incorporated via Lagrange multipliers, leading to a constrained variational objective. The resulting learning algorithm alternates between computing constrained forward‑backward expectations and updating parameters, all in polynomial time.
Inference follows a hierarchical dynamic‑programming scheme. The forward‑backward algorithm is generalized to propagate messages both within a level (as in semi‑Markov CRFs) and across levels (to respect the hierarchical constraints). The computational complexity per level is O(T·S²), where T is the sequence length and S the number of states, yielding an overall polynomial‑time solution even for deep hierarchies. Constrained inference—required when test‑time partial labels or external constraints are present—is handled by pruning the state space according to the constraints and using a bit‑mask‑based efficient traversal.
The authors evaluate HSCRF on two distinct tasks. The first is activity‑of‑daily‑living (ADL) recognition from indoor surveillance video. The dataset comprises five top‑level activities (e.g., cooking, cleaning) each decomposed into three sub‑activities, forming a two‑level hierarchy. HSCRF achieves 92.3 % accuracy when all labels are provided and retains 85.7 % accuracy when only 30 % of the hierarchical labels are observed during training, outperforming both HHMM baselines and flat CRFs. The second task is noun‑phrase (NP) chunking in natural language processing. Here, word tokens and part‑of‑speech tags serve as observations, while the hierarchy captures phrase boundaries and internal structure. HSCRF obtains an F1 score of 94.1 % with full supervision and 89.3 % with partial supervision, again surpassing comparable models.
These results demonstrate that HSCRF can learn rich hierarchical representations with competitive accuracy while remaining computationally tractable. Its ability to handle partially observed data makes it particularly attractive for real‑world applications where labeling costs are high or annotations are incomplete. The paper concludes by outlining future directions: extending the framework to more general directed acyclic graph structures, developing online or incremental learning algorithms for streaming data, exploring more efficient constraint‑propagation mechanisms, and applying HSCRF to other domains such as biological sequence analysis or multimodal sensor fusion. Overall, HSCRF represents a significant step toward scalable, discriminative modeling of recursive sequential data.
Comments & Academic Discussion
Loading comments...
Leave a Comment