Boxplots and quartile plots for grouped and periodic angular data

Boxplots and quartile plots for grouped and periodic angular data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Angular observations, or observations lying on the unit circle, arise in many disciplines and require special care in their description, analysis, interpretation and visualization. We provide methods to construct concentric circular boxplot displays of distributions of groups of angular data. The use of concentric boxplots brings challenges of visual perception, so we set the boxwidths to be inversely proportional to the square root of their distance from the centre. A perception survey supports this scaled boxwidth choice. For a large number of groups, we propose circular quartile plots. A three-dimensional toroidal display is also implemented for periodic angular distributions. We illustrate our methods on datasets in (1) psychology, to display motor resonance under different conditions, (2) genomics, to understand the distribution of peak phases for ancillary clock genes, and (3) meteorology and wind turbine power generation, to study the changing and periodic distribution of wind direction over the course of a year.


💡 Research Summary

The paper addresses the challenge of visualizing angular (circular) data, which appear in many scientific fields and require special treatment because of their periodic nature. Traditional linear boxplots cannot be directly applied, and existing circular visualizations such as rose diagrams or single‑group circular boxplots are limited when multiple groups or temporal periodicity must be displayed.

The authors first extend the circular boxplot introduced by Fisher (2010) to handle several groups simultaneously by arranging individual group boxplots concentrically. A key perceptual problem arises: when boxes are drawn at different radii with the same width, the outer boxes cover a larger arc length and therefore appear larger, misleading viewers about the true spread of the data. To correct this, the authors propose scaling the box width inversely with the square root of the radius (1/√r). This scaling equalizes the visual area of boxes that span the same angular interval, making the perceived spread independent of radius.

A perception survey with 64 participants (all with at least introductory statistics training) compared the unscaled and scaled designs. In the unscaled version, 76 % of respondents incorrectly thought the outer group had a larger spread; after scaling, only 30 % made that error. A one‑sided McNemar test yielded a highly significant p‑value (7.6 × 10⁻⁶), confirming that the scaling dramatically improves accurate perception.

When the number of groups becomes large, scaling each box by radius becomes impractical. The authors therefore adapt the quartile plot concept (originally proposed for linear data) to the circular setting. Instead of a full box, a thick arc represents the inter‑quartile range, while thin gray arcs serve as whiskers. This “circular quartile plot” reduces visual clutter and eliminates the area‑based bias inherent in full boxes.

To capture periodicity over time (e.g., wind direction throughout a year), the paper introduces a three‑dimensional toroidal display. The torus’s circular cross‑section encodes the angular variable, while the toroidal axis encodes a periodic temporal variable (such as month). Grouped circular boxplots or quartile plots are placed on the torus surface, allowing simultaneous inspection of seasonal shifts in direction and their dispersion.

Three real‑world case studies demonstrate the utility of the proposed visualizations:

  1. Motor resonance experiment (psychology) – Phase differences between a mover’s hand and an observer’s hand were recorded under three observation conditions (explicit, semi‑implicit, implicit). The concentric circular boxplots clearly separate the median phase shifts and reveal subtle differences in spread and skewness across conditions.

  2. Ancillary clock‑gene peak phases (genomics) – Peak expression phases of secondary circadian genes were measured across several tissues. The circular boxplots expose tissue‑specific phase clustering and variability, facilitating biological interpretation of tissue‑specific circadian regulation.

  3. Wind direction and turbine power (meteorology) – Annual wind direction data were grouped by month and visualized on a toroidal boxplot. The torus makes it easy to see that certain months exhibit tightly clustered wind directions (high potential for turbine efficiency), while others show broader distributions.

All methods are implemented in the R package CircularBoxplots, which provides functions for concentric circular boxplots, circular quartile plots, and toroidal visualizations. The package leverages the existing circular library for handling angular statistics (e.g., von Mises fitting, mean resultant length).

The discussion emphasizes design principles for minimizing perceptual bias (e.g., width scaling, color contrast) and outlines future extensions such as handling multivariate circular data (direction plus magnitude), interactive web‑based visualizations (Shiny, Plotly), and performance optimization for very large datasets.

In conclusion, by introducing radius‑scaled concentric boxplots, circular quartile plots for many groups, and a toroidal 3‑D display for periodic data, the authors provide a comprehensive toolkit that greatly improves the clarity and interpretability of grouped and time‑varying angular data across diverse scientific domains.


Comments & Academic Discussion

Loading comments...

Leave a Comment