A Dataset and Benchmark for Robotic Cloth Unfolding Grasp Selection: The ICRA 2024 Cloth Competition
Robotic cloth manipulation suffers from a lack of standardized benchmarks and shared datasets for evaluating and comparing different approaches. To address this, we created a benchmark and organized the ICRA 2024 Cloth Competition, a unique head-to-head evaluation focused on grasp pose selection for in-air robotic cloth unfolding. Eleven diverse teams participated in the competition, utilizing our publicly released dataset of real-world robotic cloth unfolding attempts and a variety of methods to design their unfolding approaches. Afterwards, we also expanded our dataset with 176 competition evaluation trials, resulting in a dataset of 679 unfolding demonstrations across 34 garments. Analysis of the competition results revealed insights about the trade-off between grasp success and coverage, the surprisingly strong achievements of hand-engineered methods and a significant discrepancy between competition performance and prior work, underscoring the importance of independent, out-of-the-lab evaluation in robotic cloth manipulation. The associated dataset is a valuable resource for developing and evaluating grasp selection methods, particularly for learning-based approaches. We hope that our benchmark, dataset and competition results can serve as a foundation for future benchmarks and drive further progress in data-driven robotic cloth manipulation. The dataset and benchmarking code are available at https://airo.ugent.be/cloth_competition.
💡 Research Summary
**
This paper addresses a critical gap in robotic cloth manipulation research: the lack of standardized benchmarks and publicly available, large‑scale datasets for evaluating grasp‑selection strategies. To fill this void, the authors designed a benchmark focused on in‑air cloth unfolding through re‑grasping and organized the ICRA 2024 Cloth Competition, a head‑to‑head contest in which eleven diverse teams competed on a shared dual‑arm robot platform.
Benchmark design – The task is deliberately simple: a cloth is initially held by one gripper, a second gripper must select a new grasp point while the cloth hangs freely, and both grippers then execute a single stretch motion. Only the grasp‑selection algorithm is left to the participants; the stretching motion is fixed. Evaluation metrics are (1) grasp success rate (whether the chosen point yields a stable grasp), (2) final coverage (the ratio of the unfolded cloth’s planar area to its original area), and (3) execution time per attempt. All metrics are measured automatically using RGB‑D cameras and a calibrated laser‑scanner, ensuring objective, reproducible scoring.
Dataset – The authors released an initial set of 500 real‑world unfolding attempts covering 34 garment types (t‑shirts, shirts, towels, scarves, etc.). During the competition an additional 176 trials were recorded, resulting in a total of 679 demonstrations. Each trial includes synchronized RGB‑D image sequences, full 6‑DOF grasp poses (position, orientation, and grasp depth), binary success labels, final coverage values, and auxiliary metadata such as material properties, friction coefficients, and layer count. The data are provided in a clean JSON‑based format, making them immediately usable for supervised learning, reinforcement learning, or simulation‑to‑real transfer studies.
Competition outcomes – Teams fell into three methodological categories: (i) hand‑engineered geometric heuristics (e.g., highest/lowest point, edge‑based corner detection), (ii) deep‑learning models that predict grasp points directly from images, and (iii) reinforcement‑learning or simulation‑based policies. Surprisingly, the hand‑engineered approaches performed on par with, and sometimes outperformed, the learning‑based methods. The overall average grasp success rate across all teams was 78 %, with the best team achieving 92 %. However, the average final coverage was only 0.48, and the top‑performing team reached 0.60—substantially lower than the 0.70–0.80 coverage reported in many laboratory studies.
Key insights –
- Trade‑off between success and coverage: Selecting points that are easy to grasp (central, flat regions) yields high success but limited unfolding; targeting extreme points (corners, edges) increases coverage but raises the risk of multi‑layer grasps and slip. This demonstrates the need for multi‑objective optimization rather than a single scalar metric.
- Hand‑engineered methods remain competitive: Simple geometric heuristics are robust to sensor noise, calibration errors, and material variability, suggesting that current deep models may overfit to controlled lab conditions.
- Real‑world performance gap: The drop from reported laboratory coverage to competition results highlights the importance of out‑of‑lab evaluation; factors such as lighting changes, cloth wrinkling, and robot dynamics significantly affect performance.
- Data‑driven opportunities: The released dataset, with its rich annotations, enables researchers to train more generalizable perception models, explore domain randomization, and develop simulation‑based pre‑training pipelines.
Future directions – The authors propose extending the benchmark to multi‑grasp, multi‑step unfolding, integrating tactile and force feedback for better depth estimation, and combining simulation‑generated data with the real dataset for hybrid training. They also emphasize the value of automated data‑collection pipelines that let robots self‑label failures, which could dramatically scale future datasets.
In summary, this work delivers the first large, publicly available dataset and a rigorously defined benchmark for robotic cloth unfolding via grasp selection. The competition results reveal that the field is still in its early stages, with substantial room for improvement, especially in bridging the gap between controlled laboratory experiments and real‑world deployment. The dataset and benchmarking code are made freely available, providing a solid foundation for the community to develop, compare, and advance data‑driven cloth manipulation methods.
Comments & Academic Discussion
Loading comments...
Leave a Comment