Research2026-06-26

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

arXiv:2606.26382v1 Announce Type: cross Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic synthesis difficult. To...

What Happened

Researchers have published a systematic review pipeline for social-physical human-robot interaction (spHRI), augmented by small language models (SLMs). The work addresses a growing but fragmented field where robotics, haptics, and human-computer interaction converge. The core innovation is methodological: rather than relying solely on manual literature curation or large, resource-intensive language models, the authors deploy smaller, more efficient SLMs to automate parts of the systematic review process—specifically screening, categorization, and terminology mapping.

The paper establishes a structured pipeline that first identifies relevant spHRI literature across multiple databases, then uses SLMs to classify studies based on interaction modalities (e.g., touch, gesture, force feedback) and social contexts (e.g., collaboration, assistance, therapy). The goal is to create a unified taxonomy for a domain that currently suffers from inconsistent vocabulary—where one researcher’s “physical human-robot interaction” might be another’s “haptic social robotics.”

Why It Matters

The spHRI field has grown rapidly, but this growth has come with a cost: researchers use different terms for similar phenomena, making it nearly impossible to compare results or build cumulative knowledge. A robot that gently touches a human to convey comfort might be described as performing “affective haptics,” “social touch,” or “physical human-robot interaction,” depending on the lab. This fragmentation slows progress and creates barriers for newcomers.

The use of SLMs is particularly notable. Large language models (LLMs) like GPT-4 are powerful but expensive, both computationally and financially. For a systematic review—a task that requires processing thousands of papers—LLM costs can become prohibitive. SLMs offer a pragmatic middle ground: they are cheaper, faster, and can be fine-tuned on domain-specific data. This paper demonstrates that for structured classification tasks like tagging interaction types, SLMs can achieve high accuracy without the overhead of larger models.

For the broader AI community, this work serves as a case study in how to deploy language models for knowledge synthesis rather than generation. Many current applications focus on chatbots or content creation; this paper shows a more analytical use case—organizing and making sense of existing research.

Implications for AI Practitioners

First, practitioners working on human-robot interaction should adopt the proposed taxonomy to ensure their work is discoverable and comparable. Without standardized terminology, even excellent research can be overlooked.

Second, the SLM-augmented pipeline offers a template for any domain facing literature fragmentation. If your field has multiple overlapping sub-communities (e.g., AI safety, explainable AI, or multi-agent systems), a similar approach could help map the landscape.

Third, the cost-efficiency of SLMs is a practical lesson. Not every NLP task requires a frontier model. For classification, filtering, and structured extraction, smaller models can deliver 90% of the value at 10% of the cost. Practitioners should evaluate whether their use case truly needs the generative power of an LLM or whether a smaller, specialized model suffices.

Finally, this work highlights the importance of systematic review methodology in fast-moving fields. As AI research accelerates, the ability to synthesize knowledge becomes a competitive advantage. Tools that automate parts of this process will become essential.

Key Takeaways

The spHRI field suffers from fragmented terminology; this paper proposes a unified taxonomy using SLM-augmented systematic review.
Small language models can effectively automate literature classification at lower cost than large models, offering a practical alternative for knowledge synthesis.
AI practitioners in any fragmented domain can adopt this pipeline to map research landscapes and identify gaps.
The work reinforces that not all NLP tasks require large models—smaller, task-specific models often provide sufficient accuracy with greater efficiency.

Read Original Article on Arxiv CS.AI

arxivpapers