IVAC-$\mathbf {P^{2}L}$: Leveraging Irregular Repetition Priors for Improving Video Action Counting

Hang Wang; Zhi-Qi Cheng; Youtian Du; Lei Zhang

doi:10.1109/tmm.2025.3604935

IVAC-$\mathbf {P^{2}L}$: Leveraging Irregular Repetition Priors for Improving Video Action Counting — Hang Wang (2025) | RDL Network

Abstract

1 min read

The quantification of repetitive actions in videos, a task commonly referred to as Video Action Counting (VAC), is a critical challenge in understanding and analyzing content in sports, fitness, and daily activities. Traditional approaches to VAC have largely overlooked the nuanced irregularities inherent in action repetitions, such as interruptions and variable lengths between cycles. Addressing this gap, our study introduces a novel perspective on VAC, focusing on Irregular Video Action Counting (IVAC), which emphasizes the importance of modeling the irregular repetition priors present in video content. We conceptualize these priors through two key aspects: <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Inter-cycle Consistency and <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Cycle-interval Inconsistency. Inter-cycle Consistency ensures that the <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">spatiotemporal representations across all cycle segments in a video <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">remain homogeneous, thereby reflecting the uniformity of actions between <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">different cycle segments. In contrast, Cycle-interval Inconsistency mandates a clear semantic distinction between the representations of cycle segments and intervals, acknowledging the inherent dissimilarities in content. To effectively encapsulate these priors, we introduce a novel methodology consisting of consistency and inconsistency modules, underpinned by a tailored pull-push loss (<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathrm {P^{2}~L}$</tex-math></inline-formula>) mechanism. This approach employs a pull loss to enhance the cohesion among cycle segment features and a push loss to distinctly differentiate between cycle and interval segment features. Empirical evaluations on the RepCount dataset illustrate that our IVAC-<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\mathrm {P^{2}~L}$</tex-math></inline-formula> model sets a new benchmark in state-of-the-art performance for the VAC task. Moreover, our model demonstrates adaptability and generalization across diverse video content, achieving superior performance on two additional datasets, UCFRep and Countix, without necessitating dataset-specific fine-tuning. These findings not only validate the effectiveness of our approach in addressing the complexities of irregular repetitions in videos but also open new avenues for future research in video understanding and analysis.

IVAC-$\mathbf {P^{2}L}$: Leveraging Irregular Repetition Priors for Improving Video Action Counting

Abstract

Discussion(0)

Related publications

On the Synergistic Benefits of Alternating CSIT for the MISO Broadcast Channel

<i>L</i> <sup>1</sup> Estimation: On the Optimality of Linear Estimators

Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity With Applications

Analog Self-Timed Programming Circuits for Aging Memristors

Low-Latency Multi-Kernel Polar Decoders

Related publications

Article2013
On the Synergistic Benefits of Alternating CSIT for the MISO Broadcast Channel
Article2013

Article2024
<i>L</i> <sup>1</sup> Estimation: On the Optimality of Linear Estimators
Article2024

Article2022
Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity With Applications
Article2022

Article2020
Analog Self-Timed Programming Circuits for Aging Memristors
Article2020

Article2022
Low-Latency Multi-Kernel Polar Decoders
Article2022