A BTM‐Based Adaptive Objectionable Short Text Filtering Framework
Article 2022 en
Authors
QW
Qiaoyan Wen
HZ
Hua Zhang
WL
Wenmin Li
Abstract
1 min read
Many methods are available for objectionable text filtering, such as URL‐based filtering, keyword‐based filtering, and intelligence‐based analysis filtering approaches. URL‐based filtering cannot filter the contents of objectionable short text. Keyword‐based filtering faces the overblocking issue. Intelligence‐based analysis filtering is inefficient and ineffective when filtering objectionable short text. In this paper, a biterm topic modelling‐ (BTM‐) based adaptive objectionable short text filtering framework is proposed. We propose a feature extraction algorithm for objectionable short text and establish a sensitive word feature dataset using the descriptions of applications on the Internet. Then, we construct a judgment standard to automatically select the K value of the BTM topic model that can induce self‐adaptation. The feature dataset constructed in this paper can effectively reflect the characteristics of objectionable short text. The proposed filtering framework can effectively identify objectionable short text and has a higher filtering rate than other approaches.
Discussion(0)
No comments yet. Be the first to comment.