Residential College | false |
Status | 已發表Published |
A Local-Global Estimator based on Large Kernel CNN and Transformer for Human Pose Estimation and Running Pose Measurement | |
Qingtian Wu1; Yongfei Wu2; Yu Zhang1; Liming Zhang1 | |
2022-08 | |
Source Publication | IEEE transactions on Instrumentation and Instrumentation & Measurement |
ISSN | 0018-9456 |
Volume | 71Pages:1-12 |
Abstract | Running pose in the crowd can serve as an early warning of most abnormal events (e.g., chasing, fleeing and robbing), which can be achieved by human behavior analysis based on human pose measurement. Although deep convolutional neural networks (CNNs) have achieved impressive progress on human pose estimation, how to further improve the trade-off between estimation accuracy and speed remains an open issue. In this work, we first propose an efficient local-global estimator for human pose estimation (called LGPose). Then based on the keypoints estimated by our LGPose, a simple regression model is defined by using the geometry of the joints to achieve fast and accurate running pose measurement. To model the relationships between the human keypoints, visual transformer (ViT) encoder is adopted to learn the long-range interdependencies between them at the pixel level. However, the operation of transformer encoder is based on sequence processing that linearly projects 2D image patches to 1D tokens. It loses the important local information. Yet, locality is crucial since it has relevance to lines, edges and shapes. To learn the locality, we design effective CNN modules, rather than the original fully-connected network, into the feedforward module of ViT. Experiments on MPII and COCO Keypoint val2017 dataset show that the proposed LGPose achieves the best trade-off among the compared state-of-the-art methods. Moreover, we build a lightweight running movement dataset to verify the effectiveness of our LGPose. Based on the human pose estimated by our LGPose, we propose a regression model to measure running pose with an accuracy of 86.4% without training any other classifier. Our source codes and running dataset will be made publicly available. |
Keyword | Convolutional Neural Networks (Cnn) Human Pose Estimation (Hpe) Local–global Estimator Running Pose Measurement Vision Transformer (Vit) |
DOI | 10.1109/TIM.2022.3200438 |
URL | View the original |
Indexed By | SCIE |
WOS Research Area | Engineering ; Instruments & Instrumentation |
WOS Subject | Engineering, Electrical & Electronic ; Instruments & Instrumentation |
WOS ID | WOS:000852478000012 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-85136857689 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Liming Zhang |
Affiliation | 1.Faculty of Sciences and Technology, University of Macau, Macau, China 2.e College of Data Science, Taiyuan University of Technology, Taiyuan 030024, China |
First Author Affilication | University of Macau |
Corresponding Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Qingtian Wu,Yongfei Wu,Yu Zhang,et al. A Local-Global Estimator based on Large Kernel CNN and Transformer for Human Pose Estimation and Running Pose Measurement[J]. IEEE transactions on Instrumentation and Instrumentation & Measurement, 2022, 71, 1-12. |
APA | Qingtian Wu., Yongfei Wu., Yu Zhang., & Liming Zhang (2022). A Local-Global Estimator based on Large Kernel CNN and Transformer for Human Pose Estimation and Running Pose Measurement. IEEE transactions on Instrumentation and Instrumentation & Measurement, 71, 1-12. |
MLA | Qingtian Wu,et al."A Local-Global Estimator based on Large Kernel CNN and Transformer for Human Pose Estimation and Running Pose Measurement".IEEE transactions on Instrumentation and Instrumentation & Measurement 71(2022):1-12. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment