Digital twin for human-machine interaction with convolutional neural network
Abstract
Digital twin (DT) technology aims to create a virtual model of a physical entity and efficiently analyze the intelligent manufacturing system. Based on the DT, human-machine interaction (HMI) is a typical application. Deep learning technology is employed in the digital twin to realize and strengthen HMI while analyzing the physical and virtual data. The convolutional neural network (CNN) is used for analyzing visual information. For dealing with the HMI task in DT, two CNN models, Visual Geometry Group Network (VGG) and Residual Network (ResNet), are adopted. Modified 3D-VGG and 3D-ResNet models are proposed in this paper, which is an improvement over existing VGG and ResNet models. The models focus on humans’ information in videos that are captured with the HMI system’s sensors. The information, which can be regarded as the digital twin data, includes the action and the position of the human skeleton. Additionally, the proposed models are end-to-end. The experiments show that both models perform well on the human motion recognition task. The model can effectively generate skeletal data from video data. With the generated information, the human and the machine can interact well with the aid of the digital twin data analysis.