Abstract
Multi-modal vibration analysis is a valuable technique used for detecting faults in mechanical systems. This method utilizes different modalities, such as heat signature and acoustic signals, to monitor the system’s health status. Heat signature analysis is used to measure the temperature distribution of the system, while acoustic signal analysis is used to measure the sound produced by the system. These two modalities, when combined, can provide a more comprehensive view of the system’s condition, allowing for a more accurate and reliable fault diagnosis. To identify faults in motors, a cross-attention-based transformer model is proposed. The model leverages the power of transformers, which have shown great success in natural language processing tasks, to analyze multi-modal data. The cross-attention mechanism enables the model to learn the correlations between different modalities, allowing for more effective information integration. The proposed solution utilizes a two-stage process. In the first stage, the transformer model is trained on a dataset of normal and faulty motors, utilizing both heat signature and acoustic signals as input. The model learns to differentiate between normal and faulty states based on the cross-modal information. In the second stage, the trained model is used to classify new data samples as normal or faulty. The proposed solution was evaluated on a dataset of motor fault diagnosis, and the results showed a significant improvement in the classification accuracy compared to traditional methods. The use of multi-modal data and cross- attention-based transformers provided a more comprehensive and accurate diagnosis of motor faults.