As a data augmentation method, masking word is commonly used in many natural language processing tasks. However, most mask methods are based on rules and are not related to downstream tasks. In this paper, we propose a novel masking word generator, named Actor-Critic Mask Model (ACMM), which can adaptively adjust the mask strategy according to the performance of downstream tasks. In order to demonstrate the effectiveness of the method, we conducted experiments on two causal event extraction datasets. Experiment results show that, compared with various rule-based masking methods, the masked sentences generated by our proposed method can significantly enhance the generalization of the model and improve the model performance.