In this paper, damage detection/identification for a seven-storey steel structure is investigated via using the vibration signals and deep learning techniques. Vibration characteristics, such as natural frequencies and mode shapes are captured and utilized as input for a deep learning network while the output vector represents the structural damage associated with locations. The deep auto-encoder with sparsity constraint is used for effective feature extraction for different types of signals and another deep auto-encoder is used to learn the relationship of different signals for final regression. The existing SAF model in a recent research study for the same problem processed all signals in one serial auto-encoder model. That kind of models have the following difficulties: (1) the natural frequencies and mode shapes are in different magnitude scales and it is not logical to normalize them in the same scale in building the models with training samples; (2) some frequencies and mode shapes may not be related to each other and it is not fair to use them for dimension reduction together. To tackle the above-mentioned problems for the multi-scale dataset in SHM, a novel parallel auto-encoder framework (Para-AF) is proposed in this paper. It processes the frequency signals and mode shapes separately for feature selection via dimension reduction and then combine these features together in relationship learning for regression. Furthermore, we introduce sparsity constraint in model reduction stage for performance improvement. Two experiments are conducted on performance evaluation and our results show the significant advantages of the proposed model in comparison with the existing approaches.