Buildings along riverbanks are likely to be affected by rising water levels, therefore the acquisition of accurate building information has great importance not only for riverbank environmental protection but also for dealing with emergency cases like flooding. UAV-based photographs are flexible and cloud-free compared to satellite images and can provide very high-resolution images up to centimeter level, while there exist great challenges in quickly and accurately detecting and extracting building from UAV images because there are usually too many details and distortions on UAV images. In this paper, a deep learning (DL)-based approach is proposed for more accurately extracting building information, in which the network architecture, SegNet, is used in the semantic segmentation after the network training on a completely labeled UAV image dataset covering multi-dimension urban settlement appearances along a riverbank area in Chongqing. The experiment results show that an excellent performance has been obtained in the detection of buildings from untrained locations with an average overall accuracy more than 90%. To verify the generality and advantage of the proposed method, the procedure is further evaluated by training and testing with another two open standard datasets which have a variety of building patterns and styles, and the final overall accuracies of building extraction are more than 93% and 95%, respectively.