Detailed land use and land cover (LULC) information is one of the important information for land use surveys and applications related to the earth sciences. Therefore, LULC classification using very-high resolution remotely sensed imagery has been a hot issue in the remote sensing community. However, it remains a challenge to successfully extract LULC information from very-high resolution remotely sensed imagery, due to the difficulties in describing the individual characteristics of various LULC categories using single level features. The traditional pixel-wise or spectral-spatial based methods pay more attention to low-level feature representations of target LULC categories. In addition, deep convolutional neural networks offer great potential to extract high-level features to describe objects and have been successfully applied to scene understanding or classification. However, existing studies has paid little attention to constructing multi-level feature representations to better understand each category. In this paper, a multi-level feature representation framework is first designed to extract more robust feature representations for the complex LULC classification task using very-high resolution remotely sensed imagery. To this end, spectral reflection and morphological and morphological attribute profiles are used to describe the pixel-level and neighborhood-level information. Furthermore, a novel object-based convolutional neural networks (CNN) is proposed to extract scene-level information. The object-based CNN method combines advantages of object-based method and CNN method and can perform multi-scale analysis at the scene level. Then, the random forest method is employed to carry out the final classification using the multi-level features. The proposed method was validated on three challenging remotely sensed imageries including a hyperspectral image and two multispectral images with very-high spatial resolution, and achieved excellent classification performances.