This paper presents a visual-based approach that allows an Unmanned Aerial Vehicle (UAV) to detect and track a cooperative flying vehicle autonomously using a monocular camera. The algorithms are based on template matching and morphological filtering, thus being able to operate within a wide range of relative distances (i.e., from a few meters up to several tens of meters), while ensuring robustness against variations of illumination conditions, target scale and background. Furthermore, the image processing chain takes full advantage of navigation hints (i.e., relative positioning and own-ship attitude estimates) to improve the computational efficiency and optimize the trade-off between correct detections, false alarms and missed detections. Clearly, the required exchange of information is enabled by the cooperative nature of the formation through a reliable inter-vehicle data-link. Performance assessment is carried out by exploiting flight data collected during an ad hoc experimental campaign. The proposed approach is a key building block of cooperative architectures designed to improve UAV navigation performance either under nominal GNSS coverage or in GNSS-challenging environments.