Forest structure is a crucial component in the assessment of whether a forest is likely to act as a carbon sink under changing climate. Detailed 3D structural information about the tundra–taiga ecotone of Siberia is mostly missing and still underrepresented in current research due to the remoteness and restricted accessibility. Field based, high-resolution remote sensing can provide important knowledge for the understanding of vegetation properties and dynamics. In this study, we test the applicability of consumer-grade Unmanned Aerial Vehicles (UAVs) for rapid calculation of stand metrics in treeline forests. We reconstructed high-resolution photogrammetric point clouds and derived canopy height models for 10 study sites from NE Chukotka and SW Yakutia. Subsequently, we detected individual tree tops using a variable-window size local maximum filter and applied a marker-controlled watershed segmentation for the delineation of tree crowns. With this, we successfully detected 67.1% of the validation individuals. Simple linear regressions of observed and detected metrics show a better correlation (R2) and lower relative root mean square percentage error (RMSE%) for tree heights (mean R2 = 0.77, mean RMSE% = 18.46%) than for crown diameters (mean R2 = 0.46, mean RMSE% = 24.9%). The comparison between detected and observed tree height distributions revealed that our tree detection method was unable to representatively identify trees <2 m. Our results show that plot sizes for vegetation surveys in the tundra–taiga ecotone should be adapted to the forest structure and have a radius of >15–20 m to capture homogeneous and representative forest stands. Additionally, we identify sources of omission and commission errors and give recommendations for their mitigation. In description, the efficiency of the used method depends on the complexity of the forest’s stand structure.