Nowadays, unmanned aerial vehicles (UAVs) are frequently used for periodic visual inspection of building envelopes to detect unsafe conditions or vulnerable damages. Inspection practitioners have to manually examine the large amounts of high-resolution images collected by UAVs to identify anomalies or damages on building facades for reporting and repairs. The computer vision and deep learning technologies have emerged as promising solutions to automate the image-based inspection process. However, for the detection of façade cracks from UAV-captured images, existing deep learning solutions may not perform well due to the complicated background noises caused by different façade components and materials. Towards that end, this paper proposed a two-step deep learning method for the automated detection of façade cracks from UAV-captured images. In the first step, a convolutional neural network (CNN) model was designed and trained on 26,177 images to classify images in a patch-level size of 128 × 128 pixels into crack or non-crack. In the second step, a U-Net neural network model was trained on 2870 image sets to segment crack pixels within those patches classified as cracks. Experimental results show a high performance of 94% and 96% precision, 94% and 95% recall, and 94% and 96% F1-scores was achieved by the CNN model and the U-Net model respectively. The experimental results proved that the two-step method can improve the reliability and efficiency of detecting and differentiating façade cracks from complicated façade noises. The proposed method can also be extended to detect other types of façade anomalies (e.g., corrosion and joint failures), thus facilitating a comprehensive assessment of façade conditions for better decision-making for the maintenance of building facades during its service life.