Deep learning breakthrough stimulates new research trends in civil infrastructure inspection, whereas the lack of quality-guaranteed, human-annotated, free-of-charge, and publicly available defect datasets with sufficient amounts of data hinders the progress of deep learning in defect inspection. To boost research in deep learning-based visual defect inspection, this paper first reviews and summarizes 40 publicly available defect datasets, covering common defects in various types of buildings and infrastructures. The taxonomy of the datasets is proposed based on specific deep learning objectives (classification, segmentation, and detection). Clarifications are also made for each dataset regarding its corresponding data volume, data resolution, data source, defect categories covered, infrastructure types focused, material types targeted, algorithms adopted for validation, annotation levels, context levels, and publication license for future utilization. Consequently, the summarized defect datasets offer around 13.38M labeled images, cover more than 5 defect types, 5 infrastructure types, 5 material types, and 3 levels of image context. Given that the crack is a common interest in civil engineering, this paper further combines existing datasets with self-labeled crack images to establish a benchmark dataset providing more than 15,000 and 11,000 labeled images for crack classification and segmentation, respectively. Based on the established crack dataset, experiments are conducted for classification, segmentation, and the subsequent non-maximum suppression-based detection tasks. The proposed multi-branch self-attention module and multi-stage-fused attentional pyramid network have been successfully adapted into the state-of-the-art (SOTA) classification network-Swin Transformer and segmentation networks including DeepLab V3+, DenseNet, and Full Resolution ResNet. The resulting classification network achieves 88.0% accuracy, and the adapted segmentation models reach 77.8%,77.6%,76.9% mIoU (mean Intersection over Union), respectively. Moreover, a comprehensive comparison between 11 SOTA classification algorithms and 12 SOTA segmentation algorithms has been conducted. The algorithms proposed in this work are shown to achieve satisfactory performance with an acceptable efficiency on modern graphic processing units. Detailed suggestions are provided for constructing high-quality datasets and inspection algorithms. Finally, this paper remarks on the quantity, diversity, difficulty, and scalability of the reviewed defect datasets, feasibility on robotic platforms, superiority of proposed algorithms, and criticality of algorithm comparison results, formulating a solid baseline for future defect inspection research.