A Lightweight Model for Wheat Ear Fusarium Head Blight ...

20
Citation: Hong, Q.; Jiang, L.; Zhang, Z.; Ji, S.; Gu, C.; Mao, W.; Li, W.; Liu, T.; Li, B.; Tan, C. A Lightweight Model for Wheat Ear Fusarium Head Blight Detection Based on RGB Images. Remote Sens. 2022, 14, 3481. https://doi.org/10.3390/rs14143481 Academic Editor: Francisco Javier Mesas Carrascosa Received: 12 June 2022 Accepted: 18 July 2022 Published: 20 July 2022 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). remote sensing Article A Lightweight Model for Wheat Ear Fusarium Head Blight Detection Based on RGB Images Qingqing Hong 1 , Ling Jiang 1 , Zhenghua Zhang 1 , Shu Ji 1 , Chen Gu 1 , Wei Mao 2 , Wenxi Li 2 , Tao Liu 1 , Bin Li 1 and Changwei Tan 1, * 1 Jiangsu Key Laboratory of Crop Genetics and Physiology, Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Joint International Research Laboratory of Agriculture and Agri-Product Safety of the Ministry of Education of China, Jiangsu Province Engineering Research Center of Knowledge Management and Intelligent Service, College of Information Engineer, Yangzhou University, Yangzhou 225009, China; [email protected] (Q.H.); [email protected] (L.J.); [email protected] (Z.Z.); [email protected] (S.J.); [email protected] (C.G.); [email protected] (T.L.); [email protected] (B.L.) 2 Station of Land Protection of Yangzhou City, Yangzhou 225009, China; [email protected] (W.M.); [email protected] (W.L.) * Correspondence: [email protected] Abstract: Detection of the Fusarium head blight (FHB) is crucial for wheat yield protection, with precise and rapid FHB detection increasing wheat yield and protecting the agricultural ecological environment. FHB detection tasks in agricultural production are currently handled by cloud servers and utilize unmanned aerial vehicles (UAVs). Hence, this paper proposed a lightweight model for wheat ear FHB detection based on UAV-enabled edge computing, aiming to achieve the purpose of intelligent prevention and control of agricultural disease. Our model utilized the You Only Look Once version 4 (YOLOv4) and MobileNet deep learning architectures and was applicable in edge devices, balancing accuracy, and FHB detection in real-time. Specifically, the backbone network Cross Stage Partial Darknet53 (CSPDarknet53) of YOLOv4 was replaced by a lightweight network, significantly decreasing the network parameters and the computing complexity. Additionally, we employed the Complete Intersection over Union (CIoU) and Non-Maximum Suppression (NMS) to regress the loss function to guarantee the detection accuracy of FHB. Furthermore, the loss function incorporated the focal loss to reduce the error caused by the unbalanced positive and negative sample distribution. Finally, mixed-up and transfer learning schemes enhanced the model’s generalization ability. The experimental results demonstrated that the proposed model performed admirably well in detecting FHB of the wheat ear, with an accuracy of 93.69%, and it was somewhat better than the MobileNetv2-YOLOv4 model (F1 by 4%, AP by 3.5%, Recall by 4.1%, and Precision by 1.6%). Meanwhile, the suggested model was scaled down to a fifth of the size of the state-of-the-art object detection models. Overall, the proposed model could be deployed on UAVs so that wheat ear FHB detection results could be sent back to the end-users to intelligently decide in time, promoting the intelligent control of agricultural disease. Keywords: fusarium head blight; unmanned aerial vehicles; lightweight model; YOLOv4 1. Introduction Wheat is one of the world’s most important staple foods, providing both calories and nutrients, and wheat-based foods are essential for food security of global population. Wheat proteins contain sufficient essential amino acids to meet daily requirements [1], while wheat grains provide a variety of vitamins, minerals, and other phytochemicals, significantly impacting human health [2]. In many parts of the world, Fusarium head blight (FHB), often known as head scab, is one of the most severe and widespread wheat diseases caused by Fusarium graminearum and Fusarium asiaticum [3]. FHB mainly occurs in warm Remote Sens. 2022, 14, 3481. https://doi.org/10.3390/rs14143481 https://www.mdpi.com/journal/remotesensing

Transcript of A Lightweight Model for Wheat Ear Fusarium Head Blight ...

Citation: Hong, Q.; Jiang, L.; Zhang,

Z.; Ji, S.; Gu, C.; Mao, W.; Li, W.; Liu,

T.; Li, B.; Tan, C. A Lightweight

Model for Wheat Ear Fusarium Head

Blight Detection Based on RGB

Images. Remote Sens. 2022, 14, 3481.

https://doi.org/10.3390/rs14143481

Academic Editor: Francisco Javier

Mesas Carrascosa

Received: 12 June 2022

Accepted: 18 July 2022

Published: 20 July 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional affil-

iations.

Copyright: © 2022 by the authors.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

remote sensing

Article

A Lightweight Model for Wheat Ear Fusarium Head BlightDetection Based on RGB ImagesQingqing Hong 1, Ling Jiang 1, Zhenghua Zhang 1, Shu Ji 1, Chen Gu 1, Wei Mao 2, Wenxi Li 2, Tao Liu 1, Bin Li 1

and Changwei Tan 1,*

1 Jiangsu Key Laboratory of Crop Genetics and Physiology, Jiangsu Co-Innovation Center for ModernProduction Technology of Grain Crops, Joint International Research Laboratory of Agriculture andAgri-Product Safety of the Ministry of Education of China, Jiangsu Province Engineering ResearchCenter of Knowledge Management and Intelligent Service, College of Information Engineer,Yangzhou University, Yangzhou 225009, China; [email protected] (Q.H.);[email protected] (L.J.); [email protected] (Z.Z.); [email protected] (S.J.);[email protected] (C.G.); [email protected] (T.L.); [email protected] (B.L.)

2 Station of Land Protection of Yangzhou City, Yangzhou 225009, China; [email protected] (W.M.);[email protected] (W.L.)

* Correspondence: [email protected]

Abstract: Detection of the Fusarium head blight (FHB) is crucial for wheat yield protection, withprecise and rapid FHB detection increasing wheat yield and protecting the agricultural ecologicalenvironment. FHB detection tasks in agricultural production are currently handled by cloud serversand utilize unmanned aerial vehicles (UAVs). Hence, this paper proposed a lightweight model forwheat ear FHB detection based on UAV-enabled edge computing, aiming to achieve the purpose ofintelligent prevention and control of agricultural disease. Our model utilized the You Only LookOnce version 4 (YOLOv4) and MobileNet deep learning architectures and was applicable in edgedevices, balancing accuracy, and FHB detection in real-time. Specifically, the backbone networkCross Stage Partial Darknet53 (CSPDarknet53) of YOLOv4 was replaced by a lightweight network,significantly decreasing the network parameters and the computing complexity. Additionally, weemployed the Complete Intersection over Union (CIoU) and Non-Maximum Suppression (NMS) toregress the loss function to guarantee the detection accuracy of FHB. Furthermore, the loss functionincorporated the focal loss to reduce the error caused by the unbalanced positive and negative sampledistribution. Finally, mixed-up and transfer learning schemes enhanced the model’s generalizationability. The experimental results demonstrated that the proposed model performed admirably wellin detecting FHB of the wheat ear, with an accuracy of 93.69%, and it was somewhat better thanthe MobileNetv2-YOLOv4 model (F1 by 4%, AP by 3.5%, Recall by 4.1%, and Precision by 1.6%).Meanwhile, the suggested model was scaled down to a fifth of the size of the state-of-the-art objectdetection models. Overall, the proposed model could be deployed on UAVs so that wheat ear FHBdetection results could be sent back to the end-users to intelligently decide in time, promoting theintelligent control of agricultural disease.

Keywords: fusarium head blight; unmanned aerial vehicles; lightweight model; YOLOv4

1. Introduction

Wheat is one of the world’s most important staple foods, providing both caloriesand nutrients, and wheat-based foods are essential for food security of global population.Wheat proteins contain sufficient essential amino acids to meet daily requirements [1],while wheat grains provide a variety of vitamins, minerals, and other phytochemicals,significantly impacting human health [2]. In many parts of the world, Fusarium head blight(FHB), often known as head scab, is one of the most severe and widespread wheat diseasescaused by Fusarium graminearum and Fusarium asiaticum [3]. FHB mainly occurs in warm

Remote Sens. 2022, 14, 3481. https://doi.org/10.3390/rs14143481 https://www.mdpi.com/journal/remotesensing

Remote Sens. 2022, 14, 3481 2 of 20

and humid environments. Under normal circumstances, there is little or no FHB infectionbelow 15 ◦C, and the possibility of wheat infected with FHB increases as the temperaturerises from 20 ◦C to 30 ◦C [4]. Diseased spikes reveal premature bleaching and generateshriveled kernels when FHB infects wheat kernels. Meanwhile, several Fusarium alenoltoxins are secreted. Processed products made from wheat kernels carrying these toxinslikely lead to food poisoning and gradual decline of immunity and physical health [5]. FHBis also one of the most devastating cereal diseases causing significant yield and qualityreduction [6].

Scientific research suggests a substantial risk of wheat disease under future climaticconditions. Climate change can greatly influence temperature and moisture by causingexcessive rainfall or drought, which are the most important environmental factors influencingthe development of FHB [7]. Consequently, an accurate and rapid approach to detecting FHBis particularly critical for disease control and screening the FHB disease-resistant cultivars.

Visual inspection, chromatography, and enzyme-linked immunoassay (ELISA) are theconventional methods for detecting FHB [8]. Visual inspection is convenient, but highlysubjective, requiring various human, material, and financial resources based on morpho-logical characteristics and experience. Chromatography and ELISA are highly accurate,sensitive, and dependable, but they typically require multiple processing steps, expensiveequipment, and a long turnaround time. Furthermore, the conventional approaches areineffective in detecting large-scale disease. Many studies have also been conducted tofind a quick and accurate way for FHB detection. For instance, many researchers havegained considerable attention for the safety and convenience of UAV-based wheat diseasedetections [9]. Liu et al. [10] proposed a method combining unmanned aerial vehicle (UAV)technology and multispectral technology to detect the severity of wheat FHB with 98%accuracy. When employing UAVs, the wheat images are captured using an onboard camera.The primary applications of UAV in agriculture include mapping, spraying, pest detection,and other tasks. Rapid and accurate disease type and disease area localization are crucialfor disease prevention and control, achieving a high accuracy in the assessment of theseverity of disease and the purpose of intelligent prevention.

The development of deep learning has already been extended to FHB detection. Deeplearning-based detection approaches have demonstrated distinct improvements in detec-tion efficiency, accuracy, and application scenarios [11,12]. The classical object detectionalgorithm was based on hand-engineered features and suitable classifiers for object clas-sification [13]. Research on vision-based currently centers on general object detectionalgorithms. Gu et al. [14] proposed an FHB severity evaluation method based on the Relief-F algorithm, and the accuracy for the severity of single wheat ear was 94%. Su et al. [15]also proposed an FHB severity evaluation method based on Mask-RCNN network, and theevaluation accuracy of wheat ears was 77%. Although deep convolutional neural networkshave richer network structures and offer superior feature learning advantages, the problemwith deeper networks is that incorporating many parameters slows the network’s compu-tation, as edge devices have a low computational capability [16]. Therefore, for effectivewheat ear FHB detection, reducing the parameters of deep learning algorithms deployed inedge devices is critical.

To address the abovementioned issues, this paper proposed a lightweight wheat earFHB detection model relying on YOLOv4 and MobileNet. The research items for thisstudy were wheat images captured from the field setting. The feature extraction networkCSPDarknet53 in YOLOv4 was replaced with the lightweight network MobileNet, whichhad significantly fewer parameters. Furthermore, the error caused by the positive andnegative imbalanced sample distribution was solved by improving the loss function ofYOLOv4 and introducing the focal loss in the loss function.

Remote Sens. 2022, 14, 3481 3 of 20

2. Related Work2.1. Crop Detection Based on Deep Learning

In recent decades, crop disease identification primarily relied on classification tech-niques such as the K-Nearest Neighbor (KNN) [17], Support Vector Machine (SVM) [18],Fisher linear discriminant (FLD) [19], and Random Forest (RF) [20]. Nevertheless, the char-acteristics of lesion segmentation and the manual design of various algorithms significantlyimpact disease detection in conventional techniques. Specifically, handcrafted featuresrequire a significant effort and expert knowledge, both of which are subjective. Deeplearning technology has recently been popular for crop pests and disease detection [21].For instance, Picon et al. [22] presented a challenging dataset with seventeen diseases andfive crops (wheat, barley, corn, rice, and rapeseed). They utilized three alternative CNNarchitectures that integrated contextual meta-data into image-based convolutional neuralnetworks to ease disease classification. The results demonstrated that for each crop, a singlemodel incorporating large numbers of crops and diseases performed better than an isolatedclassification model. Espejo-Garcia et al. [23] developed a new crop/weed identificationsystem that combined fine-tuned pre-trained neural networks involving classic machinelearning classifiers to prevent overfitting and achieve robust performance. Chen et al. [24]employed transfer learning on deep convolutional neural networks to quickly identifyand diagnose disease and merged VGGNet with an Inception module. This architectureattained more than 91.83% accuracy on a public dataset.

Spectroscopy and imaging techniques, such as hyperspectral imaging (HSI), have beenincreasingly popular in disease detection and monitoring [25] and have already been inte-grated with deep learning schemes to enhance automatic disease detection. For instance,Li et al. [26] designed a hyperspectral imaging reconstruction model based on DCNN toimprove spatial features and found that a DCNN-based framework could produce accept-able results. For the early diagnosis of FHB, Jin et al. [27] applied a deep neural network tocategorize healthy and diseased wheat heads from the pixels of hyperspectral images.

2.2. Lightweight Model

Due to its ability to extract features, deep learning technology and particularly theConvolutional Neural Network, is the favored strategy for detecting crop disease. Althoughdeep learning achieves a good detection performance, it involves many parameters and aslow reasoning speed. Nevertheless, farmlands are typically located in regions with limitedresources, and deploying these models in rural districts is limited by network bandwidthand delay. Furthermore, deep convolutional neural networks could not be effectivelyemployed in edge devices. As a consequence, compressing neural networks effectively iscrucial, with current strategies including pruning [28], knowledge distillation [29], quanti-zation [30], and developing lightweight models [31,32].

Employing modified deep learning architectures to detect crop disease has becomemainstream. Ale et al. [33] proposed a lightweight model and used transfer learning tostrengthen the model’s generalization performance on a plant disease dataset, achieving anaccuracy of 89.70%. In order to improve the model’s performance under limited hardwareconditions, Wang [34] altered the residual connection mode and the convolution method,considerably decreasing the model’s parameters and memory space. Toda and Okura [35]designed a CNN based on the Inceptionv3 architecture. By visualizing the output of the in-termediate layer, they reduced the network parameters by 75%. Wang et al. [16] introduceda DNN-based compression strategy to reduce the computational burden, compressing themodel to 0.04 MB.

Although decreasing the network parameters has already been achieved, the cor-responding accuracy was relatively low. Thus, some researchers suggested adding anattention mechanism to improve accuracy, with Tang et al. [36] improving the real-timeidentification performance of the lightweight neural network models ShuffleNet-V1 andShuffleNet-V2 by including such an attention mechanism. Zhao et al. [37] developedSE-ResNet50 by integrating the squeeze-and-excitation network (SENet), one of the atten-

Remote Sens. 2022, 14, 3481 4 of 20

tion mechanisms, into each stage of the ResNet50 architecture. Then the performance ofSE-ResNet50 was evaluated on tomato and grape disease classification, outperformingseveral baseline models that did not use the SENet module. Chen [38] has also introducedthe Location-wise Soft Attention mechanism to the pre-trained lightweight neural networkMobileNetv2 to boost the learning ability of minor lesion features, achieving an averageaccuracy of 99.71% for crop disease detection. According to the current literature, addingan attention mechanism to deep networks improves the detection performance of plantand crop diseases. However, this strategy increases the model’s parameters prohibiting itfrom deployment on edge devices.

2.3. The YOLO Algorithm

In recent years, the YOLO models have been widely utilized for object detection dueto their fast detection speed and high accuracy. In 2015, Redmon et al. [39] proposedthe YOLO algorithm, where the object detection task was transformed into a regressionproblem. Extensions of the YOLOv1 model were the YOLOv2 [40] and YOLOv3 [41], withYOLOv2 being substantially more accurate than the original model. YOLOv3 improvedthe shortcomings of YOLOv2 in small object detection by employing the residual neuralnetwork concept [42]. Bochkovskiy [43] proposed the YOLOv4, which optimized YOLOv3using data augmentation, a backbone feature extraction network, loss function, and activa-tion function. YOLOv4 realized the best tradeoff between detection accuracy and speed,with Figure 1 illustrating a performance comparison of YOLOv4 against other state-of-the-art object detectors. Under the MS COCO dataset, the Average Precision (AP) and detectionspeed of YOLOv4 were improved by 10% and 12% over YOLOv3, respectively.

Figure 1. Comparison between YOLOv4 and other state-of-the-art object detectors. (* AP refersto average accuracy. FPS refers to the number of frames processed per second. YOLOv4 is morefaster than EfficientDet with the same AP value and improves YOLOv3’s AP and FPS by 10% and12%, respectively.)

The superiority of the YOLO algorithm has extended its applicability in crop detection,making it an effective tool in precision agriculture. In orchards with fluctuating illumination,complicated backgrounds, overlapping apples, and branches and leaves, Tian et al. [44]proposed an improved YOLOv3 model for detecting apples during different growth stages.In this modified YOLOv3 network, the DenseNet method processed the low-resolution

Remote Sens. 2022, 14, 3481 5 of 20

feature layers, with the results revealing that the proposed model effectively performedapple detection under overlapping apples and occlusion circumstances. Arunabha et al. [45]suggested Dense-YOLOv4, a real-time object detection framework, which has been appliedto detect distinct stages of mango growth. This model’s mean average precision (mAP)and F1-score reached 96.20% and 93.61%, respectively, at a detection rate of 44.2FPS.Guo et al. [46] applied YOLOv3 and YOLOv4 for flower and bud detection. The resultsshowed that the YOLOv4 model achieved the highest AP of 98.6% on bud detection (4.6%higher than the YOLOv3 model) and an mAP of 97.6% on the kiwifruit flower, while thebud detection required 38.6ms per image.

Based on previous research findings and the advantages of the YOLOv4 algorithm,this paper employed the YOLOv4 algorithm for wheat ear FHB detection. Additionally,we replaced in YOLOv4 the standard convolution with point convolution and depth-wiseseparable convolution to construct a lightweight model, while preserving accuracy.

3. Materials and Methods3.1. Data Collection and Preprocessing

The wheat images used in this paper were captured in the field environment of theYangzhou experimental base (32◦42′N, 119◦53′E) of the Institute of Agricultural Sciencesin Lixiahe District, Jiangsu Province, China, which is primarily responsible for scientificresearch on crops such as wheat and rice. We employed a SONY@6300 digital camerawith 24.2 million effective pixels, a 3-inch 920,000-pixel LCD screen, a 4 K high-definitioncamera, and a memory card SD/SDHC/SDXCmicroSD/microSDHC/microSDXC withstrong endurance. All wheat images were captured during the grain filling stage in natu-ral daylights.

We took 866 wheat images stored in the Joint Photographic Experts Group (JPEG)format. Since some images contained uneven illumination and complex backgrounds,604 images were selected for experiments. We employed data enhancement to expand thenumber of images to 1616. An example of the captured FHB images was illustrated inFigure 2.

Figure 2. Wheat images infected with FHB.

The LabelImg tool was used to annotate the captured images by manually sorting thecategories and drawing the bounding boxes. Figure 3 depicted the annotation results on adiseased wheat ear, with Figure 3b presenting the converted gray image in PNG format.We deduce from the annotation that the diseased spikes have early bleaching and generateshriveled grains. A Pink Mucor layer would emerge at the base of diseased spikelets, andcoal dust-like black particles would appear afterward. After annotation, the final imageswere transformed into the PASCAL VOC format. After data enhancement, the dataset wasthen randomly partitioned into training, validation, and test datasets at the ratio of 8:1:1 toensure uniform distribution.

Remote Sens. 2022, 14, 3481 6 of 20

Figure 3. The annotation results of diseased wheat ear image. (a) The original image. (b) PNGgray image.

3.2. Lightweight YOLOv4-Based Network Design3.2.1. YOLOv4 Algorithm

The YOLO algorithm [39], proposed by Redmon et al., was a one-stage network thatseparates an image into S × S grids. Each grid was in-charge of detecting objects whosecenter point fell within its grid. Each grid would predict B bounding boxes, their confidencescores, and the conditional class probabilities. The detection pipeline of YOLO is illustratedin Figure 4. The YOLOv4 algorithm, as an updated version of YOLOv3, was proposed byBochkovskiy [43] and comprised the Cross Stage Partial Darknet-53 (CSPDarknet-53) [43],Spatial Pyramid Pooling (SPP) block [47], and Path Aggregation Network (PANet) [48]. TheYOLOv4 algorithm optimized the feature extraction backbone network, the neck networkfor feature fusion, and the prediction head for classification and regression. Based onthe original Darknet53 of YOLOv3, the YOLOv4 algorithm utilized a cross-stage partialnetwork (CSP) [49] and built the CSPDarknet53 feature extraction network. The YOLOv4algorithm incorporated the SSP block to expand the receptive field. Moreover, insteadof using FPN for future fusion, PANet improved the amount of image feature extraction.The YOLOv4 algorithm employed Mosaic data augmentation, CIoU, learning rate cosineannealing decay, and label smoothing to prevent overfitting.

3.2.2. MobileNet

Although the YOLOv4 object detection algorithm had improved accuracy and speed,the CSPDarknet53 network was still a deep network involving many parameters, which,therefore, made it unsuitable for deployment on edge devices with low processing ca-pacity. As a result, studying the lightweight model for its application and developmentin agriculture is critical. Hence, this study introduced the lightweight neural network(MobileNet) [31] into the YOLOv4 algorithm.

MobileNet was a lightweight network proposed by Google in 2017 for mobile termi-nals and edge devices. In MobileNet, the standard convolution was replaced with pointand depth-wise separable convolutions, dramatically decreasing the model calculation com-plexity. Moreover, MobileNet replaced the activation function with ReLU6 (Equation (1)),allowing the model to learn sparse features sooner.

Y = min(max( f eature, 0), 6) (1)

where Y is the output of ReLU6, and feature is the input feature.

Remote Sens. 2022, 14, 3481 7 of 20

Figure 4. YOLO detection (YOLO algorithm models detection as a regression problem. It divides theimage into an S × S grid and for each grid cell predicts bounding boxes, confidence for those boxesand class probabilities).

Standard convolution (Figure 5a) involved adding the input feature diagrams of eachchannel to the convolution kernel of the response and then outputting the features. Thetypical standard convolution operation has a computational cost of:

n1 = Dk · Dk ·M · N · Dw · Dh (2)

Figure 5. (a) Standard convolution process and (b) Depth-wise separable convolution process.

Remote Sens. 2022, 14, 3481 8 of 20

The depth-wise separable convolution (Figure 5b) split the standard convolution’sone-step operation into two steps: one depth-wise convolution and one point-by-pointconvolution. The following is the size of the convolution operation:

n2 = Dk · Dk ·M · Dw · Dh + M · N · Dw · Dh (3)

After applying the depth-wise separable convolution, the calculation complexity andthe required parameters are reduced to about 1/4 of the original, resulting in a significantlysmaller model and a faster detection speed.

3.2.3. The Proposed Lightweight Model

The backbone structure of the YOLOv4 model was CSPDarknet53. Although theCSPDarknet53 network had an appealing detection accuracy and a low computing cost,it still had many parameters, making it unsuitable to be embedded in edge devices. Inorder to reduce the parameters of the YOLOv4 model and facilitate the embedding of edgeequipment, this paper proposed a lightweight neural network model called MobileNetv3-YOLOv4 network, based on the traditional YOLOv4 model. It could be seen from Figure 6that the structure replaced the YOLOv4 backbone network with MobileNetv3.

Remote Sens. 2022, 14, x FOR PEER REVIEW 9 of 22

SPP PANetCBL

19×19

38×38

76×76

CSPDarknet53CBL

CBL

CBM CSP1 CSP2 CSP8 CSP8 CSP4

CBM Conv BN Mish

CBL Conv BN Leakyrelu

=

=

CSPX CBM CBM CBM

CBM

CBM

Concat

Resunit

=

SPP

Maxpool

Maxpool

Maxpool

Contact=

Resunit = CBM CBM add

Conv bneck benck bneck bneck bneck

MobileNetv3

Figure 6. The proposed lightweight network structure based on the YOLOv4.

The SPP module converted feature maps of arbitrary size into fixed-size feature vectors. Figure 7 illustrates the bottleneck (bneck) module network with MobileNetv3, and the PANet structure is shown in Figure 8 in detail. YOLOv4 aggregated parameters using PANet instead of Feature Pyramid Network (FPN) for object detection at different levels. Our first enhancement was improving the utilization and increasing the low-level information’s transmission efficiency. Therefore, we proposed the bottom-up path augmentation. The second innovation was that when the FPN selected anchors in multiple layers, it allocated them to the corresponding layer based on the anchor size for hierarchical selection. Although this was incredibly efficient, it cannot utilize all available information simultaneously. Thus, PANet afforded adaptive feature pooling to alleviate this problem. The third proposed improvement involved fully connected fusion, which combined convolution-up-sampling and fully connected layers to optimize the mask generation quality.

Figure 6. The proposed lightweight network structure based on the YOLOv4.

The SPP module converted feature maps of arbitrary size into fixed-size feature vec-tors. Figure 7 illustrates the bottleneck (bneck) module network with MobileNetv3, andthe PANet structure is shown in Figure 8 in detail. YOLOv4 aggregated parameters usingPANet instead of Feature Pyramid Network (FPN) for object detection at different levels.Our first enhancement was improving the utilization and increasing the low-level informa-tion’s transmission efficiency. Therefore, we proposed the bottom-up path augmentation.The second innovation was that when the FPN selected anchors in multiple layers, it allo-cated them to the corresponding layer based on the anchor size for hierarchical selection.Although this was incredibly efficient, it cannot utilize all available information simultane-ously. Thus, PANet afforded adaptive feature pooling to alleviate this problem. The third

Remote Sens. 2022, 14, 3481 9 of 20

proposed improvement involved fully connected fusion, which combined convolution-up-sampling and fully connected layers to optimize the mask generation quality.

Remote Sens. 2022, 14, x FOR PEER REVIEW 10 of 22

Conv 1×1(Project)

Conv 1×1(Expand)Batch Normalization

Activation

Depthwise ConvolutionBatch Normalization

Activation

SE

Batch Normalization

+

Average Pooling

FC

ReLU

FC

Sigmoid

*

(a) Bottleneck Block (b) Squeeze-and-Excite(SE) Figure 7. The structure of bneck.

+

class

box

mask

(a) (b) (c)(d)

(e)

P2

P3

P4

P5

N2

N3

N4

N5

Figure 8. The structure of PANet. (a) FPN backbone, (b) Bottom-up path augmentation, (c) Adaptive feature pooling, (d) Box branch, and (e) Fully-connected fusion.

MobileNetv3 combined the depth-wise separable convolution of MobileNetv1 with the inverted residual linear bottleneck and the lightweight attention model (Squeeze-and-Excitation, SE-Net) of MobileNetv2, and utilized the Network Architecture Search (NAS) to search the network’s configuration and parameters. The 3 × 3 and 1 × 1 convolution layers were discarded to reduce the calculation burden further. Finally, the Hard-swish and ReLU6 activation functions of MobileNetv3 significantly improved network accuracy.

Although the proposed lightweight model replaced the ordinary convolution in the backbone feature extraction network with the depth-wise separable convolution, the neck network and the prediction head still adopted the SPP + PAN + YOLO Head structure. Furthermore, the standard convolution in the enhanced feature extraction network (PANet) was replaced with the depth-wise separable convolution, considerably reducing the model’s parameter cardinality. The specific structural parameters are listed in Table 1, with columns 1–3 representing the MobileNetv3 change per feature layer, each feature layer’s operation, and the number of channels, respectively. Moreover, in the same table, columns 4–7 represented whether the attention module should be used at this layer, whether or not to adopt the H-swish function, the step size used for each block structure,

Figure 7. The structure of bneck.

Figure 8. The structure of PANet. (a) FPN backbone, (b) Bottom-up path augmentation, (c) Adaptivefeature pooling, (d) Box branch, and (e) Fully-connected fusion.

MobileNetv3 combined the depth-wise separable convolution of MobileNetv1 withthe inverted residual linear bottleneck and the lightweight attention model (Squeeze-and-Excitation, SE-Net) of MobileNetv2, and utilized the Network Architecture Search (NAS)to search the network’s configuration and parameters. The 3 × 3 and 1 × 1 convolutionlayers were discarded to reduce the calculation burden further. Finally, the Hard-swish andReLU6 activation functions of MobileNetv3 significantly improved network accuracy.

Although the proposed lightweight model replaced the ordinary convolution inthe backbone feature extraction network with the depth-wise separable convolution, theneck network and the prediction head still adopted the SPP + PAN + YOLO Head struc-ture. Furthermore, the standard convolution in the enhanced feature extraction network

Remote Sens. 2022, 14, 3481 10 of 20

(PANet) was replaced with the depth-wise separable convolution, considerably reducingthe model’s parameter cardinality. The specific structural parameters are listed in Table 1,with columns 1–3 representing the MobileNetv3 change per feature layer, each featurelayer’s operation, and the number of channels, respectively. Moreover, in the same ta-ble, columns 4–7 represented whether the attention module should be used at this layer,whether or not to adopt the H-swish function, the step size used for each block structure,and the output of each feature layer, respectively. After the preliminary feature extractionby CSPDarknet-53, three preliminary adequate feature layers were obtained in the YOLOv4network structure. When the input image was 416 × 416 × 3, the three preliminary effec-tive feature layers were 52 × 52 × 256, 26 ×26 × 512, and 13 × 13 × 1024, respectively.If the backbone feature extraction network was directly replaced with MobileNetv3, thesize of the output preliminary effective feature layers were 52 ×52 ×40, 26 × 26 × 112,and 13 × 13 × 160, respectively. Directly training the network suffered from the channelmismatch error. Thus, before training, the number of input channels of the convolutionoperation must be adjusted.

Table 1. MobileNetv3 network structure.

Input Operation Number of Channels SE HS Stride Output

416 × 416 × 3 Conv, 1 × 1 16 - - 2 208 × 208 × 16208 × 208 × 16 Bneck, 3 × 3 16 - - 1 208 × 208 × 16208 × 208 × 16 Bneck, 3 × 3 24 - - 2 104 × 104 × 24104 × 104 × 24 Bneck, 3 × 3 24 - - 1 104 × 104 × 24104 × 104 × 24 Bneck, 5 × 5 40

√- 2 52 × 52 × 40

52 × 52 × 40 Bneck, 5 × 5 40√

- 1 52 × 52 × 4052 × 52 × 40 Bneck, 5 × 5 40

√- 1 52 × 52 × 40

52 × 52 × 40 Bneck, 3 × 3 80 -√

2 26 × 26 × 8026 × 26 × 80 Bneck, 3 × 3 80 -

√1 26 × 26 × 80

26 × 26 × 80 Bneck, 3 × 3 80 -√

1 26 × 26 × 8026 × 26 × 80 Bneck, 3 × 3 80 -

√1 26 × 26 × 80

26 × 26 × 80 Bneck, 3 × 3 112√ √

1 26 × 26 × 11226 × 26 × 112 Bneck, 3 × 3 112

√ √1 26 × 26 × 112

26 × 26 × 80 Bneck, 5 × 5 160√ √

2 13 × 13 × 16013 × 13 × 160 Bneck, 5 × 5 160

√ √1 13 × 13 × 160

13 × 13 × 160 Bneck, 5 × 5 160√ √

1 13 × 13 × 160

3.3. Network Training and Evaluation

Deep learning models require a large number of datasets for training. Hence, we ex-panded the wheat images through data augmentation involving image cropping, affine trans-formation, horizontal flipping, adding noise and blur, and altering the contrast and brightness.Figure 9 illustrates an example of wheat disease samples after image augmentation.

The experiment selected 1616 wheat disease images, randomly divided into training,validation, and test datasets at a ratio of 8:1:1. We adopted transfer learning during thedeep learning model training, with the pre-training model able to perform freezing andunfreezing training operations. The learning rate is a hyperparameter that significantlyimpacts the model’s performance. In our trials, the initial learning rate was 0.0001, andits momentum was 0.9, the same for both the proposed and the competitor detectionalgorithms. This study adopted the adaptive motion estimation (Adam) to optimize thelearning rate automatically. To ensure a minimum validation loss, we saved the trainedmodel in every epoch. When the models completed training, they were validated on thesame testing datasets employing the Precision, Recall and F1 score performance metrics(Equations (4)–(6)) to reveal their FHB detection accuracy. Additionally, we evaluatedthe model’s performance on detection speed by averaging the time required to detect100 test images.

Precision =TP

TP + FP(4)

Remote Sens. 2022, 14, 3481 11 of 20

Recall =TP

TP + FN(5)

F1 =2Precision · RecallPrecision + Recall

(6)

where TP represents the number of correctly predicted positive samples, FP is the numberof positive examples with the wrong prediction, and FN is the number of negative exampleswith a wrong prediction.

Figure 9. Examples of wheat disease images after image enhancement. (a) Original image, (b) hor-izontal flipping and cropping, (c) cropping and minor Gaussian blurring, and (d) cropping andcontrast and brightness change.

All experiments were implemented on a i7-11700K CPU, ROG-STRIX-RTX3070-O8G-GAMING, 8 Gb memory, employing Pytorch with torch 1.7.0, CUDA Toolkit 11.0, cuDNNv8.0 and Python3.6.

3.4. Loss Function

The loss function indicates the disparity between the prediction and the actual data,revealing the model’s prediction quality. The YOLOv4 algorithm utilized the CIoU loss andDIoU loss, comprising the bounding box regression, confidence, and classification losses.

The bounding box regression loss utilized the CIoU loss. Based on IoU, the CIoUconsidered three geometric factors: the overlapping area, center distance, and aspect ratio.IoU represents the difference between the intersection and union ratio between predictionframe A and real frame B, reflecting the detection performance of the prediction detectionframe. IoU is defined as:

IoU =|A ∩ B||A ∪ B| (7)

The CIoU loss function added the detection frame’s length and width loss, affording aprediction frame consistent with the real frame. CIoU is defined as:

CIoU = IoU −(

ρ2(b, bgt)

c2 + αυ

)(8)

υ =4

π2 (arctanωgt

hgt − arctanω

h)

2

(9)

Remote Sens. 2022, 14, 3481 12 of 20

α =υ

(1− IoU) + υ(10)

where ωgt and hgt represent the width and height of the real frame, respectively, and ω andh are the width and height of the prediction frame.

The confidence loss and category loss in YOLOv4 employed a binary cross-entropyloss function, which imbalanced the positive and negative sample proportion, resulting inmany negative samples. In order to better focus on the difficult-to-classify samples duringtraining and reduce the false detection rate of comparable objects, the focal loss functionreplaced the binary cross-entropy loss function.

The focal loss was modified based on the cross-entropy loss function, with its expres-sion being:

L f 1 = −(1− pt)γ log(pt) (11)

where pt denotes the classification difficulty. The larger the pt, the higher the classificationconfidence, and the easier it is to divide the representative sample. Accordingly, the smallerpt, the lower the classification confidence, and the more difficult it is to distinguish therepresentative samples. γ > 0 is an adjustable factor.

4. Results

The performance of the proposed lightweight model was verified on specific detectionimages. To prove the superiority of the developed MobileNetv3-YOLOv4 model, wechallenged it against current state-of-the-art detection models. All models were trainedunder the same settings and datasets.

4.1. Loss Function of the Model

It could be seen from Figure 10 that the training loss curves of all object detectionmodels converge, indicating that all models were successfully trained. As the number oftraining epochs increased, the loss values of YOLOv4 and the improved models initiallydeclined quickly and then slowly. Figure 11 showed the detailed loss value between epoch10 and 40. From Figure 11, we could find that the loss curve of YOLOv4 progressivelyconverged near 0.07 after roughly 34 epochs, while the loss curves of MobileNetv1-YOLOv4,MobileNetv2-YOLOv4, MobileNetv3-YOLOv4 converged near 0.07, 0.07, and 0.06 afterapproximately 34, 30, and 38 epochs, respectively. All loss curves converged, indicatingthat the trained models acquired the FHB features to complete the detection tasks.

Figure 10. The loss values of YOLOv4 and the improved models.

Remote Sens. 2022, 14, 3481 13 of 20

Figure 11. Detailed loss value between epoch 10 and 40.

4.2. Network Parameters and PR Curve Analysis

In the enhanced feature extraction network of YOLOv4, we replaced the backbonefeature extraction network and the ordinary convolution with the depth-wise separableconvolutions. Table 2 revealed that the parameters of the improved model decreaseddramatically, with MobileNetv2-YOLOv4 having the least parameters and the proposedmodel slightly more. This was because MobileNetv3 incorporated an Attention mechanismbased on MobileNetv2. The algorithms’ performance was evaluated based on the accuracyand recall metrics, while the precision-recall (PR) curves of several models were illustratedin Figure 12. The PR curves considered the detection accuracy as the ordinate and therecall rate as the abscissa. The area between the curve and the coordinate axis was theAP. The higher the AP value, the better the model’s detection performance. Figure 12revealed that the PR curve area under the MobileNetv3-YOLOv4 model was slightly lessthan the YOLOv4 model, probably because some features learned during training were notsufficiently generalized.

Table 2. Parameter statistics of four network models.

Network Model Parameter/Ten Thousand

YOLOv4 5000–6000MobileNetv1-YOLOv4 1269.20MobileNetv2-YOLOv4 1080.12MobileNetv3-YOLOv4 1172.91

4.3. Comparison of Different Detection Methods

To verify the superiority of the proposed model, we compared YOLOv3, YOLOv4,YOLOv4-tiny, YOLOv4 with ResNet50, and YOLOv4 with VGG, which are currently thestate-of-the-art detection models. Table 3 reported the models’ AP, F1, Recall, Precision,average detection time, and model size, highlighting that YOLOv4 had a better Recall(68.43%) than the other methods, but the average detection time was the largest. TheYOLOv3 and YOLOv4 models afforded appealing detection results, but YOLOv4 outper-formed YOLOv3 in F1, AP, Recall, and Precision (by 20%, 4.8%, 27%, and 4.7%, respectively).The reason was that CSPDarknet-53 utilized as the backbone in YOLOv4 may enhanceits detection capability, while SPP and PANet of YOLOv4 afforded feature aggregation atmultiple levels to preserve adequate spatial information for small object detection. In termsof detection performance, the proposed MobileNetv3-YOLOv4 model was more appealingthan MobileNetv2-YOLOv4, attaining higher performance metrics (F1 by 4%, AP by 3.5%,

Remote Sens. 2022, 14, 3481 14 of 20

Recall by 4.1%, and Precision by 1.6%). Consequently, despite the smaller model size ofMobileNetv2-YOLOv4, we favor MobileNetv3-YOLOv4. Our model’s AP and F1 scorewere 1.1% and 2% lower than YOLOv4, but the error was between 4–5%, which was withinthe acceptable range. Furthermore, the proposed model outperformed the YOLOv4 modelin terms of weight indicators affording a reduced average detection time by 0.0141 s andmodel size by 200 MB. Although the proposed model sacrificed a certain detection accuracy,the error was within a controllable range, and the model size was significantly decreased.

Figure 12. PR curves of YOLOv4 and improved models.

Table 3. Performance comparison between different models.

Model AP F1 Score Recall Precision Time/s Size/MB

YOLOv3 76.44% 58% 41.81% 86.08% 0.1140 247YOLOv4 81.28% 78% 68.43% 90.72% 0.1249 256

YOLOv4-tiny 79.99% 73% 61.09% 91.33% 0.1004 24YOLOv4 with ResNet50 80.81% 76% 65.36% 92.07% 0.1123 134

YOLOv4 with VGG 80.41% 75% 62.63% 92.68% 0.1174 94MobileNetv1-YOLOv4 79.50% 74% 61.43% 92.31% 0.1071 54MobileNetv2-YOLOv4 76.64% 72% 59.22% 92.04% 0.1084 49

Ours 80.17% 76% 63.31% 93.69% 0.1108 56

4.4. FHB Detection

We randomly chose an image from the FHB validation set for manual marking, andcompared the marking result with the detection result based on the proposed model. Asshown in Figure 13, the model detection performance was consistent with the manualmarking results, presenting a high detection accuracy. Although the model still had certaindeficiencies (Figure 13b, where the blue circle represents missed detection), the ear that wasinitially infected was detected as normal, but the missed detection rate was low.

Eight models were used to evaluate the testing datasets, with the results demonstrat-ing that under low density, the detection results of YOLOv3 presented marginally lessaccurate results in some detection frames compared with our algorithm. YOLOv4 missedsome detections and presented some false detections. We challenged the proposed modelagainst popular models, with the corresponding results indicating that our model wascomparatively lightweight due to fewer network parameters, and its detection performancewas comparable to current models. Figure 14 depicted the detection results of randomlyselected wheat ears.

Remote Sens. 2022, 14, 3481 15 of 20

Figure 13. Comparison between manual annotation and the proposed model. (a) Manual annotationand (b) detection results of the proposed model.

Figure 14. Cont.

Remote Sens. 2022, 14, 3481 16 of 20

Figure 14. Comparisons of FHB detection with different algorithms. (a) The original images.(b) YOLOv3. (c) YOLOv4. (d) YOLOv4-tiny. (e) YOLOv4 with ResNet50. (f) YOLOv4 with VGG.(g) Ours.

5. Discussion

This research proposed a lightweight model for FHB wheat ear detection based onRGB images. The developed model was based on YOLOv4, but had three significantenhancements. First, YOLOv4’s backbone network, CSPDarknet53, was replaced with thelightweight MobileNet. Second, we guaranteed the FHB detection accuracy by employingthe CIoU and NMS to regress the loss function and introducing the focal loss into the lossfunction to reduce the error caused by the unbalanced distribution of positive and negativesamples. Finally, we adopted the mix-up training and transfer learning methods to enhancethe model’s generalization ability. The proposed model’s performance was challengedagainst manual annotation, revealing that it outperformed current popular models in wheatear FHB detection.

The developed lightweight model required only 56 MB, a fifth of the memory requiredby YOLOv4, achieving an accuracy of 93.69% compared to the competitor models presented

Remote Sens. 2022, 14, 3481 17 of 20

in Table 4. Although the size of the proposed model was not the smallest, the precision wasrelatively high on the basis of ensuring that the proposed model was small enough for mytask. Taking these two factors into consideration, we selected MobileNetv3-YOLOv4 as ourproposed model. In a recent study, Hayit et al. [50] proposed a deep CNN-based model,named Yellow-Rust-Xception, to classify wheat leaf rust severity with a 91% accuracy.In order to quickly and accurately detect the severity of FHB, a method of detecting thedisease utilizing continuous wavelet and particle swarm optimization support vectormachines was proposed [51]. This detection model produced the best overall accuracy(93.5%). Jiang et al. [52] enhanced the Visual Geometry Group Network-16 (VGG16) modelthrough multi-task learning to simultaneously detect rice and wheat leaf diseases. For riceleaf diseases, the accuracy was 97.22%, and for wheat leaf diseases, 98.75%. Unlike themethods mentioned above, the proposed wheat ear FHB detection method focused on thelightweight design. The proposed model successfully decreased the network parametersand simplified computing complexity. To ensure a certain accuracy, the model size wasalso compressed to 1/5 of the state-of-the-art model. Therefore, the proposed modelis suitable for deployment on edge devices, technologically supporting a UAV-enablededge computing system for wheat disease diagnosis. The entire architecture is illustratedin Figure 15. The edge server manages the edge devices, sets the UAV’s flight path,and performs management control. The edge device employs the model trained in thecloud server to detect diseases based on the captured images and uploads them to thecloud server for processing, storage, and updating the model. The detection results areprovided to the end-users to intelligently decide in time, promoting the intelligent controlof agricultural disease.

Table 4. Performance comparison of the proposed model with previous studies.

Reference Plant Model Precision

Ramcharan et al. [53] Cassava MobileNet 84.7%Gonzalez-Huitron et al. [54] Tomato leaves NASNetMobile 84%

Hayit et al. [50] Wheat Xception 91%Liu et al. [55] Tomato Improved YOLOv3 92.39%

Huang et al. [51] Wheat ear Continuous wavelet analysis and PSO-SVM 93.5%Jiang et al. [52] Wheat leaf Multi-task deep learning transfer 98.75%

Proposed method Wheat ear MobileNetv3-YOLOv4 93.69%

Figure 15. System architecture.

6. Conclusions

This paper proposes a lightweight model for wheat ear Fusarium head blight detection.Aiming at the problem that the existing deep neural networks are sizeable to be embedded

Remote Sens. 2022, 14, 3481 18 of 20

on edge devices, this paper replaces the backbone network CSPDarknet53 of the YOLOv4model with the lightweight MobileNetv3. Introducing a lightweight module effectivelydecreases the parameter cardinality and the computational complexity while preservingthe model’s detection performance. The depth-wise separable convolution of lightweightmodules reconstructs the object detection model, reducing the convolution operation.Moreover, based on the CIoU loss function in the YOLOv4 model, we introduce the focalloss into the loss function to reduce the error caused by the imbalanced distribution ofpositive and negative samples. Additionally, Adam automatically optimizes the learningrate. The proposed model parameter cardinality is reduced by 50% compared to the originalYOLOv4 model, and the average detection time is reduced by 0.014 s. The experimentalresults demonstrate that the proposed model has appealing advantages in wheat diseasedetection, with the detection accuracy on FHB of wheat ear reaching 93.69%, 2.9% higherthan the YOLOv4 model. Compared with the YOLOv4 model, on the premise that theprecision loss error of AP value is less than 5%, the network model is compressed from256 MB to 56 MB, improving the detection speed and effectively overcoming the problemof difficult deployment and application. In conclusion, the proposed model could reliablyand quickly detect FHB and could be deployed on edge devices. This is important, becausereacting in the early stages based on the disease conditions could reduce yield losses andguarantee food security.

The results indicate that the proposed model has great potential for detecting wheatear FHB in real-time. However, this study was a starting work, with several issues worthconsidering in subsequent studies.

(1) This study focused solely on wheat ear FHB detection, ignoring wheat leaf FHBdetection. Wheat is also susceptible to a variety of other diseases. As a result, furtherresearch of the disease types on this premise should be conducted to realize thedetection of wheat leaf and other diseases;

(2) Although the proposed model successfully detected FHB, incorrect detections of smallobjects were inevitable. In addition, from Figure 14, due to the background noise,the detection performance of the second image was not as accurate as other models.Future research should enhance the model’s detection performance for extremelysmall items in complex background;

(3) This research focused merely on lowering the number of parameters and overlookedthe differences between different edge platforms. Future studies should considerdesigning platform-dependent models to investigate the extended model’s general-ization ability further.

Author Contributions: Conceptualization, Q.H. and L.J.; methodology, L.J. and Q.H.; software, L.J.and Z.Z.; validation, L.J. and Q.H.; formal analysis, L.J., Q.H. and T.L.; investigation, L.J., Q.H., Z.Z.and B.L.; data curation, L.J. and C.T.; writing—original draft preparation, L.J.; writing—review andediting, Q.H. and L.J.; visualization, L.J. and Q.H.; supervision, C.T., S.J., C.G., W.M., W.L. and T.L.All authors have read and agreed to the published version of the manuscript.

Funding: This research was funded by the National Natural Science Foundation of China (32071902),the Key Research Program of Jiangsu Province, China (BE2020319), the Yangzhou University Interdis-ciplinary Research Foundation for Crop Science Discipline of Targeted Support (yzuxk202008) andthe Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).

Data Availability Statement: Not report any data.

Conflicts of Interest: The authors declare no conflict of interest.

Remote Sens. 2022, 14, 3481 19 of 20

References1. Chen, P.F. Estimation of Winter Wheat Grain Protein Content Based on Multisource Data Assimilation. Remote Sens. 2020, 12, 20.

[CrossRef]2. Zhang, L.; Zhang, W.S.; Cui, Z.L.; Schmidhalter, U.; Chen, X.P. Environmental, human health, and ecosystem economic

performance of long-term optimizing nitrogen management for wheat production. J. Clean Prod. 2021, 311, 11. [CrossRef]3. Brandfass, C.; Karlovsky, P. Upscaled CTAB-Based DNA Extraction and Real-Time PCR Assays for Fusarium culmorum and F.

graminearum DNA in Plant Material with Reduced Sampling Error. Int. J. Mol. Sci. 2008, 9, 2306–2321. [CrossRef]4. Anderson, J.A. Marker-assisted selection for Fusarium head blight resistance in wheat. Int. J. Food Micro-Biol. 2007, 119, 51–53.

[CrossRef] [PubMed]5. Xu, S.J.; Wang, Y.X.; Hu, J.Q.; Chen, X.R.; Qiu, Y.F.; Shi, J.R.; Wang, G.; Xu, J.H. Isolation and characterization of Bacillus

amyloliquefaciens MQ01, a bifunctional biocontrol bacterium with antagonistic activity against Fusarium graminearum andbiodegradation capacity of zearalenone. Food Control. 2021, 130, 10. [CrossRef]

6. Ma, H.Q.; Huang, W.J.; Dong, Y.Y.; Liu, L.Y.; Guo, A.T. Using UAV-Based Hyperspectral Imagery to Detect Winter WheatFusarium Head Blight. Remote Sens. 2021, 13, 16. [CrossRef]

7. Wegulo, S.N. Factors Influencing Deoxynivalenol Accumulation in Small Grain Cereals. Toxins 2012, 4, 1157–1180. [CrossRef]8. Barbedo, J.G.A.; Tibola, C.S.; Lima, M.I.P. Deoxynivalenol screening in wheat kernels using hyperspectral imaging. Biosyst. Eng.

2017, 155, 24–32. [CrossRef]9. Su, J.Y.; Liu, C.J.; Hu, X.P.; Xu, X.M.; Guo, L.; Chen, W.H. Spatio-temporal monitoring of wheat yellow rust using UAV

multispectral imagery. Comput. Electron. Agric. 2019, 167, 10. [CrossRef]10. Liu, L.Y.; Dong, Y.Y.; Huang, W.J.; Du, X.P.; Ma, H.Q. Monitoring Wheat Fusarium Head Blight Using Unmanned Aerial Vehicle

Hyperspectral Imagery. Remote Sens. 2020, 12, 19. [CrossRef]11. Reder, S.; Mund, J.P.; Albert, N.; Wassermann, L.; Miranda, L. Detection of Windthrown Tree Stems on UAV-Orthomosaics Using

U-Net Convolutional Networks. Remote Sens. 2022, 14, 25. [CrossRef]12. Cui, Z.Y.; Li, Q.; Cao, Z.J.; Liu, N.Y. Dense Attention Pyramid Networks for Multi-Scale Ship Detection in SAR Images. IEEE Trans.

Geosci. Remote Sens. 2019, 57, 8983–8997. [CrossRef]13. Wu, F.Y.; Duan, J.L.; Ai, P.Y.; Chen, Z.Y.; Yang, Z.; Zou, X.J. Rachis detection and three-dimensional localization of cut off point for

vision-based banana robot. Comput. Electron. Agric. 2022, 198, 12. [CrossRef]14. Gu, C.Y.; Wang, D.Y.; Zhang, H.H.; Zhang, J.; Zhang, D.Y.; Liang, D. Fusion of Deep Convolution and Shallow Features to

Recognize the Severity of Wheat Fusarium Head Blight. Front. Plant Sci. 2021, 11, 14. [CrossRef] [PubMed]15. Su, W.H.; Zhang, J.J.; Yang, C.; Page, R.; Szinyei, T.; Hirsch, C.D.; Steffenson, B.J. Automatic Evaluation of Wheat Resistance

to Fusarium Head Blight Using Dual Mask-RCNN Deep Learning Frameworks in Computer Vision. Remote Sens. 2021, 13, 20.[CrossRef]

16. Yu, D.B.; Xiao, J.; Wang, Y. Efficient Lightweight Surface Reconstruction Method from Rock-Mass Point Clouds. Remote Sens. 2022,14, 22. [CrossRef]

17. Bian, Z.; Vong, C.M.; Wong, P.K.; Wang, S. Fuzzy KNN Method With Adaptive Nearest Neighbors. IEEE Transact. Cybernet. 2020,52, 5380–5393. [CrossRef] [PubMed]

18. Cao, L.J.; Keerthi, S.S.; Ong, C.J.; Uvaraj, P.; Fu, X.J.; Lee, H.P. Developing parallel sequential minimal optimization for fasttraining support vector machine. Neurocomputing 2006, 70, 93–104. [CrossRef]

19. Zou, B.; Li, L.Q.; Xu, Z.B.; Luo, T.; Tang, Y.Y. Generalization Performance of Fisher Linear Discriminant Based on MarkovSampling. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 288–300.

20. Luo, C.W.; Wang, Z.F.; Wang, S.B.; Zhang, J.Y.; Yu, J. Locating Facial Landmarks Using Probabilistic Random Forest. IEEE SignalProcess. Lett. 2015, 22, 2324–2328. [CrossRef]

21. Hernandez, S.; Lopez, J.L. Uncertainty quantification for plant disease detection using Bayesian deep learning. Appl. Soft. Comput.2020, 96, 9. [CrossRef]

22. Picon, A.; Seitz, M.; Alvarez-Gila, A.; Mohnke, P.; Ortiz-Barredo, A.; Echazarra, J. Crop conditional Convolutional NeuralNetworks for massive multi-crop plant disease classification over cell phone acquired images taken on real field conditions.Comput. Electron. Agric. 2019, 167, 10. [CrossRef]

23. Espejo-Garcia, B.; Mylonas, N.; Athanasakos, L.; Fountas, S.; Vasilakoglou, I. Towards weeds identification assistance throughtransfer learning. Comput. Electron. Agric. 2020, 171, 10. [CrossRef]

24. Chen, J.D.; Chen, J.X.; Zhang, D.F.; Sun, Y.D.; Nanehkaran, Y.A. Using deep transfer learning for image-based plant diseaseidentification. Comput. Electron. Agric. 2020, 173, 11. [CrossRef]

25. Xiao, Y.X.; Dong, Y.Y.; Huang, W.J.; Liu, L.Y.; Ma, H.Q. Wheat Fusarium Head Blight Detection Using UAV-Based Spectral andTexture Features in Optimal Window Size. Remote Sens. 2021, 13, 19. [CrossRef]

26. Li, Y.S.; Xie, W.Y.; Li, H.Q. Hyperspectral image reconstruction by deep convolutional neural network for classification.Pattern Recognit. 2017, 63, 371–383. [CrossRef]

27. Jin, X.; Jie, L.; Wang, S.; Qi, H.J.; Li, S.W. Classifying Wheat Hyperspectral Pixels of Healthy Heads and Fusarium Head BlightDisease Using a Deep Neural Network in the Wild Field. Remote Sens. 2018, 10, 20. [CrossRef]

28. Singh, P.; Verma, V.K.; Rai, P.; Namboodiri, V.P. Acceleration of Deep Convolutional Neural Networks Using Adaptive FilterPruning. IEEE J. Sel. Top. Signal Process. 2020, 14, 838–847. [CrossRef]

Remote Sens. 2022, 14, 3481 20 of 20

29. Wu, J.F.; Hua, Y.Z.; Yang, S.Y.; Qin, H.S.; Qin, H.B. Speech Enhancement Using Generative Adversarial Network by Dis-tillingKnowledge from Statistical Method. Appl. Sci. 2019, 9, 8.

30. Peng, P.; You, M.Y.; Xu, W.S.; Li, J.X. Fully integer-based quantization for mobile convolutional neural network inference.Neurocomputing 2021, 432, 194–205. [CrossRef]

31. Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficientconvolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861.

32. Zhang, X.; Zhou, X.Y.; Lin, M.X.; Sun, R.; IEEE. ShuffleNet: An Extremely Efficient Convolutional Neural Network for MobileDevices. In Proceedings of the 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City,UT, USA, 18–23 June 2018; pp. 6848–6856.

33. Ale, L.; Sheta, A.; Li, L.Z.; Wang, Y.; Zhang, N.; IEEE. Deep Learning based Plant Disease Detection for Smart Agriculture. InProceedings of the IEEE Global Communications Conference (IEEE GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019.

34. Wang, C.; Zhou, J.; Wu, H.; Teng, G.; Zhao, C.; Li, J. Identification of vegetable leaf diseases based on improved multi-scaleResNet. Trans. Chin. Soc. Agricult. Eng. 2020, 36, 209–217.

35. Toda, Y.; Okura, F. How Convolutional Neural Networks Diagnose Plant Disease. Plant Phenomics 2019, 2019, 14. [CrossRef][PubMed]

36. Tang, Z.; Yang, J.; Li, Z.; Qi, F. Grape disease image classification based on lightweight convolution neural networks andchannelwise attention. Comput. Electron. Agric. 2020, 178, 105735. [CrossRef]

37. Zhao, S.Y.; Peng, Y.; Liu, J.Z.; Wu, S. Tomato Leaf Disease Diagnosis Based on Improved Convolution Neural Network byAttention Module. Agriculture-Basel. 2021, 11, 15. [CrossRef]

38. Chen, J.D.; Zhang, D.F.; Suzauddola, M.; Zeb, A. Identifying crop diseases using attention embedded MobileNet-V2 model.Appl. Soft. Comput. 2021, 113, 12. [CrossRef]

39. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A.; IEEE. You Only Look Once: Unified, Real-Time Object Detection. In Proceedingsof the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 779–788.

40. Redmon, J.; Farhadi, A.; IEEE. YOLO9000: Better, Faster, Stronger. In Proceedings of the 30th IEEE/CVF Conference on ComputerVision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525.

41. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.42. Zhu, L.Z.; Zhang, S.N.; Chen, K.Y.; Chen, S.; Wang, X.; Wei, D.X.; Zhao, H.C. Low-SNR Recognition of UAV-to-Ground Targets

Based on Micro-Doppler Signatures Using Deep Convolutional Denoising Encoders and Deep Residual Learning. IEEE Trans.Geosci. Remote Sens. 2022, 60, 13. [CrossRef]

43. Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934.44. Tian, Y.N.; Yang, G.D.; Wang, Z.; Wang, H.; Li, E.; Liang, Z.Z. Apple detection during different growth stages in orchards using

the improved YOLO-V3 model. Comput. Electron. Agric. 2019, 157, 417–426. [CrossRef]45. Roy, A.M.; Bhaduri, J. Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv.

Comput. Electron. Agric. 2022, 193, 14.46. Li, G.; Suo, R.; Zhao, G.A.; Gao, C.Q.; Fu, L.S.; Shi, F.X.; Dhupia, J.; Li, R.; Cui, Y.J. Real-time detection of kiwifruit flower and bud

simultaneously in orchard using YOLOv4 for robotic pollination. Comput. Electron. Agric. 2022, 193, 8. [CrossRef]47. He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

In Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014;pp. 346–361.

48. Liu, S.; Qi, L.; Qin, H.F.; Shi, J.P.; Jia, J.Y.; IEEE. Path Aggregation Network for Instance Segmentation. In Proceedings of the31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018;pp. 8759–8768.

49. Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A New Backbone that can Enhance LearningCapability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), ElectrNetwork, virtual, 14–19 June 2020; pp. 1571–1580.

50. Hayit, T.; Erbay, H.; Varcin, F.; Hayit, F.; Akci, N. Determination of the severity level of yellow rust disease in wheat by usingconvolutional neural networks. J. Plant Pathol. 2021, 103, 923–934. [CrossRef]

51. Huang, L.S.; Wu, K.; Huang, W.J.; Dong, Y.Y.; Ma, H.Q.; Liu, Y.; Liu, L.Y. Detection of Fusarium Head Blight in Wheat Ears UsingContinuous Wavelet Analysis and PSO-SVM. Agriculture 2021, 11, 13. [CrossRef]

52. Jiang, Z.C.; Dong, Z.X.; Jiang, W.P.; Yang, Y.Z. Recognition of rice leaf diseases and wheat leaf diseases based on multi-task deeptransfer learning. Comput. Electron. Agric. 2021, 186, 9. [CrossRef]

53. Ramcharan, A.; McCloskey, P.; Baranowski, K.; Mbilinyi, N.; Mrisho, L.; Ndalahwa, M.; Legg, J.; Hughes, D.P. A Mobile-BasedDeep Learning Model for Cassava Disease Diagnosis. Front. Plant Sci. 2019, 10, 8. [CrossRef]

54. Gonzalez-Huitron, V.; Leon-Borges, J.A.; Rodriguez-Mata, A.E.; Amabilis-Sosa, L.E.; Ramirez-Pereda, B.; Rodriguez, H. Diseasedetection in tomato leaves via CNN with lightweight architectures implemented in Raspberry Pi. Comput. Electron. Agric. 2021,181, 9.

55. Liu, J.; Wang, X.W. Tomato Diseases and Pests Detection Based on Improved Yolo V3 Convolutional Neural Network. Front. PlantSci. 2020, 11, 12. [CrossRef]