Chowdhury Rafeed Rahman (UIU),
United International University (UIU)
Preetom Saha Arko (BUET),
Bangladesh University of Engineering and Technology (BUET)
Mohammed Eunus Ali
Bangladesh University of Engineering and Technology (BUET)
Mohammad Ashik Iqbal Khan
Bangladesh Rice Research Institute (BRRI)
Sajid Hasan Apon
Bangladesh University of Engineering and Technology (BUET)
Farzana Nowrin
Bangladesh Rice Research Institute (BRRI)
Abu Wasif
Bangladesh University of Engineering and Technology (BUET)
Rice disease, Pest, Convolutional neural network, Dataset, Memory efficient, Two stage training
Pest Management
Diseases, Rice
2.1 Data Collection Rice diseases and pests occur in different parts of the rice plant. Their occurrence depends on many factors such as temperature, humidity, rainfall, variety of rice plants, season, nutrition, etc. An extensive exercise was undertaken to collect total 1426 images of rice diseases and pests from paddy fields of Bangladesh Rice Research Institute (BRRI). Images have been collected in real-life scenarios with heterogeneous backgrounds from December 2017 to June, 2018 for a total of seven months. The image collection has been performed in a range of weather conditions - in winter, in summer and in the overcast conditions in order to get as fully representative a set of images as possible. Four different types of camera have been used in capturing the images. These steps increase the robustness of our model. This work encompasses total five classes of diseases, three classes of pests and one class of healthy plants and others - a total of nine classes. The class names along with the number of images collected for each class. It is to note that Sheath Blight, Sheath Rot and their simultaneous occurrence have been considered in the same class, because their treatment method and place of occurrence are the same. Symptoms of different diseases and pests are seen at different parts such as leaf, stem and grain of the rice plant. Bacterial Leaf Blight disease, Brown Spot disease, Brown Plant Hopper pest (late-stage) and Hispa pest occur on rice leaves. Sheath Blight disease, Sheath Rot disease and Brown Plant Hopper pest (early stage) occur on rice stem. Neck Blast disease and False Smut disease occur on rice grain. Stemborer pest occurs on both rice stem and rice grain. All these aspects have been considered while capturing images. To prevent classification models from being confused between dead parts and diseased parts of rice plant, images of dead leaf, dead stem and dead grain of rice plants have been incorporated into the dataset. For example, diseases like BLB, Neck Blast and Sheath Blight have similarities with dead leaf, dead grain and dead stem of rice plant respectively.Thus images of a dead leaf, dead stem and dead grain along with images of healthy rice plants have been considered in a class that has been named others. False Smut, Stemborer, Healthy Plant class, Sheath Blight and/or SheathRot class show multiple types of symptoms. Early-stage symptoms of Hispa and brown Plant Hopper are different from their later stage symptoms. All symptom variations of these classes found in the paddy fields of BRRI have been covered in this work. BLB, Brown Spot and Neck Blast disease show no considerable intra-class variation around BRRI area. Experimental Set up the Keras framework with tensor flow back-end has been used to train the models.Experiments have been conducted with two state-of-the-art CNN architectures containing a large number of parameters such asVGG16 and Inception V3. Later the proposed light-weight two-stage Simple CNN has been tested and compared with three state-of-the-art memory-efficient CNN architectures such as MobileNetv2, NasNet Mobile and SqueezeNet. VGG16 is a sequential CNN architecture using 3×3 convolution filters. After each max-pooling layer, the number of convolution filters gets doubled in VGG16. InceptionV3 is anon-sequential CNN architecture consisted of inception blocks. In each inception block, convolution filters of various dimensions and pooling are used on the input in parallel. Three different types of training methods have been implemented on each of these five architectures.Baseline training: All randomly initialized architecture layers are trained from scratch. This method of training takes time to converge. Fine Tuning: The convolution layers of the CNN architectures are trained from their pre-trained ImageNet weights, while the dense layers are trained from randomly initialized weights.Transfer Learning: In this method, the convolution layers of the CNN architectures are not trained at all. Rather pre-trained ImageNet weights are kept intact. Only the dense layers are trained from their randomly initializedweights.10-fold cross-validation accuracy along with standard with standard deviation have been used as model performance metric since the dataset used in this work does not have any major imbalance.Categorical Crossentropy has been used as loss function for all CNN architectures since this work deals with multi-class classification. All intermediate layers of the CNN architectures used in this work have released activation function while the activation function used in the last layer is softmax. The hyperparameters used are as follows: dropout rate of 0.3, the learning rate of 0.0001, mini-batch size of 64 and number of epochs 100. These values have been obtained through hyperparameter tuning using 10-fold cross-validation. Adaptive Moment Estimation (Adam) optimizer has been used for updating the model weights.All the images have been resized to the default image size of each architecture before working with that architecture. For example, InceptionV3 re-quires 299×299×3 pixel size image while VGG16 requires an image of pixel size224×224×3. Random rotation from -15 degree to 15 degrees, rotations of multiple of 90 degrees at random, random distortion, shear transformation, vertical flip, horizontal flip, skewing and intensity transformation has been used as part of the data augmentation process. Every augmented image is the result of a particular subset of all these transformations, where rotation type transformations have been assigned high probability. It is because CNN models in general are not rotation invariant. In this way, 10 augmented images from every original image have been created. Random choice of the subset of the transformations helps augment an original image in a heterogeneous way.A remote Red Hat Enterprise Linux server of RMIT University has been used for carrying out the experiments. The configuration of the server includes 56 CPUs, 503 GB RAM, 1 petabyte of user-specific storage and two NVIDIATesla P100-PCIE GPUs each of 16 GB.
Rice Disease and Pest Classification
Journal