Download PDF
Research Article  |  Open Access  |  27 Jan 2022

Deep transfer learning benchmark for plastic waste classification

Views: 2415 |  Downloads: 741 |  Cited:   3
Intell Robot 2022;2(1):1-19.
10.20517/ir.2021.15 |  © The Author(s) 2022.
Author Information
Article Notes
Cite This Article

Abstract

Millions of people throughout the world have been harmed by plastic pollution. There are microscopic pieces of plastic in the food we eat, the water we drink, and even the air we breathe. Every year, the average human consumes 74,000 microplastics, which has a significant impact on their health. This pollution must be addressed before it has a significant negative influence on the population. This research benchmarks six state-of-the-art convolutional neural network models pre-trained on the ImageNet Dataset. The models Resnet-50, ResNeXt, MobileNet_v2, DenseNet, SchuffleNet and AlexNet were tested and evaluated on the WaDaBa plastic dataset, to classify plastic types based on their resin codes by integrating the power of transfer learning. The accuracy and training time for each model has been compared in this research. Due to the imbalance in the data, the under-sampling approach has been used. The ResNeXt model attains the highest accuracy in fourteen minutes.

Keywords

Plastic, transfer learning, recycling, waste, classification

1. INTRODUCTION

Plastic finds itself in everyday human activities. The mass production of plastic was introduced in 1907 by Leo Baekeland, proved to be a boon to humankind[1]. Over the years, plastic has increasingly become an everyday necessity for humanity. The population explosion has a critical part in increasing domestic plastic usage[2]. Lightweight plastics have a crucial role in the transportation industry. Their usage in space exploration gives enormous leverage over heavy and expensive alternatives[3]. The packaging industry widely uses plastics after the e-commerce revolution because they are lightweight, cheap, and abundant. In 2015, the packing sector produced 141 million metric tons of garbage, accounting for 97 percent of all waste produced concerning the total consumption in the packaging sector[4]. Discarded polyethylene terephthalate (PETE) bottles are a common source of household waste. In 2021, global waste plastic bottle consumption will surpass 500 billion as estimated[2].

The increasing use of plastics and their wastage negatively affect the global economy. This surge in consumption and the low degradability of plastic have resulted in massive plastic accumulation in the environment, which has harmed ecosystems and human health[5]. This has resulted in countries formulating strict policies for plastics and even banning some types of single-use plastics. Plastics are non-biodegradable and considerably take a longer time to degrade. Reusing and recycling are viable ways to stop contaminating the environment with plastic pollution[6]. Plastic wastes can be retrieved after entering the municipal treatment plants or before it. However, the plastic waste from the municipal treatment plants is usually contaminated and ends up in landfills or incineration centers. The plastic waste collected outside of such plants is relatively cleaner and can be reused or recycled. Recovered plastics from such wastes have varied types of plastic, making it extremely difficult to identify and sort different kinds of plastics.

By integrating transfer learning, the Dataset needs only a limited number of input images to acquire high accuracy, and it also accelerates the training of neural networks, consequently improving the classification of multiple classes in a dataset[7]. Balancing the number of images in each class compensates for the class imbalance problem. This research contributes towards benchmarking of pre-trained models and concluding that the ResNeXt model achieves the highest accuracy on the WaDaBa dataset from the list of pre-trained models specified in this paper.

1.1. Literature review

Seven different varieties of plastics exist in the modern day. They are classified as Polyethylene terephthalate (PET or PETE), high-density polyethylene (HDPE), polyvinyl chloride (PVC or Vinyl), low-density polyethylene (LDPE), polypropylene (PP), polystyrene (PS or Styrofoam) and Others, which does not belong to any of the above types, has been shown in Figure 1[3].

Deep transfer learning benchmark for plastic waste classification

Figure 1. Types of plastic, its resin code and everyday examples of plastics. PETE: Polyethylene terephthalate; HDPE: high-density polyethylene; PVC: polyvinyl chloride; LDPE: low-density polyethylene; PP: polypropylene, PS: polystyrene.

1.1.1. Traditional sorting techniques

Initially, segregation of wastes and separation of different types of plastics were done manually. However, this results in increased labor costs and time consumption[6]. Traditional macro sorting of plastics was performed with the aid of sensors which included near-infrared spectrometers[8,9], x-ray transmission sensor, Fourier transformed Infrared Technique[10], laser aided identification, and marker identification by identifying the resin type[11]. However, these approaches are limited to recognizing just particular types of plastics and are costly due to the large equipment required. The intricacy of mechanical sorting and its maintenance, as well as the high initial investment, are the drawbacks of traditional sorting methods.

1.1.2. Modern sorting techniques

Deep learning has made classification easier, more efficient, and cost-effective, with less human intervention. The deep learning approach was enhanced by convolutional neural networks (CNN)[12]. CNNs are excellent for object classification and detection[13]. After the model has been trained on the data, the plastics may be sorted into the appropriate classes with the assistance of CNN. They do, however, require a huge quantity of training data, which might be difficult to get at times. When the input data is small, the problem of overfitting develops, resulting in inaccurate classifications[14]. Transfer learning reduces the training time of a CNN by pre-training the model using benchmark datasets such as ImageNet.

Bobulski et al.[15] proposed an end-to-end system with a micro-computer embedded with the vision to sort the PETE types of plastics in the WaDaBa dataset. The authors introduced data augmentation, which reduced the number of parameters but exponentially increased the number of samples, increasing the training time. Bobulski et al.[16] also proposed to classify distinct plastic categories based on a gradient feature vector. Agarwal et al.[17] presented Siamese and triplet loss neural networks to classify the WaDaBa dataset and succeeded with very high accuracy. However, this method requires a significant amount of time for training the neural networks. Chazhoor et al.[18] Anthony utilised transfer learning to compare the three most often used architectures (ResNeXt, Resnet-50-50 and AlexNet) on the WaDaBa dataset to select the optimal model; however, the K-fold cross validation technique was not applied; as a result, testing accuracy would vary widely.

The aim of the paper is to provide researchers with benchmark accuracies and the average time required to train on the WaDaBa dataset using the latest CNN models utilising cross-validation to categorise a range of plastics into their appropriate resin types. An unbiased and concrete set of parameters has been set to evaluate the Dataset to compare the models fairly[19]. This benchmark work will assist in gaining an impartial view of numerous recent CNN models applied to the WaDaBa dataset, establishing a baseline for future research. The models used in this paper are AlexNet[20], Resnet-50[21], ResNeXt[22], SqueezeNet[23], MobileNet_v2[24] and DenseNet[25].

2. METHODS

2.1. Dataset

The WaDaBa dataset is a sophisticated collection that contains images of common plastics used in society. The dataset includes seven distinct varieties of plastic. Images show several forms of plastics on a platform under two lighting conditions: an LED bulb and a fluorescent lamp and is displayed in Figure 2. Table 1 shows the distribution of the 4000 images in the dataset according to their classes. As there are no images in the PVC and PE-LD classes, both the classes have been excluded from the deep learning models. Deep learning models are trained on five class types with images in the current work i.e., PETE, PE-HD, PP, PS, and Other. The deep learning models are set up in such a way that each output matches one of the five class categories. When the images for PVC and PE-LD are released, these classes can be included in the models. The dataset’s classes are imbalanced, with the last class holding just 40 images and the PETE class consisting of 2000 images. The dataset is freely accessible to the public[15].

Deep transfer learning benchmark for plastic waste classification

Figure 2. Examples of different types of plastics from the WaDaBa dataset in Figure 1. (A) Class 1 representing PETE (polyethylene terephthalate); (B) Class 2 representing HDPE (high-density polyethylene); (C) Class 5 representing PP (polypropylene); (D) Class 6 representing PS (polystyrene) ; (E) Class 7 representing Others[15].

Table 1

The number of images corresponding to each class in the WaDaBa dataset[15]

Resin codeClass typeNumber of images
1PETE2200
2PE-HD600
3PVC0
4PE-LD0
5PP640
6PS520
7Other40

2.2. Transfer learning

A large amount of data is needed to get optimum accuracy in a neural network. Data needs to be trained for hours on a powerful Graphical Processing Unit (GPU) to get the results. With the advent of transfer learning[26], there has been a significant change in the learning processes in deep neural networks. The model which has been already trained on a large dataset like ImageNet[27], known as the pre-trained model, enhances the transfer learning process. The transfer learning process works by freezing[28] the initially hidden layers of the model and fine-tuning the final layers of the models. The layer’s frozen state indicates that it will not be trained. As a result, its weights will remain unchanged. As the data set used in this research is relatively small with a limited number of images in each class, transfer learning best suits this research. The pre-trained models used in the research are further explained in the subsection.

2.2.1. AlexNet

AlexNet is a neural network with three convolutional layers and two fully connected layers, and it was introduced in 2012 by Alex Krizhevesky. AlexNet increases learning capacity by increasing network depth and using multi-parameter tuning techniques. AlexNet uses ReLU to add non-linearity and dropout to decrease the overfitting of data. CNN-based applications gained popularity following AlexNet's excellent performance on the ImageNet dataset in 2012[23]. The architecture of AlexNet is shown in Figure 3.

Deep transfer learning benchmark for plastic waste classification

Figure 3. The architecture of AlexNet, having five convolutional layers and three fully connected layers. This figure is quoted with permission from Han et al.[29].

2.2.2. Resnet-50

Residual networks (Resnet-50) are convolutional neural networks with skip connections with an extremely deep convolution and 11 million parameters. A skip connection after each block solves the vanishing gradient problem. The skip connection skips some layers in the network. With batch normalization and ReLU activation, two 3 × 3 convolutions are used in each block to achieve the desired result[21]. The architecture of Resnet-50-50 is displayed in Figure 4.

Deep transfer learning benchmark for plastic waste classification

Figure 4. Architecture of Resnet-50-50. This figure is quoted with permission from Talo et al.[30].

2.2.3. ResNeXt

Proposed by Facebook and ranking second in ILSVRC 2016, ResNeXt uses the repeating layer strategy of Resnet-5050, and it appends the split-transform-merge method[22]. The magnitude of a set of transformations is known as cardinality. Cardinality provides a novel approach to modifying model capacity by increasing the number of separate routes. Having width and depth as critical characteristics, ResNeXt adds on Cardinality as a new dimension. Increasing cardinality is a practical approach to enhance the accuracy of the model[22]. The architecture of ResNeXt is shown in Figure 5.

Deep transfer learning benchmark for plastic waste classification

Figure 5. Architecture of ResNeXt. (Figure is redrawn and quoted from Go et al.[31])

2.2.4. MobileNet_v2

MobileNet_v2 is a CNN architecture built on an inverted residual structure, shortcut connections between narrow bottleneck layers to improve the mobile and embedded vision systems. A Bottleneck Residual Block is a type of residual block that creates a bottleneck using 1 × 1 convolutions. The number of parameters and matrix multiplications can be reduced by using a bottleneck. The goal is to make residual blocks as small as possible so that depth may be increased, and the parameters can be reduced. The model uses ReLU as the activation function. The architecture comprises a 32-filter convolutional layer at the top, followed by 19 bottleneck layers[24]. The architecture of MobileNet_v2 is shown in Figure 6.

Deep transfer learning benchmark for plastic waste classification

Figure 6. The architecture of MobileNet_v2. This figure is quoted with permission from Seidaliyeva et al.[32]

2.2.5. DenseNet

Using a feed-forward system, DenseNet connects each layer to every other layer. Layers are created using feature maps from all previous levels, and their feature maps are utilized in all future layers to create new layers. They solve the vanishing-gradient problem and improve feature propagation and reuse while reducing the number of parameters significantly. The architecture of DenseNet is shown in Figure 7.

Deep transfer learning benchmark for plastic waste classification

Figure 7. The architecture of DenseNet. This figure is quoted with permission from Huang et al.[25].

2.2.6. SqueezeNet

SqueezeNet is a small CNN that shrinks the network by reducing parameters while maintaining adequate accuracy. An entirely new building block has been introduced in the form of SqueezeNet’s Fire module. A Fire module consists of a squeeze convolution layer containing only a 1 × 1 filter, which feeds into an expand layer having a combination of 1 × 1 and 3 × 3 convolution filters. Starting with an independent convolution layer, SqueezeNet then moves to 8 Fire modules before concluding with a final convolution layer. The architecture of SqueezeNet is shown in Figure 8.

Deep transfer learning benchmark for plastic waste classification

Figure 8. The architecture of SqueezeNet. This figure is quoted with permission from Nguyen et al.[33].

2.3. Experimental settings and the experiment

All the experiments were run on Ubuntu Linux operating system. The models were trained on Intel i7, 3.60 GHz, 32 GB ram and the graphical processing unit used was the Nvidia GeForce RTX 2080 Super. The deep learning framework used in this research is PyTorch[34]. The images from the WaDaBa dataset are input to the pre-trained models after performing under-sampling in the dataset. The batch size chosen for this experiment is 4 such that the GPU doesn’t run out of memory while processing. The learning rate is 0.001 and is decayed by a factor of 0.1 every seven epochs. Decaying the learning rate aids the network’s convergence to a local minimum and also enhances the learning of complicated patterns[35]. Cross-Entropy loss is utilized for training, accompanied by a momentum of 0.9, which is widely used in the machine learning and neural network communities[36]. The Stochastic Gradient Descent (SGD) optimizer[37], a gradient descent technique that is extensively employed in training deep learning models, is used. The training is done using a five-fold cross-validation technique, and the result is generated, along with graphs showing the number of epochs vs. accuracy and number of epochs vs. loss. On the WaDaBa dataset, each model was subjected to twenty epochs.

Before being forwarded on to the training, the data was normalized. These approaches, which were applied to the data, included random horizontal flipping and centre cropping.

The size of the input picture is 224 × 224 pixels [Figure 9].

Deep transfer learning benchmark for plastic waste classification

Figure 9. Flowchart summarizing the experiment.

2.3.1. Imbalance in the dataset

The number of images for each class in the dataset is uneven. The first class (PETE) contains 2200 photos, while the last class (Others) contains only 40. Due to the size and cost of certain forms of plastic, obtaining datasets is quite tricky. Because of the class imbalance, the under-sampling strategy was used. Images were split into training and validation sets, eighty percent for the training and twenty percent for the testing purposes.

2.3.2. K-fold cross-validation

The 5-fold cross-validation was considered for all the tests to validate the benchmark models[38]. The data was tested on the six models and the training loss and accuracy, validation loss and accuracy and the training time was recorded for 20 epochs with identical model parameters. The resultant average data was tabulated, and the corresponding graphs were plotted for visual representation. The flow chart of the experimental process is displayed in Figure 8.

3. RESULTS

3.1. Accuracy, loss, area under curve and receiver operating characteristic curve

The metrics used to benchmark the models on the WaDaBa dataset are accuracy and loss. The accuracy corresponds to the correctness of the value[39]. It measures the value to the actual value. Loss is a prediction of how erroneous the predictions of a neural network are, and the loss is calculated with the help of a loss function[40]. The area under curve (AUC) measures the classifier’s ability to differentiate between classes and summarize the receiver operating characteristic (ROC) curve. ROC plots the performance of a classification model’s overall accuracy. The curve plots the True Positive Rate against the False Positive Rate.

Table 2 clearly shows that the ResNeXt architecture achieves the maximum accuracy of 87.44 percent in an average time of thirteen minutes and eleven seconds. When implemented in smaller and portable devices, smaller networks such as MobileNet_v2, SqueezeNet, and DenseNet offer equivalent accuracy. AlexNet trains the model in the shortest period but with the lowest accuracy. In comparison to the other models, DenseNet takes the longest to train. With a classification accuracy of 97.6 percent, ResNeXt comes out as the top model for reliably classifying PE-HD. When compared to other models, MobileNet_v2 classifies PS with more accuracy. Also, from Table 2, we can see that PP has the least classification accuracy for all the models. In Table 2, the standard deviation, σ, is displayed, which is a measure of how far values deviate from the mean. The standard deviation is given by the following unbiased estimation:

Deep transfer learning benchmark for plastic waste classification

xi= accuracy at the ith epoch

Deep transfer learning benchmark for plastic waste classification = mean of the accuracies

n = total number of epochs (e.g., 20)

Table 2

The mean and class wise accuracies of the models pretrained on the ImageNet dataset, along with the time taken for training for 20 epochs. The standard deviation indicates the average deviation in accuracy across the five-folds in the respective model along with the total number of parameters for each model

AlexNetResnet-50ResNeXtMoblineNet_v2DenseNetSqueezeNet
Mean
accuracy (%)
80.0885.5487.4487.3585.5882.59
PETE (%)84.885858588.884.4
PE-HD (%)85.095.497.694.295.691.4
PP (%)67.268.67474.866.466.8
PS (%)80.286.083.289.685.482.2
Other (%)10010010010010097.5
Time
(min)
11.812.0513.1112.0617.3312.01
Std. deviation
σ (%)
7.54.95.46.05.31.7
No. of parameters
(in million)
572322260.7

4. DISCUSSION

In the results section from Table 2, we can observe that ResNeXt architecture performs better than all the other architectures discussed in this paper. MobileNet_v2 architecture falls behind ResNeXt architecture with 0.1 % accuracy. Considering the time factor, MobileNet_v2 trains faster than ResNext by a minute’s advantage. When the data is considerably large, the difference in time factor will increase, giving the MobileNet_v2 architecture dominance.

The validation loss of AlexNet architecture from Table 3 and SqueezeNet architecture from Table 4 does not significantly drop compared to other models used in the research and from the graph, it can be observed from Figure 10 and Figure 11 that there is a diverging gap between its accuracy loss and validation loss curves for both models. Fewer images in the Dataset and multiple classes cause this effect on the AlexNet architecture. Similar results can be observed for SqueezeNet from Table 4 and Figure 11, which have a similar architecture to AlexNet. Table 5 and Figure 12 represent the training and validation accuracies and loss values and their corresponding graphs for the pre-trained Resnet-50 model. From Table 6 and Figure 13, we can observe the training and validation accuracy and loss values and their plots for ResNeXt architecture. Similarly, from Table 7 and Figure 14, the accuracies and their graphs for MobileNet_v2 can be observed. The DenseNet architecture represented in Table 8 and Figure 15 takes the longest time to train and has a good accuracy score of 85.58%, which is comparable to the Resnet-50 architecture, having an accuracy of 85.54%. The five-fold cross-validation approach tests every data point in the dataset and helps improve the overall accuracy.

Deep transfer learning benchmark for plastic waste classification

Figure 10. Accuracy and loss curves for AlexNet architecture.

Deep transfer learning benchmark for plastic waste classification

Figure 11. Accuracy and loss curves for SqueezeNet architecture.

Deep transfer learning benchmark for plastic waste classification

Figure 12. Accuracy and loss curves for Resnet-50 architecture.

Deep transfer learning benchmark for plastic waste classification

Figure 13. Accuracy and loss curves for ResNeXt architecture.

Deep transfer learning benchmark for plastic waste classification

Figure 14. Accuracy and loss curves for MobileNet_v2 architecture.

Deep transfer learning benchmark for plastic waste classification

Figure 15. Accuracy and loss curves for DenseNet architecture.

Table 3

The mean training and validation accuracies and losses for AlexNet architecture for 20 epochs

EpochMean_AlexNet
Training accuracyValidation accuracyTraining lossValidation loss
10.58150.573021.002281.1308
20.66750.648060.806581.09448
30.71770.58040.692441.1246
40.733840.646560.67211.01474
50.778820.675980.551440.9506
60.786520.665680.511941.04706
70.795480.70930.501880.84044
80.846540.76960.360540.82302
90.873020.76420.301620.89168
100.879620.776460.288960.90384
110.874580.777460.291080.92258
120.882060.788740.282820.8886
130.884620.782360.265420.99196
140.881920.785320.264060.99434
150.892480.789720.256360.98168
160.891260.789720.25760.98266
170.889140.791180.258640.95596
180.8970.796080.241660.95004
190.893440.797060.246340.9735
200.896020.794140.248260.98582
Table 4

The mean training and validation accuracies and losses for SqueezeNet architecture for 20 epochs

EpochMean SqueezeNet
Training accuracyValidation accuracyTraining lossValidation loss
10.479920.72811.026081.32476
20.646880.74370.780120.96076
30.71340.7180.686121.05972
40.744280.677960.64261.14184
50.761160.70030.59030.81164
60.790060.709160.531860.88014
70.810260.658620.512220.89182
80.855860.696580.427660.81594
90.873640.701380.38710.89832
100.878740.707240.378340.99886
110.886840.68380.37520.9401
120.890620.699880.362560.93402
130.897980.692180.34650.94986
140.888780.71830.368420.8951
150.895040.707760.359060.97796
160.897980.703760.351461.0066
170.898960.707120.352420.99574
180.901660.703960.347321.00284
190.904220.702020.345081.01182
200.902380.706060.345620.9707
Table 5

The mean training and validation accuracies and losses for Resnet-50 architecture for 20 epochs

EpochMean Resnet-50 values
Training accuracyValidation accuracyTraining lossValidation loss
10.55150.67061.127941.04068
20.693460.707820.810240.96718
30.74550.76910.667720.86036
40.779180.765680.57580.82058
50.800620.776480.520120.66052
60.82560.759320.448860.85278
70.839920.743640.427941.16314
80.877040.825980.322140.60218
90.891980.822540.28350.6571
100.909860.829420.245060.62152
110.903240.833820.25660.58042
120.914980.832340.231560.63032
130.911820.816260.236180.6429
140.914760.837260.230860.65462
150.91510.834840.22350.6636
160.914640.828940.223480.70444
170.916840.83430.217480.65494
180.916840.837760.215460.6189
190.917080.834820.225780.68982
200.913520.839220.224120.61236
Table 6

The mean training and validation accuracies and losses for ResNeXt architecture for 20 epochs

EpochMean ResNeXt values
Training accuracyValidation accuracyTraining lossValidation loss
10.574540.710781.097140.97576
20.695180.743120.83040.87308
30.7520.674980.667841.3998
40.792280.767640.571740.93114
50.813360.782340.521640.7225
60.833060.831360.45420.70478
70.844940.813740.421440.7807
80.883660.85640.305480.5644
90.898360.854420.280380.64594
100.906420.852940.261560.62974
110.908260.858340.25030.65006
120.91450.850.23850.6518
130.90840.841180.24110.64972
140.910840.85440.244240.59668
150.913160.852460.24170.55656
160.925640.848540.20970.58186
170.911560.858820.232820.58778
180.9160.856880.223580.63122
190.915980.846580.2230.62936
200.920140.852460.216060.65276
Table 7

The mean training and validation accuracies and losses for MobileNet_v2 architecture for 20 epochs

EpochMean MobileNet_v2
Training accuracyValidation accuracyTraining lossValidation loss
10.555280.663221.124160.97572
20.642640.717140.942860.79604
30.68710.771080.8060.77816
40.729120.73920.707860.89686
50.755660.744620.65420.8389
60.78580.783340.575760.75382
70.788460.77990.544980.86344
80.83920.833320.41410.62084
90.859420.84950.369760.57796
100.86490.852960.351180.57304
110.874580.849540.333360.57328
120.876060.857340.321840.5281
130.87680.866180.32070.50986
140.881060.849020.311940.545
150.884640.853440.307460.53638
160.887560.861780.29660.5141
170.888040.86130.300380.50172
180.883420.86080.305660.52828
190.885120.856880.309720.53054
200.88220.861760.315760.50632
Table 8

The mean training and validation accuracies and losses for DenseNet architecture for 20 epochs

EpochMean DenseNet
Training accuracyValidation accuracyTraining lossValidation loss
10.557240.64461.08841.04494
20.684260.730880.818580.74552
30.74880.723020.67181.14064
40.761680.751960.646020.90288
50.78740.791180.56750.69646
60.819360.768620.505940.85718
70.822160.777440.485680.76844
80.871880.799520.360340.66998
90.878140.831360.318360.51186
100.89110.807360.307660.5814
110.89540.823540.282820.58526
120.901640.838740.273060.59644
130.899080.83920.27480.5592
140.90190.841180.274460.57224
150.907040.835780.251160.5755
160.90960.843660.247860.5398
170.905820.842160.249380.5301
180.90630.843160.260940.60658
190.911960.82990.246980.57962
200.90790.843640.243880.52476

Figure 16 shows the AUC and ROC for all the models in this paper. The SqueezeNet and AlexNet architecture display the lowest AUC score. MobileNet_v2, Resnet-50, ResNext and DenseNet have a comparable AUC score. From the ROC curve, it can be inferred that the models can correctly distinguish between the types of plastics in the Dataset. ResNeXt architecture achieves the largest AUC.

Deep transfer learning benchmark for plastic waste classification

Figure 16. Area under curve and receiver operating characteristic for Resnet-50, ResNeXt, DenseNet, SqueezeNet, MobileNet_v2 and AlexNet models. AUC: Area under curve; ROC: receiver operating characteristic.

5. CONCLUSION

When we compare our findings to previous studies in the field, we find that including transfer learning reduces total training time significantly. It will be simple to train the existing model and attain improved accuracy in a short amount of time if the WaDaBa dataset is enlarged in the future. This paper has benchmarked six state-of-the-art models on the WaDaBa plastic dataset by integrating deep transfer learning. This work will be laid out as a baseline work for future developments on the WaDaBa dataset. The paper focuses on supervised learning for plastic waste classification. Unsupervised learning procedures are one area where the article has placed less focus. The latter might be beneficial for pre-training or enhancing the supervised classification models using pre-trained feature selection. Pattern decomposition methods[41] like nonnegative matrix factorization[42] and ensemble joint sparse low rank matrix decomposition[43] are examples of unsupervised learning strategies. Higher order decomposition approaches, such as low-rank tensor decomposition[44,45] and hierarchical sparse tensor decomposition[46], can result in improved performance. This would be the future path of study to improve plastic waste classification.

DECLARATIONS

Authors’ contributions

Investigated the research area, reviewed and summarized the literature, wrote and edited the original draft: Chazhoor AAP

Managed the research activity planning and execution, contributed to the development of ideas according to the research aims: Ho ESL

Performed critical review, commentary and revision, funding acquisition: Gao B

Managed the research activity planning and execution, contributed to the development of ideas according to the research aims, funding acquisition, provided administrative: Woo WL

Availability of data and materials

The data can be found at http://wadaba.pcz.pl/. Emailing the creator by signing a consent form will give password access to the data[15]. The code has been uploaded to GitHub and the link is: https://github.com/ashys2012/plastic_wadaba/tree/main.

Financial support and sponsorship

The project is partially funded by Northumbria University and National Natural Science Foundation of China (No. 61527803, No. 61960206010).

Conflicts of interest

All authors declared that there are no conflicts of interest.

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Copyright

© The Author(s) 2022.

REFERENCES

1. Hiraga K, Taniguchi I, Yoshida S, Kimura Y, Oda K. Biodegradation of waste PET: a sustainable solution for dealing with plastic pollution. EMBO Rep 2019;20:e49365.

2. Alqattaf A. Plastic waste management: global facts, challenges and solutions. 2020 Second International Sustainability and Resilience Conference: Technology and Innovation in Building Designs(51154). 2020 Nov 11-12; Sakheer, Bahrain. IEEE; 2020. p. 1-7.

3. Klemeš JJ, Fan YV. Plastic replacements: win or loss? 2020 5th International Conference on Smart and Sustainable Technologies (SpliTech). 2020 Sep 23-26; Split, Croatia. IEEE; 2020. p. 1-6.

4. Backstrom J, Kumar N. Advancing the circular economy of plastics through eCommerce. Available from: https://hdl.handle.net/1721.1/130968 [Last accessed on 24 Jan 2022].

5. Joshi C, Browning S, Seay J. Combating plastic waste via Trash to Tank. Nat Rev Earth Environ 2020;1:142-142.

6. Siddique R, Khatib J, Kaur I. Use of recycled plastic in concrete: a review. Waste Manag 2008;28:1835-52.

7. Jiao W, Wang Q, Cheng Y, Zhang Y. End-to-end prediction of weld penetration: a deep learning and transfer learning based method. J Manuf Process 2021;63:191-7.

8. Duan Q, Li J. Classification of common household plastic wastes combining multiple methods based on near-infrared spectroscopy. ACS EST Eng 2021;1:1065-73.

9. Masoumi H, Safavi SM, Khani Z. Identification and classification of plastic resins using near infrared reflectance. Int J Mech Ind Eng 2012;6:213-20.

10. Veerasingam S, Ranjani M, Venkatachalapathy R, et al. Contributions of Fourier transform infrared spectroscopy in microplastic pollution research: a review. Crit Rev Environ Sci Technol 2021;51:2681-743.

11. Bruno EA. Automated sorting of plastics for recycling. Available from: https://www.semanticscholar.org/paper/Automated-Sorting-of-Plastics-for-Recycling-Edward-Bruno/e6e5110c06f67171409bab3b38f742db6dc110fc [Last accessed on 24 Jan 2022].

12. Alzubaidi L, Zhang J, Humaidi AJ, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 2021;8:53.

13. Albawi S, Mohammed TA, Al-Zawi S. Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology (ICET). 2017 Aug 21-23; Antalya, Turkey. IEEE;2017. p. 1-6.

14. Xie L, Wang J, Wei Z, Wang M, Tian Q. Disturblabel: regularizing CNN on the loss layer. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016 Jun 27-30; Las Vegas, NV, USA. IEEE; 2016. p. 4753-62.

15. Bobulski J, Piatkowski J. PET waste classification method and plastic waste DataBase - WaDaBa. In: Choraś M, Choraś RS, editors. Image processing and communications challenges 9. Cham: Springer International Publishing; 2018. p. 57-64.

16. Bobulski J, Kubanek M. Waste classification system using image processing and convolutional neural networks. In: Rojas I, Joya G, Catala A, editors. Advances in computational intelligence. Cham: Springer International Publishing; 2019. p. 350-61.

17. Agarwal S, Gudi R, Saxena P. One-Shot learning based classification for segregation of plastic waste. 2020 Digital Image Computing: Techniques and Applications (DICTA). 2020 Nov 29-2020 Dec 2; Melbourne, Australia. IEEE; 2020. p. 1-3.

18. Chazhoor AAP, Zhu M, Ho ES, Gao B, Woo WL. Intelligent classification of different types of plastics using deep transfer learning. Available from: https://researchportal.northumbria.ac.uk/ws/portalfiles/portal/55869518/ROBOVIS_2021_33_CR.pdf [Last accessed on 24 Jan 2022].

19. Guo Y, Zhang L, Hu Y, He X, Gao J. MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe B, Matas J, Sebe N, Welling M, editors. Computer Vision - ECCV 2016. Cham: Springer International Publishing; 2016. p. 87-102.

20. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2012;25:1097-105.

21. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016 Jun 27-30; Las Vegas, NV, USA. IEEE; 2016. p. 770-8.

22. Xie S, Girshick R, Dollár P, Tu Z, He K. Aggregated residual transformations for deep neural networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017 Jul 21-26; Honolulu, HI, USA. IEEE; 2017. p. 5987-95.

23. Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. Available from: https://arxiv.org/abs/1602.07360 [Last accessed on 24 Jan 2022].

24. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C. Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018 Jun 18-23; Salt Lake City, UT, USA. IEEE; 2018. p. 4510-20.

25. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017 Jul 21-26; Honolulu, HI, USA. IEEE; 2017. p. 2261-9.

26. Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C. A survey on deep transfer learning. In: Kůrková V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I, editors. Artificial neural networks and machine learning - ICANN 2018. Cham: Springer International Publishing; 2018. p. 270-9.

27. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition. 2009 Jun 20-25; Miami, FL, USA. IEEE; 2009. p. 248-55.

28. Brock A, Lim T, Ritchie JM, Weston N. Freezeout: accelerate training by progressively freezing layers. Available from: https://arxiv.org/abs/1706.04983 [Last accessed on 24 Jan 2022].

29. Han X, Zhong Y, Cao L, Zhang L. Pre-trained AlexNet Architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sensing 2017;9:848.

30. Talo M. Convolutional neural networks for multi-class histopathology image classification. 2019. Available from: https://arxiv.org/ftp/arxiv/papers/1903/1903.10035.pdf [Last accessed on 24 Jan 2022].

31. Go JH, Jan T, Mohanty M, Patel OP, Puthal D, Prasad M. Visualization approach for malware classification with ResNeXt. 2020 IEEE Congress on Evolutionary Computation (CEC). 2020 Jul 19-24; Glasgow, UK. IEEE; 2020. p. 1-7.

32. Seidaliyeva U, Akhmetov D, Ilipbayeva L, Matson ET. Real-time and accurate drone detection in a video with a static background. Sensors (Basel) 2020;20:3856.

33. Nguyen THB, Park E, Cui X, Nguyen VH, Kim H. fPADnet: small and efficient convolutional neural network for presentation attack detection. Sensors (Basel) 2018;18:2532.

34. Paszke A, Gross S, Chintala S, et al. Automatic differentiation in pytorch. Available from: https://openreview.net/pdf?id=BJJsrmfCZ [Last accessed on 24 Jan 2022].

35. You K, Long M, Wang J, Jordan MI. How does learning rate decay help modern neural networks? Available from: https://arxiv.org/abs/1908.01878 [Last accessed on 24 Jan 2022].

36. Li X, Chang D, Tian T, Cao J. Large-margin regularized Softmax cross-entropy loss. IEEE Access 2019;7:19572-8.

37. Ketkar N. Stochastic gradient descent. Deep learning with Python. Springer; 2017. p. 113-32.

38. Mukherjee H, Ghosh S, Dhar A, Obaidullah SM, Santosh KC, Roy K. Shallow convolutional neural network for COVID-19 outbreak screening using chest X-rays. Cognit Comput 2021; doi: 10.1007/s12559-020-09775-9.

39. Selvik JT, Abrahamsen EB. On the meaning of accuracy and precision in a risk analysis context. Proceedings of the Institution of Mechanical Engineers, Part O: Journal of Risk and Reliability 2017;231:91-100.

40. Singh A, Príncipe JC. A loss function for classification based on a robust similarity metric. The 2010 International Joint Conference on Neural Networks (IJCNN). 2010 Jul 18-23; Barcelona, Spain. IEEE; 2010. p. 1-6.

41. Gao B, Bai L, Woo WL, Tian G. Thermography pattern analysis and separation. Appl Phys Lett 2014;104:251902.

42. Gao B, Zhang H, Woo WL, Tian GY, Bai L, Yin A. Smooth nonnegative matrix factorization for defect detection using microwave nondestructive testing and evaluation. IEEE Trans Instrum Meas 2014;63:923-34.

43. Ahmed J, Gao B, Woo WL, Zhu Y. Ensemble joint sparse low-rank matrix decomposition for thermography diagnosis system. IEEE Trans Ind Electron 2021;68:2648-58.

44. Song J, Gao B, Woo W, Tian G. Ensemble tensor decomposition for infrared thermography cracks detection system. Infrared Physics & Technology 2020;105:103203.

45. Ahmed J, Gao B, Woo WL. Sparse low-rank tensor decomposition for metal defect detection using thermographic imaging diagnostics. IEEE Trans Ind Inf 2021;17:1810-20.

46. Wu T, Gao B, Woo WL. Hierarchical low-rank and sparse tensor micro defects decomposition by electromagnetic thermography imaging system. Philos Trans A Math Phys Eng Sci 2020;378:20190584.

Cite This Article

Export citation file: BibTeX | RIS

OAE Style

Chazhoor AAP, Ho ESL, Gao B, Woo WL. Deep transfer learning benchmark for plastic waste classification. Intell Robot 2022;2(1):1-19. http://dx.doi.org/10.20517/ir.2021.15

AMA Style

Chazhoor AAP, Ho ESL, Gao B, Woo WL. Deep transfer learning benchmark for plastic waste classification. Intelligence & Robotics. 2022; 2(1): 1-19. http://dx.doi.org/10.20517/ir.2021.15

Chicago/Turabian Style

Chazhoor, Anthony Ashwin Peter, Edmond S. L. Ho, Bin Gao, Wai Lok Woo. 2022. "Deep transfer learning benchmark for plastic waste classification" Intelligence & Robotics. 2, no.1: 1-19. http://dx.doi.org/10.20517/ir.2021.15

ACS Style

Chazhoor, AAP.; Ho ESL.; Gao B.; Woo WL. Deep transfer learning benchmark for plastic waste classification. Intell. Robot. 2022, 2, 1-19. http://dx.doi.org/10.20517/ir.2021.15

About This Article

Special Issue

© The Author(s) 2022. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, sharing, adaptation, distribution and reproduction in any medium or format, for any purpose, even commercially, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Data & Comments

Data

Views
2415
Downloads
741
Citations
3
Comments
0
28

Comments

Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.

0
Download PDF
Cite This Article 31 clicks
Like This Article 28 likes
Share This Article
Scan the QR code for reading!
See Updates
Contents
Figures
Related
Intelligence & Robotics
ISSN 2770-3541 (Online)
Follow Us

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/