Abstract:
A new method of data stream classification based on similarity weighting and differential integration of classifiers is proposed. The method uses the latest base classifier as the reference classifier, representing the upcoming concept in the data stream. Based on this classifier, the similarity between the base classifiers is worked out by use of the Gower’s similarity coefficient, and the similarity is used as the base classifier weights to conduct weighted majority vote. At the same time, Q- statistic method is adopted to calculate the difference between referenced classifiers and other base classifiers, and according to the size of the difference, the relatively weak base classifiers were eliminated so that the diversity of the integrated classification model can be kept. Lastly, simulation experiment is carried out on standard simulation dataset, and the results show that the
classification accuracy of the presented method is about11% higher than that of the traditional integrated classification method when used to classify dynamic data flow with noise, indicating the method is of good classification accuracy and anti-noise sta- bility.