The semi-supervised learning method enables labeled as well as unlabeled samples to be used to classify malicious and benign PDF documents. Each classifier is independent, which enhances the generalization capability during detection. A random sub-sampling strategy is employed to train multiple sub-classifiers. It extracts structural features as well as statistical features based on entropy sequences using the wavelet energy spectrum. This chapter presents a semi-supervised machine learning method for detecting malicious PDF documents. But this is problematic because very few labeled malicious samples are available in real-world scenarios. Traditional manual and supervised-learning-based detection methods rely heavily on labeled samples of malicious documents. Portable Document Format (PDF) documents are often used as carriers of malicious code that launch attacks or steal personal information.
0 Comments
Leave a Reply. |