laitimes

Support vector machine algorithms for machine learning

author:Everybody is a product manager
How to understand the concept of SVM algorithm and its application scenarios? In this article, the author has made a relatively detailed analysis and interpretation, let's take a look.
Support vector machine algorithms for machine learning

1. What is the support vector machine algorithm?

Support Vector Machine (SVM) algorithm, the full name of which is "Support Vector Machine". In machine learning, SVM is a binary classification algorithm under supervised learning that can be used for classification and regression tasks.

II. Basic Principles

The core task of SVM is to construct an N-1-dimensional segmentation hyperplane to realize the division of N-dimensional sample data, and the sample points on both sides of the separation hyperplane belong to two different categories. Let's start with the simplest example (2D plane):

Support vector machine algorithms for machine learning

Situation 1: Please observe which of the three straight lines A, B, and C is the correct classification boundary? Obviously, only A "completely" distinguishes the decision boundaries between the two types of data.

Support vector machine algorithms for machine learning

Scenario 2: Different from Scenario 1, in the above diagram, the three lines A, B, and C all completely distinguish the boundary, so how should we choose? Since there are so many tools that can divide beans, it is natural that we should find the best one, right? The best answer should be B, because B is the farthest away from the edge data, and the line with the farthest margin is chosen because it has a higher fault tolerance and more stable performance, which means that when we put more beans again, the probability of error is smaller.

Therefore, it can be deduced that the classification method of SVM is first considered to be correctly classified. The second consideration is the distance from the optimized data to the boundary. Then a more troublesome scene appeared, please see the picture below, how to divide this pile of beans? If this is on a two-dimensional plane, what line should the beans be divided by?

Support vector machine algorithms for machine learning

Do we really have to draw an infinite line to divide? What if you can't tell the difference? This is the case of linear inseparability, in fact, in real life, a large number of problems are linear inseparable, and SVM is a good helper to deal with this linear indivisibility situation.

The way to deal with this kind of problem is to transform the two-dimensional plane into a three-dimensional space, because often the nonlinear problem in the low-dimensional space becomes the linear problem when it is transformed into the high-dimensional space. For example, the picture above may become the case below.

Support vector machine algorithms for machine learning

As shown in the figure above, the 3D sample data is divided into 2D planes, and the inseparable problem in the 2D space is also converted into a 3D linear separable problem, which can be processed by the support vector machine. Based on this idea, SVM uses kernel functions to realize the mapping of low-dimensional space to high-dimensional space, so as to solve the problem of linear inseparability of low-dimensional space to a certain extent.

Let's briefly describe the definition of a kernel function to help us understand its role further.

Kernel function: The inner product of any two sample points in the expanded space, if it is equal to the output of the two sample points after passing through a function in the original space, then this function is called the kernel function. Function: With this kernel function, the future high-dimensional inner product can be converted into low-dimensional function operations, which means that only the low-dimensional inner product needs to be calculated, and then squared. Obvious problems are solved and complexity is greatly reduced. In summary, the kernel function essentially implies a mapping from a lower dimension to a higher dimension, thus avoiding the direct calculation of the inner product of the higher dimensions.

Commonly used kernel functions are as follows: linear kernel function, polynomial kernel function, radial basis kernel function (RBF), Gaussian kernel function, Laplace kernel function, sigmoid kernel function, etc. 3. Application steps of vector machine algorithm

The following are the key steps in the application of SVM algorithms:

Step 1: Data Preparation and Preprocessing (General)Before applying SVM, you first need to collect and prepare relevant data. Data preprocessing steps may include data cleansing (removing noise and irrelevant data points), data transformation (such as feature scaling to ensure that different features are within a similar numerical range), and data normalization.

Step 2: Select the kernel function: Select the appropriate kernel function according to the characteristics of the dataset, which is one of the steps of SVM core. If the dataset is linearly separable, a linear kernel can be selected; For nonlinear data, you can choose such as radial basis function (RBF) kernels or polynomial kernels to add dimension to the data and uncover complex data relationships.

Step 3: Parameter OptimizationOptimize SVM parameters (e.g., C parameters and kernel parameters) have a direct impact on the performance of the model. The C parameter determines the tolerance of the degree to which the interval is violated by the data points, while the kernel parameters, such as the γ parameter of the RBF kernel, control the distribution of the data after mapping to a high-dimensional space.

Step 4: Train the SVM model using the selected kernel function and optimized parameters, and use the training data to train the SVM model. At this stage, the algorithm will learn the best hyperplane to divide the different classes and determine the support vectors.

Step 5: Model Evaluation (General) Uses a test set to evaluate the performance of the SVM model. Common evaluation metrics include accuracy, recall, F1 score, etc. The results of the evaluation can help us understand the model's ability to generalize on unknown data. Step 6: Model Deployment and Monitoring (General)

The final step is to deploy the trained SVM model to production and implement continuous monitoring. During model deployment, you need to ensure that the format of the real-time data is consistent with that of training, and that the model is regularly evaluated to accommodate possible changes in the data or environment.

Fourth, application scenarios

SVM is not only suitable for linear problems, but also for nonlinear problems, with good classification performance and generalization ability, and is suitable for solving a variety of practical problems.

  • Text classification: SVM can represent text as feature vectors, and train an SVM classifier to classify text into different categories, such as spam classification, sentiment analysis, text subject classification, etc.
  • Image classification: By extracting the feature vectors of images, SVMs can be used to train a classifier to classify images into different categories, such as face recognition, object recognition, image retrieval, etc.
  • Biomedical: SVM can be used for cancer classification, protein structure prediction, gene expression data analysis, and more.
  • Finance: SVM can be used for multiple tasks in the financial sector, such as credit scoring, fraud detection, stock market forecasting, and more.
  • Medical image analysis: SVM can be used for lesion detection, disease diagnosis, medical image segmentation, etc.
  • Natural language processing: SVM can be used for tasks such as named entity recognition, syntactic analysis, and machine translation.

5. Advantages and disadvantages

Merit:

  • The effect is very good, and the classification boundary is clear;
  • It is particularly effective in high-dimensional spaces;
  • It is effective when the spatial dimension is greater than the number of samples;
  • It uses a subset of training points (support vectors) in the decision function, so it takes up less memory and is more efficient.

Shortcoming:

  • If the amount of data is too large, or the training time is too long, the SVM will perform poorly;
  • If there is a lot of noise in the dataset, the SVM is not effective;
  • SVM doesn't directly compute to provide probability estimates, so it's too expensive for us to do multiple cross-validations.

Reference:

This article was originally published by @Houqian on Everyone is a product manager, and it is forbidden to reprint without the permission of the author.

The title image is from Unsplash and is licensed under CC0.

The views in this article only represent the author's own, everyone is a product manager, and the platform only provides information storage space services.

Read on