Yoshua Bengio's answer to Why is kernelized SVM much slower than linear SVM?

Why is kernelized SVM much slower than linear SVM?

Head of Montreal Institute for Learning Algorithms, Professor @ U. Montreal · Upvoted by

, MSCS with focus in machine learning and

, PhD in ML and AIAuthor has 170 answers and 4.5M answer views · 13y ·

Basically, a kernel-based SVM requires on the order of n^2 computation for training and order of nd computation for classification, where n is the number of training examples and d the input dimension (and assuming that the number of support vectors ends up being a fraction of n, which is shown to be expected in theory and in practice). Instead, a 2-class linear SVM requires on the order of nd computation for training (times the number of training iterations, which remains small even for large n) and on the order of d computations for classification. So when the number of training examples is large (e.g. millions of documents, images, or customer records) then kernel SVMs are too expensive for training and very expensive for classification (unless one uses one of the approximations that have been proposed but have not yet become widely used as far as I know; someone more expert in this area could correct me). In the case of sparse inputs, linear SVMs are also convenient because d in the above can be replaced by the average number of non-zeros in each example input vector.

22.5K views ·

View upvotes

View 5 shares

1 of 4 answers

Something went wrong. Wait a moment and try again.

View 3 other answers to this question

About · Careers · Privacy · Terms · Contact · Languages · Your Ad Choices · Press ·