Skip to main content

Advertisement

Springer Nature Link
Log in
Menu
Find a journal Publish with us Track your research
Search
Cart
  1. Home
  2. Machine Learning: ECML-98
  3. Conference paper

Text categorization with Support Vector Machines: Learning with many relevant features

  • Support Vector Learning
  • Conference paper
  • First Online: 01 January 2005
  • pp 137–142
  • Cite this conference paper
Machine Learning: ECML-98 (ECML 1998)
Text categorization with Support Vector Machines: Learning with many relevant features
  • Thorsten Joachims1 

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1398))

Included in the following conference series:

  • European Conference on Machine Learning
  • 27k Accesses

  • 5086 Citations

  • 42 Altmetric

Abstract

This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task. Empirical results support the theoretical findings. SVMs achieve substantial improvements over the currently best performing methods and behave robustly over a variety of different learning tasks. Furthermore they are fully automatic, eliminating the need for manual parameter tuning.

Download to read the full chapter text

Chapter PDF

Explore related subjects

Discover the latest articles, books and news in related subjects, suggested using machine learning.
  • Categorization
  • Data Mining
  • Learning algorithms
  • Machine Learning
  • Statistical Learning
  • Artificial Intelligence

References

  1. C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20:273–297, November 1995.

    Google Scholar 

  2. T. Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In International Conference on Machine Learning (ICML), 1997.

    Google Scholar 

  3. T. Joachims. Text categorization with support vector machines: Learning with many relevant features. Technical Report 23, Universität Dortmund, LS VIII, 1997.

    Google Scholar 

  4. J. Kivinen, M. Warmuth, and P. Auer. The perceptron algorithm vs. winnow: Linear vs. logarithmic mistake bounds when few input variables are relevant. In Conference on Computational Learning Theory, 1995.

    Google Scholar 

  5. T. Mitchell. Machine Learning. McGraw-Hill, 1997.

    Google Scholar 

  6. J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

    Google Scholar 

  7. J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313–323. Prentice-Hall Inc., 1971.

    Google Scholar 

  8. G. Salton and C. Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513–523, 1988.

    Article  Google Scholar 

  9. Vladimir N. Vapnik. The Nature of Statistical Learning. Springer, New York, 1995.

    Google Scholar 

  10. Y. Yang. An evaluation of statistical approaches to text categorization. Technical Report CMU-CS-97-127, Carnegie Mellon University, April 1997.

    Google Scholar 

  11. Y. Yang and J. Pedersen. A comparative study on feature selection in text categorization. In International Conference on Machine Learning (ICML), 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Universität Dortmund, Inforinatik LS8, Baroper Str. 301, 44221, Dortmund, Germany

    Thorsten Joachims

Authors
  1. Thorsten Joachims
    View author publications

    Search author on:PubMed Google Scholar

Editor information

Claire Nédellec Céline Rouveirol

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Joachims, T. (1998). Text categorization with Support Vector Machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026683

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/BFb0026683

  • Published: 16 June 2005

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64417-0

  • Online ISBN: 978-3-540-69781-7

  • eBook Packages: Springer Book Archive

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • Support Vector Machine
  • Radial Basic Function
  • Text Categorization
  • Irrelevant Feature
  • Linear Threshold Function

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics

Search

Navigation

  • Find a journal
  • Publish with us
  • Track your research

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Journal finder
  • Publish your research
  • Language editing
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our brands

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Discover
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support
  • Legal notice
  • Cancel contracts here

18.222.84.83

Not affiliated

Springer Nature

© 2025 Springer Nature