|
Here's our August 24, 2025 release!
This release has
- preference alignment with DPO in the posttraining Chapter 9
- completely new ASR (Whisper) and TTS (EnCodec and VALL-E) material in Chapter 15 and 16
-
a restructuring of earlier chapters to fit how we are teaching now:
-
move Naive Bayes to the Appendix and instead using Logistic Regression to teach about classification
- Moving PPMI to the appendix and tf-idf only in Chapter 11, to move more quickly through sparse vectors
- the concept of LLMs, and LLM sampling and training introduced in chapter 7,
before introducing the internals with the transformer in Chapter 8.
- RNN/LSTM chapter delayed to 13, because students have asked to go directly to Transformers
without first learning RNNs. The new structure allows either order (LSTM/Transformer or Transformer/LSTM).
- a restructured Chapter 2 to focus more on tokens and words and introduce Unicode.
- typo fixes (thanks again to all of you!)
- some new slides
- The dialogue and chatbot Chapter was divided up and folded into various other chapters, now that LLMs tend to have replaced most earlier chatbot architectures. Much of the introduction and the ethics section went into the LLM chapter. The summary of human conversational structure went to the new chapter 25 "Conversation and its structure". The frame-based dialogue agents section is currently in Appendix chapter J, although that may change.
Individual chapters and updated slides are below.
Here is a single pdf of Aug 24, 2025 book!
-
Feel free to use the draft chapters and slides in your classes, print it out, whatever, the resulting feedback we get from you makes the book better!
-
Typos and comments are very welcome (just email slp3edbugs@gmail.com
and let us know the date on the draft)!
(Don't bother reporting missing refs due to cross-chapter cross-reference problems in the indvidual chapter pdfs, those are fixed in the full book draft)
-
Gratitude! We've put up a list here of the amazing people who have sent so many fantastic suggestions and bug-fixes for improving the book.
We are really grateful to all of you for your help, the book would not be possible without you!
-
How to cite the book:
Daniel Jurafsky and James H. Martin. 2025. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd edition. Online manuscript released August 24, 2025. https://web.stanford.edu/~jurafsky/slp3.
-
A bib entry for the book is here.
@Book{jm3,
author = "Daniel Jurafsky and James H. Martin",
title = "Speech and Language Processing: An Introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition,
with Language Models",
year = "2025",
url = {https://web.stanford.edu/~jurafsky/slp3/},
note = "Online manuscript released August 24, 2025",
edition = "3rd",
}
-
When will the book be finished? Don't ask.
-
If you need the previous Jan 2025 draft chapters,
they are here;
if you need the previous Aug 2024 draft chapters,
they are here;
|