By SciForce
When it comes to choosing the right book, you become immediately overwhelmed with the abundance of possibilities: should you choose a classic for a solid base or a fresh-from-the-oven book for the newest trends? What level to stick to? Will a beginner’s guide be too easy?
In this review, we have collected our Top 10 NLP and Text Analysis Books of all time, ranging from beginners to experts.
1. Natural Language Processing with Python
by Steven Bird, Ewan Klein and Edward Loper.
It is so popular, that every top seems to have it listed. Well, it is a timeless classic that provides an introduction to NLP using the Python and its NLTK library.
Target readers:
Beginners in NLP, computational linguists and AI developers
Why it is good:
The book is very practice-oriented: you won’t be introduced to complex theories behind, just plenty of code and concepts to start experimenting right away.
Where to find:
2. Foundations of Statistical Natural Language Processing
by**Christopher Manning and Hinrich Schütze.
This book offers a thorough introduction to statistical methods for NLP and it covers both the linguistic essentials and basic statistical methods as of 1999.
Target readers:
Beginners in natural language processing with no required knowledge of linguistics or statistics
Why it is good:
Though rather old, this book gives a strong foundation in linguistics and statistical methods and to better understand the newer methods and encodings.
Where to find:
3. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition
by Dan Jurafsky and James H. Martin
Also quite old, this book offers a unified vision of speech and language processing covering statistical and symbolic approaches to language processing, and presents algorithms and techniques for speech recognition, spelling and grammar correction, information extraction, search engines, machine translation, and the creation of spoken-language dialog agents.
Target readers:
Beginners in natural language and speech processing
Why it is good:
The book provides a solid foundational knowledge as it introduces linguistics, computer science and statistics at comprehensive depth.
Where to find:
4. The Oxford Handbook of Computational Linguistics
by Ruslan Mitkov
This handbook describes major concepts, methods, and applications in computational linguistics, starting from linguistic fundamentals comprehensible even for undergraduates and non-specialists from other fields of linguistics and proceeding with overview of current tasks, techniques, and tools in Natural Language Processing targeting more experienced computational language researchers.
Target readers:
Linguists as well as researchers in informatics, artificial intelligence, language engineering, and cognitive science.
Why it is good:
It is an academic edition, meaning that it theory-oriented and provides deeper understanding of major concepts that their functioning.
Where to find:
5. Text Mining with R
by Julia Silge and David Robinson.
This book presents an introduction of text mining using the tidytext package and other tidy tools in R. It demonstrates statistical natural language processing methods on a range of modern applications.
Target readers:
Practitioners at least slightly familiar with R.
Why it is good:
It is quite new; therefore it has a practical and modern feel to the demonstrations and provides examples of real text mining problems.
Where to find:
6. Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies)
by Yoav Goldberg , Graeme Hirst
This book focuses on the application of neural network models to natural language processing tasks. The book covers the basics of supervised machine learning and of working with machine learning over language data, and proceeds with introducing more specialized neural network architectures, such 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models.
Target readers:
Software developers and industry practitioners who are already familiar with neural networks.
Why it is good:
The book offers a thorough overview of state-of-the-art neural network models that may be useful for NLP.
Where to find:
7. Taming Text
by Grant Ingersoll, Thomas Morton and Drew Farris.
This book provides an introduction to several NLP tools and problems, including Apache Solr, Apache OpenNLP, and Apache Mahout with code samples in Java.
Target readers:
Software developers who want to familiarize themselves with enterprise-grade NLP tools for work projects.
Why it is good:
This book offers first-hand insights into Apache-based NLP a cofounder of the Apache Mahout project. Besides, it is a rare book having Java code examples.
Where to find:
8. Deep Learning in Natural Language Processing
by Li Deng, Yang Liu
This book presents an overview of the state-of-the-art deep learning techniques and their successful applications to major NLP tasks, such as speech recognition and understanding, dialogue systems, lexical analysis, parsing, knowledge graphs, machine translation, question answering, sentiment analysis, social computing, and natural language generation from images.
Target readers:
Advanced undergraduate and graduate students in computational linguistics and computer science, as well as academic and industrial researchers.
Why it is good:
First of all, it is a 2018 edition, so it reviews the real state of the art. Besides, it provides deep and fundamental knowledge of deep learning far beyond practical applications.
Where to find:
9. Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning
by Benjamin Bengfort , Rebecca Bilbro , Tony Ojeda
The book presents robust, repeatable, and scalable techniques for text analysis with Python, including contextual and linguistic feature engineering, vectorization, classification, topic modeling, entity resolution, graph analysis, and visual steering.
Target readers:
Software developers in Python who are interested in applying natural language processing and machine learning to their software development toolkit.
Why it is good:
This practical book presents a data scientist’s perspective on building language-aware products with applied machine learning techniques.
Where to find:
10. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 1st Edition
by Aurélien Géron
Though not particularly dedicated to natural language processing, this practice-oriented book presents the most popular libraries that may be used for NLP and text analysis.
Target readers:
Software developers with at least minor previous experience in machine learning.
Why it is good:
The book gives a comprehensive overview of the most recent developments in machine learning starting from simple linear regression and progressing to deep neural networks — and it all on two most popular libraries: Scikit-Learn and TensorFlow.
Where to find:
We are sure that everyone has their own favorites that has helped them master text and speech analysis. As usual, we would be happy to hear your success stories and check for your hints and suggestions of good literature in comments.
SciForce is a Ukraine-based IT company specialized in development of software solutions based on science-driven information technologies. We have wide-ranging expertise in many key AI technologies, including Data Mining, Digital Signal Processing, Natural Language Processing, Machine Learning, Image Processing and Computer Vision.
Original. Reposted with permission.
Related: