An incredible amount of unstructured text data is generated every day by social media, web pages, and a variety of other sources. But without the ability to tame and harness that data, you'll be unable to glean any value from it. In this course, learn how to translate messy text data into powerful insights using Python. Instructor Derek Jedamski begins with a quick review of foundational NLP concepts, including how to clean text data and build a model on top of vectorized text. He then jumps into more complex topics such as word2vec, doc2vec, and recurrent neural networks. To wrap up the course, he lends these concepts a real-world context by applying them to a machine learning problem.
- 英文名称：Advanced NLP with Python for Machine Learning
- Leveraging the power of messy text data
- What you should know
- What tools you need
- Using the exercise files
- What is NLP?
- NLTK setup
- Reading text data into Python
- Cleaning text data
- Vectorize text using TF-IDF
- Building a model on top of vectorized text
- What is word2vec?
- What makes word2vec powerful?
- How to implement word2vec
- How to prep word vectors for modeling
- What is doc2vec?
- What makes doc2vec powerful?
- How to implement doc2vec
- How to prep document vectors for modeling
- What is a neural network?
- What is a recurrent neural network?
- What makes RNNs so powerful for NLP problems?
- Preparing data for an RNN
- How to implement a basic RNN
- Prep the data for modeling
- Build a model on TF-IDF vectors
- Build a model on word2vec embeddings
- Build a model on doc2vec embeddings
- Build an RNN model
- Compare all methods using key performance metrics
- Key takeaways for advanced NLP modeling techniques
- How to continue advancing your skills