Applied Statistical Natural Language Processing Methods (2017)

July 19-21, 2017
(Instructor: Jeffrey Ferraro)

Natural language processing (NLP) is concerned with the practical issues of using computer systems to process human language. Statistical NLP consists of applying machine learning and statistical techniques to produce inferences providing the ability to reason and make decisions over text much like humans do. In this two-and-a-half day course we will explore techniques on how to reason draw process meaning from text. We will learn in the context of applied real-world biomedical and clinical text processing exercises. Student teams in this course will collaboratively build statistical NLP applications and learn how to mine text for meaning. You will apply existing machine learning algorithms using Python to build functional language processing prototypes.

Students completing this course will gain a hands-on working knowledge of:

  • The Principles of Machine Learning applied to Natural Language Processing
  • Methods of Machine Learning for Text Processing
  • Feature Space Representations for Text Processing
  • Representation of semantic context for Text Processing
  • Document Level Inference and Classification
  • Information Retrieval and Ranking Methods
  • Topic Clustering
  • Concept Mapping


You will be working in small teams, so the following prerequisites apply to the skills of an entire team:

  • Experience programming with Python
  • Experience using a Python IDE & debugger
  • Clinical domain expertise
  • Organizational abilities
  • Motivation to learn & explore


Topic Product