This course provides an introduction to the variety of analytical methods for text data. Texts surround us in our professional and daily lives in form of written communication, document collections, social media streams, etc. Computer science methods for text analytics can thus be useful for scientific and engineering tasks, including domain applications for literature, social media, or medical text data, for instance. This course combines two perspectives: computational (i.e., natural language processing) and visual (i.e., information visualization for raw and derived text data) to support various analytical tasks, e.g., topic analysis, opinion mining, and named entity recognition.
The following topics are covered:
- Definitions and standard models for text data
- Standard text preprocessing methods (tokenization, "stop-word" filtering)
- Vector representations and transformations for text data (BoW, "document-term" matrix, TFIDF, word2vec)
- Overview of traditional and modern computational text analysis tasks and methods
- Overview of tasks and design options for text visualization techniques
- Interaction between computational and visual methods for applied analytical tasks (sentiment analysis, topic modeling, named entity recognition)
- Domain applications of text visualization and visual text analysis
- Open challenges in visual text analysis
- Overview of software tools and libraries for computational and visual text analysis
The course will be given during the period: 20 Jan - 22 Mar, 2020
Please sign up no later than 10/1 2021
For more information see the course plan:
https://kursplan.lnu.se/kursplaner/syllabus-4DV808-1.pdfIf you have any questions please turn to Kostiantyn Kucher
kostiantyn.kucher@lnu.se