Solving Problems with Data and Text - COMP8481

Looking for a different module?

Module delivery information

Location Term Level1 Credits (ECTS)2 Current Convenor3 2022 to 2023
Spring Term 7 15 (7.5) Marek Grzes checkmark-circle


Data types: nominal, numerical, ordinal, text, audio, visual, temporal and non-temporal. Basic descriptive statistics: measures of average and spread, different ways of graphing data. Choosing appropriate and valid methods for the analysis and presentation of data, and understanding the limitations of methods. Data at different scales, including big data, and the computational challenges of processing data at scale. The process of discovering useful knowledge from data: including understanding the need for preprocessing and cleaning data, the challenges of gathering relevant data, and the need to present results in a comprehensible and actionable way. Data mining: classification/regression and clustering, and the idea of predictive analytics. Elements of information retrieval from text. Vector representations of text documents. Fairness and ethical issues concerning data.


Contact hours

Private Study Hours: 118
Contact Hours: 32 (22h lectures + 10h classes)
Total Hours: 150

Indicative reading

Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
Joel Grus, Data Science from Scratch, O'Reilly, 2015.
Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana, Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O'Reilly Media, 2020.

Learning outcomes

On successfully completing the level 6 module students will be able to:
1. Present data using descriptive statistics and visualisations.
2. Describe methods for obtaining knowledge from data at different scales and of different types.
3. Apply computer packages for data visualisation, text processing, and data mining to sample datasets.
4. Demonstrate knowledge and critical understanding of the discovery from data process and be able to apply it to specific examples.
5. Describe the challenges of ethics and fairness in data and apply these to specific examples.
On successfully completing the level 7 module students will also be able to:
6. Demonstrate systematic understanding and critical awareness of the discovery from data process and be able to technically evaluate specific results.


  1. Credit level 7. Undergraduate or postgraduate masters level module.
  2. ECTS credits are recognised throughout the EU and allow you to transfer credit easily from one university to another.
  3. The named convenor is the convenor for the current academic session.
Back to top

University of Kent makes every effort to ensure that module information is accurate for the relevant academic session and to provide educational services as described. However, courses, services and other matters may be subject to change. Please read our full disclaimer.