Solving Problems with Data and Text - COMP6481

Looking for a different module?

Module delivery information

Location Term Level1 Credits (ECTS)2 Current Convenor3 2024 to 2025
Canterbury
Spring Term 6 15 (7.5) Marek Grzes checkmark-circle

Overview

Data types: nominal, numerical, ordinal, text, audio, visual, temporal and non-temporal. Basic descriptive statistics: measures of average and spread, different ways of graphing data. Choosing appropriate and valid methods for the analysis and presentation of data, and understanding the limitations of methods. Data at different scales, including big data, and the computational challenges of processing data at scale. The process of discovering useful knowledge from data: including understanding the need for preprocessing and cleaning data, the challenges of gathering relevant data, and the need to present results in a comprehensible and actionable way. Data mining: classification/regression and clustering, and the idea of predictive analytics. Elements of information retrieval from text. Vector representations of text documents. Fairness and ethical issues concerning data.

Details

Contact hours

Private Study Hours: 118
Contact Hours: 32 (22h lectures + 10h classes)
Total Hours: 150

Method of assessment

100% coursework

Indicative reading

The University is committed to ensuring that core reading materials are in accessible electronic format in line with the Kent Inclusive Practices.
The most up to date reading list for each module can be found on the university's reading list pages.
Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016.
Joel Grus, Data Science from Scratch, O'Reilly, 2015.
Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana, Practical Natural Language Processing: A Comprehensive Guide to Building Real-World NLP Systems. O'Reilly Media, 2020.

Learning outcomes

On successfully completing the level 6 module students will be able to:
1. Present data using descriptive statistics and visualisations.
2. Describe methods for obtaining knowledge from data at different scales and of different types.
3. Apply computer packages for data visualisation, text processing, and data mining to sample datasets.
4. Demonstrate knowledge and critical understanding of the discovery from data process and be able to apply it to specific examples.
5. Describe the challenges of ethics and fairness in data and apply these to specific examples.

Notes

  1. Credit level 6. Higher level module usually taken in Stage 3 of an undergraduate degree.
  2. ECTS credits are recognised throughout the EU and allow you to transfer credit easily from one university to another.
  3. The named convenor is the convenor for the current academic session.
Back to top

University of Kent makes every effort to ensure that module information is accurate for the relevant academic session and to provide educational services as described. However, courses, services and other matters may be subject to change. Please read our full disclaimer.