Analysis of Large Data Sets - MAST8890

Looking for a different module?

Module delivery information

This module is not currently running in 2024 to 2025.

Overview

This module considers statistical analysis when we observe multiple characteristics on an experimental unit. For example, a sample of students' marks on several exams or the genders, ages and blood pressures of a group of patients. We are particularly interested in understanding the relationships between the characteristics and differences between experimental units. Regression methods can be used if one characteristic can be treated as a response variable and the others as explanatory variables. Variable selection on the explanatory variables can be daunting if the number of characteristics is large and suitable methods will be investigated. Outline Syllabus includes: measure of dependence, principal component analysis, factor analysis, canonical correlation analysis, hypothesis testing, discriminant analysis, clustering, scaling, information criterion methods for variable selection, false discovery rate, penalised maximum likelihood.

Details

Contact hours

36 hours

Method of assessment

80% examination and 20% coursework

Indicative reading

K. V. Mardia, J. T. Kent, and J. M. Bibby (1979) Multivariate analysis, London, Academic Press
D. F. Morrison (1990) Multivariate Statistical Method, McGraw-Hill Series in Probability and Statistics.
T. Hastie, R. Tibshirani and J. H. Friedman (2009) The Elements of Statistical Learning, Springer-Verlag.
P. J. Brown (1994) Measurement, Regression and Calibration, Oxford University Press

See the library reading list for this module (Canterbury)

Learning outcomes

The intended subject specific learning outcomes. On successful completion of the module students
- will be able to summarise and interpret multivariate data effectively;
- will have a critical awareness of the logical link between multivariate techniques and corresponding univariate techniques;
- will have a systematic understanding of a wide range of modern techniques in dimension reduction, regarding to their strengths and weakness;
- will be able to use statistical software to apply multivariate techniques and variable selection methods;
- will be able to select and apply these solve practical problems, to undertake
statistical calculations and manipulations, and to communicate the results effectively to statisticians.

The intended generic learning outcomes. On successful completion of the module students
- will have developed mathematical, critical approach to their work;
- will have developed the ability to solve practical problems.
- will have improved their key skills in numeracy, problem solving and information technology.

Notes

  1. ECTS credits are recognised throughout the EU and allow you to transfer credit easily from one university to another.
  2. The named convenor is the convenor for the current academic session.
Back to top

University of Kent makes every effort to ensure that module information is accurate for the relevant academic session and to provide educational services as described. However, courses, services and other matters may be subject to change. Please read our full disclaimer.