- Develop a well-defined set of objectives that need to be met by the results of the data analysis

- Here the objective is to determine if mathematical entrance exam results is correlated to the Actuarial Science \(1^{\text {st }}\) year students' performance

- Identify the data items required for the analysis

'o The data items needed would be the mathematical entrance mark for the \(1^{\mathrm{st}}\) year students doing Actuarial Science in the different South African Universities and the \(1^{\text {st }}\) year final year results for such students over a period of time

- Collection of the data from appropriate sources

- The data can be obtained from the Universities offering Actuarial Science degrees.

- Processing and formatting the data for analysis, e.g. inputting into a spreadsheet, database or other model.

- The data will need to be extracted from the administrative system of the Universities and loaded into whichever statistical package is being used for the analysis.

- Cleaning data, e.g. addressing unusual, missing or inconsistent values

- For example, a student may be recorded as registered at University X but the marks may be missing or marks which are unrealistic e.g. negative numbers or marks more than \(100 \%\) per subject \(\mathrm{X}\)

- Exploratory data analysis,

o Here takes the form of inferential analysis as we are here testing the hypothesis that mathematical entrance mark is correlated with \(1^{\text {st }}\) year Actuarial Science performance at universities.

- Modelling the data.

o In this case we need to choose the correct statistical method e.g. a Chi-squared test for the analysis

- Communicating the results, which include: describing the data sources used, the analysis performed and the conclusion of the analysis.

- Monitoring the process. Updating the data and repeating the process if required.

- May mean choosing another statistical package to use, or adjusting the level of significance chosen.