Sport Informatics and Analytics/Pattern Recognition

Overview
The conceptualisation and operationalisation of pattern recognition are foundations of sport informatics and analytics. This theme (Theme 2 of the course): We present three datasets for you to analyse in this theme: bicycle hire 2013 CitiBike| data; an Australian Rules Football GPS data set (from the 2014 season); and |physical measurements and blood measurements from athletes at the Australian Institute of Sport (2018). Elsewhere, there is a growing network of data sharing. Michael Timbs (2019) for example, shared his AFL Brownlow data. R for Data Science curated data from the FIFA Women's World Cup in France. Keith Lyons (2019) gathered data from the official FIFA record of the tournament. Mark Padgham (2019) created the CRAN package bikedata for downloading and aggregating data from public bicycle hire, or bike share, systems. James Curley (2016) developed the engsoccerdata package that "is mainly a repository for complete soccer datasets, along with some built-in functions for analyzing parts of the data". Mart Jürisoo (2019 has compiled an International football results from 1872 to 2019 dataset that has 40,838 results of international football matches.
 * Discusses systematic observation of performance.
 * Introduces supervised learning approaches to data analysis.
 * Explores the connections between performance trends and athlete actions.

In addition to this introduction to the theme, these topics are part of this theme:
 * Using R
 * Python
 * Knowledge discovery
 * Structured Query Language (SQL)
 * Capstone

Video signpost
In this video, Melissa Breen discusses the impact of pattern recognition data on her performance as an elite athlete. Melissa was the University of Canberra's first athlete in residence in 2014.

Resources
The resources to support this theme include:


 * A Theme outline.
 * A slide presentation.
 * A mind map for this theme that includes resources up to 2015. For more recent resources (2016 onward) see | this site.
 * Links to performance monitoring, systematic observation and supervised learningon the course wiki.
 * An introduction to computer vision.
 * Video suggestions. (See slides 5 and 6).
 * Jason Mayes' introduction to machine learning.
 * Darrell Cobner's iBook The Value of Numbers.
 * R Resources.
 * Five papers.
 * The Office of the Victorian Information Commissioner's (2019) report Closer to the Machine
 * There are some additional resources.

Artificial intelligence
Tannya also includes a reference to artificial super intelligence. She cites Nick Bostrom's observation that “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest”.

Data discussions
In our discussions of pattern recognition we are mindful that we need to reflect on the forms data take and how we name files. Hadley Wickham notes the importance of data cleaning and preparation. Tamrapami Dasu and Theodire Johnson, in their introduction to data cleaning, observe: Most data mining and analysis techniques assume that the data have been joined into a single table and cleaned, and that the analyst already knows what she or he is looking for. Unfortunately, the data set is usually dirty, composed of many tables, and has unknown properties. Before any results can be produced, the data must be cleaned and explored,

We recommend that, as an introduction to data cleaning and preparation, you look at Hadley Wickham's approach to data tidying. You might also consider looking at an R package, tidyr, that provides tools to help tidy messy data. For a 2017 discussion of the tidyverse approach, see Zev Ross, Hadley Wickham and David Robinson's discussion of decluttering R workflow.

Your reading and reflections might lead you consider your own role as a data scientist. Chris Dowsett (2016) notes "it takes people to use data in order for it to have any value". He explores how we might develop data science as a platform. This approach offers "the opportunity to bring together a great User Experience with holistic insights on-demand". Aidan Condron (2016) provides an example of a data science as a platform project that sought "to establish a technological infrastructure supporting data archivists and ... researchers in managing and analysing both familiar and new and novel forms of data".