Sport Informatics and Analytics/Pattern Recognition

From WikiEducator
Jump to: navigation, search
Practice session.

Overview

The conceptualisation and operationalisation of pattern recognition are foundations of sport informatics and analytics. This theme (Theme 2 of the course):

  • Discusses systematic observation of performance.
  • Introduces supervised learning approaches to data analysis.
  • Explores the connections between performance trends and athlete actions.

We present three datasets for you to analyse in this theme: bicycle hire 2013 CitiBike| data; an Australian Rules Football GPS data set (from the 2014 season); and |physical measurements and blood measurements from athletes at the Australian Institute of Sport (2018). Elsewhere, there is a growing network of data sharing. Michael Timbs (2019) for example, shared his AFL Brownlow data. R for Data Science curated data from the FIFA Women's World Cup in France. Keith Lyons (2019) gathered data from the official FIFA record of the tournament. Mark Padgham (2019) created the CRAN package bikedata for downloading and aggregating data from public bicycle hire, or bike share, systems. James Curley (2016) developed the engsoccerdata package that "is mainly a repository for complete soccer datasets, along with some built-in functions for analyzing parts of the data". Mart Jürisoo (2019 [1] has compiled an International football results from 1872 to 2019 dataset that has 40,838 results of international football matches.

In addition to this introduction to the theme, these topics are part of this theme:

Video signpost

In this video, Melissa Breen discusses the impact of pattern recognition data on her performance as an elite athlete. Melissa was the University of Canberra's first athlete in residence in 2014.


Resources

The resources to support this theme include:

Theme activities

Artificial intelligence

Icon reading line.svg
Definitions

Stuart Russell and Peter Norvig (2016)[2] note that the main unifying theme of their textbook is an intelligent agent. They define artificial intelligence as "the study of agents that receive percepts from the environment and perform actions."

Tannya Jalal 2019) [3] distinguishes three definitions of artificial intelligence.

Artificial intelligence

A broad area of computer science that makes machines seem like they have human intelligence.

Artificial narrow intelligence

Pulls information from a specific data-set.

Artificial general intelligence

Refers to machines that exhibit human intelligence.



Tannya also includes a reference to artificial super intelligence. She cites Nick Bostrom's observation that “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest”.

Reading about pattern recognition, machine learning, and artificial neural networks

Icon reading line.svg
Pattern recognition

Take some time to explore the range of resources for this theme. You might like to start with a summary of five papers on pattern recognition. To get a feel for where this work is going, have a look at a 2017 paper written by Nazanin Mehrasa and her colleagues[4] on learning person trajectory representations for team activity analysis and a 2018 paper by Manuel Stein and his colleagues[5] about combining video and movement data. You might also find the discussions of ghosting in association football (2017)[6] and basketball (2018)[7] of interest. For a portfolio of research in pattern recognition, see Luke Bornn and colleagues' (2019)[8] eleven papers submitted to the Sloan Sports Analytics Conference 2014-2019. Patrick Lucey (2019)[9] discussed interactive sport analytics in order "to find play similarity using multi-agent trajectory data, as well as predicting fine-grain plays".



Computer Science

Icon reading line.svg
Philosophy of Computer Science

William Rapaport (2019)[10] has provided a comprehensive discussion of computer science in his book the Philosophy of Computer Science. William suggests that Computer Science tries to answer five central questions:

  • What can be computed and how?
  • What can be computed efficiently, and how?
  • What can be computed practically, and how?
  • What can be computed physically, and how?
  • What can be computed ethically, and how?



Icon reading line.svg
Machine learning

As you explore the pattern recognition theme, you will find references to machine learning and deep learning. These references include Jørgen Veisdal's (2018)[11] account of the first artificial intelligence workshop at Dartmouth. As an example of how approaches to machine learning have developed over the last sixty years, you might like to compare eight papers. The first is by Allen Newell, John Shaw and Herbert Simon (1958)[12] on addressing the problems of designing computerised chess-playing. The second by Arthur Samuel written in 1959[13], Some Studies in Machine Learning Using the Game of Checkers. Four papers were written by David Silver and his colleagues, the first in January 2016[14] Mastering the game of Go with deep neural networks and tree search, the second in October 2017[15], Mastering the game of Go without human knowledge, the third in December 2017[16], Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, and the fourth in December 2018[17], A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. David Foster (2018)[18] discussed the AlphaGo, AlphaGo Zero and AlphaZero machine learning approaches taken by David Silver and his colleagues and provided his guide to help build your own AlphaZero AI with Python and Keras. A eigth paper, written by Nazanin Mehrasa and her colleagues (2018)[19], discussed a generic deep learning model for team activity analysis.

Jesus Rodriguez (2019)[20] provided background detail to machine learning and the card game of poker. He reported on the development of Pluribus and a paper that discussed the use of machine learning "in six-player no-limit Texas hold’em poker".[21]

For an overview of this period of machine learning see Andrey Kurenkov (2016)[22] and James Somers (2018)[23]. Emily Cust and her colleagues (2018) have provided a systematic review of machine and deep learning for sport-specific movement recognition.[24] Aman Agarwal (2018)[25] shared his detailed reading of David Silver and colleagues' 2016 paper. Steven Strogatz (2018)[26] extended the discussion of David Silver and his colleagues' work and contemplated a move from AlphaZero to AlphaInfinity. Michael Garbade (2018)[27] sought to clear confusion about the use of the terms AI, machine learning and deep learning.

If these readings have inspired you, Lauri Hartikka (2017)[28] offered a step-by-step guide to building a simple chess AI. Mark Farragher (2019)[29] discussed an artificial intelligence resolution of the Coastal Runners game in which he discussed the reward funtion of artificial intelligence.

You might also find the discussions about predicting the outcomes of Bundesliga football games of interest too.[30]

For an introduction to developing a machine learning model see Victor Roman's (2018)[31] post.

Marc Deisenroth, Aldo Faisal and Cheng Soon Ong (2019)[32] shared, in an open resource, their introduction to mathematics for machine learning.

Omayma Said (2019)[33] shared and introduction to the why and how of machine learning.

Christoph Molnar (2019)[34] provided a guide for making black box models explainable.

Jesus Roderiguez (2019)[35] described the development of a Google Research Football project that used a reinforcement learning environment in which agents learned to play football in a Gameplay environment. A paper by Karol Kurach and colleagues (2019)[36], Google Research Football: A Novel Reinforcement Learning Environment accompanies Jesus's introduction. They note:

Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner.

Further discussion of Google's work in this space can be found in an Emerging Technology update from arXiv (2019).[37]

Mat Herold and his colleagues (2019)[38] provided a review of machine learning in men’s professional football. They provide "a critical appraisal of the application of machine learning in football related to attacking play, discussing current challenges and future directions".



Icon reading line.svg
Artificial neural networks

Anders Krogh (2008) notes:

Artificial neural networks are inspired by the early models of sensory processing by the brain. An artificial neural network can be created by simulating a network of model neurons in a computer. By applying algorithms that mimic the processes of real neurons, we can make the network ‘learn’ to solve many types of problems.[39]

Brian Ripley (1996)[40] provided a general introduction to pattern recognition with neural networks. Carlos Gershenson (2003)[41] shared his introduction to artificial neural networks for beginners. Branislav Holländer (2018)[42] discussed natural and artificial neural networks. Jay Alammar (2016[43], 2018[44]) has provided an introduction to basic neural networks and the mathematics involved.

In sport, Jürgen Perl has been a leading advocate of the use of artificial neural networks. A 2004 paper[45] introduced his neural network approach to movement pattern analysis. Subsequently, he and his colleague, Stefan Endler, have explored a variety of applications of artificial neural networks in sport contexts. See, for example, their discussions of endurance sports[46][47] and Stefan and his colleagues' report of research into simulated anaerobic threshold compared with lactate-based thresholds[48]. Other examples of Jürgen's work include game creativity[49] and tactical pattern recognition[50][51][52].

Donald Barron and his colleagues (2018)[53] used an artificial neural network to identify key performance indicators that influenced outfield players' league standings in association football. Their analysis used data collected for 966 players.



Data science, machine learning, artificial intelligence and intelligence augmentation

Icon reading line.svg
What is in a name?

David Donoho (2017)[54] noted "there is a solid case for some entity called Data Science to be created, which would be a true science: facing essential questions of a lasting nature and using scientifically rigorous techniques to attack those questions". His paper identified six divisions of Greater Data Science (GDS):

  • Data exploration and preparation
  • Data representation and transformation
  • Computing with data
  • Data modelling
  • Data visualisation and presentation
  • Science about data science

David concludes his article with this observation:

GDS proposes that Data Science is the science of learning from data; it studies the methods involved in the analysis and processing of data and proposes technology to improve methods in an evidence-based manner. The scope and impact of this science will expand enormously in coming decades as scientific data and data about science itself become ubiquitously available.

You might find it interesting to look at the commentary on David's article.

Hanif Samad (2019)[55] looked carefully at the process of finding employment as a data scientist. He included reference to the Conway Venn Diagram[56]. His research led him to look at the profiles of 869 data scientists. His findings included: most data scientists have postgraduate degrees; Computer Science and Engineering, but also Business Analytics dominate fields of study; Currently employed data scientists tend to be in mid-career positions; most data scientist positions are new; half of data scientist roles come from non-technology companies. Hanif concluded "the background of data scientists is incredibly diverse" but noted that "a postgraduate degree is a far better indicator of your prospects as a data science hire". [57]

Roger Peng (2018)[58] discussed the role of theory in data analysis. He identified five tentpoles of data science in a subsequent post (2019)[59]:

  • the application of design thinking to data problems;
  • the creation and management of workflows for transforming and processing data;
  • the negotiation of human relationships to identify context, allocate resources, and characterize audiences for data analysis products;
  • the application of statistical methods to quantify evidence;
  • the transformation of data analytic information into coherent narratives and stories.

David Robinson (2018)[60] sought to distinguish the essential characteristics of data science, machine learning, and artificial intelligence (AI). He used a descriptivist 'rule of three' to propose:

  • Data science produces insights
  • Machine learning produces predictions
  • AI produces actions(Original emphases)

David pointed out that this is not a sufficient qualification but his attempt as "a useful way to distinguish the three types of work, and to avoid sounding silly when you’re talking about it".[61] You might find Gil Press's (2013)[62] discussion of data science of interest in this context. See also, Francesco Corea's (2018)[63] classification of AI technologies.

We believe there is a further clarification to be made in the context of intelligence augmentation (IA) as distinct from artificial intelligence (AI). The epistemological foundations of augmentation can be found in work by Vannevar Bush (1945)[64] and Douglas Engelbart (1962)[65].

Peter Skagestad (1993)[66] observed:

the pioneers of the personal-computer revolution did not theorize about the essence of the computer, but focused rather on the essence of human thinking, and then sought ways to adapt computers to the goal of improving human thinking.[67]

More recently, Melanie Cook (2017) proposed that IA is:

The idea that a computer system supplements and supports human thinking, analysis, and planning, leaving the intentionality of a human actor at the heart of the human-computer interaction. Focusing on the interaction of humans and computers, rather than on computers alone.[68]

You might also find Cassie Kozyrkov's (2018a[69], 2018b[70] ) discussions of machine learning of interest too as well as her discussions of data science (2018c[71], 2018d[72], 2018e[73], 2018f[74]).

Tirthajyoti Sarkar (2018)[75] combined insights from William of Ockham, Thomas Bayes and Claude Shannon to construct a definition of machine learning.

John Rollins (2015)[76] outlined a foundational methodology for data scientists. This has ten stages.

Karen Hao (2019)[77] shared a review of 16,625 papers that referred to artificial intelligence in arXiv.

Varuna De Silva (2018)[78] provided an example of the use of artificial intelligence in an association football club. Marcus Woo (2018)[79] discussed the use of artificial intelligence in NBA basketball.



Icon reflection line.svg
Reflection

As you engage with the theory and practice of sport informatics and analytics, you will notice a lot of technical language. If you have an opportunity to read the authors listed above, you might start to get a feel for this language and have a sense of how you might use the terms you discover. We hope this is a good point in the course for you to reflect on how you will describe your own work and conceptualise the work of others.

In the process of reflection you might like to consider the approaches taken by Peter Sweeney (2018a[80], 2018b[81]) and Zachary Lipton and Jacob Steihardt (2018)[82]. Peter explores philosophical issues in the consideration of artificial intelligence. Zachary and Jacob discuss patterns in machine learning scholarship.



Examples from sport contexts

Icon reading line.svg
American football

NFL data

Have a look at Alex Castrounis' discussion of supervised learning and unsupervised learning with NFL data. How might you use the learning approaches Alex discusses in relation to the Chicago Bears in your sport contexts? See also, Iman Behravan and colleagues' (2019)[83] use of an automatic particle swarm optimisation-clustering algorithm to identify players' roles.



Icon reading line.svg
Association football

Player recognition

Nicolas Bortolotti (2017)[84] has discussed how he has used TensorFlow (an open source machine learning framework) with an ObjectDetection model to analyse a segment of a football game to identify a player. Nicolas described how he:

  • Trained a model.
  • Used the model during a live broadcast of a football game.
  • Considered how such an approach might contribute to conversations about game tactics.

He shared a video of his player recognition model.

Analysis of spatio-temporal data

Michael Horton's PhD thesis (2018)[85] investigated algorithmic approaches to mining sports trajectory data. The thesis reported Michael's research into the automatic classifying of passes made during association football games. The data used for his analysis comprised four games played by Arsenal Football Club in the English Premier League season in 2008. The data contained trajectories for all players that participated in each half of each of the four game, and an event log for each game. The trajectories were sampled at 10 Hz and had a resolution of 10 cm.

Shots, goals and predicting team play

Debangan Dey and Andrew Pita (2018)[86] investigated: where on the field do most shots come from?; what patterns of play give rise to the most effective shots?; and can we develop team level summary measures that are predictive of team performance? Their investigations used data collected by StatsBomb during the2018 FIFA World Cup.

Machine Learning

In 2018[87], the journal Machine Learning published a guest editorial on machine learning for football. As part of the special issue, the editors posed the 2017 Soccer Prediction Challenge that revolved around predicting the outcomes of football matches.



Icon activity line.svg
Australian rules football

Analysing GPS data

An Australian Rules football team shared with us a whole game GPS data set from a game played in the 2014 season.

The data can be found at this location.

The team that provided the data won the game and scored the same number of points each of the four quarters of the game.

The scores by quarter in the game were:

  • 33 v 22 (Q1)
  • 33 v 12 (Q2)
  • 33 v 26 (Q3)
  • 33 v 37 (Q4)

Questions

  1. Do the data available help map player effort in relation this scoring pattern?
  2. What inferences can you draw from these data?



Icon reading line.svg
Basketball

Modelling player movement

There is a substantial literature reporting pattern recognition approaches in basketball. Kirk Goldsberry[88] has explored the use of visual and spatial analytics to investigate shooting abilities. Other research conducted by Kirk and his colleagues includes: score prediction[89]; shot selection[90]; and defending[91][92][93].

Steven Wu and Luke Bornn[94] provide a detailed account of their use of secondary data to analyse attacking play in professional basketball. The example they use is a data set from a 2013 game between the Miami Heat and the Brooklyn Nets.

Positive Residual has used the Shiny application to share insights into basketball performance. See, for example, NBA team rolling charts and NBA team play type.

Derek Corcoran, and Nicholas Watanabe[95] share their use of the R package Spatialball to analyse and visualise spatial data in the NBA. The package enables the user to explore player, team and league patterns of performance.

Evangelos Papalexakis and Konstantinos Pelechrinis (2018)[96] proposed "a framework based on tensor decomposition for obtaining a set of prototype spatio-temporal patterns based on the core spatio-temporal information and contextual meta-data" to provide contextual information about performance patterns.

Long Sha and his colleagues (2018)[97] shared an intelligent human-computer interface that used trajectory data to enhance the retrieval of basketball team and player performance.

Wade Hobbs and his colleagues (2018)[98] measured spatial scoring effectiveness in women's basketball in the 2016 Olympic Games. The aim of this study was to quantify how effectively teams move the ball across the basketball court and to identify the most commonly occurring sequences of ball movement in international women’s basketball.

March madness

In 2014, Kaggle announced a March Machine Learning Mania[99] competition to coincide with the NCAA Division 1 Men's Basketball Tournament hosted in Arlington, Texas. There has been a Kaggle competition each year since then with the most recent taking place in the 2018 season. [100] There was a Kaggle competition for the NCAA Division 1 Women's Basketball Tournament in 2018.[101] Sam Firke (2018)[102] provided a guide to analysing basketball performance at the championships. You might find it interesting to look at the resources he has shared in his GitHub account as a tutorial introduction.

Michael Lopez and Gregory Matthews (2014)[103] shared their reflections on their success in winning the inaugural Kaggle competition in 2014. You might consider their reflection on their model for your own work in this area of prediction, namely:

While one of our two submissions finished first in the Kaggle contest, we estimate that this winning entry had no more than about a 12% chance of doing so, even under the most optimistic of game probability scenarios.

In 2017[104], Google partnered with the NCAA to migrate eighty years of historical and play-by-play data including basketball championships. These data were used in the 2018 NCAA basketball championships to provide real-time data analysis in the two semi-finals of the tournament.[105][106]

Cooperative behaviours

Motokazu Hojo and his colleagues (2018)[107] proposed an automatic recognition system for strategic cooperative plays, which are the minimal, basic, and diverse plays in a ball game. They aimed to shed light on light on inconspicuous players who play important roles in basketball. Data were collected from a Japanese university team.

Analytics at scale

Eric Schmidt and Allen Jarvis (2018)[108] have provided a detailed insight into Google Cloud's involvement in the analysis of NCAA basketball data. We recommend that you read their account of the architecture required to deliver:

  • A flexible and scalable data processing workflow to support collaborative data analysis.
  • New analytic explorations through collaboratively developed queries and visualizations.
  • Real-time predictive insights and analysis related to the games, modeled around NCAA men’s and women’s basketball.

There are two other articles to add to your reading list about analytics at scale. One is by Tariq Shaukat (2017) [109], the other is by Courtney Blacker (2018)[110].

What do these three articles suggest to you about the skills you might need as you work to provide insights from archived data?[111]



Icon reading line.svg
Bicycle journeys

Analysing bicycle journey data

There is growing research interest in the analysis of open data about bicycle journeys. You might find Jake Vandeplas's 2014 paper[112] a good place to start. He writes "this post is as much about how to work with data as it is about what we learn from the data" (original emphases). Two examples of how to work with data to create visualisations and to learn from openly available data are: Todd Schneider's[113] tale of twenty-two million Citi Bike Rides in New York and Luis Carli's[114] analysis of Boston bike sharing data. Mark Padgham (2017)[115] reported the availability of an rOpenSci package, bikedata, that provides access to data from all cities which openly publish bicycle share data. Christoph Molnar (2018)[116] used data from Capital-Bikeshare in Washington to support discussion of machine learning. Florian Teschner (2018a[117], 2018b[118], 2018c[119]) used New York Citi Bike data to discuss embeddings for categorical variables.

In 2018, Mark Padgham published the CRAN package bikedata "an R package for downloading and aggregating data from public bicycle hire, or bike share, systems"[120]. In 2019, Martin Frigaard and Peter Spangler[121] described their analysis of data released by the City of Chicago.



Icon reading line.svg
Cricket

Fast bowling detection

Joseph McGrath and his colleagues (2018)[122]reported their use of an inertial measurement unit to provide data from fast bowling actions of 17 elite fast bowlers. Their paper shared their machine learning approach to the data collected. You might find their account of interest as you consider how you might approach a machine learning task using smartphone technology.



Icon reading line.svg
Cross-country running

Race strategies

Steve Lane[123] has analysed the pacing strategies of athletes in USA championship collegiate cross-country races and suggests:

Basic statistical analysis suggests a very strong relationship between pacing and finishing time: relatively even pacing predicts faster times.

You might like to read Steve's paper as an introduction to the literature on the use of analytics to understand pacing in sport. Chris Abbiss and Paul Larsen[124] provided a comprehensive review of the pacing literature up to 2008. Mark Waldron and Jamie Highton[125] extended the discussion of the literature with their 2014 paper that explored pacing in high-intensity intermittent team sport. For and example of a sport specific discussion, you might like to read Andrew Edwards and his colleagues[126] of pacing in rowing.

Jürgen Perl has explored how we might model performance in training and competition. He developed a Performance Potential meta-model, PerPot, that "simulates the interaction between load and performance in adaptive physiological processes like training in sport by means of antagonistic dynamics"[127]. His research provides a comprehensive insight into how neural networks can be used in sport settings.

Iztok Fister and his colleagues (2018)[128] discussed the use of pacing strategies in half marathon races and shared their use of a differential evolution algorithm to inform their post hoc analysis of performance.



Icon reading line.svg
Cross-country skiing

Analysing cross country skiing data

Finn Marsland[129] has combined his experiences as a national coach for cross country skiing with a research interest in pattern recognition. With colleagues at the Australian Institute of Sport and the University of Canberra, he has produced three research papers to share his work[130][131][132]. The three papers illustrate how Finn's work developed from a preliminary investigation[133] that considered "the potential of micro-sensors for use in the identification of the main movement patterns used in cross-country skiing" to the use of sensors in on-snow training environments[134] to their use in competition events[135]. The range of Finn's investigations provide an excellent case study in how a coach can develop his understanding of performance through considered use of pattern recognition technology.

You might find Trine Seeberg and her colleagues' (2017)[136] discussion of a multi-sensor system for automatic analysis of classical cross-country skiing techniques of interest and Jihyeok Jang and colleagues' (2018)[137] investigation of a deep-learning model for classifying cross-country skiing techniques.



Icon reading line.svg
Ice hockey

Emmanuel Perry[138] observes "Hockey is inherently random, but it isn’t roulette. With the right data and a little handiwork (a good computer doesn’t hurt either) though, you can make a decent go of it...". He explores a variety of approaches to analyse game outcome in ice hockey. His discussion provides a detailed insight into the range of tools an analyst might use to investigate performance patterns. These include:

  • bagged logistic regression
  • gradient-boosted trees
  • neural networks
  • bagged naive Bayes model
  • a random forest using fuzzy logic

Emmanuel combines eleven sub-models into his prediction model for performance. He discusses each of these in detail and outlines the validation process he used to test his model. He notes that this process is essential but "by far the least enjoyable part of building a statistical model".

We recommend Emmanuel's analysis of ice hockey performance to you as an example of an ensemble of sub-models that is discussed explicitly to guide you as a reader. His model raises an important question about the generalisability of a sport specific approach.



Icon reading line.svg
Movement pattern recognition

Kylie Steel[139] provides an introduction to movement pattern recognition. She notes that it is a field of study that has attracted research interest for over a century. If you would like to explore discussions about human movement characteristics and the attention we pay to movement after reading Kylie's summary, you might find the 1996 paper by Eva Bonda and her colleagues[140] of interest. For a 2017 example of identifying movement patterns in sport we suggest you look at Panna Felsen and Patrick Lucey's[141] investigation of shooting styles in basketball. For an indication of how this work in movement recognition is progressing, you might like to have a look at Hoang Le and colleagues'[142] discussion of coordinated multi-agent imitation learning.



Icon reading line.svg
Physical activity monitoring

Nick Strayer (2018)[143] shared an analysis of the recordings of "30 subjects performing basic activities and postural transitions while carrying a waist-mounted smartphone with embedded inertial sensors". He used Keras to train a convolutional neural network to classify physical activity. Data came from the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set.



Icon reading line.svg
Running

Estimation of lactate threshold

Urtats Etxegarai and his colleagues (2018)[144] reported the use of a machine learning system that modelled the lactate evolution using recurrent neural networks. Their account provided details of the approach they used to develop a system that predicted with accuracy lactate thresholds and performance.



Icon reading line.svg
Speed skating

Analysing speed skating data

Arno Knobbe and his colleagues[145] discuss their approach to analysing speed skating data. You can find the paper at this location. Note the process they share in the paper as they move from the records kept by a coach over a fifteen-year period to their analysis in order to "extract actionable and interpretable patterns that can provide input to future improvements in training".



Icon reading line.svg
Surfing

Surf forecasts

Surfing has been included in the 2020 Olympic Games in Tokyo and will take place at Tsurigasaki Beach[146]. It is likely the surfers and the organisers will pay particular attention to surf forecasts developed by Walter Munk[147]. Walter worked with Harald Sverdrup to develop a methodology to forecast the relationships betweem wind, sea and swell. In 1947 they produced a report for the United States of America's Hydrographic office[148] that established the framework for surf forecasts that was extended by Charles Bretschneider[149]. In the 1950s, Walter worked with John Tukey to examine power spectra in wave behaviour[150][151]. Walter was still active in oceanograpgic research on his 100th birthday on 19 October 2017[152].



Icon reading line.svg
Tennis

Within-match forecasting

Stephanie Kovalchik and Machar Reid (2018)[153] discussed a methodology to provide dynamic updates to within-match forecasting of wins in tennis. They combine a pre-match calibration method with a Bayes updating rule to report on data from the 2017 tennis season.



Icon casestudy line.svg
Triathlon

Predicting performance

Marian Hoffmann and his colleagues (2017)[154] report their use of two computational approaches to predict Olympic distance triathlon race times of two German male elite triathletes. Their first computational method (a statistical approach) after race time normalisation was: "exploratory factor analysis, as a mathematical preselection method, followed by multiple linear regression and dominance paired comparison"[155]. The second used an expertise-based nonlinear approach that included an artificial neural network.

Marian and his colleagues analysed data from eleven male elite triathletes and in order to undertake the two computational approaches, they note:

Normalization was necessary to obtain comparable individual race times independent of the various triathlon races in which the subjects participated. These normalized race times were fundamental to all following analyses, since they accounted for the slightly different competition calendars of each elite triathlete.[156]

Marian and his colleagues used a reference factor calculated as "the mean value of overall race times of the Top 10 athletes in World Triathlon Series races between 2009 and 2012".[157]

We recommend you read this discussion of triathlon performance prediction. You might find the authors' consideration of the limitations of their study of particular interest.[158]



Data discussions

In our discussions of pattern recognition we are mindful that we need to reflect on the forms data take and how we name files.[159] Hadley Wickham[160] notes the importance of data cleaning and preparation. Tamrapami Dasu and Theodire Johnson, in their introduction to data cleaning, observe:

Most data mining and analysis techniques assume that the data have been joined into a single table and cleaned, and that the analyst already knows what she or he is looking for. Unfortunately, the data set is usually dirty, composed of many tables, and has unknown properties. Before any results can be produced, the data must be cleaned and explored,[161]

We recommend that, as an introduction to data cleaning and preparation, you look at Hadley Wickham's[162] approach to data tidying. You might also consider looking at an R package, tidyr, that provides tools to help tidy messy data. For a 2017 discussion of the tidyverse approach, see Zev Ross, Hadley Wickham and David Robinson's[163] discussion of decluttering R workflow.

Your reading and reflections might lead you consider your own role as a data scientist. Chris Dowsett (2016) notes "it takes people to use data in order for it to have any value"[164]. He explores how we might develop data science as a platform. This approach offers "the opportunity to bring together a great User Experience with holistic insights on-demand"[165]. Aidan Condron (2016) provides an example of a data science as a platform project that sought "to establish a technological infrastructure supporting data archivists and ... researchers in managing and analysing both familiar and new and novel forms of data"[166].

Data science challenges

Icon reading line.svg
Addressing fallacies about a data scientist's role

We suggest you have a look at Shane Brennan's (2017) post The Ten Fallacies of Data Science[167]. In it, Sean lists these ten fallacies for a newly qualified data scientist to consider:

  • The data exist
  • The data are accessible
  • The data are consistent
  • The data are relevant
  • The data are intuitively understandable
  • The data can be processed
  • Analyses can be easily re-executed
  • We do not need encryption
  • Analytics outputs are easily shared and understood
  • The answer you are looking for is there in the first place

Do any of Shane's ten points resonate with your experience?



Icon reading line.svg
An example of a data science process

We suggest you have a look at Vick Szuflita's (2018) post Pitch Recommendation: a look into the data science process[168] for an example of a data science process. Vicky uses data from baseball to consider a process that has the following steps:

  • Identify your problem and Goal
  • Gather and clean your data
  • Get to know your data
  • Picking your model
  • How do I know if my model is good?
  • Improving your model
  • Make your model useable

Does Vicky's example help clarify the ways we might approach the collection and analysis of data?



ePortfolio questions

Icon reflection line.svg
Questions about this theme

As you work your way through this theme and compile your ePortfolio, you might like to consider these six questions.


Q 7. What is systematic about ‘systematic’ observation?

Q 8. Do we need to concern ourselves about the reliability and validity of data?

Q 9. Why is it important to de-identify performance data?

Q10. What did you discover in the shared dataset?

Q11. What have you learned about supervised learning approaches?

Q12. What are your thoughts about how we relate patterns of performance to moments of performance within games?



References

  1. Jürisoo, Mart (20 April 2018). "International football results from 1872 to 2019". https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017. Retrieved 20 April 2018.
  2. Russell, Stuart; Norvig, Peter (2016). Artificial Intelligence: A Modern Approach. Malaysia: Pearson Educational.
  3. Jalal, Tannya (21 May 2018). "Distinguishing between Narrow AI, General AI and Super AI". https://medium.com/@tjajal/distinguishing-between-narrow-ai-general-ai-and-super-ai-a4bc44172e22. Retrieved 19 September 2019.
  4. Mehrasa, Nazanin et al (2017). "Learning Person Trajectory Representations for Team Activity Analysis". arXiv.org 3 June: arXiv:1706.00893 (cs.CV).
  5. Stein, Manuel et al (2018). "Bring it to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis". IEEE Transactions on Visualisation and Computer Graphics 24(1).
  6. Le, Hoang et al (March 2017). "Data-Driven Ghosting using Deep Imitation Learning". https://authors.library.caltech.edu/75181/1/1671-2.pdf. Retrieved 25 February 2018.
  7. Seidl, Thomas et al (February 2018). "Bhostgusters: Realtime Interactive Play Sketching with Synthesized NBA Defenses". http://www.sloansportsconference.com/wp-content/uploads/2018/02/1006.pdf. Retrieved 25 February 2018.
  8. Bornn, Luke (21 February 2019). "Eleven Sloan papers in five years". https://twitter.com/LukeBornn/status/965986388863631360. Retrieved 23 February 2019.
  9. Lucey, Patrick (25 September 2019). "Interactive Sports Analytics". https://www.oreilly.com/radar/interactive-sports-analytics. Retrieved 26 September 2019.
  10. Rapaport, William (September 2019). "Philosophy of Computer Science". https://cse.buffalo.edu/~rapaport/Papers/phics.pdf. Retrieved 11 September 2019.
  11. Veisdal, Jergen (12 September 2019). "The Birthplace of AI". https://medium.com/cantors-paradise/the-birthplace-of-ai-9ab7d4e5fb00. Retrieved 24 September 2019.
  12. Newell, Allen; Shaw, John; Simon, Herbert (1958). "Chess-playing programs and the problem of complexity". IBM Journal of Research and Development 2(4): 320-335.
  13. Samuel, Arthur (1959). "Some Studies in Machine Learning Using the Game of Checkers". IBM Journal of Research and Development 3(3): 210.
  14. Silver, David et al (2016). "Mastering the game of Go with deep neural networks and tree search". Nature 529: 484-489.
  15. Silver, David et al (2017). "Mastering the game of Go without human knowledge". Nature 550: 354-359.
  16. Silver, David et al (2017). "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm". arXiv arXiv:1712.01815 [cs.AI].
  17. Silver, David et al (2018). "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play". Science 362 (6419): 1140-1144.
  18. Foster, David (27 January 2018). "How to build your own AlphaZero AI using Python and Keras". https://medium.com/applied-data-science/how-to-build-your-own-alphazero-ai-using-python-and-keras-7f664945c188. Retrieved 30 January 2018.
  19. Mehrasa, Nazanin et al (February 2018). "Deep Learning of Player Trajectory Representations for Team Activity Analysis". http://www.sloansportsconference.com/wp-content/uploads/2018/02/2003.pdf. Retrieved 19 February 2018.
  20. Rodriguez, Jesus (15 July 2019). "Inside Pluribus: Facebook’s New AI That Just Mastered the World’s Most Difficult Poker Game". https://towardsdatascience.com/inside-pluribus-facebooks-new-ai-that-just-mastered-the-world-s-most-difficult-poker-game-2fb4486cf9c1. Retrieved 10 August 2019.
  21. Brown, Noam; Sandholm, Tuomas (11 July 2019). "Superhuman AI for multiplayer poker". Science DOI: 10.1126/science.aay2400.
  22. Kurenkov, Andrey (18 April 2016). "A 'Brief' History of Game AI Up To AlphaGo". http://www.andreykurenkov.com/writing/ai/a-brief-history-of-game-ai/. Retrieved 11 March 2018.
  23. Somers, James (28 December 2018). "How the artificial intelligence program AlphaZero mastered its games". https://www.newyorker.com/science/elements/how-the-artificial-intelligence-program-alphazero-mastered-its-games. Retrieved 2 January 2019.
  24. Cust, Emily et al (2018). "Machine and deep learning for sport-specific movement recognition: a systematic review of model development and performance". Journal of sports sciences 11: 1-33.
  25. Agarwal, Aman (9 March 2018). "Explained Simply: How an AI program mastered the ancient game of Go". https://medium.com/@mngrwl/explained-simply-how-an-ai-program-mastered-the-ancient-game-of-go-62b8940a9080. Retrieved 13 March 2018.
  26. Strogatz, Steven (26 December 2018). "One Giant Step for a Chess-Playing Machine". https://www.nytimes.com/2018/12/26/science/chess-artificial-intelligence.html. Retrieved 27 December 2018.
  27. Garbade, Michael (15 September 2018). "Clearing the Confusion: AI vs Machine Learning vs Deep Learning Differences". https://towardsdatascience.com/clearing-the-confusion-ai-vs-machine-learning-vs-deep-learning-differences-fce69b21d5eb. Retrieved 30 August 2019.
  28. Hartikka, Lauri (30 March 2017). "A step-by-step guide to building a simple chess AI". https://medium.freecodecamp.org/simple-chess-ai-step-by-step-1d55a9266977. Retrieved 31 March 2018.
  29. Farragher, Mark (15 March 2019). "This AI figured out that the only winning move is not to play". https://medium.com/machinelearningadvantage/this-ai-figured-out-that-the-only-winning-move-is-not-to-play-a59acc763da8. Retrieved 4 August 2019.
  30. Roman, Victor (23 December 2018). "How To Develop a Machine Learning Model From Scratch". https://sew.unisg.ch/en/empirische-wirtschaftsforschung/sports-economics-research-group/soccer-analytics/weitere-erklaerungen. Retrieved 25 January 2019.
  31. Lechner, Michael (2018). "Soccer Analytics - Further explanations". https://towardsdatascience.com/machine-learning-general-process-8f1b510bd8af. Retrieved 18 April 2018.
  32. Deisenroth, Marc; Faisal, Aldo; Ong, Cheng Soon (2019). "Mathematics for Machine Learning". https://mml-book.github.io/. Retrieved 26 March 2019.
  33. Said, Omayma (6 April 2019). "Interpreting Machine Learning Models". https://speakerdeck.com/omaymas/interpreting-machine-learning-models-why-and-how. Retrieved 7 April 2019.
  34. Molnar, Christoph (12 April 2019). "Interpretable Machine Learning". https://christophm.github.io/interpretable-ml-book/. Retrieved 19 June 2019.
  35. Roderiguez, Jesus (June 2019). "How Google uses Reinforcement Learning to Train AI Agents in the Most Popular Sport in the World". https://www.kdnuggets.com/2019/06/google-reinforcement-learning-ai-agents-sport.html. Retrieved 22 June 2019.
  36. Kurach, Karol et al (June 2019). "Google Research Football: A Novel Reinforcement Learning Environment". https://github.com/google-research/football/blob/master/paper.pdf. Retrieved 22 June 2019.
  37. Technology, Emerging (13 August 2019). "Having mastered Space Invaders, chess, and Go, AI tackles video soccer". https://www.technologyreview.com/s/614049/having-mastered-space-invaders-chess-and-go-ai-tackles-video-soccer/. Retrieved 17 August 2019.
  38. Herold, Mat et al (1 October2019). "Machine learning in men’s professional football: Current applications and future directions for improving attacking play". International Journal of Sports Science & Coaching https://doi.org/10.1177/1747954119879350.
  39. Krogh, Anders (2008). "What are artificial neural networks?". Nature Biotechnology 26(2): 195-197.
  40. Ripley, Brian (1996). Pattern Reconition and Neural Networks. Cambridge: University of Cambridge.
  41. Gershenson, Carlos (2003). Artificial neural networks for beginners. arXiv preprint cs/0308031.
  42. Holländer, Branislav (20 August 2018). "Natural vs Artificial Neural Networks". https://becominghuman.ai/natural-vs-artificial-neural-networks-9f3be2d45fdb. Retrieved 21 September 2018.
  43. Alammar, Jay (14 December 2016). "A Visual and Interactive Guide to the Basics of Neural Networks". https://jalammar.github.io/visual-interactive-guide-basics-neural-networks/. Retrieved 25 September 2018.
  44. Alammar, Jay (February 2018). "A Visual And Interactive Look at Basic Neural Network Math". https://jalammar.github.io/feedforward-neural-networks-visual-interactive/. Retrieved 25 September 2018.
  45. Perl, Jürgen (2004). "A neural network approach to movement pattern analysis". Human Movement Science 23(5): 605-620.
  46. Perl, Jürgen; Endler, Stefan (2006). "Training and contest-scheduling in endurance sports by means of course profiles and PerPot-based analysis". International Journal of Computer Science in Sport 5(2): 42-46.
  47. Perl, Jürgen; Endler, Stefan (2012). "PerPot individual anaerobe threshold marathon scheduling". International Journal of Computer Science in Sport 11(2): 52-60.
  48. Endler, Stefan et al (2017). "The PerPot simulated anaerobic threshold : a comparison to typical lactate-based thresholds". International Journal of Human Movement and Sports Sciences 5(1): 9-15.
  49. Memmert, Daniel; Perl, Jürgen (2009). "Game creativity analysis using neural networks". Journal of Sports Sciences 27(2): 139-149.
  50. Pfeiffer, Mark; Perl, Jürgen (2006). "Analysis of tactical structures in team handball by means of artificial neural networks". International Journal of Computer Science in Sport 5(1): 4-14.
  51. Grunz, Andreass; Memmert, Daniel; Perl, Jürgen (2012). "Tactical pattern recognition in soccer games by means of special self-organizing maps". Human Movement Science 31(2): 334-343.
  52. Perl, Jürgen; Grunz, Andreass; Memmert, Daniel (2013). "Tactics analysis in soccer–An advanced approach". International Journal of Computer Science in Sport 12(1): 33-44.
  53. Barron, Donald; Ball, Graham; Robins, Matthew; Sunderland, Caroline (2018). "Artificial neural networks and player recruitment in professional soccer". PlosOne https://doi.org/10.1371/journal.pone.0205818.
  54. Donoho, David (2017). "50 years of data science". Journal of Computational and Graphical Statistics 26(4): 745-766.
  55. Samad, Hanif (1 August 2019). "I wasn’t getting hired as a Data Scientist. So I sought data on who is.". https://towardsdatascience.com/i-wasnt-getting-hired-as-a-data-scientist-so-i-sought-data-on-who-is-c59afd7d56f5. Retrieved 17 August 2019.
  56. Conway, Dean (26 March 2013). "The Data Science Venn Diagram.". https://towardsdatascience.com/i-wasnt-getting-hired-as-a-data-scientist-so-i-sought-data-on-who-is-c59afd7d56f5. Retrieved 17 August 2019.
  57. Samad, Hanif (1 August 2019). "I wasn’t getting hired as a Data Scientist. So I sought data on who is.". https://towardsdatascience.com/i-wasnt-getting-hired-as-a-data-scientist-so-i-sought-data-on-who-is-c59afd7d56f5. Retrieved 17 August 2019.
  58. Peng, Roger (11 December 2018). "The Role of Theory in Data Analysis". https://simplystatistics.org/2018/12/11/the-role-of-theory-in-data-analysis/. Retrieved 26 January 2019.
  59. Peng, Roger (18 January 2019). "The Tentpoles of Data Science". https://simplystatistics.org/2019/01/18/the-tentpoles-of-data-science/. Retrieved 26 January 2019.
  60. Robinson, David (9 January 2018). "What's the difference between data science, machine learning, and artificial intelligence?". http://varianceexplained.org/r/ds-ml-ai/. Retrieved 13 February 2018.
  61. Robinson, David (9 January 2018). "What's the difference between data science, machine learning, and artificial intelligence?". http://varianceexplained.org/r/ds-ml-ai/. Retrieved 13 February 2018.
  62. Press, Gil (28 May 2013). "A Very Short History Of Data Science". https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/#1aca6ca655cf. Retrieved 22 August 2018.
  63. Corea, Francesco (29 August 2018). "AI Knowledge Map: how to classify AI technologies". https://medium.com/@Francesco_AI/ai-knowledge-map-how-to-classify-ai-technologies-6c073b969020. Retrieved 21 September 2018.
  64. Bush, Vannevar (1945). "As we may think". The Atlantic Monthly 176(1): 101-108.
  65. Engelbart, Douglas (1962). "Augmenting human intellect: a conceptual framework". From Wagner to Virtual Reality. WW Norton & Company. pp. 64–90.
  66. Skagestad, Peter (1993). "Thinking with machines: intelligence augmentation, evolutionary epistemology and semiotic". The Journal of Social and Evolutionary Systems 16(2): 157-180.
  67. Skagestad, Peter (1993). "Thinking with machines: intelligence augmentation, evolutionary epistemology and semiotic". The Journal of Social and Evolutionary Systems 16(2): 157.
  68. Cook, Melanie (14 March 2017). "Intelligence Augmentation - The Next-Gen AI". https://www.slideshare.net/melsb/intelligence-augmentation-the-nextgen-ai. Retrieved 27 March 2018.
  69. Kozyrkov, Cassie (24 May 2018). "The simplest explanation of machine learning you’ll ever read". https://hackernoon.com/the-simplest-explanation-of-machine-learning-youll-ever-read-bebc0700047c. Retrieved 20 July 2018.
  70. Kozyrkov, Cassie (22 June 2018). "Unsupervised learning demystified". https://hackernoon.com/unsupervised-learning-demystified-4060eecedeaf. Retrieved 22 July 2018.
  71. Kozyrkov, Cassie (18 August 2018). "What on earth is data science?". https://hackernoon.com/what-on-earth-is-data-science-eb1237d8cb37. Retrieved 22 August 2018.
  72. Kozyrkov, Cassie (18 August 2018). "Is data science a bubble?". https://hackernoon.com/is-data-science-a-bubble-c70ceac0f264. Retrieved 22 August 2018.
  73. Kozyrkov, Cassie (14 September 2018). "Machine learning — Is the emperor wearing clothes?". https://hackernoon.com/machine-learning-is-the-emperor-wearing-clothes-59933d12a3cc. Retrieved 21 September 2018.
  74. Kozyrkov, Cassie (2 November 2018). "5 Bite-Sized Data Science Summaries". https://towardsdatascience.com/5-bite-sized-data-science-summaries-a5afb8509353. Retrieved 4 November 2018.
  75. Sarkar, Tirthajyoti (8 September 2018). "When Bayes, Ockham, and Shannon come together to define machine learning". https://towardsdatascience.com/when-bayes-ockham-and-shannon-come-together-to-define-machine-learning-96422729a1ad. Retrieved 12 September 2018.
  76. Rollins, John (24 August 2015). "Why we need a methodology for data science". https://www.ibmbigdatahub.com/blog/why-we-need-methodology-data-science. Retrieved 18 June 2019.
  77. Hao, Karen (25 January 2019). "We analyzed 16,625 papers to figure out where AI is headed next". https://www.technologyreview.com/s/612768/we-analyzed-16625-papers-to-figure-out-where-ai-is-headed-next/. Retrieved 27 January 2019.
  78. De Silva, Varuna (3 November 2018). "Chelsea is using our AI research for smarter football coaching". https://theconversation.com/chelsea-is-using-our-ai-research-for-smarter-football-coaching-105750. Retrieved 4 November 2018.
  79. Woo, Marcus (22 December 2018). "Artificial Intelligence in NBA Basketball". https://www.insidescience.org/news/artificial-intelligence-nba-basketball?_scpsug=crawled,49188,en_0c85c99320e2c0fb1e24841e2ab4262b1702d692d19a00030e0bf5ddff96e866#_scpsug=crawled,49188,en_0c85c99320e2c0fb1e24841e2ab4262b1702d692d19a00030e0bf5ddff96e866. Retrieved 23 December 2018.
  80. Sweeney, Peter (9 May 2018). "One problem to explain why AI works". https://towardsdatascience.com/one-problem-to-explain-ai-218a29e8fbc0. Retrieved 13 July 2018.
  81. Sweeney, Peter (31 May 2018). "Is strong AI inevitable?". https://towardsdatascience.com/is-strong-ai-inevitable-f4ed58c05293. Retrieved 16 July 2018.
  82. Lipton, Zachary; Steinhardt, Jacob (9 July 2018). "Troubling Trends in Machine Learning Scholarship". arXiv https://arxiv.org/abs/1807.03341.
  83. Behravan, Iman et al (15 February 2019). "Finding Roles of Players in Football Using Automatic Particle Swarm Optimization-Clustering Algorithm". Big Data: https://doi.org/10.1089/big.2018.0069.
  84. Bortolotti, Nicolas (9 October 2017). "Following Messi with TensorFlow and Object Detection". https://becominghuman.ai/following-messi-with-tensorflow-and-object-detection-20ba6d75667. Retrieved 3 April 2018.
  85. Horton, Michael (January 2018). "Algorithms for the analysis of spatio-temporal data from team sports". https://ses.library.usyd.edu.au/bitstream/2123/17755/2/Thesis%20-%20Michael%20Horton.pdf. Retrieved 24 October 2018.
  86. Dey, Debangan; Pita, Andrew (December 2018). "The Good, The Bad, and The Ugly of the Beautiful Game". https://ddey07.github.io/open-data/. Retrieved 22 December 2018.
  87. Berrar, Daniel et al (October 2018). "Guest editorial: special issue on machine learning for soccer". Machine Learning: https://doi.org/10.1007/s10994-018-5763-8.
  88. Goldsberry, Kirk (2012). "CourtVision: New Visual and Spatial Analytics for the NBA". http://www.sloansportsconference.com/wp-content/uploads/2012/02/Goldsberry_Sloan_Submission.pdf. Retrieved 21 November 2017.
  89. Cervone, Dan et al (2014). "Predicting Points and Valuing Decisions in Real Time with NBA Optical Tracking Data". https://pdfs.semanticscholar.org/f4b3/81f4482586dbdd15fc92bee81ce68bcb6898.pdf. Retrieved 21 November 2017.
  90. Miller, Andrew et al (2014). "Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball". http://proceedings.mlr.press/v32/miller14.pdf. Retrieved 21 November 2017.
  91. Goldsberry, Kirk; Weiss, Eric (2013). "The Dwight Effect:A New Ensemble of Interior Defense Analytics for the NBA". http://www.sloansportsconference.com/wp-content/uploads/2013/The%20Dwight%20Effect%20A%20New%20Ensemble%20of%20Interior%20Defense%20Analytics%20for%20the%20NBA.pdf. Retrieved 21 November 2017.
  92. Franks, Alexander et al (2015). "Counterpoints: Advanced Defensive Metrics for NBA Basketball". https://pdfs.semanticscholar.org/1016/c66483e546eee19e0f1a5bdc811876950158.pdf. Retrieved 21 November 2017.
  93. Franks, Alexander et al (2015). "Characterizing the spatial structure of defensive skill in professional basketball". https://arxiv.org/pdf/1405.0231.pdf. Retrieved 21 November 2017.
  94. Wu, Steven; Bornn, Luke (2017). "Modeling offensive player movement in professional basketball". PeerJ Preprints: https://doi.org/10.7287/peerj.preprints.3201v1.
  95. Corcoran, Derek; Watanabe, Nicholas (2 February 2018). "Starting to use the Spatialball package". https://derek-corcoran-barrios.github.io/SpatialBall.html. Retrieved 3 February 2018.
  96. Papalexakis, Evangelos; Pelechrinis, Konstantinos (2018). "tHoops: A Multi-Aspect Analytical Framework Spatio-Temporal Basketball Data". arcXiv: arXiv:1712.01199.
  97. Sha, Long et al (April 2018). "Interactive Sports Analytics: An Intelligent Interface for Utilizing Trajectories for Interactive Sports Play Retrieval and Analytics". ACM Transactions on Computer-Human Interaction 25(2).
  98. Hobbs, Wade et al (2018). "Measuring spatial scoring effectiveness in women’s basketball at the 2016 Olympic Games". International Journal of Performance Analysis in Sport https://doi.org/10.1080/24748668.2018.1550892.
  99. Kaggle (2014). "March Machine Learning Mania". https://www.kaggle.com/c/march-machine-learning-mania-2014. Retrieved 9 March 2018.
  100. Kaggle (2018). "Google Cloud & NCAA® ML Competition 2018-Men's Apply Machine Learning to NCAA® March Madness®". https://www.kaggle.com/c/mens-machine-learning-competition-2018. Retrieved 9 March 2018.
  101. Kaggle (2018). "Google Cloud & NCAA® ML Competition 2018-Women's Apply machine learning to NCAA® March Madness®". https://www.kaggle.com/c/womens-machine-learning-competition-2018. Retrieved 9 February 2018.
  102. Firke, Sam (5 March 2018). "Machine learning tutorial to create an entry for the Kaggle March Mania contest". https://github.com/sfirke/predicting-march-madness. Retrieved 9 March 2018.
  103. Lopez, Michael; Matthews, Gregory (30 November 2014). "Building an NCAA mens basketball predictive model and quantifying its success". arcXiv: arXiv:1412.0248v1.
  104. Shaukat, Tariq (19 December 2017). "NCAA teams up with Google Cloud". https://www.blog.google/topics/google-cloud/ncaa-teams-google-cloud/. Retrieved 31 March 2018.
  105. Blacker, Courtney (30 March 2018). "Tip off: how we’re using predictive analytics during the Final Four". https://www.blog.google/topics/google-cloud/how-were-using-predictive-analytics-during-final-four/. Retrieved 31 March 2018.
  106. Schmidt, Eric; Jarvis, Allen (30 March 2018). "Architecting live NCAA predictions: from archives to insights". https://cloud.google.com/blog/big-data/2018/03/architecting-live-ncaa-predictions-from-archives-to-insights. Retrieved 31 March 2018.
  107. Hojo, Motokazu (18 December 2018). "Automatically recognizing strategic cooperative behaviors in various situations of a team sport". PLoS ONE 13(12): https://doi.org/10.1371/journal.pone.0209247.
  108. Schmidt, Eric; Jarvis, Allen (30 March 2018). "Architecting live NCAA predictions: from archives to insights". https://cloud.google.com/blog/big-data/2018/03/architecting-live-ncaa-predictions-from-archives-to-insights. Retrieved 31 March 2018.
  109. Shaukat, Tariq (19 December 2017). "NCAA teams up with Google Cloud". https://www.blog.google/topics/google-cloud/ncaa-teams-google-cloud/. Retrieved 31 March 2018.
  110. Blacker, Courtney (30 March 2018). "Tip off: how we’re using predictive analytics during the Final Four". https://www.blog.google/topics/google-cloud/how-were-using-predictive-analytics-during-final-four/. Retrieved 31 March 2018.
  111. Lyons, Keith (3 April 2018). "Basketball: archives and insights". https://keithlyons.me/blog/2018/04/03/basketball-archives-and-insights/. Retrieved 3 April 2018.
  112. Vandeplas, Jake (10 June 2014). "Is Seattle Really Seeing an Uptick In Cycling?". https://jakevdp.github.io/blog/2014/06/10/is-seattle-really-seeing-an-uptick-in-cycling/. Retrieved 12 October 2017.
  113. Schneider, Todd (13 January 2016). "A Tale of Twenty-Two Million Citi Bike Rides: Analyzing the NYC Bike Share System". http://toddwschneider.com/posts/a-tale-of-twenty-two-million-citi-bikes-analyzing-the-nyc-bike-share-system/. Retrieved 8 November 2017.
  114. Carli, Luis (3 October 2017). "An animated guide to Frequency Trails (aka Joyplots)". http://vis.design/2017/08/how-to-joyplot/. Retrieved 12 October 2017.
  115. Padgham, Mark (17 October 2017). "Data from Public Bicycle Hire Systems". https://ropensci.org/blog/blog/2017/10/17/bikedata. Retrieved 18 October 2017.
  116. Molnar, Chris (28 January 2018). "Interpretable Machine Learning". https://christophm.github.io/interpretable-ml-book/. Retrieved 4 February 2018.
  117. Teschner, Florian (29 January 2018). "Exploring Embeddings for Categorical Variables with Keras". https://flovv.github.io/Embeddings_with_keras/. Retrieved 15 February 2015.
  118. Teschner, Florian (5 February 2018). "Concatenate Embeddings for Categorical Variables with Keras". https://flovv.github.io/Embeddings_with_keras_part2/. Retrieved 15 February 2015.
  119. Teschner, Florian (13 February 2018). "tfestimators - Package: Embeddings for Categorical Variables". https://flovv.github.io/Embeddings_with_tf/. Retrieved 15 February 2015.
  120. Padgham, Mark (27 April 2018). "bikedata". https://cran.r-project.org/web/packages/bikedata/vignettes/bikedata.html. Retrieved 30 August 2018.
  121. Frigaard, Martin; Spangler, Peter (7 May 2019). "Exploring Chicago rideshare data in R". http://www.storybench.org/exploring-chicago-rideshare-data/. Retrieved 9 May 2019.
  122. McGrath, Joseph et al (December 2018). "Cricket fast bowling detection in a training setting using an inertial measurement unit and machine learning". Journal of Sports Sciences: 10.1080/02640414.2018.1553270.
  123. Lane, Steve (August 2017, 2004). "Pacing Strategy: Can Analytics Help Us Run Faster in Cross Country?". http://www.pageturnpro.com/Renaissance-Publishing/79939-Techniques-August-2017/index.html#1. Retrieved August 19, 2017.
  124. Abbiss, Chris; Larsen, Paul (2008). "Describing and understanding pacing strategies during athletic competition". Sports Medicine 38(3): 239-252.
  125. Waldron, Mark; Highton, Jamie (2014). "Fatigue and pacing in high-intensity intermittent team sport: an update". Sports Medicine 44(12): 1645-1658.
  126. Edwards, Andrew et al (2016). "Oxford and Cambridge boat race: performance, pacing and tactics between 1890 and 2014". Sports Medicine 46(10): 1553-1562.
  127. Perl, Jürgen (2004). "PerPot - a meta-model and software tool for analysis and optimisation of load-performance-interaction". International Journal of Performance Analysis 4(2): 61-73.
  128. Fister, Iztok et al (2018). "Post hoc analysis of sport performance with differential evolution". Neural Computing and Applications https://doi.org/10.1007/s00521-018-3395-3.
  129. McCourt, Warren (August 14, 2004). "A Chat with Finn Marsland, Australia’s National Coach". http://fasterskier.com/fsarticle/a-chat-with-finn-marsland-australiaae%E2%84%A2s-national-coach/. Retrieved August 8, 2017.
  130. Marsland, Finn et al (2012). "Identification of Cross-Country Skiing Movement Patterns Using Micro-Sensors". Sensors 12(4): 5047-5066.
  131. Marsland, Finn et al (2015). "Using micro-sensor data to quantify macro kinematics of classical cross-country skiing during on-snow training". Sports Biomechanics 14(4): 435-447.
  132. Marsland, Finn et al (2017). "Full course macro-kinematic analysis of a 10 km classical cross-country skiing competition". PLoS One https://doi.org/10.1371/journal.pone.0182262.
  133. Marsland, Finn et al (2012). "Identification of Cross-Country Skiing Movement Patterns Using Micro-Sensors". Sensors 12(4): 5047-5066.
  134. Marsland, Finn et al (2015). "Using micro-sensor data to quantify macro kinematics of classical cross-country skiing during on-snow training". Sports Biomechanics 14(4): 435-447.
  135. Marsland, Finn et al (2017). "Full course macro-kinematic analysis of a 10 km classical cross-country skiing competition". PLoS One https://doi.org/10.1371/journal.pone.0182262.
  136. Seeberg, Trine et al (2017). "A multi-sensor system for automatic analysis of classical cross-country skiing techniques". Sports Engineering 20(4): 313-327.
  137. Jang, Jihyeok et al (2018). "A Unified Deep-Learning Model for Classifying the Cross-Country Skiing Techniques Using Wearable Gyroscope Sensors". Sensorsg 18(11).
  138. Perry, Emmanuel. "On Salad and Predicting Hockey Games". http://www.corsica.hockey/blog/2017/10/06/on-salad-and-predicting-hockey-games/. Retrieved 7 October 2017.
  139. Steel, Kylie. "Friend or foe? Just look at the way a person moves". https://theconversation.com/friend-or-foe-just-look-at-the-way-a-person-moves-78334. Retrieved 22 August 2017.
  140. Bonda, Eva et al (1996). "Specific Involvement of Human Parietal Systems and the Amygdala in the Perception of Biological Motion". Journal of Neuroscience 16(11): 3737-3744.
  141. Felsen, Panna; Lucey, Patrick (March 2017). "Body Shots: Analyzing Shooting Styles in the NBA using Body Pose". https://statsweb-wpengine.netdna-ssl.com/wp-content/uploads/2017/0/STATS_ResearchPaper_BodyShots.pdf. Retrieved 22 August, 2017.
  142. Le, Hoang et al. "Coordinated Multi-Agent Imitation Learning". https://www.disneyresearch.com/publication/coordinated-multi-agent-imitation-learning/. Retrieved 23 August 2017.
  143. Strayer, Nick (17 July 2018). "Classifying physical activity from smartphone data". http://blogs.rstudio.com/tensorflow/posts/2018-07-17-activity-detection/. Retrieved 18 July 2018.
  144. Etxegarai, Urtats et al (2018). "Estimation of lactate threshold with machine learning techniques in recreational runners". Applied Soft Computing 63(February): 181-196.
  145. Knobbe, Arno et al (2017). "Sports analytics for professional speed skating". Data Mining and Knowledge Discovery 27 May: 1-31.
  146. Tokyo 2020. "Surfing". https://tokyo2020.jp/en/games/sport/olympic/surfing/. Retrieved 19 October 2017.
  147. Spence, Paul; Keating, Shane. "Hang ten (decades): Walter Munk, inventor of the surf forecast, turns 100". https://theconversation.com/hang-ten-decades-walter-munk-inventor-of-the-surf-forecast-turns-100-85117. Retrieved 19 October 2017.
  148. Sverdrup, Harald; Munk, Walter (1947). "Wind, sea, and swell: theory of relations for forecasting.". Hydrographic Office: 1-36.
  149. Bretschneider. "Generation of waves by wind: state of the art". https://repository.tudelft.nl/islandora/object/uuid:6bc0ec3a-8d52-49d8-a624-22718b58cd4e/datastream/OBJ/download. Retrieved 19 October 2017.
  150. Tukey, John (1984). "Styles of spectrum analysis". Scripps Institute of Oceanography 84(5): 100-103.
  151. Munk, Walter. "Research". http://waltermunk.com/research/. Retrieved 19 October 2017.
  152. Spence, Paul; Keating, Shane. "Hang ten (decades): Walter Munk, inventor of the surf forecast, turns 100". https://theconversation.com/hang-ten-decades-walter-munk-inventor-of-the-surf-forecast-turns-100-85117. Retrieved 19 October 2017.
  153. Kovalchik, Stephanie; Reid, Machar (2018). "A calibration method with dynamic updates for within-match forecasting of wins in tennis". International Journal of Forecasting https://doi.org/10.1016/j.ijforecast.2017.11.008.
  154. Hoffmann, Marian et al (2017). "Predicting Elite Triathlon Performance: A Comparison of Multiple Regressions and Artificial Neural Networks". International Journal of Computer Science in Sport 16(2): 101-116.
  155. Hoffmann, Marian et al (2017). "Predicting Elite Triathlon Performance: A Comparison of Multiple Regressions and Artificial Neural Networks". International Journal of Computer Science in Sport 16(2): 101.
  156. Hoffmann, Marian et al (2017). "Predicting Elite Triathlon Performance: A Comparison of Multiple Regressions and Artificial Neural Networks". International Journal of Computer Science in Sport 16(2): 105.
  157. Hoffmann, Marian et al (2017). "Predicting Elite Triathlon Performance: A Comparison of Multiple Regressions and Artificial Neural Networks". International Journal of Computer Science in Sport 16(2): 105.
  158. Hoffmann, Marian et al (2017). "Predicting Elite Triathlon Performance: A Comparison of Multiple Regressions and Artificial Neural Networks". International Journal of Computer Science in Sport 16(2): 113-114.
  159. Bryan, Jenny (14 May 2015). "How to name files". https://speakerdeck.com/jennybc/how-to-name-files. Retrieved 2 February 2018.
  160. Wickham, Hadley (2014). "Tidy data". Journal of Statistical Software 59(10): 1-23.
  161. Dasu, Tamrapami; Johnson, Theodore (2003). Exploratory Data mining and Data Cleaning. Hoboken, New Jersey: John Wiley & Sons. p. ix.
  162. Wickham, Hadley (2014). "Tidy data". Journal of statistical Software 59(10): 1-23.
  163. Ross, Zev; Wickham, Hadley; Robinson, David (2017). "Declutter your R workflow with tidy tools". PeerJ Preprints. https://peerj.com/preprints/3180.pdf.
  164. Dowsett, Chris (29 May 2016). "Thinking about Data Science as-a-Platform". https://towardsdatascience.com/thinking-about-data-science-as-a-platform-f9e98277dcc6. Retrieved 25 November 2017.
  165. Dowsett, Chris (29 May 2016). "Thinking about Data Science as-a-Platform". https://towardsdatascience.com/thinking-about-data-science-as-a-platform-f9e98277dcc6. Retrieved 25 November 2017.
  166. Condron, Aiden (2016). "Servicing New and Novel Forms of Data: Opportunities for Social Science". IASSIST Quarterly 40(4).
  167. Brenna, Shane (17 September 2017). "The Ten Fallacies of Data Science". https://towardsdatascience.com/the-ten-fallacies-of-data-science-9b2af78a1862. Retrieved 27 November 2017.
  168. Szuflita, Vicky (25 May 2018). "Pitch Recommendation: a look into the data science process". https://medium.com/@vszuflita/pitch-recommendation-a-look-into-the-data-science-process-ab15f45c8687. Retrieved 30 June 2018.