Sport Informatics and Analytics/Pattern Recognition/Using R
Contents
Introduction
This topic develops issues raised in Pattern Recognition, Theme 2 of this course. It starts a conversation about the use of R in sport analytics.
R is a programming language and a software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.[1]
Kurt Hornik and Friedrich Leisch[2] introduce R in the first edition of the R Newsletter. The R Core Team provide a brief background report about R in that newsletter.[3]
There is a detailed description of R on this Wikipedia page.
There is a vibrant R community on Twitter that includes RStudio and RLadies Global.
Learning about R
Using R in sport contexts
Ice Hockey
Parkrun
Australian rules football
Netball
Association football
Cricket
Basketball
Tennis
Salaries in sport
Strava
Olympic medals
Extreme skiing and snowboarding
Baseball
NFL
Ice hockey
Visualising data with R
One of the options you have with R is to visualise your data. R has a number of functions and libraries to support your visualisations.
If you would like to explore the potential of R to visualise data, you might find Remko Duursma, Jeff Powell and Glenn Stone's (2017)[79] introduction to learning R very helpful. Their Chapter 4 refers explicitly to visualizing data and the use of RStudio and includes discussion of: scatterplot; bar plot; histogram; curves; pie chart; box and whisker plot; and symbols.
A powerful visualisation tool in R is ggplot2[80].
ggplot2 was inspired by Leland Wilkinson's (1999) The Grammar of Graphics[81] and is available as a CRAN package in R and RStudio.
Edwin Chen (2012)[82] provides "a bare-bones introduction to ggplot2" that "assumes no knowledge of R". A definitive introduction to ggplot2 is provided by Hadley Wickham (2016)[83].
R as an ePortfolio resource
References
- ↑ Hornik, Kurt; Leisch, Friedrich (November 26, 2015). "R FAQ". https://cran.r-project.org/doc/FAQ/R-FAQ.html#What-is-R_003f. Retrieved 9 February 2016.
- ↑ Hornik, Kurt; Leisch, Friedrich (1 January, 2001). "Editorial". R-project. https://www.r-project.org/doc/Rnews/Rnews_2001-1.pdf. Retrieved 9 February 2016.
- ↑ The R Core Team (1 January, 2001). "What is R?". R-project. https://www.r-project.org/doc/Rnews/Rnews_2001-1.pdf. Retrieved 9 February 2016.
- ↑ Hicks, Stephanie; Irizarry, Rafael (2016). "A Guide to Teaching Data Science". https://arxiv.org/ftp/arxiv/papers/1612/1612.07140.pdf.
- ↑ Campbell, Paul (September 2018). "A whirlwind tour of working with data in R". https://paulc91.github.io/intro_to_r/#1. Retrieved 23 September 2018.
- ↑ Dancho, Matt (4 November 2018). "New R cheatsheet: data science workflow with R". https://www.business-science.io/learning-r/2018/11/04/data-science-r-cheatsheet.html. Retrieved 5 November 2018.
- ↑ Wickham, Hadley (August 2019). "Mastering Shiny". https://mastering-shiny.org/. Retrieved 14 August 2019.
- ↑ Walum, Hasse; De Leon, Desiree (August 2019). "Introduction". https://tinystats.github.io/teacups-giraffes-and-statistics/02_bellCurve.html. Retrieved 15 August 2019.
- ↑ Schneider, Todd (2016). https://toddwschneider.com/posts/ballr-interactive-nba-shot-charts-with-r-and-shiny/. Retrieved 18 October 2017.
- ↑ Frick, Hannah; Kosmidis, Ioannis (2017). "trackeR: Infrastructure for Running and Cycling Data from GPS-Enabled Tracking Devices in R". Journal of Statistical Software 82 (7).
- ↑ Frick, Hannah; Kosmidis, Ioannis (2017). "trackeR: Infrastructure for Running and Cycling Data from GPS-Enabled Tracking Devices in R". Journal of Statistical Software 82 (7): 1.
- ↑ Tran, Jacquie (15 February 2018). "Sport analytics in R". https://jacquietran.neocities.org/acu-gcpa-2018-02/presentation.html. Retrieved 15 February 2018.
- ↑ Nakagawara, Ryo (4 July 2018). https://datascienceplus.com/visualize-the-world-cup-with-r-part-1-recreating-goals-with-ggsoccer-and-ggplot2/. Retrieved 8 August 2018.
- ↑ Nakagawara, Ryo (6 August 2018). https://www.r-bloggers.com/animating-the-goals-of-the-world-cup-comparing-the-old-vs-new-gganimate-and-tweenr-api/. Retrieved 8 August 2018.
- ↑ Benz, Luke. https://github.com/lbenz730/ncaahoopR. Retrieved 8 August 2018.
- ↑ Postive Residual (2019). "Portfolio". https://positiveresidual.com/. Retrieved 7 January 2019.
- ↑ Arregoitia, Luis (January 2019). "Animate shot distances for NBA games". https://luisdva.github.io/rstats/bball-shots/. Retrieved 7 January 2019.
- ↑ Ward, Patrick (20 January 2019). "A Simple Approach to Analyzing Athlete Data in Applied Sports Science". http://optimumsportsperformance.com/blog/testing-syntax-highlighter-evolved/. Retrieved 21 January 2019.
- ↑ Averick, Mara (27 February 2019). "NBA Advanced Metrics". http://rpubs.com/maraaverick/470388. Retrieved 28 February 2019.
- ↑ Frigaard, Martin; Spangler, Peter (7 May 2019). "Exploring Chicago rideshare data in R". http://www.storybench.org/exploring-chicago-rideshare-data/. Retrieved 9 May 2019.
- ↑ O'Hara-Wild, Mitchell (17 June 2019). "Introducing tsibbledata". https://www.mitchelloharawild.com/blog/tsibbledata/. Retrieved 15 June 2019.
- ↑ Padgham, Mark (9 May 2019). "bikedata". https://cran.r-project.org/web/packages/bikedata/vignettes/bikedata.html. Retrieved 2 September 2019.
- ↑ Hall, Meghan (11 December 2019). "An Introduction to R With Hockey Data". https://hockey-graphs.com/. Retrieved 15 January 2020.
- ↑ Hall, Meghan (8 October 2019). "Exploratory Data Analysis Using Tidyverse". https://hockey-graphs.com/2019/10/08/exploratory-data-analysis-using-tidyverse/. Retrieved 15 January 2020.
- ↑ Lyons, Keith (5 January 2019). "Braidwood Showground Parkruns 2018". https://keithlyons.me/blog/2019/01/05/braidwood-showground-parkruns-2018/. Retrieved 5 January 2019.
- ↑ Jovanović, Mladen (13 March 2015). "AFL Data Analysis Report". http://complementarytraining.net/wp-content/uploads/2015/03/AFL_Analysis.html. Retrieved 26 March 2016.
- ↑ Tran, Jacquie (12 January 2019). "Getting to know the fitzRoy package (AFL game statistics". https://underthehood.jacquietran.com/2019/01/12/getting-to-know-the-fitzroy-package-afl-game-statistics/. Retrieved 13 January 2019.
- ↑ Sweeting, Alice (2017). "Discovering the Movement Sequences of Elite and Junior Elite Netball Athletes" (PhD). Institute of Sport, Exercise and Active Living, Victoria University, Melbourne, Australia. http://trove.nla.gov.au/work/227110648?q&versionId=249204357. Retrieved 18 July 2017.
- ↑ Sweeting, Alice (11 June 2016). "Introduction to R and A Basic Analysis of Athlete Load". https://sportstatisticsrsweet.wordpress.com/2016/06/. Retrieved 18 July 2017.
- ↑ Sweeting, Alice (29 January 2018). "k-means Clustering in R". https://sportstatisticsrsweet.wordpress.com/2018/01/29/k-means-clustering-in-r/. Retrieved 30 January 2018.
- ↑ Loridan, Thomas. "téouch analytics". https://teouchanalytics.wordpress.com/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Google Scholar Profile". https://scholar.google.com.au/citations?user=VVRMn3cAAAAJ&hl=en. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 1: feature engineering (and some data to play with". https://teouchanalytics.wordpress.com/2017/07/08/episode-1-feature-engineering-and-some-data-to-play-with/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 2: Assessing feature importance". https://teouchanalytics.wordpress.com/2017/07/10/episode-2-assessing-feature-importance/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 3: Building and testing a predictive model". https://teouchanalytics.wordpress.com/2017/07/13/episode-3-building-and-testing-a-predictive-model/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 4: Tuning a football predictive model with caret". https://teouchanalytics.wordpress.com/2017/07/18/tuning-a-football-prediction-model-with-caret/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 5: how to bet on football using a prediction model". https://teouchanalytics.wordpress.com/2017/07/21/episode-5-how-to-bet-on-football-using-a-prediction-model/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 6: where to from here?". https://teouchanalytics.wordpress.com/2017/08/04/episode-6-where-to-from-here/. Retrieved 8 September 2017.
- ↑ Loridan, Thomas. "Episode 6: where to from here?". https://teouchanalytics.wordpress.com/2017/08/04/episode-6-where-to-from-here/. Retrieved 8 September 2017.
- ↑ Wilson, Robbie et al (2017). "Skill not athleticism predicts individual variation in match performance of soccer players". Proceedings of the Royal Society B Biological Sciences 284(1869).
- ↑ Tyner, Sam; Briatte, François; Hofmann, Henke (2017). "Network Visualization with ggplot2". The R Journal 9(1).
- ↑ Curley, James. "Introducing engsoccerdata". https://github.com/jalapic/engsoccerdata. Retrieved 8 November 2017.
- ↑ . https://ewen.io/2018/12/10/understatr/. Retrieved 12 December 2018.
- ↑ "#15: Getting Started with Free StatsBomb Event Data – xG Shot Map Tutorial". 16 June 2019. https://thelastmananalytics.home.blog/2019/06/16/15-getting-started-with-free-statsbomb-event-data-xg-shot-map-tutorial/. Retrieved 18 June 2019.
- ↑ Torvaney, Ben (1 January 2019). https://stats-and-snakeoil.herokuapp.com/2019/01/01/predicting-the-premier-league-with-dixon-coles/. Retrieved 12 December 2018.
- ↑ Torvaney, Ben (6 August 2019). ggsoccer. https://github.com/Torvaney/ggsoccer. Retrieved 7 August 2019.
- ↑ Ganesh, Tinniam. "Introducing cricketr! : An R package to analyze performances of cricketers". https://gigadom.wordpress.com/2015/07/04/introducing-cricketr-a-r-package-to-analyze-performances-of-cricketers/. Retrieved 25 October 2017.
- ↑ Ganesh, Tinniam. "The making of cricket package yorkr – Part 1". https://gigadom.wordpress.com/2016/03/05/the-making-of-cricket-package-yorkr-part-1-2/. Retrieved 25 October 2017.
- ↑ Ganesh, Tinniam. "More book, more cricket! 2nd edition of my books now on Amazon". https://gigadom.wordpress.com/2017/03/26/more-book-more-cricket-2nd-edition-of-my-books-now-on-amazon/. Retrieved 25 October 2017.
- ↑ Ganesh, Tinniam. "cricketr sizes up legendary All-rounders of yesteryear". https://gigadom.wordpress.com/2016/09/10/cricketr-sizes-up-legendary-all-rounders-of-yesteryear/. Retrieved 25 October 2017.
- ↑ Ganesh, Tinniam. "Analysis of IPL T20 matches with yorkr templates". https://gigadom.wordpress.com/2017/03/04/analysis-of-ipl-t20-matches-with-yorkr-templates/. Retrieved 25 October 2017.
- ↑ Cervone, Daniel et al (4 August 2014). "A Multiresolution Stochastic Process Model for Predicting Basketball Possession Outcomes". https://arxiv.org/pdf/1408.0777.pdf. Retrieved 21 November 2017.
- ↑ Cervone, Daniel. "EPVDemo". https://github.com/dcervone/EPVDemo. Retrieved 21 November 2017.
- ↑ Schneider, Todd (8 March 2016). "BallR: Interactive NBA Shot Charts with R and Shiny". http://toddwschneider.com/posts/ballr-interactive-nba-shot-charts-with-r-and-shiny/. Retrieved 4 April 2018.
- ↑ Schneider, Todd (8 March 2016). "BallR: Interactive NBA Shot Charts with R and Shiny". http://toddwschneider.com/posts/ballr-interactive-nba-shot-charts-with-r-and-shiny/. Retrieved 4 April 2018.
- ↑ Arregoita, Luis (14 February 2019). "Quantifying point overlap for NBA shot chart data". https://luisdva.github.io/rstats/nba-overlap/. Retrieved 27 February 2019.
- ↑ Arregoita, Luis (9 January 2019). "Animate shot distances for NBA games". https://luisdva.github.io/rstats/bball-shots/. Retrieved 27 February 2019.
- ↑ Greenberg, Neil (18 March 2019). "2019 NCAA tournament: The perfect bracket to win your March Madness pool". https://www.washingtonpost.com/sports/2019/03/18/ncaa-tournament-perfect-bracket-win-your-march-madness-pool/. Retrieved 19 March 2019.
- ↑ Firke, Sam (18 March 2019). "Predicting March Madness". https://github.com/sfirke/predicting-march-madness. Retrieved 19 March 2019.
- ↑ Brooks, Dan; Folsom, Keith (11 May 2016). "Predicting March Madness". https://rstudio-pubs-static.s3.amazonaws.com/180553_8d12f96839b74f4aa3b562beb54dff25.html. Retrieved 20 March 2019.
- ↑ Lopez, Michael; Matthews, Gregory (30 November 2014). "Building an NCAA men's basketball predictive model and quantifying its success". https://arxiv.org/abs/1412.0248. Retrieved 19 March 2019.
- ↑ Kovalchik, Stephanie (13 October 2017). "Measuring Match Fatigue". http://on-the-t.com/2017/10/13/fatigue-effects/. Retrieved 9 December 2017.
- ↑ Kovalchik, Stephanie (20 October 2017). "Is Fatigue Cumulative?". http://on-the-t.com/2017/10/20/cumulative-fatigue-effects/. Retrieved 9 December 2017.
- ↑ Burris, Kyle (7 September 2017). "Relief-Fatigue". https://github.com/burrisk/Relief-Fatigue. Retrieved 9 December 2017.
- ↑ Ritz, Christian et al (2015). "Dose-Response Analysis Using R". PLoS ONE 10(12).
- ↑ Kovalchik, Stephanie (18 March 2018). "Cape Town celebrates R and tennis data science at satRday". http://on-the-t.com/2018/03/16/satrday-capetown/. Retrieved 24 March 2018.
- ↑ Kovalchik, Stephanie (18 March 2018). "satRday". https://github.com/skoval/satRday. Retrieved 24 March 2018.
- ↑ Kovalchik, Stephanie (10 July 2018). "Material from 2018 UseR Conference: Statistical Models for Sport in R". https://github.com/skoval/UseR2018. Retrieved 24 July 2018.
- ↑ Tran, Jacquie (2 January 2018). "How much do you get paid? Part I - An initial exploration". http://underthehood.jacquietran.com/2018/01/02/how-much-do-you-get-paid-part-1/. Retrieved 3 January 2018.
- ↑ Smith, David (23 January 2018). http://blog.revolutionanalytics.com/2018/01/strava-visualization.html. Retrieved 24 January 2018.
- ↑ Rinker, Tyler (20 March 2018). "Building the Olympics blog: tidy data preparation". https://edwinth.github.io/olympics-dataprep/. Retrieved 22 March 2018.
- ↑ Rinker, Tyler (20 March 2018). "Building the Olympics blog: tidy data preparation". https://edwinth.github.io/olympics-dataprep/. Retrieved 22 March 2018.
- ↑ Rinker, Tyler (9 February 2014). "Sochi Olympic Medals". https://trinkerrstuff.wordpress.com/2014/02/09/sochi-olympic-medals-2/. Retrieved 22 March 2018.
- ↑ Oldach, Matthew (8 May 2018). "Analyzing extreme skiing and snowboarding in R: Freeride World Tour 1996–2018". https://medium.com/@MattOldach_65321/analyzing-extreme-skiing-and-snowboarding-in-r-freeride-world-tour-1996-2018-ffde401fb3ae. Retrieved 10 May 2018.
- ↑ Petti, Bill (21 September 2015). "A Short(-ish) Introduction to Using R Packages for Baseball Research". https://www.fangraphs.com/tht/a-short-ish-introduction-to-using-r-for-baseball-research/. Retrieved 2 June 2018.
- ↑ Protacio, Angeline (September 2019). "Using R and the Tidyverse to Play Fantasy Baseball". https://github.com/angelinepro/useR_july2019/blob/master/Using%20R%20and%20the%20Tidyverse%20to%20Play%20Fantasy%20Baseball_useR2019.pdf. Retrieved 11 September 2019.
- ↑ Petersen, Isaac. "Fantasy Football Analytics". https://fantasyfootballanalytics.net/. Retrieved 6 September 2018.
- ↑ . https://github.com/jflancer/nwhlR. Retrieved 12 December 2018.
- ↑ Duursma, Remko; Powell, Jeff; Stone, Glenn (28 August 2017). https://www.westernsydney.edu.au/__data/assets/pdf_file/0011/830909/Rnotes_20170828_web.pdf. Retrieved 26 November 2017.
- ↑ Wickham, Hadley (2011). "ggplot2". WIREs Computational Statistics 3 (2): 180-185.
- ↑ Wickham, Hadley (2007). http://ggplot2.org/resources/2007-past-present-future.pdf. Retrieved 26 November 2017.
- ↑ Chen, Edwin (17 January 2012). http://blog.echen.me/2012/01/17/quick-introduction-to-ggplot2/. Retrieved 26 November 2017.
- ↑ Wickham, Hadley (2016). ggplot2: Elegant Graphics for Data Analysis. Berlin: Springer.
- ↑ Atkinson, Anthony (1986). "Comment: Aspects of Diagnostic Regression Analysis". Statistical Science 1(3): 379-402.
- ↑ Healy, Kieran (2017). "Data Visualization for Social Science: A practical introduction with R and ggplot2". http://socviz.co/index.html. Retrieved 9 December 2017.
- ↑ MacKintosh, John (16 May 2016). "Intro to ggplot2". https://cdn.rawgit.com/johnmackintosh/ggplot2_demo/a18cc631/pres.html#1. Retrieved 22 February 2018.
- ↑ Tyner, Sam; Briatte, François; Hofmann, Henke (2017). "Network Visualization with ggplot2". The R Journal 9(1).
- ↑ Fry, Chris (9 April 2015). "Graphing in R". https://chrisfryperformanceanalyst.wordpress.com/2015/04/09/graphing-in-r/. Retrieved 21 February 2018.
- ↑ Toumi, Asmae (February 2018). "R for data visualization". https://docs.google.com/presentation/d/1f5PGhzkW0ouqvtow9JbnpNe9AKATKXJac5CLV7JSWbU/edit#slide=id.gc6f90357f_0_0. Retrieved 25 February 2018.
- ↑ Toumi, Asmae (February 2018). "R for data visualization". https://drive.google.com/drive/folders/1A-yoLHJ7VJHlo0QL28LMDg0CGogF6xeq. Retrieved 25 February 2018.
- ↑ Hvitfeldt, Emil (12 June 2018). "ggplot2 trial and error - US trade data". https://www.hvitfeldt.me/2018/06/ggplot2-trial-and-error-us-trade-data/. Retrieved 14 June 2018.
- ↑ Navarro, Danielle (6 April 2019). "Data visualisation in R". https://djnavarro.github.io/satrdayjoburg/. Retrieved 7 April 2019.
- ↑ Byrd, Larie (8 February 2018). "The First (and Namesake) Post: Is It Cake?". https://aczane.netlify.com/2018/02/08/the-first-and-namesake-post-is-it-cake/. Retrieved 10 February 2018.
- ↑ Robinson, David (14 November 2017). "Advice to aspiring data scientists: start a blog". http://varianceexplained.org/r/start-blog/. Retrieved 15 February 2018.
- ↑ Salmon, Maelle (15 March 2018). "Get on your soapbox!". http://www.masalmon.eu/rladiesct/slides#1. Retrieved 16 March 2018.
- ↑ Koehrsen, William (11 August 2018). "The most important part of a data science project is writing a blog post". https://towardsdatascience.com/the-most-important-part-of-a-data-science-project-is-writing-a-blog-post-50715f37833a. Retrieved 15 August 2018.
- ↑ SportSciData (4 April 2019). "How to Create Interactive Reports with R Markdown Part I:". https://www.sportscidata.com/2019/04/04/how-to-create-interactive-reports-with-r-markdown-part-i/. Retrieved 16 April 2016.
- ↑ SportSciData (12 April 2019). "How to Create Interactive Reports in R Markdown Part II: Data Visualisation". https://www.sportscidata.com/2019/04/12/using-data-visualisation-in-r-markdown/. Retrieved 16 April 2016.
- ↑ SportSciData (4 April 2019). "How to Create Interactive Reports with R Markdown Part I:". https://www.sportscidata.com/2019/04/04/how-to-create-interactive-reports-with-r-markdown-part-i/. Retrieved 16 April 2016.
- ↑ SportSciData (12 April 2019). "How to Create Interactive Reports in R Markdown Part II: Data Visualisation". https://www.sportscidata.com/2019/04/12/using-data-visualisation-in-r-markdown/. Retrieved 16 April 2016.
- ↑ Bajak, Aleszu (25 August 2017). "How to convert a Google Doc to RMarkdown and publish on Github pages". http://www.storybench.org/convert-google-doc-rmarkdown-publish-github-pages/. Retrieved 15 November 2017.
- ↑ Collins, Neil. "How to Create Reports In R Markdown I: Data Tables". https://www.sportscidata.com/2019/04/04/how-to-create-interactive-reports-with-r-markdown-part-i/. Retrieved 17 June 2019.
- ↑ Monkman, Martin. "Per-game run scoring by league". https://monkmanmh.shinyapps.io/MLBrunscoring_shiny/. Retrieved 17 February 2018.
- ↑ Monkman, Martin (26 March 2017). "Updated Shiny app". https://bayesball.blogspot.com.au/2017/03/updated-shiny-app.html. Retrieved 17 February 2018.
- ↑ Davis, Scott (9 June 2018). "NBA Finals Gamecast Summary". https://sdavis.shinyapps.io/NBAFinals/. Retrieved 10 June 2018 2018.
- ↑ Berndsen, Chris (8 March 2018). "Introduction to RMarkdown and Shiny". https://youtu.be/O04l-LpmoE8. Retrieved 13 March 2018.
- ↑ Biecek, Przemysław; Kosiński, Marcin (2017). "archivist: An R Package for Managing, Recording and Restoring Data Analysis Results". Journal of Statistical Software 82(11): 10.18637/jss.v082.i11.
- ↑ Biecek, Przemysław (14 December 2017). "archivist: Boost the reproducibility of your research". http://smarterpoland.pl/index.php/2017/12/boost-the-reproducibility-of-your-research-with-archivist/. Retrieved 16 December 2017.
- ↑ Xiao, Nan (20 May 2017). "Persistent Reproducible Reporting with Docker and R". https://nanx.me/talks/#talk-chinar-2017. Retrieved 31 July 2018 2018.
- ↑ Xiao, Nan (30 July 2018). "liftr: an R Package for Persistent Reproducible Research". https://nanx.me/talks/#talk-jsm-2018. Retrieved 31 July 2018.
- ↑ Turnbull, Jamres (August 2018). "Documentation as a gateway to open source". https://increment.com/documentation/documentation-as-a-gateway-to-open-source/. Retrieved 10 August 2018.
- ↑ Vuorre, Matti; Curley, James (11 April 2018). "Curating Research Assets: A Tutorial on the Git Version Control System". Advances in Methods and Practices in Psychological Science https://doi.org/10.1177/2515245918754826.
- ↑ Sweeting, Alice (29 January 2019). "A little about me…". https://sportstatisticsrsweet.rbind.io/#about. Retrieved 29 January 2019.