(A few remarks and tips before the practical session)
The most famous is the tidyverse ecosystem for data science:
There are packages for machine learning (Keras, Tensorflow), spatial packages (sf, stars), packages specific to research fields (genomics, ecology, etc.). More than 23000 packages total.
Quarto “authoring system” for writing automated reports, slides, PDF documents, etc. (our “Topic #4!”)
targets pipelining framework (possibly the most powerful and flexible of its kind)
tidyverse framework (particularly the dplyr R package introduced as “Topic 2/3”) is designed to facilitate building readable, easy-to-write processing pipelines
R itself is a very powerful, flexible programming language
Some slides on “R as a calculator” (only half joking)
Then straight into plotting histograms and computing t-tests
R was first created “by statisticians for statisticians” (1991)
But teaching needs change in modern times:
Programming is a skill, not a knowledge to transfer
Teaching R in a lecture format would mean 3 hours of torture
A series of problems-solutions to develop understanding of:
Don’t take it as nothing but a text editor like Notepad.
It’s a starship Enterprise of data science at your fingertips. It’s incredible powerful and has a lot of features.
This cheatsheet has a lot of information, but try to internalize keyboard shortcuts which I highlighted in yellow in the PDF.
At first it will be annoying and slower to use keyboard and not a mouse, but trust me. It will pay of in the long run.
[…] the user enters expressions (rather than an entire [computer program]), the REPL evaluates them and displays the results […] – Wikipedia
An idea from ancient computers (1964!) with these functions:
1 + 2)+ on 1 and 2, yielding 3)6 on the screenSteps 1.-3. repeat in an infinite loop, until the program closes.
R encourages a highly interactive workflow.
When I don’t understand something, some code I don’t get, etc., I always type it in the REPL to build an intuition.
Doing data analysis is like playing a detective, especially when figuring out bugs and problems.
Form a hypothesis, run a tiny bit of R code to test the hypothesis. Move forward based on the result you got.
I see a lot of experienced PhD students writing and running long code top-to-bottom, instead of thinking methodically.
All languages (and their packages) have documentation, sure.
But it’s mostly scattered on the internet, often hard to find.
R packages have a standardized documentation inside R!
func has a manual page available at command ?funcEvery single such help page describes:
ts_tajima() from my R package.)
In the RStudio menu Global Options -> Pane Layout set:
Maximum vertical space for code and easy switching between script and R console (particularly with keyboard shortcuts).