Skip to content

Commit c0de846

Browse files
author
Quarto GHA Workflow Runner
committed
Built site for gh-pages
1 parent 0a03a8a commit c0de846

19 files changed

Lines changed: 14 additions & 14 deletions

.nojekyll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
972f695a
1+
d96edc82

data-applications.html

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -852,27 +852,27 @@ <h1 class="title"><span id="sec-data-applications" class="quarto-section-identif
852852
<div class="workedexample">
853853
<p>Determine whether each variable in the <code>paralympic_1500</code> dataset is numerical or categorical. For numerical variables, further classify them as continuous or discrete. For categorical variables, determine if the variable is ordinal.</p>
854854
<hr>
855-
<p>The numerical variables in the dataset are <code>year</code> (discrete), and <code>time_min</code> (continuous). The categorical variables are <code>city</code>, <code>country_of_games</code>, <code>division</code>, <code>type</code>, <code>name</code>, and <code>country_of_athlete</code>. The <code>time</code> variable is trickier to classify – we can think of it as numerical, but it is classified as categorical. The categorical classification is due to the colon <code>:</code> which separates the hours from the seconds. Sometimes the data dictionary (presented in <a href="#tbl-paralympic-var-def" class="quarto-xref">Table&nbsp;<span>3.3</span></a>) isn’t sufficient for a complete analysis, and we need to go back to the data source and try to understand the data better before we can proceed with the analysis meaningfully.</p>
855+
<p>The numerical variables in the dataset are <code>year</code> (discrete), and <code>time_min</code> (continuous). The categorical variables are <code>city</code>, <code>country_of_games</code>, <code>division</code>, <code>type</code>, <code>name</code>, and <code>country_of_athlete</code>. The <code>time</code> variable is trickier to classify – we can think of it as numerical, but it is classified as categorical. The categorical classification is due to the colon <code>:</code> which separates the minutes from the seconds. Sometimes the data dictionary (presented in <a href="#tbl-paralympic-var-def" class="quarto-xref">Table&nbsp;<span>3.3</span></a>) isn’t sufficient for a complete analysis, and we need to go back to the data source and try to understand the data better before we can proceed with the analysis meaningfully.</p>
856856
</div>
857857
<p>Next, let’s try to get to know each variable a little bit better. For categorical variables, this involves figuring out what their levels are and how commonly represented they are in the data. <a href="#fig-paralympic-cat" class="quarto-xref">Figure&nbsp;<span>3.1</span></a> shows the distributions of two of the categorical variables in this dataset. We can see that the United States has hosted the Games most often, but runners from Great Britain and Kenya have won the 1500m most often. There are a large number of countries who have had a single gold medal winner of the 1500m. Similarly, there are a large number of countries who have hosted the Games only once. Over the last century, the name describing the country for athletes from one particular region has changed and includes Russian Federation, Unified Team, and Russian Paralympic Committee. Both of the visualizations are bar plots, which you will learn more about in <a href="explore-categorical.html" class="quarto-xref"><span>Chapter 4</span></a>.</p>
858858
<p>Similarly, we can examine the distributions of the numerical variables as well. We already know that the 1500m times are mostly between 3.5min and 4.5min, based on <a href="#tbl-paralympic-df-tail" class="quarto-xref">Table&nbsp;<span>3.1</span></a> and <a href="#tbl-paralympic-df-head" class="quarto-xref">Table&nbsp;<span>3.2</span></a>. We can break down the 1500m time by division and type of race. <a href="#tbl-paralympic-summary" class="quarto-xref">Table&nbsp;<span>3.4</span></a> shows the mean, minimum, and maximum 1500m times broken down by division and race type. Recall that the Men’s Olympic division has taken place since 1896, whereas the Men’s Paralympic division has happened only since 1960. The maximum race time, therefore, should be taken into context in terms of the year of the Games.</p>
859859
<div id="fig-paralympic-cat" class="quarto-layout-panel">
860860
<figure class="quarto-float quarto-float-fig figure"><div aria-describedby="fig-paralympic-cat-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
861861
<div class="quarto-layout-row">
862862
<div class="cell-output-display quarto-layout-cell-subref quarto-layout-cell" data-ref-parent="fig-paralympic-cat" style="flex-basis: 50.0%;justify-content: flex-start;">
863-
<div id="fig-paralympic-cat-1" class="quarto-float quarto-figure quarto-figure-center anchored" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. ">
863+
<div id="fig-paralympic-cat-1" class="quarto-float quarto-figure quarto-figure-center anchored" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. ">
864864
<figure class="quarto-float quarto-subfloat-fig figure"><div aria-describedby="fig-paralympic-cat-1-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
865-
<a href="data-applications_files/figure-html/fig-paralympic-cat-1.png" class="lightbox" data-gallery="fig-paralympic-cat" title="Figure&nbsp;3.1&nbsp;(a): Country of origin of the athlete"><img src="data-applications_files/figure-html/fig-paralympic-cat-1.png" class="img-fluid figure-img" style="width:90.0%" data-ref-parent="fig-paralympic-cat" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. "></a>
865+
<a href="data-applications_files/figure-html/fig-paralympic-cat-1.png" class="lightbox" data-gallery="fig-paralympic-cat" title="Figure&nbsp;3.1&nbsp;(a): Country of origin of the athlete"><img src="data-applications_files/figure-html/fig-paralympic-cat-1.png" class="img-fluid figure-img" style="width:90.0%" data-ref-parent="fig-paralympic-cat" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. "></a>
866866
</div>
867867
<figcaption class="quarto-float-caption-bottom quarto-subfloat-caption quarto-subfloat-fig" id="fig-paralympic-cat-1-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
868868
(a) Country of origin of the athlete
869869
</figcaption></figure>
870870
</div>
871871
</div>
872872
<div class="cell-output-display quarto-layout-cell-subref quarto-layout-cell" data-ref-parent="fig-paralympic-cat" style="flex-basis: 50.0%;justify-content: flex-start;">
873-
<div id="fig-paralympic-cat-2" class="quarto-float quarto-figure quarto-figure-center anchored" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. ">
873+
<div id="fig-paralympic-cat-2" class="quarto-float quarto-figure quarto-figure-center anchored" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. ">
874874
<figure class="quarto-float quarto-subfloat-fig figure"><div aria-describedby="fig-paralympic-cat-2-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
875-
<a href="data-applications_files/figure-html/fig-paralympic-cat-2.png" class="lightbox" data-gallery="fig-paralympic-cat" title="Figure&nbsp;3.1&nbsp;(b): Country in which the Games gook place"><img src="data-applications_files/figure-html/fig-paralympic-cat-2.png" class="img-fluid figure-img" style="width:90.0%" data-ref-parent="fig-paralympic-cat" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. "></a>
875+
<a href="data-applications_files/figure-html/fig-paralympic-cat-2.png" class="lightbox" data-gallery="fig-paralympic-cat" title="Figure&nbsp;3.1&nbsp;(b): Country in which the Games gook place"><img src="data-applications_files/figure-html/fig-paralympic-cat-2.png" class="img-fluid figure-img" style="width:90.0%" data-ref-parent="fig-paralympic-cat" alt="Two separate bar plots. The left panel shows a bar plot counting the number of gold medal athletes from each country. Great Britain has had 8 top finishers, Kenya has had 7 top finishers, and Tunisia and Algeria have both had 5. The right panel shows a bar plot counting the number of Games which have happened in each country. The USA has hosted 4 Games, the UK has hosted 3 Games, and each of Japan, Greece, Germany, France, and Australia have hosted the Games twice. "></a>
876876
</div>
877877
<figcaption class="quarto-float-caption-bottom quarto-subfloat-caption quarto-subfloat-fig" id="fig-paralympic-cat-2-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
878878
(b) Country in which the Games gook place
@@ -983,9 +983,9 @@ <h1 class="title"><span id="sec-data-applications" class="quarto-section-identif
983983
<p>Let’s start by considering how the 1500m gold medal race times have changed over year. <a href="#fig-paralympic-ungrouped" class="quarto-xref">Figure&nbsp;<span>3.3</span></a> shows a scatterplot describing 1500m race times and year for Men’s Olympic and Paralympic (T11) athletes with a line of best fit (to the entire dataset) superimposed (see <a href="model-slr.html" class="quarto-xref"><span>Chapter 7</span></a> where we will present fitting a line to a scatterplot). Notice that the line of best fit shows a <em>positive</em> relationship between race time and year. That is, for later years, the predicted gold medal time is higher than in earlier years.</p>
984984
<div class="cell">
985985
<div class="cell-output-display">
986-
<div id="fig-paralympic-ungrouped" class="quarto-float quarto-figure quarto-figure-center anchored" alt="A scatterplot with year on the x-axis and gold medal 1500m time on the y-axis. A line of best fit is drawn over the points. ">
986+
<div id="fig-paralympic-ungrouped" class="quarto-float quarto-figure quarto-figure-center anchored" alt="A scatterplot with year on the x-axis and gold medal 1500m time on the y-axis. A line of best fit is drawn over the points. ">
987987
<figure class="quarto-float quarto-float-fig figure"><div aria-describedby="fig-paralympic-ungrouped-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
988-
<a href="data-applications_files/figure-html/fig-paralympic-ungrouped-1.png" class="lightbox" data-gallery="quarto-lightbox-gallery-4" title="Figure&nbsp;3.3: 1500m race time for Men’s Olympic and Paralympic (T11) athletes. The line represents a line of best fit to the entire dataset."><img src="data-applications_files/figure-html/fig-paralympic-ungrouped-1.png" class="img-fluid figure-img" style="width:90.0%" alt="A scatterplot with year on the x-axis and gold medal 1500m time on the y-axis. A line of best fit is drawn over the points. "></a>
988+
<a href="data-applications_files/figure-html/fig-paralympic-ungrouped-1.png" class="lightbox" data-gallery="quarto-lightbox-gallery-4" title="Figure&nbsp;3.3: 1500m race time for Men’s Olympic and Paralympic (T11) athletes. The line represents a line of best fit to the entire dataset."><img src="data-applications_files/figure-html/fig-paralympic-ungrouped-1.png" class="img-fluid figure-img" style="width:90.0%" alt="A scatterplot with year on the x-axis and gold medal 1500m time on the y-axis. A line of best fit is drawn over the points. "></a>
989989
</div>
990990
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-paralympic-ungrouped-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
991991
Figure&nbsp;3.3: 1500m race time for Men’s Olympic and Paralympic (T11) athletes. The line represents a line of best fit to the entire dataset.
@@ -1016,8 +1016,8 @@ <h1 class="title"><span id="sec-data-applications" class="quarto-section-identif
10161016
<p>In this case study, we introduced you to the very first steps a data scientist takes when they start working with a new dataset. In the next few chapters, we will introduce exploratory data analysis, and you’ll learn more about the various types of data visualizations and summary statistics you can make to get to know your data better.</p>
10171017
<p>Before you move on, we encourage you to think about whether the following questions can be answered with this dataset, and if yes, how you might go about answering them? It’s okay if your answer is “I’m not sure”, we simply want to get your exploratory juices flowing to prime you for what’s to come!</p>
10181018
<ol type="1">
1019-
<li>Has there every been a year when a visually impaired paralympic gold medal athlete beat the Olympic gold medal athlete?</li>
1020-
<li>When comparing the paralympic and Olympic 1500m gold medal athletes, does Simpson’s paradox hold in the Women’s division?</li>
1019+
<li>Has there ever been a year when a visually impaired Paralympic gold medal athlete beat the Olympic gold medal athlete?</li>
1020+
<li>When comparing the Paralympic and Olympic 1500m gold medal athletes, does Simpson’s paradox hold in the Women’s division?</li>
10211021
<li>Is there a biological boundary which establishes a time under which no human could run 1500m?</li>
10221022
</ol></section><section id="sec-data-tutorials" class="level2" data-number="3.3"><h2 data-number="3.3" class="anchored" data-anchor-id="sec-data-tutorials">
10231023
<span class="header-section-number">3.3</span> Interactive R tutorials</h2>
-28.4 KB
Binary file not shown.
-35.4 KB
Binary file not shown.
-23.6 KB
Binary file not shown.
-20.9 KB
Binary file not shown.
-17.7 KB
Binary file not shown.
-3.59 KB
Loading
-1.3 KB
Loading
2.63 KB
Loading

0 commit comments

Comments
 (0)