The assignment is due on Monday the 23rd of May at 9:00am, Canberra time (the beginning of semester week 12). Like all other due dates, this deadline is hard: late submissions will NOT be accepted.
For this assignment, you may work in groups of up to three students. Working in larger groups (more than three students) is not allowed. If there is an indication of four or more students working together, or sharing parts of solutions, all students involved will have to be investigated for possible plagiarism.
A group sign-up activity on wattle is available until the 9th of May at 9:00am (Monday of semester week 10). If you intend to work in a group, you and your team mates should find a free group and add yourselves to it. (The group numbers have no meaning; we only care about which students are in the same group.)
Working in a group is not mandatory. If you want to do the assignment on your own, please add yourself to the “I want to do the assignment on my own” group,so that we can keep track. Remember that deadline extensions can only ever be given to individuals, not to groups. If you choose to work in a group, it is your responsibility to organise your work so that it cannot be held up by the unexpected absence of one group member.
Each student must submit two files:
- Their code (a python file). For students working in a group, it is required that all students in the group submit identical code files.
- An individual report, with answers to a set of questions. Details about the format of the report and the questions are in the section Questions for the
individual report below.
Data and files provided
The data for global COVID-19 vaccinations by countries are stored in a CSV file:
(Please note that when marking we may test your code with this and other CSV files.)
The file has a header line with names of the columns (see below). The following lines contain actual data. The meanings of the columns are:
location: name of the country (or region within a country).
iso_code: ISO 3166-1 alpha-3 – three-letter country codes.
date: date of the observation.
total_vaccinations: total number of doses administered. For vaccines that require multiple doses, each individual dose is counted. If a person receives one dose of the vaccine, this metric goes up by 1. If they receive a second dose, it goes up by 1 again. If they receive a third/booster dose, it goes up by 1 again.
people_vaccinated: total number of people who received at least one vaccine dose. If a person receives the first dose of a 2-dose vaccine, this metric goes up by 1. If they receive the second dose, the metric stays the same.
people_fully_vaccinated: total number of people who received all doses prescribed by the initial vaccination protocol. If a person receives the first dose of a 2-dose vaccine, this metric stays the same. If they receive the second dose, the metric goes up by 1.
total_boosters: total number of COVID-19 vaccination booster doses administered (doses administered beyond the number prescribed by the initial vaccination protocol)
daily_vaccinations_raw: daily change in the total number of doses administered. It is only calculated for consecutive days. This is a raw measure provided for data checks and transparency, but it is strongly recommended that any analysis on daily vaccination rates be conducted using daily_vaccinations instead.
daily_vaccinations: new doses administered per day (smoothed out over a 7-day period). For countries that don’t report data on a daily basis, it is assumed that doses changed equally on a daily basis over any periods in which no data was reported. This produces a complete series of daily figures,which is then averaged over a rolling 7-day window. An example of this calculation can be found here.
total_vaccinations_per_hundred: total_vaccinations per 100 people in the total population of the country.
people_vaccinated_per_hundred: people_vaccinated per 100 people in the total population of the country.
people_fully_vaccinated_per_hundred: people_fully_vaccinated per 100 people in the total population of the country.
total_boosters_per_hundred: total_boosters per 100 people in the total population of the country.
daily_vaccinations_per_million: daily_vaccinations per 1,000,000 people in the total population of the country.
daily_people_vaccinated: daily number of people receiving a first COVID-19 vaccine dose (7-day smoothed).
daily_people_vaccinated_per_hundred: daily_people_vaccinated per 100 people in the total population of the country.
As an example, let’s look at this one line:
This means that in Australia until April 11th 2022:
56,805,008 doses of vaccinations have been given.
22,241,967 people have been given at least one dose.
21,396,664 people have been fully vaccinated.
13,166,377 booster doses have been given
53,455 new vaccinations during that one day, i.e., equal to total_vaccinations of 2022-04-11 minus that value from 2022-04-10.
an average of 38,931 per day over the last 7 days until that date.
220.28 vaccine doses per 100 people.
86.25 people vaccinated with at least one dose per 100 people.
82.97 people fully vaccinated per 100 people.
51.06 booster doses per 100 people.
1,510 vaccine doses during that day per 1 million people.
an average of 2,825 people receiving the first vaccine dose per day during the last 7 days until that date.
an average of 0.011 people per 100 people receiving the first vaccine dose per day during the last 7 days until that date.
The data for people_vaccinated and people_fully_vaccinated are dependent on the necessary data being made available, so these metrics may not be available for some countries.
This dataset includes some subnational locations and international aggregates (World, continents, European Union…). They can be identified by their iso_code that starts with OWID_.
If you’re interested, you can find the original data files on the GitHub repository by Our World in Data.
For information and examples of how to read and process CSV files, see Lab 6.