| |
The Human Mortality
Database (HMD) includes the following types of data:
- Live birth counts,
- Death counts,
- Population size
on January 1st,
- Population exposed
to risk of death (period & cohort),
- Death rates (period
& cohort), and
- Life tables (period
& cohort).
Format of the data
- All HMD data files
are tab-delimited text (ASCII) files.
- Files are
organized by sex, age, and time.
- Population size is given for
one-year1 and five-year2 age
groups.3
- Deaths are given in the following formats:3
- Lexis triangles
(i.e., by age1, birth cohort, and calendar year)
- 1x1 (i.e.,
by age1 and calendar year)
- 5x1 (i.e.,
by 5-year2 age group and calendar year).
- Exposure-to-risk,
death rates, and life tables are given in similar formats of age and time.
In all cases, period data are indexed by year of death, whereas cohort
data (if available) are indexed by year of birth:
- 1x1 (i.e.,
by age1 and year)
- 1x5 (i.e.,
by age1 and 5-year time interval)
- 1x10 (i.e.,
by age1 and 10-year time interval)
- 5x1 (i.e.,
by 5-year2 age group and year)
- 5x5 (i.e.,
by 5-year2 age group and 5-year time interval)
- 5x10 (i.e.,
by 5-year2 age group and 10-year time interval).
1One-year age groups
(or "by age") means 0, 1, 2,..., 109, 110+.
2Five-year age groups means 0, 1-4, 5-9, 10-14,..., 105-109, 110+.
Age groups are defined in terms of completed age, so "5-9"
extends from exact age 5 to just before the 10th birthday (sometimes
written elsewhere --as "5-10").
3Some of these numbers are estimates (of population
size or numbers of deaths), not actual counts, and therefore may be
expressed as non-integers.
The following columns are included in each life table:
| Year |
Year or range of years (for both period & cohort data) |
| Age
|
Age
group for n-year interval from exact age x to just before exact
age x+n, where n=1, 4, 5, or ∞ (open age interval) |
| l(x) |
Number of survivors at exact age x, assuming
l(0) = 100,000 |
| d(x) |
Number
of deaths between exact ages x and x+n |
| q(x) |
Probability
of death between exact ages x and x+n |
| L(x) |
Number
of person-years lived between exact ages x and x+n |
| T(x) |
Number
of person-years remaining after exact age x |
| e(x) |
Life
expectancy at exact age x (in years) |
See the Methods
Protocol (pg. 35-44) for more details about life table calculations.
Important Notes:
-
Missing data are denoted by a single dot ("."). Currently, there are two
situations in which data are unavailable. First, cohort exposure
and death rates are missing for ages attained outside the period covered
by the HMD (see below for more details). Second, the national statistical
organization may not publish death counts for certain years because of war
or other reasons (e.g., Belgium 1914-18). When death counts are missing,
all quantities that depend on those data (i.e., population estimates,
exposure-to-risk, death rates, period and cohort life tables) are also missing.
- Deaths, population
estimates, death rates, and life tables are provided by single years
of age up to 109, with an open age interval for 110+. However, these
data are sometimes the product of aggregate raw
data (e.g., 5-year age groups, open age intervals), which have been
split into single years of age using the methods described in the
Methods Protocol. The original
raw data that were extracted from published or unpublished sources
are available from the HMD Input Database.
- For
populations with territorial changes, two sets of population
estimates are given for years in which a territorial change occurred. The first set
of estimates (identified as year "19xx -") refers to
the population just before the territorial change,
while the second set (identified as year "19xx +") refers
to the population just after the change. For example,
in France, the data for "1914 -" cover the previous
territory (i.e., as of December 31, 1913), while the data
for "1914 +" reflect
the territorial boundaries as of January 1, 1914.
- Cohort death
rates (i.e., by year of birth) are provided if there are at least
30 consecutive calendar years of data for that cohort. For example,
the mortality series for Sweden begins in 1751, therefore we can
show death rates for the 1675 birth cohort for ages 76 and older.
The cohort death rates at younger ages are shown as missing (denoted by ".").
Similarly, if the mortality series ends in 2002, we can show death
rates for the 1972 cohort up to age 29 because by December 31,
2002, everyone in that cohort has reached exact age 30. Yet, mortality
data for age 30 will remain incomplete until December 31, 2003.
- Cohort life tables are presented
for a population if there is at least one cohort observed from birth
until extinction (i.e., the date by which all cohort members are assumed
to have died). In that case, life tables are provided for all extinct
cohorts and for some almost-extinct cohorts as well (see the Methods
Protocol, pg. 42-44).
- For period life
tables, the central death rate m(x) is used to compute probabilities
of death q(x). Although not given here, the values of mx below age
80 are by definition equal to the observed population death rate M(x)
shown on each country page. At older ages, however, the number of
deaths and the exposure-to-risk eventually become quite small, and
thus observed death rates display considerable random variation. Therefore,
we smooth the M(x) values for ages 80 and older and use these smoothed
values to compute q(x) above a certain age (based on the number of
observed deaths). For details, see the
Methods Protocol (pp. 35-37). This procedure helps to avoid certain
difficulties in period life table calculations at older ages that
may be caused by: 1) extremely high death rates resulting from exposure
being smaller than the number of deaths, 2) death rates of zero resulting
from no deaths at an age where exposure is non-zero, and 3) undefined
death rates at all ages where exposure is zero. For cohort life table
calculations, such difficulties are not present.
Raw Data
The raw data and an
explanation of the format of those data can be found on the Input
Database page.
|