Chap 50 Exercises

\[ \newcommand{\dnorm}{\text{dnorm}} \newcommand{\pnorm}{\text{pnorm}} \newcommand{\recip}{\text{recip}} \]

Exercise 1 The functions dunif(), dnorm(), and dexp(), respectively, implement the uniform, gaussian, and exponential families of distributions. The word “family” is used because each each family has it is own parameters:

  • Uniform: min and max
  • Gaussian: mean and sd
  • Exponential: rate (with the exponential function parameterized as \(\exp\left(-\frac{t}{\text{rate}}\right)\).)

Pick 3 very different sets of parameters for each family. (They should be meaningful, for instance sd \(>0\) and rate \(>0\).)

  1. Numerically integrate each of the 9 distributions to confirm that the total probability is 1 for each.
Active R chunk 1
  1. Compute the expectation value of each of the 9 distributions.
Active R chunk 2
  1. Compute the variance of each of the 9 distributions.
Active R chunk 3

Exercise 2 Plot out dnorm() on semi-log axis. You can choose your own mean and sd, and your graphics domain should cover at least mean \(\pm 3\)sd.

  1. Describe the shape of the function graph on semi-log axes.

question id: snake-hurt-candy-1

  1. Explain why dnorm(x) will never be zero for any finite x.

question id: snake-hurt-candy-2

Exercise 3  

  1. The function \(e^{-k x}\) can be thought of as a relative density function, but \(sin(x)\) cannot. Why?

question id: seahorse-catch-cotton-1

  1. The exponential probability function dexp() is a scaled version of \(e^{-kx}\). Find, symbolically, the scalar by which \(e^{-k x}\) must be multiplied to turn it into a probability density function.

\(e^1\)

\(e^{-1}\)

\(k\)

\(k^{-1}\)

None of these.

question id: seahorse-catch-cotton

Exercise 4 Count the grid squares under the probability density functions in XREF not implemented yet and use the fact that the total area under a probability density function is always 1 to estimate the area of a single grid square in each graph. Make sure to give units, if any.

question id: oak-hurt-ring-1

Exercise 5 The cumulative distribution translates the probability density into an actual probability (a number between zero and one). Formally, the cumulative distribution is \[P(t) \equiv \int_{-\infty}^t p(t) dt\]

Active R chunk 4

Active R chunk 5 plots the cumulative probability function of dexp(t, rate = 1/100), corresponding to the probability of a 100-year storm. Evaluating \(P(t)\) at given value of \(t\) gives a probability. For instance, for the exponential density function with rate = 0.01, \(P(10) \approx 0.095\), roughly 10%. In terms of storms, this means that according to the standard model of these things, the time between consequtive 100-year storms has a 10% chance of being 10 years or less!

  1. Imagine that a 100-year storm has just happened at your location. What is the probability that the next 100-year storm will happen within 50 years?

11%

27%

39%

51%

question id: goat-sit-knob-1

  1. The median time between 100-year storms is the value where there is a 50% probability that consecutive storms will happen closer in time than this value and 50% that consecutive storms will happen further apart than this value. What is the median time between 100-year storms, according to the standard model? (Hint: You can read this off the graph.)

about 30 years

50 years

about 70 years

100 years

about 130 years

question id: goat-sit-knob-2

Exercise 6  

Active R chunk 5

The code in Active R chunk 5 creates the anti-derivative of a particular probability density function. The cumulative probability function at any x will be P(x) - P(Inf).

Plot out the cumulative probability function P(x) - P(Inf) over the domain \(x \in [-5,5]\). The shape of the cumulative should remind you of another basic modeling function. Plot out that other function with appropriate parameters to compare the two. What do you find?

question id: goat-sit-knob2-1

Exercise 7 In the Social Security life-table M2014F, one column is nliving. The nliving variable is computed by tracking the age-specific mortality rate as it plays out in a hypothetical population of 100,000 newborns. The age-specific mortality rate at age 0 is applied the the 100,000 to calculate the number of deaths in the first year: 531. Therefore 99,469 survive to age 1. Then the age-specific mortality rate at age 1 is applied to the 99,469 survivors to calculate the number of deaths of one-year olds: 34. This leaves 99,434 surviving two-year olds. (There is round-off error, involved, which is why the number is not 99,435.) The process is continued up through age 120, at which point there are no survivors.

The following R code constructs from M2014F a function died_before(age) giving the fraction of the cohort of 100,000 who died at or before the given age.

  1. Plot out died_before(age) vs age. Explain what you see in the graph that tells you that this is a cumulative probability function.

question id: seal-tug-mattress-1

To calculate life-expectancy, we need to convert died_before(age) into died_at(age), the probability density of death at any given age. Use R/mosaic to construct died_at(age), which will be a basic calculus transformation of died_before().

  1. What are the units of the output of the died_at(age) function?

No units

year

year-1

age

question id: seal-tug-mattress-2

Find the expectation value of age under the probability density died_at(age). This is called the life-expectancy at birth: the average number of years of life of the people in the imaginary cohort of 100,000.

  1. What is the life-expectancy at birth to judge from the M2014F data?
73 years       77 years       81 years       85 years      

question id: seal-tug-mattress-3

Activities

Exercise 8 Exponential distributions are self similar. Looking at ?fig-exponential-density and assume that \(1/k = 100\) days. According to the density function, the probability of an event happening in the first 100 days is 63.2%. Of course that means there is 36.8% chance that the event will happen after the 100 day mark. If the event does not happen in the first 100 days, there is a 63.2% chance that it will happen in interval 100-200 days. Similarly, if the event does not happen in the first 200 days, there is a 63.2% chance that it will happen in interval 200-300 days. Use these facts to calculate the probability mass in each of these intervals:

  • 0-100 days
  • 100-200 days
  • 200-300 days
  • 300-400 days

Hint: Make sure that the sum of these probability masses does not exceed 1.

Exercise 9 Calculate symbolically the expectation value and variance of the uniform distribution with parameters \(a\) and \(b\):

\[\text{unif}(x, a, b) \equiv \left\{{\Large\strut}\begin{array}{cl}\frac{1}{b-a}& \text{for}\ a \leq x \leq b\\0& \text{otherwise} \end{array}\right.\]

Exercise 10 The code in Active R chunk 6 will construct a function, prob_death60(age) that gives the probability that a person reaching her 60th birthday will die at any given age. (The function is constructed from US Social Security administration data for females in 2014.)

Active R chunk 6
  1. The “life expectancy at age 60” is the expectation value for the number of years of additional life for person who reaches age 60. (The number of years of additional life is age - 60.) Compute the life-expectancy at age 60 based on the prob_death(age) function.
  1. A more technically descriptive name for life-expectancy would be “expectation value of additional life-duration.” Calculate the standard deviation of “additional life-duration.”
  1. Construct the cumulative probability function for age at death for those reaching age 60. (Hint: Since the value of the cumulative at age 60 should be 0, set the argument lower.bound=60 in antiD() so that the value will be zero at age 60.) From the cumulative, find the median age of death for those reaching age 60. (Hint: Zeros().)
  1. In a previous exercise, we found from these same data that the life expectancy at birth is about 81 years. Many people mis-understand “life expectancy at birth” to mean that people will die mainly around 81 years of age. That is not quite so. People who are approaching 81 should keep in mind that they likely have additional years of life. A good way to quantify this is with the life-expectancy at age 81. We can calculate life-expectancy at 81 based on the prob_death60(). You can do this by scaling prob_death60() by \(A\) such that \[\frac{1}{A} = \int_{81}^{120} \text{prob\_death60}(\text{age})\, d\text{age}\ .\]
  1. Calculate \(A\) for age 81.
  1. Using the \(A\) you just calculated, find the life-expectancy at age 81, that is, the expectation value of additional years of life at age 81. Also calculate the standard deviation.
No answers yet collected