Creating One Sample Mean Test with LAMBDA

Suppose you visited a village. The village elders claims to possess a secret elixir that makes their children grower taller! You are aware of the national 12-year old  average height is 150cm. And you could measure the village's 12-year old children's height. Plotting the village's children's height gives you a distribution like in the graph above.

At first glance the village's children does seem taller than the national average. Is the village's children really taller? Is the distribution significantly different?

One Sample Mean Test

A one sample mean test compares a sample distribution's mean against a target value. It is similar to the one proportion test but the sample data is continuous, giving a normal distribution curve instead of a yes/no binary result.

We write the Null Hypothesis as the sample mean equal to the population mean.

`H_0: mu = mu_0`

And the Alternative Hypothesis as the sample mean not equal to the population mean.

`H_1: mu != mu_0`

We could also test if the sample mean is greater than or equal to the population mean (left tail test).

`H_0: mu >= mu_0`

Or if the sample mean is less than or equal to the population mean (right tail test).

`H_0: mu <= mu_0`

Test Statistics

The test statistics for one sample mean test is

`t = mu - mu_0 / (s/sqrt(n))`

where
`t` = referred as t statistics
`mu` = sample average
`mu_0` = population average or target value
`s` = sample standard deviation
`n` = sample size

Intermediate Calculations in LAMBDA

To implement the LAMBDA formula, I will use intermediate variables so that the calculation is easier to read in LAMBDA.

      numerator, sample_mean - expected_mean,
      denominator, sample_stdev / SQRT(sample_size),
      tStatistics, numerator / denominator,
      df, sample_size - 1,

To calculate the left tail p-value, I will use T distribution formula T.DIST from Excel. In T distribution the degrees of freedom is the sample size minus 1.

      df, sample_size - 1,
      pvalueLeftTail, T.DIST(tStatistics, df, TRUE),

To calculate the right tail p-value, I will use the T.DIST.RT formula.

      pvalueRightTail, T.DIST.RT(tStatistics, df),

The two-tail p-value is 2 times the smaller of these two values.

      pvalueTwoTail, 2 * IF(pvalueLeftTail < pvalueRightTail, pvalueLeftTail, pvalueRightTail),

Putting It Together

I put these together in a tidied up the LAMBDA formula like this:

=LAMBDA(expected_mean, sample_mean, sample_stdev, sample_size, [tail], [show_details],
    LET(tail, IF(ISOMITTED(tail), 0, tail),
      show_details, IF(ISOMITTED(show_details), FALSE, show_details),
      numerator, sample_mean - expected_mean,
      denominator, sample_stdev / SQRT(sample_size),
      tStatistics, numerator / denominator,
      df, sample_size - 1,
      pvalueLeftTail, T.DIST(tStatistics, df, TRUE),
      pvalueRightTail, T.DIST.RT(tStatistics, df),
      pvalueTwoTail, 2 * IF(pvalueLeftTail < pvalueRightTail, pvalueLeftTail, pvalueRightTail),
      pvalue, SWITCH(tail, 0, pvalueTwoTail, -1, pvalueLeftTail, 1, pvalueRightTail, "Tail:Left Tail=-1, Two Tail=0, Right Tail=1"),
      details, HSTACK(
        VSTACK(
          "Null Mean Mo",
          "Sample Mean Mu",
          "Sample StdDev SD",
          "Sample Size",
          "t-statistics",
          "p-value Left Tail Test Mu > Mo",
          "p-value Right Tail Test Mu < Mo",
          "p-value Two Tail Test Mu = Mo"
        ),
        VSTACK(
          expected_mean,
          sample_mean,
          sample_stdev,
          sample_size,
          tStatistics,
          pvalueLeftTail,
          pvalueRightTail,
          pvalueTwoTail
        )
      ),
      IF(show_details, IFNA(details,""), pvalue)
    )
)

One Sample Mean Test In Action

Let's see how the LAMBDA formula works. Suppose the measure data is in column B. First we calculate the average, standard deviation and size of the sample data.

The implementation has helpful hints when typing the formula out. I am testing against the population mean of 150cm.

Here you can see the implementation displays a nice table summary with relevant calculations.


From this example we see the p-value for the two-tail test is 0.782623 is much greater than an alpha value of 0.05. This implies the sample distribution mean is not statistically different from the population mean, i.e. the village elixir does not make the children taller. 

The implementation also show the p-value for left and right tail tests.

Note: In writing this implementation, a few revisions were made. As you progress in writing in LAMBDA functions, you will find better ways or include some ideas to enhance the display.

In the next blog post we will see how we can extend this formula for arrays. For now, have a try at writing your own in your DC-DEN!



Comments