Posts

Z Score, Z Statistics and P value

What is a Z score or Z statistics? Firstly Z score and Z statistics is the same. It is a measure of the number of standard deviations a data point is away from the sample mean. Mathematically this is written as: `z = (x - bar x) / sigma` where `z` = Z score or Z statistics `x` = data point `bar x` = sample mean `sigma` = sample standard deviation The mathematical equation of standard deviation is: `sigma = sqrt((x - bar x)/N)` where `N` = number of sample points What is P value? The P value is the probability that a particular value assuming the null hypothesis is true. Conventionally a P value of less than 0.05 is grounds for rejecting the null hypothesis. For the lower-tailed test or left-tailed test, the P value is expressed as: `P = Pr(TS <=ts | H_0) = cdf(ts)` And for the upper-tailed test or right-tailed test, the P value is expressed as: `P=Pr(TS>=ts |H_0) = 1-cdf(ts)` And the P value for a two-tailed test is `P = 2 xx min{Pr(TS<=ts |H_0), Pr(TS>

Power measurements and Confidence Interval

Image
Recently I assessed the power consumption of a particular server rack to determine if we had sufficient power capacity remaining to support additional server installation. Note: For my intent and purposes, the names and values have been obscured and changed to maintain confidentiality. The power capacity of the rack was 2.5 kW, while the average measure power consumption was 2.16 kW. This suggested that the remaining capacity = 2.5 - 2.16 = 0.34 kW But making decision based on a single aggregated point seems risky. So I calculated the confidence interval range instead. What is Confidence Interval The confidence interval is the value range where the mean of the population can be found based on the sample data collected. Mathematically, `CI = bar x pm z s/sqrt(n)` where `CI = "confidence interval"` `bar x = "sample mean"` `z = "confidence level value, e.g. 95%"` `s = "sample standard deviation"` `n = "sample siz

Displaying Mathematical Equations

Image
When I wrote on implementing Standard Error formula using LAMBDA, I used  CodeCogs Equation Editor webservice. Since then I have been trying a few other solutions. I put together this table for comparison. Renderer Engine Input Output Format Strong Points CodeCogs WebService LaTeX Image WYSIWYG Equation Editor Katex JavaScript MathML, TeX, and AsciiMath HTML+CSS, SVG, or MathML Fast rendering unlike MathJax and AsciiMath MathJax JavaScript MathML, TeX, and AsciiMath HTML+CSS, SVG, or MathML Widest browser compatibility AsciiMath JavaScript AsciiMath HTML+CSS, SVG, or MathML Easy to write equations, format like Excel equations. Here are some examples of a Quadratic Equation Renderer Quadratic Equation Example

Formula Name Manager Limit: ~2084 Characters

Image
While trying to implement a LAMBDA function I found myself unable to paste my formula into the Formula Name Manager. My first thought was that the LAMBDA has a character limit. But further investigation showed it was related to the Formula Name Manager, specifically the Refers to: field. Also it is not  related to the Name or the Comment field. Tweaking my formula the maximum number of characters is about 2084. Whitespaces are counted. This includes carriage returns and tabs. So, when implementing LAMBDA with the Formula Name Manager, do remember to keep your formula as concise as possible, but don't be too sloppy with variable names and indentation. That said, go edit your LAMBDA in your DC-DEN! Update: The Comment:  field limit is 256 characters.

Descriptive Statistics summary with LAMBDA

Image
One function in Excel that I find useful is the Descriptive Statistics from the Analysis Toolpak Add-In. At a glance I see a summary of the sample data. However I also find it cumbersome to use. It requires you to add-in Analysis Toolpak - not loaded by default. I wonder why Microsoft choses not to pre-load in Excel even though it has been available since Excel 2000 (maybe even earlier?). It launches a dialog. Recently, this becomes an issue as it unavailable in Excel Web. It returns a static result, not an auto recalculating function, i.e. the result is not updated if and when the data is updated. Of the 3 reasons above, I find the last most annoying. While field/laboratory sample data does not change after collection, but input data from external sources can be changing frequently. Any report based on external sources would have to be painstakingly updated manually. So in this blog, I will create a Descriptive Statistics function with LAMBDA that would recalculate automatically just
Image
  Recently I saw some Travel facts and figures for the Lunar New Year 2023 . The table for the Top Five Destinations (see above) was interesting but difficult to digest. Would it be possible to summarise this? Would it be possible to make this table more comprehensible at a glance? It may mean turning this table into some form of a diagram. A network diagram or Sankey diagram would probably do the job. But what if Excel was the only that I had? This would be challenging. Converting Table Information into Data Table above presents nominal data. There are a number of options to convert it into numerical data. Using Excel, it was easy to auto-calculate a count of the mentioned countries. Frequency is the number of times the country is mentioned in the Table. A simple COUNTIF  formula gave the results instantly. The Ranked is a weighted preference of destination. I put an arbitrary weightage of 5, 4, .., 1 for Top 1, Top 2, ..., Top 5 in reverse order respectively. As an example, if Thai

Confidence Interval function with LAMBDA and Optional Parameters

Image
Excel provides CONFIDENCE.T  formula to calculate the confidence interval. This value, however, does not stand on its own. We usually want to know the upper and lower confidence limits as well.  In this blog, I will implement the calculation of upper and lower confidence intervals using LAMBDA. I will also demonstrate how to implement optional parameters, if a different significance level from the default 5% is required. But before that, I will introduce the LET function. LET Function The LET  function assigns names to calculation results. This is useful for storing intermediate results or values, or defining names inside a formula. The general syntax is as follows: =LET(name1, name_value1, [name2, name_value2,...], calculation) It supports up to 126 name-and-value pairs . The last parameter calculation is the return result of LET . Let's see this in action. Confidence Interval function I will create a new function  dcrConfidence  that takes an  array  and calculates the confide