debiased_sd package

Submodules

debiased_sd.estimators module

The estimators contains the main std class.

debiased_sd.estimators.std(x: ndarray, method: str, axis: int | None = None, ddof: int = 1, num_boot: int = 1000, random_state: int | None = None, **kwargs) → ndarray[source]

Main standard deviation adjustment method to debias or reduce the bias.

If \(\sigma^2 = E[X - E[X]]^2\), then we are looking for an estimator, \(S\), with the property \(E[S] = \sigma\). When \(S\) is the sample standard deviation (SD), then \(E[S] \leq \sigma\), and we either adjust it with a scaling factor: \(E[S] \cdot C_n = \sigma\), or a non-parametric bias shift \(E[S + \mathrm{bias}(X)] = \sigma\).

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
method (str) –

Which method should be used? Must be one of:
vanilla: No adjustment jackknife: Leave-one-out jackknife bootstrap: Bootstrap gaussian: Known Gaussian C_n calculation kappa: First-order adjustment based on kurtosis
axis (int | None = None) – Axis to calculate the SD over
ddof (int, optional) – The degrees of freedom for the sample SD; should be kept to 1
num_boot (int, optional) – If method==’bootstrap’, how many bootstrap sample to draw? Note that this approach will broadcast the original array with an addition {num_boot} rows in the axis=-1 dimension, so keep that in mind for memorary consideration
random_state (int | None, optional) – Reproducability seed for the bootstrap method
**kwargs – Optional arguments to pass into methods, see utils.sd_{method} for additional details

Returns:

If x.shape = (d1, d2, …, dk), and axis=j, then returns a (d1, …, dj-1, dj+1, …, dk) array

Return type:

np.ndarray

debiased_sd.utils module

Utility helpers

debiased_sd.utils.check_x_axis_ddof(x: ndarray, axis: int | None = None, ddof: int = 1) → Tuple[ndarray, ndarray, int][source]

Checks that x aligns with with the stated axis, and that ddof is valid

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
axis (int | None = None) – Axis to calculate the SD over
ddof (int = 1) – The degrees of freedom for the sample SD

Return type:

(x, std, n) or (np.array(x), sd_array[-axis] , x.shape[-axis])

debiased_sd.utils.sd_bootstrap(x: ndarray, axis: int = 0, ddof: int = 0, num_boot: int = 1000, random_state: int | None = None, **kwargs) → ndarray[source]

Generates {num_boot} bootstrap replicates for a 1-d array

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
axis (int | None, optional) – Axis to calculate the SD over
ddof (int, optional) – The degrees of freedom for the sample SD
num_boot (int, optional) – If method==’bootstrap’, how many bootstrap sample to draw? Note that this approach will broadcast the original array with an addition {num_boot} rows in the axis=-1 dimension, so keep that in mind for memorary consideration
random_state (int | None, optional) – Reproducability seed for the bootstrap method

debiased_sd.utils.sd_gaussian(x: ndarray, axis: int | None = None, ddof: int = 1, approx: bool = True, **kwargs) → ndarray[source]

Calculate the de-biased sample SD when the data is drawn from a normal distribution

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
axis (int, optional) – Axis to calculate the SD over
ddof (int, optional) – The degrees of freedom for the sample SD
approx (bool, optional) – Should a log-approximation be used for the Gamma function calculation? Recommended if n is large

debiased_sd.utils.sd_jackknife(x: ndarray, axis: int = 0, ddof: int = 0, **kwargs) → ndarray[source]

Calculates the jackknife-adjusted sample SD by calculating the (LOO) bias, where the final estimator is: n*sighat - (n-1)*mean(sighat_loo)

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
axis (int, optional) – Axis to calculate the SD over
ddof (int, optional) – The degrees of freedom for the sample SD

debiased_sd.utils.sd_kappa(x: ndarray, axis: int | None = None, ddof: int = 1, debias_cumulants: bool = True, kurtosis_fun: Callable | None = None, **kwargs) → ndarray[source]

Adjust the vanilla SD estimator with the first-order adjustment from a Maclaurin expansion: sigma*[1-(kappa-1+2(n-1))/(8n)]^{-1}. See Giles (2021): http://web.uvic.ca/~dgiles/downloads/working_papers/std_dev.pdf

Parameters:

x (np.ndarray) – An array or array-like object of arbitrary dimension: x.shape = (d1, d2, …, dk)
axis (int | None, optional) – Axis to calculate the SD over
ddof (int, optional) – The degrees of freedom for the sample SD
debias_cumulants (bool, optional) – Should the “debiased” versions of the cumulants to be used? See https://mathworld.wolfram.com/Cumulant.html (this is the default in pandas.kurtosis btw)
kurtosis_fun (Callable | None, optional) – Should an alternative kurtosis calculation be used? Will pass in kwargs & axis.

Module contents

Top-level package for Debiased SD.