statgis.statutils.stats#

Submodule to ease some statisticals

Module Contents#

Functions#

corr_matrix(→ pandas.DataFrame)

Calculate the pearson correlation matrix of the variables in a dataframe.

statgis.statutils.stats.corr_matrix(data: pandas.DataFrame, variables: Union[numpy.typing.ArrayLike, None] = None, half: bool = False, hide_insignificants: bool = False, singificant_threshold: float = 0.05) pandas.DataFrame#

Calculate the pearson correlation matrix of the variables in a dataframe.

Parameters:
  • data (pd.DataFrame) – Dataframe with the variables to evaluate their correlation.

  • variables (ArrayLike | None (optional)) – The variables of interest, if it is not defined, all variables in the dataframe will be evaluated.

  • half (bool (optional)) – If True, only show the corerlation of the first half of the matrix, excluding the repeated correlation.

  • hide_insignifcants (bool (optional)) – If True, hide all the correlation with a p-value greater than the significant threshold.

  • siginificant_threshold (float (optional)) – Threshold of significant correlation.

Returns:

corr – Dataframe with the correlation values.

Return type:

pd.DataFrame