Validation

stats

class pycomlink.validation.stats.RainError(pearson_correlation, coefficient_of_variation, root_mean_square_error, mean_absolute_error, R_sum_reference, R_sum_predicted, R_mean_reference, R_mean_predicted, false_wet_rate, missed_wet_rate, false_wet_precipitation_rate, missed_wet_precipitation_rate, rainfall_threshold_wet, N_all_pairs, N_nan_pairs, N_nan_reference_only, N_nan_predicted_only)

Bases: RainError

namedtuple with the following rainfall performance measures:

pearson_correlation:: Pearson correlation coefficient
coefficient_of_variation:: Coefficient of variation following the definition in[1]
root_mean_square_error:: Root mean square error
mean_absolute_error:: Mean absolute error
R_sum_reference:: Precipitation sum of the reference array (mm)
R_sum_predicted:: Precipitation sum of the predicted array (mm)
R_mean_reference:: Precipitation mean of the reference array (mm)
R_mean_predicted:: Precipitation mean of the predicted array (mm)
false_wet_rate:: Rate of cml wet events when reference is dry
missed_wet_rate:: Rate of cml dry events when reference is wet
false_wet_precipitation_rate:: Mean precipitation rate of false wet events
missed_wet_precipitation_rate:: Mean precipitation rate of missed wet events
rainfall_threshold_wet:: Threshold separating wet/rain and dry/non-rain periods
N_all_pairs:: Number of all reference-predicted pairs
N_nan_pairs:: Number of reference-predicted pairs with at least one NaN
N_nan_reference_only:: Number of NaN values in the reference array
N_nan_predicted_only:: Number of NaN values in predicted array

References

class pycomlink.validation.stats.WetDryError(false_wet_rate, missed_wet_rate, matthews_correlation, true_wet_rate, true_dry_rate, N_dry_reference, N_wet_reference, N_true_wet, N_true_dry, N_false_wet, N_missed_wet, N_all_pairs, N_nan_pairs, N_nan_reference_only, N_nan_predicted_only)

Bases: WetDryError

namedtuple with the following wet-dry performance measures:

false_wet_rate:: Rate of cml wet events when reference is dry
missed_wet_rate:: Rate of cml dry events when reference is wet
matthews_correlation:: Matthews correlation coefficient
true_wet_rate:: Rate of cml wet events when the reference is also wet
true_dry_rate:: Rate of cml dry events when the reference is also dry
N_dry_reference:: Number of dry events in the reference
N_wet_reference:: Number of wet events in the reference
N_true_wet:: Number of cml wet events when the reference is also wet
N_true_dry:: Number of cml dry events when the reference is also dry
N_false_wet:: Number of cml wet events when the reference is dry
N_missed_wet:: Number of cml dry events when the reference is wet
N_all_pairs:: Number of all reference-predicted pairs
N_nan_pairs:: Number of reference-predicted pairs with at least one NaN
N_nan_reference_only:: Number of NaN values in reference array
N_nan_predicted_only:: Number of NaN values in predicted array

class pycomlink.validation.stats.WetError(false, missed)

Bases: tuple

false: Alias for field number 0

missed: Alias for field number 1

pycomlink.validation.stats.calc_rain_error_performance_metrics(reference, predicted, rainfall_threshold_wet)

Calculate performance metrics for rainfall estimation

This function calculates metrics and statistics relevant to judge the performance of rainfall estimation. The calculation is based on two arrays with rainfall values, which should contain rain rates or rainfall sums. Beware that the units of R_sum… und R_mean… will depend on your input. The calculation does not take any information on temporal resolution or aggregation into account!

Parameters:

reference (float array-like) – Rainfall reference
predicted (float array-like) – Predicted rainfall
rainfall_threshold_wet (float) – Rainfall threshold for which reference and predicted are considered wet if value >= threshold. This threshold only impacts the results of the performance metrics which are based on the differentiation between wet and dry periods.

Returns:

RainError

Return type:

named tuple

References

https://en.wikipedia.org/wiki/Matthews_correlation_coefficient https://github.com/scikit-learn/scikit-learn/blob/7389dba/sklearn/metrics/regression.py#L184 https://github.com/scikit-learn/scikit-learn/blob/7389dba/sklearn/metrics/regression.py#L112 Overeem et al. 2013: www.pnas.org/cgi/doi/10.1073/pnas.1217961110

pycomlink.validation.stats.calc_wet_dry_performance_metrics(reference, predicted)

Calculate performance metrics for a wet-dry classification

This function calculates metrics and statistics relevant to judge the performance of a wet-dry classification. The calculation is based on two boolean arrays, where wet is True and dry is False.

Parameters:

reference (boolean array-like) – Reference values, with wet being True
predicted (boolean array-like) – Predicted values, with wet being True

Returns:

WetDryError

Return type:

named tuple

pycomlink.validation.stats.calc_wet_error_rates(df_wet_truth, df_wet)

validator

class pycomlink.validation.validator.GridValidator(lats=None, lons=None, values=None, xr_ds=None)

Bases: Validator

get_time_series(cml, values)

plot_intersections(cml, ax=None)

resample_to_grid_time_series(df, grid_time_index_label, grid_time_zone=None)

class pycomlink.validation.validator.PointValidator(lons, values)

Bases: Validator

get_time_series(cml, values)

class pycomlink.validation.validator.Validator

Bases: object

calc_stats(cml, time_series)

pycomlink.validation.validator.calc_wet_dry_error(df_wet_truth, df_wet)