Spatial

coverage

pycomlink.spatial.coverage.calc_coverage_mask(cml_list, xgrid, ygrid, max_dist_from_cml)

Generate a coverage mask with a certain area around all CMLs

Parameters:

cml_list (list) – List of Comlink objects
xgrid (array) – 2D matrix of x locations
ygrid (array) – 2D matrix of y locations
max_dist_from_cml (float) – Maximum distance from a CML path that should be considered as covered. The units must be the same as for the coordinates of the CMLs. Hence, if lat-lon is used in decimal degrees, this unit has also to be used here. Note that the different scaling of lat-lon degrees for higher latitudes is not accounted for.

Returns:

grid_points_covered_by_cmls – 2D array with size of xgrid and ygrid with True values where the grid point is within the area considered covered.

Return type:

array of bool

helper

pycomlink.spatial.helper.haversine(lon1, lat1, lon2, lat2): Calculate the great circle distance between two points on the earth (specified in decimal degrees)

idw

class pycomlink.spatial.idw.Invdisttree(X, leafsize=10, stat=0)

Bases: object

inverse-distance-weighted interpolation using KDTree:

Copied from http://stackoverflow.com/questions/3104781/ inverse-distance-weighted-idw-interpolation-with-python

Usage granted by original author here: https://github.com/scipy/scipy/issues/2022#issuecomment-296373506

invdisttree = Invdisttree( X, z ) – data points, values interpol = invdisttree( q, nnear=3, eps=0, p=1, weights=None, stat=0 )

interpolates z from the 3 points nearest each query point q; For example, interpol[ a query point q ] finds the 3 data points nearest q, at distances d1 d2 d3 and returns the IDW average of the values z1 z2 z3

(z1/d1 + z2/d2 + z3/d3) / (1/d1 + 1/d2 + 1/d3) = .55 z1 + .27 z2 + .18 z3 for distances 1 2 3

q may be one point, or a batch of points. eps: approximate nearest, dist <= (1 + eps) * true nearest p: use 1 / distance**p weights: optional multipliers for 1 / distance**p, of the same shape as q stat: accumulate wsum, wn for average weights

How many nearest neighbors should one take ? a) start with 8 11 14 .. 28 in 2d 3d 4d .. 10d; see Wendel’s formula b) make 3 runs with nnear= e.g. 6 8 10, and look at the results –

|interpol 6 - interpol 8| etc., or |f - interpol*| if you have f(q). I find that runtimes don’t increase much at all with nnear – ymmv.

p=1, p=2 ?

p=2 weights nearer points more, farther points less. In 2d, the circles around query points have areas ~ distance**2, so p=2 is inverse-area weighting. For example,

(z1/area1 + z2/area2 + z3/area3) / (1/area1 + 1/area2 + 1/area3) = .74 z1 + .18 z2 + .08 z3 for distances 1 2 3

Similarly, in 3d, p=3 is inverse-volume weighting.

Scaling:

if different X coordinates measure different things, Euclidean distance can be way off. For example, if X0 is in the range 0 to 1 but X1 0 to 1000, the X1 distances will swamp X0; rescale the data, i.e. make X0.std() ~= X1.std() .

A nice property of IDW is that it’s scale-free around query points: if I have values z1 z2 z3 from 3 points at distances d1 d2 d3, the IDW average

(z1/d1 + z2/d2 + z3/d3) / (1/d1 + 1/d2 + 1/d3)

is the same for distances 1 2 3, or 10 20 30 – only the ratios matter. In contrast, the commonly-used Gaussian kernel exp( - (distance/h)**2 ) is exceedingly sensitive to distance and to h.

interpolator

class pycomlink.spatial.interpolator.IdwKdtreeInterpolator(nnear=8, p=2, exclude_nan=True, max_distance=None): Bases: PointsToGridInterpolator

class pycomlink.spatial.interpolator.OrdinaryKrigingInterpolator(nlags=100, variogram_model='spherical', variogram_parameters={'nugget': 0.1, 'range': 1, 'sill': 0.9}, weight=True, n_closest_points=None, exclude_nan=True, backend='C'): Bases: PointsToGridInterpolator

class pycomlink.spatial.interpolator.PointsToGridInterpolator

Bases: object

PointsToGridInterpolator class docstring

pycomlink.spatial.interpolator.clim_var_param(date_str='20130404', time_scale_hours=0.25)

Obtain climatological values of sill, range, and nugget of spherical variogram model This is based on a climatological variogram based on 30-year automatic rain gauge data sets from The Netherlands. Spherical variograms have been modelled as function of the day number and duration in Van de Beek et al. (2012). They use durations of 1 - 24 h. In this function the relationships can be extrapolated to, e.g. 15-min, data. Returns the values of sill, range and nugget. The nugget is set to 0.1 * sill. Python implementation of the function “ClimVarParam.R” from the R RAINLINK package: Retrieval algorithm for rainfall mapping from microwave links in a cellular communication network (Overeem et al., 2016).

References: Overeem, A., Leijnse, H., and Uijlenhoet, R., 2016: Retrieval algorithm for rainfall mapping from microwave links in a cellular communication network, Atmospheric Measurement Techniques, 9, 2425-2444, https://doi.org/10.5194/amt-9-2425-2016. Van de Beek, C. Z., Leijnse, H., Torfs, P. J. J. F., and Uijlenhoet, R., 2012: Seasonal semi-variance of Dutch rainfall at hourly to daily scales, Adv. Water Resour., 45, 76-85, doi:10.1016/j.advwatres.2012.03.023.

Parameters:

date_str (the end date of the chosen daily period in a format that pandas.to_datetime can parse e.g. ‘YYYYMMDD’)
time_scale_hours (rainfall aggregation interval in hours)

Return type:

dict with ‘sill’, ‘range’ and ‘nugget’ keys.