dask.array.nanmedian

dask.array.nanmedian

dask.array.nanmedian(a, axis=None, keepdims=False, out=None)[source]

Compute the median along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanmedian.

Some inconsistencies with the Dask version may exist.

This works by automatically chunking the reduced axes to a single chunk and then calling numpy.nanmedian function across the remaining dimensions

Returns the median of the array elements.

New in version 1.9.0.

Parameters
aarray_like

Input array or object that can be converted to an array.

axis{int, sequence of int, None}, optional

Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.

overwrite_inputbool, optional (Not supported in Dask)

If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is True and a is not already an ndarray, an error will be raised.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If this is anything but the default value it will be passed through (in the special case of an empty array) to the mean function of the underlying array. If the array is a sub-class and mean does not have the kwarg keepdims this will raise a RuntimeError.

Returns
medianndarray

A new array holding the result. If the input contains integers or floats smaller than float64, then the output data-type is np.float64. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.

See also

mean, median, percentile

Notes

Given a vector V of length N, the median of V is the middle value of a sorted copy of V, V_sorted - i.e., V_sorted[(N-1)/2], when N is odd and the average of the two middle values of V_sorted when N is even.

Examples

>>> import numpy as np  
>>> a = np.array([[10.0, 7, 4], [3, 2, 1]])  
>>> a[0, 1] = np.nan  
>>> a  
array([[10., nan,  4.],
       [ 3.,  2.,  1.]])
>>> np.median(a)  
np.float64(nan)
>>> np.nanmedian(a)  
3.0
>>> np.nanmedian(a, axis=0)  
array([6.5, 2. , 2.5])
>>> np.median(a, axis=1)  
array([nan,  2.])
>>> b = a.copy()  
>>> np.nanmedian(b, axis=1, overwrite_input=True)  
array([7.,  2.])
>>> assert not np.all(a==b)  
>>> b = a.copy()  
>>> np.nanmedian(b, axis=None, overwrite_input=True)  
3.0
>>> assert not np.all(a==b)