pycircular package

Subpackages

Submodules

pycircular.circular module

Port of the circular functions needed to calculate the time periodic analyzer Original circular project in gitlab/R7-Projects/circular

! _date2rad is different than pycircular.date2rad

pycircular.circular.bwEstimation(x, lower=0.1, upper=500, xatol=1e-05)[source]

Estimate the bandwidth/smoothing parameter for a the von Mises kernel based on the bw.cv.ml.pycircular function from http://www.inside-r.org/packages/cran/circular/docs/bandwidth

Parameters
xarray-like of shape = [n_samples] of radians.
lower, upper: range over which to minimize for cross validatory bandwidths. The default is almost always

satisfactory, although it is recommended experiment a little with different ranges.

Returns
bwthe bw that minimises the squared–error loss and Kullback–Leibler for the Von Mises pdf
pycircular.circular.kernel(x, bw=10, n=256)[source]

Estimate the von Mises kernel

Parameters
xarray-like of shape = [n_samples] of radians.
bwBandwidth of the kernel estimation

The bandwidth is related to the width of the individual von Mises distributions

nnumber of points of the kernel
Returns
yarray-like of shape = [n]

Calculated von Mises kernel

pycircular.plots module

pycircular.plots.base_periodic_fig(dates, freq, bottom=0, ymax=1, rescale=True, figsize=(8, 8), time_segment='hour', fig=None, ax1=None)[source]

Base figure for plotting periodic time variables

Parameters
datesarray-like of shape = [unique_n_samples] of dates in format ‘datetime64[ns]’.
freqarray-like of shape = [unique_n_samples] of frequencies for each date.
# TODO: finish
time_segment: string of values [‘hour’, ‘dayweek’, ‘daymonth’]
Returns
fig, axFigure and axis objects

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> from pycircular.utils import freq_time, date2rad
>>> from pycircular.plots import base_periodic_fig
>>> dates = pd.to_datetime(["2013-10-02 19:10:00", "2013-10-21 19:00:00", "2013-10-24 3:00:00"])
>>> time_segment = 'dayweek'  # 'hour', 'dayweek', 'daymonth
>>> freq_arr, times = freq_time(dates, time_segment=time_segment)
>>> fig, ax1 = base_periodic_fig(freq_arr[:, 0], freq_arr[:, 1], time_segment=time_segment)
pycircular.plots.clock_vonmises_distribution(ax1, mean, x, p, rescale=True)[source]
pycircular.plots.plot_CDF_kernel(x, y)[source]

Plot of the CDF kernel and kuiper test

Parameters
xarray-like of shape = [n_samples] of radians.
yarray-like of the kernel density.
Returns
fig, axFigure and axis objects

Examples

>>> import numpy as np
>>> from pycircular.density import kernel, bwEstimation
>>> from pycircular.density_tests import kuiper_two
>>> from pycircular.plots import plot_CDF_kernel
>>> x = np.array([0.8 ,  1.  ,  1.1 ,  1.15,  4.  ,  4.2 ,  4.3 ,  4.4])
>>> bw = bwEstimation(x, upper=500)
>>> y = kernel(x, bw=2)
>>> fig, ax1 = plot_CDF_kernel(x, y)
pycircular.plots.plot_kernel(dates, freq, y, bottom=0, ymax=1, rescale=True, figsize=(8, 8), time_segment='hour', fig=None, ax1=None)[source]

Figure for plotting the kernel # TODO: finish

pycircular.stats module

pycircular.stats.kuiper_two(x, y, return_all=False)[source]

Compute the Kuiper statistic to compare two samples. # By Anne M. Archibald, 2007 and 2009, from https://github.com/aarchiba/kuiper/blob/master/kuiper.py

Parameters
xarray-like

The first set of data values.

yarray-like

The second set of data values.

return_all: bool, whether to return additional info for plotting
Returns
fppfloat

The probability of obtaining two samples this different from the same distribution.

otherstuple, if return_all

(z, d_cdf, k_cdf, D1_, D2_, D1, D2)

Notes

Warning: the fpp is quite approximate, especially for small samples.

pycircular.stats.periodic_mean_std(angles)[source]

Calculate the periodic mean and std

Parameters
anglesarray-like or pandas.DataFrame of angles.

Do not matter if it is radians or degrees

Returns
meanfloat

calculated periodic mean

stdfloat

calculated periodic std

Examples

>>> import numpy as np
>>> from pycircular.stats import periodic_mean_std
>>> angles = np.array([0, 1, 2, 4, 6, 7, 8])
>>> print(angles.mean(), angles.std())
>>> print(periodic_mean_std(angles))
pycircular.stats.von_mises_distribution(mean, std, size=240)[source]

Calculate the von Mises distribution

Parameters
meanfloat

calculated periodic mean

stdfloat

calculated periodic std

Returns
xarray-like of shape [size]

radians across the circle

parray-like of shape [size]

von Mises pdf for each value of x

pycircular.training module

pycircular.training.train_time_periodic(trx_train, n=256, idname='account')[source]

Evaluate the time periodic risk of different accounts

Parameters
trx_trainpd.DataFrame of the transactions
nnumber of points of the kernel
Returns
risks_alldictionary of shape n_accounts where each value is a

pd.DataFrame of shape = [3, n + 1] for each account where the rows are the different time segments [‘hour’, ‘dayweek’, ‘daymonth’] the columns are the n points of the kernel and the confidence of the kernel

pycircular.utils module

pycircular.utils.date2rad(dates, time_segment='hour')[source]

Convert hours to radians

Parameters
datesarray-like of shape = [n_samples] of hours.

Where the decimal point represent minutes / 60 + seconds / 60 / 100 … 0 <= times[:] <= 24

time_segment: string of values [‘hour’, ‘dayweek’, ‘daymonth’]
Returns
radiansarray-like of shape = [n_samples]

Calculated radians

Examples

>>> import numpy as np
>>> from pycircular.utils import date2rad
>>> times = np.array([4, 6, 7])
>>> for time_segment in ['hour', 'dayweek', 'daymonth']:
>>>     print(time_segment, date2rad(times, time_segment))
pycircular.utils.freq_time(dates, time_segment='hour', freq=True, continious=True)[source]

Calculate frequency per time period and calculate continius time period

Parameters
datesarray-like of shape = [n_samples] of dates in format ‘datetime64[ns]’.
time_segment: string of values [‘hour’, ‘dayweek’, ‘daymonth’]
freqWhether to return the frequencies
Returns
freq_arrarray-like of shape = [unique_time_segment, 2]
where: freq_arr[:, 0] = unique_time_segment

freq_arr[:, 1] = frequency in percentage

timesarray-like of shape = [n_samples]

Where the decimal point represent minutes / hours depending on time_segment

Examples

>>> import numpy as np
>>> import pandas as pd
>>> from pycircular.utils import freq_time
>>> dates = pd.to_datetime(["2013-10-02 19:10:00", "2013-10-21 19:00:00", "2013-10-24 3:00:00"])
>>> for time_segment in ['hour', 'dayweek', 'daymonth']:
>>>     print(time_segment)
>>>     freq_arr, times = freq_time(dates, time_segment=time_segment)
>>>     print(freq_arr)
>>>     print(times)

Module contents

PyCircular

pycircular is a Python module for circular data analysis built on top of Scikit-Learn, SciPy and distributed under the 3-Clause BSD license. In particular, it provides: 1. A set of circular analysis algorithms 2. Different real-world datasets.

Installation

You can install costcla with pip::

# pip install pycircular