pyxpcm.pcm¶
-
class
pyxpcm.
pcm
(K: int, features: {}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]¶ Profile Classification Model class constructor
Consume and return
xarray
objects-
__init__
(self, K:int, features:{}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]¶ Create the PCM instance
Parameters: - K: int
The number of class, or cluster, in the classification model.
- features: dict()
The vertical axis to use for each features. eg: {‘temperature’:np.arange(-2000,0,1)}
- scaling: int (default: 1)
Define the scaling method:
- 0: No scaling
- 1: Center on sample mean and scale by sample std
- 2: Center on sample mean only
- reduction: int (default: 1)
Define the dimensionality reduction method:
- 0: No reduction
- 1: Reduction using :class:`sklearn.decomposition.PCA`
- maxvar: float (default: 99.9)
Maximum feature variance to preserve in the reduced dataset using
sklearn.decomposition.PCA
. In %.- classif: str (default: ‘gmm’)
Define the classification method. The only method available as of now is a Gaussian Mixture Model. See
sklearn.mixture.GaussianMixture
for more details.- covariance_type: str (default: ‘full’)
Define the type of covariance matrix shape to be used in the default classifier GMM. It can be ‘full’ (default), ‘tied’, ‘diag’ or ‘spherical’.
- verb: boolean (default: False)
More verbose output
- timeit: boolean (default: False)
Register time of operation for performance evaluation
- timeit_verb: boolean (default: False)
Print time of operation during execution
- chunk_size: ‘auto’ or int
Sampling chunk size, (array of features after pre-processing)
- backend: str
Statistic library backend, ‘sklearn’ (default) or ‘dask_ml’
Methods
__init__
(self, K, features[, scaling, …])Create the PCM instance bic
(self, ds[, features, dim])Compute Bayesian information criterion for the current model on the input dataset display
(self[, deep])Display detailed parameters of the PCM This is not a get_params because it doesn’t return a dictionary Set Boolean option ‘deep’ to True for all properties display fit
(self, ds[, features, dim])Estimate PCM parameters fit_predict
(self, ds[, features, dim, …])Estimate PCM parameters and predict classes. predict
(self, ds[, features, dim, inplace, name])Predict labels for profile samples predict_proba
(self, ds[, features, dim, …])Predict posterior probability of each components given the data preprocessing
(self, ds[, features, dim, …])Dataset pre-processing of feature(s) preprocessing_this
(self, da[, dim, …])Pre-process data before anything ravel
(self, da[, dim, feature_name])Extract from N-d array a X(feature,sample) 2-d array and vertical dimension z score
(self, ds[, features, dim])Compute the per-sample average log-likelihood of the given data to_netcdf
(self, ncfile, \*\*ka)Save a PCM to a netcdf file unravel
(self, ds, sampling_dims, X)Create a DataArray from a numpy array and sampling dimensions Attributes
F
Return the number of features K
Return the number of classes backend
Return the name of the statistic backend features
Return features definition dictionnary fitstats
Estimator fit properties plot
Access plotting functions stat
Access statistics functions timeit
Return a pandas.DataFrame
with Execution time of method called on this instance-