pyxpcm.models.pcm¶
-
class
pyxpcm.models.pcm(K: int, features: {}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]¶ Profile Classification Model class constructor
Consume and return
xarrayobjects-
__init__(self, K:int, features:{}, scaling=1, reduction=1, maxvar=15, classif='gmm', covariance_type='full', verb=False, debug=False, timeit=False, timeit_verb=False, chunk_size='auto', backend='sklearn')[source]¶ Create the PCM instance
Parameters: - K: int
The number of class, or cluster, in the classification model.
- features: dict()
The vertical axis to use for each features. eg: {‘temperature’:np.arange(-2000,0,1)}
- scaling: int (default: 1)
Define the scaling method:
- 0: No scaling
- 1: Center on sample mean and scale by sample std
- 2: Center on sample mean only
- reduction: int (default: 1)
Define the dimensionality reduction method:
- 0: No reduction
- 1: Reduction using :class:`sklearn.decomposition.PCA`
- maxvar: float (default: 99.9)
Maximum feature variance to preserve in the reduced dataset using
sklearn.decomposition.PCA. In %.- classif: str (default: ‘gmm’)
Define the classification method. The only method available as of now is a Gaussian Mixture Model. See
sklearn.mixture.GaussianMixturefor more details.- covariance_type: str (default: ‘full’)
Define the type of covariance matrix shape to be used in the default classifier GMM. It can be ‘full’ (default), ‘tied’, ‘diag’ or ‘spherical’.
- verb: boolean (default: False)
More verbose output
- timeit: boolean (default: False)
Register time of operation for performance evaluation
- timeit_verb: boolean (default: False)
Print time of operation during execution
- chunk_size: ‘auto’ or int
Sampling chunk size, (array of features after pre-processing)
- backend: str
Statistic library backend, ‘sklearn’ (default) or ‘dask_ml’
Methods
__init__(self, K, features[, scaling, …])Create the PCM instance bic(self, ds[, features, dim])Compute Bayesian information criterion for the current model on the input dataset display(self[, deep])Display detailed parameters of the PCM This is not a get_params because it doesn’t return a dictionary Set Boolean option ‘deep’ to True for all properties display fit(self, ds[, features, dim])Estimate PCM parameters fit_predict(self, ds[, features, dim, …])Estimate PCM parameters and predict classes. predict(self, ds[, features, dim, inplace, name])Predict labels for profile samples predict_proba(self, ds[, features, dim, …])Predict posterior probability of each components given the data preprocessing(self, ds[, features, dim, …])Dataset pre-processing of feature(s) preprocessing_this(self, da[, dim, …])Pre-process data before anything ravel(self, da[, dim, feature_name])Extract from N-d array a X(feature,sample) 2-d array and vertical dimension z score(self, ds[, features, dim])Compute the per-sample average log-likelihood of the given data to_netcdf(self, ncfile, \*\*ka)Save a PCM to a netcdf file unravel(self, ds, sampling_dims, X)Create a DataArray from a numpy array and sampling dimensions Attributes
FReturn the number of features KReturn the number of classes backendReturn the name of the statistic backend featuresReturn features definition dictionnary fitstatsEstimator fit properties plotAccess plotting functions statAccess statistics functions timeitReturn a pandas.DataFramewith Execution time of method called on this instance-