Metric functions#

pasted._metrics#

Disorder-metric computations.

All public metric functions accept pts (position array) plus a cutoff parameter and build their own neighbor lists internally.

C++ path (HAS_GRAPH = True): rdf_h_cpp and graph_metrics_cpp use a single FlatCellList pass for O(N*k) pair enumeration. scipy.spatial.distance.pdist / squareform are not called on this path.

Pure-Python fallback (HAS_GRAPH = False): _compute_graph_ring_charge falls back to _squareform(_pdist(pts)) — an O(N²) operation. This path is active when the C++17 extensions did not compile at install time (e.g. no compiler available). For N ≳ 500 the O(N²) cost becomes significant; see the Installation section of the quickstart for performance guidance.

pasted._metrics.compute_all_metrics(atoms: list[str], positions: list[Vec3], n_bins: int, w_atom: float, w_spatial: float, cutoff: float, cov_scale: float = 1.0) dict[str, float][source]#

Compute all disorder metrics for a single structure.

The exact count is len(ALL_METRICS) (currently ALL_METRICS).

C++ path (HAS_GRAPH = True): all pair-enumeration uses a single FlatCellList built in C++, giving O(N*k) complexity throughout. scipy.spatial.distance.pdist / squareform are not called.

Pure-Python fallback (HAS_GRAPH = False): compute_h_spatial() and compute_rdf_deviation() use scipy.spatial.cKDTree (O(N*k)), but the five graph/ring/charge/Moran metrics are computed via _compute_graph_ring_charge(), which builds a full N×N distance matrix with pdist / squareform — an O(N²) operation.

Parameters:
  • atoms – Element symbols.

  • positions – Cartesian coordinates (Å).

  • n_bins – Histogram bins for compute_h_spatial() and compute_rdf_deviation().

  • w_atom – Weight of H_atom in H_total.

  • w_spatial – Weight of H_spatial in H_total.

  • cutoff – Distance cutoff (Å) for all local metrics.

  • cov_scale – Retained for API compatibility; no longer used internally. Defaults to 1.0.

Return type:

dict with keys matching pasted._atoms.ALL_METRICS.

pasted._metrics.compute_angular_entropy(positions: list[Vec3], cutoff: float, n_bins: int = 20) float[source]#

Mean per-atom angular entropy of neighbor direction distributions.

For each atom i, the directions to its neighbors within cutoff are projected onto the unit sphere. The polar angle theta distribution is histogrammed and its Shannon entropy is computed. The result is averaged over all atoms that have at least one neighbor.

A value close to ln(n_bins) indicates a near-uniform (maximum-entropy) angular distribution – neighbors are spread evenly over the sphere. A low value indicates clustering of neighbors in certain directions, i.e. accidental local order.

This metric is intended as a diagnostic for the maxent placement mode and is not included in ALL_METRICS or in XYZ comment lines.

Uses scipy.spatial.cKDTree for O(N*k) pair enumeration instead of a full O(N^2) distance matrix.

Parameters:
  • positions – Cartesian coordinates (Å).

  • cutoff – Neighbor distance cutoff (Å).

  • n_bins – Number of histogram bins for the theta distribution (default: 20).

Returns:

Mean per-atom angular Shannon entropy. Returns 0.0 for structures with fewer than two atoms or no neighbors within cutoff.

Return type:

float

pasted._metrics.compute_charge_frustration(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Variance of Pauling electronegativity differences across neighbor pairs.

For each neighbor pair (i, j) within cutoff, the absolute electronegativity difference abs(chi_i - chi_j) is computed. The metric is the variance of these differences over all neighbor pairs.

A high value indicates a structure where electronegativity differences are inconsistently distributed across bonds: some neighbors are well matched while others are highly mismatched. This is analogous to charge frustration in disordered materials, where local charge neutrality cannot be satisfied simultaneously at every site.

Noble gases and elements without a Pauling value use the module-level fallback of 1.0 (see pauling_electronegativity()).

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Variance of abs(delta-chi) across all neighbor pairs. Returns 0.0 when fewer than two neighbor pairs are detected (variance is undefined for a single observation).

Return type:

float

pasted._metrics.compute_graph_metrics(dmat: ndarray, cutoff: float) dict[str, float][source]#

Largest connected-component fraction and mean clustering coefficient.

Pure-Python fallback used when HAS_GRAPH is False. The C++ path in graph_metrics_cpp is preferred and is invoked automatically by _compute_graph_ring_charge().

Parameters:
  • dmat – Full n x n pairwise distance matrix.

  • cutoff – Adjacency distance cutoff (Å).

Return type:

dict with keys "graph_lcc" and "graph_cc".

pasted._metrics.compute_h_atom(atoms: list[str]) float[source]#

Shannon entropy of the element composition.

Range: 0 (pure single element) to ln(k) for k distinct elements.

pasted._metrics.compute_h_spatial(pts: ndarray, cutoff: float, n_bins: int) float[source]#

Shannon entropy of the pair-distance histogram within cutoff.

Only pairs with d_ij <= cutoff are included, matching the locality assumption used by all other metrics. Higher values indicate a more uniform distribution of short-range distances.

Parameters:
  • pts – Positions array of shape (n, 3).

  • cutoff – Neighbor distance cutoff (Å).

  • n_bins – Number of histogram bins over [0, cutoff].

pasted._metrics.compute_moran_I_chi(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Moran's I spatial autocorrelation for Pauling electronegativity.

Measures whether atoms with similar electronegativity cluster spatially.

\[I = \frac{N}{W} \frac{\sum_{i \neq j} w_{ij}(\chi_i - \bar{\chi}) (\chi_j - \bar{\chi})}{\sum_i (\chi_i - \bar{\chi})^2}\]

where \(w_{ij} = 1\) when \(d_{ij} \leq\) cutoff and 0 otherwise.

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff for the step-function weight matrix (Å).

Returns:

Moran's I in (-1, 1]. Returns 0.0 when all atoms share the same electronegativity or no pair falls within cutoff.

  • I ~= 0 : random spatial arrangement (target for disordered structures)

  • I > 0 : same-electronegativity atoms cluster spatially

  • I < 0 : alternating high/low electronegativity (ionic-crystal-like)

Return type:

float

pasted._metrics.compute_rdf_deviation(pts: ndarray, cutoff: float, n_bins: int) float[source]#

RMS deviation of the empirical g*(*r) from an ideal-gas baseline.

A value of 0 indicates a perfectly random (ideal-gas-like) distribution of pair distances within cutoff. Only pairs with d_ij <= cutoff are included in the histogram, consistent with the other local metrics.

Parameters:
  • pts – Positions array of shape (n, 3).

  • cutoff – Neighbor distance cutoff (Å). The histogram range is [0, cutoff].

  • n_bins – Number of histogram bins.

pasted._metrics.compute_ring_fraction(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Fraction of atoms that belong to at least one ring.

Builds a neighbor graph using the cutoff distance threshold, then detects rings via a Union-Find spanning-tree construction: every back-edge (i.e. an edge between two vertices already in the same component) indicates a cycle, and both its endpoints are marked as ring members.

Parameters:
  • atoms – Element symbols (unused; retained for API symmetry with other metrics).

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Fraction of atoms in at least one ring, in [0, 1]. Returns 0.0 for structures with fewer than three atoms or no cycles.

Return type:

float

pasted._metrics.compute_shape_anisotropy(pts: ndarray) float[source]#

Relative shape anisotropy from the gyration tensor.

Range: 0 (spherical) to 1 (rod-like). Returns NaN for a single atom.

pasted._metrics.compute_steinhardt(pts: ndarray, l_values: list[int], cutoff: float) dict[str, float][source]#

Steinhardt Q_l averaged over all atoms.

Delegates to compute_steinhardt_per_atom() and returns the per-structure mean for each l.

Parameters:
  • pts – Positions array of shape (n, 3).

  • l_values – List of l values (e.g. [4, 6, 8]).

  • cutoff – Neighbor distance cutoff (Å).

Return type:

dict mapping "Q{l}" to its global average value.

pasted._metrics.compute_steinhardt_per_atom(pts: ndarray, l_values: list[int], cutoff: float) dict[str, ndarray][source]#

Per-atom Steinhardt Q_l values.

When the C++ extension pasted._ext._steinhardt_core is available (HAS_STEINHARDT = True), the computation uses a sparse neighbor list built internally by the extension, evaluating spherical harmonics only for actual neighbor pairs. This gives O(N*k) complexity (k = mean neighbor count).

When the extension is absent the function falls back to a sparse Python/NumPy implementation using scipy.spatial.cKDTree for neighbor enumeration and np.bincount for accumulation. Both paths have the same O(N*k) complexity.

Parameters:
  • pts – Positions array of shape (n, 3).

  • l_values – List of l values (e.g. [4, 6, 8]).

  • cutoff – Neighbor distance cutoff (Å).

Returns:

  • dict mapping "Q{l}" to a numpy.ndarray of shape (n,).

  • Atoms with no neighbors within *cutoff are assigned Q_l = 0.*

pasted._metrics.passes_filters(metrics: dict[str, float], filters: list[tuple[str, float, float]]) bool[source]#

Return True iff metrics satisfies every (metric, lo, hi) filter.

MM-level structural descriptors#

The following two functions were added in v0.1.9 and revised in v0.1.13. They use the same cutoff distance threshold as graph_lcc, graph_cc, and moran_I_chi, so all five cutoff-based metrics share a single unified adjacency definition.

All metrics are included in ALL_METRICS and can therefore be used as --filter targets on the CLI and in the StructureGenerator filters= parameter.

pasted._metrics.compute_ring_fraction(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Fraction of atoms that belong to at least one ring.

Builds a neighbor graph using the cutoff distance threshold, then detects rings via a Union-Find spanning-tree construction: every back-edge (i.e. an edge between two vertices already in the same component) indicates a cycle, and both its endpoints are marked as ring members.

Parameters:
  • atoms – Element symbols (unused; retained for API symmetry with other metrics).

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Fraction of atoms in at least one ring, in [0, 1]. Returns 0.0 for structures with fewer than three atoms or no cycles.

Return type:

float

pasted._metrics.compute_charge_frustration(atoms: list[str], dmat: ndarray, cutoff: float) float[source]#

Variance of Pauling electronegativity differences across neighbor pairs.

For each neighbor pair (i, j) within cutoff, the absolute electronegativity difference abs(chi_i - chi_j) is computed. The metric is the variance of these differences over all neighbor pairs.

A high value indicates a structure where electronegativity differences are inconsistently distributed across bonds: some neighbors are well matched while others are highly mismatched. This is analogous to charge frustration in disordered materials, where local charge neutrality cannot be satisfied simultaneously at every site.

Noble gases and elements without a Pauling value use the module-level fallback of 1.0 (see pauling_electronegativity()).

Parameters:
  • atoms – Element symbols.

  • dmat – Full n x n pairwise distance matrix (Å).

  • cutoff – Distance cutoff (Å). A pair is counted as connected when d_ij <= cutoff.

Returns:

Variance of abs(delta-chi) across all neighbor pairs. Returns 0.0 when fewer than two neighbor pairs are detected (variance is undefined for a single observation).

Return type:

float