Analysis

Coupling Function Analysis Module for gwexpy.

Estimates the coupling function (CF) with flexible threshold strategies: - RatioThreshold: Mean power ratio. - SigmaThreshold: Statistical significance (Gaussian assumption). - PercentileThreshold: Data-driven percentile (Robust to non-Gaussianity).

class gwexpy.analysis.coupling.ThresholdStrategy[source]

Bases: ABC

Abstract base class for excess detection strategies.

abstractmethod check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]

Return a boolean mask where P_inj is considered ‘excess’.

Parameters:
  • psd_inj (FrequencySeries) – PSD of the injection data.

  • psd_bkg (FrequencySeries) – PSD of the background data (usually mean or median).

  • raw_bkg (TimeSeries, optional) – Raw background time series data. Required for PercentileThreshold to calculate the distribution of PSDs across segments.

abstractmethod threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]

Return the PSD threshold values used by this strategy.

class gwexpy.analysis.coupling.RatioThreshold(ratio: float = 2.0)[source]

Bases: ThresholdStrategy

Checks if P_inj > ratio * P_bkg_mean.

Statistical Assumptions:
  • No specific statistical distribution is assumed.

  • Tests if injection power exceeds the background level by a fixed factor.

Usage:
  • Best for simple, physical excess screening where precise statistical significance is less critical.

  • Extremely fast as it requires no variance estimation.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
class gwexpy.analysis.coupling.SigmaThreshold(sigma: float = 3.0)[source]

Bases: ThresholdStrategy

Checks if P_inj > P_bkg + sigma * std_error.

Statistical Assumptions

  • Background Power Spectral Density (PSD) at each bin approximately follows a Gaussian distribution (valid when n_avg is sufficiently large).

  • The parameter n_avg represents the number of independent averages (e.g., in Welch’s method).

  • Assumes standard deviation of the noise reduces as 1 / sqrt(n_avg).

Meaning of Threshold

  • threshold = mean + sigma * (mean / sqrt(n_avg))

  • This is a statistical significance test, NOT a physical upper limit.

  • It identifies frequencies where the injection is statistically distinguishable from background variance.

Gaussian Approximation Validity

Welch PSD estimates follow a χ² distribution with 2K degrees of freedom (K = n_avg). The Gaussian approximation is valid when K ≥ 10 (approximately).

For K < 10, consider: - Using PercentileThreshold (empirical distribution, no Gaussian assumption) - Increasing FFT averaging by using longer data or shorter fftlength

References

  • Welch, P.D. (1967): PSD estimation via overlapped segment averaging

  • Bendat & Piersol, Random Data (4th ed., 2010), Ch. 11

Warning

This method relies heavily on the Gaussian and stationary assumptions. It may be unreliable if: - The background contains significant non-Gaussian features (glitches) - n_avg is small (< ~10), where the central limit theorem has not converged - There are strong spectral lines (non-stationary or deterministic signals)

In such cases, PercentileThreshold is recommended as it uses the empirical distribution.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
class gwexpy.analysis.coupling.PercentileThreshold(percentile: float = 99.7, factor: float = 2.6)[source]

Bases: ThresholdStrategy

Threshold strategy based on empirical percentile of background distribution.

This strategy follows Appendix B of the PEM injection paper, using the 99.7th percentile of background segments and a correction factor to account for finite-averaging and χ² distribution scaling.

Parameters:
  • percentile (float, default=99.7) – The percentile of the background distribution (0-100). 99.7% equivalent to 3-sigma for Gaussian noise.

  • factor (float, default=2.6) – Correction factor (multiplier) for the percentile value. The value 2.6 is recommended in Appendix B.1 to set reduced χ² ≈ 1.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: Any) ndarray[source]
class gwexpy.analysis.coupling.CouplingResult(cf: FrequencySeries, psd_witness_inj: FrequencySeries, psd_witness_bkg: FrequencySeries, psd_target_inj: FrequencySeries, psd_target_bkg: FrequencySeries, valid_mask: ndarray, witness_name: str, target_name: str, cf_ul: FrequencySeries | None = None, ts_witness_bkg: TimeSeries | None = None, ts_target_bkg: TimeSeries | None = None, fftlength: float | None = None, overlap: float | None = None)[source]

Bases: object

Result object for a SINGLE Witness -> Target pair.

property frequencies: IndexLike
plot_cf(figsize: tuple[float, float] | None = None, xlim: tuple[float, float] | None = None, **kwargs: object) Plot[source]

Plot the Coupling Function and its Upper Limit.

plot(figsize: tuple[float, float] = (10, 12), xlim: tuple[float, float] | None = None) Plot[source]

Create a diagnostic plot showing ASDs and the resulting CF.

class gwexpy.analysis.coupling.CouplingFunctionAnalysis[source]

Bases: object

Analysis class to estimate Coupling Functions (CF).

compute(data_inj: ~gwexpy.timeseries.collections.TimeSeriesDict, data_bkg: ~gwexpy.timeseries.collections.TimeSeriesDict, fftlength: float, witness: str | None = None, frange: tuple[float, float] | None = None, overlap: float = 0, threshold_witness: ~gwexpy.analysis.coupling.ThresholdStrategy = <gwexpy.analysis.coupling.RatioThreshold object>, threshold_target: ~gwexpy.analysis.coupling.ThresholdStrategy = <gwexpy.analysis.coupling.RatioThreshold object>, n_jobs: int | None = None, memory_limit: float = 2147483648.0, **kwargs: object) CouplingResult | dict[str, CouplingResult][source]

Compute Coupling Function(s) from TimeSeriesDicts.

Parameters:
  • data_inj (TimeSeriesDict) – Injection data (Witness + Targets).

  • data_bkg (TimeSeriesDict) – Background data (Witness + Targets).

  • fftlength (float) – FFT length in seconds.

  • witness (str, optional) – The name (key) of the witness channel. If None, the FIRST channel in data_inj is used.

  • frange (tuple of float, optional) – Frequency range (fmin, fmax) to evaluate CF and CF upper limit. Values outside the range are set to NaN.

  • overlap (float, optional) – Overlap in seconds (default 0).

  • threshold_witness (ThresholdStrategy) – Strategy to determine if Witness is excited.

  • threshold_target (ThresholdStrategy) – Strategy to determine if Target is excited.

  • n_jobs (int, optional) – Number of jobs for parallel processing. None means 1 unless in a joblib.parallel_config context. -1 means using all processors.

gwexpy.analysis.coupling.estimate_coupling(data_inj: TimeSeriesDict, data_bkg: TimeSeriesDict, fftlength: float, witness: str | None = None, frange: tuple[float, float] | None = None, threshold_witness: ThresholdStrategy | float = 25.0, threshold_target: ThresholdStrategy | float = 4.0, n_jobs: int | None = None, **kwargs: Any) CouplingResult | dict[str, CouplingResult][source]

Helper function to estimate CF.

Parameters:

frange (tuple of float, optional) – Frequency range (fmin, fmax) to evaluate CF and CF upper limit. Values outside the range are set to NaN.

Stat-info utilities: association edge extraction and lightweight graph builder.

gwexpy.analysis.stat_info.association_edges(target: Any, matrix: Any, *, method: str = 'pearson', parallel: int | None = None, threshold: float | None = None, threshold_mode: str = 'abs', topk: int | None = None, return_dataframe: bool = True) Any[source]

Compute association edges between a target TimeSeries and a TimeSeriesMatrix.

Returns a DataFrame (default) with columns: [“source”, “target”, “score”, “row”, “col”, “channel”].

gwexpy.analysis.stat_info.build_graph(edges: Any, *, backend: str = 'networkx', directed: bool = False, weight: str = 'score') Any[source]

Build a graph object from association edges.

If backend=”none”, returns edges unchanged.