解析ユーティリティ (Analysis)

class gwexpy.analysis.Bruco(target_channel: str, aux_channels: list[str], excluded_channels: list[str] | None = None)[source]

Bases: object

Brute force Coherence (Bruco) scanner.

target

The name of the target channel (e.g., DARM).

Type:

str

aux_channels

List of auxiliary channels to scan.

Type:

List[str]

excluded

List of channels to exclude from analysis.

Type:

List[str]

compute(start: int | float | None = None, duration: int | None = None, fftlength: float = 2.0, overlap: float = 1.0, parallel: int = 4, batch_size: int = 100, top_n: int = 5, block_size: int | str | None = None, target_data: TimeSeries | None = None, aux_data: TimeSeriesDict | Iterable[TimeSeries] | None = None, preprocess_batch: Callable[[TimeSeriesDict], TimeSeriesDict] | None = None) BrucoResult[source]

Execute the coherence scan.

Parameters:
  • start (int or float, optional) – GPS start time. Required if not inferable from data.

  • duration (int, optional) – Duration of data in seconds. Required if not inferable.

  • fftlength (float) – FFT length in seconds.

  • overlap (float) – Overlap in seconds.

  • parallel (int) – Number of parallel jobs for reading data and computing coherence.

  • batch_size (int) – Channels per batch.

  • top_n (int) – Number of top channels to keep per frequency bin.

  • block_size (int or 'auto', optional) – Channels per block in Top-N updates.

  • target_data (TimeSeries, optional) – Pre-loaded target channel data.

  • aux_data (TimeSeriesDict or Iterable[TimeSeries], optional) – Pre-loaded auxiliary channels data. Can be a dictionary-like object or an iterable/generator yielding TimeSeries.

  • preprocess_batch (Callable, optional) – Batch preprocessing callback.

Returns:

Object containing frequency-wise analysis results.

Return type:

BrucoResult

class gwexpy.analysis.BrucoResult(frequencies: ndarray, target_name: str, target_spectrum: ndarray, top_n: int = 5, metadata: Mapping[str, str | int | float | bool] | None = None, block_size: int | str | None = None)[source]

Bases: object

Hold and analyze Bruco results with Top-N coherence per frequency bin.

coherence_for_channel(channel: str, asd: bool = True) ndarray[source]

Get the coherence spectrum for a specific channel. Values are NaN where the channel is not in the Top-N.

Parameters:
  • channel – Channel name.

  • asd – If True, return Amplitude Coherence. If False, Squared Coherence.

Returns:

Coherence spectrum (same length as frequencies).

dominant_channel(rank: int = 0) str | None[source]

Return the most frequent channel name at a given rank.

generate_report(output_dir: str, max_rows: int = 2000, coherence_threshold: float = 0.5, plot_ranks: int = 3, asd: bool = True) str[source]

Generate an HTML report with plots and data summary.

Parameters:

asd – If True (default), report and plots use ASD units.

Returns:

Path to the generated HTML file.

get_noise_projection(rank: int = 0, asd: bool = True, coherence_threshold: float = 0.0) tuple[ndarray, ndarray][source]

Calculate noise projection for the channel at a specific rank (0 = highest coherence).

Parameters:
  • asd – If True (default), return ASD projection. If False, return PSD projection.

  • coherence_threshold – Frequencies with coherence below this value contribute zero noise.

Returns:

(projection, coherence)

get_ranked_channels(limit: int = 5, band: tuple[float, float] | None = None) list[str][source]

Get a list of channels ranked by their total coherence contribution.

Parameters:
  • limit – Maximum number of channels to return.

  • band – Optional (f_low, f_high) tuple (Hz). When given, only frequency bins inside the band contribute to the per-channel score. Bins outside the band that happen to appear in the Top-N arrays are ignored. NaN bins (outside Top-N) are always excluded via numpy.nanmax().

Returns:

List of channel names sorted by importance.

plot_coherence(ranks: Sequence[int] | None = None, channels: Sequence[str] | None = None, max_channels: int = 3, asd: bool = True, coherence_threshold: float = 0.0, save_path: str | None = None) Figure[source]

Plot coherence spectrum for selected ranks or channels.

Default behavior (ranks=None, channels=None):

Plots the Top-K contributors (per-channel mode).

Parameters:
  • asd – If True (default), plot Amplitude Coherence (sqrt(Coh^2)). If False, plot Squared Coherence (Coh^2).

  • coherence_threshold – Draw a horizontal line at this value (default 0.0=off).

plot_projection(ranks: Sequence[int] | None = None, channels: Sequence[str] | None = None, max_channels: int = 3, asd: bool = True, coherence_threshold: float = 0.0, save_path: str | None = None) Figure[source]

Plot target spectrum and noise projections for selected ranks or channels.

Default behavior (ranks=None, channels=None):

Plots the Top-K contributors (per-channel mode).

plot_ranked(top_k: int = 3, band: tuple[float, float] | None = None, asd: bool = True, coherence_threshold: float = 0.0, save_path: str | None = None) Figure[source]

Plot coherence spectra for the top-ranked channels.

Selects channels via topk() (optionally band-limited) and delegates to plot_coherence().

Parameters:
  • top_k – Number of top channels to plot.

  • band – Optional (f_low, f_high) frequency band (Hz) for band-limited ranking.

  • asd – If True (default), plot Amplitude Coherence sqrt(Coh^2).

  • coherence_threshold – Draw a horizontal reference line at this value (default 0.0 = off).

  • save_path – Optional file path to save the figure.

Returns:

matplotlib.figure.Figure

projection_for_channel(channel: str, asd: bool = True, coherence_threshold: float = 0.0) ndarray[source]

Calculate projection spectrum for a specific channel where it appears in Top-N.

to_dataframe(ranks: Sequence[int] | None = None, stride: int = 1, asd: bool = True, coherence_threshold: float = 0.0) DataFrame[source]

Convert results to a long-form DataFrame.

topk(n: int = 5, band: tuple[float, float] | None = None) list[str][source]

Return the top-n channels ranked by coherence.

This is a convenience alias for get_ranked_channels().

Parameters:
  • n – Number of channels to return.

  • band – Optional (f_low, f_high) frequency band (Hz) for band-limited scoring.

Returns:

List of up to n channel names, most coherent first.

update_batch(channel_names: Sequence[str], coherences: ndarray) None[source]

Update the Top-N records with a new batch of results.

Parameters:
  • channel_names – List of channel names in this batch.

  • coherences – Coherence matrix of shape (n_channels, n_bins). Must align to self.frequencies.

gwexpy.analysis.estimate_coupling(data_inj: TimeSeriesDict, data_bkg: TimeSeriesDict, fftlength: float, witness: str | None = None, frange: tuple[float, float] | None = None, threshold_witness: ThresholdStrategy | float = 25.0, threshold_target: ThresholdStrategy | float = 4.0, n_jobs: int | None = None, **kwargs: Any) CouplingResult | dict[str, CouplingResult][source]

Helper function to estimate CF.

Parameters:

frange (tuple of float, optional) – Frequency range (fmin, fmax) to evaluate CF and CF upper limit. Values outside the range are set to NaN.

class gwexpy.analysis.CouplingFunctionAnalysis[source]

Bases: object

Analysis class to estimate Coupling Functions (CF).

compute(data_inj: ~gwexpy.timeseries.collections.TimeSeriesDict, data_bkg: ~gwexpy.timeseries.collections.TimeSeriesDict, fftlength: float, witness: str | None = None, frange: tuple[float, float] | None = None, overlap: float = 0, threshold_witness: ~gwexpy.analysis.coupling.ThresholdStrategy = <gwexpy.analysis.coupling.RatioThreshold object>, threshold_target: ~gwexpy.analysis.coupling.ThresholdStrategy = <gwexpy.analysis.coupling.RatioThreshold object>, n_jobs: int | None = None, memory_limit: float = 2147483648.0, **kwargs: object) CouplingResult | dict[str, CouplingResult][source]

Compute Coupling Function(s) from TimeSeriesDicts.

Parameters:
  • data_inj (TimeSeriesDict) – Injection data (Witness + Targets).

  • data_bkg (TimeSeriesDict) – Background data (Witness + Targets).

  • fftlength (float) – FFT length in seconds.

  • witness (str, optional) – The name (key) of the witness channel. If None, the FIRST channel in data_inj is used.

  • frange (tuple of float, optional) – Frequency range (fmin, fmax) to evaluate CF and CF upper limit. Values outside the range are set to NaN.

  • overlap (float, optional) – Overlap in seconds (default 0).

  • threshold_witness (ThresholdStrategy) – Strategy to determine if Witness is excited.

  • threshold_target (ThresholdStrategy) – Strategy to determine if Target is excited.

  • n_jobs (int, optional) – Number of jobs for parallel processing. None means 1 unless in a joblib.parallel_config context. -1 means using all processors.

class gwexpy.analysis.RatioThreshold(ratio: float = 2.0)[source]

Bases: ThresholdStrategy

Checks if P_inj > ratio * P_bkg_mean.

Statistical Assumptions:
  • No specific statistical distribution is assumed.

  • Tests if injection power exceeds the background level by a fixed factor.

Usage:
  • Best for simple, physical excess screening where precise statistical significance is less critical.

  • Extremely fast as it requires no variance estimation.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
class gwexpy.analysis.SigmaThreshold(sigma: float = 3.0)[source]

Bases: ThresholdStrategy

Checks if P_inj > P_bkg + sigma * std_error.

Statistical Assumptions

  • Background Power Spectral Density (PSD) at each bin approximately follows a Gaussian distribution (valid when n_avg is sufficiently large).

  • The parameter n_avg represents the number of independent averages (e.g., in Welch’s method).

  • Assumes standard deviation of the noise reduces as 1 / sqrt(n_avg).

Meaning of Threshold

  • threshold = mean + sigma * (mean / sqrt(n_avg))

  • This is a statistical significance test, NOT a physical upper limit.

  • It identifies frequencies where the injection is statistically distinguishable from background variance.

Gaussian Approximation Validity

Welch PSD estimates follow a χ² distribution with 2K degrees of freedom (K = n_avg). The Gaussian approximation is valid when K ≥ 10 (approximately).

For K < 10, consider: - Using PercentileThreshold (empirical distribution, no Gaussian assumption) - Increasing FFT averaging by using longer data or shorter fftlength

References

  • Welch, P.D. (1967): PSD estimation via overlapped segment averaging

  • Bendat & Piersol, Random Data (4th ed., 2010), Ch. 11

Warning

This method relies heavily on the Gaussian and stationary assumptions. It may be unreliable if: - The background contains significant non-Gaussian features (glitches) - n_avg is small (< ~10), where the central limit theorem has not converged - There are strong spectral lines (non-stationary or deterministic signals)

In such cases, PercentileThreshold is recommended as it uses the empirical distribution.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
class gwexpy.analysis.PercentileThreshold(percentile: float = 99.7, factor: float = 2.6)[source]

Bases: ThresholdStrategy

Threshold strategy based on empirical percentile of background distribution.

This strategy follows Appendix B of the PEM injection paper, using the 99.7th percentile of background segments and a correction factor to account for finite-averaging and χ² distribution scaling.

Parameters:
  • percentile (float, default=99.7) – The percentile of the background distribution (0-100). 99.7% equivalent to 3-sigma for Gaussian noise.

  • factor (float, default=2.6) – Correction factor (multiplier) for the percentile value. The value 2.6 is recommended in Appendix B.1 to set reduced χ² ≈ 1.

check(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: object) ndarray[source]
threshold(psd_inj: FrequencySeries, psd_bkg: FrequencySeries, raw_bkg: TimeSeries | None = None, **kwargs: Any) ndarray[source]
gwexpy.analysis.association_edges(target: Any, matrix: Any, *, method: str = 'pearson', parallel: int | None = None, threshold: float | None = None, threshold_mode: str = 'abs', topk: int | None = None, return_dataframe: bool = True) Any[source]

Compute association edges between a target TimeSeries and a TimeSeriesMatrix.

Returns a DataFrame (default) with columns: [“source”, “target”, “score”, “row”, “col”, “channel”].

gwexpy.analysis.build_graph(edges: Any, *, backend: str = 'networkx', directed: bool = False, weight: str = 'score') Any[source]

Build a graph object from association edges.

If backend=”none”, returns edges unchanged.