Skip to contents

Performs a permutation test to assess whether the observed fraction of converts with overlap is significantly different from what would be expected by chance. Tests overlap at facility, floor, and room levels.

Usage

cluster_overlap_perm_test(
  clusters,
  dna_aln,
  seq2pt,
  adm_seqs,
  adm_pos_pt_seqs,
  dates,
  surv_df,
  facility_trace,
  floor_trace,
  room_trace,
  nperm = 1000,
  num_cores = detectCores() - 1
)

Arguments

clusters

A named numeric vector of cluster assignments.

dna_aln

A DNA alignment object (used for isolate IDs).

seq2pt

A named vector mapping sequence IDs to patient IDs.

adm_seqs

A vector of sequence IDs which correspond to admission positive sequences.

adm_pos_pt_seqs

A vector of sequence IDs for patients who are admission positive.

dates

A named vector mapping sequence IDs to dates.

surv_df

A data frame with surveillance data.

facility_trace

A matrix with patient IDs as row names and dates as column names.

floor_trace

A matrix with floor-level trace data (same structure as facility_trace).

room_trace

A matrix with room-level trace data (same structure as facility_trace).

nperm

Number of permutations to perform. Default 1000.

num_cores

Number of cores for parallel processing. Default is detectCores() - 1.

Value

A list containing:

  • observed: A named list with facility, floor, room, seq_facility, seq_floor, seq_room per-cluster fractions for observed data

  • observed_n_overlap: A named list (same trace types) of per-cluster converts-with-overlap counts (numerators) behind the observed fractions

  • observed_n_converts: A named list (same trace types) of per-cluster convert counts (denominators) behind the observed fractions

  • permuted: A numeric array of dimensions (n_clusters, 6 trace_types, nperm) of per-cluster fractions

  • permuted_n_overlap: A numeric array (same dimensions) of per-cluster converts-with-overlap counts (numerators)

  • permuted_n_converts: A numeric array (same dimensions) of per-cluster convert counts (denominators)

  • valid_clusters: A numeric vector of cluster IDs that have more than one patient

The numerator/denominator components let callers compute a pooled, convert-weighted fraction (sum(n_overlap) / sum(n_converts)) across clusters and sequence types, rather than averaging the per-cluster fractions.