face_rhythm package
face_rhythm.alignment module
Image alignment pipeline and video frame ingestion.
Image_preparation_pipeline builds a clean reference image for registration
from a sequence of frames by downsampling, masking with a VQT spectrogram to
keep only low-spectral-variance (non-behavior) frames, and applying CLAHE.
Also provides SFTP / local video frame extractors used to seed alignment.
- class face_rhythm.alignment.Image_preparation_pipeline(ds_factor: int = 20, ptile_specVar_keep: float = 10, ptile_intensity_keep: float = 90, params_vqt: Dict = {'F_max': 60, 'F_min': 0.5, 'Fs_sample': 250, 'Q_highF': 20, 'Q_lowF': 3.5, 'downsample_factor': 10, 'fft_conv': True, 'n_freq_bins': 50, 'plot_pref': False, 'window_type': 'hann'}, clip_limit: float = 2.0, grid_size: int = 20, verbose: bool = True)[source]
Bases:
objectBuilds a clean reference image for registration by downsampling frames, selecting frames with low spectral variance via a VQT spectrogram, and applying CLAHE contrast enhancement. RH 2023
- Parameters:
ds_factor (int) – Spatial downsampling factor applied before spectral analysis. (Default is
20)ptile_specVar_keep (float) – Percentile cutoff for the per-frame mean spectral magnitude; frames at or below this percentile are kept as low-variance (non-behavior) frames. (Default is
10)ptile_intensity_keep (float) – Percentile cutoff used when normalizing pixel intensities (kept for backwards compatibility; current implementation no longer applies this clip). (Default is
90)params_vqt (Dict) – Keyword arguments forwarded to
vqt.VQTfor the spectral analysis. Recognized keys includeFs_sample(sample rate),Q_lowF,Q_highF(quality factors at the low and high frequency bounds),F_min,F_max(frequency range),n_freq_bins(number of frequency bins),window_type,downsample_factor,fft_conv(use FFT-based convolution), andplot_pref. (Default is the dictionary shown in the signature)clip_limit (float) –
clipLimitargument forwarded to OpenCV CLAHE. (Default is2.0)grid_size (int) – Tile grid size forwarded to OpenCV CLAHE. (Default is
20)verbose (bool) – If
True, prints progress messages and shows intermediate plots. (Default isTrue)
- downsample(images: numpy.ndarray, ds_factor: int | None = None) numpy.ndarray[source]
Spatially downsamples a stack of images by
ds_factorusing bilinear interpolation, collapsing any color channel by mean.- Parameters:
images (np.ndarray) – Input image stack. shape: (n_frames, H, W) or (n_frames, H, W, C).
ds_factor (Optional[int]) – Integer downsampling factor. If
None,self.ds_factoris used. (Default isNone)
- Returns:
- images_ds (np.ndarray):
Downsampled image stack. shape: (n_frames, H // ds_factor, W // ds_factor), dtype: float32.
- Return type:
(np.ndarray)
- find_low_spectral_variance_idx(images_ds: numpy.ndarray, images: numpy.ndarray, ptile_specVar_keep: float | None = 10, ptile_intensity_keep: float | None = 90, params_vqt: Dict | None = {'F_max': 60, 'F_min': 0.5, 'Fs_sample': 250, 'Q_highF': 20, 'Q_lowF': 3.5, 'downsample_factor': 10, 'fft_conv': True, 'n_freq_bins': 50, 'plot_pref': False, 'window_type': 'hann'})[source]
Selects frames whose mean VQT spectral magnitude lies in the lowest
ptile_specVar_keeppercentile and returns a normalized mean image over those frames for use as a registration reference.- Parameters:
images_ds (np.ndarray) – Downsampled image stack used to compute spectrograms. shape: (n_frames, H_ds, W_ds).
images (np.ndarray) – Full-resolution image stack used to build the reference image. shape: (n_frames, H, W) or (n_frames, H, W, C).
ptile_specVar_keep (Optional[float]) – Percentile cutoff for the mean spectral magnitude; frames at or below this percentile are kept. If
None,self.ptile_specVar_keepis used. (Default is10)ptile_intensity_keep (Optional[float]) – Percentile cutoff for intensity normalization (currently unused in the active code path; retained for backwards compatibility). If
None,self.ptile_intensity_keepis used. (Default is90)params_vqt (Optional[Dict]) – Keyword arguments forwarded to
vqt.VQT. IfNone,self.params_vqtis used. (Default is the dictionary shown in the signature)
- Returns:
- im (np.ndarray):
Square-rooted, max-normalized mean of the kept full- resolution frames. shape: (H, W), dtype: float32 (or matching the dtype of
images.mean(0)).
- Return type:
(np.ndarray)
- apply_clahe(image: numpy.ndarray, clip_limit: float | None = 2.0, grid_size: int | None = 20) numpy.ndarray[source]
Applies CLAHE contrast enhancement to a single image via
rois.Image_Aligner.augment_images.- Parameters:
- Returns:
- im_aug (np.ndarray):
CLAHE-enhanced image. shape: (H, W).
- Return type:
(np.ndarray)
- apply_pipeline(images: numpy.ndarray)[source]
Runs the full reference-image pipeline: downsample, select low-spectral-variance frames, then apply CLAHE.
- Parameters:
images (np.ndarray) – Input image stack. shape: (n_frames, H, W) or (n_frames, H, W, C).
- Returns:
- im_aug (np.ndarray):
CLAHE-enhanced reference image. shape: (H, W).
- Return type:
(np.ndarray)
- class face_rhythm.alignment.SFTPVideoFrameExtractor(host: str, username: str, password: str, port: int = 22, verbose: bool = True)[source]
Bases:
objectExtracts frames from remote video files over SFTP and returns them as a NumPy array. The password is held base64-encoded with a random salt and only decoded transiently when constructing the SFTP URL; frame extraction is streamed through ffmpeg so the full file is never downloaded. RH 2023
- Parameters:
host (str) – Hostname or IP address of the remote server.
username (str) – Username for authenticating to the remote server.
password (str) – Password for authenticating to the remote server. Stored internally in base64-encoded form with a random salt.
port (int) – TCP port for the SFTP connection. (Default is
22)verbose (bool) – If
True, prints progress messages during probing and frame retrieval. (Default isTrue)
- extract_frames(remote_video_path: str, time_start: float, duration: int, fps: float | None = None) numpy.ndarray[source]
Streams a window of frames from a remote video over SFTP using ffmpeg and returns them as a stacked NumPy array. The number of frames returned is
int(fps * duration).- Parameters:
remote_video_path (str) – Path to the video file on the remote server, e.g.
"/path/to/video.mp4".time_start (float) – Start time, in seconds, of the extraction window.
duration (int) – Length of the extraction window, in seconds. The total number of frames returned is
fps * duration.fps (Optional[float]) – Frame rate of the video. If
None, it is probed from the video metadata viaffmpeg.probe. (Default isNone)
- Returns:
- frames (np.ndarray):
Decoded frames stacked along axis 0. shape: (n_frames, H, W, 3), dtype: uint8.
- Return type:
(np.ndarray)
- Raises:
RuntimeError – If ffmpeg fails while extracting frames from the SFTP stream.
- face_rhythm.alignment.get_frames(path, time_start, time_end, verbose=False)[source]
Reads a contiguous range of frames from a local video using OpenCV by seeking with
cv2.CAP_PROP_POS_FRAMES. Stops early if the requested range extends past EOF rather than raising.- Parameters:
path (str) – Path to the local video file.
time_start (float) – Start time, in seconds, of the read window.
time_end (float) – End time, in seconds, of the read window. The number of frames requested is
int((time_end - time_start) * fps).verbose (bool) – If
True, displays a tqdm progress bar over the seek loop. (Default isFalse)
- Returns:
- ims (np.ndarray):
Decoded frames stacked along axis 0. shape: (n_frames, H, W, 3), dtype matches the dtype returned by
cv2.VideoCapture.read(typically uint8).
- Return type:
(np.ndarray)
- Raises:
ValueError – If no frames could be read in the requested interval.
face_rhythm.alignment_multisession module
Multi-session (cross-session) image alignment.
Ported from ROICaT (https://github.com/RichieHakim/ROICaT,
roicat/tracking/alignment.py and roicat/helpers.py).
Both projects (C) Rich Hakim — released under the face-rhythm LICENSE
alongside the rest of the package. The ROICaT source is GPL-3.0-only;
because Rich is the sole author of both packages, there is no license
conflict; this module re-licenses the ported portions under face-rhythm’s
terms for face-rhythm users.
This module provides Aligner, which registers a list of FOV images
to a template using one of several geometric-registration backends. The
public API and call-shape intentionally match ROICaT’s
tracking.alignment.Aligner so notebooks and scripts that used the
ROICaT entry-point can swap roicat.tracking.alignment for
face_rhythm.alignment_multisession unchanged.
- Backends ported:
'RoMa'(optional — requirespip install face-rhythm[multisession])'ECC_cv2'(OpenCV-only, always available)'PhaseCorrelation'(torch-FFT only, always available)'NullRegistration'(identity, always available)
Backends deliberately NOT ported (pull heavy deps that face-rhythm users don’t need): LoFTR, DISK_LightGlue, DeepFlow, OpticalFlowFarneback, SIFT, ORB. If you need them, install and use ROICaT directly.
Low-level helpers (warp_matrix_to_remappingIdx, remap_images,
compose_transform_matrices, cv2RemappingIdx_to_pytorchFlowField,
find_geometric_transformation, make_batches, hash_file) are imported from
face_rhythm.helpers, which already carries their ROICaT-ported
equivalents. Only the helpers unique to alignment (ImageAlignmentChecker,
phase_correlation, 2-D Butterworth bandpass filter construction, Dijkstra
path reconstruction) are re-implemented here.
- face_rhythm.alignment_multisession.make_distance_grid(shape: Tuple[int, int] = (512, 512), p: int = 2, idx_center: Tuple[int, int] | None = None, use_fftshift_center: bool = False) numpy.ndarray[source]
Creates an (H, W) array of Minkowski-p distances to a reference index. Ported from
roicat.helpers.make_distance_grid.- Parameters:
shape (Tuple[int, int]) – Grid shape (H, W). (Default is
(512, 512))p (int) – Minkowski order. Use
1for Manhattan,2for Euclidean, andinffor Chebyshev. Values above2approximate the max-norm. (Default is2)idx_center (Optional[Tuple[int, int]]) – Center index for the distances. If
None, uses the geometric middle of the array (between two pixels on even shapes). (Default isNone)use_fftshift_center (bool) – If
True, uses the index wherenp.fft.fftshift(np.fft.fftfreq(N))is zero as the center (the correct reference for fftshifted 2-D FFTs). (Default isFalse)
- Returns:
- grid_dist (np.ndarray):
Minkowski-p distances to the center. shape: shape.
- Return type:
(np.ndarray)
- face_rhythm.alignment_multisession.design_butter_bandpass(lowcut: float, highcut: float, fs: float, order: int = 5) Tuple[numpy.ndarray, numpy.ndarray][source]
Designs a Butterworth bandpass filter, with low/highpass edge cases. Ported from
roicat.helpers.design_butter_bandpass.- Parameters:
- Returns:
- tuple containing:
- b (np.ndarray):
Numerator polynomial of the IIR filter.
- a (np.ndarray):
Denominator polynomial of the IIR filter.
- Return type:
(Tuple[np.ndarray, np.ndarray])
- face_rhythm.alignment_multisession.make_2D_frequency_filter(hw: Tuple[int, int], low: float = 5.0, high: float = 6.0, order: int = 3, distance_p: int = 100) numpy.ndarray[source]
Builds a 2-D fftshifted bandpass mask for phase-correlation scoring. Ported from
roicat.helpers.make_2D_frequency_filter. The filter is the 1-D Butterworth magnitude response fromdesign_butter_bandpass()evaluated on a Minkowski-distance_pdistance grid produced bymake_distance_grid().- Parameters:
- Returns:
- filt (np.ndarray):
2-D bandpass mask with values in
[0, 1]. shape: hw.
- Return type:
(np.ndarray)
- face_rhythm.alignment_multisession.phase_correlation(im_template: numpy.ndarray | torch.Tensor, im_moving: numpy.ndarray | torch.Tensor, mask_fft: numpy.ndarray | torch.Tensor | None = None, return_filtered_images: bool = False, eps: float = 1e-08) numpy.ndarray | torch.Tensor | Tuple[source]
Computes the phase-correlation of two images along the last two axes. Ported from
roicat.helpers.phase_correlation.- Parameters:
im_template (Union[np.ndarray, torch.Tensor]) – Template image(s). shape: (…, H, W). Leading dims broadcast.
im_moving (Union[np.ndarray, torch.Tensor]) – Moving image(s). shape: (…, H, W). Broadcasts against the template.
mask_fft (Optional[Union[np.ndarray, torch.Tensor]]) – Optional 2-D bandpass mask. Assumed to already be fftshifted; this function un-shifts it so that it lines up with the raw FFT output. (Default is
None)return_filtered_images (bool) – If
True, additionally returns the mask-filtered template and moving images in the image domain. (Default isFalse)eps (float) – Floor used to avoid division by zero in the phase-correlation normalization. (Default is
1e-8)
- Returns:
- cc (Union[np.ndarray, torch.Tensor]):
Phase-correlation response with a shape that matches the broadcast of the inputs. Returned as
np.ndarraywhenim_templateis numpy, otherwise astorch.Tensor. Whenreturn_filtered_imagesisTrue, a 3-tuple(cc, filtered_template, filtered_moving)is returned instead, with the filtered images in the image domain.
- Return type:
(Union[np.ndarray, torch.Tensor, Tuple])
- face_rhythm.alignment_multisession.get_path_between_nodes(idx_start: int, idx_end: int, predecessors: numpy.ndarray, max_length: int = 9999) List[int][source]
Reconstructs a shortest path from a predecessor matrix. Ported from
roicat.helpers.get_path_between_nodes. The predecessor matrix is the one returned byscipy.sparse.csgraph.shortest_path(), sopredecessors[idx_end, idx_current]gives the previous node on the shortest path fromidx_currenttoidx_end.- Parameters:
idx_start (int) – Index of the first node on the path.
idx_end (int) – Index of the destination node.
predecessors (np.ndarray) – Square predecessor matrix returned by
scipy.sparse.csgraph.shortest_path().max_length (int) – Safety cap on path length to avoid infinite loops. (Default is
9999)
- Returns:
- path (List[int]):
Node indices along the shortest path, in the form
[idx_start, ..., idx_end].
- Return type:
(List[int])
- Raises:
AssertionError – Input validation failed (shapes, integer types, or the no-path placeholder
-9999).ValueError – Reconstructed path length exceeds
max_length.
- class face_rhythm.alignment_multisession.ImageAlignmentChecker(hw: Tuple[int, int], radius_in: float | Tuple[float, float], radius_out: float | Tuple[float, float], order: int = 5, device: str = 'cpu')[source]
Bases:
objectScores whether a set of images is spatially aligned via phase correlation. Ported from
roicat.helpers.ImageAlignmentChecker.The class constructs two band-selectable 2-D filters in the phase-correlation domain: an “in” filter over the center (within
radius_in) and an “out” filter away from the center. Statistics of the phase-correlation peak under each filter are compared to produce an alignment z-score.- Parameters:
hw (Tuple[int, int]) – Image height and width. All scored images must match this shape.
radius_in (Union[float, Tuple[float, float]]) – Either the upper bound of the “in” bandpass (lower bound is
0) or an explicit(low, high)tuple.radius_out (Union[float, Tuple[float, float]]) – Either the lower bound of the “out” bandpass (upper bound is
min(H, W) / 2) or an explicit(low, high)tuple.order (int) – Butterworth order shared by both filters. Values above
5may cause the filters to collapse numerically. (Default is5)device (str) – Torch device string (e.g.
'cpu'or'cuda:0') on which the precomputed filters live. (Default is'cpu')
- filt_in
Precomputed in-band 2-D bandpass filter. shape: hw, dtype: float32.
- Type:
- filt_out
Precomputed out-band 2-D bandpass filter. shape: hw, dtype: float32.
- Type:
- score_alignment(images: numpy.ndarray | torch.Tensor | List | Tuple, images_ref: numpy.ndarray | torch.Tensor | List | Tuple | None = None) Dict[str, Any][source]
Computes per-pair alignment statistics for a stack of images.
- Parameters:
images (Union[np.ndarray, torch.Tensor, List, Tuple]) – Stack of images. shape: (N, H, W), or (H, W) for a single image (which is broadcast).
images_ref (Optional[Union[np.ndarray, torch.Tensor, List, Tuple]]) – Reference images. If
None,imagesis compared against itself (N x Nscoring). (Default isNone)
- Returns:
- stats (Dict[str, Any]):
Per-pair statistics keyed by name. Contains
'pc'(the phase-correlation array),'mean_in','mean_out','ptile95_out','max_in','std_in','std_out','max_diff','z_in'(the primary alignment score), and'r_in'.
- Return type:
(Dict[str, Any])
- class face_rhythm.alignment_multisession.ImageRegistrationMethod(device: str = 'cpu', verbose: bool | int = False)[source]
Bases:
objectBase class for image-to-image registration backends. Subclasses either implement
_forward_rigid()(to emit keypoint pairs for the RANSAC pipeline infit_rigid()) or overridefit_rigid()directly.- Parameters:
- fit_rigid(im_template: numpy.ndarray | torch.Tensor, im_moving: numpy.ndarray | torch.Tensor, inl_thresh: float = 2.0, max_iter: int = 10, confidence: float = 0.99, constraint: str = 'homography', **kwargs) numpy.ndarray[source]
Estimates a constrained 3x3 warp between two images via RANSAC. Subclasses that emit keypoint pairs use this default implementation; the estimator branches on
constraint.- Parameters:
im_template (Union[np.ndarray, torch.Tensor]) – Template image. shape: (H, W).
im_moving (Union[np.ndarray, torch.Tensor]) – Moving image. shape: (H, W).
inl_thresh (float) – RANSAC inlier threshold in pixels. (Default is
2.0)max_iter (int) – Maximum RANSAC iterations. (Default is
10)confidence (float) – RANSAC confidence level. (Default is
0.99)constraint (str) –
Warp family to fit. Either
'rigid': Procrustes (rotation + translation).'euclidean':skimage.measure.ransac()withskimage.transform.EuclideanTransform.'similarity':cv2.estimateAffinePartial2D().'affine':cv2.estimateAffine2D().'homography':cv2.findHomography()with MAGSAC.
(Default is
'homography')**kwargs – Additional keyword arguments forwarded to
_forward_rigid()for keypoint detection.
- Returns:
- warp_matrix (np.ndarray):
3x3 warp matrix. Affine rows are padded with
[0, 0, 1]where appropriate. dtype: float32.
- Return type:
(np.ndarray)
- Raises:
RuntimeError – A fitting branch failed (e.g. RANSAC returned
None).ValueError –
constraintis not one of the supported values.
- class face_rhythm.alignment_multisession.RoMa(model_type: str = 'outdoor', n_points: int = 10000, batch_size: int = 1000, device: str = 'cpu', weight_urls: Dict | None = None, fallback_weight_urls: Dict | None = None, verbose: bool = False)[source]
Bases:
ImageRegistrationMethodFeature-matching registration backend that uses the RoMa model.
Requires the optional dependency
romatch-roicat, installed viapip install face-rhythm[multisession]. The package imports asromatchregardless of which PyPI distribution was installed.On first use the constructor downloads ~1.5 GB of weights via
torch.hub.load_state_dict_from_url()intoPath(torch.hub.get_dir()) / "checkpoints". Set theTORCH_HOMEenvironment variable before import to redirect the cache.- Parameters:
model_type (str) –
RoMa model variant. Either
'outdoor': Outdoor-trained RoMa weights.'indoor': Indoor-trained RoMa weights.
(Default is
'outdoor')n_points (int) – Number of matched points to sample per image pair. (Default is
10000)batch_size (int) – Sub-batch size used by the matching sampler. (Default is
1000)device (str) – Torch device string for the RoMa model. (Default is
'cpu')weight_urls (Optional[Dict]) – Primary download URLs and MD5 hashes for the RoMa and DINOv2 weights. If
None, usesDEFAULT_WEIGHT_URLS. (Default isNone)fallback_weight_urls (Optional[Dict]) – OSF mirror URLs and matching hashes used if the primary downloads fail. If
None, usesDEFAULT_FALLBACK_WEIGHT_URLS. (Default isNone)verbose (bool) – Verbosity flag. (Default is
False)
- weight_urls
Primary URLs and hashes for the model weights.
- Type:
Dict
- fallback_weight_urls
Fallback (mirror) URLs and hashes for the model weights.
- Type:
Dict
- DEFAULT_WEIGHT_URLS = {'dinov2': {'filename': 'dinov2_vitl14_pretrain.pth', 'hash': '19a02c10947ed50096ce382b46b15662', 'url': 'https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth'}, 'romatch': {'indoor': {'filename': 'roma_indoor.pth', 'hash': '349a17aaa21883bb164b1a5884febb21', 'url': 'https://github.com/Parskatt/storage/releases/download/roma/roma_indoor.pth'}, 'outdoor': {'filename': 'roma_outdoor.pth', 'hash': '9a451dfb65745e777bf916db6ea84933', 'url': 'https://github.com/Parskatt/storage/releases/download/roma/roma_outdoor.pth'}}}
- DEFAULT_FALLBACK_WEIGHT_URLS = {'dinov2': {'filename': 'dinov2_vitl14_pretrain.pth', 'hash': '19a02c10947ed50096ce382b46b15662', 'url': 'https://osf.io/tmj5c/download'}, 'romatch': {'indoor': {'filename': 'roma_indoor.pth', 'hash': '349a17aaa21883bb164b1a5884febb21', 'url': 'https://osf.io/uzx64/download'}, 'outdoor': {'filename': 'roma_outdoor.pth', 'hash': '9a451dfb65745e777bf916db6ea84933', 'url': 'https://osf.io/cmzpa/download'}}}
- class face_rhythm.alignment_multisession.ECC_cv2(mode_transform: str = 'euclidean', n_iter: int = 200, termination_eps: float = 1e-09, gaussFiltSize: float | int = 1, auto_fix_gaussFilt_step: int | None = 10, device: str = 'cpu', verbose: bool | int = False)[source]
Bases:
ImageRegistrationMethodOpenCV Enhanced Correlation Coefficient (ECC) registration backend. Wraps
face_rhythm.helpers.find_geometric_transformation(), which in turn wrapscv2.findTransformECC(). On failure, the call is retried with a larger Gaussian filter size.- Parameters:
mode_transform (str) –
Warp family for ECC. Either
'translation': Translation-only warp.'euclidean': Rotation + translation.'affine': Affine warp.'homography': 3x3 homography.
(Default is
'euclidean')n_iter (int) – Maximum ECC iterations. (Default is
200)termination_eps (float) – ECC convergence tolerance. (Default is
1e-09)gaussFiltSize (Union[float, int]) – Gaussian-filter kernel size used as a smoothing pre-pass before the ECC iteration. Cast to int via
np.round. (Default is1)auto_fix_gaussFilt_step (Optional[int]) – If set, on ECC failure the kernel size is incremented by this value and ECC is retried recursively.
Nonedisables the retry. (Default is10)device (str) – Ignored; ECC always runs on CPU. (Default is
'cpu')verbose (Union[bool, int]) – Verbosity flag or integer level. (Default is
False)
- auto_fix_gaussFilt_step
Increment applied to
gaussFiltSizeafter each ECC failure.- Type:
Optional[int]
- fit_rigid(im_template: numpy.ndarray | torch.Tensor, im_moving: numpy.ndarray | torch.Tensor, **kwargs) numpy.ndarray[source]
Estimates a 3x3 warp matrix via ECC, retrying with a larger Gaussian filter on failure.
- Parameters:
im_template (Union[np.ndarray, torch.Tensor]) – Template image. shape: (H, W).
im_moving (Union[np.ndarray, torch.Tensor]) – Moving image. shape: (H, W).
**kwargs – Unused; accepted for interface compatibility with
ImageRegistrationMethod.fit_rigid().
- Returns:
- warp_matrix (np.ndarray):
Homogeneous warp matrix. shape: (3, 3). Affine warps are padded with
[0, 0, 1].
- Return type:
(np.ndarray)
- class face_rhythm.alignment_multisession.PhaseCorrelationRegistration(device: str = 'cpu', bandpass_freqs: List[float] | None = None, order: int = 5, verbose: bool = False)[source]
Bases:
ImageRegistrationMethodTranslation-only registration via
phase_correlation()peak detection. Supports an optional bandpass on the phase-correlation mask for robustness against low- and high-frequency noise.- Parameters:
device (str) – Torch device used for the FFT. (Default is
'cpu')bandpass_freqs (Optional[List[float]]) –
[low, high]cutoffs for the bandpass filter.Noneskips the bandpass. (Default isNone)order (int) – Butterworth order for the bandpass filter. (Default is
5)verbose (bool) – Verbosity flag. (Default is
False)
- fit_rigid(im_template: numpy.ndarray | torch.Tensor, im_moving: numpy.ndarray | torch.Tensor, **kwargs) numpy.ndarray[source]
Estimates a translation-only 3x3 warp via phase-correlation peak detection.
- Parameters:
im_template (Union[np.ndarray, torch.Tensor]) – Template image. shape: (…, H, W).
im_moving (Union[np.ndarray, torch.Tensor]) – Moving image. shape: (…, H, W).
**kwargs – Unused; accepted for interface compatibility with
ImageRegistrationMethod.fit_rigid().
- Returns:
- warp_matrix (np.ndarray):
Translation-only homogeneous warp matrix. shape: (3, 3), dtype: float32.
- Return type:
(np.ndarray)
- class face_rhythm.alignment_multisession.NullRegistration(device: str | None = None, verbose: bool = False)[source]
Bases:
ImageRegistrationMethodIdentity registration backend that returns an identity warp for every pair. Useful for debugging the
Aligner.fit_geometric()pipeline, evaluating pre-registered images, and as a zero-costmethodbaseline.- Parameters:
- fit_rigid(im_template: numpy.ndarray | torch.Tensor, im_moving: numpy.ndarray | torch.Tensor, **kwargs) numpy.ndarray[source]
Returns an identity 2x3 affine warp regardless of the input images.
- Parameters:
im_template (Union[np.ndarray, torch.Tensor]) – Template image. Ignored.
im_moving (Union[np.ndarray, torch.Tensor]) – Moving image. Ignored.
**kwargs – Unused; accepted for interface compatibility with
ImageRegistrationMethod.fit_rigid().
- Returns:
- warp_matrix (np.ndarray):
Identity affine warp. shape: (2, 3), dtype: float32.
Aligner.fit_geometric()pads this to (3, 3).
- Return type:
(np.ndarray)
- class face_rhythm.alignment_multisession.Aligner(use_match_search: bool = True, all_to_all: bool = False, radius_in: float = 4, radius_out: float = 20, order: int = 5, z_threshold: float = 4.0, um_per_pixel: float = 1.0, device: str = 'cpu', verbose: bool | int = True)[source]
Bases:
_AlignerModuleStubRegisters a list of FOV images to a template using a chosen backend. The public API mirrors ROICaT’s
tracking.alignment.Alignerso that existing notebooks can swap the import path without further changes.- Workflow:
aligner = Aligner(...).aligner.fit_geometric(template=..., ims_moving=[...], method='RoMa' | 'ECC_cv2' | 'PhaseCorrelation' | 'NullRegistration', ...).Use
aligner.remappingIdx_geo(a list of (H, W, 2)float32arrays) to warp points or images, or callaligner.transform_images(ims_moving, remappingIdx=aligner.remappingIdx_geo).Inspect alignment quality with
aligner.plot_alignment_results_geometric().
- Parameters:
use_match_search (bool) – If any image scores
<= z_thresholdagainst the template, run the Dijkstra match-search step to find a pairwise path through other images. (Default isTrue)all_to_all (bool) – If
True, always run the all-to-all match search even when direct registrations all passz_threshold. Much slower (O(N^2)). (Default isFalse)radius_in (float) – Inner radius for the
ImageAlignmentChecker, scaled byum_per_pixel. (Default is4)radius_out (float) – Outer radius for the
ImageAlignmentChecker, scaled byum_per_pixel. (Default is20)order (int) – Butterworth order for the in- and out-band filters used by
ImageAlignmentChecker. (Default is5)z_threshold (float) – z-score cutoff below which a pair is considered mis-aligned. The multi-session notebook sets
50to always trigger the match-search. (Default is4.0)um_per_pixel (float) – Pixel scale, which must match across all images. (Default is
1.0)device (str) – Torch device string for the backends (e.g.
'cuda:0').ECC_cv2andPhaseCorrelationRegistrationignore this and run on CPU;RoMaon CPU is prohibitively slow. (Default is'cpu')verbose (Union[bool, int]) – Verbosity flag or integer level. (Default is
True)
- radius_in
Inner radius parameter for
ImageAlignmentChecker.- Type:
- radius_out
Outer radius parameter for
ImageAlignmentChecker.- Type:
- order
Butterworth order parameter for
ImageAlignmentChecker.- Type:
- remappingIdx_geo
Per-image remapping arrays produced by
fit_geometric(), each with shape (H, W, 2) and dtype float32.Noneuntilfit_geometric()runs.- Type:
Optional[List[np.ndarray]]
- warp_matrices
Composed warp matrices set by
fit_geometric().Noneuntilfit_geometric()runs.
Example
aligner = Aligner(z_threshold=50, device='cuda:0') aligner.fit_geometric( template=0, ims_moving=images, method='RoMa', ) warped = aligner.transform_images( ims_moving=images, remappingIdx=aligner.remappingIdx_geo, )
- fit_geometric(template: int | float | numpy.ndarray, ims_moving: List[numpy.ndarray], template_method: str = 'sequential', mask_borders: Tuple[int, int, int, int] = (0, 0, 0, 0), method: str = 'RoMa', kwargs_method: Dict[str, Dict[str, Any]] | None = None, constraint: str = 'affine', kwargs_RANSAC: Dict[str, Any] | None = None, verbose: bool | None = None) List[numpy.ndarray][source]
Fits geometric warps from
ims_movingtotemplateand scores their alignment.Calls the backend identified by
methodonce per pair, composes warps across sequential templates (if any), then scores alignment viaImageAlignmentChecker. If any pair fails thez_thresholdgate anduse_match_searchisTrue, a Dijkstra search through all intermediate images is run to reconstruct better paths.- Parameters:
template (Union[int, float, np.ndarray]) – Template image or index. Fractional indices in
[0, 1]are mapped toint(N * f).ims_moving (List[np.ndarray]) – Same-shape images to register. shape: (H, W) each.
template_method (str) –
Template-resolution mode. Either
'image':templateis a concrete image (or pinned index resolved to one).'sequential': Each image is registered to its neighbor along a chain that ends at the template index.
(Default is
'sequential')mask_borders (Tuple[int, int, int, int]) – Pre-crop borders
(top, bottom, left, right)removed from every image before registration. (Default is(0, 0, 0, 0))method (str) – Backend key into
_METHODS_LUT. One of'RoMa','ECC_cv2','PhaseCorrelation', or'NullRegistration'. (Default is'RoMa')kwargs_method (Optional[Dict[str, Dict[str, Any]]]) – Per-backend kwargs keyed by backend name, so the same dict can be passed for any
methodchoice. IfNone, uses_DEFAULT_KWARGS_METHOD. (Default isNone)constraint (str) – Warp family passed through to
ImageRegistrationMethod.fit_rigid(). (Default is'affine')kwargs_RANSAC (Optional[Dict[str, Any]]) – RANSAC kwargs for
fit_rigid. IfNone, uses{'inl_thresh': 2.0, 'max_iter': 10, 'confidence': 0.99}. (Default isNone)verbose (Optional[bool]) – Overrides
self._verbosewhen notNone. (Default isNone)
- Returns:
- remappingIdx_geo (List[np.ndarray]):
One remapping array per input image. shape: (H, W, 2) each, dtype: float32. Also stored on
self.remappingIdx_geo.
- Return type:
(List[np.ndarray])
- transform_images(ims_moving: List[numpy.ndarray] | numpy.ndarray, remappingIdx: List[numpy.ndarray] | numpy.ndarray) List[numpy.ndarray] | numpy.ndarray[source]
Applies per-image remapping indices via
face_rhythm.helpers.remap_images().- Parameters:
ims_moving (Union[List[np.ndarray], np.ndarray]) – Images to warp. May be a list of (H, W) or (H, W, C) arrays, or a single
np.ndarray(returned as a bare array).remappingIdx (Union[List[np.ndarray], np.ndarray]) – Matching remap arrays. shape: (H, W, 2) each. The
cv2backend is used with a per-imageborder_value = im_moving.mean()so that the cropped border matches the image statistics.
- Returns:
- ims_registered (Union[List[np.ndarray], np.ndarray]):
Registered images. Returned as a single
np.ndarraywhenims_movingwas a bare ndarray, otherwise as a list.
- Return type:
(Union[List[np.ndarray], np.ndarray])
- plot_alignment_results_geometric(plot_direct: bool = True) Tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Figure | None][source]
Renders two-panel score + alignment heatmaps per registration stage.
- Parameters:
plot_direct (bool) – If
Trueand a direct all-to-all matrix was produced (i.e. the match-search ran), also render the “direct” stage. Otherwise only the “final” stage is drawn. (Default isTrue)- Returns:
- tuple containing:
- fig_final (matplotlib.figure.Figure):
Figure for the post-registration results.
- fig_direct (Optional[matplotlib.figure.Figure]):
Figure for the direct (pre-match-search) results, or
Noneif the match-search did not run.
- Return type:
(Tuple[matplotlib.figure.Figure, Optional[matplotlib.figure.Figure]])
face_rhythm.data_importing module
- class face_rhythm.data_importing.Dataset_videos(bufferedVideoReader: BufferedVideoReader = None, paths_videos: str | List[str] = None, contiguous: bool = False, frame_rate_clamp: float = None, verbose: bool | int = 1)[source]
Bases:
FR_ModuleContainer for one or more videos used as input to the face-rhythm pipeline. RH 2022
Imports videos via
decord(or wraps an existingBufferedVideoReader) and exposes lazy per-video readers along with aggregated metadata (frame counts, frame rate, frame shape, channel count). Acts as a sequence of video readers.- Parameters:
bufferedVideoReader (object) – Pre-built
BufferedVideoReaderwhose readers and metadata are reused. Mutually exclusive withpaths_videos; exactly one must be provided. (Default isNone)paths_videos (Union[str, List[str]]) – Path or list of paths to the video files to load. Used when
bufferedVideoReaderisNone. (Default isNone)contiguous (bool) – If
True, videos are treated as a single contiguous stream (the first frame of each subsequent video continues the frame index of the previous one). (Default isFalse)frame_rate_clamp (float) – If
Nonethe frame rate stored inself.frame_rateis the median of the per-video metadata frame rates. If a float, that value is used verbatim. (Default isNone)Verbosity level.
0: Silent.1: Warnings only.2: Warnings and informational progress messages.
(Default is
1)
- videos
Per-video lazy reader objects (
VideoReaderWrapperinstances or readers borrowed frombufferedVideoReader).- Type:
List[object]
- metadata
Per-video metadata with keys
'paths_videos','num_frames','frame_rate','frame_height_width', and'num_channels'.- Type:
- example_image
The first frame of the first video, materialized as a CPU
numpyarray.- Type:
np.ndarray
face_rhythm.decomposition module
Tensor Component Analysis (TCA) wrapper around tensorly.
Provides the TCA class for decomposing multi-way arrays (time x points x
frequency x …) produced upstream by the spectral pipeline. Handles dict-of-
arrays ingestion, axis concatenation, complex-to-real unfolding, normalization,
and the actual decomposition via tensorly’s CP / NN-HALS / Randomized CP
solvers on either numpy or pytorch backends.
- class face_rhythm.decomposition.TCA(verbose: bool | int = 1)[source]
Bases:
FR_ModulePerforms Tensor Component Analysis (TCA) on multi-way arrays produced by the spectral pipeline using
tensorlysolvers. RH 2022- Parameters:
Verbosity level. One of
0: No messages.1: Warnings only.2: Info messages.
(Default is
1)
- config
Configuration dictionary populated by
__init__and updated by subsequent method calls.- Type:
- run_data
Run-time data (factors and dimension names) populated after fitting and rearranging.
- Type:
- rearrange_data(data: dict, names_dims_array: list = ['xy', 'points', 'frequency', 'time'], names_dims_concat_array: list = [['xy', 'points']], concat_complexDim: bool = True, name_dim_concat_complexDim: str = 'time', name_dim_dictElements: str = 'trials', method_handling_dictElements: str = 'concatenate', name_dim_concat_dictElements: str = 'time', idx_windows: list = None, name_dim_array_window: str = 'time')[source]
Rearranges the input data dictionary into a single tensor (or set of tensors) suitable for TCA. Supports concatenating array dimensions, unfolding the complex dimension, combining or stacking dictionary elements, and windowing each array along a chosen dimension.
- Parameters:
data (dict) – Dictionary mapping element name to a
numpy.ndarrayof consistent rank. Arrays may be complex valued.names_dims_array (list) – Names of the dimensions of the data arrays, in axis order. (Default is
['xy', 'points', 'frequency', 'time'])names_dims_concat_array (list) – List of 2-element lists
[dim_a, dim_b]describing pairs of array dimensions to concatenate.dim_ais folded intodim_b, producing a single dimension named'(dim_a dim_b)'with lengthlen(dim_a) * len(dim_b). Pairs are applied in the order given. (Default is[['xy', 'points']])concat_complexDim (bool) – If
True, real and imaginary parts are stacked and folded intoname_dim_concat_complexDim. Requires complex valued input. (Default isTrue)name_dim_concat_complexDim (str) – Name of the array dimension into which the complex dimension is folded. Typically
'time'. (Default is'time')name_dim_dictElements (str) – Semantic name for the dictionary elements (e.g.
'trials'or'videos'). (Default is'trials')method_handling_dictElements (str) –
How to combine dictionary elements. One of
'concatenate': Concatenate elements alongname_dim_concat_dictElements; output is a single array of the same rank as inputs.'stack': Stack elements along a new leading axis; output is a single array with one extra dimension.'separate': Keep each element as its own tensor; decompositions run independently.
(Default is
'concatenate')name_dim_concat_dictElements (str) – Array dimension along which to concatenate dictionary elements. Only used when
method_handling_dictElementsis'concatenate'. (Default is'time')idx_windows (list) – Per-element
(start, end)index pairs (inclusive) defining a window alongname_dim_array_window. IfNone, the full array is used. (Default isNone)name_dim_array_window (str) – Array dimension along which
idx_windowsis applied. Only used whenidx_windowsis notNone. (Default is'time')
- normalize_data(mean_subtract: bool = False, std_divide: bool = False, dim_name: str = 'time')[source]
Normalizes
self.dataalong a named dimension by optional mean subtraction and/or standard-deviation scaling. Requires thatself.rearrange_datahas already populatedself.data.- Parameters:
mean_subtract (bool) – If
True, subtracts the mean alongdim_name. (Default isFalse)std_divide (bool) – If
True, divides by the standard deviation alongdim_name. (Default isFalse)dim_name (str) – Name of the array dimension to normalize over. Must match a name in
self.names_dims_array_preDecomp(i.e. the names that exist afterrearrange_data). (Default is'time')
- fit(data: dict = None, method: str = 'CP_NN_HALS', params_method: dict = {'cvg_criterion': 'abs_rec_error', 'exact': False, 'fixed_modes': None, 'init': 'svd', 'n_iter_max': 100, 'nn_modes': 'all', 'rank': 6, 'sparsity_coefficients': None, 'svd': 'truncated_svd', 'tol': 1e-07, 'verbose': False}, backend: str = 'pytorch', DEVICE: str = 'cpu', verbose: bool | int = 1)[source]
Fits a TCA model to the rearranged data using a
tensorlydecomposition. Populatesself.factors,self.factors_raw, andself.factor_weights.- Parameters:
data (dict) – Dictionary of
numpy.ndarraydata arrays of identical shape. IfNone,self.data(set byrearrange_data) is used. (Default isNone)method (str) –
tensorlydecomposition class to instantiate. One of'CP_NN_HALS': Non-negative CP decomposition via the HALS algorithm.'CP': Standard CP decomposition.'RandomizedCP': Randomized CP decomposition for large tensors.'ConstrainedCP': Constrained CP decomposition.
(Default is
'CP_NN_HALS')params_method (dict) – Keyword arguments forwarded to the
tensorlydecomposition class. Seetensorlydocumentation for valid keys. (Default is theCP_NN_HALSparameter set defined in the signature)backend (str) –
tensorlybackend. One of'pytorch': Recommended for most use cases.'numpy': NumPy backend.
(Default is
'pytorch')DEVICE (str) – Torch device string (e.g.
'cpu'or'cuda') used whenbackendis'pytorch'. (Default is'cpu')verbose (Union[bool, int]) – Verbosity level.
0is silent,1warnings,2info. (Default is1)
- order_factors_by_EVR(data: dict = None, factors: dict = None, weights: dict = None, overwrite_factors: bool = True)[source]
Reorders TCA factors by descending explained variance ratio (EVR) on each data tensor.
- Parameters:
data (dict) – Dictionary of
numpy.ndarraydata tensors. IfNone,self.datais used. (Default isNone)factors (dict) – Dictionary of factor sets keyed to match
data. IfNone,self.factorsis used. (Default isNone)weights (dict) – Dictionary of CP weights keyed to match
data. IfNone,self.factor_weightsis used. (Default isNone)overwrite_factors (bool) – If
True, writes the reordered results back toself.factors,self.factor_weights, andself.evrs_ordered. (Default isTrue)
- Returns:
- tuple containing:
- orders (dict):
Per-key sort indices used to reorder factors.
- factors_ordered (dict):
Per-key dictionaries of factors reordered by EVR.
- weights_ordered (dict):
Per-key CP weight vectors reordered by EVR.
- evrs_ordered (dict):
Per-key explained variance ratios in descending order.
- Return type:
(tuple)
- rearrange_factors(factors: dict = None, undo_concat_dictElements: bool = True, undo_concat_complexDim: bool = True)[source]
Reverses the dimension folding applied in
rearrange_dataso the fitted factors can be interpreted in the original data space. Populatesself.factors_rearranged,self.names_dims_array_postDecomp, andself.name_dim_dictElements_postDecomp.- Parameters:
factors (dict) – Dictionary of factors to rearrange. If
None,self.factorsis used. (Default isNone)undo_concat_dictElements (bool) – If
True, splits the concatenated dictionary-elements dimension back into per-element factors. Requires thatmethod_handling_dictElementswas'concatenate'. (Default isTrue)undo_concat_complexDim (bool) – If
True, recombines real and imaginary halves of the folded complex dimension into complex-valued arrays. Requires thatconcat_complexDimwasTrue. (Default isTrue)
- plot_factors(factors: dict = None, figure_saver: Figure_Saver = None, show_figures: bool = True)[source]
Plots each leaf factor as a normalized line plot, one figure per factor. Optionally writes figures to disk via a
util.Figure_Saver.- Parameters:
factors (dict) – Nested dictionary of factors to plot. If
None,self.factors_rearrangedis used if available, otherwiseself.factors. (Default isNone)figure_saver (util.Figure_Saver) – Saver used to persist figures. If
None, figures are not saved. (Default isNone)show_figures (bool) – If
True, enables interactive mode so figures are shown. (Default isTrue)
face_rhythm.h5_handling module
HDF5 utilities: hierarchical traversal, group I/O, and bulk-close helpers.
Convenience wrappers around h5py for the face-rhythm project. Nothing
here is CUDA- or video-specific; the module is safe to import anywhere.
- face_rhythm.h5_handling.close_all_h5()[source]
Closes every open
h5py.Fileobject found in the Python workspace.Iterates over all live objects via
gcand callscloseon anyh5py.Fileinstance. Falls back totables.file._open_files.close_allif the primary loop raises. Adapted from https://stackoverflow.com/questions/29863342/close-an-open-h5py-data-file.
- face_rhythm.h5_handling.show_group_items(hObj)[source]
Prints the items at the top hierarchical level of an HDF5 object or dict. RH 2021
See
show_item_tree()for a full recursive listing.- Parameters:
hObj (object) – Hierarchical object: an
h5py.File,h5py.Group, or a Pythondict.
Example
with h5py.File(path, 'r') as f: h5_handling.show_group_items(f)
- face_rhythm.h5_handling.show_item_tree(hObj=None, path=None, depth=None, show_metadata=True, print_metadata=False, indent_level=0)[source]
Recursively prints the items and groups in an HDF5 object or dict. RH 2021
- Parameters:
hObj (object) – Hierarchical object: an
h5py.File,h5py.Group, or a Pythondict. Ignored whenpathis provided. (Default isNone)path (Optional[object]) – Path-like to an HDF5 file to open in read mode. If not
None, the file is opened and traversed in place ofhObj. (Default isNone)depth (Optional[int]) – Maximum number of hierarchical levels to descend.
Nonemeans unlimited. (Default isNone)show_metadata (bool) – If
True, list per-node metadata attributes alongside items. (Default isTrue)print_metadata (bool) – If
True, also print the value of each metadata attribute; otherwise only its shape and dtype are shown. (Default isFalse)indent_level (int) – Internal recursion bookkeeping for indentation; users should leave this at the default. (Default is
0)
Example
with h5py.File(path, 'r') as f: h5_handling.show_item_tree(f)
- face_rhythm.h5_handling.make_h5_tree(dict_obj, h5_obj, group_string='', use_compression=False, track_order=True)[source]
Recursively writes a Python dict into an HDF5 group/dataset tree. RH 2021
Intended to be called by
write_dict_to_h5(); using it directly is not recommended.- Parameters:
dict_obj (dict) – Source dictionary whose hierarchy and leaf values become groups and datasets, respectively.
h5_obj (h5py.File) – Open HDF5 file (or group) into which the tree is written.
group_string (str) – Path of the current HDF5 group within
h5_objduring recursion. An empty string is treated as the root'/'. (Default is'')use_compression (bool) – If
True, write each dataset with gzip level 9 compression. (Default isFalse)track_order (bool) – If
True, seth5py.get_config()to preserve insertion order of items. (Default isTrue)
- face_rhythm.h5_handling.write_dict_to_h5(path_save, input_dict, use_compression=False, track_order=True, write_mode='w-', show_item_tree_pref=True)[source]
Writes a Python dict to an HDF5 file, mirroring its hierarchy and data. RH 2021
Wraps
make_h5_tree()and optionally prints the resulting tree.- Parameters:
path_save (object) – Full path of the file to write.
strorpathlib.Path.input_dict (dict) – Dictionary whose leaves are HDF5-writable values (typically
numpy.ndarrayor strings).use_compression (bool) – If
True, write each dataset with gzip compression. (Default isFalse)track_order (bool) – If
True, preserve dict insertion order in the HDF5 file. (Default isTrue)write_mode (str) –
File-open mode forwarded to
h5py.File. Either'w': Overwrite any existing file.'w-': Refuse to overwrite an existing file.
(Default is
'w-')show_item_tree_pref (bool) – If
True, print the resulting HDF5 hierarchy after writing. (Default isTrue)
- face_rhythm.h5_handling.simple_load(filepath, return_dict=True, verbose=False)[source]
Loads an HDF5 file and returns it as a nested
dictor an open file. RH 2023- Parameters:
filepath (object) – Full path of the file to read.
strorpathlib.Path.return_dict (bool) – If
True, return a nesteddictwhose keys are group names and whose leaves are the dataset arrays. IfFalse, return the openh5py.Fileobject instead. (Default isTrue)verbose (bool) – If
True, print the file’s hierarchy viashow_item_tree()before returning. (Default isFalse)
- Returns:
- data (object):
Either a nested
dictof arrays (whenreturn_dictisTrue) or an openh5py.Filehandle.
- Return type:
(object)
- face_rhythm.h5_handling.h5Obj_to_dict(hObj)[source]
Converts an
h5pygroup or file into a nested Pythondict. RH 2023
- face_rhythm.h5_handling.simple_save(dict_to_save, path=None, use_compression=False, track_order=True, write_mode='w-', verbose=False)[source]
Saves a Python dict to an HDF5 file or appends it to an existing one. RH 2021
- Parameters:
dict_to_save (dict) – Dictionary to save to the HDF5 file.
path (object) – Full path of the file to write.
strorpathlib.Path. (Default isNone)use_compression (bool) – If
True, write each dataset with gzip compression. (Default isFalse)track_order (bool) – If
True, preserve dict insertion order in the HDF5 file. (Default isTrue)write_mode (str) –
File-open mode forwarded to
h5py.File. Either'w': Overwrite any existing file.'w-': Refuse to overwrite an existing file.'a': Append a new dataset to an existing file.
(Default is
'w-')verbose (bool) – If
True, print the resulting HDF5 hierarchy after writing. (Default isFalse)
- face_rhythm.h5_handling.merge_helper(d, group)[source]
Recursively merges a dictionary into an open
h5py.Group.Sub-dictionaries map to subgroups; non-dict values are written as datasets, replacing any existing dataset with the same name.
- face_rhythm.h5_handling.merge_dict_into_h5_file(d, filepath=None, h5Obj=None)[source]
Merges a dictionary into an existing HDF5 file or open file object.
Wraps
merge_helper(), which recursively walks the dict and merges each level into the matching HDF5 group. Exactly one offilepathorh5Objmust be supplied.- Parameters:
face_rhythm.helpers module
General-purpose helpers: video I/O wrappers, path tools, image warping, downloads.
Collected utilities used across the face-rhythm package. Notable groups:
Video readers (
VideoReaderWrapper,BufferedVideoReader) arounddecord/torchcodecwith pre-fetch threads.Path and file helpers (
find_paths,prepare_filepath_for_saving, download + hash verification, zip extraction).Image registration helpers (
find_geometric_transformation, remap-index and flow-field conversions) used byface_rhythm.rois.Parameter dictionary utilities (
fill_missing_keys_with_defaults,flatten_dict) and a handful of numerical / plotting / device utilities.
Some routines are adapted from Rich Hakim’s basic_neural_processing_modules.
- face_rhythm.helpers.prepare_cv2_imshow()[source]
Pre-initializes
cv2.imshowto avoid kernel crashes. RH 2022Calling
cv2.imshowafteravordecordhave been imported can crash the Python kernel. Showing a small dummy frame here primes the OpenCV display loop so subsequentcv2.imshowcalls work safely.
- face_rhythm.helpers.find_paths(dir_outer: str | List[str], reMatch: str = 'filename', reMatch_in_path: str | None = None, find_files: bool = True, find_folders: bool = False, depth: int = 0, natsorted: bool = True, alg_ns: str | None = None, verbose: bool = False) List[str][source]
Searches for files and/or folders recursively in a directory using a regex match. RH 2022-2023
- Parameters:
dir_outer (Union[str, List[str]]) – Path(s) to the directory(ies) to search. If a list of directories, then all directories will be searched.
reMatch (str) – Regular expression to match. Each file or folder name encountered will be compared using
re.search(reMatch, filename). If the output is notNone, the file will be included in the output.reMatch_in_path (Optional[str]) –
Additional regular expression to match anywhere in the upper path. Useful for finding files/folders in specific subdirectories. If
None, then no additional matching is done.(Default is
None)find_files (bool) – Whether to find files. (Default is
True)find_folders (bool) – Whether to find folders. (Default is
False)depth (int) –
Maximum folder depth to search. (Default is 0).
depth=0 means only search the outer directory.
depth=2 means search the outer directory and two levels of subdirectories below it
natsorted (bool) – Whether to sort the output using natural sorting with the natsort package. (Default is
True)alg_ns (str) – Algorithm to use for natural sorting. See
natsort.nsor https://natsort.readthedocs.io/en/4.0.4/ns_class.html/ for options. Default is PATH. Other commons are INT, FLOAT, VERSION. (Default isNone)verbose (bool) – Whether to print the paths found. (Default is
False)
- Returns:
- paths (List[str]):
Paths to matched files and/or folders in the directory.
- Return type:
(List[str])
- face_rhythm.helpers.prepare_path(path: str, mkdir: bool = False, exist_ok: bool = True) str[source]
Validates a directory or file path for saving or loading. RH 2023
Resolution rules:
If the path exists and
exist_okisTrue, it is accepted.If the path exists and
exist_okisFalse, an error is raised.If the path does not exist and refers to a file: the parent directory is created when
mkdirisTrue, otherwise an error is raised when the parent does not exist.If the path does not exist and refers to a directory: the directory is created when
mkdirisTrue, otherwise an error is raised.
- face_rhythm.helpers.prepare_filepath_for_saving(filepath: str, mkdir: bool = False, allow_overwrite: bool = True) str[source]
Prepares a file path for saving a file. Ensures the file path is valid and has the necessary permissions.
- Parameters:
- Returns:
- path (str):
The prepared file path for saving.
- Return type:
(str)
- face_rhythm.helpers.prepare_filepath_for_loading(filepath: str, must_exist: bool = True) str[source]
Prepares a file path for loading a file. Ensures the file path is valid and has the necessary permissions.
- face_rhythm.helpers.prepare_directory_for_saving(directory: str, mkdir: bool = False, exist_ok: bool = True) str[source]
Prepares a directory path for saving a file. This function is rarely used.
- Parameters:
- Returns:
- path (str):
The prepared directory path for saving.
- Return type:
(str)
- face_rhythm.helpers.prepare_directory_for_loading(directory: str, must_exist: bool = True) str[source]
Prepares a directory path for loading a file. This function is rarely used.
- face_rhythm.helpers.pickle_save(obj: Any, filepath: str, mode: str = 'wb', zipCompress: bool = False, mkdir: bool = False, allow_overwrite: bool = True, **kwargs_zipfile: Dict[str, Any]) None[source]
Saves an object to a pickle file using pickle.dump. Allows for zipping of the file.
RH 2022
- Parameters:
obj (Any) – The object to save.
filepath (str) – The path to save the object to.
mode (str) –
The mode to open the file in. Options are:
'wb': Write binary.'ab': Append binary.'xb': Exclusive write binary. Raises FileExistsError if the file already exists.
(Default is
'wb')zipCompress (bool) – If
True, compresses pickle file using zipfileCompressionMethod, which is similar tosavez_compressedin numpy (withzipfile.ZIP_DEFLATED). Useful for saving redundant and/or sparse arrays objects. (Default isFalse)mkdir (bool) – If
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing file. (Default isTrue)kwargs_zipfile (Dict[str, Any]) –
Keyword arguments that will be passed into zipfile.ZipFile. compression=``zipfile.ZIP_DEFLATED`` by default. See https://docs.python.org/3/library/zipfile.html#zipfile-objects. Other options for ‘compression’ are (input can be either int or object):
0: zipfile.ZIP_STORED (no compression)8: zipfile.ZIP_DEFLATED (usual zip compression)12: zipfile.ZIP_BZIP2 (bzip2 compression) (usually not as good as ZIP_DEFLATED)14: zipfile.ZIP_LZMA (lzma compression) (usually better than ZIP_DEFLATED but slower)
- face_rhythm.helpers.pickle_load(filepath: str, zipCompressed: bool = False, mode: str = 'rb') Any[source]
Loads an object from a pickle file. RH 2022
- Parameters:
- Returns:
- obj (Any):
The object loaded from the pickle file.
- Return type:
(Any)
- face_rhythm.helpers.json_save(obj: Any, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]
Saves an object to a json file using json.dump. RH 2022
- Parameters:
obj (Any) – The object to save.
filepath (str) – The path to save the object to.
indent (int) – Number of spaces for indentation in the output json file. (Default is 4)
mode (str) –
The mode to open the file in. Options are:
'wb': Write binary.'ab': Append binary.'xb': Exclusive write binary. Raises FileExistsError if the file already exists.
(Default is
'w')mkdir (bool) – If
True, creates parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing file. (Default isTrue)
- face_rhythm.helpers.json_load(filepath: str, mode: str = 'r') Any[source]
Loads an object from a json file. RH 2022
- face_rhythm.helpers.yaml_save(obj: object, filepath: str, indent: int = 4, mode: str = 'w', mkdir: bool = False, allow_overwrite: bool = True) None[source]
Saves an object to a YAML file using the
yaml.dumpmethod. RH 2022- Parameters:
obj (object) – The object to be saved.
filepath (str) – Path to save the object to.
indent (int) – The number of spaces for indentation in the saved YAML file. (Default is 4)
mode (str) –
Mode to open the file in.
'w': write (default)'wb': write binary'ab': append binary'xb': exclusive write binary. RaisesFileExistsErrorif file already exists.
(Default is
'w')mkdir (bool) – If
True, creates the parent directory if it does not exist. (Default isFalse)allow_overwrite (bool) – If
True, allows overwriting of existing files. (Default isTrue)
- face_rhythm.helpers.yaml_load(filepath: str, mode: str = 'r', loader: object = <class 'yaml.loader.FullLoader'>) object[source]
Loads a YAML file. RH 2022
- Parameters:
filepath (str) – Path to the YAML file to load.
mode (str) – Mode to open the file in. (Default is
'r')loader (object) –
The YAML loader to use.
yaml.FullLoader: Loads the full YAML language. Avoids arbitrary code execution. (Default for PyYAML 5.1+)yaml.SafeLoader: Loads a subset of the YAML language, safely. This is recommended for loading untrusted input.yaml.UnsafeLoader: The original Loader code that could be easily exploitable by untrusted data input.yaml.BaseLoader: Only loads the most basic YAML. All scalars are loaded as strings.
(Default is
yaml.FullLoader)
- Returns:
- loaded_obj (object):
The object loaded from the YAML file.
- Return type:
(object)
- face_rhythm.helpers.download_file(url: str | None, path_save: str, check_local_first: bool = True, check_hash: bool = False, hash_type: str = 'MD5', hash_hex: str | None = None, mkdir: bool = False, allow_overwrite: bool = True, write_mode: str = 'wb', verbose: bool = True, chunk_size: int = 1024) None[source]
Downloads a file from a URL to a local path using requests. Checks if file already exists locally and verifies the hash of the downloaded file against a provided hash if required. RH 2023
- Parameters:
url (Optional[str]) – URL of the file to download. If
None, then no download is attempted. (Default isNone)path_save (str) – Path to save the file to.
check_local_first (bool) – Whether to check if the file already exists locally. If
Trueand the file exists locally, the download is skipped. IfTrueandcheck_hashis alsoTrue, the hash of the local file is checked. If the hash matches, the download is skipped. If the hash does not match, the file is downloaded. (Default isTrue)check_hash (bool) – Whether to check the hash of the local or downloaded file against
hash_hex. (Default isFalse)hash_type (str) – Type of hash to use. Options are:
'MD5','SHA1','SHA256','SHA512'. (Default is'MD5')hash_hex (Optional[str]) – Hash to compare to, in hexadecimal format (e.g., ‘a1b2c3d4e5f6…’). Can be generated using
hash_file()orhashlib.hexdigest(). Ifcheck_hashisTrue,hash_hexmust be provided. (Default isNone)mkdir (bool) – If
True, creates the parent directory ofpath_saveif it does not exist. (Default isFalse)write_mode (str) – Write mode for saving the file. Options include:
'wb'(write binary),'ab'(append binary),'xb'(write binary, fail if file exists). (Default is'wb')verbose (bool) – If
True, prints status messages. (Default isTrue)chunk_size (int) – Size of chunks in which to download the file. (Default is 1024)
- face_rhythm.helpers.hash_file(path: str, type_hash: str = 'MD5', buffer_size: int = 65536) str[source]
Computes the hash of a file using the specified hash type and buffer size. RH 2022
- Parameters:
path (str) – Path to the file to be hashed.
type_hash (str) –
Type of hash to use. (Default is
'MD5'). Either'MD5': MD5 hash algorithm.'SHA1': SHA1 hash algorithm.'SHA256': SHA256 hash algorithm.'SHA512': SHA512 hash algorithm.
buffer_size (int) – Buffer size (in bytes) for reading the file. 65536 corresponds to 64KB. (Default is 65536)
- Returns:
- hash_val (str):
The computed hash of the file.
- Return type:
(str)
- face_rhythm.helpers.get_dir_contents(directory: str) Tuple[List[str], List[str]][source]
Retrieves the names of the folders and files in a directory (does not include subdirectories). RH 2021
- face_rhythm.helpers.compare_file_hashes(hash_dict_true: Dict[str, Tuple[str, str]], dir_files_test: str | None = None, paths_files_test: List[str] | None = None, verbose: bool = True) Tuple[bool, Dict[str, bool], Dict[str, str]][source]
Compares hashes of files in a directory or list of paths to provided hashes. RH 2022
- Parameters:
hash_dict_true (Dict[str, Tuple[str, str]]) – Dictionary of hashes to compare. Each entry should be in the format: {‘key’: (‘filename’, ‘hash’)}.
dir_files_test (str) – Path to directory containing the files to compare hashes. Unused if paths_files_test is not
None. (Optional)paths_files_test (List[str]) – List of paths to files to compare hashes. dir_files_test is used if
None. (Optional)verbose (bool) – If
True, failed comparisons are printed out. (Default isTrue)
- Returns:
- tuple containing:
- total_result (bool):
Trueif all hashes match,Falseotherwise.- individual_results (Dict[str, bool]):
Dictionary indicating whether each hash matched.
- paths_matching (Dict[str, str]):
Dictionary of paths that matched. Each entry is in the format: {‘key’: ‘path’}.
- Return type:
(tuple)
- face_rhythm.helpers.extract_zip(path_zip: str, path_extract: str | None = None, verbose: bool = True) List[str][source]
Extracts a zip file. RH 2022
- Parameters:
- Returns:
- paths_extracted (List[str]):
List of paths to the extracted files.
- Return type:
(List[str])
- face_rhythm.helpers.make_batches(iterable, batch_size=None, num_batches=None, min_batch_size=0, return_idx=False, length=None)[source]
Generates batches of data from an iterable. RH 2021
- Parameters:
iterable (Iterable) – Iterable to be batched.
batch_size (Optional[int]) – Size of each batch. If
None,batch_sizeis computed fromnum_batches. (Default isNone)num_batches (Optional[int]) – Number of batches to make. Used only when
batch_sizeisNone. (Default isNone)min_batch_size (int) – Minimum size of each batch. Batches smaller than this are skipped. (Default is
0)return_idx (bool) – If
True, yields(batch, [start, end])tuples instead of just the batch. (Default isFalse)length (Optional[int]) – Length of the iterable. If
None, useslen(iterable). Useful when the iterable does not implement__len__. (Default isNone)
- Returns:
- output (Generator):
Yields successive batches from
iterable. Ifreturn_idxisTrue, yields(batch, [start, end])tuples.
- Return type:
(Generator)
- face_rhythm.helpers.cp_to_dense(cp, weights=None)[source]
Reconstructs a dense tensor from a CP-format list of factor matrices. RH 2022
- Parameters:
cp (List[np.ndarray]) – List of length
n_modesof 2D factor matrices, each with shape (len_dim, rank). This is the format Tensorly uses for its'cp'representation. Elements may be NumPy arrays ortorch.Tensor(matching dtype).weights (Optional[np.ndarray]) – Per-rank weights of length
rank. IfNone, uses a vector of ones. (Default isNone)
- Returns:
- dense (np.ndarray):
Reconstructed dense tensor. shape: (len_dim_0, len_dim_1, …).
- Return type:
(np.ndarray)
- class face_rhythm.helpers.Lazy_repeat_item(item, pseudo_length=None)[source]
Bases:
objectLazy iterator-like container that always returns the same item. RH 2021
- Parameters:
item (Any) – Item to repeat on every access.
pseudo_length (Optional[int]) – Reported length of the container. If
None, the container has no enforced length and__getitem__always returnsitem. (Default isNone)
- item
The repeated item.
- Type:
Any
- face_rhythm.helpers.deep_update_dict(dictionary, key, new_val=None, new_key=None, in_place=False)[source]
Updates a value or renames a key inside a nested dictionary. RH 2022
- Parameters:
dictionary (Dict) – Dictionary to update.
key (List[str]) – Hierarchical path of string keys leading to the entry to update. Each element corresponds to a nesting level.
new_val (Optional[Any]) – New value to assign. If
None,new_keymust be provided and only the key is renamed. (Default isNone)new_key (Optional[str]) – If provided,
key[-1]is removed and replaced withnew_key(mapping tonew_valif given, otherwise to the existing value). (Default isNone)in_place (bool) – If
True, updatesdictionaryin place and returnsNone. IfFalse, returns a deep-copied updated dictionary. (Default isFalse)
- Returns:
- output (Optional[Dict]):
Updated dictionary when
in_placeisFalse; otherwiseNone.
- Return type:
(Optional[Dict])
Example
deep_update_dict(params, ['dataloader_kwargs', 'prefetch_factor'], val)
- face_rhythm.helpers.flatten_dict(d: MutableMapping, parent_key: str = '', sep: str = '.') MutableMapping[source]
Flattens a nested dictionary into a single dictionary. RH 2022
All keys are coerced to strings and joined by
sep. Adapted from https://stackoverflow.com/a/6027615.- Parameters:
- Returns:
- flattened (Dict):
Flat dictionary with paths joined by
sep.
- Return type:
(Dict)
- face_rhythm.helpers.find_subDict_key(d: dict, s: str, max_depth: int = 9999999)[source]
Recursively searches a nested dictionary for keys matching a regex.
- Parameters:
- Returns:
- k_all (List[Tuple[List[str], Any]]):
List of 2-tuples
(path, value)wherepathis the list of string keys leading to the matched entry andvalueis the matched sub-dictionary value.
- Return type:
(List[Tuple[List[str], Any]])
- face_rhythm.helpers.fill_in_dict(d: Dict, defaults: Dict, verbose: bool = True, hierarchy: List[str] = ['dict'])[source]
Fills in a dictionary in place with values from
defaultsfor missing keys, recursing into nested dictionaries. RH 2023- Parameters:
d (Dict) – Dictionary to fill in (modified in place).
defaults (Dict) – Dictionary of default values.
verbose (bool) – If
True, prints a message each time a default value is inserted. (Default isTrue)hierarchy (List[str]) – Path of keys leading to
d. Used internally for recursion. (Default is['dict'])
- face_rhythm.helpers.check_keys_subset(d, default_dict, error_on_missing_keys=True, hierarchy=['defaults'])[source]
Verifies recursively that every key in
dalso appears indefault_dict. RH 2023- Parameters:
d (Dict) – Dictionary to check.
default_dict (Dict) – Dictionary containing the allowed keys.
error_on_missing_keys (bool) – If
True, raisesAssertionErrorwhen a key indis not indefault_dict. IfFalse, emits a warning instead. (Default isTrue)hierarchy (List[str]) – Path of keys leading to
d. Used internally for recursion. (Default is['defaults'])
- face_rhythm.helpers.prepare_params(params, defaults, error_on_missing_keys=True, verbose=True)[source]
Validates
paramsagainstdefaultsand fills in missing keys.Performs the following:
Checks that all keys in
paramsare also indefaults.Fills in any missing keys in
paramswith values fromdefaults.Returns a deepcopy of the filled-in
params.
- Parameters:
params (Dict) – Dictionary of parameters.
defaults (Dict) – Dictionary of defaults.
error_on_missing_keys (bool) – If
True, raises an error when a key inparamsis not indefaults. IfFalse, emits a warning instead. (Default isTrue)verbose (bool) – If
True, prints messages while filling in defaults. (Default isTrue)
- Returns:
- params_out (Dict):
Validated and default-filled deepcopy of
params.
- Return type:
(Dict)
- class face_rhythm.helpers.VideoReaderWrapper(*args: Any, **kwargs: Any)[source]
Bases:
VideoReaderSubclass of
decord.VideoReaderthat works around a memory leak.Calls
self.seek(0)after initialization and after every__getitem__so that decord releases buffered frames. Adapted from https://github.com/dmlc/decord/issues/208#issuecomment-1157632702.
- class face_rhythm.helpers.TorchCodecVideoReader(path_video: str, device: str = 'cpu', num_ffmpeg_threads: int = 0)[source]
Bases:
objectVideo reader backed by
torchcodec.decoders.VideoDecoderwith a workaround for torchcodec issue #905.Provides the same
__getitem__/__len__/get_avg_fpsinterface asVideoReaderWrapper(decord) so it can be used as a drop-in replacement insideBufferedVideoReader.Frames are returned as
torch.Tensorwith shape(H, W, C)and dtypeuint8(NHWC layout), matching the output of decord’s torch bridge.Issue #905 workaround: torchcodec’s sequential access path skips cursor reset; after reading
n - has_b_framesframes on a single decoder, FFmpeg’s H.264 drain emitshas_b_framesAVFrames withpts = INT64_MIN, which the internal PTS filter rejects, causingEndOfFileExceptionbefore the last frames are decoded. The fix is to serve only frames[0, n - SAFETY)from the primary decoder (which never reaches the drain), and route the trailingSAFETYframes through a fresh decoder that takes the non-sequential seek branch (avformat_seek_file+ flush + forward-decode from keyframe) wherepkt_dtsremains valid.SAFETY = max(has_b_frames, 2). torchcodec’sVideoStreamMetadatadoes not currently exposehas_b_frames, so SAFETY always defaults to 2, which matchesffprobe-reportedhas_b_frames=2for H.264 AVIs and is a safe overestimate forhas_b_frames=0files.The tail decoder is lazily created on first tail access and cached for the lifetime of this reader (one decoder recreation per video pass, versus one per chunk with earlier workarounds).
Thread safety is guaranteed by an internal lock — required because
BufferedVideoReaderloads slots from background threads.- Parameters:
path_video (str) – Path to the video file.
device (str) – Decode device.
'cpu'for software decode,'cuda'or'cuda:0'for NVDEC hardware decode. NVDEC requires torchcodec built with CUDA support and an FFmpeg built with--enable-cuda.num_ffmpeg_threads (int) – Number of FFmpeg internal threads for decoding.
0lets FFmpeg choose automatically (recommended).
- class face_rhythm.helpers.BufferedVideoReader(video_readers: list = None, paths_videos: list = None, buffer_size: int = 1000, prefetch: int = 2, posthold: int = 1, method_getitem: str = 'continuous', starting_seek_position: int = 0, backend: str = 'torchcodec', device: str = 'cpu', decord_backend: str = 'torch', decord_ctx=None, verbose: int = 1)[source]
Bases:
objectReads frames from one or more videos with a chunked memory buffer and optional background prefetching. RH 2022
Sequential batches of frames can be read quickly because buffers are filled by background threads. In many cases, batches can be consumed without waiting for the next chunk to finish loading.
Optimal use:
Create a
BufferedVideoReaderobject.EITHER set
method_getitem='continuous'and iterate over the object (fastest path), OR request batches of frames sequentially (going backwards is slow because buffers move forward).Each batch should fit inside a single buffer slot. Slices that span multiple buffer slots require concatenation and are slow. With a buffer size of 1000 frames,
[0:1000], [1000:2000], ...is fast, while[0:1700],[1700:3200],[0:990],[990:1010]are slow (too big, overlapping, backwards, or crossing slot boundaries).
- Parameters:
video_readers (Optional[list]) – List of video reader objects (
decord.VideoReaderorTorchCodecVideoReader). A single reader is also accepted. IfNone,paths_videosmust be provided. (Default isNone)paths_videos (Optional[list]) – List of paths to videos. A single
stris also accepted. IfNone,video_readersmust be provided. If both are supplied,video_readerswins. (Default isNone)buffer_size (int) – Number of frames per buffer slot. Avoid indexing more than
buffer_sizeframes at a time or across slot boundaries (e.g. acrossidx % buffer_size == 0); these require concatenating buffers and are slow. (Default is1000)prefetch (int) – Number of buffers to prefetch ahead.
0disables prefetching. A single buffer slot only contains frames from one video, sobuffer_size <= video lengthis recommended. (Default is2)posthold (int) – Number of buffers to keep loaded behind the current position.
0disables posthold. Useful when iterating backwards. (Default is1)method_getitem (str) –
Indexing mode for
__getitem__. One of'continuous': index across all videos as a single concatenated sequence;reader[idx_frames_slice].'by_video': index requires a(idx_video, idx_frames)tuple;reader[(idx_video, slice)].
(Default is
'continuous')starting_seek_position (int) – Starting frame index for the iterator. Used only when
method_getitem == 'continuous'and iterating. (Default is0)backend (str) –
Video decoding backend. One of
'torchcodec': usestorchcodec.decoders.VideoDecoder. Frame-accurate seeking, actively maintained, supports CPU and GPU (NVDEC) decode. Includes a workaround for torchcodec issue #905 (sequential-access drain bug near EOF in H.264 AVIs): frames in[0, n - SAFETY)come from a persistent decoder; the trailingSAFETY = max(has_b_frames, 2)frames go through a fresh decoder cached as the tail decoder.'decord': usesdecord.VideoReader. Well-tested fallback and the only backend available on Windows. Provided by thedecord2PyPI package on Linux/macOS (with vendored FFmpeg 8 wheels for py3.10-3.14) and byeva_decordon Windows. Both are installed by face-rhythm’s default dependencies.
Only used when
paths_videosis provided. (Default is'torchcodec')device (str) – Device for video decoding when using torchcodec.
'cpu'decodes on CPU.'cuda'or'cuda:0'decodes on GPU using NVDEC; frames are returned as CUDA tensors. GPU decode requires an NVIDIA GPU, torchcodec installed with CUDA support, and FFmpeg built with--enable-cuda. (Default is'cpu')decord_backend (str) – Backend used by decord when loading frames (
'torch','numpy','mxnet', …). Only used whenbackend='decord'. (Default is'torch')decord_ctx (object) – Context used by decord when loading frames (e.g.
decord.cpu(),decord.gpu()). Only used whenbackend='decord'. (Default isNone)verbose (int) – Verbosity level.
0silences output,1prints warnings,2prints warnings and info. (Default is1)
- metadata
Per-video metadata (path, length, fps, frame size, channels).
- Type:
pandas.DataFrame
- slots
Buffer slots holding chunks of decoded frames.
- Type:
List[List[Optional[torch.Tensor]]]
- lookup
Lookup table mapping continuous frame index to
(video, slot).- Type:
pandas.DataFrame
- get_frames_from_single_video_index(idx: tuple)[source]
Returns frames from a single video by
(video, frame)index.If
idxis anintorsliceit is interpreted as a video index and a newBufferedVideoReaderis constructed over the selected videos.- Parameters:
idx (Union[int, slice, Tuple[int, Union[int, slice]]]) – Either
(idx_video, idx_frames)to read frames from one video, or anint/sliceto spawn a reader over a subset of videos.- Returns:
- frames (Union[torch.Tensor, BufferedVideoReader]):
Decoded frames with shape (num_frames, H, W, C) when
idxis a tuple, or a new reader whenidxselects videos.
- Return type:
(Union[torch.Tensor, BufferedVideoReader])
- get_frames_from_continuous_index(idx)[source]
Returns frames addressed by a continuous (concatenated) frame index.
The videos are treated as one long sequence of frames;
idxis the index of the frames within this sequence.
- face_rhythm.helpers.save_gif(array, path, frameRate=5.0, loop=0, backend='PIL', kwargs_backend={})[source]
Saves an array of images as an animated GIF. RH 2023
- Parameters:
array (Union[np.ndarray, list]) – 3D (grayscale) or 4D (color) array of images. If dtype is floating, values are interpreted in [0, 1]; if integer, in [0, 255].
path (str) – Output path for the GIF.
frameRate (float) – Frame rate of the GIF in frames per second. (Default is
5.0)loop (int) – Number of loops.
0loops forever,1plays once,2plays twice, etc. (Default is0)backend (str) –
GIF writer backend. One of
'imageio''PIL'
(Default is
'PIL')kwargs_backend (dict) – Extra keyword arguments forwarded to the chosen backend. (Default is
{})
- face_rhythm.helpers.grayscale_to_rgb(array)[source]
Converts a grayscale image or movie to RGB by repeating the channel. RH 2023
- Parameters:
array (Union[np.ndarray, torch.Tensor, list]) – 2D image or 3D movie of grayscale frames. Lists of arrays or tensors are stacked first.
- Returns:
- rgb (Union[np.ndarray, torch.Tensor]):
Same backend as the input with an extra trailing channel dimension of size 3.
- Return type:
(Union[np.ndarray, torch.Tensor])
- class face_rhythm.helpers.Toeplitz_convolution2d(x_shape, k, mode='same', dtype=None)[source]
Bases:
objectConvolves a 2D array with a 2D kernel via Toeplitz matrix multiplication. RH 2022
Allows sparse
xinputs (kmust remain dense). Ideal whenxis very sparse (density < 0.01),xis small (shape < (1000, 1000)),kis small (shape < (100, 100)), and the batch size is large (e.g. 1000+). Generally faster thanscipy.signal.convolve2dwhen convolving many arrays with the same kernel. Memory footprint stays low because the Toeplitz matrix is held as a sparse matrix.See https://stackoverflow.com/a/51865516 and https://github.com/alisaaalehi/convolution_as_multiplication for an illustration. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.convolution_matrix.html for the 1D version, and https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.matmul_toeplitz.html for potential speedups.
- Parameters:
x_shape (Tuple[int, int]) – Shape of the 2D array to be convolved.
k (np.ndarray) – 2D kernel to convolve with.
mode (str) –
Convolution mode. One of
'full''same''valid'
See
scipy.signal.convolve2dfor details. (Default is'same')dtype (Optional[np.dtype]) – Data type for the Toeplitz matrix. Ideally matches the dtype of the input array. If
None, the dtype ofkis used. (Default isNone)
- k
Flipped copy of the kernel used internally.
- Type:
np.ndarray
- dtype
Data type of the Toeplitz matrix.
- Type:
np.dtype
- dt
The double-block Toeplitz matrix in sparse CSR form.
- Type:
Example
conv = Toeplitz_convolution2d(x_shape=x.shape, k=kernel, mode='same') y = conv(x)
- face_rhythm.helpers.cosine_kernel_2D(center=(5, 5), image_size=(11, 11), width=5)[source]
Generates a 2D radial cosine kernel. RH 2021
- Parameters:
center (Tuple[int, int]) –
(x, y)peak position, zero-indexed. Set the second element to0to obtain a 1D kernel. (Default is(5, 5))image_size (Tuple[int, int]) –
(width, height)of the output kernel. Set the second element to0for a 1D kernel. (Default is(11, 11))width (float) – Full width of one cycle of the cosine. (Default is
5)
- Returns:
- k_cos (np.ndarray):
Cosine kernel. shape: (image_size[0], image_size[1]).
- Return type:
(np.ndarray)
- face_rhythm.helpers.bounded_logspace(start, stop, num)[source]
Logarithmically spaced values between
startandstop(inclusive). RH 2022
- face_rhythm.helpers.gaussian(x=None, mu=0, sig=1, plot_pref=False)[source]
Evaluates a normalized 1D Gaussian function on a grid. RH 2021
- Parameters:
x (Optional[np.ndarray]) – 1D array of x positions. If
None, a default range covering five sigma on each side is used. (Default isNone)mu (float) – Mean of the Gaussian. (Default is
0)sig (float) – Standard deviation of the Gaussian. (Default is
1)plot_pref (bool) – If
True, plots the Gaussian using matplotlib. (Default isFalse)
- Returns:
- gaus (np.ndarray):
Gaussian evaluated at each value of
x.
- Return type:
(np.ndarray)
- face_rhythm.helpers.torch_hilbert(x, N=None, dim=0)[source]
Computes the analytic signal of
xvia a Hilbert transform. RH 2022Mirrors
scipy.signal.hilbertbut operates ontorch.Tensorinputs.- Parameters:
x (torch.Tensor) – Real-valued signal of arbitrary rank.
N (Optional[int]) – Number of Fourier components. If
None, usesx.shape[dim]. (Default isNone)dim (int) – Dimension along which to transform. (Default is
0)
- Returns:
- xa (torch.Tensor):
Complex analytic signal with the same shape as
x.
- Return type:
- face_rhythm.helpers.make_VQT_filters(Fs_sample=1000, Q_lowF=3, Q_highF=20, F_min=10, F_max=400, n_freq_bins=55, win_size=501, symmetry='center', taper_asymmetric=True, plot_pref=False)[source]
Builds a bank of complex sinusoid filters for the VQT algorithm. RH 2022
Setting
Q_lowF == Q_highFproduces a Constant-Q Transform (CQT) filter set. Differing values vary the Q factor logarithmically across the frequency range.- Parameters:
Fs_sample (float) – Sampling frequency of the signal. (Default is
1000)Q_lowF (float) – Q factor for the lowest frequency. (Default is
3)Q_highF (float) – Q factor for the highest frequency. (Default is
20)F_min (float) – Lowest frequency. (Default is
10)F_max (float) – Highest frequency (inclusive). (Default is
400)n_freq_bins (int) – Number of frequency bins. (Default is
55)win_size (int) – Window size in samples. Must be odd. (Default is
501)symmetry (str) –
Window symmetry. One of
'center': symmetric / two-sided window.'left': one-sided window, only the left half is nonzero.'right': one-sided window, only the right half is nonzero.
(Default is
'center')taper_asymmetric (bool) – If
Trueandsymmetry != 'center', the center sample of the window is multiplied by 0.5 to taper the discontinuity. (Default isTrue)plot_pref (bool) – If
True, plots the filters and windows. (Default isFalse)
- Returns:
- tuple containing:
- filts_complex (torch.Tensor):
Complex sinusoid filters. shape: (n_freq_bins, win_size).
- freqs (np.ndarray):
Filter center frequencies. shape: (n_freq_bins,).
- wins (torch.Tensor):
Gaussian window for each filter. shape: (n_freq_bins, win_size).
- Return type:
(tuple)
- class face_rhythm.helpers.VQT(Fs_sample=1000, Q_lowF=3, Q_highF=20, F_min=10, F_max=400, n_freq_bins=55, win_size=501, symmetry='center', taper_asymmetric=True, downsample_factor=4, padding='valid', DEVICE_compute='cpu', DEVICE_return='cpu', batch_size=1000, return_complex=False, filters=None, plot_pref=False, progressBar=True)[source]
Bases:
objectVariable Q Transform implemented with PyTorch. RH 2022
Differs from librosa / nnAudio: this implementation does not iterate lowpass filtering. Instead it convolves a fixed set of complex filters, optionally returns the envelope via Hilbert transform, and downsamples. Gradients propagate through the transform, and computation can run on GPU.
Qis the quality factor, roughly the number of cycles inside four sigma (95%) of a Gaussian window.- Parameters:
Fs_sample (float) – Sampling frequency of the signal. (Default is
1000)Q_lowF (float) – Q factor for the lowest frequency. (Default is
3)Q_highF (float) – Q factor for the highest frequency. (Default is
20)F_min (float) – Lowest frequency. (Default is
10)F_max (float) – Highest frequency. (Default is
400)n_freq_bins (int) – Number of frequency bins. (Default is
55)win_size (int) – Window size in samples. Must be odd. (Default is
501)symmetry (str) –
Window symmetry passed through to
make_VQT_filters. One of'center''left''right'
(Default is
'center')taper_asymmetric (bool) – If
Trueandsymmetry != 'center', the center sample of the window is multiplied by 0.5. (Default isTrue)downsample_factor (int) – Time-downsampling factor. The input is zero-padded to be a multiple of this value. (Default is
4)padding (str) – Convolution padding.
'same'pads to keep output length equal to input length;'valid'does not pad. (Default is'valid')DEVICE_compute (str) – Device used for computation. (Default is
'cpu')DEVICE_return (str) – Device on which results are returned. (Default is
'cpu')batch_size (int) – Number of signals processed per batch. Reduce when out of memory. (Default is
1000)return_complex (bool) – If
True, returns the complex-valued transform; otherwise returns its absolute value (envelope).downsample_factormust be1whenTrue. (Default isFalse)filters (Optional[torch.Tensor]) – Pre-built complex sinusoid filters. shape: (n_freq_bins, win_size). If
None,make_VQT_filtersis called. (Default isNone)plot_pref (bool) – If
True, plots the filters. (Default isFalse)progressBar (bool) – If
True, displays a tqdm progress bar during__call__. (Default isTrue)
- filters
Complex sinusoid filters used for convolution.
- Type:
- freqs
Filter center frequencies (only when filters were generated internally).
- Type:
np.ndarray
- wins
Gaussian windows for each filter (only when filters were generated internally).
- Type:
- face_rhythm.helpers.generate_multiphasic_sinewave(n_samples: int = 10000, n_periods: float = 1.0, n_waves: int = 3, return_x: bool = False, return_phases: bool = False)[source]
Generates
n_wavescosine waves with evenly spaced phase offsets. RH 2024- Parameters:
n_samples (int) – Number of samples per wave. (Default is
10000)n_periods (float) – Number of full periods spanned by
n_samples. (Default is1.0)n_waves (int) – Number of phase-shifted sine waves to return. (Default is
3)return_x (bool) – If
True, also returns the x positions. (Default isFalse)return_phases (bool) – If
True, also returns the per-wave phase arrays. (Default isFalse)
- Returns:
- output (Union[np.ndarray, tuple]):
Combination depending on
return_x/return_phases:waves(np.ndarray): generated cosine waves.x(np.ndarray): x positions, ifreturn_x.phases(np.ndarray): per-wave phases, ifreturn_phases.
- Return type:
(Union[np.ndarray, tuple])
- face_rhythm.helpers.set_device(use_GPU: bool = True, device_num: int = 0, verbose: bool = True) str[source]
Sets the device for PyTorch. If a GPU is available and use_GPU is
True, it will be set as the device. Otherwise, the CPU will be set as the device. RH 2022- Parameters:
use_GPU (bool) –
Determines if the GPU should be utilized:
True: the function will attempt to use the GPU if a GPU is not available.False: the function will use the CPU.
(Default is
True)device_num (int) – Specifies the index of the GPU to use. (Default is
0)verbose (bool) –
Determines whether to print the device information.
True: the function will print out the device information.
(Default is
True)
- Returns:
- device (str):
A string specifying the device, either “cpu” or “cuda:<device_num>”.
- Return type:
(str)
- face_rhythm.helpers.tensorly_cp_to_device(cp, device='cpu')[source]
Moves the factors and weights of a tensorly CP object to
device. RH 2024
- face_rhythm.helpers.simple_cmap(colors=[[1, 0, 0], [1, 0.6, 0], [0.9, 0.9, 0], [0.6, 1, 0], [0, 1, 0], [0, 1, 0.6], [0, 0.8, 0.8], [0, 0.6, 1], [0, 0, 1], [0.6, 0, 1], [0.8, 0, 0.8], [1, 0, 0.6]], under=[0, 0, 0], over=[0.5, 0.5, 0.5], bad=[0.9, 0.9, 0.9], name='none')[source]
Builds a
LinearSegmentedColormapfrom a sequence of RGB values.Adapted from https://gist.github.com/ahwillia/3e022cdd1fe82627cbf1f2e9e2ad80a7e.
- Parameters:
colors (list) – Sequence of RGB triples (or matplotlib color strings) defining the colormap stops.
under (list) – RGB color used for values below the colormap range. (Default is
[0, 0, 0])over (list) – RGB color used for values above the colormap range. (Default is
[0.5, 0.5, 0.5])bad (list) – RGB color used for masked / NaN values. (Default is
[0.9, 0.9, 0.9])name (str) – Colormap name. (Default is
'none')
- Returns:
- cmap (matplotlib.colors.LinearSegmentedColormap):
Resulting linear-segmented colormap.
- Return type:
Example
cmap = simple_cmap([(1, 1, 1), (1, 0, 0)]) # white to red cmap = simple_cmap(['w', 'r']) # white to red cmap = simple_cmap(['r', 'b', 'r']) # red to blue to red
- class face_rhythm.helpers.Cmap_conjunctive(cmaps, dtype_out=<class 'int'>, normalize=False, normalization_range=[0, 255], name='cmap_conjunctive')[source]
Bases:
objectCombines multiple colormaps by multiplying their per-channel outputs. RH 2022
- Parameters:
cmaps (list) – List of
matplotlib.colors.LinearSegmentedColormapobjects to combine.dtype_out (np.dtype) – Data type of the returned color array. (Default is
int)normalize (bool) – If
True, normalizes each input column to[0, 1]before applying the colormaps. (Default isFalse)normalization_range (list) –
[lo, hi]to which the output is rescaled. (Default is[0, 255])name (str) – Name of the resulting colormap. (Default is
'cmap_conjunctive')
- fn_conj_cmap
Function that maps an input array of shape (n_samples, n_cmaps) to the elementwise product of each colormap’s output.
- Type:
Callable
- class face_rhythm.helpers.Colorwheel(rotation: float = 0.0, saturation: float = 1.0, center: int = 0, radius: int = 255, dtype: numpy.dtype = numpy.uint8, bit_depth: int = 16, exponent: float = 10, normalize: bool = True, colors: List[List | Tuple] = [[1, 0, 0], [1, 0.5, 0], [1, 1, 0], [0.5, 1, 0], [0, 1, 0], [0, 1, 0.5], [0, 1, 1], [0, 0.5, 1], [0, 0, 1], [0.5, 0, 1], [1, 0, 1], [1, 0, 0.5]])[source]
Bases:
object2D colorwheel colormap (angle + magnitude) for cyclic data. RH 2024
Useful for visualizing complex/polar values, optical flow, and other cyclic data.
- Parameters:
rotation (float) – Rotation of the colorwheel in radians. (Default is
0.0)saturation (float) – Color saturation in [0, 1]. (Default is
1.0)center (int) – Color value at the center of the wheel. (Default is
0)radius (int) – Maximum color value at the rim of the wheel. (Default is
255)dtype (np.dtype) – Output dtype of the color array. (Default is
np.uint8)bit_depth (int) – Number of samples used to discretize the wheel:
2 ** bit_depth. (Default is16)exponent (float) – Exponent applied to each base color wave to sharpen transitions. (Default is
10)normalize (bool) – If
True, normalizes the per-angle color sum to1so that color intensity is uniform around the wheel. (Default isTrue)colors (List[Union[List, Tuple]]) – Sequence of base RGB triples used to build the rainbow. (Default is a 12-color rainbow)
- fn_interp
Interpolator that maps an angle (radians) to per-base-color weights along the wheel.
- Type:
Callable
- colors
Array of base colors. shape: (n_colors, 3).
- Type:
np.ndarray
- face_rhythm.helpers.clahe(im, grid_size=50, clipLimit=0, normalize=True)[source]
Applies Contrast Limited Adaptive Histogram Equalization to an image. RH 2022
- Parameters:
im (np.ndarray) – Input image.
grid_size (int) – Tile grid size passed to
cv2.createCLAHE. (Default is50)clipLimit (int) – Contrast clip limit passed to
cv2.createCLAHE. (Default is0)normalize (bool) – If
True, normalizes the input to span the full 16-bit range before applying CLAHE. (Default isTrue)
- Returns:
- im_c (np.ndarray):
CLAHE-enhanced image. dtype: uint16.
- Return type:
(np.ndarray)
- face_rhythm.helpers.add_text_to_images(images, text, position=(10, 10), font_size=1, color=(255, 255, 255), line_width=1, font=None, show=False, frameRate=30)[source]
Overlays multi-line text onto each frame using
cv2.putText. RH 2022- Parameters:
images (np.ndarray) – Frames of video or images. shape: (n_frames, H, W, C).
text (List[List[str]]) – Text per frame. Outer list has one element per frame; each inner list holds the lines of text drawn on that frame.
position (Tuple[int, int]) –
(x, y)position of the top-left corner of the text. (Default is(10, 10))font_size (int) – Font scale passed to
cv2.putText. (Default is1)color (Tuple[int, int, int]) –
(R, G, B)text color. (Default is(255, 255, 255))line_width (int) – Line thickness passed to
cv2.putText. (Default is1)font (Optional[int]) – OpenCV font constant. If
None, usescv2.FONT_HERSHEY_SIMPLEX. (Default isNone)show (bool) – If
True, displays each annotated frame usingcv2.imshow. (Default isFalse)frameRate (float) – Display frame rate when
showisTrue. (Default is30)
- Returns:
- images_with_text (np.ndarray):
Frames of video or images with text overlays applied.
- Return type:
(np.ndarray)
- face_rhythm.helpers.mask_image_border(im: numpy.ndarray, border_outer: int | Tuple[int, int, int, int] | None = None, border_inner: int | None = None, mask_value: float = 0) numpy.ndarray[source]
Masks an image within specified outer and inner borders. RH 2022
- Parameters:
im (np.ndarray) – Input image of shape: (height, width) or (height, width, channels).
border_outer (Union[int, tuple[int, int, int, int], None]) – Number of pixels along the border to mask. If
None, the border is not masked. If an int is provided, all borders are equally masked. If a tuple of ints is provided, borders are masked in the order: (top, bottom, left, right). (Default isNone)border_inner (int, Optional) – Number of pixels in the center to mask. Will be a square with side length equal to this value. (Default is
None)mask_value (float) – Value to replace the masked pixels with. (Default is 0)
- Returns:
- im_out (np.ndarray):
Masked output image.
- Return type:
(np.ndarray)
- face_rhythm.helpers.find_geometric_transformation(im_template: numpy.ndarray, im_moving: numpy.ndarray, warp_mode: str = 'euclidean', n_iter: int = 5000, termination_eps: float = 1e-10, mask: numpy.ndarray | None = None, gaussFiltSize: int = 1) numpy.ndarray[source]
Estimates the geometric transformation between two images via ECC. RH 2022
Wraps
cv2.findTransformECC.- Parameters:
im_template (np.ndarray) – Template image. dtype: uint8 or float32.
im_moving (np.ndarray) – Moving image. dtype: uint8 or float32.
warp_mode (str) –
Motion model. One of
'translation': 2x3 warpMatrix; only translation is estimated.'euclidean': 2x3 warpMatrix; rigid (rotation + translation).'affine': 2x3 warpMatrix; six parameters.'homography': 3x3 warpMatrix; eight parameters.
(Default is
'euclidean')n_iter (int) – Maximum number of iterations. (Default is
5000)termination_eps (float) – Threshold on the per-iteration increment of the correlation coefficient. (Default is
1e-10)mask (Optional[np.ndarray]) – Binary mask. Pixels where mask is zero are ignored. If
None, no mask is used. (Default isNone)gaussFiltSize (int) – Gaussian filter size.
0disables filtering. (Default is1)
- Returns:
- warp_matrix (np.ndarray):
Estimated warp matrix. Apply with
cv2.warpAffineorcv2.warpPerspective.
- Return type:
(np.ndarray)
- face_rhythm.helpers.apply_warp_transform(im_in: numpy.ndarray, warp_matrix: numpy.ndarray, interpolation_method: int = <MagicMock id='136729743356272'>, borderMode: int = <MagicMock id='136729743491200'>, borderValue: int = 0) numpy.ndarray[source]
Applies a warp transform to an image. RH 2022
Wraps
cv2.warpAffine(for 2x3 matrices) andcv2.warpPerspective(for 3x3 matrices).- Parameters:
im_in (np.ndarray) – Input image with any dimensions.
warp_matrix (np.ndarray) – Warp matrix. Shape should be (2, 3) for affine transformations, and (3, 3) for homography. See
cv2.findTransformECCfor more info.interpolation_method (int) – Interpolation method. See
cv2.warpAffinefor more info. (Default iscv2.INTER_LINEAR)borderMode (int) – Border mode. Determines how to handle pixels from outside the image boundaries. See
cv2.warpAffinefor more info. (Default iscv2.BORDER_CONSTANT)borderValue (int) – Value to use for border pixels if borderMode is set to
cv2.BORDER_CONSTANT. (Default is 0)
- Returns:
- im_out (np.ndarray):
Transformed output image with the same dimensions as the input image.
- Return type:
(np.ndarray)
- face_rhythm.helpers.warp_matrix_to_remappingIdx(warp_matrix: numpy.ndarray | torch.Tensor, x: int, y: int) numpy.ndarray | torch.Tensor[source]
Converts a warp matrix (2x3 or 3x3) into a 2D remapping index field. RH 2023
- Parameters:
warp_matrix (Union[np.ndarray, torch.Tensor]) – Warp matrix of shape (2, 3) for affine transformations, or (3, 3) for homography.
x (int) – Width of the output remapping field.
y (int) – Height of the output remapping field.
- Returns:
- remapIdx (Union[np.ndarray, torch.Tensor]):
Remapping indices. shape: (y, x, 2). The last axis stores the pixel coordinate
(x, y)to sample from.
- Return type:
(Union[np.ndarray, torch.Tensor])
- face_rhythm.helpers.remap_images(images: numpy.ndarray | torch.Tensor, remappingIdx: numpy.ndarray | torch.Tensor, backend: str = 'torch', interpolation_method: str = 'linear', border_mode: str = 'constant', border_value: float = 0, device: str = 'cpu') numpy.ndarray | torch.Tensor[source]
Applies remapping indices to a set of images. Remapping indices, similar to flow fields, describe the index of the pixel to sample from rather than the displacement of each pixel. RH 2023
- Parameters:
images (Union[np.ndarray, torch.Tensor]) – The images to be warped. Shapes can be (N, C, H, W), (C, H, W), or (H, W).
remappingIdx (Union[np.ndarray, torch.Tensor]) – The remapping indices, describing the index of the pixel to sample from. Shape is (H, W, 2).
backend (str) – The backend to use. Can be either
'torch'or'cv2'. (Default is'torch')interpolation_method (str) – The interpolation method to use. Options are
'linear','nearest','cubic', and'lanczos'. Refer to cv2.remap or torch.nn.functional.grid_sample for more details. (Default is'linear')border_mode (str) – The border mode to use. Options include
'constant','reflect','replicate', and'wrap'. Refer to cv2.remap for more details. (Default is'constant')border_value (float) – The border value to use. Refer to cv2.remap for more details. (Default is
0)device (str) – The device to use for computations. Commonly either
'cpu'or'gpu'. (Default is'cpu')
- Returns:
- warped_images (Union[np.ndarray, torch.Tensor]):
The warped images. The shape will be the same as the input images, which can be (N, C, H, W), (C, H, W), or (H, W).
- Return type:
(Union[np.ndarray, torch.Tensor])
- face_rhythm.helpers.invert_remappingIdx(remappingIdx: numpy.ndarray, method: str = 'linear', fill_value: float | None = numpy.nan) numpy.ndarray[source]
Inverts a remapping index field.
Requires the assumption that the remapping index field is invertible or bijective/one-to-one and non-occluding. Defined ‘remap_AB’ as a remapping index field that warps image A onto image B, then ‘remap_BA’ is the remapping index field that warps image B onto image A. This function computes ‘remap_BA’ given ‘remap_AB’.
RH 2023
- Parameters:
remappingIdx (np.ndarray) – An array of shape (H, W, 2) representing the remap field.
method (str) –
Interpolation method to use. See
scipy.interpolate.griddata. Options are:'linear''nearest''cubic'
(Default is
'linear')fill_value (Optional[float]) – Value used to fill points outside the convex hull. (Default is
np.nan)
- Returns:
An array of shape (H, W, 2) representing the inverse remap field.
- Return type:
(np.ndarray)
- face_rhythm.helpers.invert_warp_matrix(warp_matrix: numpy.ndarray) numpy.ndarray[source]
Inverts a provided warp matrix for the transformation A->B to compute the warp matrix for B->A. RH 2023
- Parameters:
warp_matrix (np.ndarray) – A 2x3 or 3x3 array representing the warp matrix. Shape: (2, 3) or (3, 3).
- Returns:
- inverted_warp_matrix (np.ndarray):
The inverted warp matrix. Shape: same as input.
- Return type:
(np.ndarray)
- face_rhythm.helpers.compose_remappingIdx(remap_AB: numpy.ndarray, remap_BC: numpy.ndarray, method: str = 'linear', fill_value: float | None = numpy.nan, bounds_error: bool = False) numpy.ndarray[source]
Composes two remapping index fields using scipy.interpolate.interpn.
This function computes ‘remap_AC’ from ‘remap_AB’ and ‘remap_BC’, where ‘remap_AB’ is a remapping index field that warps image A onto image B, and ‘remap_BC’ is a remapping index field that warps image B onto image C.
RH 2023
- Parameters:
remap_AB (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image A to image B.
remap_BC (np.ndarray) – An array of shape (H, W, 2) representing the remap field from image B to image C.
method (str) –
Interpolation method to use. Either
'linear': Use linear interpolation (default).'nearest': Use nearest interpolation.'cubic': Use cubic interpolation.
fill_value (Optional[float]) – The value used for points outside the interpolation domain. (Default is
np.nan)bounds_error (bool) – If
True, a ValueError is raised when interpolated values are requested outside of the domain of the input data. (Default isFalse)
- Returns:
- remap_AC (np.ndarray):
An array of shape (H, W, 2) representing the remap field from image A to image C.
- Return type:
(np.ndarray)
- face_rhythm.helpers.compose_transform_matrices(matrix_AB: numpy.ndarray, matrix_BC: numpy.ndarray) numpy.ndarray[source]
Composes two transformation matrices to create a transformation from one image to another. RH 2023
This function is used to combine two transformation matrices, ‘matrix_AB’ and ‘matrix_BC’. ‘matrix_AB’ represents a transformation that warps an image A onto an image B. ‘matrix_BC’ represents a transformation that warps image B onto image C. The result is ‘matrix_AC’, a transformation matrix that would warp image A directly onto image C.
- Parameters:
matrix_AB (np.ndarray) – A transformation matrix from image A to image B. The array can have the shape (2, 3) or (3, 3).
matrix_BC (np.ndarray) – A transformation matrix from image B to image C. The array can have the shape (2, 3) or (3, 3).
- Returns:
- matrix_AC (np.ndarray):
A composed transformation matrix from image A to image C. The array has the shape (2, 3) or (3, 3).
- Return type:
(np.ndarray)
- Raises:
AssertionError – If the input matrices do not have the shape (2, 3) or (3, 3).
Example
# Define the transformation matrices matrix_AB = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) matrix_BC = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) # Compose the transformation matrices matrix_AC = compose_transform_matrices(matrix_AB, matrix_BC)
- face_rhythm.helpers.flowField_to_remappingIdx(ff: numpy.ndarray | object) numpy.ndarray | object[source]
Converts a flow field into a remapping index by adding the pixel grid. RH 2023
WARNING: Strictly speaking, a flow field (displacement) and a remapping index (interpolation mapping) are different concepts; this helper performs the obvious sum and is correct under the standard convention.
- Parameters:
ff (Union[np.ndarray, torch.Tensor]) – Flow field describing the displacement of each pixel. shape: (H, W, 2). Last dimension is (x, y).
- Returns:
- ri (Union[np.ndarray, torch.Tensor]):
Remapping index of source pixel coordinates. shape: (H, W, 2).
- Return type:
(Union[np.ndarray, torch.Tensor])
- face_rhythm.helpers.remappingIdx_to_flowField(ri: numpy.ndarray | object) numpy.ndarray | object[source]
Converts a remapping index into a flow field by subtracting the pixel grid. RH 2023
WARNING: Strictly speaking, a remapping index (interpolation mapping) and a flow field (displacement) are different concepts; this helper performs the obvious subtraction.
- Parameters:
ri (Union[np.ndarray, torch.Tensor]) – Remapping index. shape: (H, W, 2). Last dimension is (x, y).
- Returns:
- ff (Union[np.ndarray, torch.Tensor]):
Flow field. shape: (H, W, 2).
- Return type:
(Union[np.ndarray, torch.Tensor])
- face_rhythm.helpers.cv2RemappingIdx_to_pytorchFlowField(ri: numpy.ndarray | torch.Tensor) numpy.ndarray | torch.Tensor[source]
Converts remapping indices from the OpenCV format to the PyTorch format. In the OpenCV format, the displacement is in pixels relative to the top left pixel of the image. In the PyTorch format, the displacement is in pixels relative to the center of the image. RH 2023
- Parameters:
ri (Union[np.ndarray, torch.Tensor]) – Remapping indices. Each pixel describes the index of the pixel in the original image that should be mapped to the new pixel. Shape: (H, W, 2). The last dimension is (x, y).
- Returns:
- normgrid (Union[np.ndarray, torch.Tensor]):
”Flow field”, in the PyTorch format. Technically not a flow field, since it doesn’t describe displacement. Rather, it is a remapping index relative to the center of the image. Shape: (H, W, 2). The last dimension is (x, y).
- Return type:
(Union[np.ndarray, torch.Tensor])
- face_rhythm.helpers.remap_points(points: numpy.ndarray, remappingIdx: numpy.ndarray, interpolation: str = 'linear', fill_value: float = None) numpy.ndarray[source]
Remaps a set of 2D points through an index map produced for image warping.
- Parameters:
points (np.ndarray) – Array of points to be remapped. shape: (n_points, 2), dtype: floating. Each row is an
(x, y)coordinate within the image.remappingIdx (np.ndarray) – Index map describing the warp. shape: (height, width, 2), dtype: floating.
interpolation (str) –
Interpolation method passed to
scipy.interpolate.RegularGridInterpolator. One of'linear''nearest''slinear''cubic''quintic''pchip'
(Default is
'linear')fill_value (Optional[float]) – Value used to fill points outside the convex hull. If
None, values outside the convex hull are extrapolated. (Default isNone)
- Returns:
- points_remap (np.ndarray):
Remapped points. shape: (n_points, 2).
- Return type:
(np.ndarray)
- class face_rhythm.helpers.NVIDIA_Device_Checker(device_index=None, verbose=1)[source]
Bases:
_Device_Checker_BaseResource utilization checker for an NVIDIA GPU.
Requires the
nvidia-ml-py3package.- Parameters:
- get_device_handles()[source]
Returns one
nvmlDeviceGetHandleByIndexhandle per GPU detected by NVML.
- check_utilization()[source]
Returns a snapshot of the current GPU utilization metrics.
- Returns:
- info_changing (Dict[str, Any]):
Includes
time,memory_free,memory_used,memory_used_percentage,power_used,power_used_percentage,processor_used_percentage,temperature, andfan_speed.
- Return type:
(Dict[str, Any])
- class face_rhythm.helpers.CPU_Device_Checker(verbose=1)[source]
Bases:
_Device_Checker_BaseResource utilization checker for the host CPU and disk.
- Parameters:
verbose (int) – Verbosity level passed to the base class. (Default is
1)
- check_utilization()[source]
Returns a snapshot of CPU, memory, network, and disk utilization.
- Returns:
- info_changing (Dict[str, Any]):
Per-snapshot metrics including memory, network I/O, disk free/used, disk read/write throughput, and overall + per-core CPU usage percentages.
- Return type:
(Dict[str, Any])
- class face_rhythm.helpers.Equivalence_checker(kwargs_allclose: dict | None = {'equal_nan': True, 'rtol': 1e-07}, assert_mode=False, verbose=False)[source]
Bases:
objectClass for checking if all items are equivalent or allclose (almost equal) in two complex data structures. Can check nested lists, dicts, and other data structures. Can also optionally assert (raise errors) if all items are not equivalent. RH 2023
- Parameters:
kwargs_allclose (Optional[dict]) – Keyword arguments for the numpy.allclose function. (Default is
{'rtol': 1e-7, 'equal_nan': True})assert_mode (bool) – Whether to raise an assertion error if items are not close.
verbose (bool) –
- How much information to print out:
False/0: No information printed out.True/1: Mismatched items only.2: All items printed out.
- face_rhythm.helpers.order_cp_factors_by_EVR(tensor_dense: numpy.ndarray | torch.Tensor, cp_factors: list | object, cp_weights: numpy.ndarray | torch.Tensor | None = None, orthogonalizable_EVR: bool = True) Tuple[numpy.ndarray, numpy.ndarray][source]
Sorts CP factors by descending explained variance ratio. RH 2024
- Parameters:
tensor_dense (Union[np.ndarray, torch.Tensor]) – Dense tensor to be reconstructed.
cp_factors (Union[list, object]) – CP factors. Either a list of 2D factor matrices of shape (n_samples, rank) or a
tensorly.CPTensorobject.cp_weights (Optional[Union[np.ndarray, torch.Tensor]]) – Per-rank weights of length (rank,). (Default is
None)orthogonalizable_EVR (bool) – If
True, optimizes each factor’s scaling to maximize EVR by OLS-orthogonalizing the dense tensor against each factor. (Default isTrue)
- Returns:
- tuple containing:
- order (np.ndarray):
Indices that sort the factors by descending EVR.
- evrs (np.ndarray):
Sorted explained variance ratios.
- Return type:
(tuple)
- face_rhythm.helpers.cp_reconstruction_EVR(tensor_dense, tensor_CP)[source]
Explained variance ratio of a CP-reconstructed tensor. RH 2023
- Parameters:
tensor_dense (Union[np.ndarray, torch.Tensor]) – Dense reference tensor. shape: (n_samples, n_features).
tensor_CP (Union[list, object]) – CP tensor. Either a list of 2D factor matrices of shape (n_samples, rank) or a
tensorly.CPTensorobject.
- Returns:
- ev (Union[float, torch.Tensor]):
Explained variance ratio
1 - var(tensor_dense - tensor_rec) / var(tensor_dense).
- Return type:
(Union[float, torch.Tensor])
- face_rhythm.helpers.rolling_mean(tensor: torch.Tensor, dim: int) torch.Tensor[source]
Computes the running mean along
dimusing Welford’s update. RH 2025- Parameters:
tensor (torch.Tensor) – Input tensor.
dim (int) – Dimension along which the running mean is accumulated.
- Returns:
- mean (torch.Tensor):
Final mean across
dim(last accumulated value).
- Return type:
- face_rhythm.helpers.play_video_cv2(array=None, path_video=None, frameRate=30, path_save=None, show=True, fourcc_code='MJPG', text=None, kwargs_text={})[source]
Plays or saves a video using OpenCV. RH 2021/2024
- Parameters:
array (Optional[np.ndarray]) – 3D
(frames, H, W)or 4D(frames, H, W, channels)uint8array. Values are clipped to[0, 255]. IfNone,path_videomust be supplied anddecordis used to read it. (Default isNone)path_video (Optional[Union[str, pathlib.Path]]) – Path to a video file. Used only when
arrayisNone. (Default isNone)frameRate (float) – Playback / output frame rate in Hz. (Default is
30)path_save (Optional[Union[str, pathlib.Path]]) – Destination path for the saved video.
Nonedisables saving. (Default isNone)show (bool) – If
True, displays the video in acv2window. (Default isTrue)fourcc_code (str) – FourCC codec string passed to
cv2.VideoWriter_fourcc. (Default is'MJPG')text (Optional[Union[str, List[str]]]) – Text overlay. If a list, element
iis drawn on framei. (Default isNone)kwargs_text (dict) – Keyword arguments forwarded to
cv2.putText. (Default is{})
- face_rhythm.helpers.make_tiled_video_array(videos: List[numpy.ndarray], shape: Tuple[int, int] | None = None, verbose: bool = True)[source]
Tiles a list of videos into a single grid video array. RH 2021/2024
Videos are placed top-to-bottom and then left-to-right.
- Parameters:
videos (List[np.ndarray]) – List of video arrays with shape (frames, H, W, channels) or (frames, H, W). All videos must share the same dtype.
shape (Optional[Tuple[int, int]]) – Grid layout
(n_rows, n_cols). IfNone, uses the smallest square grid that fits all videos. (Default isNone)verbose (bool) – If
True, prints progress messages. (Default isTrue)
- Returns:
- video_array (np.ndarray):
Tiled video. shape: (max_frames, total_H, total_W, channels).
- Return type:
(np.ndarray)
face_rhythm.pipelines module
Example end-to-end pipelines for running the face_rhythm package.
Each pipeline accepts a params dictionary that must contain all fields
required by the steps it executes.
- face_rhythm.pipelines.pipeline_basic(params)[source]
Runs the basic
face_rhythmpipeline, mirroringnotebooks/interactive_pipeline_basic.ipynb. RH 2023The ROIs must be defined ahead of time and saved as an
ROIs.h5file referenced byparams['ROIs']['initialize']['path_file']. Steps executed (gated byparams['steps']):'load_videos': Load video data viaBufferedVideoReader.'ROIs': Load ROIs and seed point positions.'point_tracking': Track points across frames.'VQT': Compute variable-Q spectrograms.'TCA': Perform tensor component analysis.
Each step also persists its outputs to the project’s
analysis_filesdirectory.- Parameters:
params (dict) – Dictionary of parameters controlling every pipeline step. See
scripts/params_pipeline_basic.jsonfor a complete example. Top-level keys consumed include'project','paths_videos','figure_saver','steps','BufferedVideoReader','Dataset_videos','ROIs','PointTracker','VQT_Analyzer', and'TCA'.- Returns:
- results (dict):
Dictionary with keys:
'path_config'(str): Path to the saved config file.'path_run_info'(str): Path to the saved run_info file.'directory_project'(str): Path to the project directory.'SEED'(int): Random seed used for the run.'params'(dict): The parameter dictionary that was used.
- Return type:
(dict)
face_rhythm.point_tracking module
Point tracking via Lucas-Kanade optical flow, with mesh-relaxation and outlier handling.
PointTracker advects a set of (x, y) seed points through a
BufferedVideoReader using either CPU or CUDA OpenCV LK optical flow. Mesh
distances to k-nearest neighbors are regularized toward their initial values,
and frames with any point displaced beyond a threshold halt and replay the
surrounding region to suppress outlier streaks.
- class face_rhythm.point_tracking.PointTracker(buffered_video_reader: BufferedVideoReader, point_positions: numpy.ndarray, rois_masks: ROIs = None, contiguous: bool = False, params_optical_flow: dict = {'kwargs_method': {'criteria': (3, 10, 0.03), 'maxLevel': 2, 'winSize': (15, 15)}, 'mesh_rigidity': 0.005, 'method': 'lucas_kanade', 'relaxation': 0.5}, params_clahe: dict = {'clipLimit': 40.0, 'tileGridSize': (150, 150)}, params_outlier_handling: dict = {'framesHalted_after': 30, 'framesHalted_before': 30, 'threshold_displacement': 25}, frames_freeze: numpy.ndarray | None = None, relaxation_during_freeze_frames: bool = True, idx_start: int | list | numpy.ndarray = 0, visualize_video: bool = False, params_visualization: dict = {'alpha': 1.0, 'point_sizes': 1}, verbose: bool | int = 1)[source]
Bases:
FR_ModuleTracks a set of seed points across video frames with Lucas-Kanade optical flow, mesh-rigidity regularization, and outlier handling.
Wraps OpenCV LK (CPU or CUDA when available) and applies a k-nearest- neighbor mesh constraint plus a relaxation force toward the original point positions. Frames where any point is displaced beyond a threshold trigger a rewind-and-replay of the surrounding window so violating points are frozen. Optional
frames_freezemasks proactively zero the optical flow delta on chosen frames.- Parameters:
buffered_video_reader (BufferedVideoReader) –
BufferedVideoReaderobject containing the videos to track. Created byfr.helpers.BufferedVideoReader.point_positions (np.ndarray) – Initial seed points to track. Each row is one point; columns are
(x, y). Typically produced byfr.rois.ROIsvia theROIs.point_positionsattribute. shape: (n_points, 2), dtype: float.rois_masks (Union[np.ndarray, List[np.ndarray], ROIs]) – ROI mask(s) used to zero-out non-ROI pixels before tracking. A single 2D bool array (shape: (H, W)) or a list of such arrays. When a list is provided, the masks are intersected into a single combined mask. (Default is
None)contiguous (bool) – If
True, all videos are treated as one continuous stream (the first frame of each video continues from the previous video). IfFalse, point tracking restarts for each video. (Default isFalse)params_optical_flow (dict) –
Parameters for optical flow. Missing keys fall back to defaults. Supported keys:
'method': Optical flow method. Only'lucas_kanade'is supported.'mesh_rigidity': Strength of the mesh-distance restoring force. Depends on point spacing.'mesh_n_neighbors': Number of nearest neighbors used for the mesh constraint.'relaxation': Per-frame fraction by which points relax back toward their original positions.'kwargs_method': Extra kwargs forwarded tocv2.calcOpticalFlowPyrLK(winSize,maxLevel,criteria).
See the OpenCV LK optical flow docs for parameter meanings. (Default is the dict shown in the signature)
params_clahe (dict) – Keyword arguments forwarded to
cv2.createCLAHE(orcv2.cuda.createCLAHE). IfNone, CLAHE is not applied. (Default is{'clipLimit': 40.0, 'tileGridSize': (150, 150)})params_outlier_handling (dict) –
Parameters for outlier (violation) handling. A violation is a frame in which a point exceeds the displacement threshold from its original position; on a violation the affected point has its velocity frozen for a window around the event. Supported keys:
'threshold_displacement': Maximum allowed displacement from the original position, in pixels.'framesHalted_before': Number of frames to halt before a violation.'framesHalted_after': Number of frames to halt after a violation.
(Default is the dict shown in the signature)
frames_freeze (Optional[np.ndarray]) – 1D bool array marking frames whose optical flow delta should be proactively zeroed.
True= freeze the OF delta for that frame. Length should equal the total number of frames across all videos in contiguous mode, or per-video length in non-contiguous mode. (Default isNone)relaxation_during_freeze_frames (bool) – Controls behavior on proactively frozen frames. If
True, the OF delta is zeroed but mesh rigidity and relaxation forces still apply, so the mesh can maintain shape and relax toward home positions. IfFalse, points are fully frozen and their positions are copied from the previous frame with no forces applied. (Default isTrue)idx_start (Union[int, list, np.ndarray]) – Index of the first frame to track. If an
int, it is used for all videos (or for the contiguous index whencontiguous=True). If a list/array andcontiguous=False, each entry is the start index for the corresponding video. (Default is0)visualize_video (bool) – If
True, displays the tracked frames viacv2.imshow. Set toFalseon headless systems. (Default isFalse)params_visualization (dict) – Parameters forwarded to
fr.visualization.FrameVisualizer. Do not include'points_colors'since it is reserved for outlier coloring. (Default is{'alpha': 1.0, 'point_sizes': 1})Verbosity level.
0: silent.1: warnings only.2: all info.
(Default is
1)
- point_positions
Initial seed point positions. shape: (n_points, 2).
- Type:
np.ndarray
- mask
Combined ROI mask. shape: (H, W), dtype: bool.
- Type:
- neighbors
Indices of the k-nearest neighbors of each point. shape: (n_points, mesh_n_neighbors), dtype: int64.
- Type:
- d_0
Initial mean neighbor-distance vectors per point. dtype: float32.
- Type:
- points_tracked
Tracked point arrays. Populated by
track_points; first stored as a list of arrays per video, then re-keyed as a dict{str(video_idx): np.ndarray}.
- violations
Per-video sparse COO matrices of violation flags (populated by
track_points).- Type:
- violations_sparseCOO
Per-video violation flags packed as
{'row', 'col', 'data', 'shape'}dicts (populated bytrack_points).- Type:
- violation_fraction
Per-video fraction of frames that contain at least one violation.
- Type:
List[float]
- cleanup()[source]
Deletes all instance attributes and runs garbage collection to free the large arrays held by the tracker.
- track_points()[source]
Runs the full point tracking workflow across all videos.
Tracks the seed points through every video using the configured optical flow, mesh, outlier handling, and freeze parameters. Populates
self.points_tracked,self.violations,self.violations_sparseCOO,self.violation_fraction, and therun_info/run_datadictionaries used byFR_Module.
face_rhythm.project module
Project bootstrap: creates the on-disk project layout and config files.
prepare_project builds <dir>/config.yaml, <dir>/run_info.json and
the analysis_files / visualizations subfolders used by the rest of the
pipeline.
- face_rhythm.project.prepare_project(directory_project='./', overwrite_config=False, update_project_paths=False, mkdir=True, initialize_visualization=True, verbose=1)[source]
Prepares the project folder and creates
config.yamlandrun_info.json(if they do not already exist or an overwrite is requested).- Parameters:
directory_project (str) – Path to the project directory. If
'./'is passed, the current working directory is used. (Default is'./')overwrite_config (bool) – Whether to overwrite the entire
config.yamlfile with a brand-new config. IfFalse,update_project_pathscan still be set toTrue. (Default isFalse)update_project_paths (bool) –
If
True, updates the following entries within the existingconfig.yamlto reflect the currentdirectory_project:paths > project:directory_project/paths > config:directory_project/config.yamlpaths > run_info:directory_project/run_info.json
If
overwrite_configisTrue, this argument is ignored. (Default isFalse)mkdir (bool) – Whether to create the project directory if it does not exist. (Default is
True)initialize_visualization (bool) – Whether to initialize
cv2.imshowvisualization. On a headless server this should be set toFalse. (Default isTrue)verbose (int) –
Verbosity level. One of
0: No output.1: Warnings.2: Info.
(Default is
1)
- Returns:
- tuple containing:
- path_config (str):
Path to the
config.yamlfile.- path_run_info (str):
Path to the
run_info.jsonfile.- directory_project (str):
Path to the project directory.
- Return type:
(tuple)
face_rhythm.rois module
ROI selection, point-grid generation, and image warping / registration.
Provides the ROIs class (choose face regions via GUI, file, or explicit
dict), the ImageAlignmentChecker and registration helpers used by the
alignment module, and the interactive _Select_ROI Plotly/ipywidgets GUI.
- class face_rhythm.rois.ROIs(select_mode='gui', exampleImage=None, path_file=None, coords_rois=None, point_positions=None, mask_images=None, verbose=1)[source]
Bases:
FR_ModuleContainer for one or more face ROIs and the tracking points sampled within them. Supports three construction modes: interactive GUI, loading a saved
ROIs.h5file, or building from explicit polygon coordinates. RH 2022- Parameters:
select_mode (str) –
How to populate the ROIs. One of
'gui': Launch the interactive Plotly/ipywidgets selector.exampleImagemust be provided.'file': Loadmask_images,roi_points, andexampleImagefrom a previously savedROIs.h5file.path_filemust be provided.'custom': Build masks from explicit polygon coordinates.coords_roisandexampleImagemust be provided.
(Default is
'gui')exampleImage (np.ndarray) – Image to display in the GUI or to define the canvas size for
'custom'mode. Only used whenselect_modeis'gui'or'custom'. (Default isNone)path_file (str) – Path to a saved
ROIs.h5file. Only used whenselect_modeis'file'. (Default isNone)coords_rois (dict) – Dictionary mapping ROI names (e.g.
'ROI_0','ROI_1') to polygon vertices, given as either annp.ndarrayof shape (N, 2) or a list of[x, y]pairs. Only used whenselect_modeis'custom'. (Default isNone)point_positions (np.ndarray) – Optional pre-computed array of tracking point positions, shape (n_points, 2). (Default is
None)mask_images (dict) – Dictionary mapping mask names to 2D boolean
np.ndarraymasks with the same height and width as the videos. (Default isNone)verbose (int) –
Verbosity level. One of
0: No output.1: Warnings only.2: All output.
(Default is
1)
- exampleImage
The reference image associated with the ROIs.
- Type:
np.ndarray
- point_positions
Tracking point positions, shape (n_points, 2).
- Type:
np.ndarray
- make_points(rois, point_spacing=10)[source]
Generates a regular grid of tracking points inside the intersection of the supplied ROI masks and stores them on
self.point_positions.- Parameters:
rois (Union[List[np.ndarray], np.ndarray]) – Either a list of 2D boolean masks, a single 2D boolean mask, or a 3D boolean array stacked along axis 0. All masks must share the same shape.
point_spacing (int) – Spacing between adjacent grid points, in pixels. (Default is
10)
- set_point_positions(point_positions)[source]
Manually overrides
self.point_positionswith an explicit array.- Parameters:
point_positions (np.ndarray) – Tracking point coordinates as
(x, y)pairs. shape: (n_points, 2).
- plot_rois(image=None, **kwargs_imshow)[source]
Plots ROI polygon outlines (and tracking points if available) on top of an image.
- Parameters:
image (np.ndarray) – Background image to draw the ROIs on. If
None, falls back toself.exampleImage; if that is also missing, a blank image is used. (Default isNone)**kwargs_imshow – Additional keyword arguments forwarded to
matplotlib.pyplot.imshow.
- Returns:
- tuple containing:
- fig (matplotlib.figure.Figure):
The Matplotlib figure containing the plot.
- ax (matplotlib.axes.Axes):
The Matplotlib axes containing the plot.
- Return type:
(tuple)
- class face_rhythm.rois.ROI_Alinger(method='createOptFlow_DeepFlow', kwargs_method=None, verbose=1)[source]
Bases:
objectRegisters a template image to a set of new images using OpenCV optical flow, then warps the template’s ROI polygons and tracking points onto each new image. RH 2022
- Parameters:
method (str) –
Optical-flow method to use for non-rigid registration. One of
'calcOpticalFlowFarneback''createOptFlow_DeepFlow'
(Default is
'createOptFlow_DeepFlow')kwargs_method (dict) – Keyword arguments forwarded to the chosen optical-flow method. If
None, hard-coded defaults are used. (Default isNone)verbose (int) –
Verbosity level. One of
0: No updates.1: Warnings only.2: All updates.
(Default is
1)
- align_and_make_ROIs(ROIs_object_template, images_new, image_template=None, template_method='image', shifts=None, normalize=True)[source]
Performs non-rigid registration of a template image onto each new image and warps the template’s tracking points, ROI polygons, masks, and the new images themselves into the template’s frame. RH 2022
Results are stored on the instance as
self.flows,self.pointPositions_new,self.roiPoints_new,self.maskImages_new,self.ROIs_objects_new, andself.images_warped.- Parameters:
ROIs_object_template (ROIs) – A single
ROIsobject built from the template image. Its ROIs and tracking points are warped onto each new image.images_new (List[np.ndarray]) – Images to align the template to. Each image must have shape (H, W, n_channels) and dtype uint8.
image_template (np.ndarray) – Template image to warp onto the new images. shape: (H, W, n_channels), dtype: uint8. If
None,ROIs_object_template.exampleImageis used. (Default isNone)template_method (str) –
Strategy for choosing the template per registration. One of
'image':image_templateis treated as a single image.'sequential':image_templateis treated as the integer index of the image to use as the zero-offset reference.
(Default is
'image')shifts (np.ndarray) – Per-image
(dx, dy)shifts to add to each computed flow field, e.g. from a phase-correlation pre-registration step. IfNone, zero shifts are applied. (Default isNone)normalize (bool) – If
True, normalize images to[0, 255](using each image’s own min and max) before registration. (Default isTrue)
- class face_rhythm.rois.Image_Aligner(verbose=True)[source]
Bases:
FR_ModuleA class for registering points to a template image. Currently relies on available OpenCV methods for rigid and registration. RH 2023
- Parameters:
verbose (bool) – Whether to print progress updates. (Default is
True)
- classmethod augment_images(ims: List[numpy.ndarray], use_CLAHE: bool = True, CLAHE_grid_size: int = 1, CLAHE_clipLimit: int = 1, CLAHE_normalize: bool = True) None[source]
Augments the FOV images by mixing the FOV with the ROI images and optionally applying CLAHE. RH 2023
- Parameters:
ims (List[np.ndarray]) – A list of FOV images.
use_CLAHE (bool) – Whether to apply CLAHE to the images. (Default is
True)CLAHE_grid_size (int) – The grid size for CLAHE. See alignment.clahe for more details. (Default is 1)
CLAHE_clipLimit (int) – The clip limit for CLAHE. See alignment.clahe for more details. (Default is 1)
CLAHE_normalize (bool) – Whether to normalize the CLAHE output. See alignment.clahe for more details. (Default is
True)
- Returns:
- FOV_images_augmented (List[np.ndarray]):
The augmented FOV images.
- Return type:
(List[np.ndarray])
- fit_geometric(template: int | numpy.ndarray, ims_moving: List[numpy.ndarray], template_method: str = 'sequential', mode_transform: str = 'affine', gaussFiltSize: int = 11, mask_borders: Tuple[int, int, int, int] = (0, 0, 0, 0), n_iter: int = 1000, termination_eps: float = 1e-09, auto_fix_gaussFilt_step: bool | int = 10) numpy.ndarray[source]
Performs geometric registration of
ims_movingto a template, usingcv2.findTransformECC. RH 2023- Parameters:
template (Union[int, np.ndarray]) – Depends on the value of ‘template_method’. If ‘template_method’ == ‘image’, this should be a 2D np.ndarray image, an integer index of the image to use as the template, or a float between 0 and 1 representing the fractional index of the image to use as the template. If ‘template_method’ == ‘sequential’, then template is the integer index or fractional index of the image to use as the template.
ims_moving (List[np.ndarray]) – List of images to be aligned.
template_method (str) –
Method to use for template selection.
’image’: use the image specified by ‘template’.
’sequential’: register each image to the previous or next image
(Default is ‘sequential’)
mode_transform (str) – Mode of geometric transformation. Can be ‘translation’, ‘euclidean’, ‘affine’, or ‘homography’. See
cv2.findTransformECCfor more details. (Default is ‘affine’)gaussFiltSize (int) – Size of the Gaussian filter. (Default is 11)
mask_borders (Tuple[int, int, int, int]) – Border mask for the image. Format is (top, bottom, left, right). (Default is (0, 0, 0, 0))
n_iter (int) – Number of iterations for
cv2.findTransformECC. (Default is 1000)termination_eps (float) – Termination criteria for
cv2.findTransformECC. (Default is 1e-9)auto_fix_gaussFilt_step (Union[bool, int]) – Automatically fixes convergence issues by increasing the gaussFiltSize. If
False, no automatic fixing is performed. IfTrue, the gaussFiltSize is increased by 2 until convergence. If int, the gaussFiltSize is increased by this amount until convergence. (Default is 10)
- Returns:
- remapIdx_geo (np.ndarray):
An array of shape (N, H, W, 2) representing the remap field for N images.
- Return type:
(np.ndarray)
- fit_nonrigid(template: int | numpy.ndarray, ims_moving: List[numpy.ndarray], remappingIdx_init: numpy.ndarray | None = None, template_method: str = 'sequential', mode_transform: str = 'createOptFlow_DeepFlow', kwargs_mode_transform: dict | None = None) numpy.ndarray[source]
Perform geometric registration of
ims_movingto a template. Currently relies oncv2.findTransformECC. RH 2023- Parameters:
template (Union[int, np.ndarray]) –
If
template_method=='image': Then template is either an image or an integer index or a float fractional index of the image to use as the template.If
template_method=='sequential': then template is the integer index of the image to use as the template.
ims_moving (List[np.ndarray]) – A list of images to be aligned.
remappingIdx_init (Optional[np.ndarray]) – An array of shape (N, H, W, 2) representing any initial remap field to apply to the images in
ims_moving. The output of this method will be added/composed withremappingIdx_init. (Default isNone)template_method (str) –
The method to use for template selection. Either
'image': use the image specified by ‘template’.'sequential': register each image to the previous or next image (will be next for images before the template and previous for images after the template)
(Default is ‘sequential’)
mode_transform (str) – The type of transformation to use for registration. Either ‘createOptFlow_DeepFlow’ or ‘calcOpticalFlowFarneback’. (Default is ‘createOptFlow_DeepFlow’)
kwargs_mode_transform (Optional[dict]) – Keyword arguments for the transform chosen. See cv2 docs for chosen transform. (Default is
None)
- Returns:
- remapIdx_nonrigid (np.ndarray):
An array of shape (N, H, W, 2) representing the remap field for N images.
- Return type:
(np.ndarray)
- transform_images_geometric(ims_moving: numpy.ndarray, remappingIdx: numpy.ndarray | None = None) numpy.ndarray[source]
Transforms images based on geometric registration warps.
- Parameters:
ims_moving (np.ndarray) – The images to be transformed. (N, H, W)
remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If
None, the remapping index from geometric registration is used. (Default isNone)
- Returns:
- ims_registered_geo (np.ndarray):
The images after applying the geometric registration warps. (N, H, W)
- Return type:
(np.ndarray)
- transform_images_nonrigid(ims_moving: numpy.ndarray, remappingIdx: numpy.ndarray | None = None) numpy.ndarray[source]
Transforms images based on non-rigid registration warps.
- Parameters:
ims_moving (np.ndarray) – The images to be transformed. (N, H, W)
remappingIdx (Optional[np.ndarray]) – An array specifying how to remap the images. If
None, the remapping index from non-rigid registration is used. (Default isNone)
- Returns:
- ims_registered_nonrigid (np.ndarray):
The images after applying the non-rigid registration warps. (N, H, W)
- Return type:
(np.ndarray)
- transform_images(ims_moving: List[numpy.ndarray], remappingIdx: List[numpy.ndarray]) List[numpy.ndarray][source]
Transforms images using the specified remapping index.
- Parameters:
ims_moving (List[np.ndarray]) – The images to be transformed. List of arrays with shape: (H, W) or (H, W, C)
remappingIdx (List[np.ndarray]) – The remapping index to apply to the images.
- Returns:
- ims_registered (List[np.ndarray]):
The transformed images. (N, H, W)
- Return type:
(List[np.ndarray])
- transform_points(points: numpy.ndarray, remappingIdx: numpy.ndarray)[source]
Warps points through the supplied remapping index field. RH 2022
- Parameters:
points (np.ndarray) – Points to warp as
(x, y)pairs. shape: (n_points, 2), dtype: float.remappingIdx (np.ndarray) – Remapping index field that maps output
(x, y)coordinates to source(x, y)coordinates. shape: (H, W, 2), dtype: float. Last dim is(x, y).
- Returns:
- points_remap (np.ndarray):
Warped points clipped to image bounds. shape: (n_points, 2).
- Return type:
(np.ndarray)
- get_flowFields(remappingIdx: numpy.ndarray | None = None) List[numpy.ndarray][source]
Returns the flow fields based on the remapping indices.
- Parameters:
remappingIdx (Optional[np.ndarray]) – The indices for remapping the flow fields. If
None, geometric or nonrigid registration must be performed first. (Default isNone)- Returns:
- flow_fields (List[np.ndarray]):
The transformed flow fields.
- Return type:
(List[np.ndarray])
- face_rhythm.rois.clahe(im: numpy.ndarray, grid_size: int = 50, clipLimit: int = 0, normalize: bool = True) numpy.ndarray[source]
Perform Contrast Limited Adaptive Histogram Equalization (CLAHE) on an image.
- Parameters:
- Returns:
- im_out (np.ndarray):
Output image after applying CLAHE.
- Return type:
(np.ndarray)
face_rhythm.spectral_analysis module
Spectral analysis of point-tracked motion via Variable-Q Transform (VQT).
VQT_Analyzer converts point trajectories (from face_rhythm.point_tracking)
into per-point spectrograms with a Variable-Q transform, applies 1/f and
per-timepoint normalization, and writes the resulting complex or magnitude
tensors out for downstream face_rhythm.decomposition.
- class face_rhythm.spectral_analysis.VQT_Analyzer(params_VQT: dict = {'F_max': 40, 'F_min': 1, 'Fs_sample': 90, 'Q_highF': 20, 'Q_lowF': 3, 'downsample_factor': 8, 'fast_length': True, 'fft_conv': True, 'filters': None, 'n_freq_bins': 55, 'padding': 'valid', 'plot_pref': False, 'symmetry': 'center', 'take_abs': True, 'taper_asymmetric': True, 'window_type': 'hann'}, batch_size: int = 10, device='cpu', normalization_factor: float = 0.99, spectrogram_exponent: float = 1.0, one_over_f_exponent: float = 1.0, verbose: int = 1)[source]
Bases:
FR_ModuleComputes normalized Variable-Q Transform (VQT) spectrograms for point displacement traces. RH 2022
- Parameters:
params_VQT (dict) – Keyword arguments forwarded to
vqt.VQT(the Variable Q-Transform implementation invqt). Notable keys includeFs_sample(sampling rate in Hz),Q_lowFandQ_highF(Q-factors at the low and high frequency bounds),F_minandF_max(frequency range in Hz),n_freq_bins,window_type,downsample_factor, andtake_abs. (Default is the dict shown in the signature)batch_size (int) – Number of points processed per VQT batch. (Default is
10)device (str) – Torch device on which the VQT model is run (e.g.
'cpu'or'cuda'). (Default is'cpu')normalization_factor (float) – Strength of the per-timepoint power normalization, in the range
[0, 1].0disables normalization;1forces every time point to have equal total power. (Default is0.99)spectrogram_exponent (float) – Exponent applied to the spectrogram magnitudes prior to normalization. (Default is
1.0)one_over_f_exponent (float) – Exponent for the 1/f correction; the spectrogram is multiplied by
freqs ** one_over_f_exponent.0disables the correction. (Default is1.0)verbose (int) – Verbosity level.
0is silent, higher values print and show progress bars. (Default is1)
- spectrograms
Dict mapping each input key to its normalized spectrogram array. Populated by
transform_all().- Type:
- x_axis
Dict mapping each input key to the time axis (in samples) of its spectrogram. Populated by
transform_all().- Type:
- freqs
Dict mapping each input key to the frequency bin centers (in Hz). Populated by
transform_all().- Type:
- point_positions
Reshaped reference positions used to subtract offsets from traces.
- Type:
- vqt_model
The underlying VQT filter-bank model.
- Type:
vqt.VQT
- run_data
Output payload (filters, frequencies, spectrograms, axes) used by
FR_Modulefor export.- Type:
- cleanup()[source]
Deletes every attribute on the instance and triggers garbage collection to free large tensors held by the analyzer.
- transform(points_tracked: numpy.ndarray, point_positions: numpy.ndarray)[source]
Transforms a single batch of tracked points into a normalized VQT spectrogram.
- Parameters:
points_tracked (np.ndarray) – Tracked point coordinates. shape: (n_frames, n_points, 2).
point_positions (np.ndarray) – Reference positions of the tracked points used to compute displacements. shape: (n_points, 2).
- Returns:
- tuple containing:
- spectrograms (np.ndarray):
Normalized spectrograms for the x and y displacement components. shape: (2, n_points, n_freq_bins, n_frames_ds), where
n_frames_dsis the downsampled frame count.- x_axis (np.ndarray):
Time axis of the spectrogram, in samples at the original
Fs_samplerate. shape: (n_frames_ds,).- freqs (np.ndarray):
Frequency bin centers in Hz. shape: (n_freq_bins,).
- Return type:
(tuple)
- transform_all(points_tracked: dict, point_positions: numpy.ndarray)[source]
Generates spectrograms for every entry in a dict of tracked-point arrays and stores the results on the instance.
- Parameters:
points_tracked (dict) – Mapping from a name to a tracked-points array of shape (n_frames, n_points, 2).
point_positions (np.ndarray) – Reference positions of the tracked points used to compute displacements. shape: (n_points, 2).
- Set Attributes:
- spectrograms (dict):
Dict mapping each input key to its normalized spectrogram.
- x_axis (dict):
Dict mapping each input key to the spectrogram time axis.
- freqs (dict):
Dict mapping each input key to the frequency bin centers in Hz.
- run_data (dict):
Updated with
spectrograms,x_axis, andpoint_positionsforFR_Moduleexport.
- demo_transform(points_tracked: dict, point_positions: numpy.ndarray, idx_point: list = [0], name_points: str = '0', plot: bool = True)[source]
Runs a single-point demo transform for visual sanity checking and prints the projected memory footprint of the full spectrogram set.
- Parameters:
points_tracked (dict) – Mapping from a name to a tracked-points array of shape (n_frames, n_points, 2).
point_positions (np.ndarray) – Reference positions of the tracked points. shape: (n_points, 2).
idx_point (list) – Indices of points within the selected entry to transform and plot. (Default is
[0,])name_points (str) – Key into
points_trackedselecting which array to use. (Default is'0')plot (bool) – If
True, displays a matplotlib figure with the x and y spectrograms. (Default isTrue)
- Returns:
- tuple containing:
- spec (np.ndarray):
Demo spectrogram. shape: (2, n_freq_bins, n_frames_ds).
- x_axis (np.ndarray):
Time axis of the spectrogram, in samples at the original
Fs_samplerate. shape: (n_frames_ds,).- freqs (np.ndarray):
Frequency bin centers in Hz. shape: (n_freq_bins,).
- Return type:
(tuple)
face_rhythm.util module
Miscellaneous project utilities: FR_Module base class, config I/O, system info, batch launcher.
Contains the shared FR_Module base class (save config / run_info / run_data
for every pipeline stage), YAML helpers, matplotlib -> numpy array helpers, a
system-info collector used to snapshot the environment in run_info.json, and a
SLURM batch-run wrapper.
- face_rhythm.util.get_default_parameters(path_defaults=None, directory_project=None, directory_videos=None, filename_videos_strMatch=None, path_ROIs=None)[source]
Returns a dictionary of default parameters for running face-rhythm pipelines. RH 2023
- Parameters:
path_defaults (Optional[str]) – Path to a JSON file containing a parameters dictionary. If provided, parameters are loaded from this file. If
None, the built-in defaults are used. (Default isNone)directory_project (Optional[str]) – Directory to use as the project directory. Passed through to
fr.project.prepare_project. (Default isNone)directory_videos (Optional[str]) – Directory containing the videos. Passed through to
fr.helpers.find_pathsto discover video paths. (Default isNone)filename_videos_strMatch (Optional[str]) – Regex that video filenames must match. Passed through to
fr.helpers.find_pathsto filter discovered videos. (Default isNone)path_ROIs (Optional[str]) – Path to the file containing the ROIs. Used by
fr.rois.ROIswhen running in'file'mode instead of'gui'mode. (Default isNone)
- Returns:
- params (dict):
Dictionary containing the default (or loaded) parameters for each pipeline stage.
- Return type:
(dict)
- class face_rhythm.util.FR_Module[source]
Bases:
objectSuperclass for all face-rhythm module classes. Provides shared helpers for saving
run_data,run_info, andconfigfiles. RH 2022- run_info
Per-run metadata populated by the subclass. Saved by
save_run_info().- Type:
Optional[dict]
- run_data
Per-run output data populated by the subclass. Saved by
save_run_data().- Type:
Optional[dict]
- module_name
Name of the concrete subclass; used as the top-level key in the config and run_info files.
- Type:
- save_config(path_config=None, overwrite=True, verbose=1)[source]
Appends
self.configto theconfig.yamlfile. RH 2022self.configis created by the subclass and should contain all parameters used to run the module.
- save_run_info(path_run_info=None, path_config=None, overwrite=True, verbose=1)[source]
Appends
self.run_infoto therun_info.jsonfile.Exactly one of
path_run_infoorpath_configmust be supplied.- Parameters:
path_run_info (Optional[str]) – Path to the
run_info.jsonfile. IfNone,path_configmust be provided and must containconfig['paths']['run_info']. If the file does not exist, it will be created. (Default isNone)path_config (Optional[str]) – Path to the
config.yamlfile. IfNone,path_run_infomust be provided. (Default isNone)overwrite (bool) – If
True, overwrites the existing field for this module insiderun_info.json. (Default isTrue)verbose (int) –
Verbosity level. Either
0: Silent.1: Print warnings.2: Print all info.
(Default is
1)
- save_run_data(path_run_data=None, path_config=None, overwrite=True, use_compression=False, track_order=True, verbose=1)[source]
Saves
self.run_datato an.h5file under the project’sanalysis_filesdirectory. RH 2022self.run_datais created by the subclass and should contain all the data generated by the module. Exactly one ofpath_run_dataorpath_configmust be supplied. The project directory should already exist (useface_rhythm.project.prepare_project).- Parameters:
path_run_data (Optional[str]) – Path to the output
.h5file. IfNone,path_configmust be provided and must containconfig['paths']['project']. Resolved path will be<project>/analysis_files/<module_name>.h5. If the file does not exist, it will be created. (Default isNone)path_config (Optional[str]) – Path to the
config.yamlfile. IfNone,path_run_datamust be provided. (Default isNone)overwrite (bool) – If
True, overwrites the existing.h5file. (Default isTrue)use_compression (bool) – If
True, uses compression when writing the.h5file. (Default isFalse)track_order (bool) – If
True, preserves insertion order of keys inside the.h5file. (Default isTrue)verbose (int) –
Verbosity level. Either
0: Silent.1: Print warnings.2: Print all info.
(Default is
1)
- face_rhythm.util.load_yaml_safe(path, verbose=0)[source]
Loads a YAML file, falling back to
yaml.LoaderifFullLoaderfails.
- face_rhythm.util.load_config_file(path, verbose=0)[source]
Loads a
config.yamlfile as a dictionary.
- face_rhythm.util.load_run_info_file(path, verbose=0)[source]
Loads a
run_info.jsonfile as a dictionary.
- class face_rhythm.util.Saver_Viz_Base(path_config: str = None, dir_save: str = None, formats_save: list = ['png'], kwargs_method: dict = {}, overwrite: bool = False, verbose: int = 1)[source]
Bases:
objectSuperclass for saving visualizations (e.g.
Figure_Saver,Image_Saver).- Parameters:
path_config (Optional[str]) – Path to the
config.yamlfile. Optional ifdir_saveis specified. (Default isNone)dir_save (Optional[str]) – Directory to save visualizations into. Optional if
path_configis specified. (Default isNone)formats_save (List[str]) – File formats to save visualizations as. Valid values depend on the saving method used by the subclass. (Default is
['png'])kwargs_method (Dict[str, Any]) – Keyword arguments forwarded to the underlying save method. (Default is
{})overwrite (bool) – If
True, overwrites existing files. (Default isFalse)verbose (int) –
Verbosity level. Either
0: Silent.1: Print warnings.2: Print warnings and info.
(Default is
1)
- class face_rhythm.util.Figure_Saver(path_config: str = None, dir_save: str = None, formats_save: list = ['png'], kwargs_savefig: dict = {'bbox_inches': 'tight', 'dpi': 300, 'pad_inches': 0.1, 'transparent': True}, overwrite: bool = False, verbose: int = 1)[source]
Bases:
Saver_Viz_BaseSaves matplotlib figures to disk in one or more file formats. RH 2022
- Parameters:
path_config (Optional[str]) – Path to the
config.yamlfile. IfNone,dir_savemust be specified. (Default isNone)dir_save (Optional[str]) – Directory to save the figure into. Used when
path_configisNone. (Default isNone)formats_save (List[str]) – File formats to save the figure as. Common choices are
'png','svg','eps', and'pdf'. (Default is['png'])kwargs_savefig (Dict[str, Any]) – Keyword arguments forwarded to
matplotlib.figure.Figure.savefig. (Default is{'bbox_inches': 'tight', 'pad_inches': 0.1, 'transparent': True, 'dpi': 300})overwrite (bool) – If
True, overwrites existing files. (Default isFalse)verbose (int) –
Verbosity level. Either
0: Silent.1: Print warnings.2: Print warnings and info.
(Default is
1)
- save_figure(fig, name_save: str = None, dir_save: str = None, formats_save: str = None, kwargs_savefig: dict = None)[source]
Saves a single matplotlib figure to one or more file formats.
- Parameters:
fig (matplotlib.figure.Figure) – Figure to save.
name_save (Optional[str]) – Name of the file to save the figure as (without extension). If
None, the figure’s label is used. (Default isNone)dir_save (Optional[str]) – Directory to save the figure into. If
None, the directory stored on the instance is used. (Default isNone)formats_save (Optional[Union[str, List[str]]]) – File format(s) to save the figure as. If
None, the formats stored on the instance are used. (Default isNone)kwargs_savefig (Optional[Dict[str, Any]]) – Keyword arguments forwarded to
matplotlib.figure.Figure.savefig. IfNone, the stored kwargs are used. (Default isNone)
- class face_rhythm.util.Image_Saver(path_config: str = None, dir_save: str = None, formats_save: list = ['png'], kwargs_PIL_save: dict = {}, overwrite: bool = False, verbose: int = 1)[source]
Bases:
Saver_Viz_BaseSaves images and animated GIFs to disk using PIL. RH 2022
- Parameters:
path_config (Optional[str]) – Path to the
config.yamlfile. IfNone,dir_savemust be specified. (Default isNone)dir_save (Optional[str]) – Directory to save the image into. Used when
path_configisNone. (Default isNone)formats_save (List[str]) – File formats to save the image as. Common choices are
'png','jpg', and'tif'. (Default is['png'])kwargs_PIL_save (Dict[str, Any]) – Keyword arguments forwarded to
PIL.Image.Image.save. (Default is{})overwrite (bool) – If
True, overwrites existing files. (Default isFalse)verbose (int) –
Verbosity level. Either
0: Silent.1: Print warnings.2: Print warnings and info.
(Default is
1)
- save_image(array_image, name_save: str = None, dir_save: str = None, formats_save: str = None, kwargs_PIL_save: dict = None)[source]
Saves a single image array as one or more files using PIL.
- Parameters:
array_image (np.ndarray) – Image to save. shape: (H, W) or (H, W, C) with
Cin{1, 3}. Ifdtypeis float, values must lie in[0, 1]and will be scaled by255and cast to uint8. Ifdtypeis int, values must lie in[0, 255]and will be cast to uint8.name_save (Optional[str]) – Name of the file to save the image as (without extension). If
None,'image'is used. (Default isNone)dir_save (Optional[str]) – Directory to save the image into. If
None, the directory stored on the instance is used. (Default isNone)formats_save (Optional[Union[str, List[str]]]) – File format(s) to save the image as. If
None, the formats stored on the instance are used. (Default isNone)kwargs_PIL_save (Optional[Dict[str, Any]]) – Keyword arguments forwarded to
PIL.Image.Image.save. IfNone, the stored kwargs are used. (Default isNone)
- save_gif(array_images, name_save: str = None, dir_save: str = None, frame_rate: float = 5.0, loop: int = True, optimize: bool = True, kwargs_PIL_save: dict = None)[source]
Saves a sequence of images as an animated GIF using PIL.
- Parameters:
array_images (List[np.ndarray]) – List of frames to save. Each frame has shape (H, W) or (H, W, C) with
Cin{1, 3}.name_save (Optional[str]) – Name of the file to save the GIF as (without extension). If
None,'image'is used. (Default isNone)dir_save (Optional[str]) – Directory to save the GIF into. If
None, the directory stored on the instance is used. (Default isNone)frame_rate (float) – Playback frame rate in frames per second. (Default is
5.0)loop (Union[int, bool]) – Number of times the GIF should loop.
Trueloops forever. (Default isTrue)optimize (bool) – If
True, applies PIL’s GIF size optimization. (Default isTrue)kwargs_PIL_save (Optional[Dict[str, Any]]) – Keyword arguments forwarded to
PIL.Image.Image.save. IfNone, the stored kwargs are used. (Default isNone)
- face_rhythm.util.system_info(verbose: bool = False) Dict[source]
Collects information about the OS, CPU, RAM, GPU, and key Python packages, and optionally prints it. RH 2022
- Parameters:
verbose (bool) – If
True, prints each section to stdout as it is collected. (Default isFalse)- Returns:
- versions (Dict):
Dictionary containing the system snapshot. Keys include
'datetime','face_rhythm','operating_system','cpu_info','user','ram','gpu_info','conda_env','python','gcc','torch','cuda','cudnn','torch_devices', and'pkgs'.
- Return type:
(Dict)
- face_rhythm.util.batch_run(paths_scripts, params_list, sbatch_config_list, max_n_jobs=2, dir_save=None, name_save='jobNum_', verbose=True)[source]
Submits a batch of SLURM jobs that each run a Python script with a parameter file. Adapted from BNPM. RH 2021
A typical workflow is to sweep one script over a list of parameter dictionaries: each entry in
params_listis written to its own job directory asparams.json, the corresponding SBATCH script is materialized, andsbatchis invoked. Variants with multiple scripts or multiple SBATCH configs are also supported – any ofpaths_scripts,params_list, andsbatch_config_listmay have length1(broadcast) or lengthn_jobs.- Parameters:
paths_scripts (List[str]) – Paths to the Python scripts to run. Length must be
1orn_jobs. Each script should accept the kwargs--path_paramsand--directory_saveinjected by this function.params_list (List[Dict[str, Any]]) – Parameter dictionaries, one per job. Length must be
1orn_jobs. Each dictionary is written asparams.jsoninside its job directory and its path is passed to the script.sbatch_config_list (List[str]) – SBATCH script bodies, one per job. Length must be
1orn_jobs. Each string must contain the literalpython "$@"on its final command line; this is replaced with the resolvedpython <script> --path_params <...> --directory_save <...>invocation before being written to disk.max_n_jobs (Optional[int]) – Safety cap on the number of jobs that may be submitted. If the inferred
n_jobsexceeds this value, aValueErroris raised. Set toNoneto disable the cap. (Default is2)dir_save (Union[str, pathlib.Path]) – Outer directory under which each job’s subdirectory is created. Created if it does not exist. Must be supplied – there is no sensible default. (Default is
None)name_save (Union[str, List[str]]) – Base name for each job’s subdirectory; the job index is always appended. If a string, it is reused for every job; if a list, it must have
n_jobsitems. (Default is'jobNum_')verbose (bool) – If
True, prints a status line per submitted job. (Default isTrue)
face_rhythm.visualization module
Frame and video visualization: overlay points/text on images and write videos.
FrameVisualizer wraps OpenCV’s video writer and draw primitives; helper
functions play back buffered readers with overlaid trajectories and produce
interactive image stacks for Jupyter contexts.
- class face_rhythm.visualization.FrameVisualizer(display=False, handle_cv2Imshow='FaceRhythmPointVisualizer', path_save=None, frame_height_width=(480, 640), frame_rate=None, fourcc='MJPG', error_checking=True, verbose: int = 1, point_sizes=None, points_colors=None, alpha=None, text=None, text_positions=None, text_color=None, text_size=None, text_thickness=None)[source]
Bases:
objectWraps OpenCV draw primitives and an optional
cv2.VideoWriterto overlay points and text on single frames, optionally displaying them viacv2.imshowand/or writing them to a video file. RH 2022- Parameters:
display (bool) – If
True, display each frame usingcv2.imshow. (Default isFalse)handle_cv2Imshow (str) – Window name passed to
cv2.imshow. Used to close the window later. (Default is'FaceRhythmPointVisualizer')path_save (Optional[str]) – If not
None, frames are written to this video file path. Use an.aviextension (e.g.'directory/filename.avi'). (Default isNone)frame_height_width (Tuple[int, int]) – Height and width of the displayed and/or saved video. (Default is
(480, 640))frame_rate (Optional[int]) – Frame rate of the displayed and/or saved video. If
None, playback runs at top speed and saved videos default to 60 fps. (Default isNone)fourcc (str) – Four-character codec passed to
cv2.VideoWriter. (Default is'MJPG')error_checking (bool) – If
True, perform input validation invisualize_image_with_points. (Default isTrue)verbose (int) –
Verbosity level.
0: No messages.1: Warnings.2: Info.
(Default is
1)point_sizes (Optional[Union[int, List[int]]]) – Optional override applied in every call to
visualize_image_with_points. Passed tocv2.circle. If an int, all points use this radius; if a list, each element is the radius for one batch of points. (Default isNone)points_colors (Optional[Union[Tuple[int, int, int], List]]) – Optional override applied in every call to
visualize_image_with_points. Passed tocv2.circle. If a tuple of 3 ints in[0, 255], all points use this color; if a list, each element is a color or a per-point color array of shape (N, 3) for one batch. (Default isNone)alpha (Optional[float]) – Optional override applied in every call to
visualize_image_with_points. Transparency of the overlaid points; values other than1are slow. (Default isNone)text (Optional[Union[str, List[str]]]) – Optional override applied in every call to
visualize_image_with_points. IfNone, no text is drawn; if a string, the same string is drawn at every position; if a list, each element is drawn at the corresponding row oftext_positions. (Default isNone)text_positions (Optional[np.ndarray]) – Optional override applied in every call to
visualize_image_with_points. Must be specified iftextis notNone. shape: (n_text, 2), order (x, y). (Default isNone)text_color (Optional[Union[str, List[str]]]) – Optional override applied in every call to
visualize_image_with_points. Passed tocv2.putText. If a string, the same color is used for all text; if a list, each element is the color for one text item. (Default isNone)text_size (Optional[Union[int, List[int]]]) – Optional override applied in every call to
visualize_image_with_points. Passed tocv2.putText. If an int, the same scale is used for all text; if a list, each element is the scale for one text item. (Default isNone)text_thickness (Optional[Union[int, List[int]]]) – Optional override applied in every call to
visualize_image_with_points. Passed tocv2.putText. If an int, the same thickness is used for all text; if a list, each element is the thickness for one text item. (Default isNone)
- video_writer
Underlying
cv2.VideoWriterinstance, orNoneifpath_saveis not set.- Type:
Optional[object]
- visualize_image_with_points(image, points=None, point_sizes=None, points_colors=(0, 255, 255), alpha=None, text=None, text_positions=None, text_color='white', text_size=1, text_thickness=1)[source]
Draws points and text onto a single image and optionally displays and/or writes the result. Input validation is intentionally minimal for performance, so the caller must follow the documented formats.
- Parameters:
image (np.ndarray) – Image to draw on. shape: (H, W, 3), dtype: uint8. The last dimension is channels.
points (Optional[Union[np.ndarray, List[np.ndarray]]]) – Points to overlay. If a single
np.ndarrayof shape (n_points, 2) and integer dtype, it is treated as one batch and clamped to the image bounds. If a list, each element is one batch of shape (n_points, 2) and dtype int; column order is (x, y). (Default isNone)point_sizes (Optional[Union[int, List[int]]]) – Radius passed to
cv2.circle. If an int, every point uses this size; if a list, each element is the size for one batch ofpoints. (Default isNone)points_colors (Union[Tuple[int, int, int], List]) – Color passed to
cv2.circle. If a tuple of 3 ints in[0, 255], every point uses this color; if a list, each element is either a 3-tuple for one batch or annp.ndarrayof shape (n_points, 3) with per-point colors in[0, 255]. (Default is(0, 255, 255))alpha (Optional[float]) – Transparency of the overlaid points; values other than
1are slow. (Default isNone)text (Optional[Union[str, List[str]]]) – Text passed to
cv2.putText. IfNone, no text is drawn; if a string, the same string is drawn at every row oftext_positions; if a list, each element is drawn at the matching row. (Default isNone)text_positions (Optional[np.ndarray]) – Positions for each text item. Required if
textis notNone. shape: (n_text, 2), order (x, y). (Default isNone)text_color (Union[str, List[str]]) – Color passed to
cv2.putText. If a string, all text uses this color; if a list, each element is the color for one text item. (Default is'white')text_size (Union[int, List[int]]) – Font scale passed to
cv2.putText. If an int, all text uses this scale; if a list, each element is the scale for one text item. (Default is1)text_thickness (Union[int, List[int]]) – Stroke thickness passed to
cv2.putText. If an int, all text uses this thickness; if a list, each element is the thickness for one text item. (Default is1)
- Returns:
- image_out (np.ndarray):
Copy of
imagewith points and text drawn on top. shape: (H, W, 3), dtype: uint8.
- Return type:
(np.ndarray)
- face_rhythm.visualization.play_video_with_points(bufferedVideoReader, frameVisualizer=None, points=None, idx_frames=None)[source]
Plays a video with optional point overlays and optionally writes it to disk via the supplied
FrameVisualizer. RH 2022- Parameters:
bufferedVideoReader (BufferedVideoReader) – Source of frames, created with
fr.helpers.BufferedVideoReader.frameVisualizer (FrameVisualizer) – Visualizer that draws and optionally saves each frame, created with
fr.visualization.FrameVisualizer. Required in practice despite the default. (Default isNone)points (Optional[np.ndarray]) – Points to overlay on the video. shape: (num_frames, num_points, 2). (Default is
None)idx_frames (Optional[np.ndarray]) – Indices of frames to play. If
None, all frames in the reader are played. (Default isNone)
- face_rhythm.visualization.display_toggle_image_stack(images, image_size=None, clim=None, interpolation='nearest')[source]
Renders an HTML image slider in a Jupyter notebook to scrub through a stack of images. RH 2023
- Parameters:
images (List[Union[np.ndarray, torch.Tensor]]) – Images to display, each as a 2D or 3D
np.ndarrayortorch.Tensor. All images must share an interpretation compatible with PILfromarray.image_size (Optional[Union[Tuple[int, int], float]]) –
Output size per image.
Tuple[int, int]: explicit(width, height)applied to every image.float: scale factor applied to each image’s native shape.None: images are displayed at their native size.
(Default is
None)clim (Optional[Tuple[float, float]]) –
(min, max)intensity bounds used to scale pixel values to[0, 255]. IfNone, the per-image min and max are used. (Default isNone)interpolation (str) –
Resampling method used when resizing. One of
'nearest''box''bilinear''hamming''bicubic''lanczos'
Mapped to the matching
PIL.Image.Resampling.*constant. (Default is'nearest')
- face_rhythm.visualization.complex_colormap(mags: numpy.ndarray, angles: numpy.ndarray, normalize_mags: bool = True, color_sin: Tuple[int, int, int] = (255, 0, 0), color_cos: Tuple[int, int, int] = (0, 0, 255)) numpy.ndarray[source]
Generates an RGB colormap for complex values, where hue tracks the angle and brightness tracks the magnitude.
- Parameters:
mags (np.ndarray) – Magnitudes of the complex values. Must broadcast against
angles.angles (np.ndarray) – Angles in radians. Must share shape with
mags.normalize_mags (bool) – If
True, apply min-max normalization tomagsbefore scaling brightness. (Default isTrue)color_sin (Tuple[int, int, int]) – RGB color contributed in proportion to
sin(angles). (Default is(255, 0, 0))color_cos (Tuple[int, int, int]) – RGB color contributed in proportion to
cos(angles). (Default is(0, 0, 255))
- Returns:
- rgb (np.ndarray):
RGB values per element. shape: (mags.size, 3).
- Return type:
(np.ndarray)