analyze_cell_shapes#

Workflow for analyzing cell shapes.

Working location structure:

(name)
├── analysis
│   ├── analysis.BASIC_METRICS
│      └── (name)_(key).BASIC_METRICS.csv
│   ├── analysis.CELL_SHAPES_COEFFICIENTS
│      └── (name)_(key).CELL_SHAPES_COEFFICIENTS.csv
│   ├── analysis.CELL_SHAPES_DATA
│      └── (name)_(key).CELL_SHAPES_DATA.csv
│   ├── analysis.CELL_SHAPES_MODELS
│      └── (name)_(key).CELL_SHAPES_MODELS.pkl
│   ├── analysis.CELL_SHAPES_PROPERTIES
│      └── (name)_(key).CELL_SHAPES_PROPERTIES.csv
│   └── analysis.CELL_SHAPES_STATISTICS
│       └── (name)_(key).CELL_SHAPES_STATISTICS.csv
└── calculations
    ├── calculations.COEFFICIENTS
       ├── (name)_(key)_(seed)_(region).COEFFICIENTS.csv
       └── (name)_(key)_(seed)_(region).COEFFICIENTS.tar.xz
    └── calculations.PROPERTIES
        ├── (name)_(key)_(seed)_(region).PROPERTIES.csv
        └── (name)_(key)_(seed)_(region).PROPERTIES.tar.xz

Data from calculations.PROPERTIES are processed into analysis.CELL_SHAPES_PROPERTIES. Data from calculations.COEFFICIENTS are processed into analysis.CELL_SHAPES_COEFFICIENTS. Data from analysis.BASIC_METRICS are combined with data from analysis.CELL_SHAPES_PROPERTIES and analysis.CELL_SHAPES_COEFFICIENTS into analysis.CELL_SHAPES_DATA. PCA models are saved to analysis.CELL_SHAPES_MODELS. Statistical analysis is saved to analysis.CELL_SHAPES_STATISTICS.

Flows

run_flow

Main analyze cell shapes flow.

run_flow_analyze_stats

Analyze cell shapes subflow for analyzing distribution statistics.

run_flow_combine_data

Analyze cell shapes subflow for combining data.

run_flow_fit_models

Analyze cell shapes subflow for fitting PCA model.

run_flow_process_coefficients

Analyze cell shapes subflow for processing coefficients.

run_flow_process_properties

Analyze cell shapes subflow for processing properties.

run_flow(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Main analyze cell shapes flow.

Calls the following subflows, in order:

  1. run_flow_process_properties()

  2. run_flow_process_coefficients()

  3. run_flow_combine_data()

  4. run_flow_fit_models()

  5. run_flow_analyze_stats()

run_flow_analyze_stats(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Analyze cell shapes subflow for analyzing distribution statistics.

Perform statistical analysis of shape distributions. If the analysis file already exists for a given key, that key is skipped.

run_flow_combine_data(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Analyze cell shapes subflow for combining data.

Combine processed spherical harmonics coefficients, cell shape properties, and parsed simulation results into a single dataframe that can be used for PCA. If the combined dataframe already exists for a given key, that key is skipped.

run_flow_fit_models(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Analyze cell shapes subflow for fitting PCA model.

Fit PCA for each key and save the resulting PCA object as a pickle. If the model already exits for a given key, that key is skipped.

run_flow_process_coefficients(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Analyze cell shapes subflow for processing coefficients.

Processes cell shape spherical harmonics coefficients and compiles into a single dataframe. If the combined dataframe already exists for a given key, that key is skipped.

run_flow_process_properties(context: ContextConfig, series: SeriesConfig, parameters: ParametersConfig) None[source]#

Analyze cell shapes subflow for processing properties.

Processes cell shape properties and compiles into a single dataframe. If the combined dataframe already exists for a given key, that key is skipped.

Configs

ContextConfig

Context configuration for analyze cell shapes flow.

ParametersConfig

Parameter configuration for analyze cell shapes flow.

SeriesConfig

Series configuration for analyze cell shapes flow.

class ContextConfig[source]#

Context configuration for analyze cell shapes flow.

working_location: str#

Location for input and output files (local path or S3 bucket).

class ParametersConfig[source]#

Parameter configuration for analyze cell shapes flow.

reference: dict | None = None#

Dictionary of keys for reference data and model for statistics.

regions: list[str]#

List of subcellular regions.

components: int = 8#

Number of principal components (i.e. shape modes).

ds: float | None = None#

Spatial scaling in units/um.

dt: float | None = None#

Temporal scaling in hours/tick.

valid_phases: list[str]#

Valid phases for processing cell shapes.

valid_times: list[int]#

Valid times for processing cell shapes.

sample_replicates: int = 100#

Number of replicates for calculating stats with sampling.

sample_size: int = 100#

Sample size for each tick for calculating stats with sampling.

outlier: float | None = None#

Standard deviation threshold for outliers.

features: list[str]#

List of features.

class SeriesConfig[source]#

Series configuration for analyze cell shapes flow.

name: str#

Name of the simulation series.

seeds: list[int]#

List of series random seeds.

conditions: list[dict]#

List of series condition dictionaries (must include unique condition “key”).