4.1.1.1.1.4. lib.data_handling.data_analysis
#
Module that provides functionality in order to analyze an experiment. Includes functionality for pre-processing data and analyzing the data afterwards.
4.1.1.1.1.4.1. Module Contents#
4.1.1.1.1.4.1.1. Classes#
Wrapper to provide convenient access to data evaluation. |
|
Class to cluster the analysis of an ensemble of experiment objects. |
|
Class to handle various analysis routines. |
|
Class that contains fit routine to estimate positions of molecules from experimental record. Every new fit needs the initialization of a new fit routine to clear results. |
|
Class that provides different estimators to evaluate the shape of the intensity minimum. |
|
4.1.1.1.1.4.1.2. Functions#
Analyze a single file (experiments.yaml). |
4.1.1.1.1.4.1.3. API#
- lib.data_handling.data_analysis.study_single_file(full_file_path, methods=['QUAD'], agnostic=True, collect_artefacts=True)#
Analyze a single file (experiments.yaml).
This function analyzes an experiments.yaml file, applying different methods to study and collect data. The analysis results are then stored in a queue for later saving.
- Parameters:
full_file_path (str) – The absolute path of the experiments.yaml file to be analyzed.
methods (list) – The list of method(s) to use for analysis. Default is [‘QUAD’].
agnostic (bool) – Flag to determine if the analysis should be agnostic. Default is True.
collect_artefacts (bool) – Flag to determine if the collected artefacts should be saved. Default is True.
- Returns:
None. This function adds the analyzed result as a dictionary to a queue for later saving.
- Return type:
None
- Raises:
Exception – If the study of a specific type could not be performed. If the artefacts from the study could not be saved.
- Example:
study_single_file(‘/path/to/experiments.yaml’, methods=[‘QUAD’], agnostic=True, collect_artefacts=True)
Note
The function prints diagnostic information to the console during execution.
Warning
This function may raise exceptions if specific study types fail or if artefacts cannot be saved.
See also
Study
: The class used to perform the analysis.
- class lib.data_handling.data_analysis.EvaluationFacade(collect_artefacts=True, max_files=1)#
Wrapper to provide convenient access to data evaluation.
Initialization
Initialize the EvaluationFacade object.
- Parameters:
collect_artefacts (bool) – Flag indicating whether to collect artifacts. Default is True.
max_files (int) – Maximum number of files to consider for evaluation. Default is 1.
- evaluate_data(in_folder, methods=['QUAD'], agnostic=True)#
Evaluate data in a given folder.
- Parameters:
in_folder (str) – The path to the input folder.
methods (list) – The list of evaluation methods. Default is [‘QUAD’].
agnostic (bool) – Flag indicating whether the evaluation should be done agnostically. Default is True.
- Returns:
None
- Return type:
None
- Raises:
Exception – If the evaluation of a specific file fails.
- Example:
evaluation_facade = EvaluationFacade() evaluation_facade.evaluate_data(‘/path/to/data’, methods=[‘QUAD’], agnostic=True)
- class lib.data_handling.data_analysis.Study(study_type='QUAD', collect_artefacts=True)#
Class to cluster the analysis of an ensemble of experiment objects.
Initialization
Initialize the Study object.
- Parameters:
study_type (str) – The type of study to perform. Default is ‘QUAD’.
collect_artefacts (bool) – Flag indicating whether to collect artifacts. Default is True.
- perform(experiments, agnostic=True)#
Perform an evaluation of the conducted experiment.
- Parameters:
experiments (list) – List of experiment objects to evaluate.
agnostic (bool) – Whether to forget everything or employ prior knowledge (effect only on MLE).
- Returns:
None
- Return type:
None
- Raises:
Exception – If the experiment object is invalid for non-agnostic analysis.
- Example:
study = Study() study.perform(experiments_list, agnostic=True)
- _get_metrics(experiments)#
Calculates metrics for several experiments to evaluate the success of the experiment, mainly in terms of photon numbers.
- Parameters:
experiments (list) – List of experiments.
- Returns:
None. Modifies self.results.
- Example:
study = Study() study._get_metrics(experiments_list)
- class lib.data_handling.data_analysis.AnalysisRoutines(agnostic=True, collect_artefacts=True)#
Class to handle various analysis routines.
- Parameters:
agnostic (bool) – Flag indicating whether the analysis should be agnostic. Default is True.
collect_artefacts (bool) – Flag indicating whether to collect artifacts. Default is True.
Initialization
Initialize the AnalysisRoutines object.
- Parameters:
agnostic (bool) – Flag indicating whether the analysis should be agnostic. Default is True.
collect_artefacts (bool) – Flag indicating whether to collect artifacts. Default is True.
- MLE_analysis(experiments)#
Perform a Maximum Likelihood Estimation (MLE) on distance parameters.
This method estimates distance parameters using MLE based on the provided experiments. It calculates background estimates, kappa values, and performs fitting for different molecule counts.
- Parameters:
experiments (list) – List of experiments for analysis.
- Returns:
None. Modifies self.results and self.artefacts with analysis results.
- Example:
routines = AnalysisRoutines() routines.MLE_analysis(experiments_list)
- POLY_analysis(experiments, method)#
Perform NALM analysis of a sorted list of experiments, estimating distance via center of mass shift after bleaching steps. Done via quadratic approximation near the minimum, extracting the position of the minimum. Combined with an estimate from the quadratic estimator.
- Parameters:
experiments (list) – Sorted list of experiments for analysis.
method (str) – The analysis method, one of [‘MIN-POLY’, ‘MIN-QUAD’, ‘MAX-QUAD’].
- Returns:
None. Modifies self.results and self.artefacts with analysis results.
- Example:
routines = AnalysisRoutines() routines.POLY_analysis(experiments_list, method=’MIN-POLY’)
- HARMONIC_analysis(experiments, method)#
Perform harmonic analysis on a list of experiments.
- Parameters:
experiments (list) – List of experiments for harmonic analysis.
method (str) – The harmonic analysis method, one of [‘CORR’, ‘FOURIER’, ‘HARMONIC’].
- Returns:
None. Modifies self.results and self.artefacts with analysis results.
- Example:
routines = AnalysisRoutines() routines.HARMONIC_analysis(experiments_list, method=’CORR’)
- KAPPA_analysis(experiments, method, fixed_curvature=False)#
Perform NALM analysis of a sorted list of experiments, estimating distance via center of mass shift after bleaching steps. Done via quadratic approximation near the minimum, extracting the position of the minimum. Combined with an estimate from the quadratic estimator.
- Parameters:
experiments (list) – Sorted list of experiments for analysis.
method (str) – The analysis method.
fixed_curvature (bool) – Flag indicating whether to use a fixed curvature value. Default is False.
- Returns:
None. Modifies self.results with analysis results.
- Example:
routines = AnalysisRoutines() routines.KAPPA_analysis(experiments_list, method=’your_method’, fixed_curvature=False)
- WINDOW_analysis(experiments, estimator='harmonic')#
Use a sliding window of the data to perform an analysis on a list of experiments for different spatial areas around the minimum, i.e. extract photons from different regions of a full line scan.
- Parameters:
experiments (list) – List of experiments for window analysis.
method (str) – The window analysis method.
- Returns:
None. Modifies self.results and self.artefacts with analysis results.
- Example:
routines = AnalysisRoutines() routines.WINDOW_analysis(experiments_list)
- _get_fixed_curvature(exp, fit_dict)#
Obtain the average curvature of the minimum in the experiment.
- Parameters:
exp – Experiment object for curvature estimation.
fit_dict (dict) – Fit dictionary containing the parameters for the fit.
- Returns:
Array of curvatures, one for each axis.
- _get_background_estimate(experiments)#
Estimate background from the 0M experiment and set the corresponding parameter in the parameter dictionary.
- Parameters:
experiments (list) – List of experiments for background estimation.
- Returns:
Numpy array representing the estimated background.
- _get_kappa_estimate(experiments, fit_dict, n=1, kap0=None)#
Estimate the quality of the minimum from the 1M experiment and set the corresponding parameter in the parameter dictionary.
- Parameters:
experiments (list) – List of experiments for kappa estimation.
fit_dict (dict) – Fit dictionary containing the parameters for the fit.
n (int) – Number of molecules in the experiment. Default is 1.
kap0 (numpy.ndarray) – Initial value for kappa. Default is None.
- Returns:
Numpy array representing the estimated kappa.
- class lib.data_handling.data_analysis.Fit(agnostic=True, collect_artefacts=True)#
Class that contains fit routine to estimate positions of molecules from experimental record. Every new fit needs the initialization of a new fit routine to clear results.
Initialization
Initialize the Fit class.
- Parameters:
agnostic (bool) – Flag indicating whether the fit should be agnostic.
collect_artefacts (bool) – Flag indicating whether to collect artifacts during fit.
- do_fit(exp, fit_dict, check_residuals=True)#
Method to fit experiments either line-wise or globally. Results are accessible via self.results.
- Parameters:
exp (Experiment) – Experimental data.
fit_dict (dict) – Dictionary containing fit parameters.
check_residuals (bool) – Flag indicating whether to check residuals after fitting.
- Returns:
Tuple containing solution dictionary and artifacts.
- Return type:
tuple
- _do_global_fit(experiment, param_dict={}, estimator=None)#
Fit one experiment globally (not line-wise). Suitable for 1M and 2M experiments.
- Parameters:
experiment (Experiment) – Experimental data.
param_dict (dict) – Dictionary containing fit parameters.
estimator (str) – Estimation method.
- Raises:
Exception – If no estimator is provided.
- Returns:
Tuple containing solution dictionary and fit array.
- Return type:
tuple
- _do_local_fit(experiment, param_dict={}, estimator='quadratic', max_lines=np.inf)#
Find the molecule(s)’ position(s) of an experiment. Perform a line-wise fit by fitting each line of the record separately with MLE, quadratic, and FE.
- Parameters:
experiment (Experiment) – The experiment to be fitted.
param_dict (dict) – Dictionary of parameter dictionaries for each line. {‘0’: {‘FWHM’: 1., …}, ‘1’: …}
estimator (str) – List of estimators to be applied.
max_lines (int) – Maximum number of lines to fit (default is infinity).
- Returns:
Dictionary of fit results for each line.
- Return type:
dict
- _line_to_nan_mapping(dict, key, line, start, end, lines, block_size)#
Remove redundant information depending on axis of line.
Substitutes value of local fit by nan if it is not a fit of the corresponding axis.
- Parameters:
dict (dict) – The dictionary containing fit information.
key (str) – The key corresponding to the axis.
line (int) – The line number.
start (int) – The start index.
end (int) – The end index.
lines (int) – The total number of lines.
block_size (int) – The block size.
- Returns:
The modified dictionary.
- Return type:
dict
- check_residuals(exp, fit_arr, estimator, scope)#
Evaluate residuals with respect to the original full model of fit with optional visualization.
The estimate might have been obtained from a different model, e.g., quadratic or Fourier estimate.
- Parameters:
exp (Experiment) – The experimental data.
fit_arr (np.ndarray) – The array containing fitted values.
estimator (str) – The estimation method used.
scope (str) – The scope of the fit (‘global’ or ‘local’).
- Returns:
Chi2 value in each axis normalized via pixel number, i.e., average chi2/pixel of axis.
- Return type:
list
- get_MLE(exp, estimator, param_dict={}, show=False)#
Perform Maximum Likelihood Estimation (MLE) to obtain fit results for the given experiment.
- Parameters:
exp (Experiment) – The experimental data.
estimator (str) – The estimator to be used for MLE.
param_dict (dict) – Dictionary of additional parameters for the estimation (default is an empty dictionary).
show (bool) – Flag indicating whether to display the fitting results (default is False).
- Returns:
Dictionary containing the MLE fit results and the masked model.
- Return type:
dict
- get_taylor_estimate(exp, estimator, param_dict={}, show=False)#
Obtain fit results using Taylor series expansion-based estimation.
- Parameters:
exp (Experiment) – The experimental data.
estimator (str) – The estimator to be used for Taylor series expansion.
param_dict (dict) – Dictionary of additional parameters for the estimation (default is an empty dictionary).
show (bool) – Flag indicating whether to display the fitting results (default is False).
- Returns:
Dictionary containing the fit results and the masked model.
- Return type:
dict
- get_harmonic_estimate(exp, estimator, param_dict={}, show=False)#
Obtain fit results using harmonic estimation.
- Parameters:
exp (Experiment) – The experimental data.
estimator (str) – The estimator to be used for harmonic estimation (‘fourier’, ‘correlate’, ‘harmonic’).
param_dict (dict) – Dictionary of additional parameters for the estimation (default is an empty dictionary).
show (bool) – Flag indicating whether to display the fitting results (default is False).
- Returns:
Dictionary containing the fit results and the masked model.
- Return type:
dict
- _dict_to_line_mapping(line, lines, start, end, block_size, line_dict, glob_dict, key)#
Map values in each axis to the corresponding x or y line.
- Parameters:
line (int) – Current line number.
lines (int) – Maximum number of lines.
start (int) – Size of the starting block.
end (int) – Size of the ending block.
block_size (int) – Block size.
line_dict (dict) – Current parameter dictionary for the fit in the respective axis/line.
glob_dict (dict) – Dictionary with two-dimensional key.
key (str) – Key for which values are assigned to the current axis/line.
- Returns:
Updated dictionary with assigned values for the current axis/line.
- Return type:
dict
- class lib.data_handling.data_analysis.Estimators#
Class that provides different estimators to evaluate the shape of the intensity minimum.
Initialization
- get_taylorE(xdata, ydata, estimator, param_dict={})#
Estimate parameters using the Taylor expansion of the full harmonic model.
- Parameters:
xdata (array-like) – Independent variable data.
ydata (array-like) – Dependent variable data.
estimator (str) – The type of estimator (‘min-poly’, ‘min-quad’, ‘max-quad’).
param_dict (dict) – Dictionary of additional parameters for the estimation (default is an empty dictionary).
- Returns:
Tuple containing the estimated parameters, success flag, and the fitted model.
- Return type:
tuple
- get_MLE(xdata, ydata, estimator, param_dict={})#
Maximum Likelihood Estimation (MLE) estimator for distance estimate.
- Parameters:
xdata (array-like) – Independent variable data.
ydata (array-like) – Dependent variable data.
estimator (str) – The type of estimator (‘min-poly’, ‘min-quad’, ‘max-quad’, ‘harmonic’).
param_dict (dict) – Dictionary of additional parameters for the estimation (default is an empty dictionary).
- Returns:
Tuple containing the estimated parameters and a success flag.
- Return type:
tuple
- get_CE(counts, show=False)#
Correlative estimator to determine phase shift and amplitude of harmonic signal.
This estimator is independent of the wavelength but assumes to process a full period of the signal. By correlating it with a harmonic signal in that period, it extracts the phase shift and amplitude that fit the signal best.
- Parameters:
counts (array-like) – Array containing the signal data.
show (bool) – Flag to display plots of the original data, pure cosine, and residuals (default is False).
- Returns:
Tuple containing the phase shift and amplitude.
- Return type:
tuple
- get_HE(xdata, ydata, param_dict={}, show=False)#
Get harmonic estimator, i.e. simple sinusoidal fit of phase scan.
- Parameters:
xdata (array-like) – Array containing the phase data.
ydata (array-like) – Array containing the photon counts.
param_dict (dict) – Dictionary of additional parameters for the estimator (default is an empty dictionary).
show (bool) – Flag to display plots of the original data, pure cosine, and residuals (default is False).
- Returns:
Tuple containing the solution vector and a success flag.
- Return type:
tuple
- get_FE(counts, L, K, show=False)#
Get 1D Fourier estimator (only for one full phase scan line!).
Call only on 1M and 2M experiments!
- Parameters:
counts (array-like) – Array containing photon counts.
L (float) – Length of the signal.
K (int) – Number of samples in the signal.
show (bool) – Flag to display plots of the full FFT spectrum, cleaned signal, and residuals (default is False).
- Returns:
Tuple containing the phase and amplitude of the selected frequency.
- Return type:
tuple
- min_hexa_model(x, params, **kwargs)#
Model based on a sixth-order polynomial.
- Parameters:
x (array_like) – Input values.
params (tuple) – Model parameters (m0, m1, b).
kwargs (dict) – Additional keyword arguments.
- Returns:
Model values.
- Return type:
array_like
- min_poly_model(x, params, **kwargs)#
Model based on a 4-th order polynomial.
- Parameters:
x (array_like) – Input values.
params (tuple) – Model parameters (m0, m1, b).
kwargs (dict) – Additional keyword arguments.
- Returns:
Model values.
- Return type:
array_like
- min_quad_model(x, params, **kwargs)#
Model based on a quadratic polynomial.
- Parameters:
x (array_like) – Input values.
params (tuple) – Model parameters (m0, m1, b).
kwargs (dict) – Additional keyword arguments.
- Returns:
Model values.
- Return type:
array_like
- max_quad_model(x, params, **kwargs)#
Model based on a quadratic polynomial to fit a maximum.
- Parameters:
x (array_like) – Input values.
params (tuple) – Model parameters (m0, m1, b).
kwargs (dict) – Additional keyword arguments.
- Returns:
Model values.
- Return type:
array_like
- harmonic_model(x, params, **kwargs)#
Harmonic model.
- Parameters:
x (array_like) – Input values.
params (tuple) – Model parameters (m0, m1, b).
kwargs (dict) – Additional keyword arguments.
- Returns:
Model values.
- Return type:
array_like
- lsqs_objective(x, y, model, params, **kwargs)#
Least squares objective function.
- Parameters:
x (array_like) – Input values.
y (array_like) – Target values.
model (callable) – Model function.
params (array_like) – Model parameters.
kwargs (dict) – Additional keyword arguments.
- Returns:
Objective value.
- Return type:
float
- loglike_objective(x, y, model, params, **kwargs)#
Log-likelihood objective function.
- Parameters:
x (array_like) – Input values.
y (array_like) – Target values.
model (callable) – Model function.
params (array_like) – Model parameters.
kwargs (dict) – Additional keyword arguments.
- Returns:
Objective value.
- Return type:
float
- objective_jac(objective, params)#
Compute the Jacobian of an objective function.
- Parameters:
objective (callable) – Objective function.
params (array_like) – Model parameters.
- Returns:
Jacobian matrix.
- Return type:
array_like
- objective_hess(x, y, objective, params, **kwargs)#
Compute the Hessian matrix of an objective function.
- Parameters:
x (array_like) – Input values.
y (array_like) – Target values.
objective (callable) – Objective function.
params (array_like) – Model parameters.
kwargs (dict) – Additional keyword arguments.
- Returns:
Hessian matrix.
- Return type:
array_like
- get_initial_guess(x, y, estimator, params0=None, constr=None, **kwargs)#
Retrieve an initial guess for model parameters.
- Parameters:
x (array_like) – Input values.
y (array_like) – Target values.
estimator (str) – Estimation method.
params0 (array_like, optional) – Initial guess for parameters.
constr (LinearConstraint, optional) – Constraints on parameters.
kwargs (dict) – Additional keyword arguments.
- Returns:
Initial guess for parameters.
- Return type:
array_like
- _estimate_initial_values(x, y)#
Estimate initial values for model parameters.
- Parameters:
x (array_like) – Input values.
y (array_like) – Target values.
- Returns:
Initial values for parameters.
- Return type:
array_like
- _find_ext(estimator, p, w=None)#
Find the extremum in 1D via Kernel Density Estimation (KDE) and differential evolution.
- Parameters:
estimator (str) – Estimator to decide whether to find a maximum or minimum.
p (array_like) – Array of positions.
w (array_like, optional) – Photons used as weights for the KDE.
- Returns:
Position of the maximum density.
- Return type:
array_like
- class lib.data_handling.data_analysis.Results#
Initialization
- append(obj)#
Merge two objects by appending their attributes.
Attributes have to be lists!
- Parameters:
obj (object) – The object to be appended.
- Returns:
None
- class lib.data_handling.data_analysis.MinfluxAnalysis#
Initialization
- fit_chunk(input_df, estimator='min-quad', plot=False, output=None, **kwargs)#
Fit a chunk of data using the specified estimator.
- Parameters:
input_df (pd.DataFrame) – Input DataFrame containing ‘photons’, ‘pos’, ‘weights’, and ‘time’ columns.
estimator (str) – Estimation method (default is ‘min-quad’).
plot (bool) – Flag indicating whether to plot the fit results (default is False).
output (str or None) – Output file or path for the plot (default is None).
kwargs (dict) – Additional keyword arguments to be passed to the estimator.
- Returns:
Series containing fit results.
- Return type:
pd.Series
- _plot_fit(df, fit_vals, output=None)#
Plot the fit and residuals of the given DataFrame.
- Parameters:
df (pd.DataFrame) – DataFrame containing ‘photons’, ‘pos’, ‘weights’, ‘fit’, and ‘residuals’ columns.
fit_vals (np.ndarray) – Fit values obtained from the fitting procedure.
output (str or None) – Output file or path for saving the plot.
- assign_chunk_id(df, mode='tuple', chunk_size=50, max_chunks=10, overlap=0.0, bin_size=50, **kwargs)#
Assign chunk IDs to the given DataFrame based on the specified mode and chunking parameters.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing relevant data.
mode (str, optional) – Chunking mode (‘tuple’ or ‘photons’).
chunk_size (int, optional) – Size of each chunk.
max_chunks (int, optional) – Maximum number of chunks.
overlap (float, optional) – Overlap percentage between chunks.
bin_size (int, optional) – Bin size for assigning bin IDs to chunks.
kwargs – Additional keyword arguments.
- Returns:
DataFrame with assigned chunk IDs.
- Return type:
pd.DataFrame
- grid_data(df, num_points=100)#
Grid the given DataFrame to interpolate and create a new DataFrame with a specified number of points.
- Parameters:
df (pd.DataFrame) – Input DataFrame containing ‘pos’ and ‘photons’ columns.
num_points (int, optional) – Number of points for the new grid.
- Returns:
Gridded DataFrame with interpolated ‘photons’ values.
- Return type:
pd.DataFrame