# 4.2.1. `discorpy.prep.preprocessing`

Module of pre-processing methods:

• Normalize, binarize an image.

• Determine the median dot-size, median distance between two nearest dots, and the slopes of grid-lines of a dot-pattern image.

• Remove non-dot objects or misplaced dots.

• Group dot-centroids into horizontal lines and vertical lines.

• Calculate a threshold value for binarizing.

Functions:

 `normalization`(mat[, size]) Correct a non-uniform background of an image using the median filter. `normalization_fft`(mat[, sigma, pad, mode]) Correct a non-uniform background image using a Fourier Gaussian filter. `binarization`(mat[, ratio, thres, denoise]) Apply a list of operations: binarizing an 2D array; inverting the contrast of dots if needs to; removing border components; cleaning salty noise; and filling holes. Check if the number of dots is not enough for parabolic fit. `calc_size_distance`(mat[, ratio]) Find the median size of dots and the median distance between two nearest dots. `select_dots_based_size`(mat, dot_size[, ratio]) Select dots having a certain size. `select_dots_based_ratio`(mat[, ratio]) Select dots having the ratio between the axes length of the fitted ellipse smaller than a threshold. `select_dots_based_distance`(mat, dot_dist[, ...]) Select dots having a certain range of distance to theirs neighbouring dots. `calc_hor_slope`(mat[, ratio]) Calculate the slope of horizontal lines against the horizontal axis. `calc_ver_slope`(mat[, ratio]) Calculate the slope of vertical lines against the vertical axis. `group_dots_hor_lines`(mat, slope, dot_dist[, ...]) Group dots into horizontal lines. `group_dots_ver_lines`(mat, slope, dot_dist[, ...]) Group dots into vertical lines. `remove_residual_dots_hor`(list_lines, slope) Remove dots having distances larger than a certain value from fitted horizontal parabolas. `remove_residual_dots_ver`(list_lines, slope) Remove dots having distances larger than a certain value from fitted vertical parabolas. `calculate_threshold`(mat[, bgr, snr]) Calculate a threshold value based on Algorithm 4 in Ref.
discorpy.prep.preprocessing.normalization(mat, size=51)[source]

Correct a non-uniform background of an image using the median filter.

Parameters:
• mat (array_like) – 2D array.

• size (int) – Size of the median filter.

Returns:

array_like – 2D array. Corrected background.

Correct a non-uniform background image using a Fourier Gaussian filter.

Parameters:
• mat (array_like) – 2D array.

• sigma (int) – Sigma of the Gaussian.

• mode (str) – Padding mode.

Returns:

array_like – 2D array. Corrected background image.

discorpy.prep.preprocessing.binarization(mat, ratio=0.3, thres=None, denoise=True)[source]

Apply a list of operations: binarizing an 2D array; inverting the contrast of dots if needs to; removing border components; cleaning salty noise; and filling holes.

Parameters:
• mat (array_like) – 2D array.

• ratio (float) – Used to select the ROI around the middle of the image for calculating threshold.

• thres (float, optional) – Threshold for binarizing. Automatically calculated if None.

• denoise (bool, optional) – Apply denoising to the image if True.

Returns:

array_like – 2D binary array.

discorpy.prep.preprocessing.check_num_dots(mat)[source]

Check if the number of dots is not enough for parabolic fit.

Parameters:

mat (array_like) – 2D binary array.

Returns:

bool – True means not enough.

discorpy.prep.preprocessing.calc_size_distance(mat, ratio=0.3)[source]

Find the median size of dots and the median distance between two nearest dots.

Parameters:
• mat (array_like) – 2D binary array.

• ratio (float) – Used to select the ROI around the middle of an image.

Returns:

• dot_size (float) – Median size of the dots.

• dot_dist (float) – Median distance between two nearest dots.

discorpy.prep.preprocessing.select_dots_based_size(mat, dot_size, ratio=0.3)[source]

Select dots having a certain size.

Parameters:
• mat (array_like) – 2D binary array.

• dot_size (float) – Size of the standard dot.

• ratio (float) – Used to calculate the acceptable range. [dot_size - ratio*dot_size; dot_size + ratio*dot_size]

Returns:

array_like – 2D array. Selected dots.

discorpy.prep.preprocessing.select_dots_based_ratio(mat, ratio=0.3)[source]

Select dots having the ratio between the axes length of the fitted ellipse smaller than a threshold.

Parameters:
• mat (array_like) – 2D binary array.

• ratio (float) – Threshold value.

Returns:

array_like – 2D array. Selected dots.

discorpy.prep.preprocessing.select_dots_based_distance(mat, dot_dist, ratio=0.3)[source]

Select dots having a certain range of distance to theirs neighbouring dots.

Parameters:
• mat (array_like) – 2D array.

• dot_dist (float) – Median distance of two nearest dots.

• ratio (float) – Used to calculate acceptable range.

Returns:

array_like – 2D array. Selected dots.

discorpy.prep.preprocessing.calc_hor_slope(mat, ratio=0.3)[source]

Calculate the slope of horizontal lines against the horizontal axis.

Parameters:
• mat (array_like) – 2D binary array.

• ratio (float) – Used to select the ROI around the middle of an image.

Returns:

float – Horizontal slope of the grid.

discorpy.prep.preprocessing.calc_ver_slope(mat, ratio=0.3)[source]

Calculate the slope of vertical lines against the vertical axis.

Parameters:
• mat (array_like) – 2D binary array.

• ratio (float) – Used to select the ROI around the middle of a image.

Returns:

float – Vertical slope of the grid.

discorpy.prep.preprocessing.group_dots_hor_lines(mat, slope, dot_dist, ratio=0.3, num_dot_miss=6, accepted_ratio=0.65)[source]

Group dots into horizontal lines.

Parameters:
• mat (array_like) – A binary image or a list of (y,x)-coordinates of points.

• slope (float) – Horizontal slope of the grid.

• dot_dist (float) – Median distance of two nearest dots.

• ratio (float) – Acceptable variation.

• num_dot_miss (int) – Acceptable missing dots between dot1 and dot2.

• accepted_ratio (float) – Use to select lines having the number of dots equal to or larger than the multiplication of the accepted_ratio and the maximum number of dots per line.

Returns:

list of array_like – List of 2D arrays. Each list is the coordinates (y, x) of dot-centroids belong to the same group. Length of each list may be different.

discorpy.prep.preprocessing.group_dots_ver_lines(mat, slope, dot_dist, ratio=0.3, num_dot_miss=6, accepted_ratio=0.65)[source]

Group dots into vertical lines.

Parameters:
• mat (array_like) – A binary image or a list of (y,x)-coordinates of points.

• slope (float) – Vertical slope of the grid.

• dot_dist (float) – Median distance of two nearest dots.

• ratio (float) – Acceptable variation.

• num_dot_miss (int) – Acceptable missing dots between dot1 and dot2.

• accepted_ratio (float) – Use to select lines having the number of dots equal to or larger than the multiplication of the accepted_ratio and the maximum number of dots per line.

Returns:

list of array_like – List of 2D arrays. Each list is the coordinates (y, x) of dot-centroids belong to the same group. Length of each list may be different.

discorpy.prep.preprocessing.remove_residual_dots_hor(list_lines, slope, residual=2.5)[source]

Remove dots having distances larger than a certain value from fitted horizontal parabolas.

Parameters:
• list_lines (list of array_like) – List of the coordinates of dot-centroids on horizontal lines.

• slope (float) – Horizontal slope of the grid.

• residual (float) – Acceptable distance in pixel unit between a dot and a fitted parabola.

Returns:

list of array_like – List of 2D arrays. Each list is the coordinates (y, x) of dot-centroids belong to the same group. Length of each list may be different.

discorpy.prep.preprocessing.remove_residual_dots_ver(list_lines, slope, residual=2.5)[source]

Remove dots having distances larger than a certain value from fitted vertical parabolas.

Parameters:
• list_lines (list of float) – List of the coordinates of the dot-centroids on the vertical lines.

• slope (float) – Slope of the vertical line.

• residual (float) – Acceptable distance in pixel unit between the dot and the fitted parabola.

Returns:

list of float – List of 2D array. Each list is the coordinates (y, x) of dot-centroids belong to the same group. Length of each list may be different.

discorpy.prep.preprocessing.calculate_threshold(mat, bgr='bright', snr=2.0)[source]

Calculate a threshold value based on Algorithm 4 in Ref. .

Parameters:
• mat (array_like) – 2D array.

• bgr ({“bright”, “dark”}) – To indicate the brightness of the background against image features.

• snr (float) – Ratio (>1.0) used to separate image features against noise. Greater is less sensitive.

Returns:

float – Threshold value.

References