DEVELOPMENT OF ALGORITHMS FOR BIOMEDICAL IMAGE SEGMENTATION BASED ON PRELIMINARY MARKUP AND TEXTURE ATTRIBUTES

Biomedical images are used for diagnosis and treatment of malignant neoplasms. Biomedical images of normal and pathological cells and tissues are obtained by using light microscopes. These images are the objects of study in histology and cytology. In order to automate analysis of histological images, the systems of automated microscopy (SAM) are applied. SAM include hardware and software parts [1]. One of the most important stages in the automated measurement of optical and geometrical parameters of cells and tissues is the segmentation of micro objects on histological images [2, 3]. The complexity of the analysis lies in the high variability of parameters and low contrast of micro objects. Micro objects on histological images are the cells of tissues of organs. Tissues are composed of rounded cells, which are arranged in layers. The size of the cells ranges from 0.5 to 1.2 μm. Micro objects on the cytological images are the cells that are randomly arranged. An analysis of a histological image involves the following stages: obtaining an image, manual or automated segmentation, measurement of the size, shape, position and optical parameters of micro objects, classification and statistical processing of measurements results. Histological images have the following features [4, 5]: – weak contrast due to the use of low-resolution cameras; – images contain micro objects surrounded by the background that is complex in geometric and optical characteristics; – non-even background is the result of incorrect configuration of the microscope illumination module; – attributes of brightness of micro objects are the same as in the background; DEVELOPMENT OF ALGORITHMS FOR BIOMEDICAL IMAGE SEGMENTATION BASED ON PRELIMINARY MARKUP AND TEXTURE ATTRIBUTES


Introduction
Biomedical images are used for diagnosis and treatment of malignant neoplasms.Biomedical images of normal and pathological cells and tissues are obtained by using light microscopes.These images are the objects of study in histology and cytology.
In order to automate analysis of histological images, the systems of automated microscopy (SAM) are applied.SAM include hardware and software parts [1].One of the most important stages in the automated measurement of optical and geometrical parameters of cells and tissues is the segmentation of micro objects on histological images [2,3].The complexity of the analysis lies in the high variability of parameters and low contrast of micro objects.
Micro objects on histological images are the cells of tissues of organs.Tissues are composed of rounded cells, which are arranged in layers.The size of the cells ranges from 0.5 to 1.2 μm.Micro objects on the cytological images are the cells that are randomly arranged.
An analysis of a histological image involves the following stages: obtaining an image, manual or automated segmentation, measurement of the size, shape, position and optical parameters of micro objects, classification and statistical processing of measurements results.
Histological images have the following features [4,5]: -weak contrast due to the use of low-resolution cameras; -images contain micro objects surrounded by the background that is complex in geometric and optical characteristics; -non-even background is the result of incorrect configuration of the microscope illumination module; -attributes of brightness of micro objects are the same as in the background;
Image segmentation leads to the splitting of images into regions with similar characteristics.Among the main characteristics of image segmentation is the brightness for black and white images and color component for color images.The segmentation also involves attributes of contour and texture.The process of segmentation splits the image but does not identify separate segments or their ratios [6].
At present, there are no universal methods for segmentation.In practice, a set of specialized methods are employed that are the most effective for this class of tasks.Paper [2] proposed the following requirements for segmentation: -regions should be homogeneous for the characteristics of brightness or texture; -part of the region should be simple in shape and have small "openings"; -adjacent regions should significantly differ from each other by some preset characteristics; -contours of each region should be simple, continuous and spatially precise.
Improving the segmentation of cytological images with a low contrast and non-uniform background requires evaluation of attributes of points from neighboring homogeneous regions.Such an assessment will make it possible to form the rules for building up regions following the preliminary image markup.Because the nuclei of cells are significantly darker than the surrounding background, then in the process of detection of groups of cells in histological images it is required to consider the distribution of levels of gray on the plane.We propose to estimate this distribution using spatial moments calculated for each point of the image.Application of spatial first-order moments will enable segmentation of blood vessels and ducts in the image.

Literature review and problem statement
There are several approaches to the classification of segmentation algorithms [8].The approaches are based on the following criteria [9]: properties of points, regions, contours, a priori knowledge about micro objects, etc.Additional criteria for categorization of segmentation algorithms [10] are: image type (color, grayscale, binary), the nature of the segmentation process (parallel or sequential processing) [13][14][15].However, these characteristics are rather ambiguous.For example, threshold segmentation may take place in parallel or under sequential mode and process both binary and grayscale images [11].This leads to ambiguity in the algorithms of segmentation classification [12].The criteria specified below will make it possible to better categorize segmentation algorithms.
Algorithms based on the properties of the points.A decision on the inclusion of a point in a homogeneous segment is made by analyzing characteristics of the point [16,17] (brightness, color components characteristics).The advantage of this class of algorithms is speed, since the decision on the inclusion of a point in a homogeneous region is made for each separate point.The drawback is the complexity of selection (evaluation) of preliminary information for images in a large number of homogeneous regions with similar characteristics.
Algorithms based on texture properties [4].A decision on the inclusion of a point in a homogeneous region is made based on the similarity of texture attributes in a given point.This type of algorithms is recommended for images with duplicate patterns.
A texture segmentation process [14][15][16][17] can be divided into 3 stages: 1) formation of a set of characteristics (attributes) for the original image, which creates a multidimensional feature space.Each vector describes attributes of a texture [7]; 2) stage of the classification, where each point of the image corresponds to a certain class based on the estimation of the vector of attributes; 3) stage of the original image segmentation is based on the information obtained after classification.
The advantage of this class of algorithms is the choice of segments with the same texture, but the drawback is the complexity of the segmentation algorithms.
Algorithms based on the selection of the contour [5].A decision on the inclusion of a point in a homogeneous region is made by analyzing characteristics of the point of the contour.The algorithms are recommended for images with characteristic changes in brightness along the region's contour.The advantage of this class of algorithms is speed.Disadvantages are the presence of discontinuities at the edges (for the algorithms of filtering) and the difficulty of identifying the a priori information (algorithms of active contour).
Systems for analysis of biomedical images contain tools for manual detection of micro objects by selecting parameters for the segmentation algorithms [17].In turn, academic studies have not addressed a problem on the automation of setting the algorithms.The results of algorithms work during research are controlled visually.Minimizing manual operations in the selection of segmentation parameters and the estimation of quality of the performed segmentation would allow automated segmentation of particular types of images.

The aim and objectives of the study
The aim of present research is the construction of algorithms for the segmentation of cytological and histological images, which would improve quality, when compared with known algorithms, under conditions of low-contrast images, non-uniformity of the background.
To achieve the set aim, the following tasks have been solved: -to analyze relations between neighboring points and to devise rules for combining adjacent points into homogeneous regions to highlight cells; -to analyze textural features of complex micro objects on histological images for the development of methods and algorithms for highlighting the layers of cells; -to construct algorithms for segmentation quality evaluation and to estimate segmentation algorithms compared to the splitting into homogeneous regions by experts.

A method of image segmentation based on preliminary markup
We shall consider characteristics of separate points in an image and the relationship between them.We shall denote: I -original image; i Is -original image marked up using the i-th type of markup; ij V -the j-th homogeneous region in the original image marked up using the i-th type of markup; M k (x, y, z), x=1..l, y=1..m, z=1..8 is the array of coefficients of correlations for the k-th markup, l is the width of the original image, m is the height of the original image, z is the number of neighboring pixels.
An array of summary coefficients of correlations M sum is equal to (1): where n is the number of preliminary markups used in the process of segmentation.Definition 1.We shall denote markup as the process of splitting original image I into an array of homogeneous regions V j based on the homogeneity criterion KO.The homogeneity criterion is determined preliminary based on the analysis of original image I: If two adjacent points I(x 1 , y 1 ) and I(x 2 , y 2 ) are located in a homogeneous region, the relationship between them is equal to 1: 1, ( ( , )) ( ( , )), R P I x y P I x y where 1 1

( ( , )) P I x y
is the identifier of a homogeneous region to which point ( , ), I x y belongs, R is the coefficient of relationship between two neighboring points.

R P I x y P I x y
Definition 4. The total coefficient of relationship between two pixels I(x 1 , y 1 ) and I(x 2 , y 2 ) is defined as the amount of relations at n markups (2): where R is the total coefficient of relationship between two neighboring pixels I(x 1 , y 1 ) та I(x 2 ,y 2 ).

1. Segmentation algorithm based on previous markups
The segmentation algorithm is as follows: 1) perform preliminary markup of the original image I using n markups; 2) create an array of coefficients of correlations M k between adjacent points for each of the n markups; 3) create an array of total correlation coefficients M sum between adjacent points for each of the n markups; 4) group the points of original image I into homogeneous regions based on the array of total relation coefficients M sum .
Preliminary markup can be performed in three ways.Manual markup.The markup of the images is performed by several users independently, by manually selecting the homogeneous regions.This approach is labor-intensive and subjective in character because preliminary markup is performed under the influence of the human factor.The advantage of the approach is that the quantity of preliminary markups can be minimal.
Automated markup.Marking is carried out using the known methods of segmentation, but the user assigns the input parameters independently.The advantage of this approach is: high accuracy and performance speed, improvement of the objectivity of the preliminary markup.
Automated markup.Preliminary markup is performed based on the automated analysis of the original image, for example, an analysis of the histogram of distribution of brightness and defining different thresholds of belonging to a homogeneous region.
Since the algorithm was designed for the segmentation of color images during preliminary automated tagging, it enables a transition from a three-dimensional representation of color to the one-dimensional.Representation of the image in a 1-dimensional space allows for the automated analysis of histograms of color distribution using known algorithms for determining the thresholds.
Preliminary markups can be applied in different color spaces.
The following rules were devised to finalize the process of segmentation in order to categorize points of the original image in homogeneous regions based on the relationships between adjacent points: 1) if a connection between two adjacent points I(x 1 , y 1 ) and I(x 2 , y 2 ) is maximum → max sum M for the original image, then such points are merged into homogeneous region V j (Fig. 1, a); 2) if the total coefficient of connection of point I(x 1 , y 1 ) with neighboring point I(x 2 ,y 2 ) is larger than that with other neighboring points, these points are merged into homogeneous region V j ; 3) if point 1 1 ( , ) I x y has the same total coefficient of connection with two (or more) neighboring points ( , ) j I x y V ≠ , i j then the point is added to the region with which it has more neighboring points (Fig. 1, c).
The result of work of the method is a set of homogeneous regions.Since the objects in an image typically consist of a group of homogeneous regions, then it is necessary, in order to select the objects in an image, to perform a procedure for additional combination of homogeneous regions.

Method of texture segmentation
A texture segmentation method involves the following stages [19]: a) computation of texture attributes for each point in an image within a sliding window the size of W´W; b) segmentation of the created texture field.We shall consider textural attributes based on the spatial moments of a region and matrices of the distribution of levels of gray.

1. Computation of spatial moments in a region
A texture of the image can be quantitatively described using simple statistical characteristics, such as mathematical expectation, variance and moments of the highest order [19].The term of spatial moments (SM) originates from mechanics.SM, when applied to images, reflect distribution of the levels of gray image along its axes.Based on SM, it is possible to calculate region attributes that are invariant to rotation, transfer and scale.
Spatial moments of region at point with coordinates (x, y) and the value of function of levels of gray f(x, y) are calculated from (3): where p+q=0,1,2,3.
An image is considered to be a function of two variables f(x, y); we calculate a series of points of the lower order for each pixel in image (p+q≤2).The moments are calculated within local windows the size of W´W around each pixel.
In a discrete variant, SM within a window with a center in pixel (i, j) are calculated as the sum at normalized coordinates (x m , y n ): where m, n are the coordinates of point relative to the window.
To estimate a distribution of the levels of gray, we using row-column moment of inertia m 1,1 .

2. Computing the matrices of distribution of the levels of gray
In the matrix of distribution of the levels of gray (MDLG) [19] d P for displacement vector = ( , ), dx dy d the value of element , i j p is the number of occurrences of the pair of values for levels of gray i and j, which are located at distance = ( , ), dx dy d.Thus, each point in the image f(x, у) can be matched with matrix P d , which describes the distribution of brightness in a window the size of × W W centered at the point with coordinates (x, y).The elements of matrix P d are determined from: ( ( ; ) or and ); 0, otherwise.
This function is an indicator that the points that are located at a given distance have a certain level of brightness.Parameter d determines the distance at which the analysis of neighboring points is performed.Based on MDLG, we define such textural attributes as energy, entropy, contrast, homogeneity, correlation, etc.
By applying matrix P d (i, j), which describes the distribution of brightness inside a region centered at point (x, y), we compute attributes of textures.After processing the entire image, we form for each of the attributes a matrix that stores a value for the attribute in all processed points − a field of texture attributes.To describe the attributes, we shall employ auxiliary magnitudes:

Mx ip
Below are the formulae for computing the attributes.1.Total mean value: 2. Inertia: 3. Second angular moment: i j F P i j P i j (12) 6. Correlation: The complexity of applying MDLG for the segmentation of biomedical images is in the large number of the enumer-ated attributes, need to choose values for d, m, n.Existing SAM do not possess means to search for optimal parameters of the texture segmentation algorithm.

3. Texture segmentation algorithm
The proposed texture field segmentation algorithm involves the following stages: 1. Construction of texture field G whose every point ∈ g G in accordance with ( 1

i i f x y if s x y s x y
To test MDLG (5) as a texture's attribute, it is necessary to execute a given algorithm by constructing at step 1 g(i, j)=F 5 .Parameter k is used to control the interval between the lowest and the largest threshold.
The optimal threshold number n (accordingly, the number of iterations of the algorithm) can be assigned a priori, based on actual application, or chosen based on a change in the value of signal/noise ratio ρ.Values of ρ can be calculated between the original image and the segmented image with averaged values of pixels in the middle of the segments.

Quantitative estimation of segmentation algorithms
In order to quantitatively estimate algorithms for the segmentation of cytological and histological images, we shall employ a metric approach.It implies using the Fréchet and Hausdorff metrics.The original image is segmented by an expert, that is, it is split into a set of homogeneous regions that do not intersect.The same image is processed by the constructed segmentation algorithms.The expert's set of homogeneous regions and a set after algorithm's work in a general case do not match.The level of similarity between these sets defines the quality of segmentation.To compare the contours of images, we shall apply the Fréchet metric while the Hausdorff metric will be employed to compare regions.In order to find the smallest distances between contours, we shall use a metric by Gromov-Fréchet; between regions − by Gromov-Hausdorff.Quantitative assessment of the quality of segmentation algorithms is based on the following algorithms [22].
Consider polygonal regions.Fig. 2 shows polygonal regions after being segmented by an expert and after segmentation executed by the algorithm.
Consider algorithms for determining the Fréchet distance and the Hausdorff distance.

1. Algorithm for computing the discrete Fréchet distance
Let two contours C and R be assigned (Fig. 2). a where r, s is the number of segments.2. Introduce set L between C and R:  Before finding distances between contours, we perform segmentation of a biomedical image and linear approximation of contours of the homogeneous regions.
Computing the Gromov-Fréchet distance and the Gromov-Hausdorff distance implies conducting a number of isometric transformations.That is why we shall consider the algorithm of isometric transformations of polygonal regions.

3. Algorithm of isometric transformations
Isometric transformations are assigned by set = { , , }, where P is the parallel carry, R is rotation, S is mapping relative to the axis (Fig. 3).Matrix of transformations in affine space is represented in the following way: For the isometric transformations, the matrix is represented as: We shall represent the algorithm of isometric transformations in the following steps: 1. Compute centers of masses  Estimation of the quality of segmentation for two images will consist of the following stages: splitting the image into homogeneous regions, approximation of contours, finding the Hausdorff distance, finding the Fréchet distance, calculating the weighted sum of distances for two metrics.

Experimental research into segmentation algorithms
In order to perform computer experiments, software in the Java programming language was developed using the OpenCV computer vision library.The software is designed for preliminary processing, image segmentation and segmentation quality evaluation.Preprocessing algorithm presented in detail in [24].Segmentation algorithms are estimated based on the devised metric approach [22].
The main purpose of the algorithm is to adapt automatically to different types of images that can greatly simplify operation compared to improving an image manually.The devised algorithm involves the following stages: 1. Determining the level of brightness, contrast, mean value of red, green, and blue channels of the RGB original image.
2. Selecting parameters for improving quality of an image depends on the input parameters obtained at stage 1.As a rule, the levels of brightness of cytological and histological images are not always uniform, which is why they require correction.The result of correcting brightness and contrast of cytological images is shown in Fig. 4.  5 shows that filter with a core of p=3 reduced the noise level of the original image without distorting the image visually.Filter with a core of p=7 reduced the noise level, but it significantly distorted the image.
4. Obtaining the mean values of RGB channels, which are used to perform morphological operations ("dilation", "erosion").Morphological operation "dilation" implies blurring the images using a filter.The image is generated from the local minimum, that is, dark regions increase.Operation "dilation" implies enlargement of the image using a filter.The image is generated from the local maximum, that is, bright regions increase.Examples of performing operations of dilation and erosion are shown in Fig. 6. estimation of quality of segmentation is to search for metrics to compare the non-convex regions.This path implies overcoming a problem of the dependence of computing cost on the complexity of contours of objects.
The developed algorithms and software tools make it possible to automate the operations of preliminary segmentation in the systems of automated microscopy and to reduce the time of image processing.Automating manual operations leads to improved efficiency of medical diagnosis based on cytological and histological images.

Conclusions
1. We devised the method and constructed the algorithm for segmentation based on preliminary markup, which made it possible to improve the quality of cytological image segmentation by 21 % on average.A method of segmentation implies splitting a color image into homo-geneous regions, computing a coefficient of connection between adjacent points, and merging the points into homogeneous regions.The algorithm allows for the automated segmentation.
2. We devised the method and built the algorithms for texture segmentation of layers of cells, which made it possible to improve the quality of histologic image segmentation by 18 % on average.The method involves computing the values of spatial moments for each point of the image.A feature space, obtained in this way, is segmented by the algorithm.The algorithm calculates thresholds based on mathematical expectation.
3. The algorithms of segmentation quality assessment based on a metric approach were developed.The application of the Fréchet and Hausdorff distances made it possible to find similarity between the contours and regions in the quantitative form.The use of the Gromov-Fréchet and Gromov-Hausdorff metrics allowed us to find the shortest distances between the contours and regions, respectively.
has the same total coefficient of connection with two (or more) neighboring points

Fig. 1 .
Fig. 1.Examples of merging the points into homogeneous regions: a -maximum connection between points; b -the same total connection coefficient with neighboring points; c -the same total connection coefficient with adjacent points that do not belong to one homogeneous region D is the square the size of × W W (W is an odd number); i, j=0..255 are the values of brightness of points; , m n x is the brightness of point with coordinates (m, n).Function

Fig. 2 .
Polygons: a -expert segmentation; b -algorithm segmentation 1. Represent contours C and R by sets:

a 1 =1, b 1 =1, a m =r, b m =s. 3 . 1 .
Then finding the Euclidean norm If i=1 and j=1, then the Euclidean distance is equal to:

2 . 3 . 4 .
If i>1 and j=1, then the Euclidean distance is equal to: If i=1 and j>1, then the Euclidean distance is equal to: If i>1 and j>1, then the distance is equal to:

where
are the projections of vertices of region O 1 onto region O 2 ; are the projections of vertices of region O 2 onto region O 1[24].Projections l O d (l= 1,2 ) are computed from: Proj Ol (v, w) is the minimum Euclidean distance from point P(v, w) to region O 1 [22].Then the Hausdorff distance is calculated based on steps: 1) We obtain polygonal regions O 1 =(v 1 , v 1 ,…, v m ) and O 2 =(w 1 , w 1 ,…, w m ).Find distances d Ol (l= 1,2 ) of regions O 1 and O 2 according to expression (20).

Fig. 4 .Fig. 5 .
Fig. 4. Result of the correction of contrast and brightness of the image: a -original image; b -adjustment of contrast and brightness 3. Filtration of images.Results of the experiments based on a sample of 100 cytological images from database[25] have shown that the averaging filter core with p=3 is the most appropriate to reduce the level of noise.Fig.5shows an example of the image filtration.