DEVELOPMENT OF DECISION-MAKING TECHNIQUE BASED ON SENTIMENT ANALYSIS OF CROWDSOURCING DATA IN MEDICAL SOCIAL MEDIA RESOURCES

The object of the study is the decision­making modeling in the context of medical social media to increase the clinics’ effectiveness. The problem is to classify the patient reviews collected in the patient­clinic segment of the medical social media and to identify the situation related to the clinics’ activity by revealing the criteria characterizing the clinics’ activity out of the opinions. The proposed technique refers to lexicon­based senti­ ment analysis of opinions, the classification based on Valence Aware Dictionary and Sentiment Reasoner (VADER), the verification of the results accuracy with Multinomial Naive Bayes and Support Vector Machine, the manual sentiment analysis of opinions to detect criteria and the classification of opinions according to each criterion. Using this technique


Introduction
A large amount of information collected on social media sites, professional social societies, online forums, and personal blogs supports people in making decisions [1,2].Social networks, which are used by users to communicate with one another and develop relationships, have revolutionized their culture, education and general social status [3,4].In medical social networks, Internet communities such as health-related Facebook or LinkedIn, people can search for information and discuss health conditions, symptoms of certain diseases, and treatments [5].People caring about their health use medical media resources to ask questions, get answers, discuss, build relationships with doctors, clinics, nurses, pharmaceutical companies, or other patients, and at the same time express their opinions about them [6,7].A study conducted within the framework of the Pew Internet & American Life Project1 found that almost 80 percent of Internet users in the United States search for health-related topics online, 63 % of users seek information about a specific medical problem, and approximately 47 % try to find medical information about treatment or procedures on medical social networks [8].[9] assesses the social media as the most flexible source of information in terms of quick detection of user feedback on a certain drug, and shows that obtaining such information can prevent a large amount of damage to the company that produces the drug.
Currently, medical social networks and online communities not only provide user support, but also the information gathered in this environment has become a valuable resource for healthcare decision-making.As a result of the expansion and deepening of medical social networks and societies, medical crowdsourcing research has emerged.Crowdsourcing is the practice of involving large groups of people, their knowledge and experience through the Internet to solve a certain problem [10,11].Studies devoted to the use of information gathered in the medical social media to make decisions about improving the quality of medical care are a clear example in this regard [12][13][14].Based on crowdsourcing in medical social media resources, for making decisions to improve the quality of medical care, [15] uses statistical and content analysis of the information collected from medical social media tools and describes scenarios for improving the qua lity of medical services.[16] conceptually describes solution of several issues as the most frequently requested medical problem by the population, taking into account age and gender, determination of the disease, medicine, etc. referring to the statistical analysis of the surveys collected in the doctor-patient segment of medical social media and the demographic (location, gender, age) indicators of e-patients.[17] explores the possibilities of applying sentiment analysis (SA) for the use of information obtained from crowdsourcing in medical social media resources in medical decision-making, and indicates that the application of such an approach to solving a certain medical problem «public opinion», «mass opinion» ensures more objective and transparent decisions.
Providing more objective and transparency of the decisions to improve the quality of medical care makes it urgent to conduct research for realizing this opportunity.The realization of such studies necessitates working out the methodological bases of decision-making and the realization stages by sentiment analysis of the information received from crowdsourcing in medical social media resources.[18] highlights the rapid development of technologies increasing social media activity as a harmful feature, which leads to media addiction among people.Whereas [19] substantiates that the increase in social media activity has become a valuable source of information for solving various problems along with the harmful sides of medical social media environment.As a logical result of these theses, [13] shows the use of information shared by cancer research institutes on social media to solve the problems of cancer patients, [20] is dedicated to promoting health and well-being through social networks.These studies justify that the information shared in the social media environment supports individuals and medical media subjects to solve the problems they face.However, the most important aspect of medical social media as a valuable source of information is the use of information gathered in this environment for the study of public opinion.Based on individual opinions shared by nearly two million Twitter users about the COVID-19 vaccine (SARS-CoV-2), [21] categorizes people as pro-vaccine, anti-vaccine, and vaccine-hesitant based on content analysis of the public opinion.

Literature review and problem statement
The authors use the data collected in the doctor-patient relationship segment in an online social networking environment to obtain reviews about the drugs [9].In this regard, they use the data collected on two very popular social networks in China as primary source of information.Based on the content analysis of drug names and contents found in the texts, they offer a method for evaluation of reviews.
However, content analysis of information available in the network for studying public opinion is a very boring process.Manually extracting and analyzing the content present on the web is a tedious task.This gave rise to a new research area called Sentiment Analysis.[22] presents the sentiment analysis of patients' opinions on hospitals which is mainly used to improve healthcare service.This is implemented using a lexicon-based methodology for sentiment analisys.
However, the application of lexicon-based sentiment analysis results in any problem solution requires checking their accuracy.[23] highlights the importance of machine learning methods for the higher accuracy analysis of large volumes of patient reviews on social networks, and presents the verification of the results accuracy obtained from SA with machine learning methods.The mechanism for using the results obtained from this approach to make decisions related to the solution of a specific medical problem is not provided.
In [17], the possibilities of applying lexicon-based sentiment analysis (SA) for the use of information collected in the medical social media environment in medical decision-making are explored.It is shown that with the SA of information obtained from crowdsourcing in medical media resources, more objective and transparent decision-making can be achieved that takes into account public opinion for solving a certain medical problem.To this end, the results obtained from the classification of patient opinions collected in the patient-medical institution segment are presented.However, this article lacks a mechanism for checking the accuracy of the results obtained from SA, and for using the results obtained from the research in making decisions related to the solution of the more important problem in the field of medicine.
Therefore, as a logical continuation of the research conducted for the use of information collected in medical social media resources for decision-making, the present article proposes a decision-making technique related to the solution of certain medical problems referring to the SA of Opinions and machine learning.To develop the technique, it is necessary to select a solution of the specific problem, which is more important in the field of medicine, regarding the stockholders of medical social media.
One of such problems is related to increasing the efficiency of the clinics, i.e., the core of the reforms carried out for the improvement of the quality of medical services in health care.Taking into account the role and functions of healthcare in the socio-economic system of each country, the effectiveness of the activities of medical institutions refers to the values such as satisfying the needs of the population for medical assistance, providing higher quality assistance at a minimum cost, early prevention of diseases, and promoting a healthy lifestyle [24].Reviewing the improvement of the organizations activities in the context of social media, [25] introduces the concept of social organizational credibility, and evaluates the organization's social media activity and media relations as the main factor of its credibility, and shows the importance of paying special attention to the activity of staff members in social media.[13] demonstrates the significance of using social media platforms to strengthen relationships of cancer research institutes with stakeholders and to promote their brands, and justifies the need to integrate oncologists and nurses into the company's corporate communication initiatives.These challenges have made it urgent to pay attention to medical social media resources to increase the effectiveness of the clinics.The problem solution requires obtaining and analyzing information collected in social media resources related to the activity of clinics, determining the criteria characterizing the activity of clinics according to public opinion, and evaluating their activity.
Obviously, whether in the context of traditional health care or in the context of social media, improving the clinics' activity and increasing their effectiveness refers to certain criteria, on the basis of which the activity is evaluated and identified, and appropriate management decisions are made.Ensuring more objective and transparent decisions, taking into account «mass opinion» and «public opinion», makes it more urgent to solve the issue of increasing the effectiveness of the clinics in the context of medical social media.This necessitates the implementation of research in making decisions on the management of clinics based on the information collected in medical media resources, sentiment analysis of information obtained from crowdsourcing in the patient-clinic relationship segment of medical media.

The aim and objectives of the study
The aim of the study is to develop a decision-making technique that takes into account public opinion to increase the effectiveness of clinics based on the information collected from medical social media resources.
To achieve this aim, the following objectives are accomplished: -to conduct sentiment analysis and classify the patient feedbacks regarding the activity of clinics; -to check the results of sentiment analysis with machine learning methods, and determine the accuracy; -to determine the clinics rating according to patient satisfaction and to evaluate their activity; -to determine the criteria characterizing the activity of clinics and to identify their priority; -to conduct feedback analysis according to each criterion and identify the situation related to the activity of clinics.

Materials and methods
The object of the research includes the decision-making modeling in the context of medical social media to increase the clinics' effectiveness.The main hypothesis of the study is to consider public opinion by referring to the information collected in medical social media resources to ensure the objectivity of decision-making.Proceeding from this hypothesis, it is possible to identify the status of clinics based on the patient opinions collected in the patient-clinic segment of medical social media and to take public opinion into account in making relevant decisions.To this end, patient reviews should be classified according to tonality and the criteria characterizing the clinic's activity should be identified from the reviews.The above is possible by using sentiment analysis and using machine learning methods to check the accuracy of the results.
SA, also known as feedback analysis, is a natural language processing process that enables automatic classification of content (opinion) expressed in text [26].Although the history of SA originates from the 1990s, research on its application has expanded since the emergence of Web 2.0, increased access to information generated by network users, and the proliferation of social media platforms.SA is currently applied in industry, economy, healthcare, etc. as a valuable tool.SA includes machine learning-based, lexicon-based and hybrid methods [27].The lexicon-based approach of SA is a simple approach.This approach refers to the sentiment lexicon, which consists of words and phrases commonly used to express positive and negative attitudes.Unlike machine learning-based and hybrid methods, this method does not require training data [28].The Valence Aware Dictionary and Sentiment Reasoner (VADER) is a lexicon and rule-based system.VADER approach classifies the text as negative, positive, and neutral.Manually generated sentiment lexicons are as follows: -Multi-Perspective Question Answering (MPQA) Subjectivity Lexicon is a subjective key-referencing lexicon of over 8,000 words, each classified as positive or negative.This lexicon includes 2718 positive and 4909 negative words, as well as a number of personal adjectives, adverbs, any POSpart of speech; -bing Liu Lexicon is based on a list of words (about 6800 words) expressing positive and negative opinions (feelings) in English.
Auto-generated sentiment lexicons include: -NRC Hashtag Sentiment Lexicon classifies tweets by indicating the association with a positive emotion by a positive score, and the association with a negative emotion by a negative score; -Sentiment140 Lexicon; -SentiWordNet automatically classifies all WORDNET synsets according to their positive, negative and neutral degrees.
Thus, SA offers an opportunity to classify the written feedbacks (texts) about the stakeholders of the medical media as negative, positive, and neutral, and by using this, it is intended to identify the situation related to the activities of the medical media subjects.
Machine learning methods are applied to check the accuracy of the results obtained from SA [26,29].The study uses Multinomial Naive Bayes (MultinominalNB) and Support Vector Machine (SVM) to check the accuracy of the results obtained from SA.
Naive Bayes is a probabilistic machine learning algorithm that classifies the data based on the given features.In other words, the probability of the label is calculated with the given features using Bayesian rule.It uses the bag of words approach, where the individual words in the document constitute its features, and the order of the words is ignored.This technique is different from the way to communicate with each other.It treats the language like it's just a bag full of words and each message is a random handful of them.Large documents have a lot of words that are generally characterized by very high dimensionality feature space with thousands of features.Hence, the learning algorithm requires to tackle high dimensional problems, both in terms of classification performance and computational speed.
SVM is a type of machine learning algorithm used for both classification and regression problems.The main goal of the SVM algorithm is to find the hyperplane which is able to separate the provided data into classes based on feature space.The hyperplane is N-1 dimensional flat affine subspace where N is a dimension of the features.In the case of 3 features that in turn make 3D space, the hyperplane is a 2D plane.In addition, the hyperplane aims to maximize the distance between nearest data points from either class.Particularly, this distance which the hyperplane tries to maximize is called the margin.Moreover, support vectors are the data points which are the closest to the hyperplane or define it.In case any of the support vectors are moved, the hyperplane position will change.SVM has three kernels, namely linear, polynomial radial basis function and sigmoid.A kernel is a function type that transforms the input data to a higher dimensional space to be linearly separable.Generally, the algorithm aims to find the hyperplane that has the maximum margin between the end data points of each class.This algorithm is quite effective in case the data is not linearly separable.Furthermore, SVM is effective in terms of building generalized models by maximizing the margin between classes.However, for larger datasets, it is computationally expensive and slow.
To evaluate the detection performance of classi fiers in machine learning, precision, recall, false positive rate (FPR), true positive rate (TP), f-measure and accuracy criteria are used [30].
Precision (P) denotes the proportion of the number of true positives to whole predicted positives and is defined as follows: Here T p denotes the number of correctly classified, prediction-related data; F p is the number of misclassified data non-related to the prediction.
Recall (R) is defined as the proportion of the number of true positives to all actual positives is calculated using the following formula: where F n is the number of data unrelated to the prediction classified as errors F1-Score is defined as the harmonic mean of the recall and precision, and is calculated by the following formula: Accuracy is defined as follows: The high value of the indicators obtained based on formulas (1)-( 4), especially the accuracy and recall, represents a good performance of the results obtained from the analysis and conditions the continuation of the research with the reference to the classification results.

1. Sentiment analysis and classification of patient feedbacks regarding the activity of clinics
Pandas, Numpy, Matplotlib, Seaborn, Nltk libraries and the Python environment are used to collect patient opinions about clinics.Let's use the open database cms_hospi-tal_satisfaction_2020 by the Kaggle company as information source for the study generated from the opinions collected in the patient-clinic segment of medical social media [31].For the feedback analysis in the database, let's refer to the lexicon-based sentiment analysis algorithm proposed in [32], and apply the VADER approach described in [28] in order to classify the feedbacks.With reference to the listed methodological base, the algorithm offered for solving the problem of analysis and classification of patient opinions about the activity of clinics is implemented in the following stages: Access to the open database cms_hospital_sa-tisfaction_2020 and availability of 442,587 reviews in the database (Fig. 1).
Stage 2. This is Data Pre-Processing stage and performs the data cleaning (tokenization) process, removes spaces and special characters, and names the remaining tokens.
Stage 3.This is the Extraction Opinions phase and prepares processed opinions for SA.
Stage 4. This stage applies Lexicon Based Sentiment Analysis algorithm described in [32].This approach refers to the sentiment lexicon, which consists of words and phrases commonly used to express positive and negative attitudes.
Stage 5.This is Classification of Opinions stage, and applies the VADER approach described in [28], and classifies the texts expressing patient opinions into 3 classes: negative, positive, and neutral.In order to classify the database with the VADER approach, the column «sentiment_type» is added to the dataset with values as «negative», «neutral», and «positive».Fig. 2 shows a fragment of Kaggle's cms_hospital_sa-tisfaction_2020 database after the classification based on SA.

Fig. 2. A fragment of Kaggle's cms_hospital_ satisfaction_2020 database after classification based on lexicon-based sentiment analysis and VADER approach
It is determined that out of 442,587 patient feedback in the database column «Sentiment_type», 218,914 are positive, 190,360 -neutral, and 33,313 -negative.

2. Checking the results of sentiment analysis with machine learning methods, determination of the accuracy
In order to continue the research, the accuracy of the results obtained from the classification of opinions is checked.To this end, 80 % of the dataset in the reference database is allocated to machine training data, 20 % -to test data, and Multinomial Naive Bayes (MultinominalNB) and Support Vector Machine (SVM) machine learning models are built (Fig. 3).for sentiment analysis By referring to the formulas ( 1)-( 4), the application of machine learning methods is performed to check the accuracy of the results obtained from SA. Fig. 4 illustrates the results of text classification according to the MultinominalNB machine learning model.As presented, both methods show that the classification of reviews is performed with high accuracy.Therefore, it was considered appropriate to continue our research by referring to the results obtained from SA and classification of opinions.

3. Determination of the clinics' rating acording to patient satisfaction
The tonality of feedbacks should be determined in order to investigate patient satisfaction related to the clinics' activities.The main issue in the opinions' tonality analysis is the classification of feedbacks according to different emotional states.Fig. 6 illustrates an overview of the «patient satisfaction» of clinics in Kaggle's cms_hospital_satisfac-tion_2020 database.
It is determined that out of 442,587 patient feedback in the database column «HCAHP Answer Description», 49.5 % are positive, 43.0 % -neutral, and 7.5 % -negative.Fig. 7 visualizes the patient feedbacks classified by positive, neutral, and negative tonality.
According to the positive reviews expressing the patient satisfaction, the rating of the clinics and the best performing clinic can be determined.To this end, the column «Hospitals» of cms_hospital_satisfaction_2020 database by the Kaggle company and the columns «sentiment_type» expressing patient satisfaction according to the opinion are analyzed in Excel.Fig. 8 presents the list of top ten clinics in the rating table.
Correspondingly, it is possible to determine the clinics' rating and the worst performing clinics based on negative feedbacks.The presentation of a rating expressing the patient satisfaction regarding the activity of clinics can support decision-making for patients to choose a clinic.However, obtaining more complete information about the activity of the clinics requires determination of the criteria for evaluating the activity and analyzing the situation by these criteria.The next paragraph focuses on this problem solving.

4. Determination of criteria characterizing the activity of clinics and determination of their priority
A manual sentiment analysis of patient reviews is conducted to determine the criteria characterizing the clinic's activity.Based on the analysis, it is determined that patients prioritize the following criteria for evaluating the performance of clinics in the comments in the cms_hospital_satis-faction_2020 database by the Kaggle company: -behavior of nurses with patients; -behavior of doctors with patients; -behavior of assisting medical staff with patients; -compliance with hygiene and cleanliness rules; -keeping calm; -patients support after discharge.In order to find out which criterion patients prioritize when evaluating the activity of clinics, patient reviews are classified according to these criteria (Fig. 9).

Cleans and hepl 11%
Quietness 6% After discharge 16% Fig. 9. Description of the priority of criteria characterizing the performance of clinics in the reviews of the Kaggle's cms_hospital_satisfaction_2020 database Behavior of nurses with patients (23 % of reviews), the behavior of doctors with patients (22 % of reviews), the behavior of assisting medical staff with patients (22 % of reviews) are the criteria prioritized by patients.Let's believe that giving more importance to any criterion when expressing an opinion also means the priority of that criterion.From this point of view, Fig. 9 shows the priority of the criteria in the patient opinions about the clinic's activity in percentage.

Feedback analysis according to each criterion and identification of the situation related to the activity of clinics
Let's review the issue of distribution of positive and negative reviews according to criteria for the identification of clinics' activities.Fig. 10 presents the results obtained from the classification of positive and negative opinions on the activity of the clinics according to 6 criteria.
As seen, according to the crowdsourcing results, patient satisfaction with the clinics' activity is mostly high, with 65 % of the total reviews evaluated as «positive» (15 % (nurses)+15 % (doctors)+15 % (staff)+7 % (cleans)+4 % (quietness)+9 % (after discharge).The activity of clinics is evaluated at the same level according to the criteria of behavior of doctors and nurses, as well as assisting medical staff with patients.15 % of patients expresses «positive» opinions about the activity of clinics according to these criteria.Note that the «positive» and «negative» opinions of the patients about the support provided to them after being discharged are almost the same.This situation makes it urgent to review the activity related to that criterion in the clinics, etc.

Nurses (pos.) 15%
Nurses (neg.)8% Doctors (pos.)15% Doctors (neg.)7% Staff (pos.)15% Staff (neg.)7% Cleans and help (pos.)7% Cleans and help (neg.)4% Quietness (pos.)4% Quietness (neg.) 2% After discharge (neg.)7% After discharge (pos.)9% Fig. 10.Distribution of positive and negative reviews according to the criteria representing the performance of clinics in the Kaggle's cms_hospital_satisfaction_2020 database The identification of the situation related to the activity of each clinic is analogously considered.To this end, Memorial Hospital, which is leading the rating table based on the number of positive reviews in the cms_hospital_satis-faction_2020 database by the Kaggle company, is selected.Fig. 11 presents the distribution of the positive results obtained from the sentiment analysis of opinions about this clinic by 6 criteria.Fig. 13 illustrates the distribution of sentiment analysis results (positive, negative and neutral) of opinions about Memorial Hospital by 6 criteria.
As Fig. 13 shows, the main positive aspects of Memorial Hospital's activities are the behavior of doctors and nurses, as well as assisting medical staff with patients, and «positive» opinions on these criteria exceed «negative» opinions by almost 2 times.At this point, it is noteworthy that the opinions of the patients regarding the criterion of providing support to them after being discharged are similar.It is important to examine the organization of work related to that criterion in the operation of this clinic, etc. Identification of the situation related to the activity of other clinics can be carried out analogously.

Discussion of experimental results
The formation of medical social networks and online medical communities has led to the collection of a large amount of information in media resources.This information is unstructured, fragmentary, but very important.Thus, this information expresses the public opinion, mass opinion, and by taking advantage of it, objective, transparent decisions can be made which can represent the public opinion regarding the solution of a certain problem.This article considered appropriate to analyze unsystematized information collected in social media resources for making decisions to improve the quality of medical care according to the relationship segments formed between the stakeholders of the media, that is doctors, patients, clinics, nurses, etc.The study was evaluated as a crowdsourcing study, since the patient opinions collected in the patient-clinic segment of the medical social media were selected as the research subject (in this case, the patients expressing opinions were the participants of the study, and the authors of the articles acted as professional researchers).
As it is noted, it was considered appropriate to use open databases collecting relevant information due to the lack of information in medical media resources and ensuring the reliability of the source.The study used cms_hospital_satis-faction_2020 database by the Kaggle company.442587 data about the clinics included in the database, more precisely, the patient opinion (text) sentiment was analyzed and classified as «positive», «negative» and «neutral» (Fig. 2).Then the accuracy of SA was checked with MultinominalNB and SVM machine learning methods (Fig. 3) Verification confirmed that the obtained SA-based classification provided high performance (0.97) (Fig. 4, 5).Subsequently, referring to the classification results, the general representation of the clinics' activity was visually described (Fig. 6), and the possibility of determining the rating of the clinics was shown (Fig. 8).
According to the public opinion, the problems of the determination of the aspects of the clinic's activity and the criteria prioritized by the patients more in the evaluation of the activity were resolved.To this end, a manual sentiment analysis of all reviews in the cms_hospital_satis-faction_2020 database, which it referred to, was performed.The authors of this study determined that the patients prioritize 6 criteria for evaluating the activity of clinics: the behavior of nurses, the behavior of doctors and the behavior of assisting medical staff, compliance with the rules of hygiene and cleanliness, compliance with quietness, and providing support to patients after discharge.Correspondingly, new criteria for evaluating the clinics' activity in the context of medical social media were defined and their priority was determined (Fig. 9).In order to determine the situation related to these criteria in the operation of the clinics, the patient opinions were reanalyzed.The general opinions were reclassified by each criterion, an attempt was made to identify the situation related to the activity of the clinics according to the ratio of «positive»/«negative» opinions (Fig. 10).The possibility of identifying the activity of each clinic was shown with an analogous approach (Fig. 11-13).Unquestionably, identification of condition was intended for the use in making appropriate medical decisions.
The methodological base in the sources [16,22,23, 28] about the information contained in medical social media resources as a valuable resource has been extended, its use This article is to define the criteria for evaluating the clinics' activity in the context of medical media.
In this modeling, lexicon-based sentiment analysis, the classification of opinions as positive, neutral, negative based on VADER (Fig. 2) and the verification of the accuracy of the classification results based on MNB and SVM (Fig. 3, 4) are realized with Python software, and the determination of the clinics' ranking according to the classification results (Fig. 6, 8), detection of criteria from reviews based on manual sentiment analysis, and the determination of the criteria priority according to the frequency (Fig. 9) are realized with Excel software.
The accuracy of the opinions classification results is checked by the MultinominalNB (0.96) and SVM (0.97) machine learning methods, and once the results are proved to show a sufficiently high performance (Fig. 4, 5), the research is continued.The graphical representation of the results obtained in the next steps in Excel is aimed at reflecting the adequacy and reliability of the results.
This study considers decision-making modelling to increase the effectiveness of clinics in the context of medical social media for the first time, and an original approach is proposed for the consideration of the public opinion in decision-making by referring to sentiment analysis, content analysis, statistical analysis, machine learning methods, and the integration of the results obtained from the application of various software.
A number of shortcomings related to the study should also be noted.The main problem is that the information in medical social networks is not directly available, therefore, in order to obtain the necessary information, it is necessary to apply to the open databases of the companies collecting the relevant information.However, such databases are currently few, and most of them are not available or are available at a very high price.In this regard, let's express our gratitude to constantly updated Kaggle company, which structures the information collected on the patient-clinic segment in medical social networks and collects it in a single database.By the way, it is possible to note that this company has a database of «patients' opinions about doctors' activities» formed on the basis of information collected in the patient-doctor segment of medical social media.In our further study, let's will to conduct similar research by referring to this source of information.
During the research, it is found out that when expressing an opinion about the activity of the clinic on any criteria, patients use the quality levels in natural language as «usually», «almost», «partially», «sometimes», etc. as.However, this point was not considered in this research.Of course, it would be better to take into account these statements for a more objective identification of the situation by any criteria in the evaluation of the clinics' activity.If to take into account that the set new criteria are of a qualitative nature, their priority is different, and on the other hand, they are expressed by different quality levels, then it would be motivating to solve the issue of evaluating the activity of clinics by reducing it to multi-criteria evaluation in a fuzzy environment.This issue corresponds to our (authors) traditional research area, and it is planned to consider its solution in our further studies.
In is noted that analogous research can be performed in other segments of medical social media with the proposed approach.For example, a system of new criteria characterizing the activity of doctors through crowdsourcing research according to the opinions collected in the doctor-patient segment can be formed; the priority of doctors can be determined by these criteria and in different areas of medicine, and problems can be detected.

Conclusions
1. 442,587 data, more precisely, patient opinion (text) about clinics, generated on the basis of opinions collected in the patient-clinic segment of the medical social media, entered into the cms_hospital_satisfaction_2020 open database by the Kaggle company.Out of 442,587 data, 218,914 were classified as positive, 190,360 as neutral and 333,13 as negative.
2. Determination of the accuracy of the classification results obtained from SA was checked by MultinominalNB and SVM machine learning methods, the classification was confirmed to show high performance (0.97).
3. With reference to the classification results, the possibility of evaluating the clinics' activity and determining the rating was shown.
4. Based on the manual sentiment analysis, criteria characterizing the clinics' activity were determined and their priority was determined according to the frequency of «word forms» expressing the criteria.5. Opinions were classified according to each criterion, the possibility of identifying the situation related to the clinics' activity, the possibility of using the results for making decisions to increase the effectiveness of the clinics' activity was shown.

Fig. 4 .
Fig. 4. Sentiment analysis accuracy of texts with the MultinominalNB machine learning model Fig. 5 presents the results of text classification with the SVM machine learning model.As presented, both methods show that the classification of reviews is performed with high accuracy.Therefore, it was considered appropriate to continue our research by

Fig. 11 .
Fig. 11.Distribution of positive opinions by criteria representing the Memorial Hospital's performance

Fig. 12 .
Fig. 12. Distribution of negative opinions by criteria representing the Memorial Hospital's performance 13g.13.Distribution of positive and negative and neutral opinions by the criteria representing the Memorial Hospital's performance in making relevant medical decisions taking into account public opinion.The advantage of the present article is that the algorithm of the research dedicated to the solution of this problem is clearly presented step by step, with reference to available scientific-theoretical base, and it is hoped that it will be easily understood by the reader.