Influence of procedures for processing the voice signal of authentication system on the quality of formant data
DOI:
https://doi.org/10.30837/pt.2025.1.04Abstract
The current scientific task of improving the efficiency of voice authentication systems, which are widely used in modern access systems, is considered. It is known that it is possible to reduce errors of the first and second kinds in voice authentication systems by improving the procedures for the digital processing of the voice signal being analyzed, by better extraction of user features, or by improving procedures for decision-making on user admission. An important place in all voice signal processing procedures is occupied by formant data (spectral power levels, formant frequencies, spectral envelopes, and the width of the formant frequency spectrum). Based on the first two formants, speech recognition and synthesis are solved, and the next two formants enable user authentication. The purpose of this work is to outline ways to improve the quality of the formation of formant data in relation to the tasks of digital processing of speech signals. The object of the study is the process of obtaining formant data using amplitude-frequency and phase information, as well as the results of calculating the autocorrelation function of the analyzed signal. The subject of the study is methods and procedures for extracting formant data in the context of experimental research. The scientific novelty of the obtained research results lies in the fact that, for the first time, a comparative analysis of formant data obtained from different source information, namely amplitude-frequency and phase information, as well as the results of calculating the autocorrelation function of the analyzed signal, has been performed. The reliability of the research results is justified by the proper use of the known mathematical apparatus and the coincidence of formant data estimates as a result of processing the experimental user signal. The practical significance lies in the fact that the obtained results enable improvements in the quality and efficiency of voice data processing for speech recognition and synthesis, user authentication in voice systems, and several other applied tasks related to speech production.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).