ANALYSIS OF STABILITY OF THE USER ' S KEYBOARD HANDWRITING CHARACTERISTICS IN THE BIOMETRIC AUTHENTICATION SYSTEMS

Consideration is given to the use of biometric characteristics in order to increase the efficiency of user authentication. An identifier that uses biometric characteristics is inextricably linked to the user, and it is virtually impossible to use it unauthorized. As a biometric characteristic it is expedient to use a keyboard handwriting. Keyboard handwriting, or rhythm of typing, displays a way of typing on a keyboard that is specific to a particular user. In addition, it is quite simple to implement and does not require additional hardware costs. Moreover, the use of keyboard writing when entering a password eliminates the main disadvantages of classical password systems and systems based on access cards. The focus of the research was on the sustainability of the time characteristics of the keyboard handwriting of a particular user over a long period of time. To implement the admission of the user to the computer system, an algorithm based on the Heming distance is selected. According to the chosen algorithm an algorithm for forming a vector of biometric characteristics of the user is developed, which includes the duration of holding a specific key and the time between pressing the neighboring keys. An algorithm for forming a user's biometric standard is developed. To analyze the use of keyboard handwriting, software applications that implement the user's access based on keyboard handwriting were developed, as well as a program for collecting time characteristics. Both applications use the developed algorithms. To study the constancy of handwriting time characteristics, an empirical study was conducted. For this purpose, a group of individuals is selected, each of which has a computer input at an adequate level. They all entered the proposed phrase within a year. The obtained statistical data, on the basis of which, the average values and values of the average square deviation of the time characteristics of keyboard handwriting at the various time intervals are calculated. Estimated probability of correct user recognition by its frequency in n independent experiments. As a result of the study, the persistence of user keyboard handwriting as a biometric characteristic for use in computer data protection systems, in particular, authentication systems, was analyzed.

development of biometric authentication systems that analyze the static, invariant characteristics of the person, which include fingerprints, face or hands, DNA, and others. The second category includes biometric systems that analyze the dynamic, behavioral characteristics of the person. They are based on the study of human voice, the dynamics of writing text using a handwritten or keyboard user's handwriting.
Systems that use biometric characteristics of the user are virtually devoid of the drawbacks of traditional authentication systems, since the identifier is inextricably linked to the user and unauthorized use of it is virtually impossible. As a biometric characteristic, it is advisable to use the keyboard user's handwriting [1] - [4].
Analysis of recent researches and publications. Keyboard writing as a biometric characteristic refers to dynamic characteristics that describe subconscious actions that are common to users. Keyboard handwriting or rhythm printing shows the way a user types a text. As the unique information inherent in one or another user, one can note the following most typical features [1], [3]:  intervals between key presses;  key hold time;  number of overlaps between keys;  degree of arrhythmia of the typing;  speed of typing;  the number of errors when typing.
The advantages of systems based on keyboard handwriting are as follows [5], [6]:  lack of additional hardware for implementation;  ease of implementation;  software-based implementation;  high speed decision-making on the truth of the user;  an attacker can not log in with a valid password;  it is impossible to override passwords by the method of brute force. While the disadvantages of systems are based on keyboard handwriting are next [5]- [10]:  commercial solutions such as BioPassword® for Enterprise Networks and B-Identified ™ Professional are focused on large computer systems requiring powerful hardware resources and skilled personnel. They are costly because they are geared towards the needs of a large business;  both commercial and scientific solutions do not provide source code, so it's impossible to test them for undocumented features, and the presence of vulnerable or malicious code;  there are no studies about the persistence of the user's keyboard typing for a long time. Existing studies do not accentuate this attention. However, without this, it is impossible to make constructive recommendations regarding the effective use of such systems.
The article goal is to analyze the stability of the user's keyboard handwriting characteristics in the biometric authentication systems. It is achieved by solving such individual tasks: 1. Analyze existing approaches, recent research and publications. 2. To analyze the stability of the user's keyboard handwriting characteristics. 3. Evaluate the results and develop recommendations for the creation and updating of the user's biometric standard.
The main material research. In fig. 1 shows a generalized scheme of work of biometric authentication systems [1]. All biometric systems operate in two modes: training and decision making. The main processes of the training mode are the formation of a vector of biometric characteristics and the formation of a user's biometric standard on its basis. The main processes of the decision-making mode are the formation of a vector of biometric characteristics and a decision based on the user's standard and the given vector.
Among the most used decision-making algorithms, the following can be noted [6]:  algorithm for user recognition based on access control to the domain of reference samples.
 algorithm for making decisions based on the use of neural networks.
The main disadvantages of these algorithms is that [3], [6]:  the learning process is quite labor intensive;  to study, a large number of samples of biometric characteristics of users is required. Figure 1 -Generalized scheme of biometric authentication systems Therefore, to develop a software module, it was decided to use a user recognition algorithm based on the Hamming distance, which is devoid of these drawbacks. Hamming distance is the number of positions in which the corresponding characters of two words of the same length are different [8].
In a more general case, Hamming's distance is applied to rows (vectors) of the same length and serves as a metric of difference (a function that allows determining the distance in a metric space) of objects of the same dimension.
According to the chosen user recognition algorithm, an algorithm for forming a vector of biometric parameters (the time of holding the key for the password and the time between pressing the neighboring keys) is formed. The general view of the vector is the following: ) , ,..., , , key press time; up Tkey release time. The hold time is defined as the time difference between the moment of pressing and releasing the i-th key. The interval between clicks corresponds to the difference between the moment of releasing the i-th key and pressing the (i + 1)-th keys. The general form of the biometric characteristics benchmark for the selected user recognition algorithm is formed. It is formed as confidence intervals for certain time parameters, and the maximum allowable Hamming distance, between the standard and provided during the authentication of the vector time characteristics.
The standard of biometric characteristics has the form: and ) (   Erespectively, the mathematical expectation and the mean square deviation for the Hamming distance for each vector;  -Student's ratios, with Ldegrees of freedom and pthe probability of a 1-type error.
The decision on the truth of the user is as follows. The user is considered true if [6]: where Е -Hamming's vector; i ethe distance between the corresponding parameters of the given time vector and the user's standard; i t -і-th time parameter of vector of biometric characteristics. Thus, the user finds true when the Hamming distance from the given biometric vector to the standard is less than the threshold. When making a decision on the truth of the user formed a Hamming vector whose parameters are units in the case when the time parameter is not included in the confidence interval and zeros if it is included. The Heming distance is the number of units in the Hamming vector. To analyze the use of keyboard handwriting, software applications that implement the user's access based on keyboard handwriting [14], [15], as well as a program for collecting handwriting time characteristics, have been developed.
Analysis of the dependence of the characteristics of the keyboard handwriting on time. To study the constancy of handwriting a group of individuals is selected, all of them have a computer input at an adequate level. All participants introduced the proposed phrase averagely 3 times a week. Thus, the time characteristics of the keyboard handwriting of the group of users for the year were received in the amount of 144 results per user.
The experimental statistical material is obtained -the vectors of the time parameters of the keyboard handwriting of the participants of the study when entering the same text (access password). According to [13], the optimal length of the control phrase is from 8 to 20 characters; for such parameters the authenticity probability is the highest. The length of the text in the study was 15 characters. This means that the dimension of the Hemming's vector will be 29 (15 parameters that reflect the duration of the text key hold; 14 -the duration of intervals between pressing adjacent keys). It is advisable to analyze separately the length of the hold of the keys, and the intervals between their presses. Based on the accumulated material, the average values and mean values of the average square deviation of the time characteristics of keyboard writing for different periods of time (day, week, month, year) are calculated. The calculated values allow us to draw conclusions about the sufficient stability of the time characteristics of the user's keyboard handwriting over a long period of time. For example, in Fig. 2 shows the dynamics of changing the average key hold duration per week and the average value of the interval between keystrokes a week for a year for each participant in the study. The data obtained allows us to assume that the duration of key holdings and the intervals between keystrokes received during the year do not have a pronounced trend to increase or vice versa decrease its value.
Research results. Figure 3 shows the dynamics of the change in the value of the Heming distance to the biometric standard of the two participants in the study, for which the threshold value of the Heming distance are significantly different. On the abscissa axis, the number of vector of biometric parameters is deferred (corresponds to the number of the day when the vector was obtained), and the ordinate axis is the value of the Heming measure, which is the number of "no hits" of the time parameter of the vector of biometric characteristics in the confidence interval of the standard. The red color displays the function of changing the value of the Heming measure from time ) (t E v , and the blue one -its threshold for the given user p E . Each value of Heming's measure, which is more than a threshold, is regarded as a denial of the true user in access to the system. As a result of the received results, the number of errors when entering the password during the year, practically does not increase. After analyzing the same graphs for each participant in the experiment, the number of failures was calculated and the experimental frequency of the correct access to the system for the true user was calculated. Typical values for a user group are shown in the tabl. 1.
Let's evaluate the probability p of the correct user recognition by its frequency in n independent experiments [12]. The average value of the frequency of correct user recognition in a series of 144 experiments is 0.96. Define a 90% confidence interval for probability.
The applicability of the normal distribution law is estimated by the values np and nq [12]. Assuming it is roughly * p p  we obtain:   The obtained values give grounds to diminish that the normal distribution law can be applied in this case. For the tables given in [12], for 9 , 0 