DEVELOPMENT OF A MULTINOMIAL LOGIT- MODEL TO CHOOSE A TRANSPORTATION MODE FOR INTERCITY TRAVEL

Intercity passenger transportation (except for travelers who use privately-owned cars for long-distance trips) is mainly carried out by air, by rail, and by intercity buses. The distribution of passenger traffic among these modes of transportation differs in different regions and depends on social conditions, the availability, and comfort of different types of transport, as well as other related factors. According to statistical information [1], in Ukraine, motor vehicles unambiguously dominate in terms of the number of passengers transported. For the last 10 years, an average of 89 % of the total annual volume of passenger traffic belonged to this type of transport, 10 % ‒ to railroads, and only 1 % ‒ to air and water transportation. Regarding the actual transportation operations (the product of the number of passengers transported by the distance of travel), the distribution is more uniform. Thus, the average annual values over the past 10 years are 40 % for railroad transport, 43 % ‒ for vehicles, and almost 17 % ‒ for aviation. It is obvious that this is because railroads and aircraft are mainly used for long-distance travel. Environmental issues related to transportation are becoming increasingly important. This applies directly to transport as it exerts a significant adverse effect on the environment, contaminating the air, increasing noise levels, and affecting the climate. According to work [2], given the harmful impact of transport on the environment, the overall cost of combating it is twice larger for bus transportation than for railroads. If one also considers privately-owned passenger cars, the negative impact of rail passenger traffic is only 14 %. Therefore, increasing the share of rail transportation could positively affect both the environmental and economic indicators of the state. A random utility theory in transportation, and not only, is currently the basis for modeling a choice among discrete alternatives. It is based on a hypothesis that every person attempts to maximize the utility of his/her choice. Understanding why people make a specific decision at multiple choices is essential for many industries [3]. For example, in paper [4], the discrete choice modeling is used to determine the optimum location of railroad stations (when designing urban rail transit systems). Study [5] uses discrete modeling to estimate international cargo flows (with distribution by the types of cargo). In [6], discrete modeling is applied to assess readiness to accept a crowdshipping service while in [7] ‒ the use of bicycles for travel (with distribution by a travel purpose). The main task of the current work is to define the criteria for selecting a certain type of transport by users for intercity Received date 30.05.2020


Introduction
Intercity passenger transportation (except for travelers who use privately-owned cars for long-distance trips) is mainly carried out by air, by rail, and by intercity buses. The distribution of passenger traffic among these modes of transportation differs in different regions and depends on social conditions, the availability, and comfort of different types of transport, as well as other related factors. According to statistical information [1], in Ukraine, motor vehicles unambiguously dominate in terms of the number of passengers transported. For the last 10 years, an average of 89 % of the total annual volume of passenger traffic belonged to this type of transport, 10 % -to railroads, and only 1 % -to air and water transportation. Regarding the actual transportation operations (the product of the number of passengers transported by the distance of travel), the distribution is more uniform. Thus, the average annual values over the past 10 years are 40 % for railroad transport, 43 % -for vehicles, and almost 17 % -for aviation. It is obvious that this is because railroads and aircraft are mainly used for long-distance travel.
Environmental issues related to transportation are becoming increasingly important. This applies directly to transport as it exerts a significant adverse effect on the environment, contaminating the air, increasing noise levels, and affecting the climate. According to work [2], given the harmful impact of transport on the environment, the overall cost of combating it is twice larger for bus transportation than for railroads. If one also considers privately-owned passenger cars, the negative impact of rail passenger traffic is only 14 %. Therefore, increasing the share of rail transportation could positively affect both the environmental and economic indicators of the state.
A random utility theory in transportation, and not only, is currently the basis for modeling a choice among discrete alternatives. It is based on a hypothesis that every person attempts to maximize the utility of his/her choice.
Understanding why people make a specific decision at multiple choices is essential for many industries [3]. For example, in paper [4], the discrete choice modeling is used to determine the optimum location of railroad stations (when designing urban rail transit systems). Study [5] uses discrete modeling to estimate international cargo flows (with distribution by the types of cargo). In [6], discrete modeling is applied to assess readiness to accept a crowdshipping service while in [7] the use of bicycles for travel (with distribution by a travel purpose).
The main task of the current work is to define the criteria for selecting a certain type of transport by users for intercity transportation and to subsequently construct a model of choosing the type of an external transport hub. The results obtained would make it possible to determine the amount of passenger flow of users of the urban transport system while traveling over a city territory to the hubs of external transportation, as well as assess the potential for increasing the attractiveness of railroad transportation.

Literature review and problem statement
Models of random utility are widely used in travel (displacements) modeling. A multinomial logit-model, a single-level hierarchical logit-model, a multi-level hierarchical logit-model, a cross-nested logit-model, a generalized extreme value model, a probit-model, and a mixed logical model [8] are those used most often.
The multi-nominal logit model is considered to be the most classical method of modeling user behavior during travel (a choice of the method of movement, a path of motion, etc.) because results of the logit-models can be easily and accurately enough evaluated using standard analytical methods [9]. The probability of choosing an alternative j among the available options, according to MLM, is determined as the ratio of the systematic utility of alternative j to the sum of systematic utilities of the entire set of possible alternatives (considering the parameter of Humbell's distribution law).
The alternative utility function consists of two parts: deterministic (expected) and random. The deterministic part of the function is a linear combination of attributes that are taken into consideration when evaluating an existing set of alternatives (calculated as the sum of the products of attributes, taken into consideration when evaluating the j-th alternative, by the coefficients of the attributes). The random part of the function is those decisions that the user takes directly during travel.
A person in carrying out his/her choice regarding the travel method consciously or unknowingly is guided by a complex set of factors. This process is called a modal choice. There are four approaches to assessing the modal choice: rationalistic, socio-geographical, socio-demographic, and socio-psychological [10]. Under the rationalistic approach, the main role belongs to the time and cost of travel. The socio-geographical approach considers population density, characteristics of land use, etc. In the socio-demographic approach, the main impact on the choice is exerted by age, gender, employment, and under social-psychological -lifestyle, habits, etc. The combination of these factors in the development of a model of choice of the travel method would better assess their impact on the final decision of a transport system user.
Considerable attention is paid by scientific studies to modeling the behavior of individuals when traveling. In particular, determining the utility function to assess the attractiveness of bicycle use after the introduction of a system of the shared use of bicycles to travel the "first/last mile" in Beijing (China) was reported in work [11]. The authors described the function of calculating the probability of choosing such modes of movement as walking on foot, riding own bike, a bike from the BSS system (Bicycle-Sharing Systems), and driving own car. The factors of utility function include the distance of access, age, gender, the availability of own car and/or bike, as well as travel frequency. Work [12] developed a utility function for a passenger to select a route between a pair of transport areas in the city of Seoul (South Korea). The authors took into consideration such factors as travel time in a vehicle, the time of travel without a vehicle, the coefficient of transfers, the stability of travel duration magnitude, and a path circuity index. However, the derived utility functions do not include the cost characteristics of a trip, which is important during intercity passenger transportation. In addition, paper [12] disregards the socio-demographic characteristics of users as the travel data were taken from smart cards' transaction records.
Study [13] built a model of choosing a travel mode for suburban trips (along the route Bekasi-Jakarta, Indonesia) for connecting home and work (own car, BRT, railroad). The impact factors are the cost, travel time, travel frequency, and travel delays. However, the indicators chosen for the model are important only for users older than 50.
The authors of [14] use MLM to assess the probability of choosing a certain travel mode by a consumer in a joint air-railroad hub for two cases of interaction between these modes of transport -competition and cooperation. The factors of utility function are the tariff value, travel duration, traffic frequency. However, the characteristics of an external transport hub are not taken into consideration (in particular, the ETH capacity is considered unlimited), which can affect the result obtained.
In work [15], a utility function was used to form a model of demand for tourist trips across the cities of Ukraine. A multivariate fuzzy analysis was used to develop a deterministic part of the utility function. The transport zones considered were the oblasts whose attributes of power demand for transportation were population density and the value of average income. The attributes of the zone attractiveness were the cost of a journey for the consumer (the distance, geometry, and quality of movement, these sub-attributes were fuzzy) and the number of hotel rooms. However, the authors developed only a model for assessing the likelihood of such a trip, without specifying the mode of traveling.
Paper [16] reported the results of surveys from Krakow students regarding their preferences when choosing a mode of intercity travel, taking into consideration factors such as time, distance, availability, cost, and comfort indicators. Work [17] analyzes the data from polls, conducted in Hungary, on the choice of users between the bus and rail. However, the cited works give only the actual probabilities of the choice of a certain type of transportation calculated on the basis of the analysis of the conducted surveys, but there is no formed model to estimate such a choice.
In [18], the authors estimate the influence of time, cost, and quality of travel on the choice of a railroad as a travel mode at two alternatives -railroad and bus transport. The authors emphasize the indicators of a trip quality but do not take into consideration the socio-demographic characteristics of the traveler.
Paper [19] addresses the formation of a function of attractiveness of the way from home to work when using public passenger transport in Kharkiv. The attributes of the utility function are the duration of travel, the bus capacity factor, the fare, and the number of transfers. The paper considered trips to work, which, obviously, are the most in-demand. However, for cities with a significant share of students in the structure of the population, it is important to study the peculiarities of their behavior when choosing a travel mode.
Defining criteria for choosing the type of transport for intercity travel and evaluating the impact of these criteria on the user choice of a transportation type, depending on travel conditions, is a component in determining the demand for passenger transportation. This knowledge, in particular, is important to plan ETH operation. Taking into account, in the formation of the utility function for the choice of the type of external transportation, both the characteristics of the trip and the ETH characteristics would make it possible to study the mutual influence of these factors on the attractiveness of a particular travel mode.
A special feature of the multi-nominal logit-models is that the coefficients calculated for one project territory are not relevant for another territory; each case requires conducting specific research to choose indexes and calculate coefficients.

The aim and objectives of the study
The aim of this study is to build a utility function for users to choose the type of external transportation for intercity travel using a multinominal logit-model.
To accomplish the aim, the following tasks have been set: -to define a list of attributes of the utility function; -to calculate the utility function coefficients (based on the information collected by surveys or the analysis of statistical data) and build a logit-model for the user to choose the type of an external transport hub.

1. Methods to study the influence of ETH characteristics, as well as travel characteristics, on choosing the type of transportation
The following methods were used in this paper: -empirical (for surveying users of the transport system and analyzing the information obtained); -probabilistic-statistical (to study the behavior of TC users in their choice of an external transport hub); -formalization (for processing information, obtained from polling, for further modeling); -modeling (for the mathematical notation of the probability of choosing a certain type of ETH).
Data for our analysis were obtained by conducting surveys among students who study in the city of Lviv (Ukraine). Students are a significant group of users of transport services, they often make long-distance trips, but the knowledge of their behavior in travel is not enough.

2. Descriptive statistics
Lviv is the city with a population of 760,000; its area is 182 km 2 . According to statistics, there are 120,000 students in Lviv, that is, 16 % of the population [20]. A large part of them are nonresident, that is, they regularly commute from the city. Long-distance trips in Lviv are carried out from 11 main hubs of external transportation: three railroad stations, seven bus terminals, and an airport. Taking into consideration the location of Lviv on the map and the orientation of the main highways, seven directions for an intercity trip were chosen: western, northern, northeastern, eastern, southeast, south, and southwest (Fig. 1).
The survey was conducted in 2019. The sample consists of 510 respondents: 58 % are students at higher educational establishments of accreditation level IV; 42 % are students at educational establishments of accreditation level III. The respondents first indicated the direction of the trip, the external transport hub where it started, and the frequency of the trip. Most trips (45 %) began from the main railroad station. Most of the surveyed students traveled east (26 %) and southeast (24 %). The distribution of the number of trips by the respondents based on these criteria is shown by charts in Fig. 2.
Regarding the travel frequency, 44 % of users would travel to an external transport hub once a week, 16 % -more than once a week, and 40 % -less than once a week.
The questionnaire contained the following indicators: -average travel time; -the average time taken to leave the place of residence to reach the place of dispatch (external transport hub); -the average cost of travel; -the time interval of dispatch. The respondents also assessed the importance of these factors. The scoring scale consists of four judgments: "very important", "important", "not very important", and "least important".
The result of polling is the obtained data on the characteristics of trips made. The range of change in each metric is divided into 5 intervals. The results are shown by diagrams in Fig. 3.  Based on the derived values, it can be concluded that the respondents mostly travel over 1-2 hours; the price is up to USD 2 (USD 1≈ ≈UAH 25). In 71 % of cases, the duration of getting around the city is from 15 to 45 minutes, and the most common dispatch time (58 %) is a period from 13:00 to 18:00 hours.

South direction
If one analyzes separate trips made from railroad hubs, and the trips starting at bus terminals, the greatest difference is observed in the distribution of the number of trips relative to their duration (Fig. 4).  Fig. 4. Results of choosing a transportation mode depending on the travel parameters: a -distribution of the duration of intercity travel for trips from railroad hubs; b -distribution of the cost of intercity travel for trips from railroad hubs; c -distribution of the time to get around the city for trips from railroad hubs; d -distribution of the time periods for trips from railroad hubs; e -distribution of the duration of intercity travel for trips from bus hubs; f -distribution of the cost of intercity travel for trips from bus hubs; g -distribution of the time to get around the city for trips from bus hubs; h -distribution of the time periods of dispatch for trips from bus hubs If we analyze the correlation between the trip duration and its price (Fig. 5), it is clear that increasing the distance of transportation increases the difference between the trip duration and its cost by railroad and by bus. At the travel duration to 1 hour, USD 2 for the trip was paid by 83 % of the railroad users and 89 % by bus users (the difference is only 6 %). At the trip duration from 1 to 2 hours, the same price was paid by 56 % of railroad users and 46 % of bus users (the difference is 10 %). At the trip duration from 2 to 3 hours, the percentage distribution is 44 % and 14 %, respectively (the difference is 30 %). At the trip duration of 3-4 hours, the trip cost of USD 2 was paid by 43 % of the railroad users and 8 % of bus users (the difference is 35 %). At the trip duration exceeding 4 hours, 14 % of users could still take a trip by rail for such cost, however, the bus trip attracted no users among the respondents.
Therefore, we expect a growing number of railroad transport users with an increase in the duration of travel.
As regards the importance, the most important attribute that was recognized was the duration of the trip from the external transport hub to the destination. The second in importance is the cost of traveling, followed by the time to get around the city to a hub, and the least significant attribute for the respondents was the time interval of dispatch.

Calculation of the utility function coefficients and the results of building a multi-nominal logit-model to choose a mode of transportation
A multi-nominal logit-model is used to simulate the choice of a travel mode. The expediency of using logit-models to simulate intercity trips is confirmed by available research [21]. The positive aspect of these models is their relatively easy construction and applicability at a small number of parameters, as well as the ability to use them to analyze the significance and elasticity of the model parameters.
Since the students almost do not use air transport for internal trips, the model suggests two options for travel -by railroad and by bus. The data on trips were included in the model as panel data.
The probability of selecting a specific alternative among the possible ones is determined from the following formula: where Vj is the systematic utility of the alternative j; θ is the parameter from the Humbell's distribution law; i=1...2 is the set of possible alternatives. The systematic utility of choosing the alternative of traveling by bus: where dir X is the attribute that evaluates possible directions of travel from a bus or railroad hub of external transport; travel X is the attribute that evaluates the duration of travel from a bus or railroad hub of external transport; cost X is the attribute that evaluates the cost of travel from a bus or railroad hub of external transport; city X is the attribute that evaluates the duration of travel in the city when moving to a bus or railroad hub of external transport; time X is the attribute that evaluates the possible period of dispatch from a bus or railroad hub of external transport; k β are the evaluation attribute coefficients.
Information about the trip direction was introduced based on the distribution of the total number of trips for directions (information is in Fig. 2).
Our analysis has revealed that the influence of the city trip duration and the time interval of dispatch does not have a statistical significance for the investigated category of users when selecting the type of external transport. In addition, a greater correlation between the indicators is observed if one separately forms the models of choice for directions with a different proportion of route length. In this case, this criterion was used to divide ETH into three categories: -the main share of routes within the limits of 100 km (western and southwestern directions); -the main share of routes ranging from 100 to 200 km (north and northeast directions); -the main share of routes more than 200 km (south, southeast, and east directions).  Table 1 gives the derived coefficients and statistical characteristics of the obtained logit-models of choosing the type of an external transport hub by users during their intercity trips. The β k coefficients reflect the contribution of each attribute to the general utility of the choice. If one analyzes the value of the β coefficient for a trip duration indicator, the positive coefficient values for railroad hubs indicate that, with the growing duration of travel, the utility of choosing these hubs would increase, and the bus hubsdecrease.
The assessment of the findings for the Fisher criterion testifies to their adequacy (Table 2) since the significance criterion of the Fischer criterion is less than 0.05 for most cases, and the Fisher criterion exceeds its tabular value [22]. As an example, the probability of a user-student to choose, for his/her intercity trip in the northeast direction (number 1.6), a point of departure to be a railroad hub, at the trip duration of 2.5 hours (number 3), and the cost of transportation of USD 2.4 (number 2), would equal 0.68.
The resulting model was tested on a test sample from 30 poll results, not included in the original dataset. The sample included 15 variants to choose travel by bus and 15by railroad. The graphical comparison of actual data and the results predicted on the basis of the constructed model is shown in Fig. 6. The result of testing the model is 86 % of correct results, indicating its adequacy.

Discussion of results of constructing an MLM to choose a transportation mode for intercity travel
If we analyze the distribution of passenger traffic between railroads and motor vehicles relative to student trips, the trend would differ from the generalized nationwide statistical data in Ukraine [1]. According to our study, 53 % of the users within this group, when choosing between traveling by bus and by rail, select, if possible, a trip by railroad, 47 % -by bus (Fig. 2). Therefore, it is relevant to select the criteria for this particular group of users. Based on an analysis of the results of our surveys in the city of Lviv, we investigated the factors that would affect this choice. It was assumed that such factors were the distance and travel time, the duration of travel around the city, and the characteristics of ETH operation. The results that are shown in Fig. 5 demonstrate that, from the user point of view, the cost and travel time are the important attributes for choosing a mode of transportation. The smallest percentage of railroad users is when the trip duration is within 1-2 hours (33 %), and the largest -during long journeys (at the trip duration longer than 4 hours, railroads are chosen by 92 % of the respondents). As the travel duration increases, the difference between the cost of traveling by bus and by rail grows in favor of railroads (in this case, it is usually possible to choose between more expensive and cheaper carriages). Regarding the attributes of ETH, the choice of a transportation mode is affected by the share of dispatches along a particular direction from a hub. This indicator is especially important for trips up to 100 km. This agrees with the results reported in work [14]; for short travel, the frequency of movement is important. Such indicators as the duration of getting around the city and the time period of dispatch are not statistically significant ( Table 1).
The constructed utility functions for buses and railroads (2), (3) make it possible to estimate the volumes of travel by users of the selected category at ETH with the known parameters of the trip. The ETH parameters are important when choosing a transportation mode, and their 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Experiment number actual data forecasting data effect depends on the length of the trip, as evidenced by data in Table 2 and the result of testing the adequacy of the model (Fig. 6). Therefore, they should be considered when simulating intercity trips. The resulting MLM could be particularly useful for cities with a significant proportion of students in the population structure for the initial assessment of demand for rail and bus intercity trips.
Given the initial limitations of the model, its application may prove impractical under conditions when the share of transportation by other modes of transport (for example, by air) exceeds the magnitude of the model's error. In addition, additional studies are required to explore the following conditions: a trip distance is more than 500 km, as well as the restrictions related to the epidemiological situation. The use of MLM coefficients for cities whose population average income differs from that in Ukraine would require the adjustment of travel costs.
Since our polling revealed that a certain share of users prefers a car-sharing system for a journey (Bla-Bla-car, etc.), further studies may consider such a travel alternative. It may also be advisable to introduce additional factors influencing the model.

Conclusions
1. Our study helps better understand the impact of individual attributes on choosing by users-students the type of an external transport hub for intercity travel. As expected, the most important factors are the trip duration and the cost of travel, as well as the share of dispatch from an ETH in a certain direction. It turned out that such factors as the duration of getting around the city and the dispatch time interval are not important for users within the investigated group, so in further studies, these attributes may be replaced by others.
2. We have calculated the MLM coefficients, which characterize the importance and impact of each of the parameters evaluated on the general utility of choosing a particular transportation mode. These coefficients differ depending on the length of the trip. In particular, the coefficient train travel β accepts the highest value when the trip length is from 100 to 200 km, the coefficient cos train t β -when traveling up to 100 km, and the constant β train increases dramatically with an increasing travel length. The coefficient train dir β decreases with an increasing travel length, which can be explained by that short trips are usually serviced by smaller railroad stations with a limited choice of directions while long-distance travel starts from the main railroad hub. The constructed multi-nominal logit-model for the selection of a transportation mode during intercity trips makes it possible, taking into consideration the attracting capacity of a hub [23], to determine the probable number of trips using a certain type of transport by users within the investigated group. These results could be used for improving a transportation network in a city or region. It is also possible to apply the derived utility function to simulate the functioning of ETH in different unpredictable cases (mass events, festive and fair days, emergencies, etc.).