Public Transit System Assessment in Austin, TX Yang Qing Abstract Austin is a newly developed city which was the third fastest growing big city in the US in the last decade. Public transit is important to Austin’s growing population in terms of location accessibility and connectivity. Public transit also has numerous socioeconomic impacts as it is a major connector to human services and economic activities. Hence, this project uses GIS techniques to assess the reasonableness and quality of the transit system in Austin based on the data from General Transit Feed Specifications (GFTS). This assessment includes these aspects: (1). the average speed between each bus stop; (2). the flows on each pair of bus stops; (3). the population coverage and spatial coverage of the whole transit system. The social impacts of the public transit system are also evaluated in a quantitative way. Keywords: Public Transit System, GIS 1. Introduction Before the emergence of automobiles, public transit systems were the major transport mode which provided access to jobs and city cores. Even after the diffusion of private cars, public transit is still an important alternative way of transportation, especially in big cities or concentrated employment zones where the population density is high. Besides, Public transit features the idea of economies of scale, it is more economically efficient than private cars. Moreover, public transit has some more advantages compared with private vehicles, such as providing equal access to the public, reducing traffic congestions and reducing environmental issues (T. L. Lei and R. L. Church, 2010). Besides, public transit provides basic mobility to those who don’t have vehicles or are unable to drive due to financial, physical or mental difficulties. However, the usage of public transit depends largely on the quality of public transit systems, if the accessibility or connectivity is low, people are more likely to use other transport modes over public transit. Austin is the 11th most populous city in the US, 4th in Texas, with an average of approximately 30% population growth every decade in the last several decades. According to US Census Bureau, Austin had a population of more than 900,000 in 2014. Austin had a median of 2 vehicles per household, comparing with 2.3 vehicles per household in Texas and 2.2 in the US. Therefore, commuting or other movements of the residents depend largely on public transit. As a matter of fact, there were 2 million rides on just two of the transit routes in 2014, according to Captain Metro (Capital Metro, 2015), Austin’s largest transit authority. However, due to the complexity of public transit systems, it is often difficult to analyze and measure the accessibility and connectivity of public transit system. Moreover, there is no consensus on how to measure and analyze the characteristics of a public transit system, since there are many approaches to do so and many kinds of measures (T. L. Lei and R. L. Church, 2010). But most of analyses on public transit fall within the scopes of time consumption, schedule frequency, served population, the location and user groups. This work partly uses the methodology and assessing framework proposed by (Hadas, 2013), a methodology for extracting, storing and analyzing public transit data based on the spatial and temporal coverage of the public transit system. Then the assessment is made based on the results of these measures in terms of the quality and social-economic impacts of the public transit system. This work uses the data from The General Transit Feed Specification (GTFS), a repository held by Google (Google Developers, 2015). In this case, the data was published by Capital Metro, the transit authority in Austin. GTFS defines a common format for public transportation schedules and associated geographic information. GTFS allows public transit agencies to publish their transit data and developers to write applications that consume that data in an interoperable way (Google Developers, 2015). This work also presents a data-oriented model, generating a few maps and statistics showing different aspects of quantitative measurements of the public transit system. Such aspects are the speed between each segment, the flows at each location, and the overall route coverage. Social relevance and implications of these measures are also considered and discussed. 2. Literature Review Public transit systems have numerous socio-economic impacts and benefits. They provide access to locations associated with residents’ daily behaviors. Such locations are hospitals, employment zones, schools and so on, which are largely related to people’s life and material demands. Economically, good public transit systems increase the attractiveness of bus travel relative to car travel which helps alleviate congestion. Environmentally, the increased attractiveness of bus travel relative to car travel helps reduce pollution. Socially, the existence of a bus service increases the accessibility of non-car owners to social services and employment opportunities (Department for transport, 2013). There are numerous approaches, indicators and factors to measure and assess a transit system. Notably, there are some standard factors and means of measure of transit quality proposed by (Kittleson & Associates, 2003). These factors are about transit availability, comfort and convenience. Availability factors are (1) service coverage: the access to transit stops based on the proximity; (2) scheduling: how often and when the transit service is provided; (3) capacity: the capacity of a bus/train; (4) information: information regarding how to use the transit and when to drop off and transfer. Whereas comfort and convenience factors are about travel time, passenger loads, reliability and security. As a matter of fact, most public transit quality measurements are highly related to or fall in the scope of these factors. Coverage is an important factor in terms of the accessibility of the public transit system, it accounts for the area that is reachable by the public transit system. It also refers to the service area. Many transit system studies measure the spatial coverage of a public transit system (Mamun et al., Ryus et al.). In this case, they used the proximity of the transit stop to represent the spatial coverage (Kittleson & Associates, 2003). Notably, Kittleson & Associates (2003) presented some standard measuring approaches, such as creating a 400m buffer on the bus route and 700m buffer on the light rail route then computing the percentage area coverage of route in terms of the whole census block areas, where such buffer indicates the walkable vicinity of the transit route. However, Ryus et al. (2000) define the walking vicinity as walking time instead of walking distance, which is more precise because walking time can vary in correspondence to the idea of friction surface even if the distance remains the same. However, there is a limitation of this kind of measures, the Euclidean distance from a bus stop doesn’t necessarily refer to the proximity, because it does not take some physical boundaries and friction surfaces into account. Similarly, the report by Department for Transport (2013) used the access to transit network as a factor of quantifying the social impact of the transit system. They put the distance to transit system into 3 categories which are within 400m, 400m-800m, and above 800m. Then the access to transit network in terms of different population groups was given and compared, so as to understand the socio-demographic impact of the transit system. Some other transit system measures the temporal coverage of a public transit system, which is similar to the measure of flow in this project. Such temporal coverage are about daily/weekly frequencies and hours of operation (Polzin, 2002). Often, the temporal and spatial coverage are brought together to get a wider analysis of a transit system. For example, Hadas (2013) merged the timetable of fixed transit routes with spatial information to derive the hour of operation, service frequencies and time interval between each of the bus stops. Mamun et.al (2013) used per capita service frequency and spatial coverage as indicators for the transit access. 3 Data 3.1 GTFS The data is GTFS (General Transit Feed Specifications), which is a common format for public transportation schedules and associated geographic information. The data is owned and operated by Google, over 100 public transit agencies in cities in the US as well as in foreign countries uploaded their fixed or planned operation schedules. More specifically, GTFS contain 10 fields of text files. These fields are agency, stops, routes, trips, stop times, calendar, shapes, fare attributes and fare rules. The useful information in this study is stops, trips, stop times and shapes. Stops contain the stop id, stop location (longitude and latitude) and street name. Trips contain the trip id and the shape id. Stop times contain the trip id, stop id, arrival/departure time and stop sequence. Shapes contain shape longitude/latitude, shape sequence and shape id. In conclusion, each filed has some useful information regarding route planning and public transit analysis. The most important information in GTFS is the meaning and significance of each field in terms of public transit assessment. The stop file, which contains unique stop ids and locations, can be used to carry out some point-oriented spatial analysis. The shape file which contains the shape of roads, on the other hand, can be used for road visualization. The stop time file can be used to extract spatial-temporal information and derive the average speed through the process of data merging. Trip file, which contains trip ids and shape ids, can be used to merge stops, trips with the road shape, hence a public transit map can be derived. It is necessary to carry out a data quality check beforehand. Fig. 1 below shows the visualized road shapes and bus stops on ArcGIS with OpenStreetMap base layer for comparison. A conclusion can be addressed that, the quality and accuracy of the data is pretty high, since the shape of roads and bus stops can match up with the base layer. Fig.1 Data Quality Check on ArcGIS 3.2 Population Data The second set of data is the population data by census block published by US Census Bureau (US Census, 2015). This population data is mainly used for estimating population coverage and spatial coverage of the public transit. The data is vector data on a state scale, containing population and other demographic characteristics by census block. Due to the fact that the data is on a state scale, the data has to be clipped by Austin city metropolitan area boundary beforehand, such boundary data is also provided by US Census. Then, the population by census block data needs to be transformed into population density and into raster for spatial computing. 4 Methodology 4.1 Data Preprocessing The raw GTFS data is merely a couple of text files containing useful information, they are of little use before combining and spatializing. There are some steps of joining those tables based on the idea of the relational database. A primary key can be assigned to each of those tables, which refers to the foreign key of another table. The table (Table.1) below indicates the relations between those tables, which can be utilized to join them. File Name Primary Key Trips trip id Stops stop id Stop Times trip id Shapes shape id Foreign Key trip id, shape id Stop id trip id, stop id Shape id Table.1. the relations of text files in GTFS. The relations between these objects in each table are: (1). each trip contains multiple stops; (2). each shape contains multiple trips. Since the desired information is the locations and stop times along each trip and the shape file for visualization, the ultimate outputs will be two tables: (1). A joined table containing stop locations, stop times, trip ids, and corresponding shape ids and so forth; (2). The shape file. 4.2 Average speed between each pair of stops Since the output from the data preprocessing is a joined table containing stop locations (longitude and latitude), arrival/departure time, stop sequences, travel distances, trip ids and shape ids, it is possible to compute and visualize the average speed between each pair stop. The function (Fun.1) below indicates the computation principle of the speed between each pair of stops. Vi, j= ∑π1 π |π‘ππ −π‘ππ | /n (Fun.1 average speed in each pair of stops) In this equation, i and j indicate a pair of stops, n indicates the total number of buses passing this pair of stops based on the table of records, tn refers to the time point when the bus reaches the stop. 4.3 Total flows at each stop Since the output table is the records of all the bus trips that happened in June, July and August 2015, with 76 days in the record, the total count of bus passes at each stop can be calculated based on the time table. This measure indicates the total flow at each stop, kind of showing the statistical and spatial layout of bus stops. 4.4 Population coverage and spatial coverage Based on the spatial data of population by census block, it is possible to measure the population coverage of the transit system by counting the population in proximity of each bus stop. This measurement uses the walking distance to bus stops, while some scholars measured the coverage based on the walking distance to bus routes (Mamun et al., 2012). However, the proximity in bus stops is sometimes considered more realistic, because the bus doesn’t stop until it reaches a stop. The walking threshold applied in this work is 500m, accounting that most scholars used 0.25-mile buffer for bus stops and 0.5-mile buffer for light rail stops (Mamun et al. 2012). Because the data doesn’t contain specific information about whether it is a bus route or light rail route, an approximate value of buffer was taken for this measure, which was 500m. Fun.2 indicates the computational method for this measure. ππΆπ = ∑π1 π π ∗ ππ (Fun.2. Population coverage at each stop) In this function, PCi refers to the population coverage at stop i,n refers to the pixel count within that buffer area, Sn refers to the area (km2) of pixel n, pn refers to the population density value (persons/ km2) of pixel n, which was computed from the population data. The overall population coverage is the sum of the previous function, which can be derived to percentage of population coverage in terms of the whole population. Spatial coverage is very similar to population coverage. However, spatial coverage doesn’t take population into account, but merely account for the area. This work also calculated the spatial coverage by percentage of the whole area, but the whole area excluded places where the population is zero. 5 Result and Discussion The operation speed in Austin transit system was estimated in this work. The minimum speed between each pair of stops is 5.5 km/h, whereas the maximum is 89 km/h. However, despite the fact that the maximum speed is pretty high, the mean and median of the speed are both about 24.5 km/h, which means that most routes are slow. But, the estimated speed isn’t the actual speed that buses are doing, because the travel time includes the dwelling time at stops, which induces an underestimate of the real speed. However, this measure at least partially represents the speed of the transit system. Fig.2 Speed map with Stamen base layer Fig.2 shows the spatial layout of the speed, the blue color refers to low speed and red refers to high speed. Generally, the speed in downtown areas is low, whereas the speed in the periphery is high. This pattern meets the truth that there is more traffic in downtown areas which causes traffic congestions and slows the flow. The total flows at each stop was counted and visualized in this work. According to the statistics, the minimum flow count is 1, while the maximum is 1907, and the mean and median are approximately 200, 160, respectively. However, there is a limitation of the data source, some stops had only 1 flow in 76 days. Therefore, some records are definitely missing, which may induce an underestimate of flows. Hence, this measure merely accounts for the records in the data, which may not be the case in reality. Fig.3 shows the spatial layout of flows at each stop, with the road shape as base layer. Red color represents more flows and blue represents less flows. It also shows that red dots are mostly on the central urban areas, while some of them are in the periphery, which might be some major bus stations. Some bus trips happened only a few times during a few months, so these trips might be some special ones. Fig.3 Total flows at stops The population coverage and spatial coverage were estimated in this work. The population coverage was measured by counting the population within the 500m buffer zone of the transit stops, whereas spatial coverage were measured by counting the total area within that buffer zone. The results are not surprising, the population coverage is approximately 491500, while the total population in the study area is estimated to be around 930000, so the percentage coverage is 52.7%. Besides, the spatial coverage is 295.66 km2, while the total area is 819.34 km2 with river and zero-population zones excluded, so the percentage coverage is estimated to be 36%. Discussion This work measured and visualized the performance of the transit system in Austin, adopting and modifying the methods proposed by some scholars (Hadas, 2012; Mamun, 2013; O’Sullivan et al., 2000). These methods are the measuring of speed, flows and route coverage. The overall performance of the transit system is pretty good, in terms of accessibility and connectivity. The average speed of the transit is not too low, pretty high in the city periphery, though low in the city’s core areas, which makes sense that there is more congestion and traffic lights. The flows at each stop indicate where most routes go and the frequency of them. Based on the map, most flows occurred at city’s core areas and some stations in the periphery. The spatial coverage is 36%, which is higher than the result from Mamun et al. (2013) in New Haven, CT, although they used the whole route coverage instead of bus stop coverage. So these two results are not supposed to be directly compared, but the whole route coverage should be higher than bus stop coverage, while the result from Mamun et al. is generally lower than the result in this project, which indicates that the coverage in Austin is higher. The population coverage is 52.7%, which is mediocre compared with that 81% of population in UK is within 400m vicinity of public transit networks (Department for transport, 2013). But urban core areas are mostly covered, indicating that most routes only go to populous areas, some unpopulous areas are unreachable by the public transit. But generally residents in suburb areas are more likely to own cars, so their mobility will not be significantly limited with the absent of the public transit. However, there are some limitations in this work, including some inaccuracies and choice of methodology. First, the speed of the transit is underestimated because the travel time includes the dwelling time at stops, which increases the travel time and reduces the estimated speed as a result, although the travel time accounts for the real time that users spend. Second, the flow counts at transit stops are underestimated because some records are missing in the data. Third, to address the social implication of these results, these results can be compared with results from other cities, although some of the results were compared with those from New Haven and UK as noted previously, these comparisons are not adequate. Also, social relevance can be addressed by the way that Department for Transport (2013) used, which is putting the access of different demographic groups to different places into multiple categories, such as that 58% of 11-15 year olds are able to access a school within 20 minutes with public transit. Although there are some limitations in this project, there is still some reflections on the performance of the public transit system, which could potentially be used to judge the quality of the public transit and help with planning. References Capital Metro. (2015). Retrieved from Capital Metro: http://www.capmetro.org/ David O'Sullivan, A. M. (2000). Using desktop GIS for the investigation of accessibility by public transport: an isochrone approach. International Journal of Geographical Information Science, 85-104. Department for Transport, UK (2013). Valuing the social impacts of public transport. From: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/226802 /final-report.pdf Google Developers. (2015, May 27).Retrieved from Google Developers: https://developers.google.com/transit/gtfs/ Hadas, Y. (2013). Assessing public transport systems connectivity based on Google. Journal of Transport Geography, 105-116. Kittleson & Associates. (2003). Transit Capacity and Quality of Service Manual, 2nd edition. Washington DC: TRANSPORTATION RESEARCH BOARD. Polzin, S. P. (2002). Development of time-of-day-based transit accessibility analysis tool. Transportation Research Record 1799, 35-41. Ryus, P., Ausman, J., Teaf, D., Cooper, M., Knoblauch, M. (2000). Development of Florida's transit level-of-service indicator. Transportation Research Record 1713, 123–129. Sha A. Mamun, Nicholas E. Lownes, Jeffrey P. Osleeb, Kelly Bertolaccini. (2013). A method to define public transit opportunity space. Journal of Transport Geography, 144-154. T. L. Lei and R. L. Church. (2010). Mapping transit-based access: integrating GIS, routes and schedules. International Journal of Geographical Information Science, 283-304. US Census. (2015). Retrieved from United States Census Bereau: http://www.census.gov/