A Functioning Beta Solution to the Challenge of Opening Transit Payment System Transaction Data
📝 Abstract
The deployment of smart-card-based public transit fare payment systems provides government the opportunity to create a valuable derivative data product. Companies such as Urban Engines have demonstrated an ability to add value to the data derived from transit fare transactions. The challenge for the public sector is to, for the societal good, leverage private sector interest by giving access to useful fare transaction data in a manner that protects customer privacy. This challenge is particularly acute in California, where privacy laws make sharing data in a manner that supports the public interest difficult. This paper presents the Metropolitan Transportation Commission’s (MTC’s) proposed solution to the problem. MTC operates the Clipper(r) transit fare payment system for the San Francisco Bay Area. In an effort to share usable data that protects customer privacy, MTC developed an anonymizing scheme that is the subject of the present paper. We seek feedback on our approach from the Data for Good Exchange community, asking: in seeking a balance between customer privacy and usability, does the scheme go too far in either direction? And, should we take a different anonymizing approach?
💡 Analysis
The deployment of smart-card-based public transit fare payment systems provides government the opportunity to create a valuable derivative data product. Companies such as Urban Engines have demonstrated an ability to add value to the data derived from transit fare transactions. The challenge for the public sector is to, for the societal good, leverage private sector interest by giving access to useful fare transaction data in a manner that protects customer privacy. This challenge is particularly acute in California, where privacy laws make sharing data in a manner that supports the public interest difficult. This paper presents the Metropolitan Transportation Commission’s (MTC’s) proposed solution to the problem. MTC operates the Clipper(r) transit fare payment system for the San Francisco Bay Area. In an effort to share usable data that protects customer privacy, MTC developed an anonymizing scheme that is the subject of the present paper. We seek feedback on our approach from the Data for Good Exchange community, asking: in seeking a balance between customer privacy and usability, does the scheme go too far in either direction? And, should we take a different anonymizing approach?
📄 Content
1 A Functioning Beta Solution to the Challenge of Opening Transit Payment System Transaction Data
David Ory Metropolitan Transportation Commission San Francisco, CA, USA DOry@mtc.ca.gov Stephen Granger-Bevan Formerly of Metropolitan Transportation Commission San Francisco, CA, USA stephen.grangerbevan.sjsu@gmail.com
ABSTRACT
The deployment of smart-card-based public transit fare payment
systems provides government the opportunity to create a valuable
derivative data product. Companies such as Urban Engines have
demonstrated an ability to add value to the data derived from transit
fare transactions. The challenge for the public sector is to, for the
societal good, leverage private sector interest by giving access to
useful fare transaction data in a manner that protects customer
privacy. This challenge is particularly acute in California, where
privacy laws make sharing data in a manner that supports the public
interest
difficult.
This
paper
presents
the
Metropolitan
Transportation Commission’s (MTC’s) proposed solution to the
problem. MTC operates the Clipper® transit fare payment system
for the San Francisco Bay Area. In an effort to share usable data
that protects customer privacy, MTC developed an anonymizing
scheme that is the subject of the present paper. We seek feedback
on our approach from the Data for Good Exchange community,
asking: in seeking a balance between customer privacy and
usability, does the scheme go too far in either direction? And,
should we take a different anonymizing approach?
1.INTRODUCTION
The Metropolitan Transportation Commission (MTC) is the
transportation planning, financing, and coordinating agency for the
nine-county San Francisco Bay Area – the San Francisco Bay
shoreline comprises San Francisco, San Mateo, Santa Clara,
Alameda, Contra Costa, Solano, Napa, Sonoma, and Marin
Counties. MTC began an effort to introduce a single fare payment
system across the many transit operators that serve the region in
1998. Today the Clipper® card can be used on twenty Bay Area
transit providers. The service handles around 800,000 transactions
each week day and settles over $40 million dollars each month.
As noted above, in addition to operating Clipper®, MTC serves as
the region’s transportation planning agency. As the Clipper® card
gained wider adoption, the transportation planners at MTC sought
detailed transaction data to better understand the travel behavior of
Bay Area residents. Around the same time, private sector interest
in the data increased, in particular from the firm Urban Engines,
who sought the data to support their software business. To MTC’s
planners, the highest value aspect of the Clipper® transaction data
is individual trajectories through the transportation network. This
data has the potential to reveal interesting and important behaviors,
including preference for rail (rather than bus), disdain for
transferring, seasonal variation, and day-to-day route variation.
Importantly, MTC’s planning staff did not want to be the sole
customers of the prospective Clipper® data product. Rather, we
sought to create a product that was useful to us that we could share
with others.
Though housed in the same agency, California’s privacy laws –
California Streets and Highways Code Section 31490 in particular
– make sharing the data complicated. MTC’s planning, Clipper®,
and legal personnel set about on an effort to create a useful data
product that protected customer privacy and could be broadly
distributed. The remainder of the paper describes our work.
2.PROBLEM
MTC sought to share useful Clipper® data in a manner that protects
customer privacy. In this case, the primary customers of the data,
MTC’s planners, defined “useful” as containing individual
trajectories through the transportation network across a full day,
with each day in each year represented in the data. California’s
privacy laws governing electronic payment systems require that
any data product that can be shared with MTC’s planners can be
shared more widely (i.e., MTC’s planners receive no preferential
access), in particular with private sector actors interested in
leveraging the data for business ventures.
3.SOLUTION
MTC’s
planning,
Clipper®,
and
legal
teams
iteratively
experimented with solutions that attempted to balance protecting
customer privacy and retaining the data’s usefulness. We were
guided by our collective judgment and settled on the following
anonymizing scheme:
All personally identifiable information is held in a separate database that was not joined, examined, or considered as part of our effort and no personally identifiable information is present in the anonymous data.
Each Clipper® card’s unique serial number is replaced with a pseudo-random identification field that persists for a single “circadian” day – the data is separated from 3 am to 3 am, as this represents a more logica
This content is AI-processed based on ArXiv data.