### Plenary (09:30 - 10:30)

#### Towards robust machine learning for transportation systems

Associate Professor Justin Dauwels, NTU

The field of machine learning has progressed rapidly in the recent years, fueled especially by new developments in deep learning. While such technologies are often hyped in the media, weaknesses of deep learning systems are starting to become obvious, potentially spelling trouble for mission-critical systems. Most current deep learning systems are brittle, since they typically do not encode or learn information about the physical world. For instance, state-of-the-art deep learning based object detection systems can potentially distinguish hundreds of animals, but do not necessarily know that birds fly or fish swim. In that sense, they are far from intelligent. The next generation of deep learning systems will be more robust, by letting them learn about the physical world. How such prior information can be encoded into the deep learning networks is an emerging area of research.

In recent work, we have shown that convolutional neural networks for objection detection in images can be made substantially more robust to image transformations (occurring in real-world applications) and to adversarial attacks by incorporating prior knowledge about the physical world. We encode physical properties of objects by means of hidden variables, and let the model infer what physical transformations have taken place in a given scene. As an illustration, we will present the Afﬁne Disentangled Generative Adversarial Network (ADIS-GAN). On the MNIST dataset, ADIS-GAN can achieve over 98 percent classiﬁcation accuracy within 30 degrees of rotation, and over 90 percent classiﬁcation accuracy against FGSM and PGD adversarial attack, outshining systems trained through data augmentation.

We will also briefly outline ongoing application-oriented machine learning projects in our team related to intelligent transportation systems. At the end of the talk, we will explore future research directions.

### Session 1 (11:00 - 12:30)

#### Dynamic pricing of repeated recruitment on mobile crowdsourcing

11:00 - 11:15

Hao Shugang, SUTD

Nowadays online content platforms use crowdsourcing as a technique for efficient works. How to make payment and inspire crowd’s effort is a critical issueto solve. In this paper we consider two different scenarios where crowd can or cannot choose their effort level after accepting the work. Additionally, we consider myopic and non-myopic parties within finite rounds. We find that in all the cases the platform will not choose introductory offer to reduce his payment from the second round. And it’s better for the platform to use ex-ante payment rather than ex-post payment method.

#### Competitive analysis in duopoly information marketing of mobile crowdsensing

11:15 - 11:30

Hong Shu, SUTD

The project studies the competition in information market, wheretwoplatformsrecruit sensors following Poisson Point Process to collect information andsell information to apopulation of consumers who are heterogeneous in their willingness topay for the quality of information. Assume the unit costs of both platforms to recruit sensors are common knowledge, we study the competition in a two-stage problem: in the first stage, platforms decidethesensor densityto recruit; In the second stage,theydecide the price of the sold informationmaximize the total profit.

#### Product description and consumer reviews in Omni-channel retailing

11:30 - 11:45

Deng Qiyuan, SMU

This paper studies how a retailer strategically provides product information in its offline and online channels. The two channels are operated either separately (dual-channel) or collectively (omni-channel). We consider two types of information: product description, which helps consumers identify whether the product fits their tastes, and consumer reviews, which are generated by the consumers who have made the purchase. We find that, without consumer reviews, the omni-channel strategy leads to a higher profit than the dual-channelstrategy if and only if the limit of product description in the offline channel is low and consumers' valuation of the product is small. However, with consumer review, even if the limit of product description in the offline channel is high, the omni-channel strategy can still lead to a higher profit. Furthermore, consumer reviews can reduce the retailer's profit if consumers' valuation is sufficiently large.

#### Anywhere but a Nash equilibrium: Follow-the-Regularized-Leader in zero-sum games (The Stochastic Case)

11:45 - 12:00

Sai Ganesh Nagarajan, SUTD

We analyze arguably one of the most classic settings in online learning in games.We study the class of Follow-the-Regularized-Leader algorithms (which include Multiplicative Weights Update and Gradient Descent) in zero-sum games. Unlike recent attempts at the problem that focus on the deterministic variant of such dynamics, we focus on the actual stochastic trajectories of the realized play. In contrast to the standard regret based analysis that merely analyzes the time-average characteristics of the trajectories we take a dynamical systems approach that aims at understanding the stability of the Nash equilibria and the actual day-to-day system behavior. In the case of FTRL, the daily randomized distribution of each agent, represents their current beliefs of the agent; it is her answer to the question “which action do you think is the best for you at this pointin time". How often are these beliefs in (approximate) agreement to the Nash equilibrium beliefs? In a very strong sense, the answer to this question is effectively never. Specifically, given any initial beliefs, even if agents’ beliefs are initialized exactly at the Nash equilibrium (e.g. due to advice from a centralized mediator, mechanism designer), if the mechanism designer were to check back again with the agents in the future, the probability that the agent’s beliefs would be close to their Nash beliefs converges to zero as time grows.

#### Reasoning on Knowledge Graph with Dependent Type Theory

12:00 - 12:15

Lai Zhangsheng, SUTD

Reasoning is an important element of intelligent behavior, and traditionally, reasoning has been approached using predicate logic and more recently, extensively explored using neural networks. While the approaches have shown toexhibit reasoning capabilities, inference in predicate logic can be intractable and neural networks do not perform well on questions requiring multiple supporting facts. In the talk, we demonstrate how to query and perform reasoning on a typed knowledge graph; the results of queries are supported by witnesses and the querying process correspond to graph operations.The constructivism and compositionality of dependent type theory also prevents spurious reasoning results as answers must be supported by witnesses. As far as we know, our work is the first to convert knowledge graphs into the dependent type theory space to perform reasoning.

#### Assisted algorithm design with dependent type theory

12:15 - 12:30

Lim Jin Xing, SUTD

The designer behind every masterpiece must conceptualise his ideas before properly portraying them. Similarly, in designing algorithms, the programmer needs to be clear about the goals the code should accomplish before formulatingthe algorithm. Often, the programmer would know its specification but spend most of his efforts figuring out how to code the algorithm in a specific programming language. Moreover, with increasing complexity in algorithmic designs, formal verification of such designs has gained importance. By Curry-Howard correspondence, one can view computer programs as mathematical proofs, and vice versa. Due to this proofs-as-programs correspondence, dependent type theory allows us to formally prove and write the algorithm with assistance from the machine at the same time. In this project, we will see how Coq, a dependent-typed language, assists us in writing up the proof of the insertion sort algorithm, and hence, extracts the proof as a sorting algorithm in OCaml programming language. As such, programmers can save efforts by writing and proving the correctness of algorithms concurrently with machine-assisted programming, which dependent type theory provides.

### Session 2 (13:30 - 15:00)

#### Statistics lie – so does data visualization. And that's perfectly fine.

**Invited talk**

13:30 - 14:00

Asst. Prof. Ate Poorthuis, SUTD

#### Application of queueing theory in patients' decision

14:00 - 14:15

Zhang Yufeng, SUTD

Jinting et al (2018) introduced the equilibrium strategies in M/M/1 priority observable queues with balking. They considered an M/M/1 observable queueing system with a pay-for-priority option, and studied customers' joint decisions between joining/balking and pay-for-priority. Finally they found out the equilibrium. The challenge in their research is to find out the expected waiting time of the customers in regular queue who might be overtaken by future arrivals. They use threshold methed and three-dimensional continuous-time Markov chain to solve this problem. My research meets similar challenge as theirs and I am trying to apply these methods in my own research.

#### Appointment systems with the effect of consumer risk preferences on no shows

14:15 - 14:30

Zhang Ruijie, SMU

Appointment systems are widely used in our daily life, such as the doctor appointment and consultation appointment. One problem with such appointmentsystems is customer no-show. The problem of no-show greatly impedes the system efficiency, which causes system idle time and lost revenue. This paper presents a new appointment system to reduce no-show behavior. We first investigate the consumer’s likelihood of no-show under different waiting time settings through a controlled lab experiment. Based on the experimental data, we study a new appointment system in which customer’s expected waiting time is identical irrespective of their arrival position, but the uncertainty of waiting (i.e., variance) differs. We compare our system against the traditional equal-space appointment system using both analytical analysis and simulation. Our results show that, although there might be fewer appointment slots, customer’s no-show rate of our systems is significantly reduced, leading to a higher utilization of server and system performance.

#### Performance analysis of the greedy algorithm for maximum weighted matching

14:30 - 14:45

Gao Shuqin, SUTD

A simple distributed greedy algorithm has been proposed to compute a weightedmatching in a given weighted graph, that finds a weighted matching at most a factor 2 away from the maximum in the worst-case. In particular, we observe that this algorithm performs much better on real-world instances than in the worst case. In this work, we study the average performance guarantee of the greedy algorithm in random graph models. We consider the simple straight-line graph whose edges are uniformly weighted and precisely derive some lower bounds on the average performance. Moreover, simulation results indicate that the proposed lower bounds are close to the simulated values.

#### New bounds for pairwise independent Bernoulli random variables

14:45 - 15:00

Arjun K Ramachandra, SUTD

Probability bounds for sums of Bernoulli random variables have been extensively studied due to their relative simplicity and wide range of applications in risk management, network reliability, stockouts in OM context, stochastic programming and graph theory. When the variables are mutually independent, the bounds are known to be efficiently computable using DP recursion while with worst case dependency, closed form order- statistics based analytical bounds, also computable though a compact LP are known. However, the broader question of whether such bounds are tractable with the assumption of pairwise independence is yet unanswered, to the best of our knowledge. In this talk, we propose new analytical bounds for pairwise independent variables using order statistics. We further prove that these closed form upper bounds are tight in certain regions of the probability space by constructing distributions which attain them, thus improving on well-known earlier bounds such as Chebyshev’s (1867), Boris Prekopa’s (1989) and Schmidt, Siegel, Srinivasan’s (1995) bounds. The same bounds can not only be achieved using a compact LP formulation, but can further be improved with additional constraints. Additionally, with homogeneous marginals, we prove that Boris Prekopa’s (1989) closed form bounds are tight, by a reduction from the large size LP to their LP using aggregated information upto second order.

### Flash talks (15:15 - 16:15)

**Each talk is five minutes**

#### Wealth inequality and the price of anarchy

Barnabé Monnot

The price of anarchy quantifies the degradation of social welfare in games due to the lack of a centralized authority that can enforce the optimal outcome. It is known that, in certain games, such effects can be ameliorated via tolls or taxes. This leads to a natural, but largely unexplored, question: what is the effect of such transfers on social inequality?

We study this question in nonatomic congestion games, arguably one of the most thoroughly studied settings from the perspective of the price of anarchy. We introduce a new model that incorporates the income distribution of the population and captures the income elasticity of travel time (i.e., how does loss of time translate to lost income). This allows us to argue about the equality of wealth distribution both before and after employing a mechanism. We establish that, under reasonable assumptions, tolls always increase inequality in symmetric congestion games under any reasonable metric of inequality such as the Gini index. We introduce the inequity index, a novel measure for quantifying the magnitude of these forces towards a more unbalanced wealth distribution and show it has good normative properties (robustness to scaling of income, no-regret learning). We analyze inequity both in theoretical settings (Pigou's network under various wealth distributions) as well as experimental ones (based on a large scale field experiment in Singapore). Finally, we provide an algorithm for computing optimal tolls for any point of the trade-off of relative importance of efficiency and equality. We conclude with a discussion of our findings in the context of theories of justice as developed in contemporary social sciences and present several directions for future research.

#### Predicting commercial vehicle parking duration using Generative Adversarial Multiple Imputation Networks

Low Ching Nam Raymond, SUTD

As the world rapidly urbanises in pace with economic growth, the rising demand for products and services in cities is putting a strain on the existing road infrastructure, leading to traffic congestion and other negative externalities. To mitigate the impacts of freight movement within commercial areas, city plannershave begun focusing their attention on the parking behaviours of commercial vehicles. Unfortunately, there is a general lack of information on such activities due to the heterogeneity of practices and the complex nature of urban goods movement. Furthermore, field surveys and observations of truck parking behaviour are often faced with significant challenges, resulting in data that is sparse and incomplete. The objective of this study is to develop a regression model to predict the parking duration of commercial vehicles at the loading bays of urban retail malls and identify significant factors that contribute to this dwell time. The dataset used in this study originates from a truck parking and observation survey conducted at the loading bays of nine retail malls in Singapore, containing information about the truck and driver’s activities. However, due to the presence of incomplete fields found in the dataset, the authors propose the use of a Generative Adversarial Multiple Imputation Networks algorithm to impute the incomplete fields before developing the regression model using the imputed dataset. Through the parking duration model, the activity type, parking location, and volume of goods delivered (or picked up) were identified as significant features influencing vehicle dwell time, corroborating with findings in the literature.

#### Exploring the potential for crowdshipping using public transport

Zhang Meijing, SUTD

With the overlapping routing of passenger and freight transport with aggravated urban traffic congestion and related social and economic consequences, crowdshipping has gained growing interest as an innovative delivery mode in which the main service providers are crowd members and a demonstrably effective means to alleviate the aforementioned problems. Based on MRT passenger travel origin-destination (O-D) data from EZ-link tap in and tap out dataset and Ninjavan one single day delivery data in Singapore, this research studies a joint parcel pickup point selection and delivery assignment problem and develops the algorithm for matching pairs of the parcels and passengers. The proposed approach is applied to an area in Singapore to demonstrate the feasibility and practicability of the modeland compare the performances of the carrier based on the basic case with the regular trucks and the crowdshipping scenario from both freight and passenger perspectives.

#### Representing reservoir storage dynamics and operations in the Variable Infiltration Capacity (VIC) model

Thanh Duc Dang and Kamal Chowdhury, SUTD

Civil infrastructure projects, such as water reservoirs, are invaluable assets for fulfilling water and energy thirsts, especially in developing economies; yet, the installation of such infrastructures may increase public tension due to their adverse impacts on the hydrological cycle. Developing conceptual hydrological models that can incorporate reservoir dynamics and operations is thus of paramount importance to analyze and explore solutions for sustainable water management. In this work, we developed a reservoir kernel for the Variable Infiltration Model (VIC), a macro-scale, distributed hydrological model. The key idea is to keep the existing mechanisms for water (and energy) balance and runoff generation, and to integrate reservoirs into the routing module so as to better simulate the effects of anthropogenic interventions. The newly modified module requires data not only on the rainfall-runoff generation, but also on reservoirs’ location and design features, including their operating curves. The operation of multi-reservoir systems can thus be modeled more accurately basedon conciliating water availability (inflow), water demand, and the physical constraints of reservoirs. Considering the scarcity of water resources worldwide, this modelling effort could be useful to support joint operational practices and maximize benefits of water resources in large and complex river basins.

#### Reservoir regulation could significantly influence flooding dynamics in the Chao Phraya Delta

Vu Trung Dung, SUTD

Reservoir operations may significantly alter not only flow regimes, but also timing, duration, and depth of downstream floods. In this work, we focus on the delta of the Chao Phraya River Basin, where we study the changes in flow regimes and flood patterns associated to different management scenarios of the upstream reservoirs. We pay particular attention to the Bhumibol and Sirikit dams, which were constructed on Chao Phraya River's tributaries for the purposes of power generation and irrigation supply. These dams, however, also control the flow regime and flood dynamics in the Chao Phraya delta, which includes the urban area of Bangkok. Our investigation is based on a hydrological model, namely the Variable Infiltration Capacity (VIC) model which contains a reservoir operation module was developed for the Chao Phraya Basin to simulaterainfall-runoff processes and reservoirs operation. The model parameters were calibrated, and the optimal rule curves of reservoirs were obtained by simulation-optimization system using a multi-objective evolutionary algorithm (MOEA). Modeling results confirm that different management approaches could largely impact the hydrology of the entire basin. Overall, studies like this one point to the importance of human-nature coupled systems in hydrological sciences.

#### Understanding impacts of transmission capacity on the power system performance in Laos

Rachel Koh, SUTD

With increasing emphasis on the negative effects of carbon dioxide emissions on climate change, renewable energy sources, such as hydropower, play a pivotal role to meet the increasing energy demand. The energy production in Laos is dominated by hydropower, with dams generating almost 90% of the annual electricity (Chowdhury et al., 2019). As Laos exports the majority of its production to the neighbouring countries (i.e., Thailand, Vietnam, and Cambodia),the regional energy security is influenced by the performance of Laos’ hydropower system. To understand the interaction between water availability and regional energy security, Chowdhury et al. (2018) developed a coupled water-energy model that schedules the hourly energy production mix to satisfy national demands and exports at a minimum cost—while considering the operational constraints of the power generation and transmission facilities. The energy landscape in Laos is shaped rapidly by multiple international players (including Thailand, China, Norway etc.). In 2015 and 2016 alone, the grid generation capacity was increased by more than 3000 MW. The uncoordinated development of the power plants without a corresponding expansion in the transmission system could be a prime reason affecting the performance of the power system.

The study investigates the impact of transmission capacity on the power system by comparing the existing infrastructure to a scenario where the transmission capacity of each line was arbitrarily modified to increase the amount of electricity it can convey. The system performance is then measured using three indicators: unused hydropower, cost of production, and carbon dioxide emissions. It was found that the ability to dispatch more power in each transmission line can potentially reduce both emissions and production cost, with emissions decreasing by up to 0.6 million tonnes a year and costs reducing by up to 20 million US$/year. The results indicate that the performance of the system can be improved with a corresponding expansion of the transmission capacity

#### Spatial-temporal variability of streamflow in monsoon Asia over the past eight centuries and links to climate drivers

Nguyen Tan Thai Hung, SUTD

The Asian Monsoon region is home to a quarter of the world’s population, most ofwhom relies on rivers for water supply. Water management in this region would benefit from an improved understanding of long-term hydrologic variability, made possible with streamflow reconstruction studies. In this work, we produce the first large-scale streamflow reconstruction over the last eight centuries in monsoon Asia, using a Linear Dynamical Systems approach and the Monsoon Asia Drought Atlas (MADA) as the paleoclimate proxy. The reconstructions reveal a history of regime shifts with prolonged droughts exceeding the lengths of thosefound in instrumental records and show the spatial footprints of the Asian megadroughts. Analyses of the dominant modes of variability suggest that streamflow in Asia is linked to both ENSO and IOD, but these relationships vary significantly through space and time. Overall, the findings presented advance understanding of regional hydrologic variability and can help improve water resource management practice in many countries.

#### Can long-range streamflow forecasts increase hydropower production?

Ng Jia Yi, SUTD

Hydropower generates about 17% of the world’s total electricity and 70% of all renewable energy. Exploring how hydropower dams could potentially benefit from forecast-informed operations can allow us to increase hydropower production at little cost. Here, we examine this by simulating hydropower production for 1,593 hydropower dams, which collectively represents more than 38% of the world existing hydroelectricity capacity. We first predict monthly inflows of up to 3 months for each of the dams using four global climate drivers, namely El Niño Southern Oscillations (ENSO), North Atlantic Oscillation (NAO), Pacific Decadal Oscillation (PDO), and Atlantic Multidecadal Oscillation (AMO), and two local variables, namely lagged inflow and soil moisture. We then simulate the hydropower dams using forecast-informed operations and benchmark the performance against optimized control rules designed using stochastic dynamic programming. Our results show that 51.2% of the hydropower dams have increased hydropower production using forecast-informed operations. Reservoirs that can benefit from forecast have a small storage to release ratio and their hydraulic heads are largely dependent on the reservoir water depth (i.e. lack of a natural waterfall). Interestingly, we also find that amongst these reservoirs, they require different levels of forecast skill to benefit. Reservoirs with low storage to inflow ratio and have inflow often exceeding maximum turbine release rate can easily benefit from forecast informed operations even when forecast skill is not high. Our study identifies the dam specifications suitable for forecast-informed operations and also regions where forecast skill is high. Dams in such regions and with these specifications should look into adopting forecast-informed operations to reap the benefits of increased hydropower production.