Axis 1: Data valuation for decision making
There are many strategic advantages to using data. Data often present methodological challenges due to their complex nature or structure, their large size, their degree of confidentiality or even sometimes their scarcity or poor quality. This line of research focuses on the design of mathematical, statistical and machine learning tools for processing, analyzing and modeling data for descriptive, predictive and prescriptive purposes.
Members
Cahiers du GERAD
Unboundedness in bilevel optimization
Bilevel optimization has garnered growing interest over the past decade. However, little attention has been paid to detecting and dealing with unboundedness...
BibTeX referenceExtremal chemical graphs of maximum degree at most 3 for 33 degree-based topological indices
We consider chemical graphs that are defined as connected graphs of maximum degree at most 3. We characterize the extremal ones, that is, those that maximize...
BibTeX reference
Integrating Optical Transport Networks (OTNs) into multilayer Elastic Optical Networks (EONs) enhances data transmission efficiency but introduces significan...
BibTeX referencePublications
Events
Artem Sedakov – Saint Petersburg State University
Ludovic Salomon – Polytechnique Montréal
Mathieu Boudreault – Professor, Department of Mathematics, Université du Québec à Montréal
News
GERAD is proud that the first edition of OR@Africa Day is published as a conference report on the INFORMS website. OR@Africa was founded in GERAD's seminar room in September 2023 by a group of GERAD students, accompanied by Professor Issmail El Hallaoui. Congratulations on your efforts and recognition, you richly deserve it, and we wish you all the best for the future.
Example in Economy and Finance
The Informational Content of High-Frequency Option Prices
Most asset prices exhibit volatile and unexpected movements. Some of these fluctuations are due to sudden corrections—jumps—whereas others are associated with increases in the diffusive, transitory volatility. It is, however, challenging to separate the diffusive portion of volatility from its jump component as there is no direct measure of volatility—it is latent. This study investigates these two components using different datasets.
Historically, estimation of such quantities relied on (low-frequency) daily returns—using one observation per day. Nowadays, however, returns over smaller—intraday—time steps are also available. For more than a decade now, researchers in the field of finance and econometrics have used so-called high-frequency asset returns to measure financial risk more accurately and understand this breakdown.
In addition to these low- and high-frequency asset returns, there is a whole array of options being traded every day. These options can be seen as insurance contracts on the asset itself: a premium must be paid on purchase to have the right to exercise the option at maturity. Their premiums depend on the risk associated with the embedded insurance. Including the price of various options in our sample therefore helps us identify the unobservable variables’ behaviour—volatility and jumps. Indeed, many studies are already doing this using end-of-day option prices. However, very few studies used intraday variations in options because doing so increases the sample size considerably and appreciably amplifies the computational issues involved in estimating model parameters.
This study uses traditional observations (returns, intraday variations, and end-of-day option prices) and adds the intraday variations of option prices. We show that the informational content of high-frequency option prices allows us to gain more insight into the breakdown mentioned above. Besides, the model’s unobservable factors (volatility and jumps) are measured more precisely. Consequently, omitting the option intraday information might lead to suboptimal investor decisions.
(by Diego Amaya, Jean-François Bégin, and Geneviève Gauthier)
Example in Energy, Environment and natural resources
Smart Buildings
The buildings of the 21st century are being design and built in a new context – one that clearly calls for development of new cutting-edge information and communication technologies providing decision support tools for efficient energy management in buildings. As a result, the design of smart buildings is not limited to installing various sensors: it is part of a broader context where distributed energy resources and the active participation of consumers are integrated into the effective management of demand throughout the building. Given advances in smart metre infrastructure, digital strategy has become vital for leveraging the data collected. Artificial intelligence (AI) seems essential for big data processing, increasing the performance of short-term energy demand forecasting models, gaining a better understanding of the volatility in each consumption profile and generating learning mechanisms adapted to demand management in several types of buildings.
Moreover, recent advances in the Internet of Things offer interesting opportunities for buildings to communicate, allowing the pooling of difference resources at the neighbourhood level as well as the possibility of energy trading to reduce peak demand. Along with the big data revolution brought about by widespread deployment of metres, sensors and smart technologies, there is a need to better explore the potential of predictive models based on deep learning to improve the accuracy and efficiency of forecasting in the energy context.
The team of Hanane Dagdougui, professor in the Department of Mathematics and Engineering and a member of GERAD, is particularly interested in the development of mathematical models and the application of machine learning techniques to energy management problems in buildings. Hanane Dagdougui is working on development of distributed algorithms and new approaches based on machine learning as well as implementation of applications derived from them in smart building networks. These management algorithms will make use of demand response strategies that will increase the flexibility of the building and the network. Hanane Dagdougui is currently developing several large-scale projects with major partners such as CanmetÉNERGIE, the Hydro-Québec Research Institute, Innovée, Hitachi ABB, Fusion Énergie, VadimUS. She works in collaboration with Charles Audet, Sébastien Le Digabel and Antoine Lesage-Landry, professors at Polytechnique Montréal and GERAD members. When demand side management of a significant number of buildings is precisely controlled by aggregators, this can play an increasing role in the wholesale electricity market. In this case, demand-side management can help the power system operator better manage peak demand while exploiting the potential for flexibility and enabling consumers to benefit from rewards or lower energy bills.
Example in Smart Infrastructure
Smart Infrastructures
Maintaining and renewing our infrastructure will require considerable investment in coming decades. Major changes are needed in our transport and energy systems, particularly to meet environmental challenges. At the same time, advances in information technology provide an opportunity for GERAD members to work at improving the capacity, efficiency and reliability of our infrastructure rather than simply repairing it.
More Data Available
Smart infrastructure aims to improve – often in real time – a service provided to a population of users, using different kinds of available data about their own condition as well as about users themselves. For example, management of a public transport network or a taxi fleet can rely on the precise positioning of vehicles, rapid reporting of incidents, as well as forecasts of traffic and changes in demand made more reliable by location data transmitted by the cell phones of users. More generally, with the development of the Internet of Things, a proliferation of sensors and data sources are becoming available in many areas. GERAD brings together several researchers working to transform and merge this raw data into statistical models than can then be used for decision-making purposes. Such data can also be used for longer-term planning of changes to infrastructure.
Social and Ethical Dimensions
Personal data are the pillar of the idea of smart infrastructure, but there are many legitimate concerns about the practical uses of personal data, since its dissemination may affect privacy. Many cities have developed "open data" programs which do not necessarily apply the state-of-the-art methods that are needed to protect private data. For example, it is well documented that datasets recording the movements of individuals over time are particularly hard to anonymize, although such datasets can frequently be found in open access. Professor Jérôme Le Ny's research group is interested in development of estimation and decision-making models that can use aggregated personal data, while providing formal mathematical guarantees with respect to the protection of data confidentiality (so-called "differential confidentiality" guarantees). This includes developing systems that collect data only for well-defined purposes, and then applying protection methods (aggregation, scrambling, etc.) that are specifically tailored for these purposes in order to limit the impact on statistical accuracy. There are many such applications, and this research work will contribute to strengthening the trust of users in smart infrastructures as well as their consent to providing data that are needed.
Examples of publications:
Le Ny, J., Differential Privacy for Dynamic Data. SpringerBriefs in Control, Automation and Robotics, Springer, 2020.
Le Ny, J., Privacy in Network Systems. In Encyclopedia of Systems and Control, J. Baillieul, T. Samad, Editors, Springer, 2021.
Pelletier, M., Saunier, N., Le Ny, J., Differentially Private Analysis of Transportation Data. In Privacy in Dynamical Systems, F. Farokhi, Editor, pp. 131-155, Springer, 2020.
Example in Smart Logistics
The Importance of Data in Transportation and Retailing
Data in supply chains are known to be highly complex and voluminous. Such data, both structured and unstructured, are critical in many decisions which are being made on a regular basis. For example, in retailing, data related to demand, markets, customer engagements, prices, and many other relevant factors are constantly collected and leveraged in the decision-making process by demand and merchandise planners. In transportation, planners and dispatchers often rely on many sources of data involving real-time traffic, lead times, road conditions, costs, customer requirements and other sources of data in their planning and execution processes. Combining and extracting data from different sources and generating valuable insights from such data to support decision-making processes can be very challenging. In addition, the performance of data-driven decision-making methods depends heavily on the information and representations created from the original data. This research axis aims to tackle the data valuation aspect and its implications in decision-making, either in a fully automated manner or with human interventions.
GERAD researchers have carried out several studies which attempt to improve the quality and reliability of decisions through different quantitative approaches employed to enhance the value of data in multiple real-world supply chain and logistics applications. Several notable studies led by Carolina Osorio, Guy Desaulniers and Andrea Lodi have specifically addressed the prominent uncertainty issue in traffic and transportation management in the public domain. The research work carried out by Okan Arslan and Yichuan Daniel Ding demonstrate how data analytics can enhance planning and scheduling decisions in last-mile delivery and workforce scheduling. In a retail context, Andrea Lodi proposed an efficient decomposition method to learn latent customer preferences from retail data. Finally, the value of data availability and information sharing is also examined analytically in a retail context in the research papers of Georges Zaccour.
General note: in the text, only GERAD members involved in the mentioned research are indicated, but not those co-authors who are not members of GERAD. The information on co-authors, together with further information, can be found in the references.
References:
Arslan, O., Abay, R., Data-driven vehicle routing in last mile delivery, Cirrelt-2021-30, 2021.
Fields, E., Osorio, C., Zhou, T., A data-driven method for reconstructing a distribution from a truncated sample with an application to inferring car-sharing demand. Transportation Science, 55(3), 616-636, 2021.
Jena, S. D., Lodi, A., Palmer, H.,Sole, C., A partially ranked choice model for large-scale data-driven assortment optimization. INFORMS Journal on Optimization, 2(4), 297-319, 2020.
Lu, J., Osorio, C., A probabilistic traffic-theoretic network loading model suitable for large-scale network analysis. Transportation Science, 52(6), 1509-1530, 2018.
Osorio, C., High-dimensional offline origin-destination (OD) demand calibration for stochastic traffic simulators of large-scale road networks. Transportation Research Part B: Methodological, 124, 18-43, 2019.
Ricard, L., Desaulniers, G., Lodi, A., Rousseau, L.M., Predicting the probability distribution of bus travel time to move towards reliable planning of public transport services. arXiv:2102.02292, 2021.
Yu, M., Ding, Y., Lindsey, R., Shi, C., A data-driven approach to manpower planning at US–Canada border crossings. Transportation Research Part A: Policy and Practice, 91, 34-47, 2016.
Zhang, Q., Chen, J., Zaccour, G., Market targeting and information sharing with social influences in a luxury supply chain. Transportation Research Part E: Logistics and Transportation Review, 133, 101822, 2020.