The following tutorial provides an overview of different statistical challenges when modeling electronic commerce data. Ecommerce data originates from many different behavioral, social, or economic processes and interactions online which have not been observable and measurable in the offline world. This data-rich environment allows for the questioning of existing theories and the uncovering of new phenomena. However, eCommerce data, and the new research questions associated with this data, are often not supported by classic statistical machinery. New dependency structures arise due to factors such as online competition and user interaction. In this paper, we discuss three key aspects of eCommerce data: eCommerce process dynamics, competition between processes, and user networks.
In the following papers we propose new ways to make early & dynamic marketing decisions using information from prediction markets. We use functional shape analysis to tease-out mood-swings from the observed trading patterns. We use these trading patterns in a dynamic way and identify optimal points for decision making.
In this paper, we propose a novel automated and data-driven bidding strategy. Our strategy consists of two main components. First, we develop a dynamic, forward-looking model for price in competing auctions. By incorporating dynamic features of the auction process and its competitive environment, our model is capable of accurately predicting an auctions price, taking into account information from simultaneous auctions. Then, using the idea of maximizing consumer surplus, we build a bidding framework around this model that determines a triumvirate of decision points: the best auction to bid on, the best bidtime and the best bid-amount. In simulations, we compare our automated strategy to early and last-minute bidding and find that our approach extracts considerably higher expected surplus. We also argue that our approach devotes significantly less effort to the process of bidding.
o An Automated and Data-Driven Bidding Strategy for Online Auctions, with Zhang.
The following paper is new in the sense that it studies dynamics in a special class of auctions, simultaneous auctions for Indian contemporary art. We develop a novel forecasting model that predicts price in ongoing auctions, using the concept of price dynamics. We also study the source of the predictive power of dynamics and find that dynamics capture bidder competition within and across auctions. The importance of this finding is both conceptual and practical: price dynamics are simple to compute at high accuracy, as they require information only from the focal auction and are therefore a parsimonious representation of different forms of within-auction and between-auction competition.
The following papers focus on the dynamics during online auctions. We propose different ways of capturing dynamics using functional data analysis. We also propose several new ways of modeling auction dynamics via differential equation models and differential equation trees. One of the key insights of this research is that dynamics exists and that they matter for the outcome of an auction.
In the following set of papers we focus on forecasting the price in online auctions. We develop a dynamic and real-time forecasting model based on functional data analysis. Another aspect that makes this work novel is that the forecasting method takes advantage of the changing dynamics during an online auction.
In the following set of papers we focus in competition between auctions. We are particularly interested in understanding the price-process of concurrent auctions and whether certain types of price-processes occur in groups. We also investigate the effect of competing products in the associated feature space and whether time plays a factor in online auctions.
In the following set of papers we explore auction data visually. We develop several static visualizations to better sift through large amounts of auction data. We also develop a novel interactive tool to explore & forecast online auction price processes.
The following paper proposes a novel model for the bid arrival. In particular, our model captures the 3 distinct bidding phases that characterize a typical auction. The model is based on a non-homogeneous Poisson process.
The following paper proposes a novel way to distinguish between private and common value auction. The idea is based on the functional residual analysis.
The following paper features a novel data collection method to gauge consumer surplus in online auctions.
The following papers feature methodological work motivated by the data challenges in online auctions. The first paper gives an overview of functional data methods in the context of electronic commerce and argues why many data structures found in electronic commerce motivate the use of functional data analysis. The second paper proposes a novel way of smoothing unevenly sampled price curves (such as those found in online auctions). The third paper proposes several ways of how to deal with data that is neither entirely continuous nor entirely discrete.
In this line of research we look at the geographical scatter of customer choices. In particular, we want to better understand how the choices customer make are determined by other customers in close (geographical) proximity. We develop a novel spatial model for customer choices. We also derive a dynamic implementation of that model that can react to changes in the population and give updated decisions in real-time.
In this paper we propose a novel way for capturing flight departure delay. We differentiate between short-term daily patterns and long-term trends. We use global optimization methods to best fit a mixture distribution to the observed data.
The following papers develop different methods and algorithms to make the EM algorithm amenable for finding the global solution. Like many other deterministic optimization methods, EM can get stuck in local, sub-optimal solutions. We develop a variety of methods that can overcome these local traps.
The following papers
propose several new ways of implementing EM (and in particular its stochastic
version, Monte Carlo EM) more efficiently. Among the proposed solutions are new
(automated) rules for increasing the
The following paper proposes several ways to implement Simulated Maximum Likelihood (SML) more efficiently. Among the proposed solutions are using SML in stages and using variance-reducing Quasi-Monte Carlo simulation methods.
The following paper (empirically & theoretically) compares the efficiency of Monte Carlo EM to that of Simulated Maximum Likelihood.
The following papers give a detailed overview over problems and challenges associated with stochastic implementations of the EM algorithm.