Transcript talk

Trading Agent Competition
Bassam Aoun
[email protected]
08/11/2004
Outline
TacTex
TAC SCM
Boticelli
Trading Agent
Competition
TAC classic
WhiteBear
Trading Agent Competition
• The TAC annual contest has been designed by
– a team of researchers from the e-Supply Chain
Management Lab at Carnegie Mellon University
– the Swedish Institute of Computer Science (SICS).
• It offers agent designers a forum and a common
platform to evaluate agents’ trading techniques
• It aims to spur research by
– comparing the various approaches
– enabling researchers to build over each other’s ideas;
TAC Games
• TAC-SCM, the Trading Agents Competition in
a Supply Chain Management scenario, was
designed to capture many of the challenges
involved in supporting dynamic supply chain
practices.
• TAC Classic - The software agents will
represent travel coordinators whose goal is to
arrange travel packages (flights, hotel rooms,
and tickets to entertainment events) for
clients.
Outline
Boticelli
TacTex
TAC classic
TAC SCM
WhiteBear
Trading Agent
Competition
TAC SCM Game Overview
Agent Daily Events
Tracking Agents…
TacTex vs. Boticelli
Similarities:
• TacTex and Botticelli build models of the environment and
attempt to optimize with respect to those models
Differences:
• Given computational complexity
– TacTex proposes greedy heuristics
– Boticelli proposes Sample Average Approximation (SAA)
• What if there are any production cycles remaining?
– TacTex uses them to build up an equal inventory of all computer
types to be used to satisfy future orders.
– Boticelli uses stochastic approach to predict future customer
orders through scenarios
Outline
TAC classic
Boticelli
WhiteBear
TacTex
Trading Agent
Competition
TAC SCM
Starting Point
• Formulate optimal solutions for major
decisions
• Used linear programming that plans for the
next several days of production
• They referred to [Benisch et al. 2004]
• Unfortunately, it failed to produce a result
within the 15 seconds available per game
day.
• Proposed a greedy heuristic approach
The greedy production scheduler
• Divide the current orders into two lists:
– orders that are late or will be late if not produced
immediately
– all other orders
• Sort each list in order of decreasing value
Rank by (price − cost + penalties)
• Append the second list to the first
• Go through the list attempting to fill each order:
– Use any computers in inventory that are available
– See if the remaining amount needed can be produced
– If the order can be filled, earmark the computers for
delivery
The first-day ordering strategy
• On the first day: (Supply are cheapest)
– send RFQs: 8800, 4400, 2200, 1100, and 550 of each
component (To maintain flexibility)
• On the second day:
– predict the number of components needed based on
the number of customer RFQs
Prediction based on day #2 RFQ  Total RFQs
– project total future production using the offered
components to find the usable amount
– accept a subset of the offers providing the desired
amount
After day #1…
• Need a different strategy after day 1 (if more
needed)
• Prices are determined by due date
– Supplier has lots of orders before due date  high
price
• Probe price as function of due date with small
RFQs
• Request enough to maintain threshold supply 50
days ahead (assuming current rate of use)
• Only accept if expected profit increase > price
– Marginal value based on assumptions about
computer prices and other components’ costs
Offering Computers
• Find the set of offers that maximizes profit
• Need to estimate P( winning order | offer price)
• Given RFQ and cost of computer in question,
optimal price maximizes
 (price - cost) * P( order | price)
• May not be able to produce all orders for optimal
prices
– Then need to raise prices to reduce demand
• Iteratively raise prices on the least profitable
offers
• Repeat until all orders can be produced
• Too many orders  less capacity for future
order
Summary
• Very little opportunity for optimal decisionmaking
• Lots of prediction in their strategy
• Many attempts at learning and adaptation
• So far only a few are useful
– Predicting future customers’ RFQs is difficult (e.g.
quantity and day factors)
– Predicting the acceptance of an offer P( order | price)
– Number of days to look into the future such that the
model is still valid
Outline
WhiteBear
TAC classic
Trading Agent
Competition
Boticelli
TAC SCM
TacTex
The Botticelli Agent
• This paper addresses scheduling component of
the TAC SCM problem.
• Use stochastic information in the form of
probabilistic models built from historical data
• Formulate the problem as a stochastic program
• Optimize solution using sample average
approximation (SAA)
Maximize expected profit, given a prior for each RFQ
(how likely is it to become an order)
Expected Production Scheduling
• Input
– Bidding policy, product prices, orders, RFQs,
procurement schedule, inventory, historical
data…
• Objective
– maximize order profits and expected offer
profits
• Constraint
– Quantity of SKU j in orders or expected offers
delivered by day t cannot exceed amount of
SKU j produced by day (t-1) + initial inventory
Simple Scheduling Problem
Given:
• Orders (SKU,
quantity, due date,
penalty, price)
• Inventory
• Procurement
schedule
• Number of production
cycles per product
• Product specification
Find:
• Production schedule
that optimizes profit
ILP Solution Constants
• o – set of orders
• Order i = { SKU si, price pi, quantity qi, due date di,
penalty ρi, reserve price ri}
• The profit πil for filling order i on day l
•
•
•
•
•
ak – quantity of component k in initial inventory
bj – quantity of SKU j in initial inventory
C – machine capacity in production cycles
cj – number of production cycles for SKU j
ejk – indicator, is component k part of SKU j
ILP Solution Variables
• zil – indicates whether or not order i is filled on
day l
• yjl – amount of SKU j scheduled for production
on day l
ILP Objective function
and constraints
Such that:
(1)
(2)
(3)
(4)
(5)
Stochastic Programming
• Extending the simple model by adding a set of
RFQs
• Each RFQ is given a prior αi, according to
historical information, indicating its probability of
becoming an order.
Given
• a set of orders, and a set of RFQs today, only a
fraction of which will be realized tomorrow
Goal
• to produce an optimal set of SKUs today, such
that tomorrow’s profits will be maximized.
Additional variables
& constants
• Constants
Ωm – set of possible scenarios
• Variables
zilm – indicates whether or not RFQ i is filled on day l
in scenario m
yjlm – amount of SKU j scheduled for production on
day l in scenario m
wi – indicates whether or not order i is filled on day
1
vj – amount of SKU j scheduled for production on
day 1
Objective function
& constraints
s.t.
Stage 1
Stage 2
Approximation algorithms
Expected Profit/Quantity/Value
algorithms
• Solve variants of the simple scheduling
problem
• Expected profits algorithm uses expected
profits, calculated by multiplying πil by αi
• Expected quantities algorithm uses
expected quantities, calculated by
multiplying qi by αi
• Expected value uses expected profits and
expected quantities
Sample Average Approximation
• Sample the scenario space (used 30 samples in the
paper), and optimize a regular ILP problem:

1
,
N
N  30
• SAA-Greedy
– Samples scenarios with RFQs for only one day,
– Do not reason about future RFQs
• SAA-Average and SAA-Sampling
– Sample scenarios with RFQs for N days,
– They differ on how to sample future RFQs.
Not-In-Time Algorithm
• Ignores stochastic information completely
• Schedules only orders, i.e., realized RFQs
• Production does not begin until one day
after RFQs are received – can lead to late
penalties
Metrics Used
• P
• C
– mean profit per order
– Percentage of cycles used to fill
orders
• P/C – profit per cycle
• EVPI – Expected value of perfect
information
• VSI – Value of stochastic information
Results for the 7 algorithms
Conclusions
1. The stochastic
programs
outperformed all of
the other schedulers
(in all except one
metric)
1. Using stochastic
information
improves
performance
2. Stochastic algorithms
that rely on forecasts
about future RFQs
outperformed SAAG
2. Utilizing more
stochastic
information about
the future leads to
better performance
Outline
Trading Agent
Competition
WhiteBear
TAC SCM
TAC classic
TacTex
Boticelli
TAC Classic
• General problem capturing several issues of
bidding in simultaneous auctions
• Provides a universal testbed for researchers
• Travel agents
– Working on behalf of 8 customers each
• The type of each agent is determined by the preferences by
its clients.
– Arranging for a trip to Tampa
• round-trip flight tickets
• hotel accommodations
• entertainment tickets
– GOAL: Maximize clients’ utilities
TAC Classic
URL: www.sics.se/tac
Outline
TAC SCM
Trading Agent
Competition
TacTex
WhiteBear
Boticelli
TAC classic
General Architecture (Modular)
Follows the Sense Model Plan Act (SMPA) architecture
While (not end of game)
{
Get price quotes
Calculate estimates & statistics
Planner (Formulate desired plan)
Bidder (Bid to implement plan)
}
Challenges:
– The quantity of each good to be bought
– The prices offered for each individual unit
– The times at which the bids are placed.
Determining Partial Strategies
• Determine “boundary strategies”
– E.g. minimum and maximum price for the bid, if
bid price is the issue
• Determine “intermediate strategies”
– By modifying boundary strategies
– By combining boundary strategies
– By using a strategy that constitutes an
equilibrium for a simpler but similar game
Bidding Strategies – Hotels
Auction Rules:
• Ascending and multi-unit auctions with price quotes
announced periodically. One randomly selected auction
closes at minutes 4 to 11 (one each minutes).
Issue: Bid Price
Dilemma:
• If not aggressive, could get outbid and lose rooms
needed
– will get outbid by other agents and lose utility for not
implementing the plan and for unused resources
• If too aggressive, prices will skyrocket and the agent’s
score will get hurt more than other agents’ scores
– All agents’ scores are hurt
– But this hurts the agent more, since rooms it desires will have
an increased price
Bidding Strategies – Hotels (cont.)
1.
Low aggressiveness : (boundary str.)
 Bids higher than the current ask price by an
increment
2. High aggressiveness : (boundary str.)
 Bids for all rooms progressively closer to the
marginal utility
3. Medium aggressiveness : (intermediate str.)
 Combines two previous strategies
 For critical rooms (rooms with high marginal utility)
the bid is close to the marginal utility
 For all other rooms it bids an increment above the
current price (the increment increases as time
passes)
Bidding – Plane Tickets
Auction Rules:
• Ticket prices are expected to increase approximately in
proportion to the square of the time elapsed since the
start of the game
Issue: Time of Bid Placement
Dilemma:
– To bid early in order to get the cheapest tickets
– Or to bid later in order not to limit its options
Solution:
• Bid for some of the tickets at the beginning
• Bids for the rest after some hotel room auctions have closed
• Strategies: Which tickets are bought at the beginning
Bidding – Plane Tickets (cont.)
1. Late Bidder : (boundary str.)
 Buy at the beginning only tickets that are
“certain” to be used
2. Early Bidder : (boundary str.)
 Buy all tickets at the beginning
3. Strategic Bidder : (intermediate str.)
 Modifies “Early Bidder” boundary strategy
 Uses “Strategic Demand Reduction”
 Buy all tickets at the beginning, except the ones
that are “highly likely not to be used”
Decomposing the Problem
Agent
Optimizer / Planner
Partial
Bidding
Strategy 1
Partial
Bidding
Strategy 2
Partial
Bidding
Strategy k
Auction
Type 1
Auction
Type 2
Auction
Type k
Exploring Strategy Space
• Determine the best partial strategy for one
particular auction type
– Keep all other partial strategies fixed
– Use a fixed number of agents using intermediate
strategies
– Vary the mixture of agents using boundary strategies
• Explore strategy space systematically
– Use several experiments to evaluate the strategies
for different auction types
– Use the best partial strategies found in the previous
experiments as the strategies that are kept fixed in
each experiment
– Stop when experiments “converge”
Experimental Results
 Overall the medium and high aggressiveness
versions perform the best
– But the medium aggressiveness agent is more
consistent in general
 Overall the strategic agent versions perform
the best
– The early bidder is significantly better than the
late bidder
 In general you win when you are “going against
the tide”, i.e. being aggressive when most
other agents are not
General Observations
• Planner is adaptive, versatile, fast and robust
• Agent uses both principled methods and
approaches guided by the knowledge acquired
by observing the behavior of the games and
combines both seamlessly
• The agent used in TAC was the strategic
agent with medium aggressiveness
• Agent White Bear always ranks in the top
three agents in all the competition rounds of
the Trading Agent Competition
TAC Classic 2002- 2004
# Agent (2002)
Score
# Agent (2003)
Score
1 WhiteBear
3556 1 ATTac01
3200
2 Southampton
3492 2 PackaTAC
3163
3 Thalis
3351 3 WhiteBear
3142
# Agent (2004)
1 WhiteBear
2 Walverine
3 LearnAgents
Score
4122
3849
3737
References
• The Supply Chain Management Game for the Trading
Agent Competition 2004 (Aranuchalam et al. 2004)
• TacTex03 - A supply chain management agent, (Pardoe
et al. 2004)
• A Stochastic Programming Approach to Scheduling in
TAC-SCM (Benisch et al. 2004)
• A principled study of design tradeoffs for autonomous
trading agents, (Ioannis et al. 2003)