ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim....

27
Program Guided Agent ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim

Transcript of ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim....

Page 1: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Program Guided Agent ICLR 2020 (Spotlight)

Shao-Hua Sun Te-Lin Wu Joseph J. Lim

waltersun
Page 2: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Follow an Instruction to Solve a Complex Task

Recipe: cooking fried rice Stir-fry the onions until tender, and repeat this for garlic and carrots, if you have soy sauce, add some. Pour 2/3 cups the whisked eggs into the stir-fried and scramble.

Page 3: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Natural Language Instruction

• Scoping • Coreferences • Entities

Recipe: cooking fried rice Stir-fry the onions until tender, and repeat this for garlic and carrots, if you have soy sauce, add some. Pour 2/3 cups the whisked eggs into the stir-fried and scramble.

Ambiguities in Language

Bandanau et al. in ICLR 2019 Misra et al. “Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction” in EMNLP 2018 Anderson et al. “Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments” in CVPR 2018 Misra et al. “Mapping Instructions and Visual Observations to Actions with Reinforcement Learning” in EMNLP 2017 Hermann et al. “Grounded Language Learning in a Simulated 3D World” in arXiv 2017

Page 4: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Program

Function: cooking fried rice for item in [onions, garlic, carrots]: if is_there(“soy sauce”): add(“soy sauce”, “pot”) while not tender(item): stir_fry(item)pour(whisked(“eggs”), “pot”, 0.66)scramble(“eggs”)

• Explicit scoping • Resolved Coreferences • Resolved Entities

Advantages of Programs

Page 5: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

Program

Page 6: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

StateProgram

x3 x1 x0

Page 7: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 8: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 9: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 10: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x4 x1 x0

StateProgram

x3 x1 x0

Execution

Page 11: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 12: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 13: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 14: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x2 x0

StateProgram

x3 x1 x0

Execution

Page 15: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 16: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 17: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x0

StateProgram

x3 x1 x0

Execution

Page 18: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Problem Formulation

x3 x1 x1

StateProgram

x3 x1 x0

Execution

Page 19: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Exemplar Instructions

def Task():if is_there[River]:mine(Wood)build_bridge()if agent[Iron] < 3:mine(Iron)

place(Iron, 2, 3)else:goto(4, 2)

while env[Gold] > 0 :mine(Gold)

def Task():if is_there[River]:build_bridge()

place(Gold, 3, 4) if agent[Gold] = = 1 3: while agent[Gold] <= 12:

place(Gold, 8, 3)if agent[Iron] >= 8: place(Wood, 2, 4)

elif env[Gold] <= 10: sell(Iron)

Programs

Natural Language Instructions

Page 20: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

End-to-end Learning Baseline

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Program

State

NL Instruction

OR

Page 21: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Program Guided Agent

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Page 22: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Program Interpreter

• Comprehend a given program to 3 categories: • Subtasks (actions): what agent should perform • Perception: information from the environment • Control flow: decide to call different subtasks according to perceived

information

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Page 23: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Perception Module

• Extract environmental information for choosing a path in a program • Input

• Query: a symbolically represented query (e.g. is_there[River]) • State s: environment map and agent inventory status

• Output • Predicted answer to the query (e.g. True/False)

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Page 24: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Policy

• Take low-level actions an the environment for fulfilling a subtask • Input

• Symbolically represented subtask (goal) g• State s

• Output • Predicted action distribution

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Page 25: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Result

Page 26: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Conclusion• Specific tasks using programs

• Leverage the structure of programs with a modular framework

def Task():if is_there[River]:mine(Wood)build_bridge()if agent[Iron] < 3:mine(Iron)

place(Iron, 2, 3)else:goto(4, 2)

while env[Gold] > 0 :mine(Gold)

Program

Module Module Output

EnvironmentActionPolicy

GoalProgram Interpreter

ResponseQuery

PerceptionModule

Program

def run(): while env[Gold] > 0: mine(Gold) if is_there[River]: build_bridge() place(Wood, 2, 3)

State3 0 1

Page 27: ICLR 2020 (Spotlight) - Shao-Hua Sun · ICLR 2020 (Spotlight) Shao-Hua Sun Te-Lin Wu Joseph J. Lim. Follow an Instruction to Solve a Complex Task Recipe: cooking fried rice Stir-fry

Program Guided Agent ICLR 2020 (Spotlight)

Shao-Hua Sun Te-Lin Wu Joseph J. Lim

Thank You for Your Attention