Iris: A Conversational Agent for Complex Tasks

Reading time: 6 minute
...

📝 Abstract

Today’s conversational agents are restricted to simple standalone commands. In this paper, we present Iris, an agent that draws on human conversational strategies to combine commands, allowing it to perform more complex tasks that it has not been explicitly designed to support: for example, composing one command to “plot a histogram” with another to first “log-transform the data”. To enable this complexity, we introduce a domain specific language that transforms commands into automata that Iris can compose, sequence, and execute dynamically by interacting with a user through natural language, as well as a conversational type system that manages what kinds of commands can be combined. We have designed Iris to help users with data science tasks, a domain that requires support for command combination. In evaluation, we find that data scientists complete a predictive modeling task significantly faster (2.6 times speedup) with Iris than a modern non-conversational programming environment. Iris supports the same kinds of commands as today’s agents, but empowers users to weave together these commands to accomplish complex goals.

💡 Analysis

Today’s conversational agents are restricted to simple standalone commands. In this paper, we present Iris, an agent that draws on human conversational strategies to combine commands, allowing it to perform more complex tasks that it has not been explicitly designed to support: for example, composing one command to “plot a histogram” with another to first “log-transform the data”. To enable this complexity, we introduce a domain specific language that transforms commands into automata that Iris can compose, sequence, and execute dynamically by interacting with a user through natural language, as well as a conversational type system that manages what kinds of commands can be combined. We have designed Iris to help users with data science tasks, a domain that requires support for command combination. In evaluation, we find that data scientists complete a predictive modeling task significantly faster (2.6 times speedup) with Iris than a modern non-conversational programming environment. Iris supports the same kinds of commands as today’s agents, but empowers users to weave together these commands to accomplish complex goals.

📄 Content

Author Keywords conversational agents; data science ACM Classification Keywords H.5.2 Information Interfaces and Presentation: Natural Lan­ guage INTRODUCTION For decades, the promise of computers that communicate with us through natural language has been depicted in works of science fiction and driven research agendas in artificial intelligence (AI) and human-computer interaction (HCI). As early as 1964, Joseph Weizenbaum demonstrated how a computer program could hold open-ended conversations us­ ing a large set of pattern matching rules [45]. Terry Winograd later developed a more sophisticated program that could act upon natural language requests within a simplified “blocks world” [46]. In recent years, virtual assistants such as Siri and Cortana have increasingly applied conversational agents to real-world problems, such as finding local restaurants and scheduling calendar appointments [24]. Speak to these conversational agents as you would to a col­ league or graduate student, however, and it becomes clear that they have serious limitations. When you ask a consci­ entious graduate student to “check for significant differences between conditions”, they might reply, “what kind of dif­ ferences?” And you would say, “statistical differences, you know, through a t-test.” And they might ask, “you’re talking Iris: A Conversational Agent for Complex Tasks Ethan Fast, Binbin Chen, Julia Mendelsohn, Jonathan Bassen, Michael Bernstein Stanford University {ethaen, bchen45, jmendels, jbassen}@stanford.edu, msb@cs.stanford.edu Figure 1: Iris is a conversational agent that helps users with data science tasks. (1) Users interact with Iris through natural language requests and (2) the system responds with real-time feedback on the command the request will trigger. Once a command is triggered, Iris (3) converses with users to resolve arguments, which may also be the result of a (1) new command or (4) previous conversation. 2 3 4 1 Under review at UIST 2017 ABSTRACT Today’s conversational agents are restricted to simple stand­ alone commands. In this paper, we present Iris, an agent that draws on human conversational strategies to combine commands, allowing it to perform more complex tasks that it has not been explicitly designed to support: for example, composing one command to “plot a histogram” with another to first “log-transform the data”. To enable this complexi­ ty, we introduce a domain specific language that transforms commands into automata that Iris can compose, sequence, and execute dynamically by interacting with a user through natural language, as well as a conversational type system that manages what kinds of commands can be combined. We have designed Iris to help users with data science tasks, a domain that requires support for command combination. In evaluation, we find that data scientists complete a predic­ tive modeling task significantly faster (2.6 times speedup) with Iris than a modern non-conversational programming environment. Iris supports the same kinds of commands as today’s agents, but empowers users to weave together these commands to accomplish complex goals. about income for the high- and low-education conditions?” And you would say, “that’s right, although they are skewed, so you should log-transform them first.” The end result of this conversation is that the graduate student would run a t-test to check for differences between the log-transforms of income data for high- and low-education testing populations. Try something like this with Siri, on the other hand (assum­ ing Siri understood t-tests and log-transforms), and it would respond with “I don’t know what you mean”. Why does Siri fail? The root of the issue is that Siri (and sim­ ilar agents) employ a simple model of conversation. While humans engage in complex conversations, interactions with today’s agents can be described by the following algorithm: (1) determine which command a user wants, (2) ask for any missing arguments, and (3) execute it. In other words, every conversation you have with an agent today is equivalent to executing a standalone command. These standalone com­ mands are effective for simple goals such as setting a timer or finding directions, but they restrict agents to tasks that a system has been explicitly designed to handle. Linguistic theory suggests how conversational agents can overcome this limitation. People do not restrict themselves to standalone commands: we build meaning through combina­ tions of them [16, 28]. We combine commands, for example, by nesting conversations. When a colleague says, “should I do a t-test or Mann-Whitney U?”, we might respond with a new conversation, “well, is the data normally distributed?”. Similarly, we combine commands through references to pre­ vious ones: after asking “did you take the log-transform?”, we might say, “what is the variance of that?” In this paper, we explore an architecture for conversational agents

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut