During the recent hype of ChatGPT and the Bing Search Bot I had the thought that these chat bots - if they know more languages than English - can be a nice addition to foreign language learning. This post outlines a very rough idea how such a bot can be created using the GPT API.

The Situation

I have been thinking about ways to improve my process of learning foreign languages for a long time (with some ideas, but no good achievement). According to my experience there are two difficult things:

  1. learning to actively use a language (speaking and writing)
  2. improving once you’ve reached some basic understanding (usually around level B1 or B2, I think)

One or two years ago I decided that my primary focus for learning foreign languages is reading news and books, so I do not have to bother with writing and speaking. However, reading books and news needs a certain skill level. Trying to read it too early usually leads to a lot of frustration for me. For many languages there is a lot of material for introduction to get up to speed with a passive understanding on a basic level, but the air often gets thin starting at intermediate levels.

Recently, however, I also felt that active usage of a language cannot be completely disregarded. During a social event at work a Mexican colleague started to speak some Spanish. While I was able to understand some things, I was not able to respond in even the simplest phrases - nada, nothing. I still do not want to become a perfectly fluent speaker of a language, but at least uttering some thoughts might be useful.

Using GPT for a Foreign Language Tandem Bot

I’m quite skeptical towards some ideas of the current GPT hype, but these models were made for generating text. So they should be a perfect foreign language text generation engine. And if Microsoft and Co. are throwing billions of dollars towards a software that can help me learn foreign languages I’m willing to make use of it.

Since I am a programmer, I usually want to use the programmable gateway, not an end-user application. For GPT, we can use the OpenAI playground for this. There we can find the model text-davinci-003, which seems to be the latest GPT model.

If you only know ChatGPT or the Bing Search Bot, you’ll need to understand that GPT is a one-shot text completion engine. It takes the input text you give it and completes it with a text that is probable to follow the given input text. To get some variaty in the results, you’ll usually set some randomness.

This means that chat bots are only an extension to GPT. In its raw form GPT takes only a single text as an input and give a continuation for this text. Examples in the OpenAI playground set up a schema that will guide the model into the correct continuation. E.g. there is an example converting movie titles to emojis. The goal of the example is to get a emoji representation for the movie “Star Wars”, and the model input is:

Convert movie titles into emoji.

Back to the Future: 👨👴🚗🕒
Batman: 🤵🦇
Transformers: 🚗🤖
Star Wars:

As you can see the prompt already gives GPT three examples. The language model can detect from these examples that a correct continuation of the text probably must have some emojis after the text Star Wars:.

The response according to the example is ⭐️⚔.

Implementation

So, let’s implement our chat bot. OpenAI provides a Python package called openai to work with the API. It’s quite simple to use. Since GPT works as outlined above, you only need to call a single function which takes input text and returns an output text. The chat bot needs to be programmed around this function.

For the code we’ll need the following imports:

import dataclasses
import os
import openai
import sys

We’ll start with some data structures to handle history. We use a class ChatMessage to represent a input/response pair. Since input is a reserved keyword and I did not want to write input_ all the time, I called the field human. But I really wanted to get away from the whole AI hype and thus called the response output.

@dataclasses.dataclass
class ChatMessage:
    human: str
    output: str

Since we need to convert chat history into a new ChatGPT input prompt we’ll create a class ChatHistory for this:

class ChatHistory:
    def __init__(self):
        self._context = []

    def build_prompt(self, prompt, input_):
        prompt += '\n\n'

        for chat in self._context:
            prompt += f'Input: {chat.human}\nOutput: {chat.output}\n'

        return prompt + f'Input: {input_}\nOutput:'

    def update_context(self, human, output):
        self._context.append(ChatMessage(human=human, output=output))
        
        # make sure the list does not grow infinitely
        if len(self._context) > 10:
            del self._context[0]

According to my tests an initial chat example was not needed. GPT also understood without further examples that it should write its answer directly after Output:. If this does not work, feel free to add initial chat messages to self._context.

We include some older chat messages so that GPT knows that the dialog is about at the moment. We do not want to input the whole history back into GPT, because GPT can only handle a limited amount of text and it would also become too expensive. Thus, this class just cuts off older chat entries. A more advanced solution might be to let GPT generate a summary of the current chat that is much shorter and prepend that summary to the new chat message. This approach has the advantage that important context from older messages might be retained. I might write a blog article about this somewhen in the future.

Most or all GPT examples usually work by using an introduction sentence describing the situation for the following text. In web discussion this is often called the initial prompt. The initial prompt can heavily change the output of the model, so this is where you’d want to play around to achieve other behaviours. I have no experience in prompt engineering, so I used a variation of the GPT example prompt for a chat bot that worked for my tests.

If you’re creating a bot that you want to publish to others, you should be aware that some people also try to “escape” the initial prompt to make the bot do other things. You’ll have to check how dangerous this is for your use case.

To easily switch languages I take the language name as input from the command line.

language = sys.argv[1]
initial_prompt = f"""The following is a conversation between a human and a
foreign language tandem partner AI. The AI is intelligent, funny and friendly.
It answers all questions in {language}."""

With all the setup done we can then start an infinite loop taking input and retrieving text continuations (i.e. chat responses) from GPT:

history = ChatHistory()

while True:
    input_ = input('Input: ')

    response = openai.Completion.create(
      model="text-davinci-003",
      prompt=history.build_prompt(initial_prompt, input_),
      temperature=0.9,
      max_tokens=150,
      top_p=1,
      frequency_penalty=0.0,
      presence_penalty=0.6,
      stop=["Input:", "Output:"]
    )
    
    output = response['choices'][0]['text'].strip()
    print(f'Output: {output}')

    history.update_context(input_, output)

I hope that I find some time to bring this into a more useful form. There are some ideas in my head how I could improve it and make it really useful. I’ll also need to evaluate cost a bit. 1000 tokens cost 0.02$. At first I thought that this will be extremely cheap, but since each request needs to include some history of the chat it might be more expensive than initially thought. It might make sense to evaluate whether cheaper models also can write correct sentences in foreign languages.

I do not maintain a comments section. If you have any questions or comments regarding my posts, please do not hesitate to send me an e-mail to blog@stefan-koch.name.