GPT – 3 is one of the most exciting and versatile language models, which is good at almost everything. OpenAI released it in June 2020, since then it is treated as the god of language models. It can perform a huge number of tasks such as language translation, question answering, code generation, text summarization, tweet classification and many others.
The primary speciality lies in the accuracy of predictions which is nearly similar and indistinguishable to human accuracy. Not only that, but it can also quickly adapt to new natural language tasks by providing just a few examples. The question is, how is a single model equipped to perform so many tasks with great accuracy? Let’s figure it out.
One of the main reasons why GPT – 3 performs so well is because it has been created with 175 billion parameters which accounts for the largest neural network till date. Additionally, it has been trained with roughly all the data that is available on the internet hence making it super adaptable.
What is GPT – 3?
GPT (Generative Pre–Trained) language models are based on the transformer architecture containing zero/one/few shot settings for multi–task problem-solving. Meaning, they can quickly get accustomed to new tasks with zero/one/few examples.
The architecture of GPT – 3 is an improvisation over its previous counterparts – GPT – 1 and GPT – 2. The initial GPT – 1 model was developed with unsupervised pre-training and supervised fine-tuning for suiting a single task. Later, the GPT – 2 models were introduced, which did not require supervised fine-tuning for a particular task and worked well for multiple tasks.
Finally, the GPT – 3 model has a similar underlying architecture, but the depth has increased considerably (more than 100 times bigger than GPT 2). Some of the large datasets it has been trained on are – CommonCrawl, Wikipedia, and WebText. Therefore, massive data and the tremendous depth of the model serve as the key to its versatility.
Now, let’s dive right into the implementation part.
OpenAI API for GPT – 3
As you might be knowing, the GPT – 3 model is highly powerful; therefore, the code has not been provided as open-source due to security reasons. Instead, OpenAI has released a beta API through which one can access and experiment with the different functionalities of the language model.
To get access to the API, you need to create an account and join the waitlist here. Once accepted, you can generate your own API key token and use it to play around with various examples. Here, we will be going through three different examples – Natural Language to SQL Query, Advanced Tweet Classification and ChatBot in detail.
Let’s start by understanding a few programming paradigms which sets the base for the GPT – 3 API coding.
1. The input and modeling part of the API is structured in terms of prompts and completions.
-
- Prompt is basically the input that is passed. It can be custom designed based on the kind of task you want the model to perform.
- The completion endpoint of the API returns the completion (expected outcome) of the prompt. It supports several tasks such as code generation, text summarization, etc.
2. The maximum length of the prompt is limited to a number of 2048 tokens or 1500 words. Tokens are simply the words themselves or a few characters which form the subset of the words.
3. The API also provides several options of the models to choose from – Davinci, Babbage, Ada and Curie. Out of them, Davinci is the most powerful model in terms of accuracy, whereas Ada occurs to be the fastest.
Tasks
1. Natural Language to SQL Query:
This task converts the natural language input to a SQL query.
Step 1: Install the OpenAI API.
!pip install openai
Step 2: Import libraries and set the API key.
import openai import uuid openai.api_key = "Your_API_Key"
Step 3: Define the GPT3 class and Example class.
Example class contains the functionality of adding examples to make the model learn about the particular task.
class Example: # Stores an input, output pair and formats it to prime the model def __init__(self, input, output): self.input = input self.output = output self.id = uuid.uuid4().hex # To obtain the input provided for an example def get_input(self): return self.input # To obtain the output provided for an example def get_output(self): return self.output # To obtain the unique id of an example def get_id(self): return self.id
GPT3 class contains the functionality of task specific completion endpoints, setting the parameters and converting the input to a particular format to interact with the OpenAI API.
class GPT3: """ Params engine: Model to be used. Options are Davinci, Babbage, Ada and Curie. temperature: Amount of randomness to be introduced in the predictions of the model Setting a higher value of temperature would be useful for creative applications whereas a lower value will be suitable for well defined answers. max_tokens: Maximum length of number of tokens accepted by the prompt. """ # initialises parameters and adds default values def __init__(self, engine='davinci', temperature=0.5, max_tokens=100, input_prefix="input: ", input_suffix="\n", output_prefix="output: ", output_suffix="\n\n", append_output_prefix_to_query=False): self.examples = {} self.engine = engine self.temperature = temperature self.max_tokens = max_tokens self.input_prefix = input_prefix self.input_suffix = input_suffix self.output_prefix = output_prefix self.output_suffix = output_suffix self.append_output_prefix_to_query = append_output_prefix_to_query self.stop = (output_suffix + input_prefix).strip() # Adds an example to the model object. Example is an instance of the Example class. def add_example(self, ex): assert isinstance(ex, Example), "Please create an Example object." self.examples[ex.get_id()] = ex # Converts all the examples to a particular format to prime the model. def get_prime_text(self): return "".join( [self.format_example(ex) for ex in self.examples.values()]) # Creates a query for the API request def craft_query(self, prompt): q = self.get_prime_text( ) + self.input_prefix + prompt + self.input_suffix if self.append_output_prefix_to_query: q = q + self.output_prefix return q # Calls the API using the Completion endpoint with the specified values of the parameters def submit_request(self, prompt): response = openai.Completion.create(engine=self.engine, prompt=self.craft_query(prompt), max_tokens=self.max_tokens, temperature=self.temperature, top_p=1, n=1, stream=False, stop=self.stop) return response # Formats the input output pair with appropriate prefixes and suffixes def format_example(self, ex): return self.input_prefix + ex.get_input( ) + self.input_suffix + self.output_prefix + ex.get_output( ) + self.output_suffix
Step 4: Create an object of the GPT3 class with custom values of parameters.
# Creates an object of the GPT3 class gpt3 = GPT3(engine="davinci", temperature=0.5, max_tokens=100)
Step 5: Add examples using the object of the GPT3 class and the add_example() function.
# Adding examples of queries in natural language and their SQL counterparts gpt3.add_example(Example('Fetch unique values of DEPARTMENT from Worker table.','Select distinct DEPARTMENT from Worker;')) gpt3.add_example(Example("Get all details of workers who have top 5 salaries.","Select * from Worker where SALARY in (Select distinct top 5 SALARY from Worker order by SALARY desc")) gpt3.add_example(Example("Display the highest salary from the Worker table.","Select max(Salary) from Worker;")) gpt3.add_example(Example("Fetch the count of employees working in the department Admin.","SELECT COUNT(*) FROM worker WHERE DEPARTMENT = 'Admin';")) gpt3.add_example(Example("Get all details of the Workers whose SALARY lies between 5000 and 10000.","Select * from Worker where SALARY between 5000 and 10000;")) gpt3.add_example(Example("Fetch the count of employees working in the department Admin.","SELECT COUNT(*) FROM worker WHERE DEPARTMENT = 'Admin';")) gpt3.add_example(Example('Print the first three characters of FIRST_NAME from Worker table.','Select substring(FIRST_NAME,1,3) from Worker;'))
Step 6: Design the prompt, send the request to the API and get the output.
Example: 1
prompt = "Get count of the Workers whose SALARY is more than 5000." response = gpt3.submit_request(prompt) response.choices[0].text #This chooses the topmost response from multiple outputs if any.
Output:
output: Select count(*) from Worker where SALARY > 5000;\n\n
Example: 2:
prompt = "Get all details of the Workers whose AGE lies between 25 and 35" output = gpt3.submit_request(prompt) output.choices[0].text #This chooses the topmost output from multiple outputs if any.
Output:
output: Select * from Worker where AGE between 25 and 35;\n\n
To increase the accuracy of predictions and adapt the model to a certain task, one should add more examples.
2. Advanced Tweet Classification:
This task will classify tweets according to their sentiments: Positive, Negative or Neutral.
The first three steps would be the same as discussed above in the first task. Next, you will need to change a few values of parameters and add relevant examples for this specific use case.
Step 1: Create an object of the GPT3 class with defined parameters.
gpt3_tweet = GPT3(engine="davinci", temperature=0.3, max_tokens=60)
Step 2: Add relevant examples of tweets and their sentiments.
gpt3_tweet.add_example(Example('I loved the new Batman movie!','Positive')) gpt3_tweet.add_example(Example('My day has been ?','Positive')) gpt3_tweet.add_example(Example('This is the link to the article','Neutral')) gpt3_tweet.add_example(Example('I hate it when my phone battery dies','Negative'))
Step 3: Design the prompt, send the request to the API and get the output.
Example: 1
prompt = 'I love marvel movies!' output = gpt3_tweet.submit_request(prompt) output.choices[0].text #This chooses the topmost output from multiple outputs if any.
Output:
Sentiment: Positive\n\n
Example: 2
prompt = 'I feel sad when I do not have friends!' output = gpt3_tweet.submit_request(prompt) output.choices[0].text #This chooses the topmost output from multiple outputs if any.
Output:
Sentiment: Negative\n\n
3. Chat Bot:
This task will develop a conversational AI chat bot.
The first two steps would be the same as discussed above in the first task. Further, to develop a conversational bot, input would be taken from the user as a prompt and a minor change can be witnessed in the functions to suit the purpose.
Step 1: Develop a custom function for gpt3.
# Sets the parameters and outputs the new prompt in an appropriate format to chat further def gpt3(prompt, engine='davinci', response_length=64, temperature=0.7, top_p=1, frequency_penalty=0, presence_penalty=0, start_text='', restart_text='', stop_seq=[]): response = openai.Completion.create( prompt=prompt + start_text, engine=engine, max_tokens=response_length, temperature=temperature, top_p=top_p, frequency_penalty=frequency_penalty, presence_penalty=presence_penalty, stop=stop_seq, ) answer = response.choices[0]['text'] new_prompt = prompt + start_text + answer + restart_text return answer, new_prompt
Step 2: Develop and call a chat function that can continuously take input and generate output
# Loops through the GPT3 function to develop an interactive application with appropriate start and end sequences. def chat(): prompt = """Human: Hi, how are you? AI: I'm good! What would you like to chat about? Human:""" while True: prompt += input('You: ') answer, prompt = gpt3(prompt, temperature=0.9, frequency_penalty=1, presence_penalty=1, start_text='\nAI:', restart_text='\nHuman: ', stop_seq=['\nHuman:', '\n']) print('GPT-3:' + answer) chat() #Calling the function
Step 3: You are ready to chat with your custom built AI bot.
Sample output of chat:
You: Hi, How are you? GPT-3: I'm good. Thank you for asking. You: Tell me the famous food of South India. GPT-3: The most common regional food is South Indian thali . It includes Rice, Sambar, Rasam, Curry , Appalams and Papad. You: Thank you. Which cities are most famous in India? GPT-3: Chennai, the capital city of Tamil Nadu is most popular for Music. It has large number of musical instruments manufactured in this place . Similarly , Mumbai airport is known as Queen's airport it handles more domestic and international traffic than any other Indian Airport. You: Cool. Where is Taj Mahal located? GPT-3: Taj Mahal is located in Agra, Uttar Pradesh. You: How many states are there in India? GPT-3: In India, there are 28 states. You: Thank you for talking. See you soon. Bye! GPT-3: Sure. See you soon. Bye!
In this blog, we introduced the concept of the GPT3 model. We also experimented with three different tasks that demonstrate the versatile capabilities of the model. To try out more examples, you can visit the official website of OpenAI. Given its powerful nature, GPT3 can perform multiple tasks when given instructions in natural language and served with appropriate examples.
Thank you for reading!