
EDMONTON, February 16, 2021 – There have been many articles published over the last year describing GPT-3’s abilities to assist in writing books and articles, translating between languages, providing tips, playing games and more. While AI systems already exist to perform these functions they are generally “narrow AI” and require extensive programming and training. GPT-3 is unique in that it is a General AI system that can perform all of these tasks and can do so with limited training.
This article discusses my experience teaching GPT-3 to translate natural language into SQL (structured query language). Its not a complete translator, but it’s pretty good and with additional training could probably be taught most of SQL.
GPT-3 is a massive language model, with billions of control parameters, that were trained using billions of language sequences including: Common Crawl (crawling the web), thousands of books, and Wikipedia. It has never been trained explicitly to do math, write poems, or generate SQL, yet with simple prompting examples GPT-3 can be “taught” to perform these functions.
GPT-3’s knowledge and abilities are not infinite, but they are extensive and in some ways superior to human knowledge. GPT-3 is aware of billions of “facts” about the world and can be taught to elicit them on demand. GPT-3 is aware of simple math, can write poems and provide word definitions. GPT-3 is also really good at summarizing text, which is an important skill when translating a natural language statement into a mathematical query like SQL.
One area that GPT-3 has been applied to is generating SQL which is a query language for accessing databases. SQL is based on set theory, which is a mathematical concept for reasoning about sets of things in the world. For example, an employee database can be thought of as a set of records containing facts about employees. SQL can be used to manage these sets and to generate reports from the data. SQL statements for accessing data take the following form:
Select <field names> From <select table> [join type] Join On <join expression> Where <selection expression> Order by <field> [Ascending or Descending] Group by <field> |
With a small collection of examples, I was able to train GPT-3 to do this translation. Here are some of the examples I used:
The first example is simple and consists of three parts, 1) the database description in natural language, 2) the natural language query and 3) the translation example in SQL. Note that the natural language statement doesn’t include any mathematical symbols, yet GPT-3 is able to figure out, from this one example, that these mathematical symbols are translations from natural language. Without writing any code at all, GPT-3 knows more from less, large from small, most from least, etc.
gpt.add_example(Example(‘Given a table Employee with columns name and salary. Find the employees that make more then 50k per year => sql:’, ‘Select name, salary from Employee where salary>=50000’)) |
This next example illustrates the translation of “Highest” to the SQL keyword “Max”. GPT-3 seems to be able to easily figure out the mapping to max and min without the requirement to post many examples. GPT-3 uses its understanding of natural language to translate correctly.
gpt.add_example(Example(‘Given a table store with columns name and earnings. Get the name of the store with the highest earnings last year => sql:’, ‘Select name, max(earnings) from Store where year = 2019’)) gpt.add_example(Example(‘Given a table store with columns name, earnings and year. Get the name of the store with the lowest earnings last year => sql:’, ‘Select name, min(earnings) from Store where year = 2019’)) |
This next example shows GPT-3 how to limit the results and how to order the results.
gpt.add_example(Example(‘Given a table Students with columns name and grade. Display the first 50 students with grades above 3.0 => sql:’, ‘Select top 50 name, grade from students where grade>3.0 Order by grade Desc’)) |
In SQL, its possible to chain results together so that one query can be used as input into another query. In the following example, top tallest athletes are selected and then a query is made to find the fastest one. Once given this example, GPT-3 should be able to match to combination of best, fastest, quickest, etc.
gpt.add_example(Example(‘Given a table Athletes with columns name and time. Of the top 10 tallest athletes find the athlete with the best time => sql:’, ‘Select name, Min(time) from SELECT Top 10 name,time,height from Athletes order by height’)) |
This final example is much more complicated. It provides a three way linkage. It also illustrates the use of table aliases to integrate different parts of the declarative query. During my training with other examples I found that GPT-3 easily generalized to using these aliases, but could figure out how to make them unique. It might be possible to train this with more examples; however, if I stick to table names with unique first letters, GPT-3 seems to solve the rest of the query mapping without a problem.
gpt.add_example(Example(‘Given a table products with columns product_name, list_price, category_id and brand_ID. Given a table Categories with columns category_name and category_id. Given a table Brand with columns brand_name and brand_ID. The Product tabled is linked to the Categories table with the category_id field. The product table is linked to the Brand table with the brand_id field. List all the products by product name, brand name and category and sort the results by product_name. => sql:’, ‘SELECT product_name, category_name, brand_name, list_price FROM products p INNER JOIN categories c ON c.category_id = p.category_id INNER JOIN brands b ON b.brand_id = p.brand_id ORDER BY product_name DESC’)) |
I don’t know how good these examples are or how much coverage they provide. I added a few extra examples to reinforce some of the basic concepts of SQL syntax. So now I can experiment.
Experiment 1
Input: “Given a table Athletes with columns name and time. Find the top 5 athletes with the worst times. => sql:” GPT-3 Output: “SELECT name, time FROM Athletes ORDER BY time DESC LIMIT 5” |
So this worked. GPT-3 was able to recognize that the sort needed to be descending and not ascending. It also recognized inserted the correct table and field names and generated an SQL query that is syntactically and semantically correct. At one point in my career I taught computer classes to business students. My experience tells me that GPT-3 is as intelligent as most of the students that I taught and perhaps more intelligent that some.
Let’s try something a little more complicated. A one-to-many pair of tables with a conditional that involves both tables. This query should be based on tables that are completely distinct from the names that I used in my training.
Experiment 2
Input: “Given table Room with columns room_name and room_ID. Given the table Thing with fields thing_name, room_ID and price. The Room table is linked to the Thing table by the field room_ID. Find the one Room with the most expensive average price of Things in the room. => sql:” GPT-3 Output: “SELECT room_name, AVG(price) FROM Room r INNER JOIN Thing t ON r.room_ID = t.room_ID GROUP BY room_name ORDER BY AVG(price) DESC LIMIT 1” |
This was a great result. Clearly GPT-3 has a good grasp of the concept “most” and can use this concept in generating queries.
Here is something a bit more complicated involving furniture.
Experiment 3
Input: “Given table Furniture with columns furniture_name and furniture_ID. Given the table Location with fields description, furniture_ID and area. The Furniture table is linked to the Location table by the field furniture_ID. Find the list of furniture with the largest area. => sql:”GPT-3 Output:”SELECTfurniture_name, AVG(area) FROM Furniture f INNER JOIN Location l ON f.furniture_ID = l.furniture_ID GROUP BY furniture_name ORDER BY AVG(area) DESC” |
The result is correct but it uses the “AVG” function whereas the query actually asked for the total area. To be fair, the function “SUM” did not appear in any of the examples and so there is no reason to expect that it should know to use that particular keyword.
But wait…
With GPT-3 you can make the same query more than once and GPT-3 will try again and force itself to produce a unique answer. So that’s what I did. I re-queried GPT-3 with the same query in the same session. Here is the result:
GPT-3 Output 2: “SELECT furniture_name, SUM(area) FROM Furniture f INNER JOIN Location l ON f.furniture_ID = l.furniture_ID GROUP BY furniture_name ORDER BY SUM(area) DESC”
This time it got it right!! But how did GPT-3 figure out to use the “SUM” function? This did not appear anywhere in my example list. Obviously GPT-3 has seen SQL before and is able to generalize to this case. What other tricks does GPT-3 already know about.
Here is another query with surprisingly good results.
Experiment 4
Input: “Given table Company with columns company_name and company_ID. Given the table Employee with fields employee_name, company_ID, job_ID, wage and height. Given the table Job with columns job_name, job_ID and description. The Company table is linked to the Employee table by the field company_ID. The Job table is linked to the Employee table with the field job_ID. Find the list of jobs that pay the most. => sql:” GPT-3 Output: “SELECT job_name, MAX(wage) FROM Company c INNER JOIN Employee e ON e.company_ID = c.company_ID INNER JOIN Job j ON j.job_ID = e.job_ID GROUP BY job_name ORDER BY MAX(wage) DESC” |
Correct… mostly. I did request a list. The end result is ALL the jobs, but they are correctly ordered and there is one result — the max result listed for each job.
Here is the input SQL that I used to test the query:
CREATE TABLE Company (company_name CHAR(10), company_ID INT ); INSERT INTO Company (company_name,company_ID) VALUES (‘m1’,1); INSERT INTO Company (company_name,company_ID) VALUES (‘m2’,2); INSERT INTO Company (company_name,company_ID) VALUES (‘m3’,3); CREATE TABLE Employee ( employee_name CHAR(10), company_ID INT, job_ID INT, wage INT ); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c1’,1,100,1000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c2’,1,200,2000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c3’,1,300,3000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c4’,2,100,4000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c5’,2,200,5000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c6’,2,300,6000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c7’,3,100,7000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c8’,3,200,8000); INSERT INTO Employee (employee_name,company_ID,job_ID,wage) VALUES (‘c9’,3,300,9000); CREATE TABLE Job ( job_name CHAR(10), job_ID INT ); INSERT INTO Job (job_name,job_ID) VALUES (‘j1’,100); INSERT INTO Job (job_name,job_ID) VALUES (‘j2’,200); INSERT INTO Job (job_name,job_ID) VALUES (‘j3’,300); Here is the result of the query: job_name, Max(wage) j3, 9000 j2, 8000 j1, 7000 |
Note that I used the site “https://www.db-fiddle.com” to try out all the queries. They all work as expected.
Conclusion
This is not a comprehensive overview of SQL, but it covers the basics. With further examples extensions could easily be made to the translation capabilities of the system. You don’t have to be an SQL expert though to be impressed with the raw talent of the GPT-3 system. It’s innate capacity for SQL translation exceeds that of most people. But GPT-3 does make mistakes, which means that, in its current form, it can’t be the sole component of a commercial SQL translation system.
GPT-3 is a new kind of AI system, something that hasn’t been seen before. It’s going to take time for the industry to figure out what it is and how it works, but clearly it has capabilities that have significant commercial potential. The intent of this article was to give the reader some sense of the capabilities of GPT-3 as well has assist future GPT-3 programmers with training examples that demonstrate working implementations.