SQL (Structured Query Language) is a widely-used programming language for managing and manipulating data stored in relational databases. As the need for data analysis and database management continues to grow, an increasing number of job roles demand SQL proficiency. However, grasping SQL queries can be daunting, particularly for novices.
Fortunately, the advent of cutting-edge technology, such as the ChatGPT-4 language model, can help master SQL queries. In this article, we will delve into how ChatGPT-4 can aid in creating SQL queries and assessing their performance. By using ChatGPT-4, you can not only enhance your SQL query writing skills but also leverage its capabilities to evaluate and optimize query performance.
Contents
- ChatGPT-4 vs ChatGPT-3
- Solve SQL knowledge tasks with ChatGPT-4
- Generate query solutions
- Analyze query performance
- Check the AI-generated queries with Query Profiler
- Conclusion
ChatGPT-4 vs ChatGPT-3
As an AI language model, ChatGPT is constantly evolving through new iterations and improvements. The latest version, ChatGPT-4, is more advanced than its predecessor, ChatGPT-3, in several ways. One of the key differences between the two is the model size. ChatGPT-4 is larger than ChatGPT-3, with more parameters, which enables it to understand users better and generate more complex and nuanced responses. Additionally, ChatGPT-4 has been trained on a more extensive and diverse dataset than ChatGPT-3, allowing it to have a better contextual understanding and more accurate responses to a broader range of topics. Moreover, ChatGPT-4 exhibits improved performance in terms of accuracy, response generation, and comprehension of context due to enhancements in architecture, training techniques, and optimization. It has also been fine-tuned on a more diverse set of tasks and prompts, which further enhances its ability to handle different types of queries and provide accurate and contextually relevant responses.
Although ChatGPT-4 has its own set of limitations, it is designed to address some of the shortcomings of ChatGPT-3 and may have better control over its verbosity and relevance in responses. Overall, the advancements in ChatGPT-4 make it a powerful tool for improving language skills and providing accurate and relevant responses to a wide range of queries.
Solve SQL knowledge tasks with ChatGPT-4
In this article, we will showcase how ChatGPT-4 can be a valuable tool for SQL developers, DBAs, and data analysts by demonstrating its ability to efficiently solve common SQL knowledge tests encountered in job interviews, courses, and other contexts. We will also evaluate the AI’s ability to optimize queries and compare its predictions with the actual results obtained using the industry-leading tool, dbForge Studio for Oracle.
Convert MySQL syntax to Oracle
The first SQL conversion task we have for ChatGPT-4 is to convert MySQL syntax to Oracle. Let’s see how well the AI can handle this challenge.
So, the source syntax is the following:
CREATE TABLE dept (
deptno decimal(2, 0) DEFAULT NULL COMMENT 'Department number',
dname varchar(14) DEFAULT NULL COMMENT 'Department name',
loc varchar(13) DEFAULT NULL COMMENT 'Department location'
)
COMMENT = 'Company departments';
ALTER TABLE dept
ADD UNIQUE INDEX UK_dept_deptno (deptno);
– Create table emp
CREATE TABLE emp (
empno decimal(4, 0) NOT NULL COMMENT 'Employee identification number',
ename varchar(10) DEFAULT NULL COMMENT 'Last name',
job varchar(9) DEFAULT NULL COMMENT 'Position',
mgr decimal(4, 0) DEFAULT NULL COMMENT 'Manager identification number',
hiredate date DEFAULT NULL COMMENT 'Hire date',
sal decimal(7, 2) DEFAULT NULL COMMENT 'Salary',
comm decimal(7, 2) DEFAULT NULL COMMENT 'Comment',
deptno decimal(2, 0) DEFAULT NULL COMMENT 'Department number'
)
COMMENT = 'Employees';
ALTER TABLE emp
ADD UNIQUE INDEX UK_emp_empno (empno);
– Create foreign key
ALTER TABLE emp
ADD CONSTRAINT FK_emp_deptno FOREIGN KEY (deptno)
REFERENCES dept (deptno) ON DELETE NO ACTION;
INSERT INTO dept VALUES ('10','ACCOUNTING','NEW YORK');
INSERT INTO dept VALUES ('20','RESEARCH','DALLAS');
INSERT INTO dept VALUES ('30','SALES','CHICAGO');
INSERT INTO dept VALUES ('40','OPERATIONS','BOSTON');
INSERT INTO emp VALUES ('7369','SMITH','CLERK','7902','1980-12-17','800.00',NULL,'20');
INSERT INTO emp VALUES ('7499','ALLEN','SALESMAN','7698','1981-02-20','1600.00','300.00','30');
INSERT INTO emp VALUES ('7521','WARD','SALESMAN','7698','1981-02-22','1250.00','500.00','30');
INSERT INTO emp VALUES ('7566','JONES','MANAGER','7839','1981-04-02','2975.00',NULL,'20');
INSERT INTO emp VALUES ('7654','MARTIN','SALESMAN','7698','1981-09-28','1250.00','1400.00','30');
INSERT INTO emp VALUES ('7698','BLAKE','MANAGER','7839','1981-05-01','2850.00',NULL,'30');
INSERT INTO emp VALUES ('7782','CLARK','MANAGER','7839','1981-06-09','2450.00',NULL,'10');
INSERT INTO emp VALUES ('7788','SCOTT','ANALYST','7566','1982-12-09','3000.00',NULL,'20');
INSERT INTO emp VALUES ('7839','KING','PRESIDENT',NULL,'1981-11-17','5000.00',NULL,'10');
INSERT INTO emp VALUES ('7844','TURNER','SALESMAN','7698','1981-09-08','1500.00','0.00','30');
INSERT INTO emp VALUES ('7876','ADAMS','CLERK','7788','1983-01-12','1100.00',NULL,'20');
INSERT INTO emp VALUES ('7900','JAMES','CLERK','7698','1981-12-03','950.00',NULL,'30');
INSERT INTO emp VALUES ('7902','FORD','ANALYST','7566','1981-12-03','3000.00',NULL,'20');
INSERT INTO emp VALUES ('7934','MILLER','CLERK','7782','1982-01-23','1300.00',NULL,'10');
ALTER TABLE emp
ADD CONSTRAINT FK_emp_mgr FOREIGN KEY (mgr)
REFERENCES emp (empno) ON DELETE NO ACTION;
Let’s ask ChatGPT-4 for help.
That’s the result we got.
Please note that ChatGPT-4 has made the following conversions: decimal(4,0) to NUMBER(4,0), varchar(10) to varchar2(10), ‘1981-02-20’ to TO_DATE(‘1981-02-20’, ‘YYYY-MM-DD’), and ALTER TABLE emp ADD UNIQUE INDEX UK_emp_empno(empno) to CREATE UNIQUE INDEX UK_EMP_EMPNO ON EMP(EMPNO). However, it should be noted that ChatGPT-4 did not restore comments to the table and columns, even though Oracle has a special COMMENT statement.
It’s important to mention that when using ChatGPT-4 to convert MySQL scripts to Oracle, it may not always produce the correct result on the first try. To achieve the desired output, you may need to provide additional guidance such as specifying the Oracle version or which constraints to use.
Run the generated query with dbForge Studio for Oracle
Let us now validate the query using dbForge Studio for Oracle – a comprehensive solution for developing, managing, and maintaining Oracle databases. To get a deeper understanding of the capabilities and benefits of dbForge Studio for Oracle, take a look at our recent release notes blog post. Whether you’re a developer, DBA, or data analyst, dbForge Studio for Oracle can help you streamline your workflow and optimize your Oracle database management.
The query has been executed successfully, as evidenced by the results.
Generate query solutions
Let’s now assess ChatGPT-4’s SQL proficiency by using that Oracle script to task the AI with writing SQL queries. Afterward, we will verify the accuracy and effectiveness of the provided queries by utilizing dbForge Studio for Oracle to ensure that they return the desired result.
Task 1: Return a list of middle managers
Suppose that we need to retrieve a list of mid-level employees from the database. A mid-level employee is an employee who has subordinates and is himself a subordinate. Let’s ask ChatGPT-4 to provide two variants of the query for this. The following is what it suggested.
Let us now run those queries in dbForge Studio for Oracle to see if they are valid.
Both queries have been executed successfully and returned the same result. This is truly impressive! Let’s proceed to further explore ChatGPT-4’s capabilities.
Task 2: Return the id of managers who have more than two employees
Now, let us ask ChatGPT-4 to write two variants of a query to return the identification number of managers who have more than two employees. Below is what we got.
Great! Now, let’s execute the queries in dbForge Studio for Oracle to verify if they produce the desired results, just like we did for the previous task. This will help us ensure that the queries are functional and reliable.
And again the result is perfect. The queries have been executed successfully and returned the output we expected. Shall we move forward?
Task 3: Return the department with the highest average salary
Suppose we want to find out which department has the highest average salary. Let’s request the AI to generate an SQL query that can retrieve the name of the branch with the highest average salary and the amount of that salary. As before, we will ask the AI to provide two different solutions to this task, which will help us explore different approaches to the problem. Below are the queries that the AI has created for us.
Shall we validate the queries? Let’s execute them in dbForge Studio for Oracle to verify their accuracy.
Task 4: Return the name of the branch’s head, the location of the branch, and the total salary for the branch
Let’s ask ChatGPT-4 to create an SQL query to retrieve the name of the head of a branch, the location of the branch, and the total salary of their direct subordinates. In other words, we want the output to display the name of the head, the location of the branch, and the total salary for the branch. For this task, the head is defined as an employee who reports directly to the president of the company.
Once again, let’s check the queries the AI has created in dbForge Studio, to make sure they are working properly.
Analyze query performance
Having checked ChatGPT-4’s query-building powers, let’s now test its query optimization capabilities by asking the AI to determine which of the two queries it provided will be the fastest.
As you remember, in Task 1 we asked the AI to create the two variants of the query to return a list of middle managers. The following are the queries it created.
Task 1: Variant 1
SELECT DISTINCT e1.ENAME
FROM EMP e1
WHERE e1.MGR IS NOT NULL
AND EXISTS (
SELECT 1
FROM EMP e2
WHERE e1.EMPNO = e2.MGR
AND e2.EMPNO IS NOT NULL
)
ORDER BY e1.ENAME;
Task 1: Variant 2
SELECT E1.ENAME
FROM EMP E1
JOIN EMP E2
ON E1.EMPNO = E2.MGR
WHERE E1.MGR IS NOT NULL
GROUP BY E1.EMPNO,
E1.ENAME
HAVING COUNT(*) > 0
ORDER BY e1.ENAME;
Let us ask ChatGPT-4 which query is more optimized.
In the second task, we asked the AI to create two variants of the query to return the id of managers who have more than two employees. Below are the queries it has written for us.
Task 2: Variant 1
SELECT e1.EMPNO
FROM EMP e1
WHERE e1.EMPNO IN (
SELECT e2.MGR
FROM EMP e2
WHERE e2.MGR IS NOT NULL
GROUP BY e2.MGR
HAVING COUNT(*) > 2
)
AND e1.MGR IS NOT NULL
ORDER BY e1.EMPNO;
Task 2: Variant 2
SELECT DISTINCT e1.EMPNO
FROM EMP e1
JOIN (
SELECT e2.MGR, COUNT(*) AS NUM_EMPLOYEES
FROM EMP e2
WHERE e2.MGR IS NOT NULL
GROUP BY e2.MGR
) mgr_counts ON e1.EMPNO = mgr_counts.MGR
WHERE e1.MGR IS NOT NULL
AND mgr_counts.NUM_EMPLOYEES > 2
ORDER BY e1.EMPNO;
Let us ask ChatGPT-4 which query is the fastest.
In Task 3 we wanted two solutions to the query that will return the department with the highest average salary and the amount of that salary. Below are the queries that the AI has provided us with.
Task 3: Variant 1
SELECT d.DNAME AS branch_name, AVG(e.SAL) AS avg_salary
FROM DEPT d
JOIN EMP e ON d.DEPTNO = e.DEPTNO
GROUP BY d.DNAME
HAVING AVG(e.SAL) = (
SELECT MAX(avg_salaries)
FROM (
SELECT AVG(SAL) AS avg_salaries
FROM EMP
GROUP BY DEPTNO
)
)
ORDER BY avg_salary DESC
FETCH FIRST 1 ROW ONLY;
Task 3: Variant 2
SELECT d.DNAME AS branch_name, AVG(e.SAL) AS avg_salary
FROM DEPT d
JOIN EMP e ON d.DEPTNO = e.DEPTNO
WHERE d.DEPTNO IN (
SELECT DEPTNO
FROM (
SELECT DEPTNO, AVG(SAL) AS avg_salary
FROM EMP
GROUP BY DEPTNO
ORDER BY AVG(SAL) DESC
)
WHERE ROWNUM = 1
)
GROUP BY d.DNAME;
And again, let’s ask ChatPGT-4 which query is likely to be executed faster.
In Task 4, we asked for two variants of the query that will return the name of the branch’s head, the location of the branch, and the total salary for the branch. The following is what we got.
Task 4: Variant 1
SELECT HEAD.ENAME AS HEAD_NAME,
D.LOC AS DEPARTMENT_LOCATION,
SUM(S.SAL) AS TOTAL_SALARY
FROM EMP HEAD,
DEPT D,
EMP S
WHERE HEAD.DEPTNO = D.DEPTNO
AND S.MGR = HEAD.EMPNO
AND HEAD.MGR = (SELECT EMPNO
FROM EMP
WHERE JOB = 'PRESIDENT')
GROUP BY HEAD.ENAME,
D.LOC;
Task 4: Variant 2
SELECT E.ENAME AS HEAD_NAME,
D.LOC AS DEPARTMENT_LOCATION,
SUM(S.SAL) AS TOTAL_SALARY
FROM EMP E
JOIN DEPT D
ON E.DEPTNO = D.DEPTNO
JOIN EMP S
ON S.MGR = E.EMPNO
WHERE E.MGR = (SELECT EMPNO
FROM EMP
WHERE JOB = 'PRESIDENT')
GROUP BY E.ENAME,
D.LOC;
Shall we ask the AI which one will perform better?
As you can see, ChatGPT-4 can assist database specialists by providing information about query performance and optimization. However, why don’t we check if it was correct in its predictions?
Check the AI-generated queries with Query Profiler
dbForge Studio for Oracle is equipped with a powerful query profiler that can help PL/SQL developers and data analysts optimize their SQL queries. By using this tool, they can easily identify the bottlenecks and improve the performance of their queries. With the help of this tool, we will now verify the answers generated by ChatGPT-4.
Task 1: Return a list of middle managers
Let us run the two queries in dbForge Studio for Oracle in the Query Profiling mode and compare the results.
As we can see from the results obtained using dbForge Studio, the first query performs slightly better, which contradicts what the AI suggested. JOIN and EXISTS are two techniques used to combine data from multiple tables in Oracle. In a JOIN operation, the database searches for matching rows in related tables and returns a result set with all selected columns. JOINs can be expensive, particularly with large datasets. EXISTS queries, however, check for the existence of matching rows in a subquery, and only need to verify the presence of matching rows. It is crucial to consider the specific use case and the data volume to make an informed decision on query performance. By analyzing JOIN and EXISTS row metrics in query profiling, you can determine which query is more costly and optimize it accordingly.
Task 2: Return the id of managers who have more than two employees
Let us use dbForge Studio to check the performance of the next two queries.
And again, when we run the queries through dbForge Studio’s query profiler, we find that the actual results differ from the AI’s predictions. The profiler provides us with detailed performance metrics, such as execution time and resource utilization. You may notice that the use of certain indexes or table structures may have contributed to the differences in performance between the queries.
Task 3: Return the department with the highest average salary
This time the query profiling results align with the AI’s predictions.
Task 4: Return the name of the branch’s head, the location of the branch, and the total salary for the branch
And again the results we obtained with dbForge Studio for Oracle complied with what the AI has told us.
Query performance is heavily influenced by the amount of data, as well as the use of foreign keys and indexes, table structure, etc. In this regard, query profiling is a reliable tool that provides accurate information about query execution times. And while AI predictions can be useful, they may not always be accurate due to a lack of context. Therefore, it is recommended to use performance tuning techniques and Oracle Explain Plans.
If you are interested in learning more about the application of AI in the database field, we recommend reading our articles on How to Use ChatGPT to Write SQL JOIN Queries and How ChatGPT Can Help You Retrieve MySQL Data. They will provide you with a deeper understanding of how AI can be leveraged to improve query performance and streamline the query writing process. Check them out to learn more!
Conclusion
In conclusion, we can say that ChatGPT-4 has proven to be a valuable asset in the field of database development and management, particularly in generating SQL queries. However, it is essential to understand that AI should be used in conjunction with human operators and specialized database tools such as query profilers, for example, to achieve the most reliable and accurate results. By leveraging AI’s capabilities and utilizing the right tools, developers and analysts can optimize their queries and improve their overall performance, thus enhancing their productivity and delivering more efficient solutions.
You can download a free 30-day trial of dbForge Studio for Oracle at our website and see for yourself how it can help you improve the performance of your Oracle queries.