Storing data in databases inherently requires regular updates. This process involves adding new data and replacing outdated information. While numerous challenges are associated with updating data in databases, this article will concentrate on a specific aspect: optimization of the data updating process. We will explore the UPSERT option, which effectively combines the SQL UPDATE and INSERT statements and the application of the SQL UPSERT operation in MySQL databases.
Contents:
- What is UPSERT in MySQL?
- Methods of implementing UPSERT in MySQL
- MySQL UPSERT using INSERT IGNORE
- MySQL UPSERT using REPLACE
- MySQL UPSERT using INSERT ON DUPLICATE KEY UPDATE
- UPSERT in MariaDB
- Conclusion
What is UPSERT in MySQL?
The UPSERT feature in SQL is a hybrid of the UPDATE and INSERT statements. It enables merging the functionality of updating existing data and adding new records. When a record is already in the database, UPSERT triggers an UPDATE to modify it. If the record doesn’t exist, UPSERT performs an INSERT, adding a new record. This operation is atomic and completed in a single step.
SQL programmers often use UPSERT to prevent duplicating records while updating databases. This is especially useful in scenarios like data migration or importing data from external sources where duplicates might occur. For example, UPSERT adds new customers while updating existing ones with new information. This simplifies the process, eliminating the need for complex statements with IF clauses in stored procedures.
While UPSERT is an essential feature in relational database management systems, it’s not universally available as a distinct command. Oracle supports a direct UPSERT command, but MySQL doesn’t. However, MySQL and MariaDB offer UPSERT-like functionality through specific variations of the INSERT statement, which will be discussed later in this article.
Methods of implementing UPSERT in MySQL
The UPSERT operation in MySQL, which either inserts a new record or updates an existing one, can be executed using several methods:
- INSERT IGNORE is used for inserting data into a table. However, it is not the most efficient for UPSERT operations as it lacks the UPDATE component. Essentially, it bypasses the “duplicate key violation” error and continues to insert new data, leaving existing records unaltered.
- REPLACE INTO is another technique that first deletes an existing record with the same primary key and then inserts a new record in its place.
- INSERT…ON DUPLICATE KEY UPDATE is a syntax that comes into play when a table has a unique or primary key constraint. It works by inserting a new record or updating an existing one.
Next, we will delve into each of these methods in more detail.
MySQL UPSERT using INSERT IGNORE
INSERT IGNORE is a modification of the MySQL INSERT command designed to handle situations where duplicate entries could occur in a unique primary key or index. Unlike the standard INSERT command, which would result in an error and halt the process, INSERT IGNORE proceeds with the insertion process without throwing an error or stopping. Instead, it simply skips the duplicate record and continues inserting new records.
This variant is useful in scenarios involving bulk insert operations. In such cases, the data may include records that violate uniqueness constraints.
INSERT IGNORE serves to prevent the entire operation from being interrupted. Instead of throwing an error, it will issue a warning and continue with the data update process. However, it’s important to use this option with care because ignoring errors can sometimes lead to unexpected results or data inconsistencies.
The syntax of the INSERT IGNORE statement in MySQL is:
INSERT IGNORE INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Parameters:
- table_name – the name of the table where we insert data into
- column1, column2, … – the list of columns in the table where we insert data
- VALUES (value1, value2, …) – the values that should be inserted into the corresponding columns. The number of values must match the number of columns.
Let’s have a look at the example. We will showcase our examples using dbForge Studio for MySQL. This IDE is a robust and flexible tool for MySQL and MariaDB with comprehensive SQL coding support. It comes equipped with a variety of features, including code auto-completion, syntax validation, code debugging, formatting, refactoring, a code snippets library, and much more. With dbForge Studio for MySQL, you can efficiently handle all aspects of database development, management, and administration.
Assume that we want to insert new customers into the corresponding table of the sakila test database:
We execute the following SQL query:
INSERT IGNORE INTO
sakila.new_customer (customer_id, store_id, first_name, last_name, email)
VALUES
(8, 2, 'Susan', 'Wilson', '[email protected]'),
(16, 1, 'Jane', 'Harrison', '[email protected]'),
(17, 2, 'Bob', 'Johnson', '[email protected]');
The first record with customer_id 8 already exists in the table. Other records with unique customer_id values are new entries. The query will attempt to insert all records, but the first record already exists in the table. By using INSERT IGNORE, we make the system ignore that record and continue to insert the new records. As you can see, it only added two new records. Also, the Studio notified us about warnings.
MySQL UPSERT using REPLACE
Frequently, when updating data, we need to update existing records. Using the standard INSERT command can lead to an error, specifically the “Duplicate entry for PRIMARY KEY” error. To address this problem, we can employ the REPLACE command.
When there are no duplicate records or primary key conflicts, REPLACE acts like an INSERT. However, if there are duplicate records, REPLACE first deletes the existing record and then proceeds with the INSERT operation for that particular row.
In essence, the MySQL REPLACE command effectively combines the functionalities of both DELETE and INSERT.
The syntax of the REPLACE statement in MySQL is:
REPLACE INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Note that REPLACE is typically used when we have a PRIMARY KEY or UNIQUE constraint on one or more columns, as it relies on these constraints to determine if a record already exists.
Have a look at the example below. We are updating the list of customers by adding some new entries and altering some of the existing records:
REPLACE INTO
new_customer (customer_id, store_id, first_name, last_name, email)
VALUES
(5, 1, 'Elizabeth', 'Brown', '[email protected]'),
(6, 2, 'Jennifer', 'Davis', '[email protected]'),
(11, 1, 'Jane', 'Harrison', '[email protected]'),
(12, 2, 'Barbara', 'Johnson', '[email protected]');
Now you can see two new records added to the table and updated data for the two existing customers. The system deleted the previously found matching records and inserted the values specified in the query instead.
MySQL UPSERT using INSERT ON DUPLICATE KEY UPDATE
After examining the INSERT IGNORE and REPLACE statements in MySQL, you may have observed limitations associated with each method. INSERT IGNORE merely overlooks duplicate errors without making any changes to the existing data. On the other hand, REPLACE identifies INSERT errors but takes the step of deleting the existing row before adding the new record. A more efficient solution is required.
MySQL provides the INSERT … ON DUPLICATE KEY UPDATE statement as a non-destructive alternative that inserts new entries without removing already present and matching records. When it identifies an existing record that matches the UNIQUE or PRIMARY KEY value, this method initiates an UPDATE operation on that already present record. Thus, INSERT ON DUPLICATE KEY UPDATE functions as an effective UPSERT operation.
The syntax of the INSERT ON DUPLICATE KEY UPDATE statement in MySQL is:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...)
ON DUPLICATE KEY UPDATE
column1 = value1,
column2 = value2,
...;
Note the following parameters in this syntax:
ON DUPLICATE KEY UPDATE clause indicates that if a duplicate key conflict occurs (if a row with the same unique or primary key value already exists), the query should perform an update instead of inserting a new row.
column1 = value1, column2 = value2, … are located inside the ON DUPLICATE KEY UPDATE block and define the columns to be updated and the new values that should be set for those columns. Each column and value pair must be separated by a comma.
In the below example, we want to update multiple customers in the new_customer table by changing some details for the existing customers (the active or non-active status) and adding new customers.
INSERT INTO
new_customer (customer_id, store_id, first_name, last_name, email, active, last_update)
VALUES
(2, 1, 'Patricia', 'Johnson', '[email protected]', 0, NOW()),
(5, 1, 'Elizabeth', 'Brown', '[email protected]', 0, NOW()),
(3, 1, 'Linda', 'Williams', '[email protected]', 0, NOW()),
(6, 1, 'Jane', 'Harrison', '[email protected]', 1, NOW()),
(7, 2, 'MARGARET', 'MOORE', '[email protected]', 1, NOW()),
(8, 1, 'DOROTHY', 'TAYLOR', '[email protected]', 1, NOW()),
(9, 2, 'Barbara', 'Johnson', '[email protected]', 1, NOW()),
(10, 1, 'SUSAN', 'WILSON', '[email protected]', 1, NOW())
ON DUPLICATE KEY UPDATE
first_name = VALUES(first_name),
last_name = VALUES(last_name),
email = VALUES(email),
active = VALUES(active),
last_update = VALUES(last_update);
Have a look at the output:
The new_customer table looks as follows:
UPSERT in MariaDB
In MariaDB, the UPSERT operation is also possible through the INSERT … ON DUPLICATE KEY UPDATE method, which is common for both MySQL and MariaDB.
This method enables you to add new rows to a table. In case a table already contains a row with a conflicting unique key, this statement will update that existing row with new values. The syntax of the statement is the same as shown earlier for MySQL.
However, INSERT … ON DUPLICATE KEY UPDATE has some performance specificity in MariaDB that must be considered. When updating some table, and the data includes several rows matching the unique keys, only the first row will be updated. Other rows will remain the same. That’s why it is recommended to use this option carefully in MariaDB, as it may be unsafe for data updating.
Conclusion
As previously mentioned, while MySQL doesn’t have a dedicated UPSERT command, you can still achieve this functionality through various methods, ensuring smooth and error-free database updates. MySQL provides three approaches: INSERT IGNORE, REPLACE, and INSERT … ON DUPLICATE KEY UPDATE. The choice between these methods depends on your specific work requirements.
When working with dbForge Studio for MySQL, you can efficiently create the necessary queries for data updates, achieving faster and safer results thanks to the Studio’s extensive feature set. Beyond offering coding assistance, it provides dedicated tools for data migration, too. The Studio stands out as one of the most robust IDEs for both MySQL and MariaDB, simplifying the work of all database specialists.
To experience the capabilities of dbForge Studio for MySQL, take advantage of the fully functional free trial, which grants access to all available features for 30 days. This trial period allows you to thoroughly test the solution under a full workload.