The MERGE statement first appeared as Feature F312 in the SQL:2003 ANSI/ISO standard. MERGE is useful in situations where one has to insert a set of rows but when cleaning or other ETL (Extract-Transform-Load) processing is required. MERGE permits one to implement quite sophisticated logic to process a set of tuples to be loaded in a single statement.
In this article, I’d like to briefly outline the functionality of the MERGE statement, how it has changed between the SQL:2003 and SQL:2008 standards, and the extensions to MERGE that are supported in SQL Anywhere 11. To do this, I’ll extensively borrow from a presentation on the MERGE statement and materialized views that my colleague Anil Goel presented at the Sybase Techwave conference in Las Vegas last August.
Example
When processing a set of input rows to be inserted into a base table, it is sometimes necessary to perform “INSERT-else-UPDATE” processing, which is precisely the main intent of the MERGE statement. In addition, it would be beneficial to provide:
- Provide alternate actions for
INSERT; - Conditionally skip the
INSERTand/or theUPDATEfor specific rows; DELETEan existing row; and- Permit arbitrary conditions to be specified for alternate actions.
Suppose we have an existing, non-empty table T and wish to process a set of input rows S as follows:
1 2 3 4 5 6 7 8 9 10 11 | FOR each input row in S DO IF EXISTING THEN // row of S matches an existing row in table T IF condition-1 THEN SKIP ELSE IF condition-2 THEN ERROR ELSE IF condition-3 THEN DELETE FROM T ELSE UPDATE T ELSE IF condition-4 THEN SKIP ELSE IF condition-5 THEN ERROR ELSE INSERT INTO T END |
This entire piece of logic can be executed as a single MERGE statement as follows. The example below merges existing table rows from the base table T with rows from an input file on the client machine, which are accessed using Version 11’s OPENSTRING construction:
12 13 14 15 16 17 18 19 20 21 22 23 24 | MERGE INTO T USING (SELECT * FROM OPENSTRING( FILE '/usr/paulley/techwave/2008/data1.txt') WITH( id INT, qty INT, cmt LONG VARCHAR) S) S ON PRIMARY KEY WHEN NOT MATCHED AND S.cmt = 'ignore' THEN SKIP WHEN NOT MATCHED AND S.qty < 0 THEN RAISERROR WHEN NOT MATCHED AND S.qty = 0 THEN INSERT VALUES( S.id, 1 ) WHEN NOT MATCHED THEN INSERT VALUES( S.id, S.qty ) WHEN MATCHED AND S.cmt = 'ignore' THEN SKIP WHEN MATCHED AND S.cmt = 'new' THEN RAISERROR WHEN MATCHED AND T.qty + S.qty <= 0 THEN DELETE WHEN MATCHED THEN UPDATE SET T.qty = T.qty + S.qty |
MERGE semantics – briefly
The MERGE statement defined in the SQL:2003 standard permitted at most one WHEN MATCHED and at most one WHEN NOT MATCHED clause.
The semantics of SQL:2003 MERGE are as follows.
- For each row of target-object, determine if the
MERGEsearch-condition evaluates to TRUE for each row of the source-object, to find “matching” rows from the source and target. - Generate an error if more than one matching row is found.
- Add the single matching row, if found, to the set of rows to be processed by the
WHEN MATCHEDclause, if present; - The
WHEN NOT MATCHEDclause, if present, processes rows of source-object that fail to match any target-object row.
With Feature F313, SQL:2008 permits multiple WHEN [NOT] MATCHED clauses with optional search conditions. In this case, the order of the WHEN clauses matter: it is the merge action of the first WHEN clause that satisfies its search condition that is executed. However, even with Feature F313, the standard MERGE statement isn’t that flexible: the only operation a WHEN MATCHED clause can perform is an UPDATE, and the only operation a WHEN NOT MATCHED clause can perform is an INSERT.
With SQL Anywhere 11, we extended MERGE statement functionality in several ways, some of these inherited from the older INSERT ... ON EXISTING statement that was introduced to SQL Anywhere Version 8 in 2002:
- One can use
MERGEwith an input dataset from a client machine using theOPENSTRINGconstruction; - use an explicit column list to permit column reordering from input to target table;
- use of
WITH AUTO NAMEsyntax to do column matching by name; - the
MERGEsearch-condition can specifyPRIMARY KEYas a shorthand to generate a condition based on primary key columns; - several additional
WHEN MATCHEDactions:DELETEthe existing rowRAISERROR[ error-number ]SKIPthe input rowUPDATE [ DEFAULTS { ON | OFF }the existing row
- additional
WHEN NOT MATCHEDactions:INSERTthe input rowRAISERROR[ error-number ]SKIPthe input row
A detailed description of the semantics of the MERGE statement, along with the semantics of INSERT, DELETE, and UPDATE row-level triggers on the target table, can be found in the SQL Anywhere documentation.
Here is the BNF for the MERGE statement syntax supported by SQL Anywhere 11:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | merge ::= "MERGE" "INTO" merge_into_tabterm merge_using simple_tabterm merge_on_clause merge_op_list merge_using ::= "USING" | "USING WITH AUTO NAME" merge_into_tabterm ::= ( tabref | "(" query_expr ")" as identifier | "(" "WITH" with_list query_expr ")" as identifier ) derived_collist merge_on_clause ::= "ON" merge_search_cond merge_search_cond ::= searchcond | "PRIMARY" "KEY" | "INDEX" indexname merge_op_list ::= merge_op { "," merge_op } merge_op ::= "WHEN MATCHED" "THEN" merge_matched_op | "WHEN MATCHED" "AND" searchcond "THEN" merge_matched_op | "WHEN NOT" "MATCHED" "THEN" merge_not_matched_op | "WHEN NOT" "MATCHED" "AND" searchcond "THEN" merge_not_matched_op merge_common_op ::= "SKIP" | "RAISERROR" | "RAISERROR" integer_or_var merge_matched_op ::= merge_common_op | "UPDATE SET" setlist | "UPDATE" | "UPDATE" "DEFAULTS" "ON" | "UPDATE" "DEFAULTS" "OFF" | "DELETE" merge_not_matched_op ::= merge_common_op | "INSERT" "VALUES" "(" ins_vallist ")" | "INSERT" "(" inscollist ")" "VALUES" "(" ins_vallist ")" | "INSERT" ins_vallist ::= { ins_valitem "," } ins_valitem ins_valitem ::= "DEFAULT" | expression inscollist ::= { inscolname "," } inscolname inscolname ::= identifier [ "." identifier ] |
Another useful extension that we are considering for a future release is to support outer joins between the source- and target-objects, a construction already supported by Microsoft SQL Server 2008.
My thanks to Anil Goel for putting together most of the above material for our presentation at Techwave 2008.

Glenn Paulley is a Director of Engineering at Sybase iAnywhere.

0 responses so far ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment
Note that all comments are currently being moderated until I have a better handle on spam, so your comment may not appear for a couple of hours