Each binary logging format has advantages and disadvantages. For most users, the mixed-based replication format should be fine and should provide the best combination of data integrity and performance. If, however, you want to take advantage of the differences in the replication format when performing specific updates or large data inserts, then the information in this section summarizes the advantages and disadvantages of the row and statement based formats.
Advantages of statement-based replication:
Proven technology that has existed in MySQL since 3.23.
Smaller log files. When updates or deletes affect many rows, much smaller log files. Smaller log files require less storage space and are faster to back up.
Log files contain all statements that made any changes, so they can be used to audit the database.
Log files can be used for point-in-time recovery, not just for replication purposes. See Section 5.9.3, “Point-in-Time Recovery”.
You can use a slave with a higher version than that used on the master, even when there is a different row structure in the table. This can be useful if you are unable to upgrade the master but want to take advantage of features in a recent slave version, perhaps for testing and evaluation purposes.
Disadvantages of statement-based replication:
Not all UPDATE statements can be
replicated: Any non-deterministic behavior (for example,
when using random functions in an SQL statement) is hard to
replicate when using statement-based replication. For
statements that use a non-deterministic user-defined
function (UDF), it is not possible to replicate the result
using statement-based replication, whereas row-based
replication will just replicate the value returned by the
UDF.
Statements cannot be replicated properly if they use a UDF that is non-deterministic (its value depends on other factors than the given parameters).
Statements that use one of the following functions cannot be replicated properly:
LOAD_FILE()
UUID()
USER()
FOUND_ROWS()
SYSDATE() (unless the server is
started with the --sysdate-is-now
option)
All other functions are replicated correctly (including
RAND(), NOW(),
LOAD DATA INFILE, and so forth).
INSERT ... SELECT requires a greater
number of row-level locks than with row-based replication.
UPDATE statements that require a table
scan (because no index is used in the
WHERE clause) must lock a greater number
of rows than with row-based replication.
For InnoDB: An INSERT
statement that uses AUTO_INCREMENT blocks
other non-conflicting INSERT statements.
For complex queries, the statement must be evaluated and executed on the slave before the rows are updated or inserted. With row-based replication, the slave only has to run the statement to apply the differences, not the full query.
Stored functions (not stored procedures) will execute with
the same NOW() value as the calling
statement. (This may be regarded both as a bad thing and a
good thing.)
Deterministic UDFs must be applied on the slaves.
If there is an error in evaluation on the slave, particularly when executing complex queries, then using statement based replication may slowly increase the margin of error across the affected rows over time.
Tables have to be (almost) identical on master and slave.
Advantages of row-based replication:
Everything can be replicated. This is the safest form of replication.
For MySQL versions earlier than 5.1.14, DDL (data definition
language) statements such as CREATE TABLE
are replicated using statement-based replication, while DML
(data manipulation language) statements, as well as
GRANT and REVOKE
statements, are replicated using row-based-replication.
For MySQL 5.1.14 and later, the mysql
database is not replicated. The mysql
database is instead seen as a node specific database.
Row-based replication is not supported on this table.
Instead, statements that would normally update this
information (including GRANT,
REVOKE and the manipulation of triggers,
stored routines/procedures and views are all replicated to
slaves using Statement based replication.
For statements like CREATE ... SELECT, a
CREATE statement is generated from the
table definition and replicated statement-based, while the
row insertions are replicated row-based.
The technology is the same as most other database management systems; knowledge about other systems transfers to MySQL.
In many cases, it is faster to apply data on the slave for tables that have primary keys.
Fewer locks needed (and thus higher concurrency) on the master for the following types of statements:
INSERT ... SELECT
INSERT statements with
AUTO_INCREMENT
UPDATE or DELETE
statements with WHERE clauses that
don't use keys or don't change most of the examined
rows.
Fewer locks on the slave for any INSERT,
UPDATE, or DELETE
statement.
It's possible to add multiple threads to apply data on the slave in the future (works better on SMP machines).
Disadvantages of row-based replication:
Larger log files (much larger in some cases).
Binary log will contain data for large statements that were rolled back.
When using row-based replication to replicate a statement
(for example, an UPDATE or
DELETE statement), each changed row must
be written to the binary log. In contrast, when using
statement-based replication, only the statement is written
to the binary log. If the statement changes many rows,
row-based replication may write significantly more data to
the binary log. In these cases the binary log will be locked
for a longer time to write the data, which may cause
concurrency problems.
Deterministic UDFs that generate large
BLOB values will be notably slower to
replicate.
You cannot examine the logs to see what statements were executed.
You cannot see on the slave what statements were received from the master and executed.
When making a bulk operation that includes non-transactional storage engines, changes are applied as the statement executes. With row-based replication logging, this means that the binary log is written while the statement is running. On the master, this doesn't provide any problems with concurrency, because tables are locked until the bulk operation terminates. On the slave server, however, tables aren't locked while the slave applies changes, because it doesn't know that those changes are part of a bulk operation.
In that scenario, if you retrieve data from a table on the
master (for example, SELECT * FROM
table_name), the server will wait for the bulk
operation to complete before executing the
SELECT statement, because the table is
read-locked. On the slave, the server won't wait (because
there is no lock). This means that until the “bulk
operation” on the slave has completed you will get
different results for the same SELECT
query on the master and on the slave.
This behavior will eventually change, but until it does, you should probably use statement-based replication in a scenario like this.

User Comments
Add your own comment.