Base cntr next
|

Use pt-table-checksum to check your tables USER GUIDE

Overview

An issue that arises occasionally is the need to check consistency between the data on the master table, and the data on its slave replications. The pt-table-checksum method provides you with a tool that executes a checksum query on the master and on the slave tables, efficiently detecting any inconsistency between them.

Limitations

Although pt-table-checksum is a useful tool, it does come with some limitations. One of them is caused by the tool’s presumption that databases and tables are the same on the master and all its replicas, so checksum replication will fail if, for example, the table structure is different between the master and the replica.

Another issue is that this tool requires the “binlog_format = STATEMENT” to be set on master (which is set by default), and MySQL has some limitation that might cause checksum to not replicate if there is a replica that is also a master for other replicas. So it could fail on row-based replication such as:

master -> replica1 -> replica2

Still, by passing the flag “–no-check-binlog-format”, this would not present an issue if you have a common setup such as:

master -> replica

Keep in mind that, altough there are situations in which the checksum script shows that the data is the same between master and slave, that data might still be different among them due to a hash collision (when the hashing algorithm that produces the checksum generates the same hash values for two different pieces of data). This is highly unlikely, but possible.

Click here to learn more about hash collisions:

In production

The pt-table-checksum script should be used carefully when you are in a production environment. It should be used outside of peak hours, when the system is quiet or out of use, preferably during a maintenance window. Still, to be safe, you can limit the amount of data to be scanned during single execution. The key point of it is to build a robust workflow to check the whole table in several runs.

There are some parameters that could help decrease the amount of rows to be scanned in a single execution:

--where

Do only rows matching this WHERE clause. You can use this option to limit the checksum to only part of the table. This is particularly useful if you have append-only tables and don't want to constantly re-check all rows; you could run a daily job to just check yesterday's rows, for instance.

Example:

pt-table-checksum --where "updated_dtm >= '2021-03-18 00:00:00' and updated_dtm < '2021-03-19 00:00:00'"
--tables

Checksum only this comma-separated list of tables. Table names may be qualified with the database name.


--tables-regex

Checksum only tables whose names match this Perl regex.

Example:

pt-table-checksum --tables 'transactions,users'

If you are looking for more information about the benefits, the disadvantages, and the details in general about pt-table-checksum you can find more information in this article:

A simple case study

Under the pt-table-checksum method, depending on certain conditions, the server can dynamically choose between a statement- based or a row-based replication. Among the conditions that influence this decision are:

  • Using a UDF (user-defined function) instead of a default one.
  • Using an INSERT command with the DELAYED clause.
  • Using temporary tables.
  • Using a statement that uses system variables.

Here is an example of the creation of a database “sync”, and the “ok” and “fail” tables inside of it. Then a row with data is added to the “ok” table:

master [localhost:24028] {msandbox} ((none)) > create database sync;
Query OK, 1 row affected (0.000 sec)

master [localhost:24028] {msandbox} ((none)) > use sync
Database changed
master [localhost:24028] {msandbox} (sync) > create table ok(id integer auto_increment, value varchar(30), dtm datetime, primary key(id));
Query OK, 0 rows affected (0.056 sec)

master [localhost:24028] {msandbox} (sync) > create table fail(id integer auto_increment, value varchar(30), dtm datetime, primary key(id));
Query OK, 0 rows affected (0.048 sec)

#
# generate some data using the following statements in both tables: ok, failed
#
master [localhost:24028] {msandbox} (sync) > insert into ok(value, dtm) values( hex(rand()*0xFFFFFFFF), now());
Query OK, 1 row affected (0.011 sec)

Then, we are going to inject the following inconsistency onto the replica, the “fail” table:

  • delete a single row
  • update a single row
  • insert a new row

Since only the table “fail” was modified, it is safe to assume that the “ok” table has exactly the same data than the master, while the “fail” table is now different.

Here is a review of the content on the “fail” table:

master [localhost:24028] {msandbox} (sync) > select * from fail;
+----+----------+---------------------+
| id | value	| dtm             	|
+----+----------+---------------------+
|  1 | A15F0C16 | 2021-03-18 11:51:09 |
|  2 | 16C13FA0 | 2021-03-18 11:51:09 |
|  3 | 8DA91A62 | 2021-03-18 11:51:10 |
|  4 | 8009C58E | 2021-03-18 11:51:10 |
|  5 | D7358D23 | 2021-03-18 11:51:10 |
|  6 | B3EE718A | 2021-03-18 11:51:11 |
|  7 | FE0790CF | 2021-03-18 11:51:12 |
|  8 | DA5A7FF3 | 2021-03-18 11:51:12 |
+----+----------+---------------------+
8 rows in set (0.001 sec)


slave1 [localhost:24029] {msandbox} (sync) > select * from fail;
+----+----------+---------------------+
| id | value	| dtm             	|
+----+----------+---------------------+
|  1 | A15F0C16 | 2021-03-18 11:51:09 |
|  2 | 16C13FA0 | 2021-03-18 11:51:09 |
|  3 | 8DA91A62 | 2021-03-18 11:51:10 | ← the next row was deleted
|  5 | D7358D23 | 2021-03-18 11:51:10 |
|  6 | B3EE718A | 2021-03-18 11:51:11 |
|  7 | changed  | 2021-03-18 11:51:12 | ← the updated row
|  8 | DA5A7FF3 | 2021-03-18 11:51:12 |
|  9 | A44B677A | 2021-03-18 11:53:39 | ← the inserted row
+----+----------+---------------------+
8 rows in set (0.000 sec)

Now, let’s run the pt-table-checksum method on the master. For test purposes we set “binlog_format=ROW”:

>> pt-table-checksum --user=msandbox --password=msandbox --port 24028 --host=127.0.0.1 --databases="sync"
Checking if all tables can be checksummed ...
Starting checksum ...
Cannot connect to P=24030,h=node-3,p=...,u=msandbox
Cannot connect to P=24029,h=node-2,p=...,u=msandbox
Diffs cannot be detected because no slaves were found.  Please read the --recursion-method documentation for information.
CRC32 never needs BIT_XOR optimization at /usr/bin/pt-table-checksum line 6124.

By running the pt-table-checksum we got an error signaling that we cannot connect to the replicas because of a problem with our credentials.

There is a fix for that.  We need to follow the official documentation and create the following table:

master [localhost:24028] {msandbox} (percona) > CREATE TABLE `dsns` (
	->   `id` int(11) NOT NULL AUTO_INCREMENT,
	->   `parent_id` int(11) DEFAULT NULL,
	->   `dsn` varchar(255) NOT NULL,
	->   PRIMARY KEY (`id`)
	-> );
Query OK, 0 rows affected (0.060 sec)

Make sure all replicas have a DSN (Data Source Name) which provides them with information so that the pt-table-checksum method will discover them:

master [localhost:24028] {msandbox} (percona) > insert into dsns values(1, null, "h=127.0.0.1,P=24029,u=msandbox,p=msandbox");
Query OK, 1 row affected (0.008 sec)

master [localhost:24028] {msandbox} (percona) > insert into dsns values(2, null, "h=127.0.0.1,P=24030,u=msandbox,p=msandbox");
Query OK, 1 row affected (0.007 sec)

Let’s run pt-table-checksum again to see the changes:

>> pt-table-checksum --user=msandbox --password=msandbox --port 24028 --host=127.0.0.1 --databases="sync" --no-check-slave-tables --recursion-method='dsn=h=127.0.0.1,D=percona,t=dsns'
Checking if all tables can be checksummed ...
Starting checksum ...
Replica slave1 has binlog_format ROW which could cause pt-table-checksum to break replication.  Please read "Replicas using row-based replication" in the LIMITATIONS section of the tool's documentation.  If you understand the risks, specify --no-check-binlog-format to disable this check.
Replica slave2 has binlog_format MIXED which could cause pt-table-checksum to break replication.  Please read "Replicas using row-based replication" in the LIMITATIONS section of the tool's documentation.  If you understand the risks, specify --no-check-binlog-format to disable this check.

So we got another error. Here we can see that the utility says we should either change binlog_format or skip its check. Since we do not have a replication chain in this case study, we will skip the binlog_format check by using “–no-check-binlog-format ” (In some versions of MariaDB you can get an error related to CRC, and you can workaround it by passing “–function MD5”). Also, we will pass the “–chunk-size=2” to reduce the chunk size, and so we can find where any inconsistency might be:

>> pt-table-checksum --user=msandbox --password=msandbox --port 24028 --host=127.0.0.1 --databases="sync" --no-check-slave-tables --recursion-method='dsn=h=127.0.0.1,D=percona,t=dsns'  --no-check-binlog-format --chunk-size=2
Checking if all tables can be checksummed ...
Starting checksum ...
        	TS ERRORS  DIFFS 	ROWS  CHUNKS SKIPPED	TIME TABLE
03-18T12:39:51  	0  	3    	8   	6   	0   0.240 sync.fail
03-18T12:39:51  	0  	0    	5   	5   	0   0.195 sync.ok

It is easy to notice that the “sync.ok” table (the table “ok” inside of the “sync” database) does not show any issues, while the “sync.fail” table is showing some differences. Now, we can take a look in the next image to check the content of the “percona.checksum” table, where this script stores execution results (The table is pretty wide, and so it had to be inserted as a small image):

Once we analyse the information gotten from the “percona.checksums” table, we get the following:

slave1 [localhost:24029] {msandbox} ((none)) > select chunk,  lower_boundary, upper_boundary, this_crc, master_crc, this_cnt, master_cnt from percona.checksums where tbl='fail' and( this_crc <> master_crc or this_cnt <> master_cnt)\G
*************************** 1. row ***************************
     	chunk: 2
lower_boundary: 3
upper_boundary: 4
  	this_crc: 6c6eee37756ea6e6e9581fe3c11218cf
	master_crc: 1c8de6ca86846fd0dbc3ca84de1e9d39
  	this_cnt: 1
	master_cnt: 2
*************************** 2. row ***************************
     	chunk: 4
lower_boundary: 7
upper_boundary: 8
  	this_crc: 466afe5823613919b3be7a8058e97afb
	master_crc: b9e0bb85309d42c382bc4b458ea9eb7e
  	this_cnt: 2
	master_cnt: 2
*************************** 3. row ***************************
     	chunk: 6
lower_boundary: 8
upper_boundary: NULL
  	this_crc: 0
	master_crc: 0
  	this_cnt: 1
	master_cnt: 0
3 rows in set (0.000 sec)

With the previous command, we filtered data using the following conditions:

  • this_cnt <> master_cnt

This filters out rows in which each chunk has the same number of them, i.e. no rows were deleted or inserted.

  • this_crc <> master_crc:

this filters out rows where we would get different crc values, i.e. some rows were indeed changed.

Let’s take a look closer at the output and review each chunk:

Chunk #2

We can manually see that the amount of rows is different. While “master_cnt” signals 2 rows on the master table, “this_cnt” shows only one, meaning that a row was deleted on the replica.

Chunk #4

The amount of rows between master and replica is the same, but since “master_crc” and “this_crc” are different, we can conclude that rows were updated.

Chunk #6

It can be noticed that, according to “master_cnt” being zero, and “this_cnt” being one, a row which was not on the master was added to the replica. Meaning that crc is empty because we did not originally have this chunk on the master.

Conclusion

Since this case study was a pretty simple one, it was easy to signal where the inconsistencies on the master-slave relation was. But in a real situation, where multiple rows are constantly added and deleted, we might end up with multiple different inconsistencies. It is up to the database administrator to use the tools at hand like the pt-table-checksum method to understand what is going on in the system.

Still, we hope this tool makes it easier for you to check consistency between a master table and its replications. It is a good idea to keep this method under your belt.

We have plenty of other MySQL tips that can help you in other database administration aspects. Take a look around to make your life easier!

The top 5 mysql picture 1