MySQL Forums
Forum List  »  InnoDB

MYSQL JSON fragmentation
Posted by: Sharvanath Pathak
Date: January 12, 2023 07:47PM

We are seeing a lot of storage fragmentation when appending to the same JSON column multiple times. From the documentation it's not clear this was the expected behavior so would like to understand what is wrong here.

The example to reproduce is fairly simple:

CREATE TABLE IF NOT EXISTS Documents (
id VARCHAR(255) PRIMARY KEY,
custom_data JSON);

Then you can run the following query many times (say 1000) for different JSON keys:

INSERT INTO Documents (id, custom_data) VALUES ('id', JSON_OBJECT('random_key', 'val')) ON DUPLICATE KEY UPDATE custom_data = IF(custom_data IS NOT NULL, JSON_SET(custom_data, CONCAT('$.random_key'), 'val'), JSON_OBJECT('random_key', 'val')), id = 'id';

The result after the queries is that the table uses 14MB
mysql> select * from INFORMATION_SCHEMA.INNODB_TABLESPACES where NAME="document/Documents" limit 1;
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
| SPACE | NAME | FLAG | ROW_FORMAT | PAGE_SIZE | ZIP_PAGE_SIZE | SPACE_TYPE | FS_BLOCK_SIZE | FILE_SIZE | ALLOCATED_SIZE | AUTOEXTEND_SIZE | SERVER_VERSION | SPACE_VERSION | ENCRYPTION | STATE |
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
| 17 | document/documents | 16417 | Dynamic | 16384 | 0 | Single | 4096 | 14680064 | 14680064 | 0 | 8.0.31 | 1 | N | normal |
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
1 row in set (0.01 sec)

while the actual data is fairly small,
mysql> select OCTET_LENGTH(custom_data) from Documents;
+---------------------------+
| OCTET_LENGTH(custom_data) |
+---------------------------+
| 333020 |
+---------------------------+

Also running an optimize table reduces the size by a lot

mysql> optimize table Documents;
+--------------------+----------+----------+-------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+--------------------+----------+----------+-------------------------------------------------------------------+
| document.documents | optimize | note | Table does not support optimize, doing recreate + analyze instead |
| document.documents | optimize | status | OK |
+--------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (0.06 sec)

mysql> select * from INFORMATION_SCHEMA.INNODB_TABLESPACES where NAME="document/Documents" limit 1;
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
| SPACE | NAME | FLAG | ROW_FORMAT | PAGE_SIZE | ZIP_PAGE_SIZE | SPACE_TYPE | FS_BLOCK_SIZE | FILE_SIZE | ALLOCATED_SIZE | AUTOEXTEND_SIZE | SERVER_VERSION | SPACE_VERSION | ENCRYPTION | STATE |
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
| 18 | document/documents | 16417 | Dynamic | 16384 | 0 | Single | 4096 | 475136 | 475136 | 0 | 8.0.31 | 1 | N | normal |
+-------+--------------------+-------+------------+-----------+---------------+------------+---------------+-----------+----------------+-----------------+----------------+---------------+------------+--------+
1 row in set (0.01 sec)


Thanks in advance for any help.

Options: ReplyQuote


Subject
Views
Written By
Posted
MYSQL JSON fragmentation
551
January 12, 2023 07:47PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.