Re: MySQL nested JSON column search and extract sub JSON from the nested JSON
Posted by: Alfredo Kojima
Date: December 01, 2017 01:13AM

Hi

If you're using MySQL 8, you can use RECURSIVE CTEs to "shred" the contents of your JSON document into rows, which you can then normally parse and filter.
Like this:

WITH RECURSIVE author_books (id, name, one_book, rest) AS
(
SELECT authors.id, authors.name, JSON_EXTRACT(authors.published_books, CONCAT('$.', JSON_EXTRACT(JSON_KEYS(authors.published_books), '$[0]'))), JSON_REMOVE(JSON_KEYS(authors.published_books), '$[0]')
FROM authors
UNION ALL
SELECT author_books.id, author_books.name, JSON_EXTRACT(published_books, CONCAT('$.', JSON_EXTRACT(rest, '$[0]'))), JSON_REMOVE(rest, '$[0]')
FROM author_books JOIN authors ON author_books.id = authors.id WHERE JSON_LENGTH(rest) > 0
)
SELECT * FROM author_books WHERE JSON_CONTAINS(one_book->'$.tags', '["Social"]') AND one_book->'$.language' = 'English'


The WITH (...) block creates a derived table that iterates each ISBN in the published_books document and extracts the corresponding book, one row per nested book. The whole shredded table is then filtered by the outermost SELECT using the conditions you wanted.


If I run it (I changed the ISBNs a little because they were duplicated):


mysql> WITH RECURSIVE author_books (id, name, one_book, rest) AS
-> (
-> SELECT authors.id, authors.name, JSON_EXTRACT(authors.published_books, CONCAT('$.', JSON_EXTRACT(JSON_KEYS(authors.published_books), '$[0]'))), JSON_REMOVE(JSON_KEYS(authors.published_books), '$[0]')
-> FROM authors
-> UNION ALL
-> SELECT author_books.id, author_books.name, JSON_EXTRACT(published_books, CONCAT('$.', JSON_EXTRACT(rest, '$[0]'))), JSON_REMOVE(rest, '$[0]')
-> FROM author_books JOIN authors ON author_books.id = authors.id WHERE JSON_LENGTH(rest) > 0
-> )
-> SELECT * FROM author_books ;
+------+-------+----------------------------------------------------------------------------------------------------------+--------------+
| id | name | one_book | rest |
+------+-------+----------------------------------------------------------------------------------------------------------+--------------+
| 1 | Tina | {"name": "Book 1", "tags": ["self Help", "Social"], "language": "English", "release_date": "2017-05-01"} | ["2e8b2470"] |
| 3 | Keith | {"name": "Book 6", "tags": ["Social", "Life"], "language": "English", "release_date": "2017-05-01"} | [] |
+------+-------+----------------------------------------------------------------------------------------------------------+--------------+
2 rows in set (0.00 sec)


You can probably move the filter to inside the CTE as an optimization, but I left it out in the hope that it's easier to understand.
Another possible optimization would be to remove the JOIN ON authors by returning published_books as a field in the first CTE query.


You can read more about CTEs here: https://dev.mysql.com/doc/refman/8.0/en/with.html


Now, if you're using 5.7, you have a couple of options (that I can think of). The easiest is to fetch the whole JSON document from matching rows and leaving extraction of the individual books to the application.

If you really want to do the extraction at the server side, you can probably write a UDF that does something similar to the CTE, but with a more conventional FOR/WHILE loop.


Regards,
Alfredo

--
Alfredo Kojima
MySQL Developer Tools

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: MySQL nested JSON column search and extract sub JSON from the nested JSON
4660
December 01, 2017 01:13AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.