Re: Multiple Databases - Search and retrieve.
Very crude estimate: A set of columns totalling an average of 200 characters/row will take about 200 bytes/row for the FULLTEXT index on that set of columns.
Avg row len: 700
400 -> compressed to 130
200 -> add 200 for FULLTEXT
100 -> 100 (unchanged) for other stuff, plus 20? for other index(es)
700 -> 350
SHOW TABLE STATUS LIKE 'tbl'
before and after to verify those values.
The FULLTEXT index belongs to the table, which belongs to the database, therefore, it applies toward the 1GB/db limit.
SELECT * FROM (
( SELECT ... FROM db1.tbl WHERE MATCH ... LIMIT 15 )
( SELECT ... FROM db2.tbl WHERE MATCH ... LIMIT 15 )
( SELECT ... FROM db3.tbl WHERE MATCH ... LIMIT 15 )
) LIMIT 15
A LIMIT without an ORDER BY is asking for garbage. If you add ORDER BY to the above example, add it in all 4 places. If you do OFFSET and LIMIT, then do
LIMIT 60 -- inside
LIMIT 45, 15 -- outside
That will get page 4.
Pagination and counting the number of rows both will be slow because you have to run something like:
( SELECT COUNT(*) FROM db1.tbl WHERE MATCH ... ) +
( SELECT COUNT(*) FROM db2.tbl WHERE MATCH ... ) +
( SELECT COUNT(*) FROM db3.tbl WHERE MATCH ... )
That is, you have to do all the heavy lifting just to get a count. I think you will quickly find that providing the count is not worth the sluggishness.
As for fixing the performance you will encounter for page 1234, that has to wait until you can do the query in a single database. Meanwhile, don't let users go past page, say, 5. How often have you paginated past page 5 when using a search engine (etc)? Almost never.
SHOW CREATE TABLE; I may have more space-saving suggestions. A common one is INT -> TINYINT (4 bytes -> 1 byte). A subtle one is DATETIME -> TIMESTAMP (8->4). NULL costs a bit (but you may need it). (A lucky pun.)