Re: Delete Query
Posted by:
Rick James
Date: March 01, 2010 10:02AM
Sorry, you asked a high-level question, the answer to which has some low-level prerequesites...
* Disks
-- Layout (spindles, arm motion, seek time, etc)
-- Caching (read vs write)
-- RAID (striping, mirroring)
* File systems
-- Blocking (How the disk is broken up for access)
-- inodes / vnodes / etc -- how a file knows where the pieces of the file are located
-- Elevator strageties
-- Caching by the filesystem (read; write; use of RAM)
* File integrity
-- What about a system crash? (recovery, organizing things in anticipation of needing recovery)
* Caching
-- Reads (LRU strategy)
-- Writes (write-through or not)
-- By whom? (hardware, OS, MySQL)
-- At "block" level
-- MySQL's Query cache (at statement; other radical differences)
* BTrees
-- Lookups are Order(log N) -- hence little degredation as table grows
-- Inserts are Order(log N)
-- Scans are Order(1)
-- Only one BTree on a dataset
-- How the filesystem uses BTrees
-- How MySQL uses BTrees
* MySQL InnoDB
-- Block structure for both Data and Index (BTree)
-- Transaction log
-- innodb_buffer_pool (not write-through)
-- order of writing things (to facilitate crash recovery)
-- Data blocks vs index blocks (Index is effectively a Table)
-- tablespaces -- single vs file-per-table
-- How Secondary keys differ from PRIMARY
* MySQL MyISAM
-- Data file (.MYD) structure
-- Index file (.MYI) BTree structure
-- key_buffer (index cache only)
* MySQL Indexes
-- When to use
-- Optimizations
-- Interaction with disk
-- Lookup types -- PK, secondary key, Range, IN, random, etc
-- JOINs
----
I'll gladly expound on any topic, but I need to assume you have the underlying knowledge.
Back to your original question... I recommend you ignore the question until it becomes necessary. Many small applications (1000 of anything is "tiny") get by just fine in MySQL without taking into considerations the gory details like BTrees and JOIN strategies.
If you want to talk about billion-row tables, I need you to know most of the above issues so that when I say "turn that subquery into a join and add this index and increase innodb_buffer_pool_size" [3 things I say a lot], you have some concept of why.
The one thing that helps me the most in optimizing "big" db issues is "count the disk hits". (If everything fits in innodb_buffer_pool_size, there is virtually no I/O, and CPUs are usually fast enough to mask lack of indexes, etc.) But to count the disk hits, I need to take many of the above topics into consideration.
www.mysql.com covers most topics. You can read it without too much Computer Science knowledge.