MySQL Forums
Forum List  »  Performance

Re: Question about the Query Cache and server settings
Posted by: Jay Pipes
Date: July 25, 2005 06:56PM

Erin ONeill wrote:
> I'm also suddenly getting a LOT of aborted
> connections. phpMyAdmin made it appear at first
> glance to be the server connections but on closer
> inspection that number has rarely changed (it's
> been 55 for a long while now). It's the client
> connections that has suddenly jumped and continues
> too (30,506 in 10 days). It started jumping when
> something went wrong with the hardware (but none
> of us here seem to know what it is as the colo
> isn't saying -- could be a router?). Not sure if I
> can tweak the server for that.

Yes, this more than likely is not related to MySQL but a network issue; talk to the net admins and see if something's afoul.

> I went thru the software code here and pointed out
> all the SELECT * queries. I think I've got the
> programmers here to stop doing that with their
> code. It's the forum 3rd party software. I don't
> think any database person has analyzed phpBB in a
> LONG LONG time and while this code is NOT phpBB it
> looks like the database stuff was taken from it.
> Luckily the code we're using has been
> discontinued. They have encouraged me to look at
> other forum software and make a recommendation.
> And SELECT * in queries is on my list as well as
> the schema.

SELECT * FROM some_table is just bad programming practice, though in this case, I was referring less to the * part and more about situations in which the programmer does not realize that how to use the LIMIT clause correctly, and instead retrieves boatloads of information from MySQL and then uses PHP to simply return, say, the first 10. This is awful practice and can lead to situations in which you will exceed the query_cache_limit size. Plus, it's simply inefficient; unless there is a WHERE clause, the only two access strategies MySQL can use are an index scan or a table scan. Neither is an attractive option, of course.

> In the meantime I have to speed up the queries of
> the software we have and like I susprected my
> query_cache_limit needs to be raised. I raised it
> to 16M to start and see if I see some drop off. I
> know I'll probably have to raise it again.

Actually, I made it sound like the *actual* size of the resultset returned was greater than 32MB. This may not be the case; instead, the temporary table (used in GROUP BY and ORDER BY operations) is over 32 MB. If the query is a GROUP BY, then more than likely, there is less than 32 MB result size being returned. Either way, though, you should definitely figure out which queries are causing the disk-based temp tables to be created. More than likely, this is one of the top performance degraders currently in your system. Use mysqldumpslow (a relatively unknown and little-documented script included in the mysql installation tree) to analyze the contents of your slow query log for large GROUP BY queries that are repeatedly showing up.

> I'm cautious as I don't have access to this server and
> can't see how my changes effect the System. I only
> have access to phpMyAdmin -- so it feels like I'm
> working in the dark here.

You know you have to remedy that situation!

> I'm now greping thru our code to see if I can find
> queries with functions in them. Then I'll start
> looking at the sorts of functions they're using.
> BUT most of my slow queries come down to TWO
> select statements. So first I'll go thru the
> explain with them and see how many rows are
> returned.

Cool. Looking forward to it.

> Is it possible to find out the size of
> the query?? (meaning the size of the rows returned
> - I know how to get the number of rows returned ).

Not easily, or at least not that I am aware of off the top of my head.

> I'm currently reading Joe Celko's SQL programming
> style as I've found too many fieldnames that I
> think should be reserved words but MySQL let's my
> programmers use! (like the word date). Joe has a
> LOT of info in a small book. Not sure if all of it
> is correct for MySQL.

Celko's books are indeed excellent advanced reading; they provide a lot of substance in a small amount of pages, I absolutely agree. He does have a couple "sticking points" that I somewhat disagree with, most notably his insistence that auto-incrementing keys are a poor choice as opposed to a more "naturally occurring" key. However, for the vast majority of his stuff, he's right on the money. Most of it applies to an Oracle syntax, but much of the concepts are applicable to MYSQL, especially MySQL 5.

> I think Joe comes from the
> cobol and JCL world where a lot of MySQL
> programmers are coming from the java and php
> world. Joe hates CamelHumps.

here's one of those disagreements I have with Celko. In my (very humble) opinion, there are certain things about SQL style that are simply that: a matter of style. Things that I consider to be matters of personal preference are such things as whether you use PascalCasing, camelCasing, underscore_names, etc. to name tables and/or fields. I *personally* tend to use PascalCasing for table names, and underscore_naming for fields, but I feel this is a stylistic preference.

What is *not* IMHO a stylistic preference, but rather poor coding practice, is the following:

1) Inconsistency between styles.

This bothers me to no end and is perhaps the easiest method of spotting a professional developer from a novice. Beginning programmers (or even medium-experienced programmers who haven't worked on large teams) tend to be inconsistent in the way they style and indent their SQL code. Especially on large, multi-developer projects, it is a must to have a written documentation of coding styles, indentation expectations, and so on. Consistency breeds easier code to maintain and understand, and allows fewer bugs into the SQL code.

2) CAPITALIZING EVERYTHING UNDER THE SUN

To me, this is an outdated, and extremely frustrating style preference to deal with, especially in SQL code, that is rooted in the old mainframe and COBOL days. When everything in a SQL statement is capitalized, it makes it very difficult to spot keywords, thus leading to harder to debug code. Related to this practice is not using table aliasing for ease of reading.

Example:

SELECT MYTABLE.MYFIELD, SOMEOTHERTABLE.SOMEOTHERFIELD FROM MYTABLE, SOMEOTHERTABLE
WHERE MYTABLE.MYOTHERFIELD = SOMEOTHERTABLE.MYOTHERFIELD;

compared to:

SELECT mt.my_field, sot.some_other_field FROM MyTable mt, SomeOtherTable sot
WHERE mt.my_other_field = sot.my_other_field;

In which is it easier to spot the a) keywords, b) table names, and c) field names

3) No or improper indentation and lline breaks

Again, it's something that happens with experience, both in SQL coding and in general programming in other languages. More experienced programmers understand the use of whitespace and indentation. They realize that 400 characters strings of SQL code on a single line make it very difficult for another coder to a) understand what fields are used in the query, b) add or remove fields from the SELECT or other clauses, and c) make the code look like a messy pile of binary vomit in any text editor.

Again, which would you prefer:

SELECT mt.my_field, sot.some_other_field FROM MyTable mt, SomeOtherTable sot
WHERE mt.my_other_field = sot.my_other_field;

or:

SELECT
mt.my_field
, sot.some_other_field
FROM MyTable mt, SomeOtherTable sot
WHERE mt.my_other_field = sot.my_other_field;

I personally feel the second is easier to read and maintain. Especially in complex SQL statements with multiple subqueries, derived tables, and calculations, proper indentation and line breaks are essential. There is no set rules on whether to include the ON clause in the same line as the JOIN, or whether to put the comma at the beginning or end of field lines, but the point is to apply whichever style you choose *consistently*.

4) Using Theta-style instead of ANSI style.

I'll probably get a lot of flak for bringing this up, because a lot of programmers just love their old-style Theta joins (only two of which MySQL even supports the syntax for). I personally feel that *good* SQL code is *explicit* and *clear* SQL code. Simply put: ANSI style is much, much, much clearer and more explicit than Theta-style.

For example, which is clearer (again, using the example from above):

SELECT
mt.my_field
, sot.some_other_field
FROM MyTable mt, SomeOtherTable sot
WHERE mt.my_other_field = sot.my_other_field;

or:

SELECT
mt.my_field
, sot.some_other_field
FROM MyTable mt
INNER JOIN SomeOtherTable sot
ON mt.my_other_field = sot.my_other_field;

If in the future, an outer join was needed for this query, it would be a simple task to simply change INNER to LEFT. For the top query, it would have to be rewritten to use ANSI SQL.

There are other little style things I go over in the book, but these are the big ones. Consistency is key. BTW, the book hit stores today I guess. Woo hoo! I'm in the process of putting together a SQL puzzler on my website; I'm giving free copies away to the folks who write the clearest, most efficient SQL solutions to the problem questions. I think it will be a fun and interesting way to get people thinking about style and efficiency in their coding...

Cheers,

Jay Pipes
Community Relations Manager, North America, MySQL Inc.

Got Cluster? http://www.mysql.com/cluster
Personal: http://jpipes.com

Options: ReplyQuote


Subject
Views
Written By
Posted
Re: Question about the Query Cache and server settings
1995
July 25, 2005 06:56PM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.