MySQL :: Re: Where and Case sensitivity

New Topic

Re: Where and Case sensitivity

Posted by: Rick James
Date: August 05, 2014 12:21AM

I'm stuck here:

> millions of rows in the initial Extracted table (ExtractedWordsToMatch).
> However these will be inserted ~1-2,000 per time.

The table name (ExtractedWordsToMatch) implies that you have not yet matched them against the OfficialWordList words.
It seems like it would be better to test the 1K-2K words first, then throw the matched (or maybe the unmatched?) words into a table called
ExtractedWordsThatMatch (or ExtractedUnmatchedWords)?

A LEFT JOIN is how to see what is in one table but not another.

If I had a million-row table with lots of dups, I would be tempted to do
SELECT DISTINCT ...
first to get rid of the dups.

> Would the JOIN work faster still without the distinct?

Try it both ways. It is good exercise in SQL to formulate the queries, run them, time them, and compare the outputs.

> I don't want the matching to be case sensitive, so I'll use utf8_unicode_ci.

Good; that issue seems to be settled.

Navigate: Previous Message• Next Message

Options: Reply• Quote

Subject

Written By

Posted

Where and Case sensitivity

Joey JJ

August 02, 2014 02:33AM

Re: Where and Case sensitivity

Barry Galbraith

August 02, 2014 04:30AM

Re: Where and Case sensitivity

Rick James

August 02, 2014 01:11PM

Re: Where and Case sensitivity

Joey JJ

August 04, 2014 12:31AM

Re: Where and Case sensitivity

Rick James

August 04, 2014 01:17PM

Re: Where and Case sensitivity

Joey JJ

August 04, 2014 06:55PM

Re: Where and Case sensitivity

Rick James

August 05, 2014 12:21AM

Re: Where and Case sensitivity

Joey JJ

August 06, 2014 02:12AM

Re: Where and Case sensitivity

Rick James

August 06, 2014 06:50PM

Re: Where and Case sensitivity

Sofia Alex

August 07, 2014 05:00AM

Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.