Felix John wrote:
>
> According to you the table might look something
> like this having 6250000 rows ( 2500 * 2500 ),
> 1) Will a query like find all y where x == 1 and
> Distance == 3 scan all the 6250000 rows ?????.
Definitely not, provided that the table has suitable indexes. These are essential for performance.
>
> 2) Is it efficient to split this (2500 * 2500 = 6250000 rows) table into 2500 seperate tables of 2500 rows each ?????.
I don't know, but I don't think so; my intuition is that each table becomes a few files, so reading a table out of 2500 requires reading a file in a directory of about 3*2500 files, which is expensive (Ext3 directories scan are usually O(n) where n is the directory size.. The best way to be sure is to experiment by yourself, and post your experimental results here.
> 3) What do you think on an idea like this, A table with 2500 columns and 2500 rows exactly like a distance matrix.
Try by yourself. But basically, you are implicitly expecting that retrieving one of 2500 columns is a quick operation.
I am not a MySQL expert, but I just believe that I understand that the basics of RDBMS are relations, not functions. So your function t(i,j) becomes a relation t(i,j)=u (of the 3 varaibles i,j,u). And a good RDBMS is built to manage efficiently big relations (aka tables), provided it has suitable indexes.
I really suggest trying by yourself; and I will be glad to read your results here. You might script your test in Ruby, Perl, Ocaml, PHP, or whatever language (with MySQL bindings) you like.
--
Basile STARYNKEVITCH ::::
http://starynkevitch.net/Basile/