MySQL Forums
Forum List  »  Data Warehouse

Re: big data question
Posted by: Vincent R&D
Date: March 30, 2013 06:11AM

Dear Rick,
Thanks for your responce. Thanks in advance for further guidances...
+------------+--------------------------+------+-----+---------------------+----------------+
| Field      | Type                     | Null | Key | Default             | Extra          |
+------------+--------------------------+------+-----+---------------------+----------------+
| name       | varchar(30)              |      | MUL |                     |                |
| n1         | int(11)                  |      |     | 0                   |                |
| n2         | int(11)                  |      |     | 0                   |                |
| n3         | int(11)                  |      |     | 0                   |                |
| n4         | int(11)                  |      | MUL | 0                   |                |
| session    | bigint(20) unsigned      |      | MUL | 0                   |                |
| n5         | int(11)                  |      |     | 0                   |                |
| ID         | bigint(20) unsigned      |      | PRI | NULL                | auto_increment |
| Time       | datetime                 |      |     | 0000-00-00 00:00:00 |                |
| type       | int(6) unsigned zerofill |      | MUL | 0000                |                |
| userId     | bigint(19) unsigned      |      | MUL | 0                   |                |
| code       | int(6) unsigned          |      |     | 0                   |                |
| amount1    | bigint(20) unsigned      | YES  |     | NULL                |                |
| branch     | int(6) unsigned          |      |     | 0                   |                |
| t2         | datetime                 | YES  |     | NULL                |                |
| record     | int(11)                  |      | MUL | 0                   |                |
| record2    | int(3) unsigned          |      | MUL | 0                   |                |
| device     | int(10) unsigned         |      | MUL | 0                   |                |
| amount2    | int(11)                  |      | MUL | 0                   |                |
| additional | text                     |      |     |                     |                |
| status     | int(11)                  |      |     | N                   |                |
+------------+--------------------------+------+-----+---------------------+----------------+

Some history: A user dialin call to our systems results in populating two tables which one of them is as above. The next table is a call table which holds a couple of parameter related to the call itself and a sessionId which is the same as what we have in above table as "session". In fact we use this to left-join two tables and report incomming call data to the user.


call table:

create table if not exists %s (name varchar(30) not null,
chasis integer not null,
slot integer not null,
port integer not null,
date1 integer not null,
time1 integer not null,
state integer,
handshake integer,
connectiontime integer,
protocol integer,
mode integer,
received integer,
sent integer,
tel1 varchar(100),
tel2 varchar(100),
netId varchar(20),
txspeed integer,
rxspeed integer,
sentbytes integer,
rcvbytes integer,
`sessionId` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
KEY `name` (`name`,`chasis`,`slot`,`port`), KEY `date1` (`date1`,`time1`), KEY `sessionId` (`sessionId`), KEY `tel1` (`tel1`), KEY `tel2` (`tel2`), KEY `protocol` (`protocol`,`mode`)) TYPE=MyISAM MAX_ROWS=4294967295 AVG_ROW_LENGTH=7


The problem is that this is indeed a huge huge amount of data with four millions of records in both tables each week. A select query which contains a join and group-by by date would be take a lot of time and the user is not happy.

As a novice data player I have a couple of questions:
- Would it be a help by partitioning both tables base on time (or session - guaranteeing that the sessions are advances by one increment each time?!). Also how could indexing might survive us.
- Would it be a help to create a historical table which contains normalized joint data product of both tables and the user would need to query that table instead?! And does this make us creating a staging cermony with say one day interval than the origincal table?!
And would it be better to create a second separate database instance and keep this normalaized table there.

- Also what if I need to report the user some minor analytical data such as a number of transactions base on their status base on date intervals.


Regards,
Vincent



Edited 1 time(s). Last edit at 03/30/2013 06:23AM by Vincent R&D.

Options: ReplyQuote


Subject
Views
Written By
Posted
3056
March 27, 2013 01:26AM
1596
March 28, 2013 07:52PM
Re: big data question
1680
March 30, 2013 06:11AM
1516
March 31, 2013 01:25AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.