MySQL Forums
Forum List  »  Newbie

Re: Partitioning or Not Partitioning?
Posted by: Manuel Gentili
Date: November 24, 2014 07:30AM

> Is `dati_rilevati` the "history" table? If so, what is the CREATE TABLE for the "real time" table?

Yes, dati_rilevati is the history table. The dati_realtime table is identical, and it's the sequent:

CREATE TABLE `dati_realtime`
(
`drt_id` bigint (20) NOT NULL AUTO_INCREMENT ,
`drt_dispositivo` integer (11) NOT NULL,
`drt_cliente` integer (11),
`drt_linea` integer (11),
`drt_idvariabile` integer (11),
`drt_nomevariabile` varchar (50),
`drt_valore` varchar (50),
`drt_data` datetime,
PRIMARY KEY (`drt_id`)
) TYPE=InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci;

> If you can make that one row with 25 columns for the measurements, you can do the INSERT a lot faster. Also, the dataset size will be smaller -- possibly 1/3 the size.

Yes, I can do it, but now the software work with one rows for each value because I think it's more simple to make measurement average query in one month or more... Another consideration about the insert of another measurement, that in this case will be an insert of one new column in the table, but it's not a big problem, if you tell me that it's better, I can change my database and my software too.

> Perhaps that is what you are doing??

No, now I insert a rows for every value, so 25 rows per second for example. I insert it in the realtime table, and every 60 second I do an average and I insert the value in the history table, so I have an history that is more light and a realtime table that give me the realtime value in a webpage that I use like a display.

> There are two issues: dataset size (and you are willing to buy bigger disks), and ability to perform the INSERTs (which I am mostly focusing on). MySQL has a limit of how many rows it can INSERT per second

The dataset size I think that it's not a great problem, because I can use very large disks or also I can decided to make a backup every year and that restart with a clean database.
The INSERT query for every second may be a problem, so now I try to calculate the total amount of INSERT, with the actual method and with your modify in witch I have 25 value on each row. To do this I use the definitive details, that are that in each dispositives I can calculate 3 set of measure, so I can use one dispositive for 3 client. I will have about 2000 dispositives, so I can insert 6000 clients, so:

- In My actual method I have ((25 value * 3 client) * 2000 dispositives), so I have 150000 insert for second, that are divided in about 6000 dati_realtime table (dati_reatime_1, dati_realtime_2... dati_realtime_6000), so it's about 25 insert for each table.
- In your method I have ((one single row * 3 client) * 2000 dispositives), so I have 6000 rows, each one in a different table, so I have 1 row per table. I see that there is a great difference

> By "real time", do you mean the last value read?

SELECT drt_valore FROM dati_realtime_1 WHERE drt_dispositivo = 1 AND drt_linea = 2 AND drt_cliente = 123 AND drt_idvariabile = 5 AND drt_data = (SELECT MAX(drt_data) FROM dati_realtime_1 WHERE drt_dispositivo = 1 AND drt_linea = 2 AND drt_idvariabile = 5)

in witch linea is a value that is from 1 to 3, because in each dispositive that take measures I have 3 identical lines, in witch I can attach 3 different client, so in each dispositive a can take measure for 3 clients.

> Are those in a 1-to-1 relationship? If so, that pair should be in a separate table (or switch to an ENUM as I mentioned above). That VARCHAR wastes a lot of space, and hence speed. Put only the smaller field (idvariable) in the history table.

Yes, drl_idvariabile is a value that identify the kind of measure, so it's refer to another table in witch I explain some caratteristics of these measure. drl_nomevariabile is the name of the meauser, for example volt, ampere and so on, but I can take this from the attached table by use drl_idvariabile, I use this VARCHAR in the dati_rilevati only for a more simple query display result, and because I thought that it's faster to have a query in only one table...

> Some calculations...

I think you make an error, 25 rows per second per 2000 client are 50000 record, not 5000, and so on... but with the new 6000 client I have 150000 record with my actual method and 6000 record in your method

> Clearly, I am pushing you toward Plan B, especially with the by-minute summarization.

I read your example and I think that it's better your method, so I must modify the dati_realtime and dati_rilevati table. I also think that partition by date (I think months in dati_rilevati and minute, if is possible, in dati_realtime) is a good solution. But I have always 6000 INSERT for second, it's possible? it's a problem? can you explain me the insert limit for second?

Thank you so much, Manuel

Options: ReplyQuote


Subject
Written By
Posted
Re: Partitioning or Not Partitioning?
November 24, 2014 07:30AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.