<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <channel>
        <title>MySQL Forums - Data Warehouse</title>
        <description>Forum about MySQL used as a Data Warehouse.</description>
        <link>http://forums.mysql.com/list.php?32</link>
        <lastBuildDate>Thu, 23 May 2013 17:02:55 +0000</lastBuildDate>
        <generator>Phorum 5.2.19</generator>
        <item>
            <guid>http://forums.mysql.com/read.php?32,585342,585342#msg-585342</guid>
            <title>datawarehousing (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,585342,585342#msg-585342</link>
            <description><![CDATA[ Hi everyone iam developping an application with php and mysql  but i want to use oracle 11g for the datawarehouse can someone help me about how to proceed????]]></description>
            <dc:creator>Bodelère ASSOKPE</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Fri, 03 May 2013 15:02:02 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,582433,582433#msg-582433</guid>
            <title>big data question (3 replies)</title>
            <link>http://forums.mysql.com/read.php?32,582433,582433#msg-582433</link>
            <description><![CDATA[ Hi,<br />
<br />
I don't know if this is the right place to put this question but I guess this is<br />
related to high availablity of data.<br />
<br />
The problem is that I have a OLTP mysql database which has two very large tables among many.<br />
The user is interested to get a report which is created from joining this two tables. These tables are getting bigger over time with rate of millions of records<br />
per week.<br />
<br />
Querying the tables through on-demand joins as a response to the user's request is<br />
very unhealthy becasuse the tables are very big and the whole database hangs during the report.<br />
<br />
I thought one solution would be to create a second database beside the first database. The second database has a view of joint tables of the first database which is filled with the source data from the first database in an interval. sounds like staging ....<br />
<br />
I don't know what is a standard or best practice way of doing this...<br />
What would you suggest?<br />
I have also familiar with most of big data definitions and solutions such as partitioning, data warehouse, staging, data marts etc. but not having had a practical experience so far, I'm not that confidence to pick a solution ...<br />
<br />
<br />
Your suggestions are appreciated.<br />
Thanks,<br />
Vincent.]]></description>
            <dc:creator>Vincent R&amp;D</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sun, 31 Mar 2013 07:25:32 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,575773,575773#msg-575773</guid>
            <title>16 Table Join Benchmarks for New Fast Data Warehouse Server Published (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,575773,575773#msg-575773</link>
            <description><![CDATA[ 16 table join benchmarks were ran on Parallel Universe* at Amazon Web Services.<br />
<br />
Results are available at www.paralleluniverse-inc.com/benchmarks.shtml .<br />
<br />
----------------------------------------------------------------------<br />
<br />
*Parallel Universe is the industry's only SQL server with fast parallel query engine.<br />
<br />
It is created by extending MySQL server architecture and runs on commodity server hardware.<br />
Speed is achieved by processing tables in parallel utilizing multiple core/CPU of server hardware and scalable to large joins.<br />
<br />
Because of fast query processing being available to data analysis, it is an ideal data warehouse server.<br />
With Parallel Universe, you'll also be able to deploy less costly server hardware for the same query load/task.<br />
<br />
Parallel Universe is released under the GPL license and fully compatible with MySQL and Percona servers.<br />
Also available as part of Linux OS images at Amazon Web Services and www.GoGrid.com .]]></description>
            <dc:creator>Hiromichi Watari</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Mon, 10 Dec 2012 19:21:02 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,573682,573682#msg-573682</guid>
            <title>New Fast Data Warehouse Server Now Available at Amazon Web Services (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,573682,573682#msg-573682</link>
            <description><![CDATA[ Parallel Universe* is now available as part of Linux OS images at Amazon Web Services (http://www.aws.amazon.com).<br />
<br />
Launch Instance -&gt; Classic Wizard -&gt; Community AMIs -&gt; type in &quot;parallel universe&quot; (All Images)<br />
----------------------------------------------------------------------------------------------------------------------<br />
<br />
*Parallel Universe is the industry's only SQL server with fast parallel query engine.<br />
<br />
It is created by extending MySQL server architecture.<br />
Speed is achieved by processing tables in parallel, utilizing multiple core/CPU of server hardware and scalable to large joins.<br />
<br />
Because of fast query processing being available to data analysis, it is an ideal data warehouse server.<br />
With Parallel Universe, you'll also be able to deploy less costly server hardware for the same query load/task.<br />
<br />
Parallel Universe is released under the GPL license and fully compatible with MySQL and Percona servers.<br />
Also available at www.GoGrid.com and www.paralleluniverse-inc.com .]]></description>
            <dc:creator>Hiromichi Watari</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Tue, 13 Nov 2012 17:01:07 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,569386,569386#msg-569386</guid>
            <title>New Fast Data Warehouse Server Now Available at GoGrid.com (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,569386,569386#msg-569386</link>
            <description><![CDATA[ Parallel Universe* is now available as part of Linux OS images at www.GoGrid.com.<br />
GoGrid.com is a dedicated/cloud server hosting site.<br />
<br />
----------------------------------------------------------------------------------------------------------------------<br />
<br />
*Parallel Universe is a new extension to MySQL server architecture, created to provide fast parallel query capability.<br />
Speed is achieved by processing tables in parallel, utilizing multiple core/CPU of server hardware.<br />
<br />
Because of fast query processing being available to data analysis, it is an ideal data warehouse server.<br />
With Parallel Universe, you'll also be able to deploy less costly server hardware for the same query load/task.<br />
<br />
Parallel Universe is released under the GPL license and fully compatible with MySQL and Percona servers.<br />
<br />
Also available at www.paralleluniverse-inc.com]]></description>
            <dc:creator>Hiromichi Watari</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Mon, 24 Sep 2012 15:32:00 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,568386,568386#msg-568386</guid>
            <title>Size of my databse (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,568386,568386#msg-568386</link>
            <description><![CDATA[ Hello everyone, <br />
<br />
this is my first post on this forum. I have a tricky question  that has probably been asked here before.<br />
I would like to know what could be the size (in Octets or Bytes) of my database (using InnoDB engine with MySql) if I have to record 40 million of lines and 20 fields (attributes).<br />
<br />
Best Regards, <br />
<br />
Thomas<br />
<br />
<br />
*** Sorry if I choose the wrong topic ***]]></description>
            <dc:creator>Thomas Laguerre</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Thu, 13 Sep 2012 14:54:13 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,567942,567942#msg-567942</guid>
            <title>New Fast Data Warehouse Server (4 replies)</title>
            <link>http://forums.mysql.com/read.php?32,567942,567942#msg-567942</link>
            <description><![CDATA[ Parallel Universe is a new extension to MySQL server architecture, created to provide fast parallel query capability.<br />
Speed is achieved by processing tables in parallel, utilizing multiple core/CPU of server hardware.<br />
<br />
Because of fast query processing being available to data analysis, it is an ideal data warehouse server.<br />
With Parallel Universe, you'll also be able to deploy less costly server hardware for the same query load/task.<br />
<br />
Parallel Universe is released under the GPL license and fully compatible with MySQL and<br />
Percona servers.<br />
<br />
Available at www.paralleluniverse-inc.com]]></description>
            <dc:creator>Hiromichi Watari</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Tue, 11 Sep 2012 16:46:03 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,564142,564142#msg-564142</guid>
            <title>Loading files automation using Load_data (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,564142,564142#msg-564142</link>
            <description><![CDATA[ HI everyone,<br />
<br />
I want to load more than 100 CSV files in every week to database. For the same i am using LOAD DATA functionality of MYSQL.<br />
<br />
To automate it. I am extracting file name &amp; related path in csv file using scripting &amp; loading to the table. Now i have all csv files names &amp; path in my database table.<br />
<br />
I tried to create a prepared statement for LOAD DATA, where i was concatenating file name &amp; file path to LOAD DATA statement in procedure using cursor to dynamically execute it.<br />
<br />
But here LOAD DATA is not allowed in procedures.<br />
<br />
Can anyone please help me to do so????<br />
<br />
Please suggest.<br />
<br />
Thanking you in advance.<br />
<br />
Regards,<br />
Dhiraj]]></description>
            <dc:creator>Dhiraj Bhangale</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 08 Aug 2012 05:07:13 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,563404,563404#msg-563404</guid>
            <title>Need advice for table rotation (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,563404,563404#msg-563404</link>
            <description><![CDATA[ I plan to design system for keep access log in my customer website and keep every events that happen so data increase very quick (someday about 2M in 1 day)<br />
and cause it is a log record that no need for update or delete, just select and insert. And We wanna keep data just 1 year, older than will be removed but cleaning process will not run everyday, may 1 week, 2 week or month once.<br />
I plan to active data just only newest 3 months and archive 9 older months, is case customer wanna review and compare old data.<br />
So please suggest me.<br />
1. Which storage engine that I should use, myisam, innodb or archive engine?<br />
2. If I split data to daily table or monthly table, How is it performance?<br />
3. If you have any good idea, please tell me.<br />
<br />
Thanks for every answer.]]></description>
            <dc:creator>Pichai Iamtana-anan</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 01 Aug 2012 03:54:38 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,532653,532653#msg-532653</guid>
            <title>Problem with granularity choosen (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,532653,532653#msg-532653</link>
            <description><![CDATA[ Hi.<br />
<br />
I have a fact table with a very small granularity, which is a daily snapshot of sales process on my company.<br />
<br />
I have told my client I couldn´t handle too much detail because of the poor hardware given. But he insisted on that because he requires detailed data for some decisions.<br />
<br />
So we agreed on this deal:<br />
<br />
- To persist only the last day snapshot of each month in database;<br />
- But update every day, erasing the previous days of the same month;<br />
<br />
Example:<br />
<br />
2012-05-03 replaces 2012-05-02<br />
2012-05-04 replaces 2012-05-03<br />
2012-05-05 replaces 2012-05-04<br />
...<br />
2012-05-31 replaces 2012-05-30, maintain 2012-05-31<br />
start another month<br />
2012-06-01<br />
2012-06-02 replaces 2012-06-01<br />
...so on<br />
<br />
I realized that deleting large number of records takes too long time.<br />
So I split on 2 tables, one with the last day of the month results, and the other the partially results, so I can perform TRUNCATE TABLE on this partial result table.<br />
<br />
1st question: Is that the best solution?<br />
<br />
Continuing...<br />
Now I need to perform a UNION ALL with these 2 tables using view. Unfortunately, the performance drops considerably, because I can´t manage the indexes properly (USE INDEX, FORCE INDEX ...). I am not happy with that.<br />
<br />
What should I do?]]></description>
            <dc:creator>Hélio Hélio</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sat, 12 May 2012 01:54:39 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,511435,511435#msg-511435</guid>
            <title>Proxy users (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,511435,511435#msg-511435</link>
            <description><![CDATA[ Hi,<br />
<br />
<a href="http://dev.mysql.com/doc/refman/5.5/en/proxy-users.html"  rel="nofollow">http://dev.mysql.com/doc/refman/5.5/en/proxy-users.html</a><br />
<br />
Did anybody use the a/m feature with mysql_native_password plugin?<br />
I used the script from the site, replacing  auth_plugin with mysql_native_password.<br />
The commands have been executed succesfully however the feature &quot;proxy users&quot; seems not to work. I tried on 5.5.20 (windows) and 5.5.10-log (centos).<br />
I used the same feature on Oracle 11g r1.<br />
<br />
After logged in I received:<br />
&lt;code&gt;<br />
mysql&gt; SELECT USER(), CURRENT_USER();<br />
empl_external@localhost | empl_external@localhost<br />
&lt;/code&gt;<br />
<br />
There is not proxy at all.]]></description>
            <dc:creator>Przemys?aw Budny</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Mon, 23 Jan 2012 11:58:17 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,499432,499432#msg-499432</guid>
            <title>Memory Allocation Error Help ?? (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,499432,499432#msg-499432</link>
            <description><![CDATA[ I am having a memory problem when trying to update Cognos Query subjects in Framework manager running on MySql. I have a large MySql table that I have added a new column to and I am trying to update the table definition in Cognos Framework manager in order to pull the new version of the MySql table into Cognos. Smaller tables work fine but this larger table I am getting the error message shown below. I have 16G of memory for my Mysql server and am running the community version 5.1 of MySql... has anyone encountered this problem or have any suggestion as to what may be the cause ??? <br />
<br />
RQP-DEF-0177 An error occurred while performing operation 'sqlPrepareWithOptions' status='-9'. <br />
UDA-SQL-0107 A general exception has occurred during the operation &quot;prepare&quot;. <br />
[MySQL][ODBC 3.51 Driver][mysqld-5.1.49-community-log]MySQL client ran out of memory]]></description>
            <dc:creator>Norm Bailey</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Fri, 18 Nov 2011 16:10:29 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,431917,431917#msg-431917</guid>
            <title>sql error (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,431917,431917#msg-431917</link>
            <description><![CDATA[ I got a mysql error on my site can you please tell why<br />
<br />
<a href="http://www.mobilyainsaat.com/"  rel="nofollow">http://www.mobilyainsaat.com/</a>]]></description>
            <dc:creator>hasan ozerr</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Thu, 25 Aug 2011 23:28:46 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,421494,421494#msg-421494</guid>
            <title>Optimizing it even more (8 replies)</title>
            <link>http://forums.mysql.com/read.php?32,421494,421494#msg-421494</link>
            <description><![CDATA[ Hi everyone,<br />
<br />
We have a MySQL 5.5.11 x64 runnin on a Windows 2008 R2 x64, and we use it as a data warehosue.<br />
So I optimize as max as I could but I think some more tweaking can be made.<br />
<br />
Let's say that we have a sessions table, which is an INNODB one, it has been partitioned by mounth and subpartitioned by bot_id, which is the id of our bots.<br />
It doesn't have Primary Key, but as it has an AUTO_INCREMENT column an index for that column has been set, but most of its columns are foreign key to dimensions (tables) without being set the constain, I mean, they are simple columns.<br />
<br />
On the other hand we have for example the agent dimension table, which basically has the primary key and the agent name.<br />
<br />
The SHOW TABLE STATUS for these tables are as follows:<br />
<br />
Name          Engine    Ver   Row_format Rows        AvgR   Data_lenght   Max  Index_lenght Data_free        Auto_inc   Create_time           UpT    ChT   Collation          ChSM  CO             Comment<br />
'd_agents',   'InnoDB', '10', 'Compact', '1243425',  '239', '297549824',  '0', '228589568', '2001731584',    '1262471', '2011-05-03 10:13:08', NULL, NULL, 'utf8_general_ci', NULL, '',            ''<br />
'f_sessions', 'InnoDB', '10', 'Compact', '39042058', '131', '5125947392', '0', '874774528', '1447251935232', '39233398', NULL,                 NULL, NULL, 'utf8_general_ci', NULL, 'partitioned', ''<br />
<br />
<br />
So We usually run queries like this:<br />
SELECT DISTINCT a.* FROM d_agents a INNER JOIN f_sessions s ON a.agent_id = s.agent_fk AND s.begin_timestamp &gt;= '2010-01-01 00:00:00' AND s.begin_timestamp &lt;= '2011-05-25 23:59:59' AND s.bot_fk IN (1,2,3,4,5,6,13);<br />
<br />
It takes a lot of time and what concerns me is that it takes a lot of diskspace when running the query(+400 GBytes), even though all the database weights only 20 GBytes.<br />
<br />
I changed the setting innodb_buffer_pool_size to 14G in order to help MySQL put the PrimaryKeys into memory.<br />
<br />
Is there any way to omptimize even more this kind of queries?<br />
If I set indexs for the fks in the session table, will they help the execution of the query?<br />
<br />
<br />
Thanks all of us in advance.]]></description>
            <dc:creator>Pepe Pardo</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Thu, 02 Jun 2011 08:13:12 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,419958,419958#msg-419958</guid>
            <title>Best Match from the left (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,419958,419958#msg-419958</link>
            <description><![CDATA[ Suppose i have a list of numbers<br />
2243432422<br />
5645343425<br />
4252524245<br />
6756345234<br />
7809080594<br />
<br />
<br />
And i got number 56488888888.<br />
<br />
<br />
I want the query to get 5645343425 which is the best match from the left. Is there a query that can do that ?? <br />
<br />
<br />
I tried with query below but no luck.<br />
<br />
SELECT * FROM table<br />
WHERE number LIKE CONCAT('56488888888', '%');]]></description>
            <dc:creator>goh juno</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 18 May 2011 05:40:05 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,417048,417048#msg-417048</guid>
            <title>LOAD DATA INFILE  with SET commad (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,417048,417048#msg-417048</link>
            <description><![CDATA[ hi I need to insert bb.csv file data which col1=12<br />
cat bb.csv<br />
1,1,41|<br />
19,2,40662432|<br />
12,3,47006477|<br />
19,4,46124433|<br />
<br />
table:<br />
+-----------------+----------+------+-----+---------+-------+<br />
| Field           | Type     | Null | Key | Default | Extra |<br />
+-----------------+----------+------+-----+---------+-------+<br />
| service_key     | int(6)   | YES  |     | NULL    |       |<br />
| subscriber_type | char(6)  | YES  |     | NULL    |       |<br />
| call_type       | char(15) | YES  |     | NULL    |       |<br />
+-----------------+----------+------+-----+---------+-------+<br />
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=<br />
<br />
commad: <br />
mysql&gt;LOAD DATA INFILE '/data/PREPAID/gg.csv' INTO TABLE  test fields TERMINATED BY ',' LINES TERMINATED BY '|' (service_key,@subscriber_type,call_type) SET service_key =12;<br />
<br />
Query OK, 4 rows affected, 1 warning (0.24 sec)<br />
Records: 4  Deleted: 0  Skipped: 0  Warnings: 0<br />
<br />
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++<br />
here all the data will inserted. how to avoid this issue. i need only one record to be inserted.<br />
great help<br />
thanks<br />
great]]></description>
            <dc:creator>chiko Ratnayake</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 20 Apr 2011 20:29:25 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,416537,416537#msg-416537</guid>
            <title>Data mining in MySQL (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,416537,416537#msg-416537</link>
            <description><![CDATA[ I search for data mining module tightly integrated with MySQL.<br />
(like Oracle Data Mining integrate with Oracle Database, IBM Intelligent Miner integrated with IBM DB2...)<br />
<br />
Does MySQL support data mining functionality itself?<br />
<br />
Or, is there any data mining system tightly integrate with MySQL?]]></description>
            <dc:creator>Ilyeop Yi</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Fri, 15 Apr 2011 05:45:40 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,415212,415212#msg-415212</guid>
            <title>Data Warehouse In Web Development (2 replies)</title>
            <link>http://forums.mysql.com/read.php?32,415212,415212#msg-415212</link>
            <description><![CDATA[ Hi anybody.<br />
First pleas let me know if I didn't post in correct forum.<br />
I developed a website using php and MySQL.Now I want to show the amount of each record or each attribute and value of my columns where user browse my categories or search my sit.<br />
For example please see this example <a href="http://www.amazon.com/s/ref=amb_link_83353631_23?ie=UTF8&amp;node=565108&amp;brand=lenovo&amp;pf_rd_m=ATVPDKIKX0DER&amp;pf_rd_s=left-1&amp;pf_rd_r=006DKMQ9KADV5MQ1WBB9&amp;pf_rd_t=101&amp;pf_rd_p=1292034742&amp;pf_rd_i=565108"  rel="nofollow">http://www.amazon.com/s/ref=amb_link_83353631_23?ie=UTF8&amp;node=565108&amp;brand=lenovo&amp;pf_rd_m=ATVPDKIKX0DER&amp;pf_rd_s=left-1&amp;pf_rd_r=006DKMQ9KADV5MQ1WBB9&amp;pf_rd_t=101&amp;pf_rd_p=1292034742&amp;pf_rd_i=565108</a><br />
<br />
there are another example,just brows  some category or search some things:<br />
<a href="http://cars.oodle.com/buick/dallas-area/"  rel="nofollow">http://cars.oodle.com/buick/dallas-area/</a><br />
<br />
If you watch the left panel it showed how many record founded for each attribute.<br />
I wondered how amazon or other site get this information rapidly! I think it may used Warehousing ! <br />
<br />
I wanna to know  what is this technology and could I use it in MySQL fro my website?]]></description>
            <dc:creator>saeed sp</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sat, 09 Apr 2011 10:00:59 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,415191,415191#msg-415191</guid>
            <title>Data Warehouse (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,415191,415191#msg-415191</link>
            <description><![CDATA[ I want more informations about this topic.<br />
____________<br />
Nikolas.<br />
<a href="http://www.topsyssolutions.com"  rel="nofollow">payroll solutions</a>]]></description>
            <dc:creator>nikolas jones</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Tue, 05 Apr 2011 05:30:46 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,407328,407328#msg-407328</guid>
            <title>Load Data with Binary File (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,407328,407328#msg-407328</link>
            <description><![CDATA[ Hello All,<br />
<br />
I'd like to use the speed of the load data statement to load in binary data into a table. I've done it with a character flat file (with determined/known character delimitations) however, does anyone know if it works with binary data also?<br />
Thanks!]]></description>
            <dc:creator>Curtis Pastore</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sun, 13 Feb 2011 05:21:51 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,407180,407180#msg-407180</guid>
            <title>simple question about dimension table (1 reply)</title>
            <link>http://forums.mysql.com/read.php?32,407180,407180#msg-407180</link>
            <description><![CDATA[ Hi all ,<br />
<br />
I am new to data warehouse world and need to know that I have a oltp structure where lets say there are 3 tables brand, category and product. Now in brand there is a field say name varchar(45) and in category there is also a field name varchar(45). So for OLTP it's different table and no problem.<br />
But now when I am creating dimention table &quot;product&quot; and combining all these 3 tables in one Product dimension so how to incorporate that 'name' field as it will be duplicated and mysql wont allow so should I change the name to something meaningful or what is the dw concept for this.<br />
And one more question. say in OLTP product table it has 100 rows and in category 5 and in brand 3. so how to populate the &quot;product&quot; dimension table by taking data from all these 3 tables. <br />
Should I just run a query matching productID with category tables's product ID and then brandID with category tables brand ID and finally will get 100 rows in dimension table.<br />
<br />
Thank you. hope I make myself  clear.]]></description>
            <dc:creator>Maron Vomr@</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sat, 12 Feb 2011 22:58:18 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,396650,396650#msg-396650</guid>
            <title>Performance problem in InnoDB (3 replies)</title>
            <link>http://forums.mysql.com/read.php?32,396650,396650#msg-396650</link>
            <description><![CDATA[ I'm currently using Mysql 5.1.49 community server for the datawarehouse.<br />
The performance of the server degrades quite dramatically for even a simple query.<br />
<br />
The fact table contains 10 million rows and the dimension tables contain only around 40k records.<br />
The query takes around 14 secs to execute.<br />
<br />
I want to show the query results in the graph and the user cannot wait for 14 secs.<br />
I want to show the graph in real time.<br />
As the where clause can vary depending on the user selection, I cannot cache all the queries in the WS layer.<br />
<br />
Is this the expected behavior ?<br />
<br />
<br />
mysql&gt;  select  date_key, time_key, sum(return_count)  <br />
from FACT_HOUR_SUBNET_TRACKER fact, B2BUnitDimension b2bDim, SubnetDimension subnetDim   <br />
where fact.b2bunit_key = b2bDim.b2bunit_key  and  <br />
fact.subnet_key = subnetDim.subnet_key  AND  <br />
subnetDim.PIDType  = 'LOCAL' and  <br />
b2bDim.customer_name = 'turktel'   <br />
group by  fact.date_key ,  fact.time_key;<br />
<br />
This query takes 13.26 secs<br />
<br />
mysql&gt; show table status like 'FACT_HOUR_SUBNET_TRACKER'\G<br />
*************************** 1. row ***************************<br />
           Name: FACT_HOUR_SUBNET_TRACKER<br />
         Engine: InnoDB<br />
        Version: 10<br />
     Row_format: Compact<br />
           Rows: 10224815<br />
 Avg_row_length: 58<br />
    Data_length: 602619904<br />
Max_data_length: 0<br />
   Index_length: 447791104<br />
      Data_free: 1024909639680<br />
 Auto_increment: 10530643<br />
    Create_time: NULL<br />
    Update_time: NULL<br />
     Check_time: NULL<br />
      Collation: latin1_swedish_ci<br />
       Checksum: NULL<br />
 Create_options: partitioned<br />
        Comment: Stores the Subnet Tracker hourly data for 1 month<br />
1 row in set (0.02 sec)<br />
<br />
<br />
mysql&gt; show create table FACT_HOUR_SUBNET_TRACKER;<br />
<br />
| FACT_HOUR_SUBNET_TRACKER | CREATE TABLE `FACT_HOUR_SUBNET_TRACKER` (<br />
  `id` bigint(20) NOT NULL AUTO_INCREMENT,<br />
  `creation_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP,<br />
  `day_of_month` tinyint(4) NOT NULL DEFAULT '1',<br />
  `time_key` smallint(5) unsigned DEFAULT NULL,<br />
  `date_key` int(10) unsigned DEFAULT NULL,<br />
  `subnet_key` mediumint(8) unsigned DEFAULT NULL,<br />
  `b2bunit_key` mediumint(8) unsigned DEFAULT NULL,<br />
  `nbr_of_local_seeders` mediumint(8) unsigned DEFAULT NULL,<br />
  `nbr_of_unique_sessions` mediumint(8) unsigned DEFAULT NULL,<br />
  `nbr_of_peers` mediumint(8) unsigned DEFAULT NULL,<br />
  `mbytes_uploaded` mediumint(8) unsigned DEFAULT NULL,<br />
  `mbytes_downloaded` mediumint(8) unsigned DEFAULT NULL,<br />
  `return_count` mediumint(8) unsigned DEFAULT NULL,<br />
  `receive_count` mediumint(8) unsigned DEFAULT NULL,<br />
  `local_down_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  `peering_down_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  `transit_down_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  `local_up_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  `peering_up_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  `transit_up_traffic` decimal(10,2) unsigned DEFAULT NULL,<br />
  PRIMARY KEY (`id`,`day_of_month`),<br />
  KEY `idx_FACT_HOUR_SUBNET_TRACKER_lookup` (`creation_time`,`subnet_key`,`time_key`,`date_key`,`b2bunit_key`)<br />
) ENGINE=InnoDB AUTO_INCREMENT=10530643 DEFAULT CHARSET=latin1 COMMENT='Stores the Subnet Tracker hourly data for 1 month'<br />
/*!50100 PARTITION BY LIST (day_of_month)<br />
(PARTITION PDay1 VALUES IN (1) ENGINE = InnoDB,<br />
 PARTITION PDay2 VALUES IN (2) ENGINE = InnoDB,<br />
 PARTITION PDay3 VALUES IN (3) ENGINE = InnoDB,<br />
 PARTITION PDay4 VALUES IN (4) ENGINE = InnoDB,<br />
 PARTITION PDay5 VALUES IN (5) ENGINE = InnoDB,<br />
 PARTITION PDay6 VALUES IN (6) ENGINE = InnoDB,<br />
 PARTITION PDay7 VALUES IN (7) ENGINE = InnoDB,<br />
 PARTITION PDay8 VALUES IN (8) ENGINE = InnoDB,<br />
 PARTITION PDay9 VALUES IN (9) ENGINE = InnoDB,<br />
 PARTITION PDay10 VALUES IN (10) ENGINE = InnoDB,<br />
 PARTITION PDay11 VALUES IN (11) ENGINE = InnoDB,<br />
 PARTITION PDay12 VALUES IN (12) ENGINE = InnoDB,<br />
 PARTITION PDay13 VALUES IN (13) ENGINE = InnoDB,<br />
 PARTITION PDay14 VALUES IN (14) ENGINE = InnoDB,<br />
 PARTITION PDay15 VALUES IN (15) ENGINE = InnoDB,<br />
 PARTITION PDay16 VALUES IN (16) ENGINE = InnoDB,<br />
 PARTITION PDay17 VALUES IN (17) ENGINE = InnoDB,<br />
 PARTITION PDay18 VALUES IN (18) ENGINE = InnoDB,<br />
 PARTITION PDay19 VALUES IN (19) ENGINE = InnoDB,<br />
 PARTITION PDay20 VALUES IN (20) ENGINE = InnoDB,<br />
 PARTITION PDay21 VALUES IN (21) ENGINE = InnoDB,<br />
 PARTITION PDay22 VALUES IN (22) ENGINE = InnoDB,<br />
 PARTITION PDay23 VALUES IN (23) ENGINE = InnoDB,<br />
 PARTITION PDay24 VALUES IN (24) ENGINE = InnoDB,<br />
 PARTITION PDay25 VALUES IN (25) ENGINE = InnoDB,<br />
 PARTITION PDay26 VALUES IN (26) ENGINE = InnoDB,<br />
 PARTITION PDay27 VALUES IN (27) ENGINE = InnoDB,<br />
 PARTITION PDay28 VALUES IN (28) ENGINE = InnoDB,<br />
 PARTITION PDay29 VALUES IN (29) ENGINE = InnoDB,<br />
 PARTITION PDay30 VALUES IN (30) ENGINE = InnoDB,<br />
 PARTITION PDay31 VALUES IN (31) ENGINE = InnoDB) */ | <br />
 <br />
mysql&gt; explain select  date_key, time_key, sum(return_count)  <br />
from FACT_HOUR_SUBNET_TRACKER fact, B2BUnitDimension b2bDim, SubnetDimension subnetDim   <br />
where fact.b2bunit_key = b2bDim.b2bunit_key  and  <br />
fact.subnet_key = subnetDim.subnet_key  AND  <br />
subnetDim.PIDType  = 'LOCAL' and  <br />
b2bDim.customer_name = 'turktel'   <br />
group by  fact.date_key ,  fact.time_key\G<br />
*************************** 1. row ***************************<br />
           id: 1<br />
  select_type: SIMPLE<br />
        table: fact<br />
         type: ALL<br />
possible_keys: NULL<br />
          key: NULL<br />
      key_len: NULL<br />
          ref: NULL<br />
         rows: 10265643<br />
        Extra: Using temporary; Using filesort<br />
*************************** 2. row ***************************<br />
           id: 1<br />
  select_type: SIMPLE<br />
        table: subnetDim<br />
         type: eq_ref<br />
possible_keys: PRIMARY<br />
          key: PRIMARY<br />
      key_len: 8<br />
          ref: dwh.fact.subnet_key<br />
         rows: 1<br />
        Extra: Using where<br />
*************************** 3. row ***************************<br />
           id: 1<br />
  select_type: SIMPLE<br />
        table: b2bDim<br />
         type: eq_ref<br />
possible_keys: PRIMARY,idx_B2BUnitDimension_lookup<br />
          key: PRIMARY<br />
      key_len: 4<br />
          ref: dwh.fact.b2bunit_key<br />
         rows: 1<br />
        Extra: Using where<br />
3 rows in set (0.00 sec)<br />
<br />
<br />
mysql&gt; show variables like '%buffer%';<br />
+-------------------------+-------------+<br />
| Variable_name           | Value       |<br />
+-------------------------+-------------+<br />
| bulk_insert_buffer_size | 67108864    |<br />
| innodb_buffer_pool_size | 10737418240 |<br />
| innodb_log_buffer_size  | 8388608     |<br />
| join_buffer_size        | 9998336     |<br />
| key_buffer_size         | 3221225472  |<br />
| myisam_sort_buffer_size | 134217728   |<br />
| net_buffer_length       | 16384       |<br />
| preload_buffer_size     | 32768       |<br />
| read_buffer_size        | 2097152     |<br />
| read_rnd_buffer_size    | 16777216    |<br />
| sort_buffer_size        | 8388608     |<br />
| sql_buffer_result       | OFF         |<br />
+-------------------------+-------------+<br />
12 rows in set (0.00 sec)<br />
<br />
mysql&gt; show variables like '%tmp%';<br />
+-------------------+-------------+<br />
| Variable_name     | Value       |<br />
+-------------------+-------------+<br />
| max_tmp_tables    | 32          |<br />
| slave_load_tmpdir | /tmp        |<br />
| tmp_table_size    | 10737418240 |<br />
| tmpdir            | /tmp        |<br />
+-------------------+-------------+<br />
4 rows in set (0.00 sec)<br />
<br />
mysql&gt; show variables like '%heap%';<br />
+---------------------+-------------+<br />
| Variable_name       | Value       |<br />
+---------------------+-------------+<br />
| max_heap_table_size | 10737418240 |<br />
+---------------------+-------------+<br />
<br />
<br />
# uname -a<br />
Linux dataw1.kalixlab.be.alcatel-lucent.com 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux<br />
<br />
             total       used       free     shared    buffers     cached<br />
Mem:            19         19          0          0          0          6<br />
-/+ buffers/cache:         12          7<br />
Swap:            1          0          1]]></description>
            <dc:creator>Ganesh Ramasubramanian</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 08 Dec 2010 08:46:55 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,386015,386015#msg-386015</guid>
            <title>Which data warehouse for MySQL can run on Solaris 10? (3 replies)</title>
            <link>http://forums.mysql.com/read.php?32,386015,386015#msg-386015</link>
            <description><![CDATA[ Hello<br />
I'm using mysql 5.0 and 5.1 based on Solaris 10 server, and there are about 500GB data to query ,I want to try data warehouse in my application. <br />
But which data warehouse for MySQL can install and run on Solaris 10 (X86 or Space)? Please help to advice.<br />
<br />
Thanks a lot<br />
Justin]]></description>
            <dc:creator>justin ux</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Tue, 28 Sep 2010 17:54:45 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,374649,374649#msg-374649</guid>
            <title>MySQL data warehouse? Newbie (2 replies)</title>
            <link>http://forums.mysql.com/read.php?32,374649,374649#msg-374649</link>
            <description><![CDATA[ Hello,<br />
<br />
we have 2 MySQL databases of about 30 GB each (for 2 completely different products, the structure is very different). They are used as transactional databases.<br />
<br />
The thing is we also want to do complex analytical queries on this data and even though it's &quot;only&quot; 30GB some queries are simply not possible at all. For example grouping a non-indexed table by 3 different columns at once and joining it with 2 other tables, all of which have several million rows, and without using a where condition. But we really need this data and we need it within &lt; 1 second response time, so obviously we need a new solution. We didn't put indexes on the tables because it's actually a transactional DB and bad for OLTP performance (?), plus it probably wouldn't give us the sub-second response times we want either.<br />
<br />
My question is how we should go about setting up a data warehouse. Our requirements are as follows:<br />
<br />
- We would like to use a BI platform like MicroStrategy<br />
<br />
- If possible, we'd like to have real-time synchronization between the data warehouse and the operational MySQL DB. If not possible, hourly synchronization. But it has to be reliable.<br />
<br />
- The data warehouse should be synchronized with both (completely different) operational databases, plus some files that are generated by some scripts (web crawler stuff) - but those files can be updated like every day<br />
<br />
- The data from the data warehouse actually has to be DISaggregated, like a table with 3m rows has to be turned into 20m rows and to be made MORE granular that way. Just imagine you have a table &quot;slot_machine_displays&quot; and it includes a &quot;symbols&quot; string column including the ids for the symbols that showed up. Well, we want to turn that into a table &quot;symbols_that_showed_up&quot;, making it more granular. That's just an example, we're not a casino ;)<br />
<br />
- It should not be a pain to set up. For example, if I have to think about in detail what aggregate tables I want to create, that would be a pain. It would be nice to have a data warehouse where I just push a button and it loads all the data in real-time from the operational databases, and all analytical queries, even the most complex ones that analyze data for an entire year across 5 dimensions are carried out within seconds.<br />
<br />
-We're a company with 40 people and basically have no real budget to afford enterprise-level solutions. Anything &gt;5000 EUR would probably be not possible.<br />
<br />
I am COMPLETELY lost on this so maybe someone can explain to me in some detail of what we should explore in this situation.<br />
<br />
<br />
I already looked at InfoBright but I don't know how it's supposed to synchronize with the operational database? ETL tools like Talend seem to be able to do that but the enterprise edition is ridiculously expensive.<br />
<br />
Is there a really easy solution that I simply didn't realize? I mean isn't that the no.1 problem any small web company has? Having to analyze their MySQL data?]]></description>
            <dc:creator>Maximilian Seifert</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Mon, 30 Aug 2010 10:56:11 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,371064,371064#msg-371064</guid>
            <title>converting from InnoDB to MyISAM (2 replies)</title>
            <link>http://forums.mysql.com/read.php?32,371064,371064#msg-371064</link>
            <description><![CDATA[ Hi,<br />
<br />
I am converting my tables from InnoDB to MyISAM using Alter table command. one table which was 23000 records took few seconds to convert. I have other table which is approximately 220 million records, alter command is running since 5 days now and not finished yet. Is this a right behavior? Is there way to find out how much data is converted and if the script is really running and not hung somewhere?<br />
<br />
thanks,<br />
shyam]]></description>
            <dc:creator>shyam ahuja</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Wed, 25 Aug 2010 15:02:57 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,367222,367222#msg-367222</guid>
            <title>datawarehouse ETL rolls back after 5 GB of data transfer (5 replies)</title>
            <link>http://forums.mysql.com/read.php?32,367222,367222#msg-367222</link>
            <description><![CDATA[ Hi,<br />
<br />
I am running ETL which is Extracting Data from staging table and inserting into Datawarehouse DB. I have around 10 GB of data in one table. After transferring 5 GB of data it rolls back the transaction and deletes everything from table. Any feedback?<br />
<br />
 Here is ETL<br />
--Procedure: new_etl1<br />
<br />
--DROP PROCEDURE IF EXISTS new_etl1;<br />
<br />
DELIMITER $$<br />
CREATE  PROCEDURE `new_etl1`()<br />
BEGIN<br />
       DECLARE ins1 varchar(15) default '';<br />
       DECLARE ins2 varchar(15) default '';<br />
       DECLARE ins3 varchar(15) default '';<br />
       DECLARE ins4 varchar(15) default '';<br />
       DECLARE ins5 varchar(15) default '';<br />
       DECLARE ins6 varchar(15) default '';<br />
       DECLARE ins7 varchar(15) default '';<br />
       DECLARE ins8 varchar(15) default '';<br />
       DECLARE ins9 varchar(15) default '';<br />
       DECLARE ins10 varchar(15) default '';<br />
       DECLARE j int;<br />
       DECLARE dw_max_id int  ;<br />
       DECLARE stg_max_id int  ;<br />
       DECLARE dw_max_id_1 int ;<br />
       DECLARE stg_max_id_1 int ;<br />
       DECLARE log1 varchar(15) default'';<br />
       DECLARE l_log numeric ;<br />
<br />
DECLARE EXIT HANDLER FOR SQLEXCEPTION,SQLWARNING,NOT FOUND<br />
	select concat<br />
         (<br />
           'Erorr Inserting Data into DWH Table  ',<br />
           case <br />
             when ins1='' then 'Dw_log_presentation' <br />
             when ins2='' then 'Dw_log_smil_sequence' <br />
             when ins3='' then 'Dw_log_smil_context' <br />
             when ins4='' then 'Dw_theaterexhibitor '<br />
             when ins5='' then 'Dw_log_processor'<br />
             when ins6='' then 'Dw_log_day_control ' <br />
             when ins7='' then 'Dw_Files ' <br />
             when ins8='' then 'Dw_log_File'<br />
             when ins9='' then 'Dw_log_playlog' <br />
             when ins10='' then 'Dw_playlog' <br />
           END,<br />
           ' Insertion  Failed'         );<br />
SET AUTOCOMMIT=0; <br />
START TRANSACTION; <br />
select    max(job_instance_no) into j from staging_db.job_details ;<br />
 <br />
insert into datawarehouse_db.dw_log_presentation <br />
                (Presentation_Id, Name)<br />
                select Presentation_Id,Name from staging_db.stg_log_presentation<br />
where Presentation_Id not in<br />
		(select Presentation_Id from  datawarehouse_db.dw_log_presentation);<br />
SELECT ROW_COUNT() into log1;<br />
 <br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_presentation','No Duplicate Entry Allowed');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_presentation','Completed');<br />
end if;<br />
        <br />
SET ins1='Done';<br />
 <br />
insert into datawarehouse_db.dw_log_smil_sequence <br />
                (SEQUENCE_ID, NAME)<br />
                select SEQUENCE_ID, NAME from staging_db.stg_log_smil_sequence<br />
where SEQUENCE_ID not in<br />
		(select SEQUENCE_ID from  datawarehouse_db.dw_log_smil_sequence);<br />
 <br />
SELECT ROW_COUNT() into log1;<br />
 <br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_smil_sequence ','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_smil_sequence','Completed');<br />
end if;<br />
        <br />
SET ins2='Done';<br />
 <br />
insert into datawarehouse_db.dw_log_smil_context <br />
                (SMIL_ID, REF_ID)<br />
                select    SMIL_ID, REF_ID from staging_db.stg_log_smil_context<br />
where SMIL_ID not in<br />
		(select SMIL_ID from  datawarehouse_db.dw_log_smil_context);<br />
 <br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_smil_context','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_smil_context','Completed');<br />
end if;<br />
SET ins3='Done';<br />
insert into datawarehouse_db.dw_theaterexhibitor <br />
                (PROCESSOR, THEATER, EXHIBITOR)<br />
select    PROCESSOR, THEATER, EXHIBITOR <br />
                from <br />
                staging_db.stg_theaterexhibitor <br />
where  PROCESSOR not in<br />
		(select  PROCESSOR from  datawarehouse_db.dw_theaterexhibitor );<br />
 <br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_theaterexhibitor','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_theaterexhibitor','Completed');<br />
end if;<br />
SET ins4='Done';<br />
insert into datawarehouse_db.dw_log_processor <br />
                (Processor_Id, Name)<br />
select    Processor_Id, Name <br />
                from <br />
                staging_db.stg_log_processor <br />
where  Processor_Id not in<br />
		(select  Processor_Id from datawarehouse_db.dw_log_processor );<br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_processor','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_processor','Completed');<br />
end if;<br />
SET ins5='Done';<br />
insert into datawarehouse_db.dw_log_day_control <br />
                (LDC_Id, Log_Date, Pending, Node_Id)<br />
select    LDC_Id, Log_Date, Pending, Node_Id <br />
                from <br />
                staging_db.stg_log_day_control <br />
where  LDC_Id not in<br />
		(select  LDC_Id from datawarehouse_db.dw_log_day_control);<br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_day_control','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_day_control','Completed');<br />
end if;<br />
SET ins6='Done';<br />
select    max(Entry_Id) into dw_max_id from datawarehouse_db.dw_log_playlog ;<br />
select    max(Entry_Id) into stg_max_id from staging_db.stg_log_playlog ;<br />
if dw_max_id is null then<br />
                select    MIN(Entry_Id) into dw_max_id from staging_db.stg_log_playlog ;<br />
End if;<br />
if stg_max_id is null then<br />
                set stg_max_id=0;<br />
End if;<br />
set log1 = 0 ;<br />
                while dw_max_id &lt; stg_max_id<br />
                DO<br />
                SET @s =CONCAT(&quot;insert into datawarehouse_db.dw_log_playlog <br />
                (Entry_Id, File_Id, Processor_Id, Sequence_Id, SMIL_Id, Play_Time)<br />
select    Entry_Id, File_Id, Processor_Id, Sequence_Id, SMIL_Id, Play_Time <br />
                from staging_db.stg_log_playlog <br />
where stg_log_playlog.Entry_Id  &gt; &quot; , dw_max_id, &quot; and stg_log_playlog.Entry_Id &lt;= &quot; ,dw_max_id+10000);<br />
                PREPARE stmt1 FROM @s; <br />
                EXECUTE stmt1; <br />
                SELECT ROW_COUNT() into l_log;<br />
                if l_log is not null then<br />
                                set log1 = log1 + l_log ;<br />
                end if ;<br />
                DEALLOCATE PREPARE stmt1; <br />
SET dw_max_id=dw_max_id+10000;<br />
                <br />
END WHILE;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_playlog','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_playlog','Completed');<br />
end if;<br />
SET ins7='Done';<br />
insert into datawarehouse_db.dw_files <br />
                (LONG_NAME, SHORT_NAME, MOVIE, STUDIO)<br />
select LONG_NAME, SHORT_NAME, MOVIE, STUDIO from   staging_db.stg_files <br />
where LONG_NAME not in<br />
		(select LONG_NAME from datawarehouse_db.dw_files );<br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_files ','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_files ','Completed');<br />
end if;<br />
SET ins8='Done';<br />
insert ignore into datawarehouse_db.dw_log_file <br />
                (File_Id, Presentation_Id, File_Name)<br />
select    File_Id, Presentation_Id, File_Name<br />
                from <br />
                staging_db.stg_log_file <br />
where File_Id not in<br />
		(select File_Id from datawarehouse_db.dw_log_file);<br />
SELECT ROW_COUNT() into log1;<br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_log_file  ','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_log_file ','Completed');<br />
end if;<br />
SET ins9='Done';<br />
 <br />
select    max(Entry_Id) into dw_max_id_1 from datawarehouse_db.dw_playlog;<br />
select    max(Entry_Id) into stg_max_id_1 from staging_db.stg_log_playlog;<br />
if dw_max_id_1 is null then<br />
                select    MIN(Entry_Id) into dw_max_id_1 from staging_db.stg_log_playlog ;<br />
End if;<br />
if stg_max_id_1 is null then<br />
                set stg_max_id_1=0;<br />
End if;<br />
set log1 = 0 ;<br />
                while dw_max_id_1 &lt; stg_max_id_1<br />
                DO<br />
                SET @s =CONCAT(&quot;insert into datawarehouse_db.dw_playlog <br />
	(Entry_Id, Play_Time, File_Name, Processor_Name, THEATER, EXHIBITOR, Presentation_Name, Presentation_Id, File_id, processor_id, short_name, movie, studio, Node_id, Log_date	)	<br />
SELECT stg_log_playlog.Entry_Id<br />
,stg_log_playlog.Play_Time<br />
 , stg_log_file.File_Name<br />
 , stg_log_processor.Name<br />
 , stg_theaterexhibitor.THEATER<br />
 , stg_theaterexhibitor.EXHIBITOR<br />
 , stg_log_presentation.Name<br />
,stg_log_presentation.Presentation_Id<br />
,stg_log_file.File_Id<br />
,stg_log_processor.Processor_Id<br />
,stg_files.SHORT_NAME<br />
,stg_files.MOVIE<br />
,stg_files.STUDIO<br />
,stg_log_day_control.Node_Id<br />
,stg_log_day_control.Log_Date<br />
FROM<br />
 ((stg_log_file stg_log_file<br />
 INNER JOIN<br />
 stg_log_presentation stg_log_presentation<br />
 ON (stg_log_file.Presentation_Id = stg_log_presentation.Presentation_Id))<br />
 INNER JOIN<br />
 stg_log_playlog stg_log_playlog<br />
 ON (stg_log_playlog.File_Id = stg_log_file.File_Id))<br />
 INNER JOIN<br />
 stg_log_processor stg_log_processor<br />
 ON (stg_log_playlog.Processor_Id = stg_log_processor.Processor_Id)<br />
left JOIN  stg_theaterexhibitor  stg_theaterexhibitor <br />
 ON (stg_log_processor.Name=stg_theaterexhibitor.PROCESSOR)<br />
left JOIN  stg_files  stg_files<br />
 ON (stg_log_file.File_Name=stg_files.LONG_NAME)<br />
left JOIN stg_log_day_control  stg_log_day_control<br />
 ON (stg_log_file.File_Id=stg_log_day_control.LDC_Id)<br />
where (stg_log_playlog.File_Id = stg_log_file.File_Id)<br />
 AND (stg_log_file.Presentation_Id = stg_log_presentation.Presentation_Id)<br />
 AND (stg_log_playlog.Processor_Id = stg_log_processor.Processor_Id)<br />
 AND  stg_log_playlog.Entry_Id  &gt; &quot; , dw_max_id_1, &quot; and stg_log_playlog.Entry_Id &lt;= &quot; ,dw_max_id_1+10000);<br />
                PREPARE stmt1 FROM @s; <br />
                EXECUTE stmt1; <br />
SELECT ROW_COUNT() into l_log;<br />
                if l_log is not null then<br />
                                set log1 = log1 + l_log ;<br />
                end if ;<br />
                DEALLOCATE PREPARE stmt1; <br />
SET dw_max_id_1=dw_max_id_1+10000;         <br />
END WHILE;<br />
 <br />
if log1 is null or log1=0<br />
then <br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_Status) values(j,log1,'Dw_playlog ','No Duplicate Entry Allowed ');<br />
else<br />
insert into staging_db.dw_logs(job_id,num_rows,Table_Name,Updated_status) values(j,log1,'Dw_playlog','Completed');<br />
end if;<br />
 <br />
SET ins10='Done';<br />
<br />
COMMIT; <br />
 <br />
END$$<br />
<br />
DELIMITER ;]]></description>
            <dc:creator>shyam ahuja</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Fri, 21 May 2010 05:05:42 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,359817,359817#msg-359817</guid>
            <title>Stored procedure and connexion to another MySQL DB (2 replies)</title>
            <link>http://forums.mysql.com/read.php?32,359817,359817#msg-359817</link>
            <description><![CDATA[ Hi,<br />
I would like to retrieve some information from another MySQL database, modify data and load it in a MySQL DB.<br />
To do it regularly, I'll use a stored procedure with an event.<br />
So, is it possible to create a stored procedure in database 1 which will connect and get information from database 2 ? And what is the syntax to do it ?<br />
<br />
With this method, I have many other questions : <br />
   * If the 2 databases are on the same server, can I have 2 versions of MySQL ? Because the database source use an older version of MySQL which does not support the Event Scheduler&quot;.<br />
   * Is my solution possible if the 2 databases are not on the same server ?<br />
<br />
Thanks all,<br />
I look forward]]></description>
            <dc:creator>discmat discmat</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sun, 04 Apr 2010 16:05:04 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,357475,357475#msg-357475</guid>
            <title>mysql or hadoop? (no replies)</title>
            <link>http://forums.mysql.com/read.php?32,357475,357475#msg-357475</link>
            <description><![CDATA[ hi,<br />
<br />
i'm working on a data warehouse and am deciding whether to use hadoop or mysql.<br />
<br />
the dataset is currently likely to be no bigger than 40gb for the first year, then perhaps 80gb for the next year, and possibly 120gb the year after.<br />
<br />
we want to be able to query all of the data at any point in the future - we aren't interested in throwing data away since we can't envisage how we might want to use it.<br />
<br />
so, would mysql be the right choice? i don't need high availability since this will be a back-office application, and the number of different queries won't be problematic - if staff want to perform queries, they don't really need real-time results. <br />
<br />
how large can mysql scale for this kind of thing? should i perhaps look at something like hadoop instead? i'm feeling it'd be overkill today, but how would mysql deal with a dataset of ~ 300gb? <br />
<br />
any help would be appreciated.<br />
<br />
thanks]]></description>
            <dc:creator>sanjo cole</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Mon, 08 Mar 2010 15:54:54 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,353637,353637#msg-353637</guid>
            <title>Large Database Question (7 replies)</title>
            <link>http://forums.mysql.com/read.php?32,353637,353637#msg-353637</link>
            <description><![CDATA[ This is a continuation of post: <a href="http://forums.mysql.com/read.php?22,353250,353625#msg-353625"  rel="nofollow">http://forums.mysql.com/read.php?22,353250,353625#msg-353625</a> <br />
<br />
<br />
I'm sorry I wasn't clear. 3 to 4 GROUPS of inserts per minute after the database is initially populated.  There could be 100s of rows to insert per minute for some of these tables.  Also, after new rows are inserted, updates do have to be done, which if I'm interpreting correctly from your previous response would block any selects on a MyIsam table?<br />
<br />
I am using an auto increment for one of the crucial tables in this project currently so I'm glad to hear that using this is extremely cheap.<br />
<br />
Initially, when the database is first being populated, there will be thousands if not tens of thousands of rows to insert per minute, but during this process no one will need to access them except the program that initially inserted them.<br />
<br />
Again, sorry, but I'm still working on an example we can use to discuss the design of this database.]]></description>
            <dc:creator>John McGiles</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sat, 13 Feb 2010 21:38:08 +0000</pubDate>
        </item>
        <item>
            <guid>http://forums.mysql.com/read.php?32,297910,297910#msg-297910</guid>
            <title>Importing data to MySQL database (4 replies)</title>
            <link>http://forums.mysql.com/read.php?32,297910,297910#msg-297910</link>
            <description><![CDATA[ Hello,<br />
<br />
I am working on MySQL database. My requirement is to import data from the following format to MySQL database. The data format is<br />
 &quot;----[/DHS3dyn/account/TAXAYQEK.DAT : Ticket number 8/8/100]---------------------<br />
<br />
00) TicketVersion = ED5.1 (01) CalledNumber = 00138313407<br />
<br />
(02) ChargedNumber = 32370 (03) ChargedUserName = TEST PERF 202<br />
<br />
(04) ChargedCostCenter = C42 (05) ChargedCompany =<br />
<br />
(06) ChargedPartyNode = 202 (07) Subaddress =<br />
<br />
(08) CallingNumber =<br />
<br />
(09) CallType = PrivateNetworkOutgoingCallToPublicNetwork<br />
<br />
(10) CostType = ISDNCircuitSwitchedCall (11) EndDateTime = 20090908 10:06:52<br />
<br />
(12) ChargeUnits = 1 (13) CostInfo = 100<br />
<br />
(14) Duration = 519 (15) trunkIdentity = 29<br />
<br />
(16) TrunkGroupIdentity = 2 (17) TrunkNode = 202<br />
<br />
(18) PersonalOrBusiness = Normal (19) AccessCode =<br />
<br />
(20) SpecificChargeInfo = (21) BearerCapability = 3Audio<br />
<br />
(22) HighLevelComp = Unspecified (23) DataVolume = 0<br />
<br />
(24) UserToUserVolume = 0 (25) ExternFacilities =<br />
<br />
(26) InternFacilities = BasicCall ARSService<br />
<br />
(27) CallReference = 0 (28) SegmentsRate1 = 0<br />
<br />
(29) SegmentsRate2 = 0 (30) SegmentsRate3 = 0<br />
<br />
(31) ComType = Data (32) X25IncomingFlowRate = Unspecified<br />
<br />
(33) X25OutgoingFlowRate = Unspecified (34) Carrier = 1<br />
<br />
(35) InitialDialledNumber = 014001893 (36) WaitingDuration = 2<br />
<br />
(37) EffectiveCallDuration = 7 (38) RedirectedCallIndicator = 1<br />
<br />
(39) StartDateTime = 20090908 09:58:13 (40) ActingExtensionNumber = 65579<br />
<br />
(41) CalledNumberNode = 9999 (42) CallingNumberNode = 9999<br />
<br />
(43) InitialDialledNumberNode = 9999 (44) ActingExtensionNumberNode = 501<br />
<br />
(45) TransitTrunkGroupIdentity = 43696 (46) NodeTimeOffset = 0&quot;<br />
<br />
Can you please suggest some methodology to import this kind of data.]]></description>
            <dc:creator>velayudham jayakumar</dc:creator>
            <category>Data Warehouse</category>
            <pubDate>Sat, 06 Feb 2010 11:52:28 +0000</pubDate>
        </item>
    </channel>
</rss>
