Mickael Besson wrote:
> > An alternative would be to set up a
> repository
> > that describes you application in terms of
> > abstract datatypes etc, and use templates or
> > reports to generate platform specific code.
>
> Hi Roland,
> I seam to be very interested in your repository
> system.
> Please, could you give me more informations about
> this repository, and about templates ? Do you have
> some examples ?
> Thank you
The repository would be very useful in case you want to maintain portability for an extended period of time. For a single port, a migration in the true sense, it's much easier to use the migration toolkit, and I would certainly recommend you do that if that's your case. (I think to product suggested by John Moll would fall into the same category.)
As for the repository, I hope I can illustrate it using a short example. Let's limit ourselves to porting tables only. And, let's not think about being able to maintain version history of our schemata. One other assumption is that we will build the repository in MySQL. So, here we go:
/*
Datatype
This would contain a list rdbms-independant datatypes
You could use the datatype from the sql standard, or
a list of abstract datatypes you think of yourself.
It doesnt really matter as long as you can think of an implementation
for a specific rdbms platform afterwards.
*/
create table t_datatype(
id int unsigned not null auto_increment
, name varchar(128) not null
, constraint pk_datatype primary key(id)
, constraint uk_datatype unique(name)
);
/*
Schema
Just a container for our objects
*/
create table t_schema (
id int unsigned not null auto_increment
, name varchar(128) not null
, constraint pk_schema primary key(id)
, constraint uk_schema unique(name)
);
/*
Domain
datatype + seasoning.
Generally, you will encounter several instances
of identical datatype usage throughout the database.
For instance, you might give your tables and 'id' column,
or you might give several tables a 'description' column.
These columns would normally share several atributes.
Take for example the description column; the physical datatype
would be varchar, you might want to give each description column
a maximum of 255 characters, and you might want to specify that
the character set is always utf8.
A domain is a concept that lets you reuse such a datatype usage.
Domains will generally be specific to the schema or application
*/
create table t_domain (
id int unsigned not null auto_increment
, schema_id int unsigned not null
, datatype_id int unsigned not null
, name varchar(128) not null
, character_set_name varchar(128)
, collation_name varchar(128)
, precision int unsigned
, scale int unsigned
, constraint pk_domain primary key(id)
, constraint uk_domain unique(schema_id,name)
, constraint fk_domain_schema
foreign key (schema_id)
references t_schema (id)
, constraint fk_domain_datatype
foreign key (datatype_id)
references t_datatype (id)
);
/*
A relational table
Here, I'll just pretend that the table is just a container of columns
In reality, things could get pretty complicated here, depending upon
your requirements.
One of the difficulties here is to allow the repository to store
apsects having to do with the physical storage parameters.
This is quite a challenge: since every rdbms has it's own constructs
and concepts to implement storage, it's not very easy to find a
satisfactory structure to model this in a repository.
*/
create table t_table (
id int unsigned not null auto_increment
, schema_id int unsigned not null
, name varchar(128) not null
, constraint pk_table primary key(id)
, constraint uk_table unique(schema_id,name)
, constraint fk_table_schema
foreign key (schema_id)
references t_schema (id)
);
/*
Columns
The columns appearing in the table.
*/
create table t_column (
id int unsigned not null auto_increment
, table_id int unsigned not null
, name varchar(128) not null
, domain_id int unsigned not null
, is_nullable bit not null
, ordinal_position int unsigned not null
, constraint pk_table primary key(id)
, constraint uk_table unique(schema_id,name)
, constraint fk_column_table
foreign key (table_id)
references t_table (id)
, constraint fk_column_domain
foreign key (domain_id)
references t_domain (id)
);
Ok, now it's not hard to see how you could fill this repository so that it describes one or several database schemas. It would of course be nice to be able to fill it directly from a tool like MySQL workbench or DBDesigner (and I will give some hints into that direction), but a simple data-entry application would do fine too.
Now that we can store stuff into the repository, it's time to worry about how to get data out of our repository. What we want to do is to have something (a script, a procedure whatever) generate an XML document that contains an entire schema model. Here's an example using stored procedures:
delimiter //
create procedure p_export_schema(
p_name varchar(128)
)
begin
declare v_schema_id int unsigned;
declare exit handler for not found
select concat('no such schema: ',p_name);
--
-- get the schema id
--
select id
into v_schema_id
from t_schema
where name = p_name
;
--
-- generate the XML document
--
select '<?xml version="1.0"?>'
union all
select concat(
'<schema'
, ' id="',s.id,'"'
, ' name="',s.name,'"'
, '>'
)
from t_schema s
where s.id = v_schema_id
union all
select concat(
'<datatype'
, ' id="',d.id,'"'
, ' name="',d.name,'"'
, '/>'
)
from t_datatype d
union all
select concat(
'<domain'
, ' id="',d.id,'"'
, ' name="',d.name,'"'
, ' datatype_id="',d.datatype_id,'"'
, ' character_set_name="',coalesce(d.character_set_name,''),'"'
, ' collation_name="',coalesce(d.collation_name,''),'"'
, ' precision="',coalesce(d.precision,''),'"'
, ' scale="',coalesce(d.scale,''),'"'
, '/>'
)
from t_domain d
where d.id = v_schema_id
union all
select concat(
'<table'
, ' id="',t.id,'"'
, ' name="',t.name,'"'
, '>'
, group_concat(
concat(
'<column'
, ' id="',c.id,'"'
, ' name="',c.name,'"'
, ' ordinal_position="',c.ordinal_position,'"'
, ' domain_id="',c.domain_id,'"'
, ' is_nullable="',if(c.is_nullable,'Y','N'),'"'
, '/>'
)
order by c.ordinal_position
separator ''
)
, '</table>'
)
from t_table t
inner join t_column c
on t.id = c.table_id
where t.schema_id = v_schema_id
group by
union all
select concat(
'</schema>'
)
from t_schema s
where s.id= v_schema_id
;
end;
//
(Note that this is just an illustration - doing this properly would take some more time.)
So, this would give you xml documents like this:
<?xml version="1.0"?>
<schema
id="1"
name="myschema"
>
<datatype
id="1"
name="character varying"
/>
<datatype
id="2"
name="character large object"
/>
<datatype
id="3"
name="integer"
/>
<domain
id="1"
name="name"
datatype_id="1"
character_set_name="utf8"
collation_name="utf8_general_ci"
precision="64"
scale=""
/>
<domain
id="2"
name="description"
datatype_id="2"
character_set_name="utf8"
collation_name="utf8_general_ci"
precision="255"
scale=""
/>
<domain
id="3"
name="id"
datatype_id="3"
character_set_name=""
collation_name=""
precision=""
scale=""
/>
<table
id="1"
name="customer"
>
<column
id="1"
name="id"
ordinal_position="1"
domain_id="3"
is_nullable="N"
/>
<column
id="2"
name="lastname"
ordinal_position="2"
domain_id="1"
is_nullable="N"
/>
<column
id="3"
name="firstname"
ordinal_position="3"
domain_id="1"
is_nullable="Y"
/>
</table>
<table
id="2"
name="product"
>
<column
id="1"
name="id"
ordinal_position="1"
domain_id="3"
is_nullable="N"
/>
<column
id="2"
name="name"
ordinal_position="2"
domain_id="1"
is_nullable="N"
/>
<column
id="3"
name="description"
ordinal_position="3"
domain_id="2"
is_nullable="Y"
/>
</table>
</schema>
Now, we can write XSLT (see:
http://www.w3.org/TR/xslt) templates that translate this into code. For example, we can write an xslt like this to create oracle DDL for this schema (this is gonna look awful without indentation, sorry):
<?xml version="1.0" ?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
>
<!--====================================================
We will be outputting Oracle SQL DDL,
so we need plain text
=====================================================-->
<xsl:output
method="text"
/>
<!--====================================================
We compensate the lack of physical storage properties
in the repository by parameterizing our generator
=====================================================-->
<xsl:param name="tablespace_name">users</xsl:param>
<!--====================================================
entry point for the XSL transformation
=====================================================-->
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<!--====================================================
match schema: do a CREATE USER statement.
apply other templates from there on
=====================================================-->
<xsl:template match="schema">
create user <xsl:value-of select="@name"/>
identified by <xsl:value-of select="@name"/>
/
<xsl:apply-templates select="table"/>
</xsl:template>
<!--====================================================
match table: do a CREATE TABLE statement.
=====================================================-->
<xsl:template match="table">
create table <xsl:value-of select="../@name"/>.<xsl:value-of select="@name"/> (
<xsl:for-each select="column">
<xsl:sort
data-type="number"
select="@ordinal_position"
/>
<xsl:variable name="domain_id" select="@domain_id"/>
<xsl:choose>
<xsl:when test="position()=1"><xsl:text> </xsl:text></xsl:when>
<xsl:otherwise><xsl:text> , </xsl:text></xsl:otherwise>
</xsl:choose>
<xsl:value-of select="@name"/>
<xsl:text> </xsl:text><xsl:apply-templates
select="/schema/domain[@id=$domain_id]"
/>
<xsl:text> </xsl:text><xsl:if test="@is_nullable='N'">NOT NULL</xsl:if></xsl:for-each>
)
tablespace <xsl:value-of select="$tablespace_name"/>
/
</xsl:template>
<!--====================================================
match domain: translate to physical datatype
=====================================================-->
<xsl:template match="domain">
<xsl:variable name="datatype_id" select="@datatype_id"/>
<xsl:variable name="datatype" select="/schema/datatype[@id=$datatype_id]"/>
<xsl:variable name="datatype-name" select="$datatype/@name"/>
<xsl:variable name="physical-datatype">
<xsl:choose>
<xsl:when test="$datatype-name='character varying'">
<xsl:choose>
<xsl:when test="@character_set_name='utf8'">NVARCHAR2</xsl:when>
<xsl:otherwise>VARCHAR2</xsl:otherwise>
</xsl:choose>(<xsl:value-of select="@precision"/>)</xsl:when>
<xsl:when test="$datatype-name='character large object'">
<xsl:choose>
<xsl:when test="@character_set_name='utf8'">NCLOB</xsl:when>
<xsl:otherwise>CLOB</xsl:otherwise>
</xsl:choose>(<xsl:value-of select="@precision"/>)</xsl:when>
<xsl:when test="$datatype-name='integer'">NUMBER</xsl:when>
</xsl:choose>
</xsl:variable>
<xsl:value-of select="$physical-datatype"/></xsl:template>
</xsl:stylesheet>
This XSLT will transform (or translate if you will) the xml into this:
create user myschema
identified by myschema
/
create table myschema.customer (
id NUMBER NOT NULL
, lastname NVARCHAR2(64) NOT NULL
, firstname NVARCHAR2(64)
)
tablespace users
/
create table myschema.product (
id NUMBER NOT NULL
, name NVARCHAR2(64) NOT NULL
, description NCLOB(255)
)
tablespace users
/
Of course, a similar XSLT to generate the mysql DDL is trivial.
But the fun does not stop here! You can put about anything in your XSLT. So, you can use the same mechanism to generate the code that makes up your data access layer. You would just need to write another XSLT template for each language (say, database SP's, java, C#.NET, C++, whatever) that you need to support.
Another nice thing is that XSLT has two built-in constructs that let you reuse your XSLTs, <xsl:import> and <xsl:include>. Also, XSLT's allow parameters so you can provide the transformation process with whatever extra info should your repository not provide it for whatever reason.
One final note. If you are in the situation that you can keep considering one implementation (either Oracle or MySQL) leading, you can take a shortcut and get away with not building a repository yourself. Both Oracle and MySQL (as of 5.0.2) have a data dictionary. So you could just write the code that exports the data dictionary to XML.
Oracle as of v9 has most of the stuff on board to export the metadata to XML directly.
For MySQL, you could do it too without too much of a hassle. The easiest way to export metadata in xml format would be to write a batch file that would query the tables of the information_schema database using the command line tool. The -X or --xml option will give you the information in an XML format. This will give you a separate xml document for each table in the information schema, but you can solve that in your xslt. In xslt, you can use the document() function to import a document.
Of course, this will increase the complexity of your xslt a bit, but you won't have to write your own xml generator.
Hope this points you into a direction,
Roland