Tuesday, 13 November 2007

hibernate hell

I'm afraid I don't like "Hibernate". Hibernate, when used incorrectly, can add a massive overhead to your project in terms of runtime efficiency and maintainability.

Imagine the scenario where the architects of an application decide to create the database schema first. This isn't necessarily a bad thing until you add Hibernate in to the equation. EJB3, cool! So let's generate our objects from our database schema... bad mistake.

Hibernate is in an O/R mapping tool. That means it allows developers to map objects to a relational database schema. By generating objects based on a database schema what you are actually doing is creating Relational Database Objects. What I mean is, rather than creating a valid object model which works well, you are simply creating a bad object model that reflects your database schema. This might sound great and handy BUT... what you end up with won't be an object model that is very useable. Hibernate will add annotations to your objects and will assume all kinds of fetching strategies (it SHOULD assume eager fetching as per the JPA spec, but I'm not sure what it does) and may also add associations between classes that are uncessary or undesireable. For instance, when you add new use cases are the fetching strategies still appropriate? Possibly not! So, if you take this approach you need to go through the generated model with a fine tooth comb and check that it supports your all your currently known use cases and try best you can to make sure it can support your unknown future use cases (a very difficult task!).

Personally, I don't like the "database up" design approach. It implies a waterfall approach from the outset as usually once the database schema has been settled on it is very hard to change it. A better approach is to only think in terms of your object model and do not worry about your database schema until you really need to. At this point I can see Hibernate being more useful in that it will be able to generate your schema for you. However, I wouldn't recommend this either as it will most likely design you a schema that is very rigid to your current object model. Not only will you be stuck with your schema, but your object model won't be very flexible either.

I prefer to work with my object model and leave the RDBMS until the last possible moment. Since I do a lot of modular work, my database schema design ends up being very modular as well. This doesn't suit everyone so a suggestion would be to take the object model in small chunks and design the schema at agreed stages during the project lifecycle. You will naturally group objects together in packages where cohesion between classes is high or the classes fall in to a natural domain. Why should your database schema be any different? A lot of the time though, this is hard if not impossible because of referential integrity - and rightly so! But if you want the same flexibility with your RDBMS schema as you do with your objects grouped in to packages (low coupling) you end up with a lot of join tables. I don't have a problem with this, but tools like Hibernate cannot work that out for you.

Another option, if you really must use a RDMBS is to work with an ODBMS until later on in to your project when your object model has settled down and the risk of change in designing a database schema has been reduced. If you were to use something like DB4O for instance, it will encourage you to respect your object model. With the correct, sensible level, of abstraction it is not hard to create a layer to replace DB4O with your data access code for your RDBMS schema.
(Further more, consider using stored procedures in order to further remove the dependancy on your database schema.)

So you may be thinking that it isn't fair for me to dislike Hibernate, but the reason I do is mainly because it allows developers/architects to take certain shortcuts which add complexity and a certain amount of rigidity to a project. In turn this makes it difficult to be agile, encouraging a waterfall approach and makes it really hard to change your project's code at a later date, even if you just want to add something new.

It doesn't have to be that way, but given the availabilty of the tools, I suspect most developers (under project pressures) will take the easy way out just to satisfy their manager's continual (and understandable) desire for reducing timescales, but it is a false economy.

If you're about to embark on something like this, either generating objects based on a database schema or generating the schema based on objects consider NOT using the tools. Continue to use Hibernate, yes, but don't take the shortcuts as in the end they'll only add overhead rather than saving time and money.

Your turn... have you had a good experience with Hibernate generation tools? Did you have to spend a long time tweaking the objects/schema that was generated? Did you use Hibernate but manually annotate the objects/create mapping files? What approach worked best for you? What nightmares have you had?

6 comments:

German Viscuso said...

Hi!

Your explanation of the problems related to ORM (and particularly Hibernate) is excellent!
However, why replace db4o once everything is nicely set up with a cool ODBMS? =)

Best regards!

German Viscuso
db4o community manager

Chris Brind said...

Hi German!

Well, yes... I would agree, but apparently our customers would see it as a barrier as they would expect to be able to use an RDBMS (i.e. so they can run their own queries if they want to). I realise that DB4O can be used with reporting tools like Bert, but convincing management is very tricky. :(

But, I'm fully behind ODBMS and especially DB4O and intend to push it whenever I can. =)

Cheers,
Chris

Ced said...

Hi Chris,

For my past two projects, we've done it like this:
- write the hbm by hands
- generate the entities with "hbm2java"
- generate the database with "hbm2ddl"

So you are maintaining only one thing in the database layer: the HBMs.

I've integrated the generation by using maven so every cycle of build, I'm generating the entities (so I don't keep the entities in CVS/subversion).

The only disadvantage of that approach, you can't add behaviour to your entities (due to the generation cycle). To bypass this issue (when I need it), I've used the proxy pattern which encapsulate an entity and provide additional behaviour.

In my past experience the database is tight couple to only one application. Is there any reason why you talk to the inflexibility of the database structure with Hibernate if you are in that kind of scenario?

For me, designing the hibernate entities is the same of doing the database design.

With an MDA approach (my prefer solution), you are modelling the database/entities with UML and behind the scene it uses a combination of its own tools (AndroMDA) and the hibernate tools. You don't get the issue with inflexibility of adding behaviour to your model.

Obviously by using an ORM library, the database layer is a black box. Is not Java application and a separate database. You have to design/maintain the ORM layer instead of maintaining two separate things. Is that not a good thing? :-)

My 2 cents...
Ced.

Chris Brind said...

Hi Ced,

It sounds like you have a disciplined approach at least, but how do you know your database design is optimal?

What happens if you determine your database design sub-optimal and you need to make an improvement some how.

Also, what about other database specific aspects like indexing and so on. Is this all done through your modelling? How do you know that what comes out the other end is valid?

Personally for those reasons alone, I think it's a bad idea to model both the database and application objects in a single place.

The best solution would be not have a database at all, which is what using an ODBMS is like. It's just persistent storage for your app's object. Further more your queries are code, not just strings in code, but actually compilable. A good ODBMS (e.g. DB4O) will optimize the query, but the important point is that it gets checked by the compiler and not at runtime. Cool eh?

Seriously, RDBMS and ORM are out of date. On your next green-field project consider using DB4O instead of an RDBMS (you can still use reporting tools like Bert so don't let that put you off).

Cheers,
Chris

Ced said...

what do you mean about "disciplined"? eh? :-)

In term of database optimum stuff like index, tablespace, etc. I would have a separate SQL script which I would run after the hibernate hbm2ddl.

I'll have a look at db40 if I've got the opportunity to ditch Oracle... so if I've got it correctly, by using DB40, you don't have a separate box? is it all in the app server? what about scalability, load, etc.


Ced.

Chris Brind said...

Hi Ced,

You have the option of both. DB4O is an Enterprise level database, you can either run it embedded app agains a file or you can run it standalone and use a client connection. Beyond opening the file or making the connection the interface is identical.

You need to see their website for information on scaling, etc. And its works with .NET too (not that you'll be bothered about that I suppose).

You'll love it! =)

Cheers,
Chris