[isf-wifidog] Thoughts on implementing an ORM in WifiDog

Benoit Gregoire bock at step.polymtl.ca
Lun 22 Mai 15:02:12 EDT 2006


On Monday 22 May 2006 09:13, François Proulx wrote:
> I really think that in order to provide a scalable and lightweight
> solution, we will absolutely need to implement a good open source ORM
> solution in Wifidog (realistically I think we'll start doing it after
> v1.0, since it's a very critical change in the core).
>
> Why ? Simply because if you analyze the flow on many pages (take
> hotspot_status.php for example) we generate dozens of SQL queries
> each time, not mentionning that we could as of now use persistent
> connections to postgresql... (i think I'll investigate it today).
> Simply to display the hotspot status there will be a ton of exactly
> identical queries (like node->getNetwork() ) . Since we do not have
> any cache, these eat much resources. Also, managing the ORM is such a
> pain in the ass now... There are a few very very good ORM solutions
> for PHP5 now, almost as good as Hibernate (in the J2EE community).
>
> Take these 2 examples :
>    - Doctrine : http://www.phpdoctrine.com/
>    - Propel : http://propel.phpdb.org/trac/
>
> These are the most actively developped ORM solutions now.
>
> Doctrine looks very interesting :
> http://www.phpdoctrine.com/comparison.php
>
> more food for thought...

We already use the active record pattern for most tables, so it would 
be "reasonably easy" (as in only a few weeks of full time work) to transition 
to a sophisticated enough ORM (one did not exist when we started, propel 
definitely doesn't fit the bill, Doctrine fits like a glove but is very 
young).

Benefits / what using doctrine would buy us:
-Gives us essentially free object caching, across requests, and keep the cache 
in sync. How much of a real world gain that would be remains to be seen (some 
functions will actually get slower), but it's likely to be significant 
overall. 
-Slightly clearer cascading model for object creation and deletion.
-Easier to write setters and getters for simple properties.
-A custom content type could create it's own table without disrupting the 
system.
-We can be lazy with reusing object instances as creating them isn't as 
expensive.

What it would not buy us:
-It would NOT significantly reduce the amount of code or the number of methods 
in classes.
-Free setters and getters (unless we use the orm's accessor for database 
fields, but then you lose access control, documentation and one abstraction 
level)

Problems / What we would lose
-Enforcing of referential integrity.
-Centralised updating of database schema (we could keep it, but we would in 
practice lose all the programming benefits of the ORM).  Every object will 
have to have it's own versioning and updating.  That means (if we don't want 
to add more sql queries) that every row of every table has to have a schema 
version (not really a problem).  It may cause glitches for queries that do 
not instantiate objects for performance reasons, as the schemas wouldn't be 
up to date after an update untill an object of every type has been 
instantiated.  Once done, it has a significant benefit:  it would make it 
easier to develop since schema changes would have higher code locality.  This 
is also a practical problem as it make it much more likely for someone to 
assume that this field is not used anywhere else.

So what are we trying to solve exactly?  On ISF, with 90 nodes and 18000 
users, our CPU is used at about 15%, the vast majority of which is actually 
openvpn's idle chat.  Scaling for basic operations (auth, login, counter 
updates, pings) isn't much of an issue right now, and there are low hagning 
fruits to improve it if it becomes one.  UI speed is more significant, as 
it's spikes of intense activity causing user detectable slowdowns in certain 
circomstances.  Reports are performance limited (not much we can do about 
that, they already had a round of optimizations).  The content manager will 
likely become first real bottleneck, and it can definitely benefit from ORM 
caching (but not as much as one may think, as the results of the most 
commonly called methods are not determined only by that object's state).  

So to summarize, doctrine would have been an excellent idea if it existed 3 
years ago.  Using it now wouldn't change the signature of our current methods 
so a transition (and most importantly a progressive one) is at least 
possible.  But transitioning would be a huge amount of non-functionnal, non 
user visible changes, most of which have nothing to do with the main expected 
benefit (performance).  It would also tie up for weeks a developper that need 
to have near total understanding of every part of the system.

Do I think we should do it?  I think moving to an ORM is a logical step, but 
we must go all the way or we'll have endless problems.
Do I think we should do it soon?  I'll answer with my favorite quote (again): 

Don't use a computer to do things that can be done efficiently by hand.
Don't use hands to do things that can be done efficiently by a computer.
Get it to work first. Then make it work faster.
- Jon Bentley, Bell labs


More information about the WiFiDog mailing list