[isf-wifidog] Thoughts on implementing an ORM in WifiDog
Benoit Gregoire
bock at step.polymtl.ca
Lun 22 Mai 15:02:12 EDT 2006
On Monday 22 May 2006 09:13, François Proulx wrote:
> I really think that in order to provide a scalable and lightweight
> solution, we will absolutely need to implement a good open source ORM
> solution in Wifidog (realistically I think we'll start doing it after
> v1.0, since it's a very critical change in the core).
>
> Why ? Simply because if you analyze the flow on many pages (take
> hotspot_status.php for example) we generate dozens of SQL queries
> each time, not mentionning that we could as of now use persistent
> connections to postgresql... (i think I'll investigate it today).
> Simply to display the hotspot status there will be a ton of exactly
> identical queries (like node->getNetwork() ) . Since we do not have
> any cache, these eat much resources. Also, managing the ORM is such a
> pain in the ass now... There are a few very very good ORM solutions
> for PHP5 now, almost as good as Hibernate (in the J2EE community).
>
> Take these 2 examples :
> - Doctrine : http://www.phpdoctrine.com/
> - Propel : http://propel.phpdb.org/trac/
>
> These are the most actively developped ORM solutions now.
>
> Doctrine looks very interesting :
> http://www.phpdoctrine.com/comparison.php
>
> more food for thought...
We already use the active record pattern for most tables, so it would
be "reasonably easy" (as in only a few weeks of full time work) to transition
to a sophisticated enough ORM (one did not exist when we started, propel
definitely doesn't fit the bill, Doctrine fits like a glove but is very
young).
Benefits / what using doctrine would buy us:
-Gives us essentially free object caching, across requests, and keep the cache
in sync. How much of a real world gain that would be remains to be seen (some
functions will actually get slower), but it's likely to be significant
overall.
-Slightly clearer cascading model for object creation and deletion.
-Easier to write setters and getters for simple properties.
-A custom content type could create it's own table without disrupting the
system.
-We can be lazy with reusing object instances as creating them isn't as
expensive.
What it would not buy us:
-It would NOT significantly reduce the amount of code or the number of methods
in classes.
-Free setters and getters (unless we use the orm's accessor for database
fields, but then you lose access control, documentation and one abstraction
level)
Problems / What we would lose
-Enforcing of referential integrity.
-Centralised updating of database schema (we could keep it, but we would in
practice lose all the programming benefits of the ORM). Every object will
have to have it's own versioning and updating. That means (if we don't want
to add more sql queries) that every row of every table has to have a schema
version (not really a problem). It may cause glitches for queries that do
not instantiate objects for performance reasons, as the schemas wouldn't be
up to date after an update untill an object of every type has been
instantiated. Once done, it has a significant benefit: it would make it
easier to develop since schema changes would have higher code locality. This
is also a practical problem as it make it much more likely for someone to
assume that this field is not used anywhere else.
So what are we trying to solve exactly? On ISF, with 90 nodes and 18000
users, our CPU is used at about 15%, the vast majority of which is actually
openvpn's idle chat. Scaling for basic operations (auth, login, counter
updates, pings) isn't much of an issue right now, and there are low hagning
fruits to improve it if it becomes one. UI speed is more significant, as
it's spikes of intense activity causing user detectable slowdowns in certain
circomstances. Reports are performance limited (not much we can do about
that, they already had a round of optimizations). The content manager will
likely become first real bottleneck, and it can definitely benefit from ORM
caching (but not as much as one may think, as the results of the most
commonly called methods are not determined only by that object's state).
So to summarize, doctrine would have been an excellent idea if it existed 3
years ago. Using it now wouldn't change the signature of our current methods
so a transition (and most importantly a progressive one) is at least
possible. But transitioning would be a huge amount of non-functionnal, non
user visible changes, most of which have nothing to do with the main expected
benefit (performance). It would also tie up for weeks a developper that need
to have near total understanding of every part of the system.
Do I think we should do it? I think moving to an ORM is a logical step, but
we must go all the way or we'll have endless problems.
Do I think we should do it soon? I'll answer with my favorite quote (again):
Don't use a computer to do things that can be done efficiently by hand.
Don't use hands to do things that can be done efficiently by a computer.
Get it to work first. Then make it work faster.
- Jon Bentley, Bell labs
More information about the WiFiDog
mailing list