Graham Fawcett wrote: > Shalabh Chaturvedi wrote: > >> This is still a step up >> from >>something like Zope, where applications written for ZODB cannot be >>(easily) ported to relational databases. > > I may have gone a little far when I suggested "abstracting away all > persistence details" in my earlier rant. ZODB and any relational > database have almost nothing in common, and abstracting their > differences would be very tough. I don't think *that* tough. If we use a 'data model' in our application that can be mapped to both the object and relational models, I believe it is possible. And not just at a theoretical level but practical enough to be useful. I find the object model to be better at the application level, which means the relational model needs to be mapped into objects. In fact this is precisely what many object-relational mappers do :) >>I think that a data persistence interface by itself could be used >>*instead* of components with pre-defined interfaces. For examle, instead >>of having a user management component that implements an interface (i.e. >>an API consisting of a bunch of methods), there could just be a user >> 'data >>schema' (i.e. consisting of a set of objectclasses each consisting of a >>set of attributes). >> >>Now any application such as an issue tracker uses the standard >> persistence >>interface to add and read user and group information. When someone wants >>to deploy this application, he simply points it to the correct >> objectclass >>in his environment, that may be implemented over any data source. Now you >>have a new application working with *existing* data. In fact there might >>be a separate application that adds and deletes users so that function of >>the issue tracker may be turned off by configuration. >> >> > Hm... I was going to argue that an interface is necessary since, at the > least, one needs lookup functions. Having a User object without a > well-known 'UserSource.getUser(userid)' method isn't too helpful. But > 'getUser' and the rest of the interface could be derived from the > schema, couldn't it? SQLObject et. al. use this approach, I believe. You'd actually not derive any interface at all but just use a standard data access interface to do the job. Let's say you have a User class: u = User.find_one(userid='shalabh') would get you one User instance. Here find_one() is part of the data access interface. To check the password, just access the attribute u.passwd. In another application: issue = Issue.find_one(postid=33) might return an Issue instance, on which you'd access and update attributes. Weather the data in User objects comes from a relation database or a flat file or LDAP depends on how it has been configured at installation (not development) time. All the application writer is interested in is that User objects have at least a specific set of attributes such as userid and passwd. >>Taking the idea further, the schema doesn't even have to be pre-defined. >>During deployment, the deployer may *create* a new objectclass from >>existing ones simply by configuration (specifying joins, mapping >> attribute >>names etc.). This new objectclass now exposes the correct schema for the >>new application and they can be tied together. >> >>Sorry for going on and on, >> > I'm eager to listen. ;-) > >>but I think the idea I am trying to push is to >>use a data-oriented approach rather than an interface-oriented one. With >>interfaces, you always need a pre-written component that implements the >>correct interface. >> > Yes, though not necessarily implementations. One can create an Interface > class that specifies the required methods and attributes but does not > implement them. Java, Zope3, PyProtocols/PEAK share this notion. Such > Interfaces are declarative in nature, like schemas; not that I'm > equating the two concepts. Yes, but you're still doomed without an implementation. Let's say you have an application using a User Management Interface. You're happily working with relational databases and then you decide to move to LDAP. Now you go about looking for a User Management implmentation for LDAP. Because you have not just User but also Group Management Interface, Issue Tracker Interface and so on, the problem of finding an implementation explodes as the number of applications grow. In a data-oriented approach, you would have *one* configurable relational data component, and *one* configurable LDAP data component. Let's call these data providers. Since all an application is dependent on - whether a User, Group or Issue Management application - is the 'schema', you configure the data provider of your choosing to map the underlying data into the schema required by the application. Say, to move a User application from a relational database to LDAP: 1. You configure the LDAP provider to expose an objectclass which has two attributes - userid and passwd. The LDAP provider gives you the capability to map data in LDAP - perhaps by by specifying location in LDAP and specific searches - into the standard data model. 2. You connect the User class to the above objectclass. 3. Now the same User code shown earlier works, as does the entire User application. Central to the data-oriented approach is the notion of schema or objectclass. By objectclass I mean a set of data attributes and their types. For example I can say the 'user' objectclass has three attributes - userid (a string), passwd (another string) and groups (a tuple of 'group' objects). This is the data model that all applications use. The success of a data-oriented approach depends on whether a standard data model is possible which is useful to the application as well as mappable to various data stores. I believe so though it is yet to be proven. > In practice, do you conceive the data-oriented approach being sufficient > in cases where the back-end systems are quite dissimilar? Yes! > For example, > it is not hard to imagine moving user management away from a relational > database toward LDAP. What benefits might be gained from a data-oriented > approach (c.f. an interface-oriented one) in such a case? Precisely the one described above :). No new code needs to be written. > I may have misunderstood some of your ideas. I'm left with the notion that > > * a schema implies an interface (with create/retrieve/update/delete > functions), By schema I mean one or more defined objectclasses. The term is taken from the relational world where the schema of a table implies the list of columns and their types. An application may use more than one objectclass and this collection may be referred to as the schema required by the application. This has to do with the structure of data only and does not include any interface. There is a separate data access interface which has methods to create/retrieve/update/delete objects of any schema. > * a schema could be used to generate an implementation of the > implied interface (!) against a particular persistence system > (e.g. MySQL), Well, nothing is generated. An application needs a schema and a data provider can be configured to provide a schema. Then they can be connected and made to work together. The 'interface' of the connection is always the same - the data access interface. > * but not all possible implementations of the interface could be > generated in this way, But a large number of schemas can be generated by configuring a data provider. Since objectclasses can be 'joined' on attributes, attribute names can be mapped, etc. > * that applications using the schema are really using the > implementation, via the interface, not the schema itself, Applications are using the standard data access interface but rely on the actual data conforming to a specific schema. This is like applications always using the DB API but requiring a table with specific columns, except that the schema in this case is object-oriented and not relational. > * therefore schema are a useful tool for building implementations, > and implying interfaces, > * but the interface is the truer decoupling point, since an > implementation that cannot be derived from the schema could be > substituted if necessary (as I suspect in the RDBMS/LDAP case, for > example). I would say the schema is the truer decoupling point. And a schema that cannot be built from configuring an existing data provider may need a new provider implementation. > I'll probably regret posting that last bit when I re-read it tonight, > but I have to head home... > > I'm eager to hear your thoughts. And I am to hear yours. Hopefully I have made some things clearer. I am talking about specific ideas because I have worked through these issues and have come up with a working version of some of the ideas. I just uploaded an unfinished version of some documentation [1] which might give you a better idea of what I am talking about. I'd like to add that I am not rejecting the use of interfaces/protocols for everything. For things like sending mail, for example, an object with a specific method may be required. There is more to discuss though, because even there a data-oriented approach may be possible. For example, obj.send_message(message_object) might be a messaging interface (parallel to the data interface) which has many providers - email, sms, logfile etc. The pattern is to use the API less and the data more to specify what needs to be done. Cheers, Shalabh [1] http://www.shalabh.com/qhb.html#the-data-model (see also http://www.shalabh.com/qhb.html#the-data-model-and-relational-databases )