* Rodrigo Dias Arruda Senra wrote [2005-10-11 11:05:38 -0300]: > Some possible rules of thumb (far from an exaustive list): > > Relational - when you have a huge volume of data, sharing a > small set of stable data schemes. > - when you need ad hoc queries that could be > written as SQL+operators > > OODB - when you have a considerable diversity of data schemes, > when you have to index textual content (==Information Retrieval), > when browsing a structure is enough as a query language > (== when you can do without cross products and joins). A few comments - regarding relational, its ability to manage a huge volume of data is largely because of the facilities provided to add automatic indexes that span multiple keys. If you have a large table, unindexed, and do a sequential scan (retrieval) through every row to find matches for some criteria, its going to take more or less the same amount of time to do the same against a large set of objects in an OODB. One advantage not mentioned with Relational that can't be dismissed easily is the preponderance of tools, and the facility by which integration with other applications - most running on some sort of SQL back end - can be made. Of course the latter point ultimately becomes less and less a distinction as other integration techniques (REST, SOAP and the like) become more prevalent, and the chicken-egg scenario with OODB moves forward. Regarding OODB, is there anything truly in the OODB design that facilitates more easily textual content management? I don't think so. In my past I worked as a systems integrator implementing massive document and workflow management systems, all of them back ended with SQL. A big blob of text, or anything other big lump of information (as opposed to data), can be just as easily shoved into a SQL column as into an OODB as a python type, or, better yet in my opinion, into a darn file on the file system where it belongs. I think the major advantage of the Relational approach is its been around for a long time and is well understood. The major disadvantage of the relational approach is that its painful to map objects on to relational data at times, particularly when the data model is non trivial and the volume of data is large. Back to the trivial, for ad hoc queries, I found, quite to my surprise, that its much easier in Python than in SQL, because after all I have the full power of the object and all its methods available to me, where in SQL all I have is SQL. I'm sure no one responding to this discussion needs this example but no doubt there are some lurkers out there that might benefit. Lets say I want to query the database in SQL to find out what Monty Python's balance is, who his sales person is, and what his last shipment date was: CREATE TABLE customer ( id SEQUENCE, name varchar(200), balance numeric, FOREIGN KEY (salesperson_id) REFERENCES employee (id)) .... etc for each table. Note we'll have to at data definition time make all sorts of decisions that we would not have to make in Python, most commonly all sorts of constraints as to string lengths. SELECT c.name, balance, e.name, last_shipment_date FROM customer c, transactions t, employees e WHERE ar.customer_id = c.id AND t.customer_id = c.id AND t.completed is True AND c.salesperson_id = e.id AND c.name = "Monty Python" ORDER BY t.last_shipment_date LIMIT 1 Or, adding a shipment: insert (customer_id, date, value) into transactions values(124, '2005-10-11', 1234.59) of course you needed that customer_id first but lets not go there. Assuming you have even a simple object model behind you, doing the same in an oodb becomes trivial: monty = customer_db.get('monty python') print "Customer Summary For: %s Sales person: %s Last Shipment: %s" \ (monty.name, monty.salesperson.name, monty.get_transactions()[0]) And that's it.. a simple and perhaps more to the point understandable data access... useful work done, possible even for a casual user of the system's API, which is not always possible with complex SQL schemas. Just think back to any hierarchical data structure modeled in SQL and how mind bending (for me at least) that is, compared to a hierarchy of objects. Newcomers to oodb world seem to obsess (I did) over querying and indexing; but once you get past that, it seems that they are non issues or require only a little thinking now and then. The benefits seem to outweigh the drawbacks by a decent enough margin. Also the "query" language is greatly enhanced in Python because we aren't constrained to the language itself but to all the features of our objects; that's something we don't have with straight SQL, although granted stored procs can deliver some similar benefits. A simple object like this will support that simple python 'query' above (this is all off the top of my head and of course not tested or even proofread): class Customer(Persistent): name = None receivables_balance = 0 salesperson = None transactions = PersistentList() def add_transaction(self, trans): self.transactions.append(trans) def get_transactions(self): L = [trans for trans in self.transactions if trans.completed is True] L = attr_sort(L, 'shipment_date') L.reverse() return L class CustomerDatabase(Persistent): next_account_number = 1 def add_customer(customer): assert customer.account_number is None customer.account_number = self.next_account_number self.next_account_number += 1 class Person(Persistent): name = None class Transaction(Persistent): date = now() value = None items = PersistentList() def compute_value(self): value = 0 for item in items: value += item.value self.value = value db = Database() db.customer_db = CustomerDatabase() db.people_db = PersistentDict() monty = Customer('Monty Python', ... etc) db.customer_db.add_customer(monty) etc... I think I better do some real work now.