On Aug 15, 2011, at 10:12 AM, David Hess wrote: > > On Aug 15, 2011, at 8:50 AM, David Binger wrote: > >> This sounds like the offset mapping in memory is somehow out of sync with the file. >> We've never seen this condition, which is obviously bad. I'm concerned >> about this and trying to figure out how it could occur. > > Yes, if the packing implementation was an issue, I would have thought we would just have missing oids (or is that a durus_id now?) - but I also have evidence of an object whose class is BNode trying to load state of a pickle that was clearly associated with an application object. Yes, the oid is now called "durus_id". I just meant that the BNode class doesn't have any special state management code that could be causing the problem. I think BNode is involved only because that is where your actions happen to be. > > Based on what I did to repair the database, it appears this corruption only affected the items that were touched when the modification to the BTree was made during the pack (interior BNodes and the application objects). It's almost as if there is a race problem during packing having to do with either durus_id allocation or file writing. I would be interested in the details of the repair. I don't remember the details of how you are using FileStorage. Do you have just one thread/process using the FileStorage? I hope so. > >> It seems unlikely to have anything in particular to do with BTree or BNodes: those structures >> are implemented in the same way as other persistent objects. I think, from your >> other message, that you are looking closely at the packing code, and that >> seems like a good idea. There was a time when you talked about >> overriding a '_p_' method. Does the code you are running involve >> any such low-level modifications? > > That's on one particular and isolated class that was not involved in this situation. All it does is ignore the ghosting request from the cache manager (which is safe because we never have conflicts). That does seem safe. Is there any code in the cache manager, though, that calls the ghosting function and then takes some action based on the assumption that the ghosting worked? > >> In the current code tree, File.write() ends with self.file.flush() under all conditions. >> Is that the case in the code you are running? > > Here's the implementation we are running from on Durus 3.7: > > def write(self, s): > self.obtain_lock() > self.file.write(s) The new code adds self.file.flush(). Without it, certain file systems lose track of the correct end of the file, so a subsequent seek to the end of the file ends up in an incorrect place. In a pack, an offset is written into the header of the file and then there is a seek to the end of the file. It is important for this to work correctly. It could possibly be the reason for the trouble you have seen. > > I noticed that DurusWorks 1.0 has been released but the CHANGES.txt files are gone. Should we upgrade to DurusWorks 1.0 instead of 3.8? It looks like Durus is no longer distributed separately? Yes, DurusWorks is a new distribution that includes Durus. I have not yet sent out an announcement. Sorry. I'll start a new CHANGES.txt file in the next release. I'd recommend downloading it and diffing the durus package against the one you have. This code is not changed much, so I hope that is not too much trouble. I would upgrade. > > Thanks. > > Dave > >> On Aug 14, 2011, at 12:44 PM, David Hess wrote: >> >>> Afterwards, the oids seem to be jumbled up in this BTree (at least - maybe elsewhere in the database too). Unpickled Persistent objects are not what they should be - interior BNodes are sometimes application classes and stored values are sometimes BNodes rather than application classes. > >