Michael Watkins wrote: > * mario ruggier wrote [2005-10-12 08:40:43 +0200]: > > My import is still running... hours later and a 3GB durus file so far. I > rather hoped it would be done by 2GB, but my memory remembers a 20GB > number... It may be that the resulting ZODB was 20GB, after creating objects > based on 600mb of fairly simple raw data. > > Unfortunately I put a commit in the wrong part of my loop, so every 500 > records (of 13 million) a commit is being done... silly me. Ah, yes, commits are slow. I will be doing one commit per file, or I might wrap the entire thing into a single transaction, and therefore a single commit. And I'll time it so I can report how long it takes to build the entire database from scratch. >>Patrick, could this be the basis of a comparison between various >>implementation scenarios? It is rather relational in nature though... if >>we define what the test application should do, then whoever wants can >>implement the test to those speces. > > > This example as described so far is really so simple and trivial, except for > the volume of data. However it could be made somewhat more complex: > > Symbols trade on Exchanges, Exchanges are frequently in different Countries > with different Currencies. Symbols have different Aliases depending on who > the data provider is (i.e. Reuters, Bloomberg, Yahoo, eSignal, DTN, etc all > may have a different symbol for the "Dow 30 Industrials Index"). > > Symbols belong to Indexes - collections of similar securities, arranged > either by Sector or by Style (large cap, mid cap, small cap etc) or by > Geography (emerging markets, US and other country indexes). > > Some Symbols might be "calculated" symbols, in that their end of day values > are as a result of some computation; indexes are a simple example of this - > all symbols from the NYSE exchange could be summed in such a way to create a > composite index. Another example of a calculated index would be to determine > how many symbols hit new 52 week highs for the trading session being > examined, and store that data as "quotes" such that it can be plotted and > analysed. Or calculate how many symbols are trending up, trending down, and > not trending at all. > > Quote records are an indication of what price was at the end of the trading > session; but specific events such as Splits change what the meaning of past > data is, particularly if one needs a continuous split-adjusted set of data. Now we're talking! Those additions would be great. -- Patrick K. O'Brien Orbtech http://www.orbtech.com Schevo http://www.schevo.org Pypersyst http://www.pypersyst.org PyDispatcher http://pydispatcher.sourceforge.net