Real-time is not a hard limit.
But the only way to operate in faster than realtime is to deal in probabilities, using as much data as possible.
The more data you have the more confidence you can have that something should be handled in a certain way.
Real estate auctions made me think about it in this way. The problem is that you don't know all of the information about the property until the split second when you have to decide to bid or not. Which is almost impossible at scale, it's like needing to put a price on every single stock, every day, just in case it comes up for sale at the right price.
This is a symptom of a broken market, of course, the stock market could never function like this. Yet that's what I'm dealing with. You need a decent number of participants and adequate information to have a liquid market, the housing market has neither.
Anyway, the only way to deal with this problem is to pre-calculate everything, giving a probability that the house/customer/call/problem you're dealing with falls into a given category. And re-calculate every time you get more information. The output is an acceptable range of values for everything else. If you re-price every property in existence when you get new data you can be adequately prepared to act when you get that last critical piece of information (such as the price).
Maintaining an accurate data-mined model for each entity you're interested in is quite a feat. If you're interested in Twitter users, for example, it's like monitoring every user on the service and then flagging them when they become interesting to your purpose based on everything they've done in the past.
Treating every single instance of a property/customer/prospect whatever as a unique case to be considered on its own instead of as one of group is the only way to operate in faster than realtime.
However, doing things this way requires an immense data store because you're crunching everything you know about every single entity you know about whenever you get new data related to it. Every bit of information you get triggers a process that generates more data. That means every detail record you have becomes not just a detail record, but it becomes the subject of a detailed analysis, often very frequently.
I've seen lots of enterprise-grade relational databases fall flat when faced with a large number of detail records, and you have to start throwing away data very quickly. What a tragedy, no? Most importantly, it doesn't allow you store and crunch as much data as you need to in order to get to faster than realtime-operating in the realm of probabilities. This goes back to the reason that I'm so excited about NoSql databases such as Cassandra and Hadoop, they really enable this.
I've come to realize that it doesn't really matter what kind of whiz-bang analysis you can do within seconds. It will never hold up against brute force analysis of everything you're looking at.
I almost feel like this generated meta-data about the detail data deserves its own name. It's a like an identity probability card or something–the probabilities that the customer/call/property/twitterer you're looking at falls into a variety of categories. Ultimately this is what you're interested in, not the individual bits of data that decide the probabilities. This type of aggregate information about anything is going to be much more valuable than the bits themselves.
Mined data? Data probabilities? Data gold? I don't know what the right name is for this meta-data, but it's very interesting to me right now.








