Is there a need for a master global schema/ontology/vocabulary?

One of the interesting debates from the Semantic Web meetup this month, which has stuck in my head ever since, is the argument over whether it’s worth it to try to compile a master, global, schema–ontology–vocabulary–whatever you want to call it.

I could tell that the academic side of the crowd was again’ it, but the business side of the crowd was ‘fer it.

Guess which side will win?

The analogy that was given (sorry don’t remember who said it) was that a pig farmer might use a different concept of "a time" than I might.  So 3PM Pig-Farming Time might equal 2PM Jason Time.  (If I got the analogy wrong I’m sure I’ll be corrected, but it was something like that.)  Or, one that kept coming up as a real-world example was that a transaction has totally different meanings across industries.

I think that thinking along those lines is misguided.  You can’t define everything in the world, but you can Great_plans
sure define a lot of it.  That’s what businesses end up spending a lot of their time doing–defining their world.  I am intimately familiar with this, as it is one of the biggest problems that we struggled with at Latigent (now Cisco).  Company A will define "revenue" as one thing, Company B will define it as something else. 

This isn’t an either/or proposition.

You can give them both definitions, they really don’t care..  This is an instance where you really can have your cake and eat it too.  As long as both values are populated, they will take the one they need.

To me, this is one of the huge value-adds of the Semantic Web.  You can go out and see if somebody has already developed a schema for what you’re trying to build, and if they have you don’t have to write it yourself.  And your app will automatically inter-operate with theirs, in some areas.  It’s really cool.

Every company out there should really be using available schemas as much as possible–get into the Semantic Web game early and often.  All you have to do is start adopting a standard schema, taking as much as possible from what’s already publicly available.

Why NOT build a giant global schema?  Isn’t that what the ontology-building stuff is all about anyway?  It’s not like it all has to be from the same place.  If I come up with the best definition of a "location", because mine includes height above sea level and active quantum dimension, you’re free to use that in your application for "location" and develop some new schema pieces for the custom stuff you’re building.  Eventually the global schema will just kind of materialize out of the best (well, technically probably the most POPULAR) pieces of publicly-available schema out there anyway.

Heh, boy I can’t even imagine what this is going to do to the business intelligence space eventually.  Totally demolish and rebuild it, is my guess.

Share and Enjoy:
  • Print
  • Digg
  • Facebook
  • Google Bookmarks
  • HackerNews
  • Reddit
  • http://www.chriscrosby.net Chris Crosby

    Business Intelligence companies have long embraced standarized schemas. IE, XBRL. These will most likely continue to be driven from individual verticals and converge somewhere in the middle.

    The problem we had at Latigent wasn't necessarily that companies defined "Revenue" differently (GAAP takes care of that), its that people capture it differently.

  • http://randykolb.com Randy Kolb

    The "why NOT build" question may be too premature for the vision of most corporate entities. It could be viewed as:
    1) a prohibitive expense without guaranteed return
    2) speculative bleeding-edge risk–are there standards that should be followed? Do they already have some level of compliance assurance (PCI, SOx, ISO, on and on…)?
    3) potential security risk–if no one knows my data's schema then I don't have to worry too much about unauthorized extraction or abuse

    No problem in seeing value, particularly with respect to system integration and the entire web as a platform, it just needs to be standardized and secure.

  • Alex

    I don't think it's going to be useful to look for a golden european mega standard. And I don't think it will ever emerge, because domains almost never have identical requirements, and because domain knowledge is a key factor in differentiatiating business.

    It seems more useful and scalable to me to define isomorphisms between ontologies rather than require everyone to use equivalent concepts.

    The result being that you have a number of ways of describing and querying the same kind of data (a good thing). You and I have different, even inconsistent, concepts of transaction/location/etc, because we run different businesses. But as long as there is a mapping between them there is no reason our services can't work together.