JasonKolb.com

Business Idea: Online Data Modeling Tool

Moneyfallingfromthesky_2 Man, if I only had the time to actually build all of the ideas I have I'd be a disgustingly rich man.  But I don't, so they end up on my blog :P

Someone really needs to build an online AJAX-driven data modeling tool (a la Embarcadero ER/Studio, or ERWin).  If you've never used one of these for your software development you don't know what you're missing, because the data model really is the foundation for the application (especially if you use code generation utilities, which are just cool as all hell).

I love using them, but it drives me up the wall that I have to use a fat Windows app to do it.  That makes collaboration even more difficult because usually to hook multiple installations together you have to purchase some kind of uber-expensive enterprise package that usually only works over a LAN anyway.  I would be in heaven if I could use some kind of Google Apps type of online application to do data modeling and collaborate with other people.

So there it is, somebody please take this idea and run with it, this is a market just begging to be disrupted.

Just Find the Best Tools for the Job

I find the Enteprise 2.0 adoption discussion very interesting, and it brings to mind some lessons I had to learn the hard way.  I come from the geek mindset, where the tool itself is what is cool, because I use them directly, I read about them, and sometimes I even build them.  I get very excited thinking about semantic Web technology, tagging, user-generated content, federated identity, etc, etc.  But it took me a long time to realize that technology is just a tool, not a solution.  I get excited thinking about all the possibilities, not necessarily about solving a problem.  It’s a problem I have, and I’m working on it :)  But really, if all you have is a problem, to a large extent you don’t really care about the tool as long as it gets the job done.  I think geeks like me tend to lose sight of that, and the result is vaporware and eventual disillusionment.

Imagine if Home Depot sold Web 2.0 tools.  You could walk in and pick up an automatic tagging tool, a user-generated video tool, a rapid AJAX-ifier, social networking connectors, an RSS attachment, and of course the handy rounded-corner router.  Well, I can’t see Home Depot going and trying to sell these tools to companies that weren’t looking for them any more than they would hit the streets to hawk the newest hammers to carpenters that had no need for them.  The customer comes to them for the tools they need, not the other way around.  You don’t shove the tool down people’s throats, you wait for them to see the value in it and ask for it.  Otherwise, you’re just a glorified traveling salesman trying to sell a better mousetrap.  No Soliciting.

So, to be honest, I really don’t care when someone looks up from their beer and shouts “Eureka!  There’s no tagging for the enteprise yet!”.  There’s no YouTube for the enterprise either, that doesn’t mean there’s a hole begging to be filled.  Yet.  Unfortunately, end users take time to digest new technology before they realize they need it—before they can put the pieces together and realize that this tool solves that problem.  That’s why the dot com-boom of the 90’s turned into a dot com-bust.  It wasn’t because the technology was bad, people just didn’t realize the value yet.  That’s how I see the relationship between Web 2.0 tools and the enterprise right now.

The true secret to new tool adoption lies in the cross-functional geek.  IT folks who are not only good at what they do, but are also familiar with the business itself.  They probably even came from the business side but moved over to IT because they saw how to apply the tools there to solve real problems.  Those guys are worth their weight in gold, and if a geek is ever able to truly be successful he needs to be able to put himself in that person’s shoes.  THAT is when Enterprise 2.0 will happen.

Recent Advances and Remaining Obstacles in Web Application Technology

I like to keep a close eye on advances in browser-as-rich-client technology, as I think it's one of the most important areas in the continuing maturation and mass adoption of the Internet.  I think there are still several hurdles to overcome before the browser can compete with the desktop in terms of rich client firepower, but the obstacles are falling almost on a daily basis. 

Recent Advances

Some exciting developments have popped up recently, here are a few of the highlights:

  • Comet:  Comet is a technology that keeps an HTTP connection alive between the browser and the server so that messages can be streamed back and forth in realtime.  This is very rarely used right now primarily because the technology is so young and there's no easy way to implement it, making for a high barrier to entry.  The stuff going on with the cometd framework is changing that, however, and I'm especially interested in what's happening with gCometd, which is a Java implementation of comet.  Once comet grows up and extends into areas like server-side event-handling it's going to have as much of an effect on the rich browser application experience as AJAX had.
  • Offline Access:  One of the big weaknesses in rich Internet applications is the lack of offline access, but there's a lot of work being done to fix that.  In particular, the Dojo Javascript framework now has strong support for browser-side storage, and there's work being done on top of that to enable entire applications to be run offline.
  • Synchronous Server Calls:  A big mental barrier to overcome when programming rich browser applications is the lack of support for synchronous calls to the server--asynchronous is really the only game in town.  It's not just: call function A, get result, call function B.  It's more like: call function A, wait for your callback function to be called, receive the result, and then call function B.  Certainly not the easiest or most straightforward way to write an application.  However, there are some interesting and creative solutions to this problem now thanks to projects like jwacs and Narrative Javascript.
  • Threading:  The one topic that's sure to give any programmer heartburn is threading, but the fact is that there's a lot you can't do without the capability to write multi-threaded applications.  To that end, there's a lot being done to facilitate multi-threaded Javascript applications in one form or another, such as wrapping server-side Java calls and using interpreted languages on top of Javascript.
  • Microformats and Live Clipboard:  Even though Live Clipboard has completely fallen off the map recently (seriously, Ray Ozzie, where did you go?), I still believe that it's a very useful and cool technology and I hope it takes off (moving structured data between sites is just too useful to let die).  Microformats, however, are very much alive, and once they're built into the browser itself I think there's no doubt that they're going to become as ubiquitous as RSS.  They will be the standard way to publish structured data, and I love having a standard format for widely-used data such as contacts and events.  Sites are already building in support, it's just a matter of time.

Remaining Obstacles

It's not all sunshine and lollipops, though.  There are still several areas that need to be addressed ASAP, especially before these technologies can be adopted in the enterprise.  A few of the problems I've noticed:

  • Lack of Synchronous Server Calls:  While the projects attempting to address this shortcoming are a start, the options I've looked at pretty much dictate that you write your application in a certain way or they won't work.  This may be something that needs to be addressed at the browser/Javascript level, but I hope somebody comes up with an elegant solution using existing technology.
  • Lack of Encryption:  Honestly, I really can't believe there's not more awareness of this.  All it's going to take is for one mischevious person with knowledge of HTTP sniffing and Javascript to start watching Internet traffic and gobbling up passwords and sensitive communication before this becomes a VERY hot button topic.  The problem is, most rich client applications use little to no encryption--there is no good way to secure AJAX calls right now.  If you're lucky, your username and password are encrypted via SSL, but I've even seen that rule broken countless times to save a browser refresh.  Did you know that all of your GMail and Google Docs and Spreadsheet traffic is flying around the Internet naked?
  • Lack of Client-to-Client Communication:  One of the big advantages of thick client apps is that each one can exist as its own node on the network and communicate directly with other nodes.  Since browser applications are disconnected (well, until technologies like Comet become standardized, anyway), all communication to and from the browser has to flow through the server delivering the app.  That means that one client cannot talk directly to another client, creating a chasm between nodes on the network.  There are a lot of dominos that need to fall before this one can be addressed--for example it seems ideal to me that clients could talk directly to other clients' servers instead of using their own server as a middle-man, but that requires a universal client lanuage first.  The result of this, though, is that only clients which are connected to the same server have any means to communicate with each other, a pretty severe shortcoming.
  • Lack of a Standardized Object Model:  The browser DOM is really only an object model as it applies to the user interface.  It's like the Win32 API before libraries like MFC and OWL were built on top of it--just about everything has to be coded by hand.  I believe the next big leap in rich client technology will come when a standardized schema and object model is developed as a launching point for new applications, I have to believe that Microformats will play a role in this as they're the closest thing to a standardized schema that we have right now.

Hopefully the problems will be addressed and the advances will continue to advance, and we can do away with fat-client applications completely sometime in the next year or two.  It's definitely an area that bears watching, as it's going to touch every site on the Internet as it continues to develop.

Perfect Programming

Over the past week I've described Network-Oriented Architecture (I'm going to refer to it as NOA in this post to save my fingers), which is a my "ideal" Web-based programming model that I've been brainstorming for about a year, and given a simple example of it in action.  The reason is that I've been talking with a few other people who are thinking along the same lines--that the current programming models are limiting what Web applications are capable of, there's a better way to do things, and we'd like to figure it out sooner rather than later.  I get the feeling that the wider technology community is sensing this as well, as evidenced by Read/WriteWeb's article asking "What's next after AJAX?"

When I started brainstorming a new programming model, I started by asking myself this question:  "In a perfect world, with no limitations, how would you want to program Web applications?"  The posts you saw on NOA are a key part of that puzzle, but there's one more piece that I haven't shared yet.  And that is the actual development workflow and programming environment in which new applications are constructed.  So bear with me, one more post on this topic and then I'll get back to some easier reading ;)

What I realized when I asked myself this question is that the programming environments we had a few years ago (to build desktop applications) were actually much more advanced than what we have now (to build Web applications).  I consider this to be because we currently don't have a good programming model for the Web which blurs the line between the browser and the server like I tried to outline with NOA.

So I think the answer to my question is: "I want what we were using to build desktop applications, and then some."  The "and then some" harnesses the additional functionality we get from running applications on the Internet, namely cross-boundary connectivity and always-on remote resources.

In case you haven't read my posts on NOA, I'll save you some reading:  it essentially makes server-side objects and classes accessible on the client, aka browser.  What I realized after working with it for a while is that all you really need to use an object in the user interface is its URI, and you can get a list of objects that live on a server pretty easily.  With a list of URI's that correspond to objects you can pretty easily construct an programming environment that will allow you to drag and drop objects onto the application interface, something like this:

Ideal_ide

Even better, you can use objects that live on more than one server and even more than one domain, because all you really need is its URI.  When an object is dropped onto the user interface is that a new DIV tag is created in the browser DOM with the object URI as an attribute.  The parser recognizes the URI and requests the object from the server, which spits back the HTML representation of the object.  I wrote an entire post on this piece a while back--if you're interested in more details I'll refer you back to it.

So now we have the capability to drop a server-side object onto the user interface and the object will even take care of rendering itself.  It's basically just drawing objects onto the UI.  When an object is used in the UI an event hook is also set up so that the object is notified whenever it changes on the server, so that it's always consistent with the state of the object everywhere else.

With the line between the server and the browser blurred like that, you can also set up event handlers on the client to handle events such as the object being updated or deleted.  Most AJAX programming environments are focused on getting the event back to the server so it can be handled there, but I have no interest in that.  The entire object model is accessible from within Javascript, so I can keep everything on the client and simply use server resources right there on the client--no need to make a round trip back to the server for anything.  I can even update an object from Javascript and any other user interface that's connected to that object will be automatically updated with the changes due to the event subscription.

What is YOUR perfect programming model?

Well, that's it, my perfect programming model.  If I had a genie in a bottle I'd wish to have it today, but I don't, and there's still some leg work to be done before this type of environment can become a reality.  No real technical hurdles, just elbow grease.  But before venturing down that path and potentially wasting time, I'm really hoping that other people can throw their ideas out there and we can come up with some kind of consensus about "what's next".  I'm not at all attached to any particular technology or methodology, I'm really only interested in what's best.

So that being said, how would you want to program Web applications if you could have anything you wanted?  I think if we can get a dialogue going about this and get a workgroup together to hammer out the final solution, we'll have the opportunity to create software capable of things that the world has never seen before.

Microcontent Viewer Source

At the request of a few different people, I've zipped up and made available the source code for the microcontent viewer/Live Clipboard proof of concept I posted a few months ago.  Everyone interested can find it here.  Enjoy.

Web 2.0 and AJAX Security Vulnerabilities

Ajaxian has a post about some sessions at the Black Hat USA 2006 conference.  I'm quite honestly surprised that this is just gaining some press now, I've figured it would happen sooner than it has (but that's typical for me :)  I posted on this a while back, and I haven't seen much improvement in this area since.

There are so many ways to break an application it's not even funny.  I wouldn't consider a Web application secure unless it (and the company that provides it) have adequate answers to the type of security scrutiny that Sarbanes-Oxley typically requires.

On top of that, however, AJAX programming techniques do a few other things that make it easy to break applications and/or intercept sensitive data:

  • LOTS of Web 2.0 applications use GET instead of POST to transmit data, and that means that any ID's or commands that are in the querystring are available in plaintext to anyone who wants them.  (POST's are vulnerable too, but not quite as easy to intercept).  If there's not a solid authentication mechanism underneath such as Digest authentication, man in the middle attacks become a piece of cake.  Somebody could easily sniff messages and pretend to be you.
  • XmlHttpRequest calls (at the core of most AJAX apps) can easily be interecepted unless they're encrypted with SSL, which almost none are.  That means that pretty much anything you input into a Web 2.0 app is fair game for somebody sniffing HTTP on the network.

Part of the reason I've been so quiet on my blog lately is because I've been wrestling with this very problem.  I absolutely love everything AJAX has to offer, however sending naked data back and forth from the server is a pretty huge problem.  I also don't like the idea of securing the entire site with SSL as that's a huge burden on the server.  What I'm currently working on (as part of a much larger probject) is a comet implementation that streams messages over SSL in an unsecured page so that only messages going back and forth from the server are encrypted.  I believe it will work, but it requires a backend server to receive and queue messages which has to be built first (similar to cometd).  I'll post more about it when there's something to show.

The Top 10 Reasons Web Desktops Are a Bad Idea

It seems like every couple of months a new Web desktop comes along and somehow grabs a whole bunch of press and blog activity (guess I'm not helping in that respect).  This week it's YouOS, developed by several MIT Youos1 grads.  The difference between YouOS and the rest, it seems is that they have more of the traditional desktop functionality such as installers, API's, settings, program groups, etc, etc.

When Web desktops first came out, I thought they were kind of cool, and I played with a few, although I didn't really like any.  I basically just wanted a fancy RSS aggregator, and a place to look at my Gmail.  I've since given up on that idea and have gone back to Google homepage and reader.

That being said, why am I starting to hate Web desktops?

  1. Why?  What's the point?  I already have several desktops that work just fine.  They meet no need I currently have.
  2. They complicate instead of simplify.  Just looking at my screen right now there are about 50 icons staring at me, do I really need 50 more on top of that?
  3. It's yet another fragmentation of my online identity.  I posted about this a while back, and it's just getting worse.
  4. They're currently very buggy, slow, and generally not ready for prime time.  This is fixable but it's a big problem.
  5. What am I supposed to do when it goes down?  As I mentioned before, I have no guarantees that the developers know how to construct a scalable application.  It's great that these guys are from MIT, but what have they done before?
  6. How am I supposed to use one of these on a Mobile device?
  7. I don't WANT to deal with installing things, managing configurations, or muck with anything that comes between me and what I'm trying to accomplish when I get on the Internet.  About the most sophisticated the average Internet user gets is saving something to their Favorites, or if they're really tech-savvy, to Delicious (screw the dots I refuse to type 'em).
  8. As John Udell writes, "the desktop metaphor — with its cluttered surface and overlapping resizable windows — is at best a distraction and at worst an impediment".  A desktop in a browser just isn't that useful to me.  TDavid said it well: "The YouOS concept at first is amusing to play around with and look at but quickly frustrates... overcoming the sardine-like limitations inside a browser window will be too great for most people."
  9. The Windows software model is broken and outdated, this is just an attempt to re-create it on the Web.  Windows will undoubtedly be the last proprietary starting point any of us ever use.  If they open-source this it might be different, but the other points still apply.  They will never make money selling them.  Maybe Google or Microsoft could make money showing ads on them, but they're probably the only ones big enough to pull it off.  At least one out there is open source, eyeOS, although I haven't played with it.
  10. Here's the real deal-killer for me:  I don't want anyone else owning my data, let alone my entire desktop.  If someone is going to hold my data, they'd better be able to give me assurances that they meet some basic security requirements and let me know how to get the data out if they ever go belly-up.

This all looks like technology for technology's sake to me.  Just because you can doesn't mean you should.

Marrying objects and services in Web apps

I've been thinking a lot lately about the best way to construct "Web 2.0"-style applications.  I posted before about how AJAX and Flash apps have almost returned to a client-server model again, with much more of the application living on the client instead of on the server.  I don't know that I've seen a satisfactory solution to this question yet.

There seem to be two main schools of thought around this right now.  One is that most of the application lives on the client in Javascript or Flash, and just uses the server for storage.  NetVibes and quite a few of the "Web 2.0" apps are examples of this type of architecture.  The other is that most of the application lives on the server, which is responsible for rendering most the application, and the client is made up of some widgets and controls that interact with the user and send information back to the server for processing and storage.  Microsoft's Atlas framework is a prime example of this type of architecture.

This almost goes back to the whole object-oriented vs. SOA debate.  The "mainly-client" applications tend to be more object oriented, while the "mainly-server" applications tend to be service-oriented.  I don't feel that either of these are ideal.  I think we're at the point where we need to marry both of these approaches for an ideal application architecture.  Object-oriented on the client is very good, but the client isn't suited for heavy lifting like servers are.  Servers are great at heavy lifting, but they aren't so good at creating a rich user experience.

I think what's needed is a hybrid approach.  I love the idea of having a rich object model on the client, at least up to the point that Javascript allows.  With some hacks, you can simulate multiple levels of inheritence and some pretty nifty object oriented constructs.  What I've been toying with lately is the idea of creating object proxies on the client that are coupled to the "real" objects on the server using SOA.  The client objects would have all of the same properties and methods as the objects on the server, but the guts of the objects would actually live on the server.  The server essentially just accepts the data from the client-side Javascript and passes it back to the server using SOAP or REST, gets the value, and passes it back.  Transparent to the script on the client, but it essentially offloads the real work to the server.  The server can then do whatever it needs with that data, including passing it off to a message queue, re-routing the request to another server using SOA, etc.  In a sense it's the loosest coupling you can get, it's decoupling the objects on the client itself from the server.  It's using SOA as the plumbing behind the client object.

What I'd like to do is create, say, a Person object on the client which is coupled to a person record on the server (in the database or in XML).  All the property changes would be done on the client, but when the user saves the object, the client object would then use SOAP or REST to call the Save method on the server, passing with it the properties of the client object in JSON format.  Same with loading an object getting an object, the client should just send the object ID to the server and ask for the full contents which the server would return using JSON, and then probably cache it on the client for future use.  The client objects' base classes would also have the inherent ability to render itself to HTML, return its XML for Live Clipboard use, etc.

Maybe somebody's already developed something like this, but I haven't been able to find it yet.  I don't want to be tied to any specific development language on the server, either, so libraries like Ajax.NET that generate the client objects from the server code aren't really what I'm looking for.  I'm interested in an actual object library for use on the client that communicate directly with the server in the appropriate places.  The Javascript libraries like Dojo have just matured enough to enable this pretty recently (in my opinion, anyway).  If anyone has seen anything along these lines I'd really appreciate a link.

This is Not an Ordinary Blog Post

This post is a proof of concept--I've embedded microformatted content into the text of this post.  If you run this page thru a microcontent viewer you should be able to see and use the microcontent.  There aren't, to my knowledge, any viewers out there yet, so I ( this is my vcard

) wrote a simple one that supports events and contacts (hcards and hevents).  Try viewing this post using the microcontent viewer I wrote using this URL:

Go ahead and play with the viewer a little.  Click the links, map an address, make a call with Skype, copy and paste contacts using Live Clipboard.  (Anyone who's never used Live Clipboard before should read this other post for a step-by-step.)  It's purely Javascript and CSS-based, which makes it very simple to plop on top of any AJAX application out there (including RSS readers).  It's also a small piece of a larger project I'm working on, but I wanted to throw it out there because I see a lot of misunderstanding right now about the potential of microformats.  Although I think it's very cool that search engines like Technorati are beginning to understand and aggregate microformatted content, that's only half the equation.  The other half is that we need to allow PEOPLE to use microcontent as well.  This post is an example of that capability.   Viewing this post with a compatible viewer gives the reader the ability to not only read the text, but to do things with the content as well.   (To my knowledge this is the only public text in existance right now with embedded microcontent, although I'd love to learn about some more examples!)

Using Microcontent

Admittedly, there aren't many fun things to do with microcontent yet. However, it's very enlightening the first time you move data around between applications using Live Clipboard.   Try copying a contact out of this post and pasting it into Ray Ozzie's Live Clipboard demo site.  Another site that supports Live Clipboard is M. David Peterson

's Global Clip demo (which is super cool because what you paste in gets stored in Amazon's S3 online storage service).  The sites that support Live Clipboard are a little rough around the edges at the moment, but I would assume that things will start coming together nicely over the next six months.

Here's an example embedded event just for kicks: Web 2.0 Conference

. Just to give you an event to copy & paste using Live Clipboard.

To me, this is what microformats promise.  They enable us to turn regular old content into rich media, with little to no effort on the part of content creators.

More Examples from Around the Web

Now, let's have a little more fun ;)  You can actually use the viewer I wrote to look at things other than this blog post.  You can either hit the viewer directly using http://www.xformats.org/MicroViewer or you can append the URL to the query string to automatically load up a page like I did with the link to this post earlier.  Go find some microformatted content and plug it in, here are some links to content that I found from poking around on http://www.microformats.org:

Disclaimer:  if it doesn't work or your computer bursts into flames or you break out in a rash or something, tell me about it--but I accept no blame in perpetuity for anything :)  This isn't even beta software, this is like... whatever comes before alpha.  Also, people are doing lots of weird stuff with Microformats such as embedding <script> tags in them, so you'll often find that although the cards will render, they will choke Live Clipboard if you attempt to paste them into another site.  If you're technically minded, try pasting the contact into Notepad or something so you can fix it.

These links also aren't really examples of inline microcontent like this post is, unfortunately to my knowledge this is the first example of that on the Web.  If anyone has any other examples I'd love to know about them.

Technicalities

If anyone's interested I'll post some more technical information about all this, but I'm still refactoring it for broader use in actual products.  All of the code for this example is licensed under the Creative Commons Share-Alike license, so you're free to use and/or modify it if you wish.  I'm still adding to it and refactoring it quite a bit; however, I got it to a stable point and I figured I'd see what people thought of as it stands now.

Oh and by the way, listen up Microsoft:  we need an editor for this stuff.  If you really want to leapfrog the competition, do us all a favor and build Live Clipboard and microformat support into the next version of Word and Outlook.

Thoughts?

Home away from home

1600 Pennsylvania AveWashington

Work

350 Fifth AveNew York

The return of client-server

As I've been planning and prototyping my next major project, I realized that we really haven't figured out what the optimal archicture for Web 2.0-type applications is yet.  The big difference is that some of the application itself lives on the client now, not just the HTML that the back end spits out.  There's plenty of documentation out there on writing pre-Web 2.0 Web applications, but it's still pretty much up in the air as to how much of the application should actually live on the client versus the server.

I'd call this another application layer beyond the presentation layer, because it's not just presentation.  A lot of the new Web applications out there (e.g. NetVibes) live almost entirely on the client, and just use the server to store data.  In fact most of the new Web applications use the client not just for data presentation and manipulation, but also for input validation and even storage (the use of Flash as a small-scale storage medium is blossoming).  It's almost like adding that old client layer back in from client-server computing.

The problem is, it always takes a while to get new architectures right.  I've found that when I use services like NetVibes, once I create too much data and the client app gets too fat, it starts to break.  But from what I can tell, they use the client to render the windows, lists, feeds, pretty much everything.  I don't really think JavaScript is heavy-duty enough to handle that much of the application, and I don't really care to start worrying about system specs again in the future.  With $100 laptops coming, the client side has to stay lean and mean.  I actually just found a post by Phil Wainewright describing the service model of Employease, which I think is a great example of a good architecture in this new world.  That is, they provide a list of services that can be called from either the client or another server, depending on the business need.

On the other hand, users are getting used to rich client experiences, and there even benefits to it.  For example, pretty soon we'll be able to use the client for identity verification and personal security.  And we'll be able to use the clipboard in our browser experience.  And lots of other cool stuff.

I don't think any of the regular big boys in software development have really figured it out yet.  Microsoft is trying to straddle the fence on this with their Atlas API, but they're getting terrible reviews and the code it generates is way too fat.  I think Dojo is by far the best designed API out there, they have the right idea with providing rich-client Widgets and offloading the heavy lifting transparently to the server with their great IO engine.

Somebody needs to do a study on rich client scalability and architecture design, or maybe it's out there and I just haven't found it yet.

Client side storage using Dojo

This is really cool.  Ajaxian ran a story about a new capability built into the Dojo framework that provides persistent client-side storage.  There's even a link to a demo app.  Although it has some limited use because, frankly, I want my Web apps to work on every machine I use (yes I'm a geek I'm in front of at least 2 machines at all times), this opens up some cool possibilities.

It even does a pretty nifty trick of deciding the most appropriate storage mechanism for the client, with multiple options including cookies, flash, and IE-specific.  I'm envisioning using this for Live Clipboard 2.0, in fact it could even work like the MS Office clipboard where you're able to hold more than one object/text chunk a time.  Groovy.

IDEA:  Could we keep a security token there?

Cross-domain SOAP from the browser

You know, writing about SOA vs. Web 2.0 earlier got me thinking.  I think the real missing piece that needs to be addressed is a way to consume SOAP Web services, cross-domain, from the browser.  That'll make it possible to use SOAP services in a pure AJAX environment for things like mashups.  Right now the only way to consume SOAP services from the browser is to use a "behavior", which only works in Internet Explorer and doesn't work cross-domain (it can only call SOAP services that exist on the same server as the client code).  So in order for applications to use SOAP services on the client, the server must act as an intermediary, and introduces another layer on top of the original service itself.  Right now the only way for the browser to use a Web service on another domain is using JSON, which doesn't support all the security and transaction goodness that SOAP does.  The client needs to be able to use SOAP directly, on other domains, before this will quiet down.

By the way, I really believe this is doable with a little elbow grease; Dave Johnson has already figured out how to send XML cross-domain using dynamic script tags, a technique that I'm pretty excited about.  I think all it's going to take is for somebody to translate Microsoft's web service behavior to pure javascript using a similar technique and we'll be rockin'

Web 2.0 Office Software: TeamSlide

I just read an interesting post on Ajaxian about a new Powerpoint-like slideshow application called TeamSlide.  Mike Arrington also wrote about this new app.

As the online Office-killers pile up, I can't help but wonder what formats these companies are using.  If they're all using independently-developed formats, that's a bad thing.  If they keep them private, that's an even worse thing.

What they really need to do is put their formats out there in the public domain and allow people to beat on it and re-use it.  Not only will it allow people to re-use the work they've already done, but it'll allow to people to provide feedback about how to improve the product and the format.  And once a standardized format is reached, adoption is going to grow exponentially.  It'll allow people to choose which application they want to use to view a particular file, which levels the playing field for the little guy.  If somebody doesn't want to spend over a grand for an Office license with PowerPoint, they can choose to use TeamSlide to view the same file.

On a similar note, I know Microsoft is opening up the Office XML format for the next release, I wonder if PowerPoint is included in this.  If it is, it would be wise for TeamSlide and other applications like it to work with that format, even it is via an import or transform.

Sarbanes-Oxley IT Security Compliance Checklist

Web-based applications and the technology that enables them are fantastic, but they’re bringing a new set of security considerations and challenges along with. This is destined to become a bigger and bigger issue as Web 2.0 applications gain traction, and particularly as they move into the enterprise. Checklist

This list represents an attempt to compile those considerations and the proper way to handle them. This is by no means a comprehensive list, but rather is meant to be a starting point for entrepreneurs or individuals curious about the security of a specific application.

I realize that not all companies will be able to comply with all of the requirements, especially in the back-end section. However, keep in mind that these requirements all need to be addressed to comply with the Sarbanes-Oxley Act (SoX). If a company chooses not to take them into consideration they’re most likely going to be cutting all publicly traded companies (and the companies that do business with them) out of their potential customer base.  On a side note, I hope this checklist will eat a chunk out of the SoX auditing extortion racket.  Back when I had to go through making a company SoX-compliant, I was never able to find anything like this.  Because there's nothing in the Act itself about IT, a small cottage industry has sprung up around telling companies what they need to do to become compliant and then auditing them so that their partners know that they're "SoX Cerified".  Well, if you meet the bullets on this checklist you're most likely going to pass.  I had to figure these out the hard way, that is, by paying consultants boku bucks to tell me what they were ;)

Sensitive User Information

The crux of this list and SoX is protecting "sensitive user information".  Here’s a basic list of what constitutes that vague term (I probably left some out so please leave feedback if you think of something I missed):

  • Account number and identifiers
  • Customer numbers
  • User names
  • Credit card or bank information of any kind
  • Passwords
  • Private messages and blog posts
  • Wage information
  • Social security and driver’s license numbers
  • Birthdates

Front-End Security

The first list applies to the application front end, with special consideration given to AJAX and Web services calls:

  • The application should use Digest authentication or SSL when accepting a user’s password for authentication.
  • User passwords should be stored as an MD5 hash of the password in the database rather than as plain text.
  • No sensitive account stored by the application should ever be rendered to the client. This includes database logins, email logins, third-party site logins, etc.
  • User sessions should not be identified using cookies or IP addresses, both of which can be easily compromised.
  • No sensitive information should be stored in cookies.
  • Strong passwords (more than 8 characters, mixed alphanumeric and special characters, mixed upper- and lower-case) should be enforced if users select their own passwords.
  • HTTP POST requests should be used instead of GET requests whenever possible. This isn’t more secure per se, but it raises the bar for potential hackers and makes it more difficult to crack your system.
  • GET and POST requests should not be vulnerable to SQL injection attacks.  All forms should check for special characters such as single or double quotes before sending the information to the database.  Preferably, stored procedures should be used for all database access since they are not vulnerable to SQL injection attacks.
  • All input validation should be done on the server, even if it was already done on the client.  Doing this will avoid the possibility of using Javascript injection attacks.
  • If sensitive user information is sent or received from Web services: The WS-Security SOAP header or SSL should be used.
  • If sensitive user information is sent or received using XmlHttpRequest (AJAX calls): The entire page utilizing XmlHttpRequest needs to be secured using SSL (which will secure the XmlHttpRequest calls from that page using SSL as well).  SSL is currently the only 100% secure way of making XmlHttpRequest calls, all other methods are vulnerable to man in the middle attacks.

Back-End Security

This list applies primarily to the internal database and network:

  • The database administrator password should not be used for application access.  If possible it should be renamed from the default name (sa, for example), and only known by somebody outside of the IT department.
  • If the database contains sensitive customer information, the number of people who know the application database login or have unrestricted access to the database should be strictly limited.  Backups should be encrypted and stored off-site in a secured location.
  • Anyone who has access to the production database should be required to change their password at least every 90 days.
  • If you must store credit card information, it should be encrypted and preferably wiped out after processing is completed.
  • Change your network Administrator account name (so that it's not "Administrator").
  • Domain administrator accounts should rarely if ever be given out, and the list of people with that access should be documented and readily available.
  • Security violations (such as a user entering the wrong password three times in a row) should be logged to a secure location and reviewed by the company Security Officer on a regular basis.
  • The physical servers that hold sensitive customer information need to be secured and protected as well. Typically this is taken care of in a hosted environment, but if you’re hosting your application from your basement then you’d better make sure that you have the basement locked and you make the water heater repair man sign in and out.

Perimeter (Network) Security

Here are some requirements that are just good practices for building secure Web-based applications, particularly when they’re hosted on the Internet:

  • The firewall should only allow ports 80 and 443 (HTTP and SSL) except when other applications such as BitTorrent or FTP need to be used.
  • The Web servers should be on a separate network, separated from the database and other internal servers by a firewall which only allows the database application port to be used.
  • The default database application port should be changed if possible.

Hope this helps everyone out there, please help me flesh this out if I missed something.

Web 2.0 Security

I think one of the topics that's really going to be hot in the 2nd half of 2006 is Web 2.0 security. Before these apps can ever live in the enterprise, there are going to be a lot of hard questions asked about how hardened these apps are and if they're really secure.

For example, are they using anything besides SSL to encrypt user passwords and senstive information? Do the AJAX calls back to the server permit people to sniff and decrypt tokens that can be used to glean private customer information? Are the AJAX and HTTP calls subject to SQL injection attacks? Are the passwords stored in the database or are they using salted password derivatives? Are they using WSE for their Web services calls?

Big companies will and do ask these questions. Before the Web 2.0 apps can graduate from use only in mom & pop shops, they'll need to answer them.

The problem is, it's too easy to build cool applications now without a knowledge of proper software architecture. I know. I've been burned by these very questions in the past, and they're not easy to answer if you've never answered them before. The very fact that the ASP-model applications *don't* provide answers to these questions tells me they're not prepared to answer them, and are probably hoping that they don't ever get asked. Ostrich Syndrome - head in the sand.

I think there's a really big opportunity here for somebody to start a company that certifies software companies for best security practices. It should be pretty easy to compile an audit checklist that somebody can use to check their software against. In fact I might very well start one.

If this doesn't happen, once hackers catch on to AJAX techniques this industry is going to shoot itself in the foot (or maybe more relevant, the womb?) It'll never gain traction because CEO, CTO's and CIO's of big companies will be so scared by stories of Web 2.0 applications being compromised that they won't touch them with a 10-foot pole. Remember, these guys could do hard jail time if their customer's information is compromised as a result of Sarbanes-Oxley. Something to keep in mind, all you Basecamps out there.