I have a proposal.
I have been watching and reading about social network portability and data portability and OpenID and facebook beacon and doc searls’ vendor relationship management and Obama’s call for open formats and Google Drive and Jon Udell’s hosted lifebits scenarios (another post just today).
And Chris Messina has been hanging around this space for a while as well and posted two solid write-ups this past week – on data portability and data brokers.
All of these things seem to scream for some integration, for a system that plays by all the rules and ‘just works’ for the simplest of use cases today, and is ready to scale up and handle the use cases of tomorrow.
I’m envisioning a wrapper – a specification that defines how data should be held and managed for an individual. At first, this should be a single human – later, perhaps organizations or groups of people.
It should be the bucket where our digital stuff lives and the vehicle through which we interact with vendors and each other.
I see three parts – at least for now.
1. Data Repository
This is the heart of the matter. A solid datastore on which to build. This can be a collection of approved document formats – ones that are found to be open and/or well understood. Archival quality stuff here. These need to last for a long time and not be rendered incompatible or unreadable in the future. Really, this is nothing more than a well-defined filesystem or collection of files.
I think it’s most useful at this point to consider the different types of data and simply list the types of formats that meet these criteria. If we don’t have such a format at this time, document the gap and hope that the next few years provide standards that match up.
- ICS/iCal – Calendar/Event Data
- MPEG – Video Data
- Ogg – Video and Audio Data
- TXT – Text, preferably UTF-8
- PDF – Portable Document Format
- ODF – Open Document Format (Word Processing, Charts, Spreadsheets, Presentations)
- vCard – People Listings, AddressBooks
- JPEG – Photographs / Images
- SVG – Scalable Vector Graphics / Images
- PNG – Portable Network Graphics / Images
- KML – Keyhole Markup Language – Mapping Data
- FOAF/XFN – Relationships between people
- OPML – Subscriptions
- GEDCOM – Geneology Data (has limitations)
- PHR/EHR – Personal/Electronic Health Record – complicated, lots of standardization attempts
- APML – Attention / Interest Data
- Financial Data – records, transactions, balances, gets complicated quickly, OFX, GnuCash
2. Data Channels
This is the second piece – getting data in and out of the repository. Open protocols are the key here and it seems we have quite a number of them already being pushed around the live web. Let’s name some and find some gaps…
- XMPP – Messaging – does voice, text, images, this is the Jabber protocol
- HTTP – The web protocol – very handy
- RSS/Atom – Syndication
- OAuth – Authentication between applications
- OpenID – Authentication of Users
- Yadis – Service Discovery
- SMTP/IMAP – Mail protocols
3. Data Management
The third part of this specification would be focused on the management of the data that in the repository and keeping things secure and logged. This is the most complicated part and what makes OpenLifeBits the most different from anything we’ve already got today. Encryption should be at the heart of keeping things well secured (having a brokered encryption market (managing access to secret keys) is another task altogether). Additionally, the dataset should have the capability to be split/merged at will. If you don’t want your medical history stored near your financials, so be it.
- Metadata – describe the data in the repository – using open standard METS
- Permissions – access control – does an open standard exist, Unix permissions?
- Encryption – variety of standards, definitely should have good, strong defaults
- Backup – needs to be atomic, automatic, and recoverable (versioned, even)
- Logging – a full record of what has happened to the life of the dataset
Interfaces
A fourth piece that is really beyond the scope of the definition of any spec is the interface(s) into this data and how a person actually interacts with the data and the outside world via the dataset. A comprehensive list will not be possible to create today, but a solid look at what is available today should force a flexibility of thinking to allow future innovators to do what they do best.
- Mac
- Windows
- Linux
- API
- smartphone
- star trek communicator
Brokered Digital Identity Management
The entire infrastructure/dataset should be portable. Most people will not want to worry about the intracacies of managing this kind of data. We already do it for banking – we get a broker. We outsource and allow experts to manage our stuff for us. We let them worry about the details and there is a marketplace to encourage them to behave well. If they do not, we can move our stuff. This should be the case with our lifebits as well.
Big Picture
There is a large amount of momentum (and cash) behind today’s corporate model (companies own data about you). Inverting the system to be beholden to me and my permission model is not something that will happen overnight. Additionally, the legal questions around ownership of data and the contractual obligations of those you share your information with remain unanswered questions. I have a hunch though that a lot of these types of questions have precedent – just not with the specifics of personal data archives.
As we move into a more digital existence, we will need tools that begin to manage this type of stuff on the personal level. Do you think we can start small and simple and grow into the more complicated models later? Is anyone going to see any value in this OpenLifeBits model besides the geeks among us?
Next week’s IIW in Mountain View should have quite a few people willing to talk about an OpenLifeBits. Please come and find me if you want to hash some of this out further.
Tags: archives - dataportability - jonudell - lifebits - openlifebits - phr - PIM - snp - vrm