<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Terrell Russell: This Old Network &#187; PIM</title>
	<atom:link href="http://weblog.terrellrussell.com/tag/pim/feed/" rel="self" type="application/rss+xml" />
	<link>http://weblog.terrellrussell.com</link>
	<description>Ideas on interconnections, identity, and information from all sides.</description>
	<lastBuildDate>Thu, 22 Dec 2011 15:31:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>The Alexandrine Dilemma</title>
		<link>http://weblog.terrellrussell.com/2008/12/the-alexandrine-dilemma/</link>
		<comments>http://weblog.terrellrussell.com/2008/12/the-alexandrine-dilemma/#comments</comments>
		<pubDate>Wed, 10 Dec 2008 16:49:13 +0000</pubDate>
		<dc:creator>Terrell Russell</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[librarians]]></category>
		<category><![CDATA[library science]]></category>
		<category><![CDATA[personal archiving]]></category>
		<category><![CDATA[pesce]]></category>
		<category><![CDATA[PIM]]></category>
		<category><![CDATA[shirky]]></category>

		<guid isPermaLink="false">http://weblog.terrellrussell.com/?p=169</guid>
		<description><![CDATA[Mark Pesce gave a keynote entitled &#8220;The Alexandrine Dilemma&#8221; at the New Librarians Symposium last Friday. He spoke about how Library Science, its skills and philosophy, are necessary for everyone to embrace and understand as we move forward in our networked world. Quite inspiring, as someone who&#8217;s selected that line of work, if I do [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.playfulworld.com/">Mark Pesce</a> gave a <a href="http://blog.futurestreetconsulting.com/?p=101">keynote entitled &#8220;The Alexandrine Dilemma&#8221;</a> at the <a href="http://conferences.alia.org.au/newlibrarian2008/Speakers.html">New Librarians Symposium</a> last Friday.  <a href="http://blog.futurestreetconsulting.com/?p=101">He spoke about how Library Science, its skills and philosophy, are necessary</a> for everyone to embrace and understand as we move forward in our networked world.</p>
<p>Quite inspiring, as someone who&#8217;s selected that line of work, if I do say so myself.</p>
<p>A few samples:</p>
<blockquote><p>In fact, because the library is universal, library science now needs to be a universal skill set, more broadly taught than at any time previous to this. We have become a data-centric culture, and are presently drowning in data.
</p></blockquote>
<blockquote><p>I could go on and on, but the basic point is this: <strong>wherever data is being created, that’s the opportunity for library science in the 21st century</strong>. Since data is being created almost absolutely everywhere, the opportunities for library science are similarly broad. It’s up to you to show us how it’s done, lest we drown in our own creations.
</p></blockquote>
<blockquote><p>The dilemma that confronts us is that for the next several years, people will be questioning the value of libraries; if books are available everywhere, why pay the upkeep on a building? Yet the value of a library is not the books inside, but the expertise in managing data. That can happen inside of a library; it has to happen somewhere. Libraries could well evolve into the resource the public uses to help manage their digital existence. Librarians will become partners in information management, indispensable and highly valued.</p>
<p>In a time of such radical and rapid change, it’s difficult to know exactly where things are headed. We know that books are headed online, and that libraries will follow. But we still don’t know the fate of librarians. I believe that the transition to a digital civilization will founder without a lot of fundamental input from librarians. We are each becoming archivists of our lives, but few of us have training in how to manage an archive.
</p></blockquote>
<blockquote><p>When you announce yourselves to the broader public as the individuals empowered to help us manage our digital lives, you’ll doubtless find yourselves overwhelmed with individuals who are seeking to benefit from your expertise. What’s more, to deal with the demand, I expect Library Science to become one of the hot subjects of university curricula of the 21st century.
</p></blockquote>
<p>One interesting note &#8211; and somewhere I think Mark misses the boat:</p>
<blockquote><p>It’s interesting to note that books.google.com uses Google’s text search-based interface. Based on my own investigations, you can’t type in a Library of Congress catalog number and get a list of books under that subject area. Google seems to have abandoned – or ignored – library science in its own book project. I can’t tell you why this is, I can only tell you that it looks very foolish and naïve. It may be that Google’s army of PhDs do not include many library scientists. Otherwise why would you have made such a beginner’s mistake? It smells of an amateur effort from a firm which is not known for amateurism.
</p></blockquote>
<p>This isn&#8217;t a shortcoming of Google &#8211; this is a liberation from the shortcoming of the historical reality of shelf space as the limiting factor in a physical library.  These numbers have all been subjectively applied by the local librarian and rarely agree across libraries.  Additionally, these subjective assignments reflect more about the culture making the assignment than the content of the work.</p>
<p>I&#8217;m surprised Mark swung so wildly on this point.  If we&#8217;re all sharers now, and we bring our own opinions to the table.  What we need more than &#8216;search by catalog number&#8217; is a means to sift/sort the multiple readers&#8217; opinions of what a book/work is about.  This may include expert and non-expert opinion &#8211; the point is that the work can be filed in multiple places at the same time.</p>
<p><a href="http://www.shirky.com/writings/ontology_overrated.html">As Clay Shirky has said, &#8220;there is no shelf&#8221;:</a></p>
<blockquote><p>People have been freaking out about the virtuality of data for decades, and you&#8217;d think we&#8217;d have internalized the obvious truth: there is no shelf. In the digital world, there is no physical constraint that&#8217;s forcing this kind of organization on us any longer. We can do without it, and you&#8217;d think we&#8217;d have learned that lesson by now.
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://weblog.terrellrussell.com/2008/12/the-alexandrine-dilemma/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>OpenLifeBits &#8211; For Your Digital Stuff</title>
		<link>http://weblog.terrellrussell.com/2007/11/openlifebits-for-your-digital-stuff/</link>
		<comments>http://weblog.terrellrussell.com/2007/11/openlifebits-for-your-digital-stuff/#comments</comments>
		<pubDate>Wed, 28 Nov 2007 21:14:05 +0000</pubDate>
		<dc:creator>Terrell Russell</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[archives]]></category>
		<category><![CDATA[dataportability]]></category>
		<category><![CDATA[jonudell]]></category>
		<category><![CDATA[lifebits]]></category>
		<category><![CDATA[openlifebits]]></category>
		<category><![CDATA[phr]]></category>
		<category><![CDATA[PIM]]></category>
		<category><![CDATA[snp]]></category>
		<category><![CDATA[vrm]]></category>

		<guid isPermaLink="false">http://weblog.terrellrussell.com/2007/11/openlifebits-for-your-digital-stuff/</guid>
		<description><![CDATA[I have a proposal. I have been watching and reading about social network portability and data portability and OpenID and facebook beacon and doc searls&#8217; vendor relationship management and Obama&#8217;s call for open formats and Google Drive and Jon Udell&#8217;s hosted lifebits scenarios (another post just today). And Chris Messina has been hanging around this [...]]]></description>
			<content:encoded><![CDATA[<p>I have a proposal.</p>
<p>I have been watching and reading about <a href="http://microformats.org/wiki/social-network-portability">social network portability</a> and <a href="http://www.dataportability.org/">data portability</a> and <a href="http://openid.net">OpenID</a> and <a href="http://chimprawk.blogspot.com/2007/11/data-sharing-with-facebooks-beacon.html">facebook</a> <a href="http://blogs.law.harvard.edu/doc/2007/11/25/time-to-write-our-own-rules/">beacon</a> and <a href="http://blogs.law.harvard.edu/doc/">doc searls&#8217;</a> <a href="http://projectvrm.org/">vendor relationship management</a> and <a href="http://www.veen.com/jeff/archives/000976.html">Obama&#8217;s call for open formats</a> and <a href="http://online.wsj.com/article/SB119612660573504716.html">Google Drive</a> and <a href="http://blog.jonudell.net/">Jon Udell&#8217;s</a> <a href="http://blog.jonudell.net/2007/01/29/the-persistent-blogosphere/">hosted</a> <a href="http://blog.jonudell.net/2007/05/22/hosted-lifebits/">lifebits</a> <a href="http://blog.jonudell.net/2007/08/22/hosted-lifebits-scenarios/">scenarios</a> (<a href="http://blog.jonudell.net/2007/11/28/your-winnings-sir/">another post just today</a>).  </p>
<p>And <a href="http://factoryjoe.com/blog/">Chris Messina</a> has been hanging around this space for a while as well and posted two solid write-ups this past week &#8211; <a href="http://factoryjoe.com/blog/2007/11/26/data-portability-and-thinking-ahead-to-2008/">on data portability</a> and <a href="http://factoryjoe.com/blog/2007/11/26/data-banks-data-brokers-and-citizen-bargaining-power/">data brokers</a>.</p>
<p>All of these things seem to scream for some integration, for a system that plays by all the rules and &#8216;just works&#8217; for the simplest of use cases today, and is ready to scale up and handle the use cases of tomorrow.</p>
<p>I&#8217;m envisioning a wrapper &#8211; a specification that defines how data should be held and managed for an individual.  At first, this should be a single human &#8211; later, perhaps organizations or groups of people.</p>
<p>It should be the bucket where our digital stuff lives and the vehicle through which we interact with vendors and each other.</p>
<p>I see three parts &#8211; at least for now.</p>
<p><strong>1. Data Repository</strong></p>
<p>This is the heart of the matter.  A solid datastore on which to build.  This can be a collection of approved document formats &#8211; ones that are found to be open and/or well understood.  Archival quality stuff here.  These need to last for a long time and not be rendered incompatible or unreadable in the future.  Really, this is nothing more than a well-defined filesystem or collection of files.</p>
<p>I think it&#8217;s most useful at this point to consider the different types of data and simply list the types of formats that meet these criteria.  If we don&#8217;t have such a format at this time, document the gap and hope that the next few years provide standards that match up.</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/ICalendar">ICS/iCal</a> &#8211; Calendar/Event Data</li>
<li><a href="http://en.wikipedia.org/wiki/MPEG">MPEG</a> &#8211; Video Data</li>
<li><a href="http://en.wikipedia.org/wiki/Ogg">Ogg</a> &#8211; Video and Audio Data</li>
<li><a href="http://en.wikipedia.org/wiki/Text_file">TXT</a> &#8211; Text, preferably UTF-8</li>
<li><a href="http://en.wikipedia.org/wiki/PDF">PDF</a> &#8211; Portable Document Format</li>
<li><a href="http://en.wikipedia.org/wiki/OpenDocument">ODF</a> &#8211; Open Document Format (Word Processing, Charts, Spreadsheets, Presentations)</li>
<li><a href="http://en.wikipedia.org/wiki/Vcard">vCard</a> &#8211; People Listings, AddressBooks</li>
<li><a href="http://en.wikipedia.org/wiki/JPEG">JPEG</a> &#8211; Photographs / Images</li>
<li><a href="http://en.wikipedia.org/wiki/Scalable_Vector_Graphics">SVG</a> &#8211; Scalable Vector Graphics / Images</li>
<li><a href="http://en.wikipedia.org/wiki/Portable_Network_Graphics">PNG</a> &#8211; Portable Network Graphics / Images</li>
<li><a href="http://en.wikipedia.org/wiki/Keyhole_Markup_Language">KML</a> &#8211; Keyhole Markup Language &#8211; Mapping Data</li>
<li><a href="http://en.wikipedia.org/wiki/FOAF_%28software%29">FOAF</a>/<a href="http://en.wikipedia.org/wiki/XHTML_Friends_Network">XFN</a> &#8211; Relationships between people</li>
<li><a href="http://en.wikipedia.org/wiki/OPML">OPML</a> &#8211; Subscriptions</li>
<li><a href="http://en.wikipedia.org/wiki/GEDCOM">GEDCOM</a> &#8211; Geneology Data (<a href="http://archiver.rootsweb.com/th/read/GENMSC/2007-07/1184785656">has limitations</a>)</li>
<li><a href="http://en.wikipedia.org/wiki/Personal_health_record">PHR</a>/<a href="http://en.wikipedia.org/wiki/Electronic_health_record">EHR</a> &#8211; Personal/Electronic Health Record &#8211; complicated, <a href="http://en.wikipedia.org/wiki/Electronic_health_record#Standards">lots of standardization attempts</a></li>
<li><a href="http://en.wikipedia.org/wiki/APML">APML</a> &#8211; Attention / Interest Data</li>
<li>Financial Data &#8211; records, transactions, balances, gets complicated quickly, <a href="http://en.wikipedia.org/wiki/Open_Financial_Exchange">OFX</a>, <a href="http://www.gnucash.org/">GnuCash</a></li>
</ul>
<p><strong>2. Data Channels</strong></p>
<p>This is the second piece &#8211; getting data in and out of the repository.  Open protocols are the key here and it seems we have quite a number of them already being pushed around the live web.  Let&#8217;s name some and find some gaps&#8230;</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/XMPP">XMPP</a> &#8211; Messaging &#8211; does voice, text, images, this is the Jabber protocol</li>
<li><a href="http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">HTTP</a> &#8211; The web protocol &#8211; very handy</li>
<li><a href="http://en.wikipedia.org/wiki/RSS">RSS</a>/<a href="http://en.wikipedia.org/wiki/Atom_%28standard%29">Atom</a> &#8211; Syndication</li>
<li><a href="http://oauth.net/">OAuth</a> &#8211; Authentication between applications</li>
<li><a href="http://openid.net">OpenID</a> &#8211; Authentication of Users</li>
<li><a href="http://en.wikipedia.org/wiki/Yadis">Yadis</a> &#8211; Service Discovery</li>
<li><a href="http://en.wikipedia.org/wiki/SMTP">SMTP</a>/<a href="http://en.wikipedia.org/wiki/IMAP">IMAP</a> &#8211; Mail protocols</li>
</ul>
<p><strong>3. Data Management</strong></p>
<p>The third part of this specification would be focused on the management of the data that in the repository and keeping things secure and logged.  This is the most complicated part and what makes OpenLifeBits the most different from anything we&#8217;ve already got today.  Encryption should be at the heart of keeping things well secured (having a brokered encryption market (managing access to secret keys) is another task altogether).  Additionally, the dataset should have the capability to be split/merged at will.  If you don&#8217;t want your medical history stored near your financials, so be it.</p>
<ul>
<li>Metadata &#8211; describe the data in the repository &#8211; using open standard <a href="http://en.wikipedia.org/wiki/METS">METS</a></li>
<li>Permissions &#8211; access control &#8211; does an open standard exist, Unix permissions?</li>
<li>Encryption &#8211; variety of standards, definitely should have good, strong defaults</li>
<li>Backup &#8211; needs to be atomic, automatic, and recoverable (versioned, even)</li>
<li>Logging &#8211; a full record of what has happened to the life of the dataset</li>
</ul>
<p><strong>Interfaces</strong></p>
<p>A fourth piece that is really beyond the scope of the definition of any spec is the interface(s) into this data and how a person actually interacts with the data and the outside world via the dataset.  A comprehensive list will not be possible to create today, but a solid look at what is available today should force a flexibility of thinking to allow future innovators to do what they do best.</p>
<ul>
<li>Mac</li>
<li>Windows</li>
<li>Linux</li>
<li>API</li>
<li>smartphone</li>
<li>star trek communicator</li>
</ul>
<p><strong>Brokered Digital Identity Management</strong></p>
<p>The entire infrastructure/dataset should be portable.  Most people will not want to worry about the intracacies of managing this kind of data.  We already do it for banking &#8211; we get a broker.  We outsource and allow experts to manage our stuff for us.  We let them worry about the details and there is a marketplace to encourage them to behave well.  If they do not, we can move our stuff.  This should be the case with our lifebits as well.</p>
<p><strong>Big Picture</strong></p>
<p>There is a large amount of momentum (and cash) behind today&#8217;s corporate model (companies own data about you).  Inverting the system to be beholden to me and my permission model is not something that will happen overnight.  Additionally, the legal questions around ownership of data and the contractual obligations of those you share your information with remain unanswered questions.  I have a hunch though that a lot of these types of questions have precedent &#8211; just not with the specifics of personal data archives.</p>
<p>As we move into a more digital existence, we will need tools that begin to manage this type of stuff on the personal level.  Do you think we can start small and simple and grow into the more complicated models later?  Is anyone going to see any value in this OpenLifeBits model besides the geeks among us?</p>
<p><a href="http://iiw.idcommons.net/index.php/Main_Page">Next week&#8217;s IIW in Mountain View</a> should have quite a few people willing to talk about an OpenLifeBits.  Please come and find me if you want to hash some of this out further.</p>
]]></content:encoded>
			<wfw:commentRss>http://weblog.terrellrussell.com/2007/11/openlifebits-for-your-digital-stuff/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Timelined Information Retrieval</title>
		<link>http://weblog.terrellrussell.com/2007/10/timelined-information-retrieval/</link>
		<comments>http://weblog.terrellrussell.com/2007/10/timelined-information-retrieval/#comments</comments>
		<pubDate>Wed, 31 Oct 2007 16:31:57 +0000</pubDate>
		<dc:creator>Terrell Russell</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[email]]></category>
		<category><![CDATA[normal distribution]]></category>
		<category><![CDATA[PIM]]></category>
		<category><![CDATA[retrieval]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[timeline]]></category>

		<guid isPermaLink="false">http://weblog.terrellrussell.com/2007/10/timelined-information-retrieval/</guid>
		<description><![CDATA[I was thinking about how I search through my email this morning and worked out that sometimes I know more about *when* an email happened than what it said or who it was from. This is a rare thing, but generalizing, I quickly worked out that this would be a great addition to any/all search [...]]]></description>
			<content:encoded><![CDATA[<p>I was thinking about how I search through my email this morning and worked out that sometimes I know more about *when* an email happened than what it said or who it was from.  This is a rare thing, but generalizing, I quickly worked out that this would be a great addition to any/all search interface(s) if done well.</p>
<p>I want to be able to specify where in time I think my known item search should look.  I think it could be done fairly simply with a well-designed <a href="http://en.wikipedia.org/wiki/Normal_distribution">normal distribution curve</a>.</p>
<p><a href='http://en.wikipedia.org/wiki/Normal_distribution' title=''><img src='http://weblog.terrellrussell.com/wp-content/uploads/2007/10/800px-normal_distribution_pdf.png'  width='250' alt='' /></a></p>
<p>I want to see a timeline (aka a landscape-oriented rectangle) with a distribution curve that I can drag around.  I would be centering the curve on *when* I wanted to focus my search.</p>
<p><img src='http://weblog.terrellrussell.com/wp-content/uploads/2007/10/timeline.png' alt='' /></p>
<p>The search itself would still do fulltext and weight like before, but now, would scale that prior weighting by how well it fit under my specified curve.</p>
<p>I have not done any due-diligence in looking through the information retrieval literature, but I have not seen this interface before and it seems like it would be very helpful for certain types of known-item, time-based queries.</p>
<p>Things that are not within my &#8220;window of interest&#8221; would be punished with a reduced relevance score in my overall search results.  Things that matched my curve, in time, would receive a boost.  Otherwise, the search behaves as it always has.  This would simply be an addition parameter that gives more power to the searcher who knows *when* they&#8217;re looking for.</p>
<p>Has anyone seen anything like this before?  Where?</p>
]]></content:encoded>
			<wfw:commentRss>http://weblog.terrellrussell.com/2007/10/timelined-information-retrieval/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Your Personal Data and whether Google knows all</title>
		<link>http://weblog.terrellrussell.com/2007/05/your-personal-data-and-whether-google-knows-all/</link>
		<comments>http://weblog.terrellrussell.com/2007/05/your-personal-data-and-whether-google-knows-all/#comments</comments>
		<pubDate>Thu, 24 May 2007 13:23:49 +0000</pubDate>
		<dc:creator>Terrell Russell</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[PIM]]></category>

		<guid isPermaLink="false">http://weblog.terrellrussell.com/2007/05/your-personal-data-and-whether-google-knows-all/</guid>
		<description><![CDATA[Google knows a lot about each of us. If you&#8217;re doing anything online these days, you&#8217;ll be hard-pressed to do it without Google having a hand in a part of it. Recently, James Thomas decided to not use Google&#8217;s products at all for two weeks and quickly realized it made the Internet quite hard to [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.google.com/search?q=%22google+knows%22">Google knows</a> a lot about each of us.  If you&#8217;re doing anything online these days, you&#8217;ll be hard-pressed to do it without Google having a hand in a part of it.</p>
<p>Recently, <a href="http://www.centernetworks.com/my-life-without-google">James Thomas decided to not use Google&#8217;s products at all for two weeks</a> and quickly realized it made the Internet quite hard to use.  They&#8217;re everywhere &#8211; and he had to go out of his way to force his computer to not lookup or visit google.com.  Not exactly an option for the vast majority of users. (He did also note that the Internet was faster&#8230;)</p>
<p>Google is moving into the <a href="http://arstechnica.com/news.ars/post/20070518-google-to-target-isps-with-google-apps-package.html">ISP space</a> (<a href="http://googleblog.blogspot.com/2007/05/getting-it-done-with-google-apps.html">over 100,000 organizations already</a>), the <a href="http://picasa.google.com/">photo hosting space</a>, the <a href="http://gmail.com/">email space</a>, the <a href="http://www.google.com/calendar">calendar space</a>, the <a href="http://www.google.com/a/edu/">higher education space</a>, the <a href="http://www.google.com/analytics/">analytics space</a>, <a href="http://www.businessweek.com/print/technology/content/apr2007/tc20070414_675511.htm">more banner ads</a>, and now, into the RSS space itself with <a href="http://www.techcrunch.com/2007/05/23/100-million-payday-for-feedburner-this-deal-is-confirmed/">yesterday&#8217;s purchase of FeedBurner</a>, the premiere RSS serving tool.   Most high powered RSS sites I see are pushed out over FeedBurner&#8217;s network &#8211; their statistics and republishing in multiple formats of your serialized datastream are first rate.  For $100M, and a promise of a couple years future employment for the owners, Google now has insight into <strong>that</strong> side of our collective data behavior as well.  They&#8217;ve got the readership side figured out with <a href="http://www.google.com/reader/">Google Reader</a>, and now the serving side is known to them via this deal.  How tidy.</p>
<p>I&#8217;m starting to sense a shift in my own dealings with the King of Search.  I go out of my way to avoid Google Groups and Gmail.  I don&#8217;t use Google Docs or Google Apps for my domain, even though they&#8217;re arguably easier and more functional than most other setups available today.  I avoid having them know all the feeds I&#8217;m reading (Reader) and things I&#8217;m searching for (log out of google account before searching).  I can only assume since these services all sport a unified login now, that Google could not plausibly deny that aggregation is possible across all their (growing) properties.</p>
<p>And I trust Google.  I do.</p>
<p>But who I don&#8217;t trust is everyone else.  I don&#8217;t trust that Google will not be driven by the government to hand over certain records or prevent themselves from a data breach forever.  They are a very high value target.</p>
<p>With regards to Google knowing too much, <a href="http://chimprawk.blogspot.com/2007/05/googles-tia-strategy.html">Fred has a paragraph that&#8217;s worth quoting in his post from yesterday afternoon&#8230;</a></p>
<blockquote><p>Anonymity is the ultimate irony of the internet. The medium is so clouded in the perception of anonymity, it can fundamentally change human behavior. Of course, the reality is that the internet is the most sophisticated data mining tool ever invented. Compared to any offline action, you are less anonymous when you are using the internet. The nature of our revelations in this false anonymous context could lead a CEO to believe that they really could uncover the &#8220;true&#8221; persona of an individual, hence being able to accurately answer these very personal questions. In fact, this may be partially true; however, what we&#8217;d have to give up to get this benefit is almost always too much.</p></blockquote>
<p>All that said &#8211; I truly want my stuff to be online and available to me.  I want global access to what is mine &#8211; and to be secure in the fact that it&#8217;s redundantly backed up and &#8216;safe&#8217; from the bad guys.  I think that is the way of the future.  A personal repository of my stuff with nuanced access given to those who need it when they need it.  <a href="http://blog.jonudell.net/2007/05/22/hosted-lifebits/">Jon Udell posted something along the lines of what I want earlier this week&#8230; Hosted Lifebits&#8230;<br />
</a></p>
<blockquote><p>Grade 11</p>
<p>You’re applying to colleges. You publish your essay into your space, then syndicate it to the common application service. The essay points to supporting evidence — your e-portfolio, recommendations — which are also (to a reasonable degree of assurance) permanently recorded in your space.</p>
<p>College sophomore</p>
<p>You visit the clinic and are diagnosed with mononucleosis. You’ve authorized the clinic to store your medical records in your space. This comes in handy a couple of years later, when you’ve transferred to another school, and their clinic needs to refer to your health history.</p>
<p>Working professional</p>
<p>You use your blog to narrate the key events and accomplishments in your professional life, and to articulate your public agenda. All this is, of course, published in your space where you are confident (to the level of assurance you can reasonably afford) that it will be reliably available for your whole life, and even beyond.</p></blockquote>
<p>I think we are well on our way to giving up too much.  There will always be a wide spectrum that defines how we live our lives, but more and more, we are choosing to give up our personal information for the sake of convenience in the very short term.  This is a dangerous precedent and, I&#8217;m certainly not the first to say it, but, I&#8217;d rather not be the one who jumps first.  <a href="http://www.wired.com/techbiz/people/magazine/15-06/ps_transparency">Have we really gotten to the point where giving up all our privacy is the right answer?  Posting everything online?</a></p>
<blockquote><p>So it dawned on him: If being candid about his flights could clear his name, why not be open about everything? &#8220;I&#8217;ve discovered that the best way to protect your privacy is to give it away,&#8221; he says, grinning as he sips his venti Black Eye. Elahi relishes upending the received wisdom about surveillance. The government monitors your movements, but it gets things wrong. You can monitor yourself much more accurately. Plus, no ambitious agent is going to score a big intelligence triumph by snooping into your movements when there&#8217;s a Web page broadcasting the Big Mac you ate four minutes ago in Boise, Idaho. &#8220;It&#8217;s economics,&#8221; he says. &#8220;I flood the market.&#8221;</p></blockquote>
<p>It seems so wrong&#8230;  Is this just paranoia on my part?</p>
<p><strong>Update: </strong><a href="http://chimprawk.blogspot.com/2007/05/your-private-twitters-arent.html">Fred did it again today &#8211; went and posted something relevant &#8211; Your Private Twitters Aren&#8217;t</a>:</p>
<blockquote><p>If you&#8217;ve been Twittering privately for the past few months, I&#8217;ve got some bad news.  <a href="http://meish.org/2007/05/24/theres-a-hole-in-your-twitter/">As reported by Meish</a>, the Twitter API does not enforce privacy ACL&#8217;s, meaning all of your private Twitters are available to the public. To check this out for yourself, visit <strong>http://twittervision.com/username</strong>, and you&#8217;ll be able to see private Twitter streams.</p>
<p>I must note that it appears that not all accounts are affected by this problem. It&#8217;s impossible to calculate the breadth of this breach, or what it will do to Twitter as a company, but it illustrates a greater problem with the internet. What if your Gmail, or Google History, or Facebook/Myspace account leaked? Or what if the government swept up your information in a national security letter, only to have your information posted in court documents? Think it can&#8217;t happen, or that these well-meaning companies can even control it? <a href="http://www.cs.cmu.edu/%7Eenron/">Just ask people at Enron how they feel</a>.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://weblog.terrellrussell.com/2007/05/your-personal-data-and-whether-google-knows-all/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>BibDesk, BibTeX and Subversion &#8211; An academic&#8217;s necessity</title>
		<link>http://weblog.terrellrussell.com/2007/02/bibdesk-bibtex-and-subversion-an-academics-necessity/</link>
		<comments>http://weblog.terrellrussell.com/2007/02/bibdesk-bibtex-and-subversion-an-academics-necessity/#comments</comments>
		<pubDate>Wed, 28 Feb 2007 05:57:43 +0000</pubDate>
		<dc:creator>Terrell Russell</dc:creator>
				<category><![CDATA[Default]]></category>
		<category><![CDATA[bibdesk]]></category>
		<category><![CDATA[bibtex]]></category>
		<category><![CDATA[PIM]]></category>
		<category><![CDATA[subversion]]></category>

		<guid isPermaLink="false">http://weblog.terrellrussell.com/2007/02/bibdesk-bibtex-and-subversion-an-academics-necessity/</guid>
		<description><![CDATA[When I first started this PhD program, I had a desire to keep all my references and papers in my computer. Better to search them. Better to keep them in one place. No dog-eared corners and ripped notebook paper in 3-ring binders. I wanted to minimize the stacks of paper that I knew would accumulate [...]]]></description>
			<content:encoded><![CDATA[<p>When I first started this PhD program, I had a desire to keep all my references and papers in my computer. Better to search them.  Better to keep them in one place.  No dog-eared corners and ripped notebook paper in 3-ring binders.  I wanted to minimize the stacks of paper that I knew would accumulate on the floor of my office. I wanted to preempt the piles that I&#8217;ve got a reputation for creating.</p>
<p>The fact that this was a pipe dream is fodder for another discussion at another time.  I want to talk about how I <strong>am</strong> managing the documents I <strong>do</strong> have in electronic form.</p>
<p>One of the ways I devised to attack the problem of the piles was to locate and continue to use a reference manager for my electronic files (mostly PDFs). There were a few to choose from so I needed to make a decision. I decided I&#8217;d choose based on 1) open formats, 2) documentation, and an 3) open development model (open source). After looking through RefWorks, vanilla BibTeX files, ProCite, EndNote, Reference Manager (the application) and BibDesk, I chose BibDesk.</p>
<p><a href="http://bibdesk.sourceforge.net/">BibDesk</a> (<a href="http://bibdesk.sourceforge.net/screenshots.html">screenshots</a>) wins because of these:</p>
<ul>
<li><strong>Open data format</strong>: BibDesk is a Mac app built to manage BibTeX files.  BibTeX is an open format designed to be easily edited and easily moved from platform to platform (it&#8217;s plaintext).  It has been developed by many eyes over many years and seems to be one of the most robust and flexible formats available for keeping track of bibliographic records.</li>
<li><strong>Documentation</strong>:  Being as old as it is, <a href="http://en.wikipedia.org/wiki/BibTeX">BibTeX</a> has a great number of well documented use cases.  Many of the people who use BibTeX have <a href="http://www.ecst.csuchico.edu/~jacobsd/bib/formats/bibtex.html">documented</a> their different situations online, where I can find them through basic web search.  This is a very big deal &#8211; as I can be confident that whatever it is I&#8217;m trying to do with the software, I&#8217;m probably not the first.  And <a href="http://bibdesk.sourceforge.net/manual/">BibDesk itself is documented pretty well too</a>.</li>
<li><strong>Open source code</strong>:  The codebase for BibDesk is open and available for download, inspection, and editing (if you are so inclined).  I can trust that the format and application with which I&#8217;m choosing to manage my papers and references will not go away tomorrow.  The company that owns the application cannot decide tomorrow that I need to pay to upgrade or that a particularly favorite feature is no longer necessary (and therefore no longer available).  The fact that BibDesk also happens to be under quite active development is an added bonus.  It is not uncommon when, if I&#8217;ve noticed something with BibDesk that I find annoying or less-than-polished, upon upgrading, it is improved or fixed outright.  This continual improvement and attention to detail is a good thing and cannot be overstated as a matter of instilling confidence in a software selection.</li>
</ul>
<p>As an added piece of the BibTeX/BibDesk solution I chose <a href="http://subversion.tigris.org/">Subversion</a> for version control.  Version control protects me from errors/deletion of my own doing as well as those caused by &#8220;accidents&#8221; like power-failures or bad disk drives, etc.  Subversion works well with the rest of this system because the entire BibTeX file is simple text.  It also allows me seamless syncronization between different computers (home and school) and I can be sure I always have the most recent updates (read: PDFs I&#8217;ve already found once) available to me, no matter where I happen to be studying/working.</p>
<p>Additionally, by having my subversion repository &#8220;in the sky&#8221; (my hosting provider) I&#8217;ve got yet a third place where the data lives, making it even less likely that I&#8217;ll lose it in a fat-fingered accident or because someone has stolen my laptop.</p>
<p>I&#8217;ve collected the steps for my personal SVN/BibDesk/BibTeX solution below.  I share them here since, when researching this on my own, nobody seemed to have this setup documented in one place.</p>
<p><strong>Downloading and Installing BibDesk</strong></p>
<p>This is one of the most straightforward steps.  You can <a href="http://bibdesk.sourceforge.net/">get BibDesk from the project&#8217;s homepage</a>. Once the dmg is mounted, simply drag BibDesk to your Applications directory.</p>
<p><strong>Configuring BibDesk</strong></p>
<p>I made a couple choices as to how I wanted BibDesk to work &#8211; and I share them here because the choices work well with the plans I had for Subversion.</p>
<p>I want BibDesk to manage my PDFs like iTunes manages music files.  I do not want to have to name all the PDFs and make sure they&#8217;re in the right place and constantly worry about whether I&#8217;ve saved it correctly.  I want to tell the reference manager that I&#8217;ve got a reference, and I&#8217;ve got the electronic version of that reference, and please link them together.  But I also want to be able to dig into that file tree later and browse to a well-named PDF so I can send it to a friend or colleague.</p>
<p>BibDesk does this through two features &#8211; the CiteKey, and AutoFile.  Both are found in the BibDesk preferences panel.</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/prefs.png" alt="prefs.png" /></p>
<p>CiteKey is where you define your unique identifier for each entry across your library of references.</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/citekey.png" alt="citekey.png" /></p>
<p>I have chosen &#8220;First Author, Year, First 20 Characters of Title&#8221; as my convention.  This gives the &#8220;Format String&#8221; the value of &#8220;%a1%Y%t20&#8243;. Many others before me have chosen different format strings, but this combination seems to be a good balance between regular citation convention and the need to remain useful when simply browsing the file tree.</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/citekeypref.png" alt="citekeypref.png" /></p>
<p>The CiteKey determines how the references are kept unique within the BibTeX file.  I take this an extra step and keep that same format as the unique key for naming the attached PDFs as well.  This is where the AutoFile preferences come into play.</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/autofile.png" alt="autofile.png" /></p>
<p>I tell BibDesk to store my files in a set directory.  I tell BibDesk to file my papers automatically (like iTunes).  And I tell BibDesk to use the same format I had just defined for the CiteKey.  The format string for this is &#8220;%f{Cite Key}%e&#8221;.  The %e on the end tells BibDesk to use the file extension usually associated with that type of file (almost always .pdf for me).</p>
<p><strong>Update:</strong> Additionally, consider checking the second box (relative path for local-urls) if you&#8217;re using subversion to manage your BibTeX file from multiple machines with different full paths.  I missed this myself, as my machines have consistent user accounts (and therefore, full paths).</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/autofilepref.png" alt="autofilepref.png" /></p>
<p>And that&#8217;s it for the BibDesk setup.  BibDesk should now store my references automatically as I find them and save them in the GUI.  When I want to get back to a linked PDF from within BibDesk, I can simply double-click on it, and the linked file will open in my PDF viewer.  No more digging around for lost files.  I could also go browse the file tree where I told BibDesk to store my files &#8211; and they&#8217;ll be there named well and easy to scroll through.</p>
<p>Of course, BibDesk itself has keywords (tagging), annotation, and search &#8211; so it&#8217;s easy to find the reference you&#8217;re looking for from within the app in the first place as well.</p>
<p><strong>Subversion</strong></p>
<p>Subversion is a version control system that allows you to track changes across a file system.  Read more about it <a href="http://svnbook.red-bean.com/">here</a> and <a href="http://subversion.tigris.org/faq.html">here</a>.</p>
<p>The system works as follows&#8230; There is a repository that holds your files.  This repository could live on your computer or somewhere else.  Mine lives &#8220;in the sky&#8221; at my hosting provider and is a full copy of all my SCHOOL files.  When you work with subversion, you&#8217;re actually always interacting with a &#8220;working copy&#8221; on your local machine.  When you are ready to save your changes to the version control system, you &#8220;commit&#8221; your changes back to the repository.  If you are on a different machine with a different working copy of the same repository (this can be used by multiple people as well&#8230;), you simply have to resync that working copy, and it will download all the changes made to the repository in the sky.  Magical stuff, no?  Multiple computers, in sync, and no hard thinking necessary.</p>
<p>My BibDesk files are only a part of a greater collection of documents that I have under version control &#8211; but you can see how they live under the SCHOOL folder here.</p>
<p style="text-align: center"><img src="http://weblog.terrellrussell.com/wp-content/uploads/2007/02/finder.png" alt="finder.png" /></p>
<p>My BibTeX file (&#8220;BibFileTGR.bib&#8221;) lives right alongside the &#8220;BibDeskDocs&#8221; directory that I specified above in the AutoFile preferences panel.  All my collected PDFs live in that BibDeskDocs directory.  When things change in the BibTeX file or in the Docs directory, I commit the changes to the subversion repository and never think about it again.  They&#8217;re safe.  I&#8217;ve got a remote backup, and a full history of changes, built-in.</p>
<p><strong>Configuring Subversion</strong></p>
<p>Installing <a href="http://subversion.tigris.org/">Subversion</a> and getting it functional on your system is beyond the scope of this document, but please take the time to consider using it.  It&#8217;s an integral part of what makes this system so powerful.</p>
<p>Quite a few hosting companies provide svn space.  My own stuff is hosted at <a href="http://textdrive.com/features">TextDrive</a>.</p>
<p>Here are some of the Knowledge Base articles at TextDrive about getting your subversion up and running.  If they&#8217;re not exactly the same as your setup, they shouldn&#8217;t be too far off.  Again, the power of plaintext configuration and open source documentation.</p>
<ul>
<li><a href="http://help.textdrive.com/index.php?pg=kb.page&amp;id=54">Getting Started with Subversion</a></li>
<li><a href="http://help.textdrive.com/index.php?pg=kb.page&amp;id=55">Managing Subversion users</a></li>
</ul>
<p><strong>Feedback</strong></p>
<p>I&#8217;d love to hear from others who are using something like this for their own reference management.  Do you have anything to add?  Anything I left out?</p>
<p>I had used version control for my software projects for a while, but definitely had a moment when everything clicked for me when I realized my BibTeX was just plaintext as well and would fit very nicely within my existing setup.</p>
<p>I won&#8217;t be locked into a vendor &#8211; or an upgrade cycle &#8211; or an opaque &#8220;corruption&#8221; of my database file.  I&#8217;ll even be protected against my own fat fingers, since I have the ability to go &#8220;back in time&#8221; through my versions after a bad save or deleted reference (oops).</p>
<p>Presumably, I can continue to cultivate and build my reference library for many years and never have to worry about it disappearing or getting corrupted.</p>
]]></content:encoded>
			<wfw:commentRss>http://weblog.terrellrussell.com/2007/02/bibdesk-bibtex-and-subversion-an-academics-necessity/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>

