Skip to content

This conversation is being blogged – ASIST panel

Paul Jones has posted the writeup for tomorrow’s panel:

What: THIS CONVERSATION IS BEING BLOGGED: Our lives, online, all the time, in the trend towards lifelogging
When: 12:30 to 1:30pm March 29th, 2007
Where: Pleasants Family Room in Wilson Library at UNC-CH
Who: A Panel Discussion Led by Dr. Deborah Barreau, with panelists Paul Jones, Dr. Cal Lee, Dr. Jeffrey Pomerantz, Terrell Russell, and Chirag Shah [note – librarians have arranged the speakers in alphabetic order by last name]
Presented by: ASIS&T-UNC

We’re each going to have a short statement designed to invite discussion. I’ve included mine here…

My provocative issue revolves around the fuzzy (read: disappearing) line between what is personal and what is global. The publishing and technical tools are so easy, powerful and available today we find it very simple to connect with those doing the same kind of work as ourselves. It shrinks the world. We can find our colleages – or they find us.

However, it also makes it easier than ever to blur the line between personal and professional, global and local. As we continue to share and discuss and leave our breadcrumbs for all the search engines of both today and tomorrow to spider and remember forever – do we think any of it’s really private? Surely not – or we wouldn’t be putting it online, right?

Our identity consolidates to what is findable about us. We may project these different personas – in different contexts to different parties – but when push comes to shove, we’re only one person. When someone *really* knows you, they know the real you – whatever that means.

When we navigate in the physical world we’ve always known, we understand how information leaks and moves. There aren’t too many invisible audiences and we ‘get’ how gossip and white lies work. We have an innate understanding of our audience and what they’re capable of – how much we can tell them and how much of it will ‘get back’ to the other audience(s).

What’s happening today – and from here forward I’d guess – is different. There’s been a shift in the last few years – our audience is significantly more hidden and large and ‘forever’.

When everything is recorded, who is the audience?
Is it yourself? Your children? Your friends? Your boss? Your next boss?

Who should it be?

Can we reasonably believe we can have more than one, that we can separate them one from another?

Those of us who are writing online, involving ourselves in this conversation, are we thinking about five years from now? 25 years from now? 100? Is what you’re saying today going to stand up in the future? Does any of this really matter?

While I’m concerned about the privacy implications of all this self-recording, I’m more fascinated with the perceptions of those broadcasting themselves about their own projection – their own sense of what the world sees in them.

Tags: - -

eekim, STODID podcast, and SXSW

A few days ago I was excited to find that Eugene Eric Kim had posted about a conversation we’d had (when I apparently ambushed him) at the last Internet Identity Workshop in Mountain View in December. I love it when people who write well make me sound smart.

What was he doing that I found so compelling? It was his Ph.D. research on ContextualAuthorityTagging. The basis of the idea is simple: The best way to identify an authority on a topic is not to ask people to self-identify themselves as such, but to ask others to identify the people they consider to be the authorities. We can leverage this principle to locate expertise by building tagging systems where users tag other users with information about their expertise. (LWT)

Terrell has thought really deeply about this, and several of his ideas are documented at his website and on his blog. PhilWindley and DavidWeinberger have also commented on his work. (LWU)

I heard more original ideas about tagging in that 20 minutes of conversation than I’ve ever heard from anyone else. The one that really struck me was the notion of tag disparities: comparing what people say about you to what you say about yourself as a way of measuring reputation. Sound familiar? It’s a real-life instantiation of the SquirmTest! (LWV)

And then, Aldo Castañeda at The Story of Digital Identity (STODID) contacted me to talk about my work. We spoke last week and he’s posted the podcast this morning.

Episode #55 is live and the direct link is here. It’s just over 36 minutes long.

I also wanted to share that I’ll be in Austin this next week for SXSWi – so please send me a note if you want to talk and/or get a very cool claimID button.

Tags: - - - - - -

BibDesk, BibTeX and Subversion – An academic’s necessity

When I first started this PhD program, I had a desire to keep all my references and papers in my computer. Better to search them. Better to keep them in one place. No dog-eared corners and ripped notebook paper in 3-ring binders. I wanted to minimize the stacks of paper that I knew would accumulate on the floor of my office. I wanted to preempt the piles that I’ve got a reputation for creating.

The fact that this was a pipe dream is fodder for another discussion at another time. I want to talk about how I am managing the documents I do have in electronic form.

One of the ways I devised to attack the problem of the piles was to locate and continue to use a reference manager for my electronic files (mostly PDFs). There were a few to choose from so I needed to make a decision. I decided I’d choose based on 1) open formats, 2) documentation, and an 3) open development model (open source). After looking through RefWorks, vanilla BibTeX files, ProCite, EndNote, Reference Manager (the application) and BibDesk, I chose BibDesk.

BibDesk (screenshots) wins because of these:

  • Open data format: BibDesk is a Mac app built to manage BibTeX files. BibTeX is an open format designed to be easily edited and easily moved from platform to platform (it’s plaintext). It has been developed by many eyes over many years and seems to be one of the most robust and flexible formats available for keeping track of bibliographic records.
  • Documentation: Being as old as it is, BibTeX has a great number of well documented use cases. Many of the people who use BibTeX have documented their different situations online, where I can find them through basic web search. This is a very big deal – as I can be confident that whatever it is I’m trying to do with the software, I’m probably not the first. And BibDesk itself is documented pretty well too.
  • Open source code: The codebase for BibDesk is open and available for download, inspection, and editing (if you are so inclined). I can trust that the format and application with which I’m choosing to manage my papers and references will not go away tomorrow. The company that owns the application cannot decide tomorrow that I need to pay to upgrade or that a particularly favorite feature is no longer necessary (and therefore no longer available). The fact that BibDesk also happens to be under quite active development is an added bonus. It is not uncommon when, if I’ve noticed something with BibDesk that I find annoying or less-than-polished, upon upgrading, it is improved or fixed outright. This continual improvement and attention to detail is a good thing and cannot be overstated as a matter of instilling confidence in a software selection.

As an added piece of the BibTeX/BibDesk solution I chose Subversion for version control. Version control protects me from errors/deletion of my own doing as well as those caused by “accidents” like power-failures or bad disk drives, etc. Subversion works well with the rest of this system because the entire BibTeX file is simple text. It also allows me seamless syncronization between different computers (home and school) and I can be sure I always have the most recent updates (read: PDFs I’ve already found once) available to me, no matter where I happen to be studying/working.

Additionally, by having my subversion repository “in the sky” (my hosting provider) I’ve got yet a third place where the data lives, making it even less likely that I’ll lose it in a fat-fingered accident or because someone has stolen my laptop.

I’ve collected the steps for my personal SVN/BibDesk/BibTeX solution below. I share them here since, when researching this on my own, nobody seemed to have this setup documented in one place.

Downloading and Installing BibDesk

This is one of the most straightforward steps. You can get BibDesk from the project’s homepage. Once the dmg is mounted, simply drag BibDesk to your Applications directory.

Configuring BibDesk

I made a couple choices as to how I wanted BibDesk to work – and I share them here because the choices work well with the plans I had for Subversion.

I want BibDesk to manage my PDFs like iTunes manages music files. I do not want to have to name all the PDFs and make sure they’re in the right place and constantly worry about whether I’ve saved it correctly. I want to tell the reference manager that I’ve got a reference, and I’ve got the electronic version of that reference, and please link them together. But I also want to be able to dig into that file tree later and browse to a well-named PDF so I can send it to a friend or colleague.

BibDesk does this through two features – the CiteKey, and AutoFile. Both are found in the BibDesk preferences panel.

prefs.png

CiteKey is where you define your unique identifier for each entry across your library of references.

citekey.png

I have chosen “First Author, Year, First 20 Characters of Title” as my convention. This gives the “Format String” the value of “%a1%Y%t20”. Many others before me have chosen different format strings, but this combination seems to be a good balance between regular citation convention and the need to remain useful when simply browsing the file tree.

citekeypref.png

The CiteKey determines how the references are kept unique within the BibTeX file. I take this an extra step and keep that same format as the unique key for naming the attached PDFs as well. This is where the AutoFile preferences come into play.

autofile.png

I tell BibDesk to store my files in a set directory. I tell BibDesk to file my papers automatically (like iTunes). And I tell BibDesk to use the same format I had just defined for the CiteKey. The format string for this is “%f{Cite Key}%e”. The %e on the end tells BibDesk to use the file extension usually associated with that type of file (almost always .pdf for me).

Update: Additionally, consider checking the second box (relative path for local-urls) if you’re using subversion to manage your BibTeX file from multiple machines with different full paths. I missed this myself, as my machines have consistent user accounts (and therefore, full paths).

autofilepref.png

And that’s it for the BibDesk setup. BibDesk should now store my references automatically as I find them and save them in the GUI. When I want to get back to a linked PDF from within BibDesk, I can simply double-click on it, and the linked file will open in my PDF viewer. No more digging around for lost files. I could also go browse the file tree where I told BibDesk to store my files – and they’ll be there named well and easy to scroll through.

Of course, BibDesk itself has keywords (tagging), annotation, and search – so it’s easy to find the reference you’re looking for from within the app in the first place as well.

Subversion

Subversion is a version control system that allows you to track changes across a file system. Read more about it here and here.

The system works as follows… There is a repository that holds your files. This repository could live on your computer or somewhere else. Mine lives “in the sky” at my hosting provider and is a full copy of all my SCHOOL files. When you work with subversion, you’re actually always interacting with a “working copy” on your local machine. When you are ready to save your changes to the version control system, you “commit” your changes back to the repository. If you are on a different machine with a different working copy of the same repository (this can be used by multiple people as well…), you simply have to resync that working copy, and it will download all the changes made to the repository in the sky. Magical stuff, no? Multiple computers, in sync, and no hard thinking necessary.

My BibDesk files are only a part of a greater collection of documents that I have under version control – but you can see how they live under the SCHOOL folder here.

finder.png

My BibTeX file (“BibFileTGR.bib”) lives right alongside the “BibDeskDocs” directory that I specified above in the AutoFile preferences panel. All my collected PDFs live in that BibDeskDocs directory. When things change in the BibTeX file or in the Docs directory, I commit the changes to the subversion repository and never think about it again. They’re safe. I’ve got a remote backup, and a full history of changes, built-in.

Configuring Subversion

Installing Subversion and getting it functional on your system is beyond the scope of this document, but please take the time to consider using it. It’s an integral part of what makes this system so powerful.

Quite a few hosting companies provide svn space. My own stuff is hosted at TextDrive.

Here are some of the Knowledge Base articles at TextDrive about getting your subversion up and running. If they’re not exactly the same as your setup, they shouldn’t be too far off. Again, the power of plaintext configuration and open source documentation.

Feedback

I’d love to hear from others who are using something like this for their own reference management. Do you have anything to add? Anything I left out?

I had used version control for my software projects for a while, but definitely had a moment when everything clicked for me when I realized my BibTeX was just plaintext as well and would fit very nicely within my existing setup.

I won’t be locked into a vendor – or an upgrade cycle – or an opaque “corruption” of my database file. I’ll even be protected against my own fat fingers, since I have the ability to go “back in time” through my versions after a bad save or deleted reference (oops).

Presumably, I can continue to cultivate and build my reference library for many years and never have to worry about it disappearing or getting corrupted.

Tags: - - -

Ze Frank talks about his lizard brain

In yesterday’s episode of The Show with Ze Frank, Ze talks about fear and emotion and the “reptilian part of your brain“:

It’s left over from a time when things were a little more straightforward.

Bigger than me? Check. Fangs? Check. Run like hell.

While a bit more straightforward perhaps than manipulation via mediated communication, I still think Ze knows what he’s doing here

On an unrelated note, John Hodgman made a guest appearance yesterday. I think this is a first. Perhaps the offer of a hobo-themed episode would keep him around for more than a windy Brooklyn weather report.

hodgman.png

Tags: - - - -

New Verified Page at claimID

We rolled out Verified Pages today.

OpenID is in the air, and providing services across domains will become very important very soon. I think we’re still about six months out from the Big Bang. August. I’m calling it.

Verification underlies Identity. Identity underlies claims about a person. Aggregated claims underlie the reputations we ascribe to people. With reputation, we can do really cool stuff. And it’s coming…

Cross-posted at claimID proper:

ClaimID allows real people to aggregate what is online about themselves. It allows them to bring links together, sort them, talk about them, and generally refocus their online identity on their own terms. We’ve had great success so far in getting that message out – and the feedback we’ve received has been positive. People really like the empowerment and are pleased when their claimID page begins to appear in the search results for their name.

But we also want to convey that these links are validated – verified in some way. So we introduced MicroID and OpenID to our system. Since that time, people have been pointing to their own websites, their own blogs, and their own OpenIDs hosted at other Identity Providers (AOL, Verisign, JanRain, Livejournal, etc.). And with all of those identities, it made sense for us to create a trusted place for you to aggregate them.

Verified Page

Today, we launched a special page for each person that brings these verified links into greater focus. The verified information about a person is presented all on one page, in one place – and you can be sure that these links are maintained by the person who owns the claimID account because of the math behind the scenes. MicroID and OpenID are based on strong hashing algorithms and cryptography and have been designed to validate and verify claims – just the sort of thing we’re doing at claimID.

Terrell's verified ClaimID

Our pages are at:
http://claimid.com/terrell/verified
http://claimid.com/fred/verified

They’re very clean and very powerful.

Once you find someone’s claimID Verified Page, you can be pretty sure that who you’re reading about at claimID is the same person at all those other sites. This allows us to really begin to tap into the power of distributed identity and maybe even hint at some uses for basic reputation across disparate websites. Of course, if you don’t want to display your verified identity, you can easily turn this off in your account settings.

We’re not done with online reputation yet, but the single verified page at claimID is a very strong early step.

Tags: - - - - -

ClaimID, the easy to use OpenID identity provider

So Fred and I have been scheming. The recent push behind OpenID and its impending uptake by a great many people has led us to the decision to rebrand claimID just a bit.

Rebranded claimID

We retooled the documentation, made it more apparent to the new user the benefits of having and using an OpenID and generally tidied up our original copy as we prepare for that “big growth” that we keep seeing poke its head around the corner.

From the official blog post:

At ClaimID, our strength has always been translating the complex into the simple. We want to give you the best solutions, without requiring you to read a protocol or understand code. As web identity plays a greater role in all of our lives, we feel that we can really help people by enabling them with solutions simply. And as OpenID grows (and it will grow, says Bill Gates), we want to be there to help you take advantage of this amazing and useful tool.

We’ve seen lots of convergence in the last few months – and even more in the last couple days – and we want to make sure we’re helping as many people as possible follow along at home.

Scott Kveton from JanRain has a nice writeup on the latest happenings:

OpenID has always been about convergence. When Brad, David and Johannes talked about how OpenID and Yadis could work together over a year ago. When the XRI folks brought their amazing people and technology to be integrated into OpenID 2.0 last Spring. This past Summer when Sxip Identity joined the OpenID party by joining in on developing the specification and offering up their attribute exchange specification to the OpenID community. And now today, we have a commitment from Microsoft to take part in the OpenID community as well as enable the technology for their future identity products.

There are a couple of points I’d like to make outside of the above announcement to hopefully address any concerns that the OpenID community might have:

  • JanRain will never require users of our libraries or services to use Windows CardSpace ™. We offer support for this technology as another option for users much like using our Safe SignIn and Personal Icon technologies on MyOpenID.com. We’ll also continue to support the OpenID efforts going on with Mozilla and Firefox.
  • Windows CardSpace ™ is shipping with Vista today and is a well thought-out technology that helps address many of the privacy and security concerns that people have had with OpenID. OpenID helps users describe their identity across many sites in a public fashion. The two together are very complimentary products and each has its strength.
  • Microsoft did not cave in to the OpenID community and the OpenID community is giving nothing up to Microsoft. This is a collaboration on bringing the best technology to the marketplace as quickly as possible to help secure users and solve the single sign-on solution once and for all.
  • Please reserve judgment on what this all means until you see it all work together. The technology is really quite simple and the ramifications for end-users is huge. It also goes a very long way to completely addressing the phishing concerns we’ve heard so much about.

Tags: - -

It’s hard to watch our published surface area

I’ve recently started subscribing to Jon Udell’s blog. One of his recent posts relates to our own information publishing as a cell – in the sense that it has a membrane where we detect interactions with the outside world.

A compelling visual no doubt – I think it’s a great way to describe to those who have not really thought about how their information is aggregated, redistributed and shared once they send it out. Shortly after he wrote about this illustrative analogy, he was informed that his own site was blocking crawlers via his robots.txt file. The irony was not lost on him:

A comment from Mark Middleton perfectly illustrates the point I was making the other day about visualizing your published surface area. I started this blog in December, and ever since I’ve been running with a robots.txt file that reads:

User-agent: *
Disallow: /

In other words, no search engine crawlers allowed. Of course that’s not what I intended. I’d simply assumed that the default setting was to allow rather than to block crawlers, and it never occurred to me to check. In retrospect it makes sense. If you’re running a free service like WordPress.com, you might want to restrict crawling to only the blogs whose authors explicitly request it.

WordPress.com’s policy notwithstanding, the real issue here is that these complex information membranes we’re extruding into cyberspace are really hard to see and coherently manage.

We’re all learning and probing and figuring out this new medium, even 10-15 years on now. We’re struggling with the abundance of information, and concurrently, the distinct lack thereof. We can connect with people from anywhere, at anytime, assuming they’re connected and watching the same streams of information. And yet, we cannot see who’s watching, who’s aggregating and saving for later.

Thanks, Jon, for the nice analogy. I’ll use it myself, with a link back, of course – so you can sense it.

Tags: - - -

Wikia Search to have open algorithms

I am a big fan of open software. Meaning, Open Source and the thought and philosophy behind it. And I also think that in the long run, things based on open standards with open discussion and open algorithms are the infrastructure that allows us to move forward. Private, proprietary companies might advance the state of the art at any given time, but in the long run, it’s the open, publicly documented things that hang around.

So today’s talk by Jimbo Wales at the NYU Free Culture Club (talking openly about Wikia for the first time) begs the question, how can open algorithms square with the “better” results gained from having full search histories? How do you anonymize? Can you, in an open space?

Howard Greenstein was there and paraphrases:

Question on the open corpus of search info from Wikia – even though it will be somewhat anonymized, even some sets of data that were supposedly anonymous have been shown to be able to be analyzed (implying the AOL leak).

They want to be careful on this – very tough question for Jimmy.

Can we deal with the identity and privacy of the end users while providing the ‘best possible’ search engine? Are those two things actually opposites?

Another quote explaining Jimbo’s vision…

Wikia – all the algorithms will be open source. Free search – as in transparent, testable, researchable. All the good research has been behind the walls of commercial companies (like Google, Microsoft, etc.) No place for computer scientists to go to do such research.

Want it to be:
– Participatory – bring best elements of Wikipedia to the problem of search
– Open
– Democratic

Tags: - - -

28% of Online Americans have used tagging

The Pew Internet and American Life Project has just put out a new report on the state of tagging in America. It also features a fairly prominent interview with David Weinberger.

The takeaway numbers show that 28% of online Americans have used tagging before and that 7% are active taggers (tagged something ‘yesterday’). The survey spoke with 2,373 adults (1,623 of those being counted as ‘online’, or 68%). The data was collected via telephone throughout December 2006.

“Taggers look like classic early adopters of technology. They are more likely to be under age 40, and have higher levels of education and income.”

Q: Why do you think Internet users are drawn to tagging?

Weinberger: It’s really useful. Compare your traditional computer system to organize your digital photos to using a tagging system. Instead of having to stick a photo into a single folder — say, “trips 2006” — you can easily tag it as “Italy,” “anniversary,” “sunset,” “mountains,” and “no kids.” You can assemble instant virtual albums of all your anniversary photos, or all your photos of all
your trips to Italy, etc.

There’s an altruistic appeal to tagging as well. Tagging at public sites can give you a sense that you’re adding to a shared stream of knowledge. At del.icio.us, or other such sites, tag a page “robotics” and you know that it’s automatically added to the list of pages tagged that way, so anyone else interested in that topic can find it.

In addition, as more people get used to the idea of having their information sliceable and resortable on the fly, we’ll see more intuitive interfaces bloom for very, otherwise, ordinary tasks. This phenomenon is still very young. The world only found del.icio.us in early 2004. Not very long ago.

Tags: - -

The RIAA and MPAA are fronts for the Green Party

It recently occurred to me that I may have stumbled onto a very uncomfortable truth.

Could it be that the RIAA/MPAA are fronts for the Green Party here in the US?

As the number of pirates has decreased, the Earth has seen a steady increase in average global temperature (global warming).

This was first dutifully reported by Bobby in May 2005 in his Open Letter to the Kansas School Board:

Pirates are Cool

This can only be causal, and therefore, the RIAA and MPAA have joined forces, undercover of course, to save the world. The RIAA creates pirates by changing the math behind their sales numbers and the MPAA creates pirates by suing everyone, everywhere. The Green Party has the same motive – albeit through environmental awareness and policy instead of manipulation of the merchantile population’s underbelly.

This would be in stark contrast to the recent TechDirt post about the RIAA’s apparent behavior not-unlike the French Button-Makers of the 17th century. I can’t decide what to believe. The first seems so noble, and this one, not.

The Boy Scouts seem to be marginally involved as well. So conflicted.

What would the Flying Spaghetti Monster suggest?

Perhaps start the Everyone-Gets-A-Vault fund? I’d donate.

noodledoodlewall.jpg

Tags: - - - - - -