Skip to content

MicroIDs now at WeeWar and MyBlogLog

Two more feathers in the MicroID cap today…

Both Fred and the MicroID blog have the news:

MyBlogLog has just started publishing MicroIDs (and FOAF) for member pages at http://www.mybloglog.com/buzz/members/username.

And…

WeeWar has also started publishing MicroIDs and some XFN. You can see Alexander Kohlhofer’s profile at http://weewar.com/user/alex.

Tags: - - -

MPACT at CNI – Identity in Scholarly Discourse

A week and a half ago, I was invited to speak at the CNI Workshop (Coalition for Networked Information) in Washington, DC entitled Authors, Identity Management and the Scholarly Communication System. I was there to present our work on the MPACT project at UNC-Chapel Hill and how it relates to the sticky world of name disambiguation in academia.

Attendees included representatives from Elsevier, ProQuest, OCLC, Library of Congress, Internet2, Shibboleth, Thomson Scientific (ISI), the JISC Names Project, Mellon Foundation, CrossRef, IFLA, ISO, ICPSR, MIT, NLM, NIH, NCBI, American Physical Society, Association of Research Libraries, and the Thomas Jefferson Foundation. I may have missed one or two in there…

MPACT

MPACT is looking at how mentoring as a scholarly activity can be better measured and quantified. Our larger goal is to make the argument that mentoring is not being rewarded enough when faculty members are evaluated on their productivity. Usually, at research institutions, research, teaching, and service are quantified and evaluated through a variety of metrics. These metrics are part of the cultural and institutional infrastructure and have been built up over time to reflect what the university values in their faculty members. Arguably, this leaves out mentoring as a scholarly activity – and that’s a mistake.

I presented our work to date and pointed out to the mostly VPs, CTOs, and CEOs in the room that the engineering of the MPACT project could have been greatly reduced if I hadn’t had to research and construct a system to manage ‘people’ and instead could have focused only on the mentorship connections between them (advisorships and committeeships at the dissertation level). I offered MPACT as a project willing to be a beta tester to whatever interface the large companies/organizations in the room put together for public/limited querying.

Identities Mapping

One of the most interesting developments that I was privy to last week was a potential collaboration between OCLC and Elsevier.

OCLC currently has over 100 million bibliographic records with over 25 million of those having pages in WorldCat. This represents over 1,200 million library records across 20k libraries. And they’ve identified characters and authors at WorldCat Identities. These are largely book items and manuscripts as represented via MARC records. I also learned that OCLC is ingesting some Wikipedia content as source material to augment/supplement some of their records. I had no idea and hadn’t heard that anywhere else.

Elsevier currently has over 32 million identity records that they are 99%+ sure map to individual people. This represents the world of journal articles.

The interesting collaboration point would be to collide these bibliographic records and these identity records and see where the overlap occurs and how much automatic disambiguation could be realized. Being able to click on a single person and see their books and articles in one place would be a great leap forward – and that’s only the first order benefit.

VIAF

Another project I had not been aware of is the Virtual International Authority File (VIAF) (also at viaf.org). This is a work in progress to virtually disambiguate and connect the bibliographic records from the major national authority files around the world.

VIAF is a joint project of the Library of Congress (LC), the Deutsche Nationalbibliothek (DNB), the Bibliothèque nationale de France (BnF), and OCLC. The project’s goal is to match and link the library authority files.

They’re doing personal names and geographic names – but not topics. It’s all going to be publicly available and dereferenceable. This is a very big deal.

Thanks to Cliff Lynch and Joan Lippincott for inviting me and bringing together the players in this area who had never before all been in the same room. It was a thrill to have some candid conversations with the people who can move these large datasets into position and continue to change the way we interact with so much information.

Tags: - - - -

Information Starved in SF

All of my adult life, I’ve been information-rich. I am fairly social and keep in touch with friends over time. I know my way around a computer. I understand how the internet works. I have been able to help many people find what they were looking for and I’m usually well prepared (information-wise) when entering a new situation.

Last weekend, I found myself on the opposite end of the spectrum.

I found myself continually on the short end of the information assumptions held native by those who live/work in the Bay Area. I don’t have an iPhone. I don’t have ubiquitous access to my email and twitter streams and google maps and upcoming.org. I realized I was seeing what ‘normal’ will look like in a few years when we all have our “People in our Pocket” and access to flows of information that seamlessly aid our movement through space. I feel hyper-connected when at my desktop or on my laptop with an internet connection, but I haven’t experienced this with a mobile phone yet. I expect I’ll have to remedy this in the next few months.

The clarity came most strikingly when I realized where I was driving in my rental car was not on my paper-based mapping solution. It seems the address I was headed towards was 3-8 blocks off the edge of my map. A sense of profound helplessness was fleeting, but present. I don’t go anywhere that’s not mapped these days. I know what things look like before I get there. I’ve done the research on the hotels in an area or items on the menu before I arrive. I’ve usually also read the reviews. I exist in a communal flow of information, and all of a sudden, I was alone, ironically, in one of the most connected cities on the planet.

Reflecting on that moment, I realized that what was happening was only as clear as it was because my destination was *just* off my map. The line was cleanly drawn between knowing and not knowing – between having the knowledge and confidence in that situation and being forced to navigate with my eyes and under-exercised landmark muscle memory. I was thrown back to ‘pre-web’ days. It felt like 100 years.

I thought about Songphan’s dissertation topic of Information flows during crisis/emergency. He’s looking at how, during emergencies, the information hierarchy is inverted – the people at the center of the action have the least information. He logged into his IM account during the recent coup in Thailand and his screen lit up with messages from home, people who had little/no information but they knew he had CNN. He became the hub for the flow that day – and he was 7000 miles away.

The interaction with our own physical surroundings is being outsourced. How many times have you called someone from the road and asked them to look up a phone number or the directions to where you were headed?

It was strange to experience the inversion so directly. No map. No directions. I truly expect it to be one of the last times it ever happens to me by accident. Only with intent will I find myself without the ability to ‘know’ where I am and how to get where I want to go.

Tags: - - - -

Travel/Conferences this winter and spring

In case either of you reading this want to catch me while I’m in your neighborhood…

  • Social Graph Foo Camp
    Feb 1-3, 2008 – O’Reilly Campus, Sebastopol, CA
    Attending for myself and for claimID – this will be my first Foo Camp. I’m excited to meet some of the names I’ve been reading/watching for some time. I’ll be in the SF area through Tuesday evening (5th).
  • iconference 2008
    Feb 27-Mar 1 – UCLA, Los Angeles, CA
    I’m attending the doctoral colloquium the day prior to the conference, and then presenting a poster about my old friend Cloudalicio.us.
  • Wedding of two dear friends
    Mar 27-Apr 3 – Santiago, Chile
    I’ve been looking forward to this for a year now…
  • ASIS&T Social Computing Summit – IA Summit
    Apr 10-14, 2008 – Miami, FL
    I’ll be speaking on the data portability and open social networks panel led by Brian Oberkirch. More information here and here… Please come say hello.

Tags: - - -

MicroID gains another foothold or three

This week (and a little of last month) – MicroID has made some major progress.

MyOpenID

First, I see that MyOpenID has implemented MicroID. Excellent implementation. They allow you a lot of control over how your information should be shared – and with your confirmed email addresses, they publish MicroID for others to be able to better confirm it’s really you.

Plaxo

Second – Plaxo has rolled out their canonical myplaxo.com URLs for each user. Before, the myplaxo.com space autoforwarded to an ‘add me’ page for each user. This was less than perfect for MicroID since MicroID calculates hashes based on the displayed URL in the browser.

With a tweak to the apache configuration at Plaxo, the URLs are stable and Joseph got it working to spec and now things are humming.

Plaxo publishes a MicroID for each of the verified email addresses in your account.

Good stuff.

Digg

In another strong showing for the spec – this morning, Digg rolled out MicroID on their user pages as well. You can see them on any user page at http://digg.com/users/username. Additionally, Digg has taken the interesting new approach to publishing MicroIDs in their responses to API calls as well. A request for developer Steve Williams’ profile via the API produces:

<?xml version="1.0" encoding="utf-8" ?>
<users timestamp="1201200757" total="1" offset="0" count="1">
 <user name="sbwms" icon="http://digg.com/users/sbwms/l.png" registered="1135702996" profileviews="14706" fullname="Steve Williams" microid="mailto+http:sha1:e945976887f47a4ae2bc20dace1a3e4a3808143c">
  <link href="http://www.baychi.org/" description="BayCHI" date="1190263703" />
  <link href="http://www.nuqu.org/" description="Moffett Blog" date="1190263688" />
  <link href="http://www.sbw.org/" description="Home Page" date="1190263641" />
 </user>
</users>

Excellent work all around – this Data Portability thing is actually going to happen one day.

Tags: - - - - -

On the MySpace private photos torrent

So, here is a scenario playing out live…

– Lots of people post things to social network sites.
– Some of these things are private (friends/family type of private).
– The users understand and follow the rules, and protect themselves.
There is a bug in the system.
– Their private stuff is now available to anyone.
– Someone grabs the content.
Then redistributes it anonymously and efficiently.

What we’ve not seen yet…

– Talking heads blowing it out of proportion.
– Reactionary bad law passed to ‘fix’ it.

Hopefully, with more mainstream news coverage and widespread understanding and adoption of better privacy practices and controls, we won’t get to the point where we have any more bad law. But I wouldn’t bet on that happening.

As of this writing, there are 4 seeders and 340 downloaders on thepiratebay torrent link. 17GB of photos. 567,000 images.

Yes, I’d say this has blown up.

Word of Caution

We sometimes forget we’re in uncharted territory. We are playing with the new shiny toys of the internet and not necessarily understanding the implications. These tools provide great power across the board. Users gain abilities to connect, find, sort, and publish in ways never before available. Conversely, companies gain abilities to monitor, gather, and sell more personal information than ever before. Additionally, third party observers gain the ability to observe at a distance and in numbers never possible in the physical world.

And we don’t yet know all the rules.

With all these new powers, our nuanced understanding of how we interact and the ramifications of our various ‘digital’ actions have not kept up with our abilities. We don’t know how these things “break” yet.

I would argue that this MySpace leak (as well as the Facebook minifeeds and Beacon) are examples of how these systems can break in explosive ways – ways that were not possible, and on a scale that was not possible before we were ‘hyperconnected’ and ‘always on’.

Please pay attention to what you post. Please think through what happens when it is made public. Please consider how our systems break – because it’s rather a question of “when and how” than a question of “if”.

Update: Fred weighs in, referencing this post
Update: Michael Zimmer referencing Fred

Tags: - - -

User Merge plugin for PunBB

To close out a week of very geeky Rails and PHP posts

I released an administrator plugin for PunBB earlier this week and Rickard has posted it on the PunBB Downloads page (about 3/4ths of the way down).

User Merge. Created by Terrell Russell. The User Merge plugin allows administrators to merge two user accounts.

This follows the Broadcast Email plugin from a couple years ago…

Broadcast Email. Created by trel1023. The Broadcast Email plugin allows administrators to broadcast e-mail to all registered users via the administration interface.

Tags: - - -

Flickr Commons adds tags to Library of Congress images

Just announced this morninga fantastic partnership between Flickr and the Library of Congress.

Flickr Commons

The Library of Congress Pilot Project

The Library of Congress has a Prints and Photographs Online Catalog comprised of over 1 million images (and growing) that have been available online for over 10 years.

Back in June of 2007, we began our first collaboration with a civic institution to facilitate giving people a voice in describing the content of a publicly-held photography collection.

The key goals of this pilot project are to firstly give you a taste of the hidden treasures in the huge Library of Congress collection, and secondly to how your input of a tag or two can make the collection even richer.

You’re invited to help describe photographs in the Library of Congress’ collection on Flickr, by adding tags or leaving comments.*

*Any Flickr member is able to add tags or comment on these collections. If you’re a dork about it, shame on you. This is for the good of humanity, dude!!

I’m very excited about this and will be participating. Just look at all that good old-fashioned well-formed library data in each photo’s description…

However, I think there’s a missed opportunity here to leverage some of the extra power in having many people tag.

At Flickr’s sister site, del.icio.us, we’ve seen wonderful growth and understanding around how communities of users tag collectively. They’re not necessarily collaborating, which is why del.icio.us holds some special properties we do not see in the tagging at Flickr. However, I think Flickr should expose the identities/usernames along with the tags associated with a photo. Most photos are only tagged by the owner – it’s a safe assumption that this will continue to occur into the future. However, when the tagger is NOT the owner/uploader of the photo, this information is currently lost and not passed along in the Flickr interface.

Please, Flickr, expose the ‘who’ part of the tagging triumvirate (see last paragraph of Vander Wal’s definition). Especially now that we’ll have such rich data around our collective history.

This is still a strong belief the three tenets of a folksonomy: 1) tag; 2) object being tagged; and 3) identity, are core to disambiguation of tag terms and provide for a rich understanding of the object being tagged.

Another interesting note about this pilot – this is the first time we’ve seen a distinction of ‘no known copyright’:

Can anyone use “no known copyright restrictions?”
For the time being on Flickr this new usage is being contained to the Library of Congress account. If the pilot works – or, when it works! – we’ll look to allow other interested cultural institutions the opportunity to extend the application of “no known restrictions” to their catalogues.

Hooray, Library of Congress + Flickr!

Tags: - - - - -

Generating a Rails and PunBB (and DokuWiki) shared cookie

This is another post for how I got something to work in the past week that’s been bugging me for a while. I recently wrote about how to get PHP to render correctly within a Rails app.

This post is about getting single-sign-on to work with a PunBB forum inside your Ruby on Rails application. I wanted to have a user who signs into the Rails app be ‘logged into’ the forum as well. This requires setting the cookie in the same way the PunBB code does it.

I have put this code in the bottom of my application.rb, so that it can be called from anywhere in the Rails app. I would suggest setting the cookie on login and clearing it on logout.

The forum config regex is based on the configuration parser in the PunBB SDK for Rails. It parses your existing PunBB cookie_name and cookie_seed from the PunBB install so you only need to keep that information in one place (PunBB).

The necessary php_serialize.rb file also comes from the /lib directory of the PunBB SDK plugin and is courtesy of Thomas Hurst (also available directly). It should be copied/placed into your own Rails app’s /lib directory and ‘require’d accordingly. The line below ending in \\ indicates a forced linewrap – and should be pulled back into a single line if you copy and paste this code.

  def set_shared_cookie
    # this is the punbb cookie
    # should be called on login from main site
    # the wiki uses the same cookie
    # setting it here allows unified login
    require 'digest/md5'
    forumconfig = get_forum_config_data() # private method at bottom
    # get forumuser info and set cookie
    forumuser = Forumaccount.find_by_username(@current_user.login)
    cookies[forumconfig[:cookie_name]] = {
      :value => PHP.serialize([forumuser.id, \\
         Digest::MD5.hexdigest("#{forumconfig[:cookie_seed]}#{forumuser.password}")]),
      :expires => 1.year.from_now
    }
  end

  def clear_shared_cookie
    # should be called on logout from main site
    require 'digest/md5'
    forumconfig = get_forum_config_data() # private method at bottom
    # set cookie for Guest
    cookies[forumconfig[:cookie_name]] = {
      :value => PHP.serialize([1,Digest::MD5.hexdigest("#{forumconfig[:cookie_seed]}Guest")]),
      :expires => 1.year.from_now
    }
  end

  # Uses regex to parse the php punbb config file
    # ahgsoftware.com/punbb_sdk/
    # make sure the config file exists
    # make sure 'RewriteEngine Off' is in /forum/.htaccess and wiki/.htaccess
  def get_forum_config_data
    config_hash = Hash.new
    c = File.read(File.join(RAILS_ROOT,'public/forum/config.php'))
    c.scan(/\$(\w*)\s*=\s*['"](.*)['"];/).each do |pair|
      config_hash[pair[0].to_sym] = pair[1]
    end
    return config_hash
  end

Of course, keeping your users in sync across the main Rails app and the Forum install is its own trick – and necessary before the above cookie injection will work. I’ve got Theforum and Forumaccount models that are wired to the PunBB database. I keep the usernames and passwords synced whenever users/passwords are created/deleted/updated.

database.yml

theforum_production:
  adapter: mysql
  database: punbb_production
  host: localhost
  username: xxxxxxxxx
  password: xxxxxxxxx

theforum.rb

class Theforum < ActiveRecord::Base
  self.abstract_class = true
  establish_connection "theforum_#{RAILS_ENV}"
end

forumaccount.rb

class Forumaccount < Theforum
  set_table_name :users

  def encrypt_and_save_new_password(password)
    write_attribute("password", self.sha1hashed(password))
    save
  end

  def sync_from_account(account)
    write_attribute("email", account.email)
    forumname = account.prefix+" "+account.first_name+" "+account.last_name
    write_attribute("realname", forumname)
    save
  end

  protected

  def sha1hashed(str)
    Digest::SHA1.hexdigest("#{str}")[0..39]
  end

end

A separate trick was to make DokuWiki look for and pay attention to the PunBB cookie we created at the beginning of the post. I got that for free with the shipping auth options in Dokuwiki. I simply pointed my Dokuwiki install at the PunBB install and the magic was complete.

Success: A login to the Rails app also sets a cookie for PunBB which is fully honored by Dokuwiki.

An additional benefit is now the entire site is under one codebase and can be installed/developed without moving as many pieces around.

Tags: - - - - -

Running PHP within Rails

So this has been something I’ve been putting off because it wasn’t readily apparent how to make it happen when I first tried over a year and a half ago.

I wanted to manage a group of users and passwords for a community with an existing Ruby on Rails application – and then have both a PHP-based PunBB forum and PHP-based wiki running inside the Rails app. Of course, more interestingly, I wanted to make the sessions and cookies all align and share a single sign-on.

And this past week, I poked enough at the right pieces to make it happen.

I first found this post at macdiggs.com but found the solution to lack directory mapping in addition to allowing the PHP to execute inside the Rails /public/ directory. I wanted http://example.com/forum to render and not throw a Rails error. As it stood, only http://example.com/forum/index.php and URLs like it would come up correctly.

Then, I started seeing references to turning off Apache’s RewriteEngine…

The solution I’ve now got in place has two parts:

1) Edit the Apache configuration file for your virtualhost. This tells Apache not to push any requests that start with /forum or /wiki to the Rails app (here, a mongrel cluster) – and to just handle them by itself (via the PHP processor).

RewriteEngine On
# send all /forum traffic to php
RewriteCond %{REQUEST_URI} ^/forum.*
RewriteRule .* - [L]
# send all /wiki traffic to php
RewriteCond %{REQUEST_URI} ^/wiki.*
RewriteRule .* - [L]
# Redirect all non-static requests to cluster
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://mongrel_cluster%{REQUEST_URI} [P,QSA,L]
<Proxy balancer://mongrel_cluster>
BalancerMember http://127.0.0.1:6000
BalancerMember http://127.0.0.1:6001
</Proxy>

2) Add/Edit the .htaccess file in both the forum and the wiki directories. This tells any requests that were sent the way of either the forum or the wiki to halt the rewriting engine and just process the requests themselves.

RewriteEngine Off

Now, both http://example.com/forum and http://example.com/wiki render correctly.

As for getting the cookie to sync – please see my next post…

P.S. And now for all the phrases that I couldn’t find anywhere when I was looking for how to make all this magic happen – maybe someone else will find this stuff here: PHP within Rails, PHP inside Rails, PunBB in Rails app, Ruby on Rails, DokuWiki, PHP/Rails integration, PHP integration with Rails

Tags: - - - - - - -