One of the great agonies of a human being is searching for that little kernel of knowledge that actually answers their question. Within a traditional library, one would ask the reference librarian to lead them to the documents that, hopefully, answer their question. On the web, we use search engines as if they were reference librarians, and search engines are a poor substitute, but they are all we have for now.
Within a traditional library, the information about a book (metadata) is stored within some system (e.g. MARC), and this system is linked to some library classification (e.g. Dewey Decimal) for finding it on the shelf at a particular library. A whole profession exists for making this happen. Book metadata is chosen by professionals so that said book can be delivered to the person looking for the information within. These professionals (catalogers) are the gatekeepers of the whole system. Without them, the books might as well be strewn about.
On the web, there is no central authority. Every site is responsible for its own content. Search engines like Google use complex algorithms to try to find something that answers your question. Web site owners must take it upon themselves to not only ensure that their site stays consistent and correct, but also that it has metadata for these search engines to use to find them. Though Search Engine Optimization (SEO) is largely used to ensure potential customers find businesses, it is also important in helping users find the information they are looking for.
Now that I am aware of the importance of such metadata, I have installed a plugin for WordPress on my blog that generates Dublin Core metadata elements. These metadata elements are supposed to help others find articles like this one via search engines.
This plugin takes the existing metadata I was already supplying for each post, and places it in the header of the HTML at the top of the page, like so:
<meta name="DC.publisher" content="the Little Projects of Shawn M. Jones" />
<meta name="DC.publisher.url" content="http://www.littleprojects.org/blog/" />
<meta name="DC.title" content="Finding a kernel on the web" />
<meta name="DC.identifier" content="http://www.littleprojects.org/blog/?p=188" />
<meta name="DC.date.created" scheme="WTN8601" content="2011-01-19T23:52:11" />
<meta name="DC.created" scheme="WTN8601" content="2011-01-19T23:52:11" />
<meta name="DC.date" scheme="WTN8601" content="2011-01-19T23:52:11" />
<meta name="DC.creator.name" content="Shawn M. Jones" />
<meta name="DC.creator" content="Shawn M. Jones" />
<meta name="DC.rights.rightsHolder" content="Shawn M. Jones" />
<meta name="DC.language" content="en-US" scheme="rfc1766" />
I don’t really expect the search engine rankings to go up, but the real win here is that I’m helping others index my site in case I’ve actually provided exactly the information someone is looking for. In a way, this is a form of SEO, but it gets back to that cataloging spirit originally found in the library. There is no common list of tags or subjects for the web that we all must adhere to, but little steps like this bring us closer to finding the information we are looking for.
Take a look at the source for some of your favorite news sites, you’ll probably see the same metadata in their headers too.
For futher reading:
Metadata: The Foundations of Resource Description