PataMetaData: An HTML Resume, Part II

Last time, I discussed the general structure of my semantic HTML resume & my usage of the <section> element. For part two, I am focusing on a very specific segment of my resume.

A Publications List

I was excited to work on this <section> of my resume because it is such a unique form of content. My starting point was the <cite> element &, for the first time, I became frustrated with HTML5's semantics. To me, as someone who teaches citation styles & writes academic articles, logically <cite> should wrap an entire citation, e.g. author, title, publishing information, etc.. But <cite> in HTML5 is only for the title of a work. HTML5 Doctor has a great article on the drama surrounding <cite>'s semantics; basically, HTML5 has limited the scope of <cite> by forbidding the wrapping of an author's name or full citation in <cite>, which leaves us with no single element for an academic citation. This completely destroys the utility of processing data inside of <cite> tags because a work's title is a radically insufficient identifier. Consider a journal article; if you only know the title of the journal, you're a long way from tracking down the individual article, & HTML5 has no means of telling you where a citation begins & ends. The spec even shows an example where a citation is wrapped in a <p>, meaning A) it's almost certainly getting rendered improperly by the user agent, B) isn't part of a list (HTML5Doctor's example of <cite> in an academic citation, by the way, does show a citation as a list item within an ordered list), & C) is indistinguishable from other paragraphs. How do I tell a main body paragraph with a <cite> in it from a works cited list? For those working with citation information (read: every web librarian out there, but many other academics as well), this is a massive void that's necessarily filled with a few more complex approaches. True, it is perhaps asking too much for HTML5 to be as thorough as, say, Zotero's Citation Style Language XML format, but the current state of HTML citations is still disappointing.

However, <cite> (which, according to spec, user agents should render in italics) does correspond with what is usually italicized in a citation, such as a book or journal title. So by marking publication names with <cite> I was able to semantically render publication titles in italics, as opposed to the clearly un-semantic approach of wrapping a title in <em> tags or a-semantic approaches involving styled <span>s. But doing a hanging indent in HTML/CSS still feels very hacky (li { text-indent: -2em; margin-left: 2em; } ...yes, that's right, a negative text indent offset by a positive margin) & the <cite> element appears to be an ideal place for a user agent to render a hanging indent by default, something that no HTML element currently does. If the standard isn't going to do that, then at least a "hanging" flag for the text-indent property in CSS (e.g. { text-indent: 2em hanging; } would replace the negative text-indent hack) should be on the table.

The second part of marking up my publication list was exposing the citations to OpenURL applications. I read up on COinS & attempted to implement the standard, which is an a-semantic approach involving empty <span>s with a string inside the title attribute that can be tacked onto an OpenURL resolver domain. Producing the code is somewhat nontrivial &, while I'm glad I read the spec & developed my inchoate understanding, I used the COinS Generator to speed up the process. Here's an example <span> from my publications list:

<span class="Z3988" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rfr_id=info%3Asid%2FoCOinS.info%3Agenerator&rft.genre=article&rft.atitle=Hardening+the+Browser%3A+Protecting+Patron+Privacy+on+the+Internet&rft.title=Reference+%26+User+Services+Quarterly&rft.issn=1094-9054&rft.date=2012&rft.volume=51&rft.issue=3&rft.spage=210&rft.epage=214&rft.ssn=spring&rft.aulast=Phetteplace&rft.aufirst=Eric&rft.auinit=EP&rft.au=Eric+EP+Phetteplace"></span>

Could that be any prettier? Adding embedded metadata like COinS is probably not a step most authors go to but has an instant gratification that mere semantic HTML lacks: you can instantly see your metadata consumed by applications like Zotero & Mendeley. If your library uses an OpenURL resolver, chances are that a "find full text" icon will appear next to your citations when they're viewed from within the library network. Wikipedia employs this tactic to great effect.

Zotero offers to save two articles in my publications list

Ahhh yeah, that's the good stuff.

So here we have a totally a-semantic approach that's still exposing metadata to lots of applications that can reuse it today. That is the promise of all this semantic markup: that machines will be able to process it with greater accuracy & that humans will think up interesting uses for the processed markup. So while COinS may look hideous, it's a pretty great example of the potential for markup to interact with APIs.

PataMetaData

Thursday, April 26, 2012

An HTML Resume, Part II

A Publications List

No comments:

Post a Comment