Skip to main content

The Semantic Web - HTML5 Microdata

The "semantic web" as a theory has been around for ages and I remember working with people, a decade ago, who were investigating how to build a semantic web.

The semantic web, a term coined by Sir Tim Berners-lee, is a vision that would allow automated agents and software to access the Web intelligently, via machine-readable metadata embedded within content.

There are a number of standards, tools, methodologies and technologies around that have been created to aid in the development of a semantic web, yet it is still unrealised and alludes the world.

There are a number of reasons for this including the physical size of the web, the vastness of knowledge and how to categorise it all into suitable classes, and the completeness, consistency and standardisation of information, to name just a few issues to deal with. I imagine some even question whether it is truly possible due to the sheer scale and requirements involved.

Probably the biggest impact of working towards a semnatic web is Extensible Markup Language (XML). XML is a set of rules for encoding documents in machine-readable form. However it's always practical to create and/or display all information and data of a site in XML form. Luckily there are various other ways to add meta data to existing HTML docs.

3 popular (and when I say popular I mean supported by Google - http://www.google.com/support/webmasters/bin/topic.py?hl=en&topic=21997) methods for adding metadata to HTML content are:

  • RDFa
  • Microformats
  • Microdata

Adding these to HTML creates "rich snippets" that are understandable by machines and agents such as search engine bots and spiders.

RDFa (Resource Description Framework in attributes)

RDFa extends XHTML with attribute level extensions that allow rich metadata to be embedded within Web documents. However RDFa can add metadat to any XML-based language.

http://www.w3.org/TR/xhtml-rdfa-primer/
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=146898

Microformats

Microformats are already widely used by applications such as linkedin.com and facebook.com for user profile information. Microformats generally use the "class" attribute of elements to describe the encapsulated data, but do also sometimes use the id, title, rel or rev attribute as well.

Microformats continue to be developed with standards for products, reviews, etc. being relased.

http://microformats.org/
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=146897

Microdata

Microdata is a feature of HTML5. Much has been made about the canvas tag, videos and animation and interactive creative possibilites with HTML5; however one area that has not received much attention is Microdata. HTML5 Microdata achieves effectively the same result as RDFa and Microformats, by semantically tagging data.

http://dev.w3.org/html5/md/
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=176035

And finally....

Relatively speaking RDFa seems relatively complex to use and implement. Microformats are popular, yet limited in scope currently, and also I have a bit of an issue with using classes to describe data. Microdata seems like it will be the future for tagging content, but has limited support and is still in the early stages. Having said that Google uses Microdata, and where Google goes, others follow.

Comments

  1. As a fellow web developer, I completely agree on the your assessment on the future of metadata on the web.

    That being said, according to Google, all three of the types of semantic markup you've been discussing are actually supported by google to some extent according to what I've read earlier this morning.

    http://www.google.com/support/webmasters/bin/topic.py?topic=21997

    Then again, HTML5 has various other benefits (iPhone video, simpler and more semantic markup, etc) in addition to having one of the most straightforward microdata markup schemes for SEO.

    It's my opinion that this is the direction the web is going to be taking.

    ReplyDelete
  2. Hi James,

    Thanks for the comment. You're right with Google currently supporting all three types of semantic markup to varying extents, but it still all feels a little messy to me this attempt at describing data, with loads of different companies taking loads of different approaches and no de facto standard anywhere.

    Like you rightly point out though, HTML5, along with all its other features, also has some new elements like 'nav', 'details' and 'menu' to name a few, which could help pave the way for making data more structured and processed easier. Just a shame we have to accommodate legacy browsers as web developers, and wait for all browsers to become compliant.

    ReplyDelete

Post a Comment

Popular posts from this blog

Which blog engine?

So the time has come to move to a more advanced blog engine for my blog. blogger.com , Google's blogging service, has served me well. It's incredibly easy to use and to get started with, along with having some great features such as inbuilt stats; however now I need a few more advanced features and greater control over the blog. There's a vast array of blog engines out there, some free, some paid for, some hosted, some self-hosted, and picking which one is best or the right choice could be a little bit tricky. This article from Mashable lists most of the main options and bigger players -  http://mashable.com/2007/08/06/free-blog-hosts/ . There are a few parameters that I've kind of decided on Ease of installation/compatibility and support with web hosts Simple to use. I don't want to spend ages clicking around just to add a post or format it. Feature rich and well supported. Most blog engines should have a fairly standard set of features now such RSS/ATOM fe

Enable .NET 8 Preview in Visual Studio

Download the SDK using Download .NET 8.0 (Linux, macOS, and Windows) (microsoft.com)  and install it. To enable projects to target the .NET 8 preview framework, the preview option in Visual Studio needs to be enabled, otherwise the option to target .NET 8 will not be available as shown below when setting up a new project (or trying to upgrade an existing one). To allow .NET 8 Preview to be used as a target framework for projects, the preview option needs to be enabled in Visual Studio. Open Visual Studio and select "Continue without code" In Visual Studio, select Tools then Options In Options, under Environment, select Preview Features and enable Use previews of the .NET SDK.

SQL Server - Remove Non-Alphanumeric Characters from String

The following SQL function will remove and strip all non-alphanumeric characters from a string. CREATE FUNCTION [dbo].[fncRemoveNonAlphanumericChars](@Temp VarChar(1000)) RETURNS VarChar(1000) AS BEGIN WHILE PatIndex('%[^A-Za-z0-9]%', @Temp) > 0 SET @Temp = Stuff(@Temp, PatIndex('%[^A-Za-z0-9]%', @Temp), 1, '') RETURN @TEmp END Example: SELECT dbo.fncRemoveNonAlphanumericChars('abc...DEF,,,GHI(((123)))456jklmn') Result: abcDEFGHI123456jklmn