Great piece by Jeremy Liew about now to make sense of all the user generated content. He explores 4 approaches prevalent right now:

Tagging is the first approach, and its use has been endemic to web 2.0. Sometimes the tagging is limited to the author of the content, and other times any user can add tags to create a folksonomy.

The second approach is to solicit structured data from users. Examples of sites that do this include wikihow (which breaks down each how to entry into sections such as Introduction, Steps, Tips, Warnings and Things You’ll Need), CitySearch (which asks you for Pros and Cons and for specific ratings on dimensions such as Late Night Dining, Prompt Seating, Service and Suitability for Kids) and Powerreviews (which powers product reviews at partner sites that prompt for Pros, Cons, Best Uses and User Descriptions, including both common responses as check boxes and a freeform text field with autocomplete).

The third approach to user generated data is the traditional approach to the Semantic Web. … Ideally, each web site creator would usa an agreed format to mark up the meaning of each statement made on the page, in a similar way that they mark up the presentation of each element of a webpage in HTML. In a subsequent article, Iskold also notes some of the challenges with a bottom up approach to building the Semantic web which can be summarized at a high level as “it’s too complicated” and “no one wants to do the work”.

The fourth approach to user generated structure is to build a central authority of meaning. Metaweb appears to be trying to do this with Freebase, a sort of “Wikipedia for structured data” which describes itself as follows:


There are clearly both advantages and disadvantages with a single authoritative source of user generated structured data; and criticisms similar to those leveled at Wikipedia (potential for systemic bias, false information, vandalism and lack of accountability could cause some data to be unreliable) could be leveled at Freebase. Wikipedia has combated these problems largely successfully through a robust community of Wikipedians - it isn’t clear if Freebase has yet developed a similar protective community.

I think all these approaches are important parts of the solution and so are other technologies like natural language processing, concept identification and extraction etc. Overall, there isn’t one approach that will address all the issues but its going to be a heady cocktail of a number of approaches and technologies. One thing is for sure though, search of meaning in the mountains of user generated content is a going to be profoundly important to the evolution of web 2.0.

Rate this:
3.7 (1 person)

3 Comments

Post a comment   |   Trackback URI   |   Comments RSS feed

Filter Comments

Trackbacks/Pings

  • No trackbacks or pings yet

Leave a Comment

Comment template by SezWho