From IT

Revision as of 18:48, 12 January 2011 by Angelina (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The whole document in pdf form can be found File:Here

Tags introduction

Tags in general are one of most recognized Web 2.0 products. Going one step forward, one can say that machine tags as a part of the semantic web, should become one of the most recognized Web 3.0 products. Before we try to understand usages of machine tags, first we should understand what the meanings of the tags / tagging are, what kinds of tags exists and how users of internet can exploit it.

There is not official definition of tags and tagging but there are several characteristics of the tags that are applicable to all tag. Tags are user contributed (user-generated) descriptive strings, possibly labels and keywords that are describing a piece of content. Those strings should be relevant and easily associated to the piece of content. Under the content we can understand URLs, web pages, texts, images, videos, geographic maps, blog entries etc. Tags are not same as keyword annotations. The difference is that tags are flat, disorganized, free-form strings made by users and keyword annotations are usually part of the predefined vocabulary given by different authors, web systems (web sites, web directories, web platforms etc.) or librarians.

The fact that the tags are made by humans according to their own understanding of the content can be advantage and disadvantage of the tags systems. It is advantage in the sense that user knows and understands meanings of the content (data) and by adding the tags he can easier remember, retrieve, recognize, save, browse and search for content. The major disadvantage is that the same content can be tagged differently by different people. For example, images on the Flickr could be tagged according to the place where they had been taken (geo-tags) or by its content. If we have image of the mountain we can tag it with: “winter” (time of year when image had been taken), “Zlatibor” (place), “skiing” (activity shown on the image). But same image could be tagged also with: “January” (winter month), “Obudojevica” (skiing resort on Zlatibor), “skiing”. Another problem of tagging systems is that “system” doesn’t understand meaning of the tags. For example, tag “java” can describe computer company and program, coffee and island; tag “apple” can be applicable for both computer company (Apple Inc.) and fruit. In the case of individual tagging on the personal computer, those problems are not crucial, but in the case of collaborative / sharing tagging systems (like: delicious.com, flicker.com, digg.com) those problems are critical.

In social bookmarking web sites (collaborative tagging communities), users can share tags one with another, retrieve tagged content online, search, browse and filter tags. Examples of such communities are: Delicious, Flickr, Digg etc. We can distinguish social bookmarking communities according to the type of the content they are used to tag:

Tagging for URL (for example: del.icio.us, stumbleupon.com)
Tagging for photos (for example: flickr.com)
Tagging for videos (for example: youtube.com)
Tagging for news (dig.com, reddit.com, netscape.com)
Tagging for books (librarything.com, openlibrar.com)
Tagging for academic articles (citeulike.com)
Tagging for retail products (amazon.com)

Those entire collaborative tagging systems share previously described problems. Some of them try to resolve it by using the machine tags.

Machine tags definition

The idea of the machine tags follows the basic idea of the semantic web: to give a meaning to every tag, so it can be understood and interpreted by machines. The machine tags keep characteristics of the “ordinary” tags, but also provide variety of new possibilities.

The machine tags are extension of the “ordinary” tags: they are made by humans according to their understanding of the content, they are descriptive but they are written in the specific format so machine can read it, understand it and perform specific action according to it. They add extra semantic information about tag and indirectly about content. Machine tags are semi-automated (must be added by humans, and then machine can perform action) and they can be understood as link between tags and keyword annotations ( at the moment, machine tags are given by collaborative system as part of API; users can add it but they will be parsed as regular, flat tags). In the table below, there is the list of the characteristics of “ordinary” tags, machine tags and keyword annotations.

Tags	Machine tags	Keyword annotations
Single word (usually) Descriptive User contributed (user generated) Collaborative Flat Disorganized Free-form strings	Descriptive User contributed (user generated) Collaborative Structured Organized Semi-automated Link between tags and keyword annotations	List of terms (tags) with predefined meaning Structured Organized Part of the vocabulary (usually)

Table 1: Characteristics of the tags, machine tags and keyword annotations

Machine tags are also known as triple tags. The first examples of the machine tags are geo-tags (specific identifiers of the geographical location) provided by Delicious and GeoBloggers.

Machine tags structure

The structure of “normal” tags, which has been defined above, is flat. They are usually single words, free form strings given by users. On the contrary, machine tags have well defined and simple structure. Note that the structure of the machine tags can vary from one collaborative tagging system to another since there is no standards for machine tags jet. In this work we will be focused on the structure of the machine tags on the Flick.

Machine tags comprise three parts: a namespace, a predicate and a value.

Namespaces are used to distinguish between multiple meanings of the same term. Namespaces should be used when site-specific information is being encoded. [1] The namespace defines a class or a face that a tag belongs to ('geo', 'flickr', etc.). It describes “who is going to take care about the tag”.

Predicates are the type of value that is being defined. The predicate is name of the property for a namespace ('latitude', 'user', etc.) which describes “what tag applies to”. And the value is the specific value of the tag (“which one is this”). Not all machine tags need to have value. Example provided by Flicker to explain machine tags is “Upcoming.org” example [2] [3]. The starting assumption is that we have images from the event that has been listed on the upcoming.org. On the Flickr, we can add machine tags: “upcoming:event=428084” where “upcoming” is namespace, “event” is the predicate and “428084” is the value. When this machine tag is added to the photo, it will be automatically shown on the upcoming.org “events” page. How? By adding this machine tag, robot squirrel on the Flickr servers would call the robot squirrels that are running on the Upcoming API and ask for Upcoming event with 428024 ID. The Upcoming robots return answer back by saying: “event under 428042 is named with Flicker Tunes 4” and Flickr stores that name in its database. Next time when user loads particular photo, Upcoming icon with name of the event is shown in the sidebar.

Image 1: structure of the machine tags

Another example is record of location information by entering latitude and longitude as geo:lat=12.345678 and geo:lon=12.345678.

Simple syntax rules of the machine tags are given below:

A "namespace":

Namespaces MUST begin with any character between a - z; remaining characters MAY be a - z, 0 - 9 and underbars. Namespaces are case-insensitive.

A "predicate":

Predicates MUST begin with any character between a - z; remaining characters MAY be a - z, 0 - 9 and underbars. Predicates are case-insensitive.

A "value":

Values MAY contain any characters that a "plain vanilla" tags use. Values may also contain spaces but, like regular tags, they need to wrapped in quotes.

Namespace and predicates are separated by a colon : ":"
Predicates and values are separated by an equals symbol : "="

Like tags, there are no rules for machine tags beyond the syntax, described above, to specify the parts of a machine tag. For example, you could tag a photo with:

flickr:user=straup
flickr:id=2436387779 – for flickr photo id
flora:tree=coniferous – user defined machine tag
medium:paint=oil – user defined machine tags
geo:quartier="plateaumontroyal" – geo tags
geo:neighbourhood=geo:quartier – geo tags

The complete list of the namespaces and predicated that are currently existing on the Flickr can be found in Paul Mison’s machnie tags browser application, which has been described in “Machine tags searching and browsing” section. Machine tags on the Flicker are part of the API. Anyone can make machine tags but in some cases they would be preceded like normal, flat tags. In the case of the Flicker, “machine tags are added exactly the same as any other tag whether it is done through the website or the API. When the Flickr supercomputer processes your tags, we take a moment to check whether it is a machine tag.” In addition, machine tags are also queried by API. How to add machine tags is described through examples in “Examples” section, while searching and browsing of the machine tags is given in “Machine tags search and browsing” section.

Machine tags extensions – Machine tag extras

Beside normal tags that users add to the pictures (called “raw” tags on Flickr), tags that have been seen in URL (“clear” tags by Flicker) and machine tags, Flicker introduced new term: “machine tag extras”. Under “machine tag extras” [4], developers of Flickr mean “the entire process of the machine tags as a kind of foreign key to access data stored on another website.” Take a look to the “upcoming.org” example again. Beside “upcoming:event=XXXX”, Flickr supports other web sites like:

Dopplr.comis social travel web site that has launched “Social Atlas” where users can recommend places to stay, eat and visit in the places they know. The structure of the machine tags for this web site is: “dopplr:(eat|stay|explore)=XXXX”
Open Plaques.org is the community-run web site set up to catalogue and document blue plaques that hung across the UK to commemorate persons and famous events. The structure of the machine tags for this web site is: “openplaques:id=XXXX”
OpenLibrary.com is internet achieve devoted to make web page for “every book ever published”. The structure of the machine tags for this web site is: “openlibrary:id=XXXX”
Burning Man project is aimed at providing a digital space which encompasses the entire event and community, both in BRC and in archival form, long after the dust has settled [5]. The structure of the machine tags for this web site is: “burningman:(camp|art|event)=XXXX”
Last.fm – music recommendation service. The structure of the machine tags for this web site is: “lastfm:event=XXXX”

The usages of those machine tags are described in the examples below.

Machine tags examples

Dopplr.com example:

Each place on Dopplr.com can be tagged with 3 machine tags:

doppler:eat= - for tagging places with good food that you would like to recommend / remember
doppler:stay= - for tagging places that you would like to recommend / remember as good examples for staying in
doppler:visit= - for tagging places that should be visited in some city

The good point about the Dopplr.com is that users doesn’t have to be familiar with syntax and semantic rules of machine tags, they can just copy them from Dopplr.com web site and past code on the their photos on the Flickr.

The example of the Dopplr:stay= and dopplr:visit= machine tag is available here .

On the Doppler.com web site machine tags are used to connect different kinds of practical information about touristic destination that might be relevant for users or (potential) visitors. Dopplr.com could be considered as new, extended recommendation service; beside simply ratings, it provides photos of ranked item. If we take food as an example: instead of just providing textual feedback (or feedback in the terms of ratings on the scale)about restaurant and its “delicious” food, visitors provide also photos of food. Based on the photos other users (traveler, potential customer) can decide if the restaurant is interesting for him, if the portions are big enough etc. Similarly, accommodation can be ranked: we don’t have to rate cleanness of the rooms, we can just connect photos with particular hotel.

OpenPlaques.com example:

Since this web site is made only for UK market, I took images from Flickr that has been tagged with “openplaques:id=XXXX” machine tag. Beside id, predicates that might be used are: context (describing famous houses), todo and test. If we take “openplanques:id=1372” or on the Flicker: we will see at the bottom of the page “additional information” which is linking us to openplaques.com where all information about Thomas Lord is given . The machine tag can be seen under “tags/ show machine tags”.

Image 2: Additional info on Flickr shows openplanques.com web site which is providing additional information about Thomas Lord.

OpenPlanques.com represents the idea of how machine tags could be used in the process of gathering information and creating database on any topics. If we consider that streets could change names over time, and for each of them we have OpenPlanques.com ID, OpenPlanques.com could be used to store information about each name (person or event) and to analyses historical changes and influences in one country / region over time. Another interesting usage of OpenPlanques.com platform could be see how “popular” or influential one person is. For example, if we have same street names in several different cities, we could use “link popularity” approach to define popularity / influence of the person in some country / region.

OpenLibrary.com example:

The process of adding machine tags on the openlibrary.com is also simple. To add a book, visitors need to register and fill a form about basic information about book (name, author, publishing date, publisher and id (optionally). To add machine tag one needs to add “openlibrary:id=XXXX”. The OpenLibrary provides virtual space for each cover, which contains: title of the book, description, different identifiers (openlibrary id, ISBN needed for linking with book sellers, WorldCat ID etc), information and links where book we can buy / read / borrow a book.

The examples of the photos tagged with “openlibrary” namespaces are: Flickr image 1 and Flickr image 2 .

Gathering books by using their cover page is attempt to provide virtual space for each book that has been ever published (note that one book can have several cover pages and publishers over time), but also to link all publishers, authors, languages and countries where book has been published, critics that exist for certain book. For example, some book editions are famous because of critics and overviews provided, while others are interesting because of illustrations that book contains inside. By using machine tags, OpenLibrary.com provides all information about single book on one place and but having it users can access the book itself and information related to book more easily.

The phenomenon of machine tags

From IT

Contents

Tags introduction

Machine tags definition

Machine tags structure

Machine tags extensions – Machine tag extras

Machine tags examples

Dopplr.com example:

OpenPlaques.com example:

OpenLibrary.com example:

Views

Personal tools

Navigation

Search

Toolbox