The phenomenon of machine tags
The whole document in pdf form can be found File:Here
Tags in general are one of most recognized Web 2.0 products. Going one step forward, one can say that machine tags as a part of the semantic web, should become one of the most recognized Web 3.0 products. Before we try to understand usages of machine tags, first we should understand what the meanings of the tags / tagging are, what kinds of tags exists and how users of internet can exploit it.
There is not official definition of tags and tagging but there are several characteristics of the tags that are applicable to all tag. Tags are user contributed (user-generated) descriptive strings, possibly labels and keywords that are describing a piece of content. Those strings should be relevant and easily associated to the piece of content. Under the content we can understand URLs, web pages, texts, images, videos, geographic maps, blog entries etc. Tags are not same as keyword annotations. The difference is that tags are flat, disorganized, free-form strings made by users and keyword annotations are usually part of the predefined vocabulary given by different authors, web systems (web sites, web directories, web platforms etc.) or librarians.
The fact that the tags are made by humans according to their own understanding of the content can be advantage and disadvantage of the tags systems. It is advantage in the sense that user knows and understands meanings of the content (data) and by adding the tags he can easier remember, retrieve, recognize, save, browse and search for content. The major disadvantage is that the same content can be tagged differently by different people. For example, images on the Flickr could be tagged according to the place where they had been taken (geo-tags) or by its content. If we have image of the mountain we can tag it with: “winter” (time of year when image had been taken), “Zlatibor” (place), “skiing” (activity shown on the image). But same image could be tagged also with: “January” (winter month), “Obudojevica” (skiing resort on Zlatibor), “skiing”. Another problem of tagging systems is that “system” doesn’t understand meaning of the tags. For example, tag “java” can describe computer company and program, coffee and island; tag “apple” can be applicable for both computer company (Apple Inc.) and fruit. In the case of individual tagging on the personal computer, those problems are not crucial, but in the case of collaborative / sharing tagging systems (like: delicious.com, flicker.com, digg.com) those problems are critical.
In social bookmarking web sites (collaborative tagging communities), users can share tags one with another, retrieve tagged content online, search, browse and filter tags. Examples of such communities are: Delicious, Flickr, Digg etc. We can distinguish social bookmarking communities according to the type of the content they are used to tag:
- Tagging for URL (for example: del.icio.us, stumbleupon.com)
- Tagging for photos (for example: flickr.com)
- Tagging for videos (for example: youtube.com)
- Tagging for news (dig.com, reddit.com, netscape.com)
- Tagging for books (librarything.com, openlibrar.com)
- Tagging for academic articles (citeulike.com)
- Tagging for retail products (amazon.com)
Those entire collaborative tagging systems share previously described problems. Some of them try to resolve it by using the machine tags.
The idea of the machine tags follows the basic idea of the semantic web: to give a meaning to every tag, so it can be understood and interpreted by machines. The machine tags keep characteristics of the “ordinary” tags, but also provide variety of new possibilities.
The machine tags are extension of the “ordinary” tags: they are made by humans according to their understanding of the content, they are descriptive but they are written in the specific format so machine can read it, understand it and perform specific action according to it. They add extra semantic information about tag and indirectly about content. Machine tags are semi-automated (must be added by humans, and then machine can perform action) and they can be understood as link between tags and keyword annotations ( at the moment, machine tags are given by collaborative system as part of API; users can add it but they will be parsed as regular, flat tags). In the table below, there is the list of the characteristics of “ordinary” tags, machine tags and keyword annotations.
|Tags||Machine tags||Keyword annotations|
Table 1: Characteristics of the tags, machine tags and keyword annotations
Machine tags are also known as triple tags. The first examples of the machine tags are geo-tags (specific identifiers of the geographical location) provided by Delicious and GeoBloggers.
The structure of “normal” tags, which has been defined above, is flat. They are usually single words, free form strings given by users. On the contrary, machine tags have well defined and simple structure. Note that the structure of the machine tags can vary from one collaborative tagging system to another since there is no standards for machine tags jet. In this work we will be focused on the structure of the machine tags on the Flick.
Machine tags comprise three parts: a namespace, a predicate and a value.
Namespaces are used to distinguish between multiple meanings of the same term. Namespaces should be used when site-specific information is being encoded. The namespace defines a class or a face that a tag belongs to ('geo', 'flickr', etc.). It describes “who is going to take care about the tag”.
Predicates are the type of value that is being defined. The predicate is name of the property for a namespace ('latitude', 'user', etc.) which describes “what tag applies to”. And the value is the specific value of the tag (“which one is this”). Not all machine tags need to have value. Example provided by Flicker to explain machine tags is “Upcoming.org” example. The starting assumption is that we have images from the event that has been listed on the upcoming.org. On the Flickr, we can add machine tags: “upcoming:event=428084” where “upcoming” is namespace, “event” is the predicate and “428084” is the value. When this machine tag is added to the photo, it will be automatically shown on the upcoming.org “events” page. How? By adding this machine tag, robot squirrel on the Flickr servers would call the robot squirrels that are running on the Upcoming API and ask for Upcoming event with 428024 ID. The Upcoming robots return answer back by saying: “event under 428042 is named with Flicker Tunes 4” and Flickr stores that name in its database. Next time when user loads particular photo, Upcoming icon with name of the event is shown in the sidebar.
Another example is record of location information by entering latitude and longitude as geo:lat=12.345678 and geo:lon=12.345678.
Simple syntax rules of the machine tags are given below: 1. A "namespace": Namespaces MUST begin with any character between a - z; remaining characters MAY be a - z, 0 - 9 and underbars. Namespaces are case-insensitive. 2. A "predicate": Predicates MUST begin with any character between a - z; remaining characters MAY be a - z, 0 - 9 and underbars. Predicates are case-insensitive. 3. A "value": Values MAY contain any characters that a "plain vanilla" tags use. Values may also contain spaces but, like regular tags, they need to wrapped in quotes. 4. Namespace and predicates are separated by a colon : ":" 5. Predicates and values are separated by an equals symbol : "="