mala::home Davide “+mala” Eynard’s website


Perl Hacks: Googleicious!

At last, here's my little googleicious, my first attempt at merging google results with tags. I did it some months ago to see if I could automatically retrieve information from Well... it's possible :)

What does this script do? It takes a word (or a collection of words) from the command line and searches for it on google; then it takes the search results, extracts URLs from it and searches for them within, showing the tags that have been used to classify them. Even if the script is quite simple, there are many details you can infer from its output:

  • you can see how many top results in Google are actually interesting for people
  • you can use tags to give meaning to some obscure words (try for instance ESWC and you'll see it's a conference about semantic web, or screener and you'll learn it's not a term used only by movie rippers)
  • starting from the most used tags returned by, you can search for similar URLs which haven't been returned by Google

Now some notes:

  • I believe this is very near to a limit both Google and don't want you to cross, so please behave politely and avoid bombing them with requests
  • For the very same reason, this version is quite limited (it just takes the first page of results) but lacks some more controls I put in later versions such as sleep time to avoid request bursts. So, again, be polite please ^__^
  • I know that probably one zillion people have already used the term googleicious and I don't want to be considered as its inventor. I just invented the script, so if you don't like the name just call it foobar or however you like: it will run anyway.

Ah, yes, the code is here!

Comments (0) Trackbacks (0)

No comments yet.

Leave a comment

No trackbacks yet.