{"id":125,"date":"2011-02-15T23:57:17","date_gmt":"2011-02-15T15:57:17","guid":{"rendered":"http:\/\/stephenwang.com\/b\/?p=125"},"modified":"2018-08-28T10:33:42","modified_gmt":"2018-08-28T02:33:42","slug":"speak-of-the-devil-build-your-own-opencalais-like-supermachine","status":"publish","type":"post","link":"https:\/\/stephenwang.com\/b\/speak-of-the-devil-build-your-own-opencalais-like-supermachine\/125\/","title":{"rendered":"Speak of the devil&#8230; build your own OpenCalais like supermachine"},"content":{"rendered":"<p>In line with my <a href=\"\/b\/the-man-versus-machine-jeopardy-challenge\/123\/\">recent blog<\/a> mentioning <a href=\"http:\/\/http\/\/viewer.opencalais.com\/\" target=\"_blank\">OpenCalais<\/a>, the topic extraction tool, DBpedia, one of the awesome linked open data projects I&#8217;ve been using a bunch for <a href=\"\/b\/alive-cn-2010-now\/54\/\">Alive.cn<\/a>, just released their own topic extraction tool, <a href=\"http:\/\/wiki.dbpedia.org\/spotlight\/usersmanual?v=188\">DBpedia Spotlight<\/a>. If you are okay with downloading 9GB of Lucene indices and setting up their scripts, you can have your own self-hosted topic extraction tool. They basically open sourced something that is worth a lot of money in a previously relatively closed space.<\/p>\n<p>What is <a href=\"http:\/\/en.wikipedia.org\/wiki\/Terminology_extraction\" target=\"_blank\">topic extraction<\/a>? Check this demo out and enter any block of text &#8212; say, a recent news article. The benefit of using DBpedia&#8217;s solution (besides it being free) is that it automatically ties topics back to their DBpedia topics which already have a huge storehouse of Wikipedia-derived linked open data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In line with my recent blog mentioning OpenCalais, the topic extraction tool, DBpedia, one of the awesome linked open data projects I&#8217;ve been using a bunch for Alive.cn, just released their own topic extraction tool, DBpedia Spotlight.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[82,4],"tags":[109,105,110,107],"class_list":["post-125","post","type-post","status-publish","format-standard","hentry","category-alive-cn","category-internet","tag-dbpedia","tag-linked-open-data","tag-opencalais","tag-topic-extraction"],"_links":{"self":[{"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/posts\/125","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/comments?post=125"}],"version-history":[{"count":4,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/posts\/125\/revisions"}],"predecessor-version":[{"id":894,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/posts\/125\/revisions\/894"}],"wp:attachment":[{"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/media?parent=125"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/categories?post=125"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/stephenwang.com\/b\/wp-json\/wp\/v2\/tags?post=125"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}