Speak of the devil… build your own OpenCalais like supermachine

In line with my recent blog mentioning OpenCalais, the topic extraction tool, DBpedia, one of the awesome linked open data projects I’ve been using a bunch for Alive.cn, just released their own topic extraction tool, DBpedia Spotlight. If you are okay with downloading 9GB of Lucene indices and setting up their scripts, you can have your own self-hosted topic extraction tool. They basically open sourced something that is worth a lot of money in a previously relatively closed space.

What is topic extraction? Check this demo out and enter any block of text — say, a recent news article. The benefit of using DBpedia’s solution (besides it being free) is that it automatically ties topics back to their DBpedia topics which already have a huge storehouse of Wikipedia-derived linked open data.

The Man versus Machine Jeopardy Challenge

Jeopardy Feb. 14 2011 – Human vs Machine IBM Challenge Day 1 Part 1/2

It’s the probably the most public test of the advances in linked open structured data and semantic text analysis, I’m really following closely this tournament pitting IBM’s super-computer Watson against the two most successful Jeopardy champions. I suspect that they’re using the same publicly available data sets that we’re using for constructing Alive.cn.

I wonder, however, why they chose to rely only on electronically fed questions rather than going the final mile and adding a voice recognition interface on top of the system. Voice recognition accuracy has gotten so good these days, but I wonder if the final few percentage mistakes makes a critical difference against the best human players.

There have been some other truly AMAZING projects in this field. Two I’d like to highlight:

  • Google Squared: This Google Labs experiment is an amazing mash-up of topic extraction and turning unstructured web data into structured data. Simply type in any category (example: “Chinese Emperors”) and it will bring you up a spreadsheet of items in that category and some properties. Next, you can add your own properties (“Inventions”) and it will automatically fill in the results using searched data from the web converted back into structured data. It’s truly one of the most remarkable things to come out of Google, but a bit more work on it (say, a voice recognition interface) and it could be a mainstream breakthrough.
  • OpenCalais Topic Extraction: Another semantic analysis tool that will pull out “topics” automatically and link them against linked open data. Try out the free demo and copy-and-paste a news article. After submitting the article, you’ll see it has linked together topics on the side automatically.

Like I’ve mentioned before, I feel that we’re right on the tipping point in the next several years where there will be advances in knowledge extraction and interpolation that will have a revolutionary effect on everything including how we interact with computing and having exponential advances on data forecasting. Projects like Wikipedia (an unstructured data source) are just the beginning.

P.S. My favorite comment about the Man versus Machine Jeopardy contest: “Why couldn’t they have programmed Watson to use the voice of Sean Connery?”

Don’t use a hammer to do a screwdriver’s job

Don’t use a hammer to do as screwdriver’s job. While their intentions are honorable, lawmakers shouldn’t be trying to compel Facebook to change their real names rules when more anonymous, popular alternatives are available.

Facebook aims to build a platform that, as crazy as it sounds, is bigger than just forum for political activists, and the pressures from other real-world concerns like fraud, spam, and ensuring trusted, interpersonal communication might outweight the benefits of anonymous posting.

From the New York Times…

Facebook Officials Keep Quiet on Its Role in Revolts

Beijing mongoDB conference: I’ll be showing a shiny new Alive.cn

mongoDB

mongoDB is one of the hot new NoSQL databases that have recently come out and is the database platform for new Alive.cn, the new multilingual entertainment database that I’ve been constructing. I’ve been a MySQL user ever since we started Rotten Tomatoes over ten years ago, so I’m still relatively new to mongoDB, but I really like the philosophy of simplicity and flexibility for things like dynamic and lazy schemas, auto-sharding, on-the-fly indexes, etc. I’m dealing with a wide variety of complex data schemas across very large datasets in this new project so it’s nice to be able to waste time having to stuff everything into a “one-size fits all” design.

In any case, the nice folks at 10gen, the company that develops mongoDB, will be conducting a free developers conference in Beijing on Thursday, March 3 and I will be delivering one of the presentations. I hope to prepare something that shows the power of flexibility of using mongoDB with various linked open data sources (or combining this data with social media data sources like Facebook, Twitter, and Sina Weibo) or something along those lines. I’ll deliver my talk in English, but hope to have Chinese slides as well and, of course, you can come up and chat with me in Chinese.

mongoDB is increasingly being used by many notable social companies overseas like foursquare, Disqus (which I use on my own site), and Eventbrite. If you’re interested in learning about this alternative to MySQL, check out more details.

Alive Not Dead (2007-2013)

Alive Not DeadNote: This post is part of an extended auto-biography which is collected in my About page.

As our first China company, Xiaban.com, transitioned to becoming the local BBS web site, XMFish.com, my business partner Patrick Lee and I decided that we would pursue new opportunities that would allow us to return to my original passion of film and entertainment and to move to Hong Kong. We had witnessed how the social network Myspace had grown leaps and bounds faster than our former acquirer IGN Entertainment despite being acquired at the same time and for around the same amount of money and by the same owner, News Corporation. As a consequence, we partnered with the members of band Alive to create a new online community of artists, alivenotdead.com.

Patrick had been the primary investor and executive producer for the directorial debut of popular Hong Kong-based actor Daniel Wu (吴彦祖), The Heavenly Kings (四大天王). During college, Daniel was the co-founder of the University of Oregon Wushu Team and frequently came down to Berkeley, near his original hometown, to practice with us and Cal Wushu Team. Daniel and another Cal classmate of ours, Terence Yin (尹子维), were now successful actors in Hong Kong and presented Patrick with the idea of doing creating a boy band similar to F4 or the Backstreet Boys comprised of popular Hong Kong heartthrob actors. In reality, the boy band, named “Alive” and additionally comprising of actors Andrew Lin (连凯) and Conroy Chan (陈子聪), was a cover for a mock-umentary that they were filming that would expose some of the hypocrisies and urgent issues in the Asian entertainment industry. For a period of a year and a half, Alive recorded and released several songs and even went out on a concert tour throughout Asia in the guise of a boy band when, in reality, they were documenting the process for their film. When finally released during the Hong Kong International Film Festival in April 2005, the film and the fake band’s secret mission landed as a media bombshell (The Standard (HK), San Francisco Chronicle), but eventually went on to earn Daniel the award for Best New Director at the Hong Kong Film Awards.

Alivenotdead.com was the original web site for the Alive band and, eventually, The Heavenly Kings movie. It was created by the Alive boys as a place for fans to read their updates as well as connect with other fans on the site’s message boards. It also hosted fan boards for several of the independent Hong Kong bands that were featured in the movie and had accumulated an impressive 30,000+ registered members. As the promotion for the film was coming to an end, the Alive boys presented Patrick with the idea of converting the web site and it was eventually we came across the idea of building an online community similar to Myspace that would allow artists to connect with their fans. Patrick and I were primarily interested in returning to something entertainment-themed as this was my original passion; additionally, we wanted to pursue a model that could grow exponentially as Myspace had, but do it in Asia. Daniel and Terence sought to build a community that could support and largely run artists including filmmakers, musicians, and others.

As a consequence, we worked through early 2007 to launch a new alivenotdead.com in April 2007 with seven initial “official artists”: the Alive band, Daniel Wu (吴彦祖), Andrew Lin (连凯), Conroy Chan (陈子聪), Terence Yin (尹子维), world-famous Chinese action star Jet Li (李连杰), and Chinese-American actress Kelly Hu (胡凯莉). Jet and Kelly came on-board as initial artists on the site since we had been doing their official web sites for numerous years already extending back to our Design Reactor days.

The official artist membership rapidly expanded from the initial seven artists to it’s current roster of around 1,600 artists (as of January 2011) with primary coverage in Hong Kong, Singapore, mainland China, Taiwan, Japan, and Asian-Americans in the United States. Artists can publish and share blogs, photo albums, events, and maintain their own fan forums. For a while, we experimented with artist stores that allowed artists to sell merchandise directly from their profiles. Fans can also register and create their own blogs, photo albums, etc. and connect with their favorite artists and as of January 2011 we have over 600,000 registered members.

A lot of the work we’ve done recently on Alive Not Dead has been towards connecting artists with each other as well as with advertising brands as a way to generate revenue. With the financial crisis in 2008, we pivoted to expand our efforts on working with artists and advertisers on offline events in conjunction with online advertising. At the current time, we work with many top brands (e.g. Adidas, Nokia, Esprit, Diesel) to create online marketing campaigns that draw attention to artist concerts, art exhibitions, etc. which employ Alive Not Dead artists. We also host the most popular and fun annual, costumes-mandatory Halloween party (“Dead Not Alive” Halloween 2010, 2009 (another link), and 2008) in Asia 🙂 .

Working closely with artists, we’ve also expanded our alivenotdead.com platform to help some high profile Asian artists power their official web sites. We power the official web sites for Jet Li 李连杰 (JetLi.com), Jackie Chan 成龙 (JackieChan.com), and Karen Mok 莫文蔚 (KarenMok.com).

In October 2009, I decided to move from Hong Kong to Beijing in order to accelerate our expansion in mainland China. I personally wanted to return to mainland China where I had moved originally when I first came to Asia, and especially to Beijing which is the epicenter of the unique and tremendous internet industry in China. Additionally, Alive Not Dead had recently landed a partnership with web portal, Tom.com, that would allow us to begin hosting and promoting the alivenotdead.com community within mainland China with the help of a local partner. Since then, I’ve been working to reach out to other internet entrepreneurs and engineers, improve my Mandarin Chinese, and grow an online destination for a local Chinese audience.

Update: After departing Alive Not Dead in April 2013, the company was acquired by the Southeast Asian social networking company Migme in early 2014. Alive Not Dead continues to grow under Migme’s stewardship.

Xiaban.com (2005-2006)

Xiaban.com and XMFish.comNote: This post is part of an extended auto-biography which is collected in my About page.

After leaving my role as head of the recently acquired Rotten Tomatoes and a VP at the even more recently acquired IGN Entertainment, I rejoined my frequent business partner Patrick Lee in the Chinese coastal city of Xiamen, Fujian province, where he had teamed up with his original business partner from his first company, Jimmy Zhuang (庄振宁). Jimmy, a college classmate of ours, was originally born and raised in Xiamen prior to moving to California for high school and, eventually, university at Cal.

Our initial web site in China, Xiaban.com (下班网), was initially a customer loyalty platform for merchants whereby customers could swipe a loyalty card at hundreds of different participating stores and receive points which could be redeemed for prizes and discounts. Merchants could sign up to receive powerful, aggregated data about their customers including demographic data, spendings statistics, and comparison data with their competitors. Furthermore, we provided a way for merchants to target SMS-based ads to their customers — every time the card was swiped, the customer would receive an SMS confirming their points along with an advertising area for merchants that could be targeted by neighborhood, customer demographic, or store category. We rolled out this powerful platform across nearly a thousand stores throughout our Xiamen with plans to expand nationwide. When I came into the company as Chief Operating Officer (COO), I was additional tasked with redoing Xiaban.com as a Yelp-like web site that would help us rapidly expand our brand throughout China. Like Yelp, our site allowed users to find the best places to eat and shop from a comprehensive, nationwide database of merchants and share their reviews and tips with other consumers and friends. We further tied in these member services with data accumulated by using the Xiaban loyalty card so members could check and redeem points and prizes online. Unfortunately, the site’s traffic was leapfrogged by our rapidly growing competitor, Dianping.com, and at the end of 2006 we decided to pivot away from the capital-intensive loyalty card platform. Instead, we acquired XMFish.com (厦门小鱼社区), a rapidly growing local community web site in Xiamen. XMFish.com’s traffic was on a phenomenal growth path in the local Xiamen area and was already becoming the most important online destination in Xiamen. As part of the new company, we grew XMFish to become the most trafficked website in the province and a vital and positive community in the Xiamen area. By building online ad sales on the site, we were able to grow both the web site and company stably.

At the current time, XMFish.com has expanded to included neighboring cities and has even begun offering our loyalty card again in partnership with local banks including ICBC. The site has become the primary online platform for local advertising and has been extended to include services like group buying and an online shopping of local merchants with same-day delivery.

While I departed from my full-time position in December 2006, I continue to frequently return to Xiamen.

IGN Entertainment (2004-2005)

IGN Entertainment

Note: This post is part of an extended auto-biography which is collected in my About page.

After Rotten Tomatoes’ acquisition by IGN Entertainment, my business partner and Rotten Tomatoes CEO, Patrick, departed the company and I took over as head. For the remaining eighteen months at IGN/Rotten Tomatoes, I worked to further expand the Rotten Tomatoes traffic and brand. We developed the Certified Fresh seal as a way for movie studios to take advantage of positive film ratings on Rotten Tomatoes in their marketing. Furthermore, we did a tour of all of the marketing and online departments of the major studios to further cement our relationship with the industry we were covering. Finally, I worked closely with the IGN Entertainment team to integrate and expand our ad sales efforts with their more well-developed bi-coastal, ad sales force as well as merge our server platform into their hosting environment. Most importantly, though, I did my best to protect the Rotten Tomatoes brand and team and hire up additional team members who could continue to grow Rotten Tomatoes upon my own departure.

At IGN, I was elevated to a Vice-President position and, as part of my corporate duties, I asked our CEO for the opportunity to explore international expansion in Asia. At the time, IGN itself was preparing to go public as the largest video games content web site with the highest concentration of young male visitors online. It also had some great products such as GameSpy, the early, popular game matchmaking software and Direct2Drive, a video games version of iTunes. I wanted to explore how we might be able to partner or joint venture with companies to relicense and promote these properties in Asia. My former partner, Patrick, had already departed Rotten Tomatoes in order to startup another company  (Xiaban.com) in mainland China and I similarly felt that the growth opportunities in China at the time were enticing. As a consequence, starting in early 2004 I began frequently going to China and learning up on language, culture, and the Internet. Later on, as a part of IGN, I began visiting various gaming and internet companies and investors in preparation to create a representative office in Shanghai where IGN could begin partnering on expansion projects. The frequent visits helped cement my conviction that my next step after Rotten Tomatoes would need to take place in building something in China where the growth opportunities seemed as similarly exciting as the Internet Boom that I had taken a part of in the late 90’s.

In August 2005 and in the same month as it acquired Myspace, News Corporation acquired IGN Entertainment for $650 million. While the IGN’s acquisition was a really impressive feat by our CEO, I had only held a small amount employee shares in the company since our Rotten Tomatoes acquisition was done in cash. Furthermore, my efforts to help with IGN with opening it’s own Shanghai office were sidelined as all new efforts would be done in conjunction with the new parent company. I quickly made a decision that I wanted to return to entrepreneur life rather than working for a large company such as News Corp. and that I wanted to pursue my opportunities in mainland China. As a result, in December 2005, I left News Corp./IGN/Rotten Tomatoes and moved to join my former partner at his startup in China.