It’s been a radical summer building out Alive.cn. It’s been a while since I’ve been back to the nuts and bolts of startup life doing everything from programming to hiring to sales to taking out the trash. I’ve had my partners Patrick and Raffi living in my two bedroom apartment for weeks as we rapidly build out the new service together. Guess who gets stuck on the couch…
In addition, I just did a podcast chat with CRI radio host John Artman about my entrepreneurial experiences. His own podcast is just getting started so the talk is all about “startups”. The chat is about an hour long — not sure who in the world is interested in hearing me blabber for that long, but if you wait till the end, you WILL get to hear my billion dollar new startup ideas.
Starting a company is like having a baby — it’s an amazing thing to look back upon to the point when the company was only just an “idea” in your head and then see it grow as more and more people join the company.
As a founder, you’re like the Greek god Zeus pulling Athena from his head and bringing her from nothingness into the World.
I really like this recent interview with Michael Scott, the oft-forgotten FIRST CEO of Apple. Our perception of Apple is now of this gargantuan monster of a company that has always existed, but I can guarantee you that the most treasured memories by both Steves are likely from the first several years of Apple’s founding — these are the important moments where every day’s seemingly inconsequential decisions have potentially unexpected impact on the legacy and culture of the company.
BI: What was the culture that developed at the company in the early days?
MS: Well, I guess the biggest part of the culture was that Holt made our coffee in the morning. He made the coffee to suit him, and it was so strong that it would keep us all up forever. That was subsequently a big fight that we had.
Ann Bowers…. who was…. I forgot the guys name, but she was the wife of one of the founders of Intel, she was our first VP of Personnel. This was a couple of years later. She was on this kick saying that we should not supply caffeine to the employees because it was unhealthy. And I just said, “No,” because we weren’t a committee and we didn’t need a vote on it.
I would say that the challenge was, who was more stubborn, Steve or me, and I think I won.
The other argument at the meetings was would Steve take his dirty feet and sandals off the table, because he sat at one end of the conference table, and Markkula sat at the other end chain smoking. So we had to have special filters in the attic in the ceiling to keep the room filter. I had the smokers on one side and the people with dirty feet on the other.
[Laughter from us.]
It was not funny then. Everybody has their pet peeves.
It’s difficult to imagine these kind of long-lasting memories having impact when you’re in the middle of the “fog of war” during those first few years of development. Right now, the alivenotdead team is closing in on a transition point where we will be a pivot. It gives us a unique opportunity similar to starting a whole new company with a new team members and new experiences and reading this interview really makes me excited about the prospect of building new lifelong memories to reflect upon.
While I’m still relatively new to mongoDB, I’m taking the opportunity to give some insights on building a new multi-lingual, comprehensive entertainment database using linked open data. The presentation will go through an evolution starting with the early days of Rotten Tomatoes when we assembled the movie information manually to my current efforts with Alive.cn.
I wanted to invite technically-minded Beijing folks again to a presentation that I’m doing on Thursday at the mongoDB conference. While I’m still relatively new to mongoDB, I’m taking the opportunity to give some insights on building a new multi-lingual, comprehensive entertainment database using linked open data. The presentation will go through an evolution starting with the early days of Rotten Tomatoes when we assembled the movie information manually to my current efforts with Alive.cn.
I’m still not certain yet whether I’m going to deliver my presentation in English or in Chinese. Obviously, I’m much more comfortable speaking English, but would like to make sure that the audience is getting the message correctly. In any case, I’ve presented both English and Chinese versions of the presentation below. I decided to go with a movie theme in the visuals throughout the presentation to keep things in line with my “entertainment database” topic.
Looks like some of the presentation fonts and layout didn’t get transferred too well with the upload to SlideShare, but you can get the general gist below:
In line with my recent blog mentioning OpenCalais, the topic extraction tool, DBpedia, one of the awesome linked open data projects I’ve been using a bunch for Alive.cn, just released their own topic extraction tool, DBpedia Spotlight.
In line with my recent blog mentioning OpenCalais, the topic extraction tool, DBpedia, one of the awesome linked open data projects I’ve been using a bunch for Alive.cn, just released their own topic extraction tool, DBpedia Spotlight. If you are okay with downloading 9GB of Lucene indices and setting up their scripts, you can have your own self-hosted topic extraction tool. They basically open sourced something that is worth a lot of money in a previously relatively closed space.
What is topic extraction? Check this demo out and enter any block of text — say, a recent news article. The benefit of using DBpedia’s solution (besides it being free) is that it automatically ties topics back to their DBpedia topics which already have a huge storehouse of Wikipedia-derived linked open data.
It’s the probably the most public test of the advances in linked open structured data and semantic text analysis, I’m really following closely this tournament pitting IBM’s super-computer Watson against the two most successful Jeopardy champions.
It’s the probably the most public test of the advances in linked open structured data and semantic text analysis, I’m really following closely this tournament pitting IBM’s super-computer Watson against the two most successful Jeopardy champions. I suspect that they’re using the same publicly available data sets that we’re using for constructing Alive.cn.
I wonder, however, why they chose to rely only on electronically fed questions rather than going the final mile and adding a voice recognition interface on top of the system. Voice recognition accuracy has gotten so good these days, but I wonder if the final few percentage mistakes makes a critical difference against the best human players.
There have been some other truly AMAZING projects in this field. Two I’d like to highlight:
Google Squared: This Google Labs experiment is an amazing mash-up of topic extraction and turning unstructured web data into structured data. Simply type in any category (example: “Chinese Emperors”) and it will bring you up a spreadsheet of items in that category and some properties. Next, you can add your own properties (“Inventions”) and it will automatically fill in the results using searched data from the web converted back into structured data. It’s truly one of the most remarkable things to come out of Google, but a bit more work on it (say, a voice recognition interface) and it could be a mainstream breakthrough.
OpenCalais Topic Extraction: Another semantic analysis tool that will pull out “topics” automatically and link them against linked open data. Try out the free demo and copy-and-paste a news article. After submitting the article, you’ll see it has linked together topics on the side automatically.
Like I’ve mentioned before, I feel that we’re right on the tipping point in the next several years where there will be advances in knowledge extraction and interpolation that will have a revolutionary effect on everything including how we interact with computing and having exponential advances on data forecasting. Projects like Wikipedia (an unstructured data source) are just the beginning.
P.S. My favorite comment about the Man versus Machine Jeopardy contest: “Why couldn’t they have programmed Watson to use the voice of Sean Connery?”
mongoDB is one of the hot new NoSQL databases that have recently come out and is the database platform for new Alive.cn, the new multilingual entertainment database that I’ve been constructing. I’ve been a MySQL user ever since we started Rotten Tomatoes over ten years ago, so I’m still relatively new to mongoDB, but I really like the philosophy of simplicity and flexibility for things like dynamic and lazy schemas, auto-sharding, on-the-fly indexes, etc. I’m dealing with a wide variety of complex data schemas across very large datasets in this new project so it’s nice to be able to waste time having to stuff everything into a “one-size fits all” design.
In any case, the nice folks at 10gen, the company that develops mongoDB, will be conducting a free developers conference in Beijing on Thursday, March 3 and I will be delivering one of the presentations. I hope to prepare something that shows the power of flexibility of using mongoDB with various linked open data sources (or combining this data with social media data sources like Facebook, Twitter, and Sina Weibo) or something along those lines. I’ll deliver my talk in English, but hope to have Chinese slides as well and, of course, you can come up and chat with me in Chinese.
mongoDB is increasingly being used by many notable social companies overseas like foursquare, Disqus (which I use on my own site), and Eventbrite. If you’re interested in learning about this alternative to MySQL, check out more details.