I have seen many waves of technology drive innovation and open new opportunities in the 30 years I’ve spent building software startups – as both an entrepreneur and an investor. GUI. SQL Databases. Client/Server. The Internet. Mobile. It’s an incredible feeling to encounter game-changing technologies in their earliest days – and to have that “Eureka moment” when you realize the shifts that are going to ensue.
Artificial intelligence and machine learning (AI/ML) is one of those game-changing, foundational technologies. AI/ML has been a focus of research for decades, but over the last several years has greatly expanded its area of addressable use due to advances in algorithms, compute power & training data. It is already enabling fascinating new applications and companies throughout the tech ecosystem – industry giants like Apple, Google, Baidu, Amazon, Microsoft & Facebook are weaving AI/ML capabilities into their many solutions. So are a very high percentage of the startups that we see in the venture business - from enterprise SaaS to human capital management to robotics to pharma development. This technology will enable massive change in how we think about building systems & software. While much of the current fervor around the potential of AI is justified by the real world solutions already being delivered, we are still in the early phases of realizing that potential.
I first got excited about AI as an undergrad at Stanford In the 1980s. The first course I took was with Prof. Ed Feigenbaum, “Fundamentals of AI.” One of the specific systems we worked with in class was MYCIN, which was used for medical diagnosis. For the first time, I was introduced in a hands on sense (actual code, not a Sci-Fi thriller) to working AI – a piece of software that could do knowledge work at super human levels. It completely fascinated me – and still does. And, as a kid who loved programming (but was a little short on experience and perspective), it seemed this technology and cool AI startups like Intellicorp and Teknowledge were about to explode. Or so I thought. Little did I know that we were in the middle of an AI summer. And winter was coming.
While the AI solutions at the time provided real utility in tasks ranging from medical diagnosis to complex IT systems configuration, they were very “narrow” in their capabilities. The state-of-the-art AI technique at the time was something called Expert Systems, like MYCIN. ES were basically a set facts coupled with a set of rules and an inference engine to interpret how the two related (e.g. when given some data about a patient’s blood chemistry, analyze it via the known facts and rules and then determine what infectious disease they have). MYCIN worked, and was as good as human doctors at diagnosing infectious blood diseases, but there was no “leverage” – the system had to be entirely rebuilt for each new topic area. People realized coding entire knowledge bases into rules & facts was really difficult to do at scale and had some critical limitations – in particular ES were really terrible at handling uncertainty. You could get great results, but only in narrow domains where the underlying knowledge base was static and the rules were deterministic in nature.
The limitations of the rules based or “cognitive” approach to building an AI ultimately led researchers to explore new approaches with roots in statistics and probability, such as “fuzzy logic" or Bayesian approaches. As tools and techniques evolved, sophisticated modelers could build hand-crafted algorithms that dealt more effectively with uncertainty and probability, but were still incredibly difficult to scale. Neural Networks (NN) were around at the time, but were still largely limited to research tools and simple demonstrations. Model complexity, training techniques, as well as lack of training data and compute power were all crippling limitations in the early days of Neural Networks. Those barriers were overcome one by one, and in today’s world Deep Learning models are capable of results that are not only astoundingly accurate but also much, much easier to develop. This is due to the more powerful tools that have been created; vastly improved compute power; and massive aggregations of training data. This “statistical” branch of AI/ML is where much of the progress has been made in the last decade, and it has been incredible to witness.
However, the fact is that the current state of the art techniques, such as Deep Learning, have the same fundamental limitation as early AI/ML efforts: to get great results for a given model, it is necessary to domain constrain the problem and to train a model using facts or training data that are known in advance. For certain classes of problems, developers have massive data sets for training models and they work really well. However, it can be nearly impossible to provide sufficient training examples covering every “long tail” instance for very large problem domains with a near-infinite number of inputs. This is why, for example, Level 4-5 capabilities in self-driving cars is a very difficult challenge. Fundamentally, whether building supervised learning models or using rules based inference, both approaches to creating a model are hampered by a need to preview things the system will run into, whether using codified rules or training data. This leads to the current situation, where we have highly effective AIs, and the range of problems that they can be used to solve has increased dramatically, but they still have to focus narrowly.
There is no doubt that these current AI solutions have definitively leaped over the hurdle of sufficient functionality to deliver real utility and economic value. They are not going away and are proliferating rapidly. I do not believe we are going to see another “AI winter” of the type that arrived in the 80’s. While true Artificial General Intelligence (AGI) is still very far away, if or when they arrive is an academic argument. It doesn’t change the fact that the use of Narrow AI to solve specific problems has grown massively since the days of MYCIN, and that growth will continue.
At Canaan, we expect to see the scope and capability of AI systems expand rapidly, driven by two main themes. First, we are very excited by the work specifically focused on new techniques for unsupervised learning, and tools for generating training data for models (GANS, for example). This will expand the scope and ease with which models can be built. Research into areas like transfer learning will also continue to lower the barriers to create solutions for “adjacent” problems. The ecosystem of Narrow AIs is going to continue to explode as these techniques for circumventing the training data limitations reach maturity. We also believe that new frameworks will eventually exist to combine separate narrow AIs into larger scale ensemble models in some sort of hybrid form – using statistical based models (like Neural Nets) for certain tasks, and more cognitive based higher level logic and “rules” for others. As we see many more of these narrow AIs come into use, the need will increase for frameworks that combine them together into much broader solutions and can combine those models into ever larger, more capable systems. In future posts we’ll be talking about some of the most interesting technical developments, examples of real solutions in use today, as well as current investment activity. We’re looking forward to a long summer!