Search is hardly a new problem, but it’s one that’s getting harder to solve.
Just a few short years ago, “searching” meant typing something into a textbox. (Hello, Ask Jeeves, Lycos, and Altavista!) The world is rapidly moving toward multimedia at a mind-boggling pace, encompassing voice, music, photos, videos, and much more. So why hasn’t “search” evolved beyond a basic textbox?
One area that is seeing a lot of interest is neural search, which at its core is a fundamentally new approach to retrieving information. Instead of the traditional method of telling a machine a set of rules to index, filter, sort, and rank items in understanding data (known as “symbolic search”), neural search does the same thing with pre-trained models. This means developers don’t have to write every little rule to handle every edge case (spelling issues, synonyms abbreviations), saving them time and headaches, and the system trains itself to get better as it goes along.
Neural search is also an area where Jina AI has quickly become a leader, with a clear vision of powering the future of search in a way that’s fast, scalable, and most importantly, works with any kind of data. We’re thrilled to lead the company’s Series A and partner with the team to bring this to fruition.
Founded only 18 months ago, Jina AI has seen massive open-source traction, already building a community of 1,000+ developers and becoming one of the fastest-growing projects on GitHub in 2021 with more than 1 million total downloads. The founding team – Han (CEO), Nan (CTO), and Bing (COO) – all hail from Tencent where they built the video recommendation search to compete with Tiktok on a massive scale.
With Jina AI, they’re building an open-source, horizontal search framework that allows developers to build search applications for the unstructured data era. The core project (called Jina) is being built in the open on GitHub for users to create cloud-native neural search frameworks in just a few hours. It’s designed to be accessible to an average developer (without deep learning background, for example) while also having enough substance to enable best-in-class technologists to leverage the framework, too.
One of the most exciting facets of Jina is that it is data type-agnostic. That means it can include all kinds of data – and address the rapidly growing amount and types of unstructured data (without requiring manual tagging). Audio- and video-first companies have been the headliners this year (TikTok, Zoom, Clubhouse, Loom), but the types of datasets companies look to unlock diversify (from product catalogs and source code to customer service logs and business documents), Jina.AI can serve as the infrastructure layer for it all.
The early applications we’re seeing – from e-commerce and insurance adjudication to gaming asset search and healthcare – are remarkable. In retail, users can search for products in large catalogs by uploading a similar image. (This has wide-ranging applications from searching for anything from apparel to automotive parts.) In insurance, companies can search for similar images from previous claims to determine payouts for clients more accurately. In gaming, users may want to search across an expansive product or asset catalog, and similarly, game developers may want to do the same as they design games. (By the way, none of these require a traditional search box.) The potential is endless. It’s also why we believe the open-source approach is the right one for Jina AI.
Every company has a strong need to find the proverbial needles in the haystacks, and while the data of today already goes far beyond text, traditional systems haven’t kept up. We believe in this team to build a new approach that underpins the future of search and opens up new opportunities, and even entirely new businesses.