The Origin Story of DeepSet AI: A Journey of Innovation and Growth
DeepSet AI, a pioneering company in the field of Natural Language Processing (NLP), was founded in 2018 by CEO Milos Uh and CTO Malta. The two met in university 10 years prior, where they worked together on machine learning, applied mathematics, stochastics, and statistics. Malta's experience in building online recommendation engines and natural language processing systems laid the foundation for DeepSet's vision to understand language and process it in the form of text or speech.
Bootstrapping and Consulting: The Early Days
In the early days, DeepSet was bootstrapped, with the founders working as consultants for companies like Airbus, federal authorities in Europe, and software providers in Germany. They built NLP systems for various use cases and enterprises, often using open-source tools and assembling solutions. This experience helped them develop their financing strategy and learn about the NLP space.
Hast: The Open-Source Orchestration Framework
One of the key projects that emerged from DeepSet's early days is Hast, an open-source orchestration framework designed to make everything work together to deploy an NLP system. Hast consists of multiple components, including models, databases, and text preprocessing, and is designed to reflect complex architectures and build systems that integrate multiple models.
Transition to Building a Product
The transition from consulting to building a product was relatively smooth, thanks to the small team size (around 10-12 people) and the early versions of the product that had already been built while bootstrapped. The founders had always aspired to capture a bigger opportunity in the NLP space, which wasn't well-understood at the time.
Deep Dive into Hast
To illustrate the power of Hast, let's consider a use case where we want to enable people to ask questions about a podcast transcript. The components of Hast include models (e.g., GPT models), databases (e.g., Vector database), text preprocessing, and serving the model with the text. This orchestration framework makes it possible to build complex NLP systems that can process and generate human-like language.
DeepSet Cloud: A Cloud-Based Platform for NLP
DeepSet Cloud is a cloud-based platform that enables companies to adopt NLP and Large Language Models (LLMs). The platform is designed to support the full workflow of experimenting, developing, and running NLP applications in production. Built on top of Hast, DeepSet Cloud provides a cloud-based infrastructure for deploying and managing NLP systems.
Combining Pre-processing, Retrieval, and Generation
At the heart of DeepSet Cloud is the ability to combine pre-processing, retrieval, and generation. This involves chunking podcasts into small slices, storing them in a database, using a filter to select relevant slices, and serving them to the model. The model then generates an answer, which can be further refined using additional components such as hallucination detectors and fact-checking models.
DeepSet Cloud Architecture
The architecture of DeepSet Cloud supports bringing your own database and LLM, and integrates with popular Vector databases and LLMs. The platform also includes tooling for running small-scale tests and A/B testing, making it easy to move applications to production.
The Lifecycle of NLP Applications
DeepSet Cloud supports the full lifecycle of NLP applications, from experimentation and evaluation to moving to production, monitoring, and iteration. The platform is designed to make it easy for NLP teams to integrate with existing infrastructure and tools, and to orchestrate teams of data scientists, software engineers, product owners, and product managers.
Maturity of the Stack
The cloud infrastructure is a mature space, and DeepSet Cloud uses best-of-breed components to ensure scalability and reliability. However, scaling and infrastructure setup require expertise and effort, and the company is committed to making it easy for customers to get started.
DeepSet Cloud Features and Pricing
DeepSet Cloud offers exciting features such as sematic search, summarization, and Q&A, including a hallucination detector retrieval augmented generation. The pricing model is consumption-based, reflecting the value that comes out of the LLM system. The company offers different packages based on consumption behaviors, making it easy for customers to get started.
Technology Stacks and Use Cases
DeepSet Cloud has strong adoption from enterprises, particularly from large companies. The company has built a technology stack that provides all the necessary tooling for individualizing applications. Use cases include Airbus, which is working on a project to improve single-pilot operations, and Monis, a law firm that uses the H stack to build features that improve the user experience for their customers.
Global 2000 vs. Tech Startups
The company notes that the global 2000 and tech startups have different needs when it comes to AI adoption. Global 2000 companies tend to be more concerned with security and customization, while tech startups are more focused on ease of use and standardization.
Product Roadmap
DeepSet AI has recently raised a round of funding and will be investing in their go-to-market strategy. The company plans to double down on product maturity and expand to the US market.
Generative AI and Observability
The company is excited about generative AI, particularly in the area of observability. They believe that there are many players in the observability market, and that it is an exciting space to be in.
AI Scene in Europe
The company notes that Europe has great companies and a great potential, but that it can be challenging for them to grow and exceed ambitious growth levels due to a more conservative approach to adopting new technologies. They hope to see more appetite for AI adoption in Europe.
In conclusion, DeepSet AI's origin story is one of innovation and growth, driven by a passion for NLP and a commitment to making it easy for companies to adopt AI. With its open-source orchestration framework Hast and cloud-based platform DeepSet Cloud, the company is well-positioned to capture a bigger opportunity in the NLP space.