Accurate and Speedy Data Retrieval

Pryon Retrieval Engine excels at quickly matching user queries to your ingested content, resulting in accurate responses sent to the API endpoint of your choosing.

Vectorization of user queries and vector-matching them to trusted content

Responses delivered with best-in-class accuracy and speed

Attribution to source documents, enabling users to gather additional context

How it Works

01

Generate query embeddings

Pryon Retrieval Engine uses a series of advanced techniques developed in-house to ensure it retrieves the right information from the vector database. These techniques, such as short query prefixing, synonym expansion, query backoff, query disambiguation, and query canonicalization, help the Retrieval Engine gain a complete understanding of a user’s intent.

Pryon Retrieval Engine then converts the user’s query into a series of machine-interpretable vectors, or embeddings, which are used in the subsequent content matching process.

02

Match content

During the content matching phase, Pryon Retrieval Engine matches the vectors that reflect the meaning behind the query (and its possible permutations) against the vectors created during the ingestion process. This enables the Retrieval Engine to rapidly find those chunks of content that most closely match the user’s query.

Pryon Retrieval Engine achieves an industry-leading content matching scale, with the ability to match vectors in a performant manner across 1 million pages of content.

03

Re-rank content

To ensure users receive highly useful responses — and to avoid overwhelming the LLMs with too much information, potentially slowing response speed — Pryon Retrieval Engine re-ranks the chunks of information by relevance.

The Retrieval Engine's re-ranking algorithms are tuned specifically to enterprise data, as opposed to information appropriate to consumer applications.

04

Collect user feedback

The best way to ensure users receive accurate, helpful, and relevant responses is to collect feedback. Pryon Retrieval Engine enables developers to let users rate a response with a simple and intuitive “thumbs up/thumbs down” reaction.

These responses are used in conjunction with the previous steps to better understand user intent and optimize future responses.

The Retrieval Engine has out-of-domain detection, so it knows when a query is asking for irrelevant information and can respond accordingly rather than making up (hallucinating) a response.

FAQs

Do I need Pryon Ingestion Engine to use Pryon Retrieval Engine?

Yes. Pryon Retrieval Engine is designed to work with data that has been accessed, cleaned, chunked, and embedded by Pryon Ingestion Engine. Both these engines (and Pryon Generative Engine) are highly flexible and configurable to meet developers’ requirements. For example, you can send responses to the API endpoint of your choice.

What kind of scale and latency can I expect from Pryon Retrieval Engine?

Pryon Retrieval Engine operates at high speed and scale, enabling users to get precise answers from millions of pages of content with sub-second latency.

What makes Pryon Retrieval Engine best in class?

Pryon Retrieval Engine uses a sophisticated query expansion model to understand not just the keywords of a query, but the query’s true intent. Pryon Retrieval Engine is also highly flexible. Developers can use whichever vector database they’d like and combine Pryon Ingestion Engine and Pryon Retrieval Engine with any LLM they choose. Additionally, Pryon Retrieval Engine's out-of-domain detection recognizes when a query is asking for irrelevant information and prevents the LLM from delivering a hallucinated response.

Ready to get started?

Request a demo.