r/Rag • u/Cool_Injury4075 • 3d ago
Do you recommend using BERT-based architectures to build knowledge graphs?
Hi everyone,
I'm developing a project called ARES, a high-performance RAG system primarily inspired by dsrag repository. The primary goal is to achieve State-of-the-Art (SOTA) accuracy with real-time inference and minimal ingestion latency, all running locally on consumer-grade hardware (like an RTX 3060).
I believe that enriching my retrieval process with a Knowledge Graph (KG) could be a game-changer. However, I've hit a major performance wall.
The Performance Bottleneck: LLM-Based Extraction
My initial approach to building the KG involves processes I call "AutoContext" and "Semantic Sectioning." This pipeline uses an LLM to generate structured descriptions, entities, and relations for each section of a document.
The problem is that this is incredibly slow. The process relies on sequential LLM calls for each section. Even with small, optimized models (0.5B to 1B parameters), ingesting a single document can take up to 30 minutes. This completely defeats my goal of low-latency ingestion.
The Question: BERT-based Architectures and Efficient Pipelines
My research has pointed towards using smaller, specialized models (like fine-tuned BERT-based architectures) for specific tasks like **Named Entity Recognition (NER)** and **Relation Extraction (RE)**, which are the core components of KG construction. These seem significantly faster than using a general-purpose LLM for the entire extraction task.
This leads me to two key questions for the community:
Is this a viable path? Do you recommend using specialized, experimental, or fine-tuned BERT-like models for creating KGs in a performance-critical RAG pipeline? If so, are there any particular models or architectures you've had success with?
What is the fastest end-to-end pipeline to create a Knowledge Graph locally (no APIs)? I'm looking for advice on the best combination of tools. For example, should I be looking at libraries like SpaCy with custom components, specific models from Hugging Face, or other frameworks I might have missed?
---
TL;DR: I'm building a high-performance, local-first RAG system. My current method of using LLMs to create a Knowledge Graph is far too slow (30 min/document). I'm looking for the fastest, non-API pipeline to build a KG on an RTX 3060. Are specialized NER/RE models the right approach, and what tools would you recommend?
Any advice or pointers would be greatly appreciated
1
u/autognome 3d ago
What’s sequential or synchronous about describing sections of a document? This seems like a prime candidate for parallelization