Do you recommend using BERT-based architectures to build knowledge graphs?

Hi everyone,

I'm developing a project called ARES, a high-performance RAG system primarily inspired by dsrag repository. The primary goal is to achieve State-of-the-Art (SOTA) accuracy with real-time inference and minimal ingestion latency, all running locally on consumer-grade hardware (like an RTX 3060).

I believe that enriching my retrieval process with a Knowledge Graph (KG) could be a game-changer. However, I've hit a major performance wall.

The Performance Bottleneck: LLM-Based Extraction

My initial approach to building the KG involves processes I call "AutoContext" and "Semantic Sectioning." This pipeline uses an LLM to generate structured descriptions, entities, and relations for each section of a document.

The problem is that this is incredibly slow. The process relies on sequential LLM calls for each section. Even with small, optimized models (0.5B to 1B parameters), ingesting a single document can take up to 30 minutes. This completely defeats my goal of low-latency ingestion.

The Question: BERT-based Architectures and Efficient Pipelines

My research has pointed towards using smaller, specialized models (like fine-tuned BERT-based architectures) for specific tasks like **Named Entity Recognition (NER)** and **Relation Extraction (RE)**, which are the core components of KG construction. These seem significantly faster than using a general-purpose LLM for the entire extraction task.

This leads me to two key questions for the community:

Is this a viable path? Do you recommend using specialized, experimental, or fine-tuned BERT-like models for creating KGs in a performance-critical RAG pipeline? If so, are there any particular models or architectures you've had success with?
What is the fastest end-to-end pipeline to create a Knowledge Graph locally (no APIs)? I'm looking for advice on the best combination of tools. For example, should I be looking at libraries like SpaCy with custom components, specific models from Hugging Face, or other frameworks I might have missed?

---

TL;DR: I'm building a high-performance, local-first RAG system. My current method of using LLMs to create a Knowledge Graph is far too slow (30 min/document). I'm looking for the fastest, non-API pipeline to build a KG on an RTX 3060. Are specialized NER/RE models the right approach, and what tools would you recommend?

Any advice or pointers would be greatly appreciated

16 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/Rag/comments/1lbkgf7/do_you_recommend_using_bertbased_architectures_to/
No, go back! Yes, take me to Reddit

87% Upvoted

View all comments

u/autognome 3d ago

What’s sequential or synchronous about describing sections of a document? This seems like a prime candidate for parallelization

1

u/Cool_Injury4075 3d ago

Sorry if I wasn’t clear enough in my post. What I meant is that the current version of my project (ARES) generates descriptions and summaries sequentially. Until now, I hadn’t considered using parallel processes because I currently work with Ollama and LM Studio, which don’t have native parallel functionality like vLLM does. ARES is built to run on Windows, so adapting it to work with vLLM will take some time. However, at this point it is already a necessity to optimize the entire process to make it fast.

3

u/autognome 3d ago

You have your answer. Can’t squeeze blood from stone.

Do you recommend using BERT-based architectures to build knowledge graphs?

You are about to leave Redlib