Plugin: AI Chatbot
Chatbot PoC for SQL Access via LangChain
- Maintainer: Diego Ruiz - BX Service GmbH
- Status: Beta - Proof of concept
- License: GPLv2
- Sources: GitHub
Description
This repository is a proof of concept (PoC) for a chatbot system that can understand natural language questions and translate them into SQL queries against a PostgreSQL database using LangChain, Ollama, and LangGraph.
What it Does
- Accepts user questions via natural language.
- Uses a local LLM model via Ollama (e.g. llama3) to interpret the intent.
- Translates questions into PostgreSQL SQL queries with help from LangChain tools.
- Injects tenant filtering (e.g. AD_Client_ID = ...) to avoid cross tenant data access.
- Streams back answers to the user.
- Uses sqlglot to safely parse and rewrite SQL, ensuring valid syntax and injected conditions.
Technologies Used
- Python 3.12
- LangChain
- LangGraph
- Ollama (for running local LLMs like llama3)
- SQLGlot (safe SQL parsing & rewriting)
Disclaimer
This is a Proof of Concept. Do NOT use it in production. This project is meant to explore feasibility and limitations, not to provide a secure, production-ready system.
Configuration
System Setup in iDempiere
- Log in as System in iDempiere (not as a tenant).
- Navigate to the “SQL AI Configurator” window.
- Open the existing record and fill in the following fields
Required Fields:
| Field | Description |
|---|---|
| Python Path | Full path to python3.
Recommended: Use lc-env inside Conda or virtualenv for consistent dependencies. |
| Script Path | Full path to the pocSQLLangChain.py script from this repository. |
| Config JSON | JSON string with PostgreSQL connection info and tables to include (see example below). |
Example Config JSON: {
"host": "localhost", "port": 5432, "database": "idempiere", "user": "adempiere", "password": "adempiere", "include_tables": ["v_bpartner_info", "v_order_info"]
}
Recommendation:
- Do NOT use core iDempiere tables like c_order, m_product, or ad_user directly — they are too large and complex.
Instead, create custom views that expose only the necessary columns and rows.
Example View: CREATE OR REPLACE VIEW v_bpartner_info AS select
value, name, iscustomer, isvendor, taxid, totalopenbalance, c_bpartner_id, ad_client_id
FROM c_bpartner;
Security Best Practice
Create a dedicated PostgreSQL user for this integration:
- Only grant read-only access to the necessary views.
- Avoid giving write, update, or delete permissions.
- Helps reduce the risk if the model generates unexpected queries.
Which Script to Use?
Recommended: Use pocSQLLangChain.py
❌ Avoid using the Agent-based version for now
The agent-based pipeline tends to hallucinate more, lacks strict validation, and is harder to control. pocSQLLangChain.py includes:
- SQL validation (SELECT-only)
- Safe WHERE clause injection
- SQL parsing with sqlglot
How SQL Injection is Prevented
- Only allows queries starting with SELECT
- Rejects any containing dangerous keywords like INSERT, DROP, UPDATE
- Strips trailing semicolons
- Uses sqlglot to parse and inject AD_Client_ID = ... safely
Using the Chatbot
- Log in to iDempiere as a tenant
- Open the Chatbot Form
- Ask a question like: “List my top customers”
The system will:
- Translate your question to SQL
- Inject AD_Client_ID = ... for multi-tenant safety
- Log the executed query
Logging
- For each user question, the system logs a warning message with the final SQL query.
- Review these logs to ensure the correctness of the generated queries.
A dedicated iDempiere process has been created to trigger the Python chatbot script via REST call — suitable for integrations with external tools or frontends.
Limitation and Risks
- LLM Hallucination: The LLM may generate incorrect or non-existent SQL columns or logic and throw random answers.
- SQL Injection: Only partially mitigated — raw SQL generation still occurs.
- No Authentication: There's no user management or permission control (Do this with a specific potsgresql users with limited access).
- Performance: Not optimized for real-time or large-scale workloads.
- Error Handling: Limited feedback when queries fail (e.g. ambiguity, syntax, or missing fields).


