metadata

base_model: ybelkada/falcon-7b-sharded-bf16
tags:
  - generated_from_trainer
  - lora
  - falcon
model-index:
  - name: results
    results: []
datasets:
  - Clinton/Text-to-sql-v1
library_name: peft
language:
  - en
pipeline_tag: text-generation

AI2sql

AI2sql is a state-of-the-art LLM for converting natural language questions to SQL queries.

Model description

AI2SQL is a specialized LLM fine-tuned from Falcon-7b-instruct with PEFT- LoRA technology, tailored for interpreting natural language and generating corresponding SQL queries.

Intended uses & limitations

AI2SQL is designed for data analysts, business intelligence professionals, and developers to facilitate the conversion of natural language questions into SQL queries. This tool aids those who are not proficient in SQL, enabling easier database querying. AI2SQL's performance is inherently tied to the characteristics of its training data. While it has been trained on a diverse and substantial dataset, it may not account for all possible SQL dialects or database structures. Careful review of the generated SQL queries is recommended.

Inference

Model Deployment

AI2SQL is designed for efficient real-time inference, making it suitable for interactive applications where users query databases using natural language.

Computational Requirements

Hardware Requirements: AI2SQL performs optimally on a range of CPUs, showing satisfactory performance on an A10 processor. For enhanced performance, particularly for more complex queries, the use of a high-end CPU or GPU is recommended.
Memory Footprint: The model requires at least 14 GB of RAM for inference.
Latency: The average response time for generating a SQL query is dependent on the hardware used and the complexity of the query. On an A10 processor, latency is satisfactory, with potential for faster response times on more advanced hardware.

Usage Guidelines

To use AI2SQL for generating SQL queries, follow these steps:

Preparation: Ensure that your system meets the hardware and software requirements for running the model.
Input Formatting: Format your natural language questions clearly and concisely for best results.
Model Invocation: Call the AI2SQL model with the natural language question as input. The model returns the corresponding SQL query as output.

Example Code for Inference

from transformers import pipeline

# Initialize the AI2SQL model
ai2sql = pipeline('text-to-sql', model='ai2sql')

# Example natural language question
question = "How many products were sold last month?"

# Generate the SQL query
sql_query = ai2sql(question)
print("Generated SQL Query:", sql_query)

Scalability

AI2SQL is scalable and can handle concurrent requests, making it suitable for deployment in high-demand environments.

Error Handling

The model includes robust error handling for invalid inputs and provides meaningful error messages to guide users in correcting their queries.

Security Considerations

Users should be aware of security implications when using AI2SQL, especially when dealing with sensitive data or integrating the model into secure environments. Ensure all data handling complies with relevant privacy and security regulations.

Training and evaluation data

Trained on a comprehensive dataset comprising 262,000 rows of paired natural language questions and SQL queries sourced from Text-to-SQL Dataset, covering a wide array of domains and question complexities.

Training procedure

Overview

AI2SQL was trained in a multi-stage process, starting with a pre-trained Falcon-7b-instruct model, a large transformer-based language model. This base model was then fine-tuned using a Parameter Efficient Fine-Tuning (PEFT) approach with Locally Reweighted Approximations (LoRA) specifically for the task of translating natural language to SQL queries.

Data Preparation

The training dataset, sourced from the Text-to-SQL Dataset, included 262,000 rows of paired natural language questions and SQL queries. Each pair consists of a natural language question and its corresponding SQL query, covering a diverse range of domains and query complexities.

Fine-Tuning Process

Data Preprocessing: The dataset was preprocessed to normalize text and SQL queries, ensuring consistency in formatting and syntax.
Model Adaptation: The Falcon-7b-instruct model was adapted using PEFT- LoRA, a technique that allows for efficient and targeted updates to the model's weights without extensive retraining. This approach is particularly beneficial for adapting large-scale models to specific tasks with limited computational resources.
Training Strategy: The model was trained in a supervised learning setup, where it learned to map natural language inputs to their corresponding SQL queries. Special attention was given to the model's ability to understand the semantics of the natural language questions and accurately reflect them in SQL syntax.
Validation and Testing: Throughout the training process, the model was periodically evaluated on a held-out validation set to monitor its performance and prevent overfitting. The final model was tested on an independent test set to assess its generalization capabilities.

Model Evaluation

The model's performance was evaluated based on its accuracy in generating correct SQL queries corresponding to the input natural language questions. Metrics such as precision, recall, and F1-score were used to quantify the model's effectiveness.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.03
training_steps: 500
mixed_precision_training: Native AMP

Training results

Performance Metrics

AI2SQL's performance was rigorously evaluated post-training. The key metrics used to assess the model were:

Accuracy: The percentage of queries where the model-generated SQL matched the expected SQL.
Precision: The proportion of correctly generated SQL queries out of all queries generated by the model.
Recall: The ability of the model to generate all relevant SQL queries corresponding to the input natural language questions.
F1-Score: The harmonic mean of precision and recall, providing a balance between the two.

Results:

Accuracy: TBD
Precision: TBD
Recall: TBD
F1-Score: TBD

Insights and Observations

Handling Complex Queries: AI2SQL demonstrated a high proficiency in handling complex queries involving multiple SQL clauses and parameters.
Contextual Understanding: The model showed a notable capability in understanding context and generating SQL queries that accurately reflect nuanced natural language instructions.
Performance on Diverse Data: AI2SQL maintained consistent performance across various domains present in the training dataset, indicating its robustness and general applicability.

Limitations Observed

Handling Ambiguous Questions: The model sometimes struggled with ambiguous natural language inputs where the intent was not clear.
Query Specificity: In cases of highly specific queries, the model occasionally generated SQL that was syntactically correct but did not completely align with the nuanced requirements of the question.

Future Improvements

Based on the training results and observed limitations, future improvements could include:

Enhanced training on ambiguous natural language inputs to improve the model's interpretative capabilities.
Further fine-tuning with a broader range of specific and complex SQL queries to enhance the model's accuracy in generating nuanced SQL statements.

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0