# RLCoder

**Repository Path**: yingyiyifan/RLCoder

## Basic Information

- **Project Name**: RLCoder
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-07-16
- **Last Updated**: 2025-07-16

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 🖥️ RLCoder: Reinforcement Learning for Repository-Level Code Completion

📄 This repository contains the implementation for the ICSE 2025 paper, "[RLCoder: Reinforcement Learning for Repository-Level Code Completion](https://arxiv.org/abs/2407.19487)".

> In this paper, we introduce a reinforcement learning framework for repository-level code completion. 
> The core module, RLRetriever, is a retriever that can disregard seemingly useful yet ultimately useless reference code snippets, focusing instead on those more likely to contribute to accurate code generation.

**🔧 Models:** The trained `RLRetriever` is available at [RLRetriever](https://huggingface.co/nov3630/RLCoder).

**📦 Datasets:** Download the training and evaluation datasets from [Data4RLCoder](https://huggingface.co/datasets/nov3630/Data4RLCoder) to the `/data` folder.

---

![Overview](asset/Overview.jpg)

---

**⚠️ If you want to reproduce the results from scratch, please follow these steps:**

**🛠️ Set-Up:** Before starting the following process, it's essential to set up your environment by installing the necessary dependencies listed in the `requirements.txt` file. To install these dependencies, activate your Python virtual environment and run:

```bash
pip install -r requirements.txt
```

## 🚀 Trianing

The training command is as follows:
```bash
# RLCoder
python main.py \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --output_dir result/RLCoder \
    2>&1|tee log/RLCoder.log
```

```bash
# Ablation: natural candidates
python main.py \
    --weighted_keywords \
    --enable_generation \
    --enable_fixed_block \
    --inference_type unixcoder_with_rl \
    --output_dir result/Ablation_candidate \
    2>&1|tee log/Ablation_candidate.log
```

```bash
# Ablation: stop signal
python main.py \
    --weighted_keywords \
    --enable_generation \
    --disable_stop_block \
    --inference_type unixcoder_with_rl \
    --output_dir result/Ablation_stop_signal \
    2>&1|tee log/Ablation_stop_signal.log
```

```bash
# Ablation: unweight ppl
python main.py \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --output_dir result/Ablation_weight \
    2>&1|tee log/Ablation_weight.log
```

```bash
# RLCoder w/ RepoCoder
python main.py \
    --weighted_keywords \
    --enable_generation \
    --enable_repocoder \
    --inference_type unixcoder_with_rl \
    --rlcoder_model_path 'nov3630/RLRetriever' \ # path to trained RLCoder, eg. 'result/RLCoder/retriever_cpkt/result_0'
    --output_dir result/RepoCoder_rl \
    2>&1|tee log/RepoCoder_rl.log
```

```bash
# SFT
python main.py \
    --weighted_keywords \
    --enable_generation \
    --enable_sft \
    --inference_type unixcoder_with_rl \
    --epoch 1 \
    --inner_epoch 20 \
    --output_dir result/SFT \
    2>&1|tee log/SFT.log
```


## 🔍 Evaluation
In the evaluation section of our paper, our configuration utilizes an `in-file context` of `512` and a `cross-file context` of `1536`.

### RQ1

We evaluate the performance of `RLCoder` on 5 backbone LLMs, including `CodeLlama-7B`, `StarCoder-7B`, `StarCoder2-7B`, `DeepSeekCoder-1B` and `DeepSeekCoder-7B`.

We show an inference example on `DeepSeekCoder-7B` as follows:

```bash
# deepseekcoder-7b + RawRAG
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/RawRAG_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/RawRAG_deepseekcoder_7b_crossfile_1536_infile_512.log

# deepseekcoder-7b + RepoCoder
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --enable_repocoder \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --rlcoder_model_path 'microsoft/unixcoder-base' \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/RepoCoder_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/RepoCoder_deepseekcoder_7b_crossfile_1536_infile_512.log

# deepseekcoder-7b + RLCoder
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path 'nov3630/RLRetriever' \ # path to trained RLCoder, eg. 'result/RLCoder/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/RLCoder_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/RLCoder_deepseekcoder_7b_crossfile_1536_infile_512.log
```

For other backbone LLMs, replace `deepseek-ai/deepseek-coder-6.7b-base` above to `deepseek-ai/deepseek-coder-1.3b-base`, `bigcode/starcoderbase-7b`, `bigcode/starcoder2-7b` and `codellama/CodeLlama-7b-hf`, respectively.


### RQ2

We evaluate the performance of `RLRetriever` compared with `NoRetriever`, `BM25`, `UniXcoder`, `UniXcoder-SFT`.

```bash
# NoRetriever
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type baseline \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/Baseline_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/Baseline_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# BM25
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type BM25 \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/BM25_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/BM25_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# UniXcoder
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type UniXcoder \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/UniXcoder_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/UniXcoder_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# UniXcoder-SFT
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path '' \ # path to trained UniXcoder-SFT, eg. 'result/SFT/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/UniXcoder_sft_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/UniXcoder_sft_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# RLRetriever
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path 'nov3630/RLRetriever' \ # path to trained RLCoder, eg. 'result/RLCoder/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/RLCoder_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/RLCoder_deepseekcoder_7b_crossfile_1536_infile_512.log
```


### RQ3

Ablation study.

```bash
# w/o RL
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --inference_type unixcoder \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/Ablition_RL_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/Ablition_RL_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# w/o WP
python main.py \
    --eval \
    --enable_generation \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path '' \ # path to trained RLCoder, eg. 'result/Ablation_weight/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/Ablition_WP_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/Ablition_WP_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# w/o NC
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --enable_fixed_block \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path '' \ # path to trained RLCoder, eg. 'result/Ablation_candidate/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/Ablation_candidate_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/Ablation_candidate_deepseekcoder_7b_crossfile_1536_infile_512.log
```

```bash
# w/o SS
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --disable_stop_block \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path '' \ # path to trained RLCoder, eg. 'result/Ablation_stop_signal/retriever_cpkt/result_0'
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/Ablation_stop_signal_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/Ablation_stop_signal_deepseekcoder_7b_crossfile_1536_infile_512.log


# w/o SS on GitHubEval
Just replace the backbone LLMs above.

```


### RQ4

RepoCoder + RLCoder.

```bash
# RepoCoder + RLCoder
python main.py \
    --eval \
    --weighted_keywords \
    --enable_generation \
    --enable_repocoder \
    --inference_type unixcoder_with_rl \
    --generator_model_path deepseek-ai/deepseek-coder-6.7b-base \
    --retriever_model_path '' \ # path to trained RepoCoder_rl, eg. 'result/RepoCoder_rl/retriever_cpkt/result_0'
    --rlcoder_model_path 'nov3630/RLRetriever' \
    --generator_max_crossfile_length 1536 \
    --generator_max_context_length 2048 \
    --generator_batch_size_per_gpu 16 \
    --output_dir result_infer/RepoCoder_RLCoder_deepseekcoder_7b_crossfile_1536_infile_512 \
    2>&1|tee log_infer/RepoCoder_RLCoder_deepseekcoder_7b_crossfile_1536_infile_512.log
```

## Case Study
![Case_Study](asset/Case_Study.jpg)

## Citation
 
**BibTeX:**
```
@misc{wang2024rlcoderreinforcementlearningrepositorylevel,
      title={RLCoder: Reinforcement Learning for Repository-Level Code Completion}, 
      author={Yanlin Wang and Yanli Wang and Daya Guo and Jiachi Chen and Ruikai Zhang and Yuchi Ma and Zibin Zheng},
      year={2024},
      eprint={2407.19487},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2407.19487}, 
}
```