iofu728 commited on
Commit
0830175
1 Parent(s): 962818f

Feature(MInference): update NeurIPS'24

Browse files
Files changed (1) hide show
  1. app.py +5 -1
app.py CHANGED
@@ -14,7 +14,7 @@ HF_TOKEN = os.environ.get("HF_TOKEN", None)
14
 
15
 
16
  DESCRIPTION = """
17
- # [MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention](https://aka.ms/MInference) (Under Review, ES-FoMo @ ICML'24)
18
 
19
  _Huiqiang Jiang†, Yucheng Li†, Chengruidong Zhang†, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
20
 
@@ -23,7 +23,11 @@ _Huiqiang Jiang†, Yucheng Li†, Chengruidong Zhang†, Qianhui Wu, Xufang Luo
23
  <a href="https://arxiv.org/abs/2407.02490" target="blank"> [Paper]</a></h3>
24
 
25
  ## News
 
 
 
26
  - 🪗 [24/07/07] Thanks @AK for sponsoring. You can now use MInference online in the [HF Demo](https://huggingface.co/spaces/microsoft/MInference) with ZeroGPU.
 
27
  - 🧩 [24/07/03] We will present **MInference 1.0** at the _**Microsoft Booth**_ and _**ES-FoMo**_ at ICML'24. See you in Vienna!
28
 
29
  <font color="brown"><b>This is only a deployment demo. You can follow the code below to try MInference locally.</b></font>
 
14
 
15
 
16
  DESCRIPTION = """
17
+ # [MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention](https://aka.ms/MInference) (NeurIPS'24 Spotlight)
18
 
19
  _Huiqiang Jiang†, Yucheng Li†, Chengruidong Zhang†, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang and Lili Qiu_
20
 
 
23
  <a href="https://arxiv.org/abs/2407.02490" target="blank"> [Paper]</a></h3>
24
 
25
  ## News
26
+ - 🧤 [24/09/26] MInference has been accepted as **spotlight** at **NeurIPS'24**. See you in Vancouver!
27
+ - 👘 [24/09/16] We are pleased to announce the release of our KV cache offloading work, [RetrievalAttention](https://aka.ms/RetrievalAttention), which accelerates long-context LLM inference via vector retrieval.
28
+ - 🥤 [24/07/24] MInference support [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) now.
29
  - 🪗 [24/07/07] Thanks @AK for sponsoring. You can now use MInference online in the [HF Demo](https://huggingface.co/spaces/microsoft/MInference) with ZeroGPU.
30
+ - 📃 [24/07/03] Due to an issue with arXiv, the PDF is currently unavailable there. You can find the paper at this [link](https://export.arxiv.org/pdf/2407.02490).
31
  - 🧩 [24/07/03] We will present **MInference 1.0** at the _**Microsoft Booth**_ and _**ES-FoMo**_ at ICML'24. See you in Vienna!
32
 
33
  <font color="brown"><b>This is only a deployment demo. You can follow the code below to try MInference locally.</b></font>