---
base_model: intfloat/multilingual-e5-small
language:
- multilingual
library_name: sentence-transformers
license: apache-2.0
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:94
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: 서울여자대학교 수시모집 지원자에게 필요한 최초합격자 발표 정보는 다음과 같습니다. 최초합격자 발표는 2024년 11월
    8일부터 12월 13일까지입니다. 합격자는 본교 입학처 홈페이지에서 합격 여부를 확인하여야 하며, 등록기간 내에 등록을 마쳐야 합니다.
  sentences:
  - SWU의 SI(Social Innovation)교육에 대해 알려줘.
  - 학교생활기록부 교과성적 반영방법을 설명해 주세요.
  - 서울여자대학교 수시모집 지원자에게 필요한 최초합격자 발표 정보를 알려줘.
- source_sentence: 고등학교 졸업(예정)자의 경우 학교생활기록부 제출 방법은 다음과 같습니다. 원본 대조필 및 학교장 직인 날인 후
    제출하여야 합니다. 외국 고등학교 졸업(예정)자의 경우는 한국어나 영어로 번역 공증받은 문서를 제출하여야 합니다.
  sentences:
  - 언론영상학부-저널리즘전공의 졸업 후 진로는 무엇입니까?
  - 서울여자대학교에 있는 박물관학전공의 교육 내용을 설명해줘.
  - 고등학교 졸업(예정)자의 경우 학교생활기록부 제출 방법을 설명해줘.
- source_sentence: 심리·인지과학학부-인지학습과학전공의 졸업 후 진로는 교육프로그램 개발자, 교육기업 데이터 분석 업무, 인지학습 치료사,
    인지행동 치료사, 교육컨설턴트, 국가연구소, 이러닝 관련 산업분야 등입니다.
  sentences:
  - 서울여자대학교에 있는 예술심리치료전공의 목표를 설명해줘.
  - 서울여자대학교 수시모집 지원자에게 필요한 교과성적 산출 방법을 설명해줘.
  - 심리·인지과학학부-인지학습과학전공의 졸업 후 진로를 설명하세요.
- source_sentence: 2024학년도 서울여자대학교 수시모집 지원자에게 필요한 정보는 다음과 같습니다. 수시모집 지원기간은 2024년 9월
    10일부터 9월 13일까지입니다. 지원자는 인터넷 입학원서접수 사이트에 접속하여 원서접수를 완료해야 하며, 전형료 결제는 신용카드, 계좌이체
    등으로 가능합니다. 또한, 지원자는 제출서류를 등기우편으로 제출하여야 하며, 서류제출 마감일은 2024년 9월 13일입니다.
  sentences:
  - 박물관학전공의 교육 목표는 무엇입니까?
  - 2024학년도 서울여자대학교 수시모집 지원자에게 필요한 정보를 알려줘.
  - 학생부종합 전형으로 지원할 수 있는 전형의 유형을 모두 알려줘
- source_sentence: 학교생활기록부 교과성적 대체 점수(비교내신) 대상자는 논술(논술우수자전형), 실기/실적(실기우수자전형_체육) 지원자
    중 고등학교 졸업학력 검정고시 출신 지원자 및 교과성적 산출 불가자입니다.
  sentences:
  - 고등학교 학교생활기록부 제출 방법을 설명하세요.
  - 청소년학전공의 교육 내용은 무엇입니까?
  - 학교생활기록부 교과성적 대체 점수(비교내신) 대상자를 알려줘.
model-index:
- name: Multilingual base SWU Matryoshka
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 256
      type: dim_256
    metrics:
    - type: cosine_accuracy@1
      value: 0.6363636363636364
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.9090909090909091
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 1.0
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 1.0
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.6363636363636364
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.30303030303030304
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.2
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.1
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.6363636363636364
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.9090909090909091
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 1.0
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 1.0
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8475878017079786
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.7954545454545454
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.7954545454545454
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 128
      type: dim_128
    metrics:
    - type: cosine_accuracy@1
      value: 0.6363636363636364
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.9090909090909091
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 1.0
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 1.0
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.6363636363636364
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.30303030303030304
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.2
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.1
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.6363636363636364
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.9090909090909091
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 1.0
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 1.0
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8475878017079786
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.7954545454545454
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.7954545454545454
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 64
      type: dim_64
    metrics:
    - type: cosine_accuracy@1
      value: 0.6363636363636364
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.9090909090909091
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 1.0
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 1.0
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.6363636363636364
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.30303030303030304
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.2
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.1
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.6363636363636364
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.9090909090909091
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 1.0
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 1.0
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.8356850968378461
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.7803030303030302
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.7803030303030302
      name: Cosine Map@100
---

# Multilingual base SWU Matryoshka

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) on the json dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) <!-- at revision fd1525a9fd15316a2d503bf26ab031a61d056e98 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - json
- **Language:** multilingual
- **License:** apache-2.0

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ValentinaKim/Multilingual-base-SWU-Matryoshka")
# Run inference
sentences = [
    '학교생활기록부 교과성적 대체 점수(비교내신) 대상자는 논술(논술우수자전형), 실기/실적(실기우수자전형_체육) 지원자 중 고등학교 졸업학력 검정고시 출신 지원자 및 교과성적 산출 불가자입니다.',
    '학교생활기록부 교과성적 대체 점수(비교내신) 대상자를 알려줘.',
    '청소년학전공의 교육 내용은 무엇입니까?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval
* Dataset: `dim_256`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.6364     |
| cosine_accuracy@3   | 0.9091     |
| cosine_accuracy@5   | 1.0        |
| cosine_accuracy@10  | 1.0        |
| cosine_precision@1  | 0.6364     |
| cosine_precision@3  | 0.303      |
| cosine_precision@5  | 0.2        |
| cosine_precision@10 | 0.1        |
| cosine_recall@1     | 0.6364     |
| cosine_recall@3     | 0.9091     |
| cosine_recall@5     | 1.0        |
| cosine_recall@10    | 1.0        |
| cosine_ndcg@10      | 0.8476     |
| cosine_mrr@10       | 0.7955     |
| **cosine_map@100**  | **0.7955** |

#### Information Retrieval
* Dataset: `dim_128`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.6364     |
| cosine_accuracy@3   | 0.9091     |
| cosine_accuracy@5   | 1.0        |
| cosine_accuracy@10  | 1.0        |
| cosine_precision@1  | 0.6364     |
| cosine_precision@3  | 0.303      |
| cosine_precision@5  | 0.2        |
| cosine_precision@10 | 0.1        |
| cosine_recall@1     | 0.6364     |
| cosine_recall@3     | 0.9091     |
| cosine_recall@5     | 1.0        |
| cosine_recall@10    | 1.0        |
| cosine_ndcg@10      | 0.8476     |
| cosine_mrr@10       | 0.7955     |
| **cosine_map@100**  | **0.7955** |

#### Information Retrieval
* Dataset: `dim_64`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| cosine_accuracy@1   | 0.6364     |
| cosine_accuracy@3   | 0.9091     |
| cosine_accuracy@5   | 1.0        |
| cosine_accuracy@10  | 1.0        |
| cosine_precision@1  | 0.6364     |
| cosine_precision@3  | 0.303      |
| cosine_precision@5  | 0.2        |
| cosine_precision@10 | 0.1        |
| cosine_recall@1     | 0.6364     |
| cosine_recall@3     | 0.9091     |
| cosine_recall@5     | 1.0        |
| cosine_recall@10    | 1.0        |
| cosine_ndcg@10      | 0.8357     |
| cosine_mrr@10       | 0.7803     |
| **cosine_map@100**  | **0.7803** |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### json

* Dataset: json
* Size: 94 training samples
* Columns: <code>positive</code> and <code>anchor</code>
* Approximate statistics based on the first 94 samples:
  |         | positive                                                                            | anchor                                                                             |
  |:--------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
  | type    | string                                                                              | string                                                                             |
  | details | <ul><li>min: 24 tokens</li><li>mean: 89.93 tokens</li><li>max: 272 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 19.18 tokens</li><li>max: 35 tokens</li></ul> |
* Samples:
  | positive                                                                                                                                                                               | anchor                                               |
  |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------|
  | <code>서울여자대학교 수시모집에서 평가하는 요소는 다음과 같습니다. 1. 서류 평가(학업역량 40%, 진로역량 35%, 공동체역량 25%) 2. 면접 평가(인성 및 의사소통능력, 발전가능성) 3. 학교생활기록부에 학교폭력 관련 기재사항이 있을 경우, 정성평가로 반영합니다.</code>                      | <code>서울여자대학교 수시모집에서 평가하는 요소를 알려줘.</code>            |
  | <code>서울여자대학교 학생부종합전형 지원자에게 필요한 지원자격 정보는 다음과 같습니다. 지원자격은 기초생활수급자, 차상위계층, 한부모가족 지원대상자, 국가보훈대상자, 자립지원 대상 아동, 농어촌학생 등입니다. 각 지원자격에 따라 필요한 제출서류가 다르므로, 지원자격에 따라 필요한 제출서류를 확인하여야 합니다.</code> | <code>서울여자대학교 학생부종합전형 지원자에게 필요한 지원자격 정보를 알려줘.</code> |
  | <code>SWU의 SI(Social Innovation)교육은 사회적 가치 확산을 위해 혁신적인 방법론을 적용하여 긍정적인 사회 변화를 유도하는 서울여자대학교만의 차별화된 교육입니다. 바롬종합설계프로젝트는 유네스코한국위원회가 인증한 유네스코지속가능발전교육공식프로젝트입니다.</code>                       | <code>SWU의 SI(Social Innovation)교육에 대해 알려줘.</code>   |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
  ```json
  {
      "loss": "MultipleNegativesRankingLoss",
      "matryoshka_dims": [
          256,
          128,
          64
      ],
      "matryoshka_weights": [
          1,
          1,
          1
      ],
      "n_dims_per_step": -1
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: epoch
- `gradient_accumulation_steps`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 4
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.1
- `tf32`: False
- `load_best_model_at_end`: True
- `optim`: adamw_torch_fused
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: epoch
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 8
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 16
- `eval_accumulation_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 4
- `max_steps`: -1
- `lr_scheduler_type`: cosine
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: False
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch   | Step  | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_64_cosine_map@100 |
|:-------:|:-----:|:----------------------:|:----------------------:|:---------------------:|
| **1.0** | **1** | **0.7955**             | **0.7955**             | **0.7803**            |
| 2.0     | 2     | 0.7955                 | 0.7955                 | 0.7803                |
| 3.0     | 4     | 0.7955                 | 0.7955                 | 0.7803                |
| **1.0** | **1** | **0.7955**             | **0.7955**             | **0.7803**            |
| 2.0     | 2     | 0.7955                 | 0.7955                 | 0.7803                |
| 3.0     | 4     | 0.7955                 | 0.7955                 | 0.7803                |

* The bold row denotes the saved checkpoint.

### Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.1.1
- Transformers: 4.41.2
- PyTorch: 2.1.2+cu121
- Accelerate: 0.34.2
- Datasets: 2.19.1
- Tokenizers: 0.19.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->