# multihead-siamese-nets **Repository Path**: mirrors_opencollective/multihead-siamese-nets ## Basic Information - **Project Name**: multihead-siamese-nets - **Description**: Implementation of Siamese Neural Networks built upon multihead attention mechanism for text semantic similarity task. - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-25 - **Last Updated**: 2026-03-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README    # Siamese Deep Neural Networks for semantic similarity. This repository contains implementation of Siamese Neural Networks in Tensorflow built based on 3 different and major deep learning architectures: - Convolutional Neural Networks - Recurrent Neural Networks - Multihead Attention Networks The main reason of creating this repository is to compare well-known implementaions of Siamese Neural Networks available on GitHub mainly built upon CNN and RNN architectures with Siamese Neural Network built based on multihead attention mechanism originally proposed in Transformer model from [Attention is all you need](https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf) paper. # Supported datasets Current version of pipeline supports working with **3** datasets: - [The Stanford Natural Language Inference (SNLI) Corpus](https://nlp.stanford.edu/projects/snli/) - [Quora Question Pairs](https://www.kaggle.com/c/quora-question-pairs) - :new: Adversarial Natural Language Inference (ANLI) benchmark: [GitHub](https://github.com/facebookresearch/anli/), [arXiv](https://arxiv.org/pdf/1910.14599.pdf) # Installation ### Data preparation In order to download data, execute the following commands (this process can take a while depending on your network throughput): ``` cd bin chmod a+x prepare_data.sh ./prepare_data.sh ``` As as result of executing above script, **corpora** directory will be created with **QQP**, **SNLI** and **ANLI** data. ### Dependency installation This project was developed in and has been tested on **Python 3.6**. The package requirements are stored in **requirements** folder. To install the requirements, execute the following command: For **GPU** usage, execute: ``` pip install -r requirements/requirements-gpu.txt ``` and for **CPU** usage: ``` pip install -r requirements/requirements-cpu.txt ``` # Training models To train model run the following command: ``` python3 run.py train SELECTED_MODEL SELECTED_DATASET --experiment_name NAME --gpu GPU_NUMBER ``` where **SELECTED_MODEL** represents one of the selected model among: - cnn - rnn - multihead and **SELECTED_DATASET** is represented by: - SNLI - QQP - ANLI **--experiment_name** is an optional argument used for indicating experiment name. Default value **{SELECTED_MODEL}_{EMBEDDING_SIZE}**. **--gpu** is an optional argument, use it in order to indicate specific GPU on your machine (the default value is '0'). Example (GPU usage): Run the following command to train Siamese Neural Network based on CNN and trained on SNLI corpus: ``` python3 run.py train cnn SNLI --gpu 1 ``` Example (CPU usage): Run the following command to train Siamese Neural Network based on CNN: ``` python3 run.py train cnn SNLI ``` ## Training configuration This repository contains main configuration training file placed in **'config/main.ini'**. ```ini [TRAINING] num_epochs = 10 batch_size = 512 eval_every = 20 learning_rate = 0.001 checkpoints_to_keep = 5 save_every = 100 log_device_placement = false [DATA] logs_path = logs model_dir = model_dir [PARAMS] embedding_size = 64 loss_function = mse ``` ## Model configuration Additionally each model contains its own specific configuration file in which changing hyperparameters is possible. ### Multihead Attention Network configuration file ```ini [PARAMS] num_blocks = 2 num_heads = 8 use_residual = False dropout_rate = 0.0 ``` ### Convolutional Neural Network configuration file ```ini [PARAMS] num_filters = 50,50,50 filter_sizes = 2,3,4 dropout_rate = 0.0 ``` ### Recurrent Neural Network configuration file ```ini [PARAMS] hidden_size = 128 cell_type = GRU bidirectional = True ``` ## Training models with GPU support on Google Colaboratory If you don't have an access to workstation with GPU, you can use the below exemplary Google Colaboratory notebook for training your models (CNN, RNN or Multihead) on SNLI or QQP datasets with usage of **NVIDIA Tesla T4 16GB GPU** available within Google Colaboratory backend: [Multihead Siamese Nets in Google Colab](https://colab.research.google.com/drive/1FUEBV1JkQpF2iwFSDW338nAUhzPVZWAa) # Testing models Download pretrained models from the following link: [pretrained Siamese Nets models](https://drive.google.com/file/d/1STgv1hIxdVpKLQ6-EZK7J3C4ZtfZgbkS/view?usp=sharing), unzip and put them into **./model_dir** directory. After that, you can test models either using predict mode of pipeline: ```bash python3 run.py predict cnn ``` or using GUI demo: ```bash python3 gui_demo.py ``` The below pictures presents Multihead Siamese Nets GUI for: 1. Positive example: