Code release for "COTR: Correspondence Transformer for Matching Across Images"

COTR: Correspondence Transformer for Matching Across Images

This repository contains the inference code for COTR. We plan to release the training code in the future. COTR establishes correspondence in a functional and end-to-end fashion. It solves dense and sparse correspondence problem in the same framework.

Demos

Check out our demo video at here.

1. Install environment

Our implementation is based on PyTorch. Install the conda environment by: conda env create -f environment.yml.

Activate the environment by: conda activate cotr_env.

Notice that we use scipy=1.2.1 .

2. Download the pretrained weights

Down load the pretrained weights at here. Extract in to ./out, such that the weights file is at /out/default/checkpoint.pth.tar.

3. Single image pair demo

python demo_single_pair.py --load_weights="default"

Example sparse output:

Example dense output with triangulation:

Note: This example uses 10K valid sparse correspondences to densify.

4. Facial landmarks demo

python demo_face.py --load_weights="default"

Example:

5. Homography demo

python demo_homography.py --load_weights="default"

Citation

If you use this code in your research, cite the paper:

@article{jiang2021cotr,
  title={{COTR: Correspondence Transformer for Matching Across Images}},
  author={Wei Jiang and Eduard Trulls and Jan Hosang and Andrea Tagliasacchi and Kwang Moo Yi},
  booktitle={arXiv preprint},
  publisher_page={https://arxiv.org/abs/2103.14167},
  year={2021}
}
Owner
UBC Computer Vision Group
University of British Columbia Computer Vision Group
UBC Computer Vision Group
Comments
  • find the coordinates of the corresponding point (x', y') on another picture.

    find the coordinates of the corresponding point (x', y') on another picture.

    Thank you for the outstanding work you do. I would like to ask if it is possible to enter the coordinates of a point (x, y) and find the coordinates of the corresponding point (x', y') on another picture.

  • How is the warpped image in Figure 9 generated?

    How is the warpped image in Figure 9 generated?

    Hi, thanks for the great work! I'm curious about how do you generate the warpped image in Figure 9 by dense flow. If I understand correctly, you input a pixel coordinate (x, y) in img1, and get its corresponding coordinate (x', y') in the img2. Then, you just copy the RGB in (x, y) to (x', y') in img2, and repeat this for all the coordinates in img1. Am I correct? Or, is there any efficient way of doing so? (like you've mentioned in #28 ?)

  • Question

    Question

    What does the dense correspondence map in Figure 1 mean and how to get it and how to reflect it numerically, I only know that it is the dense correspondence between the two images, what does color-coded ‘x’ channel mean ?

  • Possible redundancy in the code

    Possible redundancy in the code

    Hi, I notice that when constructing the Transformer, you always return the intermediate features at this line. However, after feeding them to MLP for corr regression, you only take the prediction over the last layer at this line. So I guess maybe you can set return_intermediate=False to save some memory/computation?

  • Dense optical flow as in paper Figure 1 (c)

    Dense optical flow as in paper Figure 1 (c)

    Hi, thanks for the great work! I wonder how can I estimate the optical flow between two images. Say img1 is of shape [H, W], then can I basically reshape the grid coordinates to [H*W, 2] and then input to queries_a as in this demo?

  • TypeError: 'NoneType' object is not callable

    TypeError: 'NoneType' object is not callable

    Thank you very much for your open source code! When I run "python demo_single_pair.py --load_weights="default"", the bug show. Could you give me some debugging advice? image

  • Question

    Question

    Hello, when running through the code with the pre-trained model, it appears that RuntimeError: CUDA out of memory. Tried to allocate 2.00 GiB (GPU 0; 7.79 GiB total capacity; 2.90 GiB already allocated; 1.83 GiB free; 4.80 GiB reserved in total by PyTorch).Is there any solution?For example, which parameters to adjust?

  • Matching time

    Matching time

    您好,感谢您精彩的工作。有点疑问向您请教,请问该如何理解一个点的查询,每秒可以做到35个对应点? "Our currently non-optimized prototype implementation queries one point at a time, and achieves 35 correspondences per second on a NVIDIA RTX 3090 GPU. " 我最近在跑您的代码,我在NVIDIA RTX 3090 GPU跑demo_single_pair.py,匹配大概花了30s,请问这正常吗? 谢谢!

  • Rotation angle

    Rotation angle

    Hello, I would like to ask, when COTR extracts the common view area, for some scenes with too large rotation angle, the formula area cannot be extracted. What is the possible reason for this phenomenon?

  • Match time

    Match time

    Hello, about COTR, if I use other feature extraction methods to get the feature point positions of the image and input them, can I reduce the time of COTR feature matching?

  • Common view image

    Common view image

    Hello, I have the honor to read your article. I would like to ask how to get the common view image after the mask as shown in Figure 4 of the paper according to the mask matrix.

  • About ETH3D evaluation

    About ETH3D evaluation

    Hi Wei, thanks for sharing the code.

    Would it be possible to provide the ETH3D evaluation code? I was wondering how the data flow of the model's forward propagation.

    Look forward to your reply. Regards

  • Sharing raw data of ETH3D and KITTI

    Sharing raw data of ETH3D and KITTI

    Hi everyone:

    I'd like to share the raw output from COTR for ETH3D and KITTI dataset.

    ETH3D eval: https://drive.google.com/file/d/1pfAuHRK7FvB6Hc9Rru-beH6F-2lpZAk6/view?usp=sharing

    KITTI: https://drive.google.com/file/d/1SiN5UbqautqosUCInQN2WhyxbRcbWt8b/view?usp=sharing

    The format is: {src_id}->{tgt_id}.npy, and I saved the results as a dictionary. There are several keys: "raw_corr", "drifting_forward", and "drifting_backward". "raw_corr" is the raw sparse correspondences in XYXY format, and "drifting_forward", "drifting_backward" are used to the masks to filter out drifted predictions.

  • About HPatches datasets

    About HPatches datasets

    Thanks very much for your great work! AND i want to know that how do you test and evaluate the HPatches dataset(in the code)? Can you tell me how to get the relevant code?

  • training

    training

    Hello, I would like to ask if you are using the complete MegaDepth dataset for training data, or select a part of it, and if it is convenient, can you provide a training data?

  • is demo_single_pair supposed to be slow ?

    is demo_single_pair supposed to be slow ?

    Hi, thanks for your great work. I tried to run the demo_single_pair.py and it took around 350s to get the correspondences and I'm just wodering whether that's normal even running on gpu ?

    thanks Cheng

Related tags
TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.
TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset. TunBERT was applied to three NLP downstream tasks: Sentiment Analysis (SA), Tunisian Dialect Identification (TDI) and Reading Comprehension Question-Answering (RCQA)

Sep 7, 2022
Code examples for my Write Better Python Code series on YouTube.

Write Better Python Code This repository contains the code examples used in my Write Better Python Code series published on YouTube: https:/

Sep 17, 2022
Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

This codebase is being actively maintained, please create and issue if you have issues using it Basics All data files are included under losses and ea

Nov 9, 2021
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

Jul 28, 2021
Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

Sep 11, 2022
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Sep 18, 2022
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.
Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2. It is trained (finetuned) on a curated list of approximately 45K Python (~470MB) files gathered from the Github. Currently, it just works properly on Python but not bad at other languages (thanks to GPT-2's power).

Sep 23, 2022
Code-autocomplete, a code completion plugin for Python
Code-autocomplete, a code completion plugin for Python

Code AutoComplete code-autocomplete, a code completion plugin for Python.

Aug 21, 2022
Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Recurrent VLN-BERT Code of the Recurrent-VLN-BERT paper: A Recurrent Vision-and-Language BERT for Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian

Sep 22, 2022
Easy, fast, effective, and automatic g-code compression!
Easy, fast, effective, and automatic g-code compression!

Getting to the meat of g-code. Easy, fast, effective, and automatic g-code compression! MeatPack nearly doubles the effective data rate of a standard

Aug 8, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Sep 24, 2022
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Sep 26, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Feb 18, 2021
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

T5: Text-To-Text Transfer Transformer The t5 library serves primarily as code for reproducing the experiments in Exploring the Limits of Transfer Lear

Feb 17, 2021
Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks. It takes raw videos/images + text as inputs, and outputs task predictions. ClipBERT is designed based on 2D CNNs and transformers, and uses a sparse sampling strategy to enable efficient end-to-end video-and-language learning.

Sep 23, 2022
Collection of scripts to pinpoint obfuscated code

Obfuscation Detection (v1.0) Author: Tim Blazytko Automatically detect control-flow flattening and other state machines Description: Scripts and binar

Sep 5, 2022
Code associated with the "Data Augmentation using Pre-trained Transformer Models" paper

Data Augmentation using Pre-trained Transformer Models Code associated with the Data Augmentation using Pre-trained Transformer Models paper Code cont

Aug 22, 2022