Add links to blog post (#2)
* Add links to blog post * minor fix * Fix links * fix links * Fix image link
This commit is contained in:
22
README.md
22
README.md
@@ -3,13 +3,12 @@
|
|||||||
|
|
||||||
# SPOTER Embeddings
|
# SPOTER Embeddings
|
||||||
|
|
||||||
This repository contains code for the Spoter embedding model.
|
This repository contains code for the Spoter embedding model explained in [this blog post](https://blog.xmartlabs.com/blog/machine-learning-sign-language-recognition/).
|
||||||
<!-- explained in this [blog post](link...). -->
|
The model is heavily based on [Spoter](https://github.com/matyasbohacek/spoter) which was presented in
|
||||||
The model is heavily based on [Spoter] which was presented in
|
|
||||||
[Sign Pose-Based Transformer for Word-Level Sign Language Recognition](https://openaccess.thecvf.com/content/WACV2022W/HADCV/html/Bohacek_Sign_Pose-Based_Transformer_for_Word-Level_Sign_Language_Recognition_WACVW_2022_paper.html) with one of the main modifications being
|
[Sign Pose-Based Transformer for Word-Level Sign Language Recognition](https://openaccess.thecvf.com/content/WACV2022W/HADCV/html/Bohacek_Sign_Pose-Based_Transformer_for_Word-Level_Sign_Language_Recognition_WACVW_2022_paper.html) with one of the main modifications being
|
||||||
that this is an embedding model instead of a classification model.
|
that this is an embedding model instead of a classification model.
|
||||||
This allows for several zero-shot tasks on unseen Sign Language datasets from around the world.
|
This allows for several zero-shot tasks on unseen Sign Language datasets from around the world.
|
||||||
<!-- More details about this are shown in the blog post mentioned above. -->
|
More details about this are shown in the blog post mentioned above.
|
||||||
|
|
||||||
## Modifications on [SPOTER](https://github.com/matyasbohacek/spoter)
|
## Modifications on [SPOTER](https://github.com/matyasbohacek/spoter)
|
||||||
Here is a list of the main modifications made on Spoter code and model architecture:
|
Here is a list of the main modifications made on Spoter code and model architecture:
|
||||||
@@ -21,8 +20,7 @@ is therefore an embedding vector that can be used for several downstream tasks.
|
|||||||
* Some code refactoring to acomodate new classes we implemented.
|
* Some code refactoring to acomodate new classes we implemented.
|
||||||
* Minor code fix when using rotate augmentation to avoid exceptions.
|
* Minor code fix when using rotate augmentation to avoid exceptions.
|
||||||
|
|
||||||
<!-- Include GIFs for Spoter and Spoter embeddings. This could be linked from the blog post -->
|
_(1).gif)
|
||||||
|
|
||||||
|
|
||||||
## Results
|
## Results
|
||||||
|
|
||||||
@@ -41,8 +39,6 @@ This is done using the model trained on WLASL100 dataset only, to show how our m
|
|||||||
|
|
||||||

|

|
||||||
|
|
||||||
<!-- Also link the product blog here -->
|
|
||||||
|
|
||||||
|
|
||||||
## Get Started
|
## Get Started
|
||||||
|
|
||||||
@@ -66,7 +62,7 @@ pip install -r requirements.txt
|
|||||||
|
|
||||||
To train the model, run `train.sh` in Docker or your virtual env.
|
To train the model, run `train.sh` in Docker or your virtual env.
|
||||||
|
|
||||||
The hyperparameters with their descriptions can be found in the [train.py](link...) file.
|
The hyperparameters with their descriptions can be found in the [training/train_arguments.py](/training/train_arguments.py) file.
|
||||||
|
|
||||||
|
|
||||||
## Data
|
## Data
|
||||||
@@ -79,9 +75,9 @@ This makes our model lightweight and able to run in real-time (for example, it t
|
|||||||
|
|
||||||

|

|
||||||
|
|
||||||
For ready to use datasets refer to the [Spoter] repository.
|
For ready to use datasets refer to the [Spoter](https://github.com/matyasbohacek/spoter) repository.
|
||||||
|
|
||||||
For best results, we recommend building your own dataset by downloading a Sign language video dataset such as [WLASL] and then using the `extract_mediapipe_landmarks.py` and `create_wlasl_landmarks_dataset.py` scripts to create a body keypoints datasets that can be used to train the Spoter embeddings model.
|
For best results, we recommend building your own dataset by downloading a Sign language video dataset such as [WLASL](https://dxli94.github.io/WLASL/) and then using the `extract_mediapipe_landmarks.py` and `create_wlasl_landmarks_dataset.py` scripts to create a body keypoints datasets that can be used to train the Spoter embeddings model.
|
||||||
|
|
||||||
You can run these scripts as follows:
|
You can run these scripts as follows:
|
||||||
```bash
|
```bash
|
||||||
@@ -131,7 +127,3 @@ The **code** is published under the [Apache License 2.0](./LICENSE) which allows
|
|||||||
relevant License and copyright notice is included, our work is cited and all changes are stated.
|
relevant License and copyright notice is included, our work is cited and all changes are stated.
|
||||||
|
|
||||||
The license for the [WLASL](https://arxiv.org/pdf/1910.11006.pdf) and [LSA64](https://core.ac.uk/download/pdf/76495887.pdf) datasets used for experiments is, however, the [Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/) license which allows only for non-commercial usage.
|
The license for the [WLASL](https://arxiv.org/pdf/1910.11006.pdf) and [LSA64](https://core.ac.uk/download/pdf/76495887.pdf) datasets used for experiments is, however, the [Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/) license which allows only for non-commercial usage.
|
||||||
|
|
||||||
|
|
||||||
[Spoter]: (https://github.com/matyasbohacek/spoter)
|
|
||||||
[WLASL]: (https://dxli94.github.io/WLASL/)
|
|
||||||
Reference in New Issue
Block a user