Tools and Frameworks
Find the best tool for you out of a list of fast tools and frameworks for data annotation or labeling for images, videos, text (NLP) or audio.
I had trouble getting a good overview of all the tools and frameworks around for data annotation so I created this list. I will try to keep it up to date. There are many tools and each one has it’s advantages and disadvantages.
Computer Vision | NLP | Audio | Others | |
Open Source tools and frameworks | Images Video LiDAR 3D | Text | Audio | Time Series MultiDomain |
Open Source Annotation and Labeling Tools and Frameworks
Data annotation can be a very tedious task. Luckily there are many free to use tools available in the web. Some of them are are open-source and allow for modifications.
Here you find a list of open-source projects grouped by datatypes!
Computer Vision
Images
- Alturos.ImageAnnotation – A collaborative tool for labeling image data
- Anno-Mage – A Semi-Automatic Image Annotation Tool which helps you in annotating images by suggesting you annotations for 80 object classes using a pre-trained model
- CATMAID – Collaborative Annotation Toolkit for Massive Amounts of Image Data
- CVAT – Powerful and efficient Computer Vision Annotation Tool
- deeplabel – A cross-platform image annotation tool for machine learning
- imagetagger – An open-source online platform for collaborative image labeling
- imglab – A web-based tool to label images for objects that can be used to train dlib or other object detectors
- Labelbox – Labelbox is the fastest way to annotate data to build and ship computer vision applications
- labelImg – LabelImg is a graphical image annotation tool and label object bounding boxes in images
- labelme – Image Polygonal Annotation with Python
- LOST – Design your own smart Image Annotation process in a web-based environment
- make-sense – makesense.ai is free to use online tool for labeling photos
- MedTagger – A collaborative framework for annotating medical datasets using crowdsourcing.
- OpenLabeler – OpenLabeler is an open-source desktop application for annotating objects for AI applications
- OpenLabeling – Label images and video for Computer Vision applications
- PixelAnnotationTool – Software that allows you to manually and quickly annotate images in directories
- Pixie – Pixie is a GUI annotation tool which provides the bounding box, polygon, free drawing, and semantic segmentation object labeling
- turktool – A modern React app for scalable bounding box annotation of images
- VoTT – An open-source annotation and labeling tool for image and video assets
- Yolo_mark – GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2
Video
Find a video labeling tool that suits your needs. Some of these tools have also great Python APIs to interface with them.
- Diffgram – Training Data Software for Teams Shipping Deep Learning AI Systems. Track objects through time.
- UltimateLabeling – A multi-purpose Video Labeling GUI in Python with integrated SOTA detector and tracker
- VATIC – VATIC is an online video annotation tool for computer vision research that crowdsources work to Amazon’s Mechanical Turk.
Lidar
Here we provide a curated list of lidar annotation tools.
- semantic-segmentation-editor – Web labeling tool for the camera and LIDAR data
3D
- KNOSSOS – KNOSSOS is a software tool for the visualization and annotation of 3D image data and was developed for the rapid reconstruction of neural morphology and connectivity.
NLP
Text
- ML-Annotate – Label text data for machine learning purposes. ML-Annotate supports binary, multi-label and multi-class labeling.
- SMART – Smarter Manual Annotation for Resource-constrained collection of Training data
- TagEditor – Annotation tool for spaCy
- YEDDA – A Lightweight Collaborative Text Span Annotation Tool (Chunking, NER, etc.). ACL best demo nomination.
Audio
Audio
It’s a good idea to use a special audio annotation tool. They allow for easy playback and help you mark timestamps efficiently.
- audio-annotator – A JavaScript interface for annotating and labeling audio files.
- audio-labeler – An in-browser app for labeling audio clips at random, using Docker and Flask.
- EchoML – Play, visualize and annotate your audio files
- peak.js – Browser-based audio waveform visualization and UI component for interacting with audio waveforms, developed by BBC UK.
- wavesurfer.js – Simple annotations tool, check the example.
Others
Time Series
Here we list a few popular time series annotation tools. These can be use for annotating anomalies or interesting sections in a data stream.
- Curve – Curve is an open-source tool to help label anomalies on time-series data
- TagAnomaly – Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
- time-series-annotator – The CrowdCurio Time Series Annotation Library implements classification tasks for time series.
- WDK – The Wearables Development Toolkit (WDK) is a set of tools to facilitate the development of activity recognition applications with wearable devices.
MultiDomain
- Dataturks – Dataturks support E2E tagging of data items like video, images (classification, segmentation, and labeling) and text (full-length document annotations for PDF, Doc, Text, etc) for ML projects.
- Label Studio – Label Studio is a configurable data annotation tool that works with different data types
If your looking for data-labeling service providers check out my other blog.
Do you know a tool or framework and would like me to add it to the list? Just comment below or drop me an mail at isusmelj at gmail.com!
Source:
github.com/heartexlabs/awesome-data-labeling
github.com/taivop/awesome-data-annotation
github.com/jsbroks/awesome-dataset-tools