Tools and Frameworks

Find the best tool for you out of a list of fast tools and frameworks for data annotation or labeling for images, videos, text (NLP) or audio.

I had trouble getting a good overview of all the tools and frameworks around for data annotation so I created this list. I will try to keep it up to date. There are many tools and each one has it’s advantages and disadvantages.

tagcloud of different terms related to data-annotation
Computer VisionNLPAudioOthers
Open Source tools and frameworksImages
Video
LiDAR
3D
TextAudioTime Series
MultiDomain

Open Source Annotation and Labeling Tools and Frameworks

Data annotation can be a very tedious task. Luckily there are many free to use tools available in the web. Some of them are are open-source and allow for modifications.

Here you find a list of open-source projects grouped by datatypes!

Computer Vision

Images

  • Alturos.ImageAnnotation – A collaborative tool for labeling image data
  • Anno-Mage – A Semi-Automatic Image Annotation Tool which helps you in annotating images by suggesting you annotations for 80 object classes using a pre-trained model
  • CATMAID – Collaborative Annotation Toolkit for Massive Amounts of Image Data
  • CVAT – Powerful and efficient Computer Vision Annotation Tool
  • deeplabel – A cross-platform image annotation tool for machine learning
  • imagetagger – An open-source online platform for collaborative image labeling
  • imglab – A web-based tool to label images for objects that can be used to train dlib or other object detectors
  • Labelbox – Labelbox is the fastest way to annotate data to build and ship computer vision applications
  • labelImg – LabelImg is a graphical image annotation tool and label object bounding boxes in images
  • labelme – Image Polygonal Annotation with Python
  • LOST – Design your own smart Image Annotation process in a web-based environment
  • make-sense – makesense.ai is free to use online tool for labeling photos
  • MedTagger – A collaborative framework for annotating medical datasets using crowdsourcing.
  • OpenLabeler – OpenLabeler is an open-source desktop application for annotating objects for AI applications
  • OpenLabeling – Label images and video for Computer Vision applications
  • PixelAnnotationTool – Software that allows you to manually and quickly annotate images in directories
  • Pixie – Pixie is a GUI annotation tool which provides the bounding box, polygon, free drawing, and semantic segmentation object labeling
  • turktool – A modern React app for scalable bounding box annotation of images
  • VoTT – An open-source annotation and labeling tool for image and video assets
  • Yolo_mark – GUI for marking bounded boxes of objects in images for training neural network Yolo v3 and v2

Video

Find a video labeling tool that suits your needs. Some of these tools have also great Python APIs to interface with them.

  • Diffgram – Training Data Software for Teams Shipping Deep Learning AI Systems. Track objects through time.
  • UltimateLabeling – A multi-purpose Video Labeling GUI in Python with integrated SOTA detector and tracker
  • VATIC – VATIC is an online video annotation tool for computer vision research that crowdsources work to Amazon’s Mechanical Turk.

Lidar

Here we provide a curated list of lidar annotation tools.

3D

  • KNOSSOS – KNOSSOS is a software tool for the visualization and annotation of 3D image data and was developed for the rapid reconstruction of neural morphology and connectivity.

NLP

Text

  • ML-Annotate – Label text data for machine learning purposes. ML-Annotate supports binary, multi-label and multi-class labeling.
  • SMART – Smarter Manual Annotation for Resource-constrained collection of Training data
  • TagEditor – Annotation tool for spaCy
  • YEDDA – A Lightweight Collaborative Text Span Annotation Tool (Chunking, NER, etc.). ACL best demo nomination.

Audio

Audio

It’s a good idea to use a special audio annotation tool. They allow for easy playback and help you mark timestamps efficiently.

  • audio-annotator – A JavaScript interface for annotating and labeling audio files.
  • audio-labeler – An in-browser app for labeling audio clips at random, using Docker and Flask.
  • EchoML – Play, visualize and annotate your audio files
  • peak.js – Browser-based audio waveform visualization and UI component for interacting with audio waveforms, developed by BBC UK.
  • wavesurfer.js – Simple annotations tool, check the example.

Others

Time Series

Here we list a few popular time series annotation tools. These can be use for annotating anomalies or interesting sections in a data stream.

  • Curve – Curve is an open-source tool to help label anomalies on time-series data
  • TagAnomaly – Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)
  • time-series-annotator – The CrowdCurio Time Series Annotation Library implements classification tasks for time series.
  • WDK – The Wearables Development Toolkit (WDK) is a set of tools to facilitate the development of activity recognition applications with wearable devices.

MultiDomain

  • Dataturks – Dataturks support E2E tagging of data items like video, images (classification, segmentation, and labeling) and text (full-length document annotations for PDF, Doc, Text, etc) for ML projects.
  • Label Studio – Label Studio is a configurable data annotation tool that works with different data types

If your looking for data-labeling service providers check out my other blog.

Do you know a tool or framework and would like me to add it to the list? Just comment below or drop me an mail at isusmelj at gmail.com!

Source:
github.com/heartexlabs/awesome-data-labeling
github.com/taivop/awesome-data-annotation
github.com/jsbroks/awesome-dataset-tools

Leave a Comment

Your email address will not be published. Required fields are marked *