Data Collection & Creation Services

High volumes of data collection or data creation can be the hardest part of a machine learning project, especially at scale. We can help.

TELUS International will help source text, image, audio, video and/or geo-local data to train your machine learning models using both platform automation and human verification. Leveraging our AI Community of 1 million+ members, we assign the most qualified people to build your custom AI training datasets. We focus on:

  • Data collection for AI training
  • Data entry, analysis and enrichment
  • Content summarization, formatting and processing
  • Dataset creation across multiple languages

Harness the power of our AI Community

Our 1 million+ qualified contributors are ready to collect the data you need

We collect and / or create diverse and representative datasets via our large and vetted global AI Community. Harnessing human intelligence in a manner that reduces bias is key to successful machine learning.

Our AI Data crowd across the globe

Data at scale

It’s not enough to give a computer a large amount of data and expect it to learn. Rather, AI needs to be trained. Large scale human-annotation services are required to teach a machine about human judgement.

Custom datasets

Creating a custom dataset is often complicated and time consuming - yet necessary for successful machine learning. Quick, efficient custom data is our speciality.

Secure & confidential

Our platform is designed for high security and data privacy. We use advanced quality system features such as built-in validation, spot-checking and a workers seniority system to ensure the highest quality data.

Data types for all of your machine learning needs

In order to build intelligent applications capable of understanding, machine learning models need to digest large amounts of structured training data. Gathering sufficient training data is the first step in solving any AI-based machine learning problem.

previous button

Video data

Sensor fusion

Image data

Text data

Audio data

next button

Video data

We provide video collection, classification and annotation services, including object localization, object detection, video tracking and more. We also provide a wide range of annotation types, including 2D and 3D bounding boxes, polygons, landmark annotation and semantic segmentation. Our strict quality assurance ensures that moving objects continue to be accounted for in all video frames.

We also provide video transcription services to convert what is spoken on video into written text for subtitling or captioning. Video transcription makes your online videos more searchable and accessible because it provides a better UX and boosts SEO.

AI software is tracking body movements of a running man in a video

Custom data collection and creation services

The data you require may need to be created. Our extensive data creation and data collection services are designed to improve your machine learning models. Our AI Community creates the best AI training data to help build AI-based systems that make the world a better place.

Data collection services

Across all data types - text, images, audio, video and geo - we can collect vast amounts of high-quality training data. This includes handwritten data collection as well as very specific data crowdsourcing requests for chatbot training or other AI-based applications.

Illustration featuring a group of audio files

Data enrichment services

Our data enrichment and data entry services will transcribe any existing data type and / or dataset into a digital format that is suited to machine learning. In addition to our data collection services via our global AI Community, we focus on data enrichment, data processing and data cleansing to ensure your raw data is validated for accuracy, consistency and completeness.

Text recognition AI identifying an address sign on the wall

Intent variation services

When training a model to process natural language, it needs to not only understand what the user is asking, but also the intent of the question regardless of how the user phrased it. We can capture custom intent variation datasets that cover all of the different ways that users from different backgrounds and age groups might express the same intent. Our services cover intent classification and intent recognition.

Snapshot of intent variations available inside the AI Training platform

Text summarization services

For your machine learning algorithms to accurately perform text summarization, they need an understanding of the language and the central message behind each text. We have the platform, contributors and project managers necessary to build these datasets - via either extractive text summarization or abstractive text summarization - in a huge range of global languages. Whether you’re looking to summarize financial reports or build a news aggregator, we can design a custom workflow to achieve your goals.

An AI tool summarizing a lengthy text into a short one

Success stories

Snapshot of autocomplete options related to a user search

Improving onsite search

Our client, a fast-growing Southeast Asian unicorn, provides a one-stop platform for a range of ticketing services including flights, accommodation and attractions. With an expanding list of core products, improving onsite search was key to their continued growth. Here's how we helped:

  • Dataset creation in both English and Bahasa Indonesian
  • 200,000+ text strings
  • 50+ professional contributors
  • 6,000+ hours of work completed

Diverse global AI Community of annotators and linguists

Data annotation languages and dialects

Locales covered across the globe

Secure onsite global delivery centers if required

The essential guide to AI training data

Discover best practices for the sourcing, labeling and analyzing of training data from TELUS International, a leading provider of AI data solutions.

Upgrade your AI

Partner with our AI Data Solutions experts to customize the exact project to advance your machine learning needs.

A person recording an audio note on his smartphone