Data Collection & Creation
High volumes of data collection or data creation can be the hardest part of a machine learning project, especially at scale. We can help.
TELUS International will help source text, image, audio, video and/or geo-local data to train your machine learning models using both platform automation and human verification. Leveraging our AI Community of 1 million+ members, we assign the most qualified people to build your custom AI training datasets. We focus on:
- Data collection for AI training
- Data entry and enrichment
- Content summarization, formatting and processing
- Dataset creation across multiple languages
Harness the power of our AI Community
Our 1 million+ qualified contributors are ready to collect the data you need
We collect and / or create diverse and representative datasets via our large and vetted global AI Community. Harnessing human intelligence in a manner that reduces bias is key to successful machine learning.
Data at scale
It’s not enough to give a computer a large amount of data and expect it to learn. Rather, AI needs to be trained. Large scale human-annotation services are required to teach a machine about human judgement.
Creating a custom dataset is often complicated and time consuming - yet necessary for successful machine learning. Quick, efficient custom data is our speciality.
Secure & confidential
Our platform is designed for high security and data privacy. We use advanced quality system features such as built-in validation, spot-checking and a workers seniority system to ensure the highest quality data.
Data types for all of your machine learning needs
In order to build intelligent applications capable of understanding, machine learning models need to digest large amounts of structured training data. Gathering sufficient training data is the first step in solving any AI-based machine learning problem.
We provide video collection, classification and annotation services, including object localization, object detection, video tracking and more. We also provide a wide range of annotation types, including 2D and 3D bounding boxes, polygons, landmark annotation and semantic segmentation. Our strict quality assurance ensures that moving objects continue to be accounted for in all video frames.
We also provide video transcription services to convert what is spoken on video into written text for subtitling or captioning. Video transcription makes your online videos more searchable and accessible because it provides a better UX and boosts SEO.
Custom data collection and creation services
The data you require may need to be created. Our extensive data creation and data collection services are designed to improve your machine learning models. Our AI Community creates the best AI training data to help build AI-based systems that make the world a better place.
Data collection services
Across all data types - text, images, audio, video and geo - we can collect vast amounts of high-quality training data. This includes handwritten data collection as well as very specific data crowdsourcing requests for chatbot training or other AI-based applications.
Data enrichment services
Our data enrichment and data entry services will transcribe any existing data type and / or dataset into a digital format that is suited to machine learning. In addition to our data collection services via our global AI Community, we focus on data enrichment, data processing and data cleansing to ensure your raw data is validated for accuracy, consistency and completeness.
Intent variation services
When training a model to process natural language, it needs to not only understand what the user is asking, but also the intent of the question regardless of how the user phrased it. We can capture custom intent variation datasets that cover all of the different ways that users from different backgrounds and age groups might express the same intent. Our services cover intent classification and intent recognition.
Text summarization services
For your machine learning algorithms to accurately perform text summarization, they need an understanding of the language and the central message behind each text. We have the platform, contributors and project managers necessary to build these datasets - via either extractive text summarization or abstractive text summarization - in a huge range of global languages. Whether you’re looking to summarize financial reports or build a news aggregator, we can design a custom workflow to achieve your goals.
Improving onsite search
Traveloka, a Southeast Asian online travel company, provides a one-stop platform for a range of ticketing services including flights, accommodation and attractions. With an expanding list of core products, improving onsite search was key to their continued growth. Here's how we helped:
- Dataset creation in both English and Bahasa Indonesian
- 200,000+ text strings
- 50+ professional contributors
- 6,000+ hours of work completed
Diverse global AI Community of annotators and linguists
Data annotation languages and dialects
Locales covered across the globe
Secure onsite global delivery centers if required
AI starts with data: Facing the challenges of data collection & annotation
Discover useful insights into the challenges of data preparation to ensure that your next artificial intelligence project is a success.
Upgrade your AI
Partner with our AI Data Solutions experts to customize the exact project to advance your machine learning needs.