Reimagining Data Annotation: The Rise of GenAI Automation

Demand for annotated data continues to grow as AI adoption expands across industries, making high-quality labeled datasets critical for training reliable models.
Automation is reshaping annotation workflows, enabling teams to process large datasets faster while reducing manual effort.
GenAI improves speed and scalability but still requires human oversight, particularly when handling complex scenarios, ambiguity, or contextual interpretation.
The effective annotation approach combines human expertise with automation, using hybrid workflows that maintain quality while supporting scalable AI development.

The Growing Demand for Data Annotation in the AI Era

The global data annotation services sector is entering a period of exceptional expansion, growing from a valuation of approximately USD 1.30 billion in 2024 to an estimated USD 14.40 billion by 2034, supported by a powerful ~27% annual growth rate between 2025 and 2034.¹ This surge is largely driven by AI adoption in sectors such as healthcare and automotive, where large volumes of quality data are required to train models. With organizations expanding their use of GenAI, the demand for reliable data annotation continues to increase.

Quality data is fundamental to training and improving AI models. A model performs well only when it is trained on the right datasets. This process requires large volumes of varied data to enable ML systems to learn different patterns and operate effectively across scenarios. With effective training, models produce more reliable outputs and better user outcomes, making the quality of data annotation a critical factor in overall model performance. However, manually annotating data is complex and time-consuming. Automation, on the other hand, helps accelerate model development and allows teams to handle large-scale annotation projects more efficiently.

Data processing to strategic enablement, effective annotation is the foundation for every reliable AI model.

From Automation to Intelligence: How GenAI is Redefining Value Creation

Today, GenAI is increasingly used in annotation workflows as model and technical capabilities improve. Annotation teams now combine manual review with GenAI-assisted labeling to maintain accuracy while keeping projects on schedule .

Automation in data annotation: Manual annotation -> Assisted manual annotation -> Rule-based annotation -> Machine learning models and active learning -> GenAI based annotation -> GenAI automated annotation.

Tech Mahindra’s annotation programs have successfully used assisted manual annotation, rule-based automation, and machine learning models. Current efforts are expanding toward active learning, GenAI-assisted annotation, and autonomous annotation workflows. While GenAI improves speed and scale, it still faces limitations. Here’s a quick rundown of the key challenges and the areas where it delivers the most value.

Data Type	GenAI Challenges	Where GenAI Excels
Text	Ambiguity, sarcasm, domain jargon, long context	Entity extraction, classification, PII redaction
Image	Hallucination or missing attributes, occlusions	Object detection, tagging, basic captioning
Video	Scene complexity, fast objects, long context	Segmentation, activity classification, keyframes
Audio	Noise, dialects, diarization drift	ASR transcripts, keyword spotting, basic intent

Automation Meets Reality: Key Learnings from a Real Annotation Implementation

Annotation automation demands successful coordination between people, processes, and technology. The following use case illustrates key challenges in real-world projects and TechM’s structured approach in addressing them.

Objective: To identify participants and objects within a given video scene and describe participant activities and object properties like size, shape, color, and build for AI model training. The project scope included data collection and data annotation.

TechM's Solution:

Data Collection
Manual data collection required participants to act out scripted activities that were recorded on video. These activities ranged from relaxing in the living room and cooking in the kitchen to cleaning the house and performing tasks within a defined area.
Data Annotation
Using our data management platform, we built a GenAI model to support automation in the annotation workflow. The model first analyzed the data to identify participants and objects within the scenes, after which an expert human annotation team reviewed the output and corrected any inaccuracies before finalizing the model.

Limitations: The complexity of the scenes, limitations in prompting techniques, object misidentification, and inconsistent adherence to process guidelines led to GenAI hallucinations, resulting in false attributes and incorrect descriptions. In addition, human reviewers had to meet strict quality thresholds while identifying subtle model-generated inaccuracies, which increased the operational burden.

TechM’s Fix: After identifying gaps, we solved the hallucination and false attribute challenge by streamlining the process.

Incorporated client feedback into the process:
- LLM did not follow the correct spatial order of objects. We introduced additional training and manual annotation to address this.
- LLM generated unnecessary object attributes; we implemented prompt refinement to resolve this.
- Missing object attributes were identified using automated scripts that validated JSON output for missing information.
Introduced changes to the existing process
Deployed hybrid working model: AI + Human evaluations
Enhanced the prompts by providing sufficient data for learning
Refined the scripts accordingly to validate missing information

These refinements improved model performance and reduced reviewer workload, supporting a sustainable workflow.

Prompt refinement, validation scripts, and human-AI collaboration turned early challenges into an annotation workflow built to scale.

Learnings

With this experience, we learned to always start with human-led manual annotations for complex projects and later introduce automation to check on the efficacy of the model. Also, we recognized the importance of expanding automation only while maintaining annotation quality and delivery timelines.

Designing for Impact: Principles Every GenAI Automation Strategy Must Follow

For GenAI automation to deliver real value, organizations must:

Have a clear understanding of the use case.
Have a sufficient amount of data trained for AI to learn and generate appropriate outputs.
Continuously tune prompts to enhance the quality of the outputs.
Apply a complexity-based approach, which is ideal for simple, medium, and complex use cases.
Start with a pilot to gather the required information and identify potential challenges of the process.

The Bottom Line

Annotation workflows are moving toward hybrid models in which automation accelerates the process while human expertise ensures accuracy. This approach helps enterprises build reliable and scalable AI models. With GenAI now in the picture, organizations with strong process control and oversight can augment the annotation process and build better AI models efficiently.

TAGS: Artificial Intelligence Media & Entertainment Hi Tech Retail

Frequently Asked Questions

Our FAQ section is designed to guide you through the most common topics and concerns.

High quality data annotation is essential for training reliable AI models. As industries accelerate AI adoption, large and diverse labeled datasets are required to help models recognize patterns and operate accurately across scenarios. The growing complexity and scale of AI applications make consistent, high fidelity annotation a foundational requirement for model performance.

Automation accelerates data annotation by reducing manual effort and enabling large scale processing. Techniques such as rule based systems, machine learning models, and GenAI assistance streamline labeling tasks. While automation improves speed and consistency, human oversight remains critical for resolving ambiguity, interpreting context, and maintaining annotation quality.

GenAI enhances scalability by supporting tasks such as entity extraction, object tagging, and basic classification across text, images, audio, and video. Challenges arise when handling ambiguity, complex scenes, domain specific nuances, and long context interpretation. Human review is necessary to correct hallucinations, missing attributes, and misidentifications.

Real world implementations show that hybrid workflows—combining AI assistance with expert human review—deliver the best outcomes. Early challenges often stem from prompting limitations, complex scenes, and inconsistent process adherence. Refinements like prompt tuning, validation scripts, and structured human AI collaboration significantly improve accuracy and reduce reviewer workload.

Effective strategies start with clearly defined use cases, sufficient training data, and continuous prompt tuning. Organizations should apply a complexity based approach, beginning with pilot phases to identify risks and refine workflows. Strong process control, human oversight, and iterative improvements ensure that GenAI contributes responsibly and efficiently to annotation pipelines.

References

https://www.zionmarketresearch.com/report/data-annotation-service-market

About the Author

Mothiraj Ramalingam

Group Practice Head, Digital Business Operations, Tech Mahindra Business Process Services

Mothiraj has over 24 years of experience in managing clients across various service lines. His expertise includes setting up delivery operations, establishing centers of excellence, solution design, and leadership in best practices. In his current role, he leads the digital data services practice and collaborates with internal and external stakeholders incl. clients across different industry verticals to provide our AI/ML data services solutions.

Author(s)

Mothiraj Ramalingam

Group Practice Head, Digital Business Operations, Tech Mahindra Business Process Services

Know More

Related Insights

Driven by Analytics: Orchestrating Data-Driven Processes for an Efficient Enterprise

September 25, 2024

The Importance of Policy Making in Content Moderation

October 20, 2022

Hybrid Cloud Forging: Facilitating and Measuring Cost Efficiency and ROI

November 26, 2024

Migrating to a Hybrid Cloud: Your Comprehensive Guide

August 05, 2024

Event

Tech Mahindra Sponsors Geospatial World Forum (GWF)

Transform Operations with GenAI and BPS

Explore our intelligent business process services that drive efficiency, innovation, and growth.

Know More

Cut Through the Noise

Get real-world insights from thought leaders and experts building the future of enterprise tech.

Join S/N Newsletter

Bridging the Gap to Smart ManufacturingThe manufacturing industry has undergone a profound shift, led by a global transformation driven by 4.0 technologies. The current manufacturing trends are dominated by robotics, automation, digital twins, IoT, and other ground-breaking tech. Additionally, enterprises are increasingly investing in building smart factories shaped by the convergence of automation, AI, and IoT.However, 38% of enterprises moving towards Industry 4.0 are discovering talent and skills gaps, while a similar percentage experience roadblocks when integrating new technology with legacy systems 1.The journey of modern and adaptive manufacturing is incremental and requires a strong alignment of technology, skills, and expertise. To put it simply, manufacturers need all the help and support to build the factories of the future.Business Process Services in Modern ManufacturingManufacturing technologies such as automation, robotics, AI, and big data must be integrated into workflows, operations, supply chains, and customer-facing processes. The role of business process services is to create a cushion between these technologies and physical, day-to-day operations. This serves as a center of excellence, embodying deep expertise and advanced technology to build connected systems in which machines, data, and humans can communicate and operate seamlessly in real time.Domain experts bridge smart factory gaps during the transformation that manufacturers are unable to fill alone:Skilled workforces equipped with IoT integration skills, data literacy, and AI/ML proficiencyBig data analytics to address the problem of under-utilized data across supply chain touchpointsGovernance and change management frameworks to increase agility and scale technology adoptionSustainability and compliance management mechanicsUnified reporting and analytics sans disruptions usually caused by fragmented, legacy systemsThe manufacturing space is ever evolving; organizations that understand the human side of Industry 4.0 transformation will be able to respond to market shifts and disruptions.Scaling Adaptive Manufacturing Beyond TechnologyThese capabilities enable adaptive factories. But adaptive manufacturing goes beyond deploying advanced technologies. It helps build smart factories that can handle supply chain disruptions, shifting customer demands, and market fluctuations with real-time agility. However, AI and automation alone are not enough. Many enterprises face significant IT/OT misalignment and interoperability issues, like fragmented data, system silos, security risks, and technical debt.To address this, organizations need a more structured approach. By combining business process services with business process management, enterprises can scale Industry 4.0 initiatives in a controlled, incremental manner.Tech Mahindra’s Factories of FutureWe, at Tech Mahindra, transform manufacturing through our Factory of the Future solutions. They focus on predictive maintenance, sustainable operations, and supply chain agility. Built on AI, IoT, cloud, and cyber-physical systems, they help enterprises to:Maximize asset utilization and optimize the effectiveness of equipmentHelp manufacturers build a zero-harm environment through AI-powered predictive intelligence and risk mitigationLeverage digital twin, simulation design, and product innovation data to enhance design and engineeringEnable smart production through anomaly detection, corrective intelligence, and quality checksUpgrade the client factory network infrastructure to handle data explosions caused by digital disruptionsEmpower businesses to transition from reactive management to proactive, data-driven operationsAlign technology with human capital and enable smooth change managementBuilding Resilient and Future-ready ManufacturingAt Tech Mahindra, smart manufacturing entails tightly linking technology adoption, functional drivers, and change enablement. We have collaborated with new-age technology partners and leveraged our vast talent pools to build highly customized solutions to support rapid IT/OT convergence in customer organizations. From building resilient supply chains to ensuring reliable aftermarket care, Tech Mahindra plays a transformative role in helping manufacturers stay agile and adaptive.Learn more about Tech Mahindra’s supply chain management solutions - Supply Chain Management | Tech Mahindra

Automation in Pension SystemsFinancial enterprises across ASEAN markets are actively rethinking service delivery and operations at scale. An increase in customers’ digital-first expectations has led to stringent regulatory requirements in the financial services ecosystems—pension fund administrations included. Currently, with a shrinking workforce and the growing need for technological advancements, several financial services across ASEAN are looking to move towards a hybrid BPS-driven operating model to alleviate the heightened strain on pension systems.In the ASEAN finance market, outsourcing and process automation are redefining how institutions operate. Transparent reporting, operational resilience, and responsive servicing are becoming essential capabilities. At the same time, organizations are responding to the global retirement shift1 with greater agility and foresight.This blog explores how traditional legacy systems and fragmented processes are being replaced by modern, integrated business process models.Organizations are quickly responding to the global retirement shift with foresight and better agility.Redesigning Pension Calculation and Customer ServiceTraditional pension calculator engines had limitations with siloed tools and spreadsheets. But with automation, institutions are gradually changing the way retirement and pension administrations1 are handled. Below are the ways how automation has changed the financial sector, particularly retirement and pension administrations:On-demand Ramp-upIn an organization, setting up pension and retirement infrastructure not only needs extra cost but also careful planning. As business needs evolve, building and maintaining pension administration infrastructure can be complex—from setting up operational capacity to onboarding skilled pension administrators and managing benefits processing.On the other hand, outsourcing pension management is cost-effective, as businesses incur costs only as per project or contract. This model allows businesses to scale operations up or down as needed, improving flexibility and cost efficiency.Expert-led, Outcome-driven More than 80% of global organisations are increasing their outsourcing efforts, driven by cost pressures, the adoption of advanced technologies, and the need for specialised expertise.2Financial institutions that outsource employee account transfers, pension calculations, distribution processing, and regulatory compliance to third-party experts do notice significant improvements in process efficiency and customer experiences. For example, the use of automated APIs and pipeline orchestration eliminates reconciliation errors. Likewise, anomalous entries and variance thresholds are immediately flagged using automated controls, which again enables operational excellence.At the core of this shift are expert business service providers who take over the complete process end-to-end, enabling comprehensive data management and thorough accounting services.Alignment with Regulatory ChangesAcross ASEAN, particularly in highly regulated markets such as Singapore and Malaysia, pension regulations continue to evolve. Keeping retirement plans consistently compliant can therefore become a significant operational challenge for financial institutions.Therefore, outsourcing pension regulation to a trusted BPS provider ensures commendable regulatory compliance. It is possible when compliance precision is embedded directly into the pension workflows, enabling actuarial assumptions and contribution validations. With a precise governance model, the goal is simple—the shifting of compliance metrics from reactive to proactive.Transformative Impacts of Automation on Customer ServicingWhether enterprises set up their own pension infrastructure or outsource services, the goal remains the same—earning and maintaining customer trust. AI- and automation-powered service strategies have brought new levels of sophistication to the ASEAN finance landscape. Some of them are mentioned below:Self-Servicing: Shifting to automated business logic helps customers in real-time to instantly access benefits estimates, view contribution history, simulate pension scenarios, and request policy explanations.Omnichannel Engagement: Conversational AI deployed across messaging platforms, emails, voice bots, web chat, and mobile apps reduces average handling time (AHT), increases first contact resolution, and ensures consistent service quality.Workflow Automation: An enterprise that integrates AI- and automation-powered pension lifecycle management improves member interaction and customer satisfaction. For example, a pension withdrawal request automatically triggers eligibility checks, tax computation, and authorization validation. This also reduces escalations and conflicts.Predictive Pension Management: Automation enables predictive engagement through real-time analysis of member behavior, demographic data, and contribution trends. As result, it shows early detection of at-risk members, policy deadline alerts, portfolio recommendations, and so on.Excellent customer servicing is more than efficiency; it is also about offering a responsive, consistent, and reliable experience at scale.Ensuring Success in the ASEAN LandscapeProcess outsourcing and automation can significantly improve operational efficiency and service delivery when organizations focus on the following:Choose Expert Service Partners: Opt for business process service providers with deep expertise in pension management and knowledge of ASEAN markets.Adopt an AI-First Mindset: Consider providers with AI-powered solutions to connect pension calculation engines to customer engagement platforms and regulatory workflows.Ensure Cultural Readiness: Invest in change management by building organizational trust in automated outcomes.Secure Data and Ethical AI Principles: Select organizations that embody strong data ethics and AI principles, focusing on decisions such as pension management affecting people’s livelihoods.ConclusionToday, pension management is no longer considered a back-office function. Automation and regulatory precision are the driving forces behind the ASEAN pension ecosystems. Outsourcing pension administration is increasingly critical, enabling stronger member engagement and greater operational agility. It also ensures end-to-end pension lifecycle management through expert pension administration, compliant pension calculations, readymade audit templates, and robust audit-ready decision paths.Across global financial markets, outsourcing is no longer just a tactical efficiency move. It is increasingly seen as a strategic approach to improving accuracy, continuity, and scalability in customer lifecycle management. Consequently, organizations that invest in advanced automation and AI-powered processes through experienced providers can unlock value faster, build customer trust, and gain a competitive advantage.

As AI becomes the primary interface between institutions and people, one question is becoming impossible to ignore: who does AI truly work for?At 56th Annual Meeting of the World Economic Forum (WEF), MINDS, their flagship program to recognize organizations that are spearheading meaningful AI-driven transformations with tangible results, accolated Tech Mahindra for advancing linguistic and digital equity through AI. Project Indus, our initiative to build a foundational large language model rooted in Indic languages, was acknowledged for reshaping AI adoption in economies that speak regional dialects, which global systems often cannot address accurately. This highlights an important shift in how AI adoption is evaluated globally, not just for its technical sophistication, but also by its ability to serve diverse populations responsibly.India’s Own Foundational ModelWhen we began this journey, the accessibility gap was evident. Most large language models were trained predominantly in English and a handful of high-resource languages. Yet in India, only a fraction of the population interacts comfortably in English. Hundreds of millions communicate in Hindi and other regional languages, many of which have limited digital representation.Project Indus was conceived to address this imbalance. It aims to preserve and digitize underrepresented languages while creating scalable, industry-agnostic language models that can power real-world applications across banking, healthcare, education, and citizen services. This is a civilizational initiative that has developed India's own foundational large language model (LLM) focused on Indic languages. One of Project Indus's key objectives is to preserve Indic languages and dialects that lack significant digital records. It also supports the creation of localized and verticalized, industry-agnostic Indic LLMs, enabling us to forge partnerships with telecoms, hyperscalers, and OEMs to build a platform offering E2E LLM solutions.Bringing Cultural Equity with Regional LLMsToday, many digital public services across healthcare, banking, and government are delivered through AI-powered chatbots and IVRs. However, most of these systems rely on language models trained primarily in English and a small set of Western languages. For example, India has 27 official languages and more than 1,600 dialects, with only 10-20% of the population speaking English. Language models that are not trained in these regional dialects and cultural nuances fail to recognize intent and cultural context and translate inaccurately.With our sovereign AI model, we are trying to bridge this communication gap by enabling users to engage with conversational platforms trained in regional dialects and cultural sensitivities. Capable of supporting both speech and text, this model can adapt to any conversational interface. A defining aspect of it has been its training on datasets obtained from open national databases, literature, archives, and user contributions. Such a unique LLM allows enterprises to retain complete control over data and model behavior, ensuring privacy and ethical AI governance.Built from the Ground Up and Enhancing Accuracy for Regional ContextIn my view, digital inclusivity cannot be achieved by retrofitting global models with translation layers alone. Linguistic diversity must be treated as a foundational design principle.Most popular platforms are trained predominantly in English and other major Western languages. These interfaces perform extremely well in English but lose contextual and translational accuracy when used with diverse languages and dialects. However, our LLM has been trained with linguistic diversity at its core and further refined by native speakers, enabling it to achieve 92% accuracy in Hindi. In comparison, the global benchmark was only 70%.Our Indic LLM today handles 3.8 million queries per month for banking, healthcare, and government organizations. It is a classic case study that shows how AI can be trained and scaled, with people at the heart of its framework. Such an innovative approach has made AI more adaptive, accessible, and preferable, bringing it closer to people.ConclusionDigital inclusion will define the next chapter of AI adoption.As AI systems become embedded in public life, we must ask ourselves whether they reflect the linguistic and cultural diversity of the people they serve. Building sovereign, culturally grounded foundational models is not simply a technological ambition—it is a societal responsibility.At Tech Mahindra, we see ethical and inclusive AI not as a positioning statement, but as a design imperative. The future of AI will not be determined solely by how advanced it is, but by how equitably it is built and deployed.And that, I believe, is the true measure of progress.

Our Promise

Featured Report

Featured Press Release

Featured White Paper

Featured Event

Featured Case Study

Featured Case Study

Automated Annotations Reimagined: Market, GenAI, and Learnings

The Growing Demand for Data Annotation in the AI Era

From Automation to Intelligence: How GenAI is Redefining Value Creation