
Prompt Engineer (LLM Automation for Data Labeling & Localization)
- Remote
- Toronto, Ontario, Canada
- Innodata Services LLC
Job description
About the Role
Innodata is building a team of prompt engineers to harness the power of LLMs to automate data annotation and human evaluation workflows. The goal is to facilitate accurate, localized, and culturally adapted data labeling and translation processes through effective prompt design and implementation. This team will collaborate directly with our client partner, a leading technology company, to identify opportunities for automation, design solutions, and drive measurable improvements. As a technical subject matter expert, you will work backwards from the customer problem statement to develop an efficient plan for execution.
You will collaborate with cross-functional teams, including product managers, data scientists, and client teams, to solve complex problems, reduce human effort, and ensure that AI-driven processes meet high standards for quality and reliability. Your work will directly contribute to improving our client's data annotation and evaluation processes, enabling them to scale more efficiently.
Key Responsibilities
Collaborate with data scientists, linguists, and localization experts to ensure accuracy and cultural relevance.
Prototype and validate AI models to demonstrate initial feasibility, potential impact, and overall effectiveness.
Design, develop, and implement prompts for data labeling and localization processes within software applications.
Understand the current components of the software stack, use cases and problems and iterate on solutions leveraging a solid knowledge of data structures, data formats, and data modeling.
Conduct user testing and feedback analysis to optimize prompt design for data accuracy and linguistic consistency.
Analyze model performance using key performance indicators (KPIs) and metrics, ensuring that AI models meet customer acceptance criteria and deliver high-quality outputs.
Communicate technical findings and solution strategies to both technical and non-technical stakeholders, including presenting model performance and actionable insights in a clear, accessible manner.
Collaborate on data pipelines and workflows that integrate LLMs into automated systems, enhancing both the efficiency and effectiveness of data annotation tasks.
Create guidelines and training materials for prompt usage in data labeling and localization projects.
Stay informed on data labeling and localization industry trends and tools to enhance prompt engineering techniques.
Job requirements
Technical & Required Skills
Deep understanding of LLMs (e.g. transformer-based architectures).
Demonstrated experience programmatically using LLMs to automate data labeling, classification, localization and annotation tasks.
Strong expertise in Python for NLU, for data processing & transformation, and for statistical analysis. Familiarity with JSON, Javascript or XML.
Experience with popular frameworks and libraries, including TensorFlow, PyTorch, Jupyter, and other relevant AI/ML tools.
Familiarity with APIs and platforms for working with LLMs (e.g., OpenAI, Hugging Face, etc.).
Knowledge of localization best practices and cultural nuances for different languages and regions.
Strong understanding of LLM evaluation metrics and the ability to assess model reliability, bias, and generalizability.
Experience working with data pipelines, automation tools, and integrating models into production systems to ensure scalable, reliable solutions.
A collaborative mindset with the ability to solve complex technical challenges and work independently as needed.
Exceptional attention to detail and a commitment to delivering high-quality, reliable AI solutions.
Appreciation for issues of Diversity, Equity, and Inclusion in AI.
Preferred Skills and Experience
2 years of prompt engineering / LLM fine-tuning, or related AI/ML roles.
Familiarity with tools/platforms for annotation and human-in-the-loop workflows (e.g., Labelbox).
Experience designing and automating data annotation workflows.
Knowledge of data annotation and the challenges of scaling human-in-the-loop workflows.
Familiarity with cloud platforms, containerization, and model deployment.
Knowledge of another language.
Minimum Education Requirements
Bachelor’s degree or higher in Computer Science, Artificial Intelligence, Machine Learning, Linguistics, Localization or a related field.
or
All done!
Your application has been successfully submitted!