
Description
Under the direction of Information Services Leadership, the incumbent will be responsible for the full lifecycle management of machine learning models, including design, build, and maintenance of machine learning models. The MLOps Engineer will play an integral role in implementing artificial intelligence solutions across Keck Medicine of USC. The incumbent will partner with data scientists, data team members, and clinical operations to deploy, monitor, and maintain machine learning solutions that will improve patient care, support operational excellence, and advance clinical research. The incumbent will ensure seamless integration, automation, and scaling of AI solutions within the existing infrastructure by leveraging DevOps expertise. They will maintain and continuously improve MLOps pipelines for monitoring, versioning, and deploying models in production environments. The incumbent will be responsible for the end-to-end lifecycle management of artificial intelligence solutions and comes with DevOps experience, ensuring seamless integration, deployment, and automation of systems. The MLOps Engineer will implement best practices for testing, debugging, and performance monitoring of AI systems to ensure reliability and scalability.
Essential Duties:
- Design, build and maintain production-grade machine learning models, with real-time inference, scalability, and reliability.
- Develop end-to-end scalable ML infrastructure using cloud platforms, such as Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure.
- Develop AI pipelines for various data processing needs, including data ingestion, pre-processing, and search and retrieval, ensuring solutions meet all technical and business requirements.
- Monitor model performance for data drift and concept drift detection, automate retraining processes where necessary to maintain model accuracy and relevance.
- Collaborate with data scientists, data engineers, analytics teams, and DevOps teams to design and implement robust deployment pipelines for continuous improvement of machine learning models.
- Implement and optimize CI/CD pipelines for machine learning models, automating testing and deployment processes.
- Configure and manage monitoring and logging solutions to track model performance, system health, and anomalies, enabling timely intervention and proactive maintenance.
- Implement version control systems for machine learning models, parameters, results and associated code to track changes and facilitate collaboration.
- Ensure all machine learning systems meet security and compliance standards, including data protection and privacy regulations.
- Lead engineering efforts in creating and implementing methods and workflows for ML/GenAI model engineering, LLM advancements, and optimizing deployment frameworks while aligning with business strategic directions.
- Maintain clear and comprehensive documentation of MLOps processes and configuration.
- Strong communication and collaboration skills, to collaborate cross-functionally and align on deployment strategies and technical requirements
- Other duties as assigned.
Required Qualifications:
- Req Bachelor's Degree Degree in computer science, engineering or closely related field
- Req Proven experience with: Artificial intelligence and machine learning platforms (e.g., AWS, Azure or GCP). Containerization technologies (e.g., Docker) or container orchestration platforms (e.g., Kubernetes). CI/CD tools (e.g., Github Actions). Programming languages and frameworks (e.g., Python, R, SQL). MLOps engineering principles, agile methodologies, and DevOps lifecycle management. Technical writing and documentation for AI/ML models and processes. Healthcare data and machine learning use cases.
- Req Ability to solve complex problems through troubleshooting
- Req Deep understanding of coding, architecture, and deployment processes
- Req Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy
- Req Excellent organizational skills and attention to detail
- Req Self-starter with the ability to solution when requirements are vague or ambiguous
Preferred Qualifications:
- Pref Master's degree Degree in computer science, engineering or closely related field
Required Licenses/Certifications:
- Req Fire Life Safety Training (LA City) If no card upon hire, one must be obtained within 30 days of hire and maintained by renewal before expiration date. (Required within LA City only)
The annual base salary range for this position is $145,600.00 - $240,240.00. When extending an offer of employment, the University of Southern California considers factors such as (but not limited to) the scope and responsibilities of the position, the candidate's work experience, education/training, key skills, internal peer equity, federal, state, and local laws, contractual stipulations, grant funding, as well as external market and organizational considerations.
Apply on company website