IT Infrastructure Engineer

ID du poste: c2046

TÜV Rheinland

Permanent, Plein temps

Shanghai

À partir de maintenant

We are looking for a senior AI Infrastructure Engineer to design, build, and operate robust AI/ML infrastructure on public cloud platforms. The ideal candidate will have deep hands-on experience in cloud-native environments, container orchestration, Infrastructure as Code, CI/CD, and observability, ensuring scalable, secure, and efficient AI workloads.
我们正在招聘一名资深的 AI基础架构工程师,负责在公有云环境中设计、搭建和运营稳定高效的AI/ML基础架构。理想候选人需要具备云原生环境、容器编排、基础设施即代码、CI/CD 以及可观测性方面的丰富实战经验,确保AI工作负载的可扩展性、安全性和高性能。

  • Design, deploy, and operate AI/ML infrastructure on public cloud platforms (AWS/Azure/GCP or domestic clouds like Alibaba Cloud/Tencent Cloud).
    在公有云平台(AWS/Azure/GCP 或阿里云/腾讯云等国内云)上设计、部署并运维 AI/ML 基础架构。
  • Build and maintain containerized environments using Docker and manage large-scale workloads with Kubernetes.
    使用 Docker 构建和维护容器化环境,并通过 Kubernetes 管理大规模工作负载。
  • Use Infrastructure as Code (e.g., Terraform, Ansible) to manage and automate environment provisioning, configuration, and changes.
    使用基础设施即代码工具(如 Terraform、Ansible)进行环境的自动化部署、配置与变更管理。
  • Design, implement, and optimize CI/CD pipelines to support frequent, reliable, and secure deployment of AI and backend services.
    设计、实现并优化 CI/CD 流水线,支持 AI 及后端服务的高频、可靠和安全部署。
  • Implement and maintain monitoring, logging, and alerting systems to ensure high availability and quick incident response.
    部署并维护监控、日志与告警系统,保障系统高可用性并支持快速故障响应。
  • Collaborate closely with AI/ML engineers and backend teams to ensure infrastructure meets performance, security, and compliance requirements.
    与 AI/ML 工程师及后端团队紧密合作,确保基础架构满足性能、安全与合规要求。
  • Continuously optimize cost, performance, and reliability of infrastructure, and drive best practices in cloud-native and DevOps.
    持续优化基础架构的成本、性能与可靠性,推动云原生与 DevOps 相关最佳实践的落地。
  • Cloud & Operations | 云平台与运维经验
    • Senior level hands-on experience with deployment and operations on public cloud platforms (AWS/Azure/GCP or domestic platforms like Alibaba Cloud/Tencent Cloud).
      具备资深水平的公有云平台实战经验,能够在 AWS/Azure/GCP 或阿里云/腾讯云等国内平台上独立完成系统的部署与运维。
  • Container & Orchestration | 容器与编排
    • Proficient in containerization technologies (Docker) and container orchestration tools (Kubernetes).
      精通容器化技术(Docker)以及容器编排工具(Kubernetes),具有实际生产环境经验。
  • Infrastructure as Code | 基础设施即代码
    • Skilled in using Infrastructure as Code tools (e.g., Terraform, Ansible) for environment management.
      熟练使用基础设施即代码工具(如 Terraform、Ansible)进行环境管理和自动化运维。
  • CI/CD | 持续集成与持续交付
    • Practical experience in building, maintaining, and optimizing CI/CD pipelines (familiar with tools like GitHub Actions/GitLab CI/Jenkins).
      具备搭建、维护和优化 CI/CD 流水线的实践经验,熟悉 GitHub Actions、GitLab CI、Jenkins 等工具。
  • Monitoring & Observability | 监控与可观测性
    • Familiar with monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK Stack).
      熟悉监控、日志与告警系统(如 Prometheus、Grafana、ELK Stack),能独立完成监控体系的搭建与优化。
  • Networking Fundamentals | 网络基础
    • Senior level knowledge of computer networking, DNS, CDN, and other related fundamentals.
      具备资深水平的计算机网络基础知识,熟悉 DNS、CDN 等相关原理和配置。

    Nous acceptons uniquement les candidatures en ligne via notre système de candidature. Les candidatures envoyées par e-mail ne peuvent pas être prises en compte.

    Ce que tu dois encore savoir

    ID du postec2046
    Type de contratPermanent
    Type d'emploiPlein temps
    Modèle de travailNon spécifié
    SociétéTÜV Rheinland

    Tant que le poste figure sur notre est publié sur le site Carrières, nous recherchons des candidats adéquats (f/m/d). Nous attendons avec impatience ton Candidature!

    Cela pourrait aussi t'intéresser

    TÜV Rheinland en tant qu'employeur

    TÜV Rheinland en tant qu'employeur

    Notre mission

    Notre mission

    Durabilité et conformité

    Durabilité et conformité

    Votre processus de candidature

    1. Candidature en ligne

    Tu ne peux postuler qu'en ligne via notre site de carrière. Le processus de candidature est très simple et ne prend que quelques minutes.

    stagewave-01