Apply Now
Location: Dallas, Texas (TX)
Contract Type: C2H
Posted: 3 weeks ago
Closed Date: 05/23/2025
Skills: develop, and manage robust ETL
Visa Type: Any Visa

Position : Senior ETL Pipeline Engineer

Location : Dallas, TX (Onsite – Hybrid)

Job Type : Contract

Note        : Must have min 10+ years of experience in IT, looking for H1B candidates – visa copy and passport number are mandatory.

 

Job Description

Overview:

Seeking an experienced Senior ETL Pipeline Engineer with strong expertise in building scalable, cloud-agnostic data pipelines using modern data engineering tools and platforms. This role involves end-to-end ownership of ETL development, from design through deployment, in a containerized and orchestrated environment. The ideal candidate is comfortable working across multi-cloud and hybrid infrastructures, integrating diverse data sources, and supporting long-term data initiatives.

 

Key Responsibilities:

  • Design, develop, and manage robust ETL pipelines using Python and Apache Spark to process large-scale datasets across structured, semi-structured, and unstructured formats.
  • Containerize ETL workflows using Docker for portability and deploy them using Kubernetes for scalability and fault tolerance.
  • Leverage Apache Airflow for orchestrating and scheduling complex data workflows.
  • Build and maintain cloud-agnostic pipelines capable of running in multi-cloud or hybrid (cloud + on-premises) environments.
  • Integrate data from a variety of sources, including Hadoop ecosystem, RDBMS, NoSQL databases, REST APIs, and third-party data providers.
  • Work with data lake architectures and technologies such as Amazon S3, Trino, Presto, and Athena to support analytics and reporting use cases.
  • Implement CI/CD practices to automate the deployment and update processes for ETL pipelines.
  • Collaborate with cross-functional teams to align pipeline design with business and data architecture goals.
  • Monitor pipeline health, performance, and cost-efficiency; troubleshoot and resolve issues proactively.
  • Document pipeline architecture, operational playbooks, and best practices.
  • (Preferred) Contribute to infrastructure automation using Infrastructure as Code (IaC) tools.

 

Requirements:

  • Strong proficiency in Python and hands-on experience with Apache Spark for data transformation.
  • Deep understanding of Docker and Kubernetes for containerized deployments.
  • Experience with AWS Cloud and willingness to work across other cloud platforms as needed.
  • Solid experience with Apache Airflow or equivalent orchestration tools.