Apply Now
Location: Dallas, Texas (TX)
Contract Type: C2C
Posted: 3 weeks ago
Closed Date: 05/23/2025
Skills: ETL Pipeline Engineer
Visa Type: Any Visa

Job Title: Senior ETL Pipeline Engineer

Location: Dallas, TX

Duration: 6 – 12 Months Contract

Overview:

Seeking an experienced Senior ETL Pipeline Engineer with strong expertise in building scalable, cloud-agnostic data pipelines using modern data engineering tools and platforms. This role involves end-to-end ownership of ETL development, from design through deployment, in a containerized and orchestrated environment. The ideal candidate is comfortable working across multi-cloud and hybrid infrastructures, integrating diverse data sources, and supporting long-term data initiatives.

Key Responsibilities:

  • Design, develop, and manage robust ETL pipelines using Python and Apache Spark to process large-scale datasets across structured, semi-structured, and unstructured formats.
  • Containerize ETL workflows using Docker for portability and deploy them using Kubernetes for scalability and fault tolerance.
  • Leverage Apache Airflow for orchestrating and scheduling complex data workflows.
  • Build and maintain cloud-agnostic pipelines capable of running in multi-cloud or hybrid (cloud + on-premises) environments.
  • Integrate data from a variety of sources, including Hadoop ecosystem, RDBMS, NoSQL databases, REST APIs, and third-party data providers.
  • Work with data lake architectures and technologies such as Amazon S3, Trino, Presto, and Athena to support analytics and reporting use cases.
  • Implement CI/CD practices to automate the deployment and update processes for ETL pipelines.
  • Collaborate with cross-functional teams to align pipeline design with business and data architecture goals.
  • Monitor pipeline health, performance, and cost-efficiency; troubleshoot and resolve issues proactively.
  • Document pipeline architecture, operational playbooks, and best practices.
  • (Preferred) Contribute to infrastructure automation using Infrastructure as Code (IaC) tools.

 

Requirements:

·Strong proficiency in Python and hands-on experience with Apache Spark for data transformation.

·Deep understanding of Docker and Kubernetes for containerized deployments.

·Experience with AWS Cloud and willingness to work across other cloud platforms as needed.

·Solid experience with Apache Airflow or equivalent orchestration tools.

·Demonstrated experience in developing cloud-agnostic ETL pipelines and operating in hybrid environments.

·Familiarity with a variety of data storage and query tools including RDBMS, NoSQL, Hadoop-based systems, and cloud-native services.

·Strong problem-solving skills, ownership mindset, and ability to execute on long-term, complex projects.

 

Preferred Qualifications:

 

·Experience in working with multi-cloud environments (AWS, Azure, GCP).

·Exposure to Infrastructure as Code (e.g., Terraform, CloudFormation) for provisioning and managing environments.

·Cloud certifications in data engineering, Kubernetes, or container orchestration.