Note: The job is a remote job and is open to candidates in USA. i4DM is an organization that provides federal agencies instant access to experienced and talented professionals. They are seeking an experienced PySpark & Delta Lake Developer responsible for designing, building, and maintaining scalable ETL pipelines to process and analyze large-scale healthcare claims data.
Responsibilities
• Design, develop, and maintain robust ETL pipelines using PySpark and Delta Lake for large and complex healthcare data workloads
• Implement and optimize data lake solutions using Delta Lake table formats, supporting ACID transactions, schema enforcement, and time travel
• Write efficient, reusable, and well-documented PySpark scripts for data ingestion, transformation, cleansing, and aggregation
• Collaborate with data engineers, architects, and data scientists to understand business and data requirements and translate them into scalable data solutions
• Ensure data quality, consistency, lineage, and integrity across all stages of data processing
• Troubleshoot, debug, and optimize PySpark applications and Delta Lake workflows for cost, speed, and reliability within AWS
• Maintain detailed and up-to-date technical documentation of code, data pipelines, and standard operating procedures
• Stay updated with the latest Delta Lake and Spark advancements, advocating for best practices in data management and analytics
Skills
• Strong proficiency in Python and PySpark, with hands-on experience developing data pipelines
• Advanced experience with Delta Lake and its ACID transaction and schema management features
• Solid SQL skills for querying, joining, and optimizing data in distributed environments
• Hands-on experience with AWS cloud data services (e.g., S3, Glue, EMR, Athena)
• Familiarity with data lake concepts, partitioning, and performance tuning
• Excellent communication skills and a desire to continuously learn and adapt to innovative technologies
• Familiarity with CI/CD, version control (e.g., Git), and infrastructure as code
• Experience with healthcare or claims data
• Knowledge of data governance, security, data cataloging (AWS Glue Catalog), and compliance best practices
• Strong ability to prioritize and execute tasks independently and within collaborative team environments
• Previous experience working in a government or public sector setting
Company Overview
• i4DM provides full range of information technology consulting services to government and commercial clients. It was founded in 2002, and is headquartered in Millersville, Maryland, USA, with a workforce of 51-200 employees. Its website is https://www.i4dm.com.
Company H1B Sponsorship
• i4DM has a track record of offering H1B sponsorships, with 1 in 2022, 1 in 2021. Please note that this does not guarantee sponsorship for this specific role.
Apply Now
Apply Now