Logo for Kaztronix LLC

Lead Data Engineer

Job description


Job Summary

We are seeking a Lead Data Engineer to architect and execute high-impact data integration strategies across the organization. While the primary focus is enabling the enterprise Translational Data Lake, this is a broader role that will drive various scientific data engineering initiatives across the research pipeline. This is a hands-on technical role: the successful candidate must be a "player-coach" capable of designing scalable architectures and writing the production code to implement them. You will partner with IT and Computational Biology and other Research functions to transform raw multi-modal data into actionable assets.

Key Responsibilities

  • Act as a hands-on technical lead who not only defines the architecture but also codes, deploys, and maintains scalable ETL pipelines and data structures.
  • Spearhead the technical implementation of the Translational Data Lake data ingestion, managing the ingestion of complex datasets (genomics, proteomics, imaging, lab data, etc.) into modern cloud architectures.
  • Broader Research Integration: Lead data engineering projects beyond the Data Lake, designing bespoke integration solutions for diverse scientific data sources across the Research organization.
  • Data Transformation: Design and script automated procedures to normalize unformatted data from external vendors (CROs) into a structured Common Data Model (CDM).
  • Technical Collaboration: Partner with various functions in Research and IT to align infrastructure with scientific needs, ensuring solutions are robust, FAIR-compliant, and scalable.
  • Develop and communicate the technical vision for biomarker data integration and reuse.
  • Architect and implement scalable ETL procedures, APIs and front-end tools for data access and visualization.
  • Engage stakeholders to gather requirements and incorporate feedback into design.
  • Lead user acceptance testing (UAT) and ensure high-quality deliverables.
  • Collaborate with IT and Translational leads to align infrastructure and governance processes.
  • Champion FAIR principles and interoperability across translational and clinical programs.

Minimum Qualifications

  • Education: Bachelor’s or master’s degree in computer science, Data Engineering, Bioinformatics, or related field
  • Experience: 8+ years of professional experience in data engineering or software architecture, with a focus on building production-grade data pipelines
  • Expert-level coding proficiency in Python with specific mastery of modern data engineering libraries (Pandas, PySpark, Dask, SQLAlchemy).
  • Advanced proficiency with SQL, workflow orchestration tools (Airflow, Dagster, or Prefect), and containerization (Docker/Kubernetes).
  • Cloud Architecture: Deep experience with modern Data Lake and Lakehouse architectures (e.g., Azure Fabric, Databricks, Snowflake), with a proven track record of connecting and integrating disparate data sources.
  • Data Modeling: Solid understanding of data modeling, ETL processes, and schema design for complex datasets.
  • API Development: Experience designing and deploying APIs for data access
  • Excellent communication skills to bridge the gap between IT infrastructure and scientific stakeholders
  • Familiarity with FAIR principles and metadata standards for scientific data.
  • Excellent communication and collaboration skills.

Preferred Qualifications

  • Familiarity with clinical data standards including SDTM, ADaM, and CDISC, and biomarker data formats (NGS variant results, flow cytometry, serum proteomics, gene expression profiling).
  • Direct experience with Azure Fabric tools for connecting and integrating data sources.
  • Proficiency in R for interoperability with bioinformatics teams

Kaztronix is an equal opportunity employer and does not discriminate on the basis of race, color, national origin, sex, age, religion, disability, veteran status or any other consideration made unlawful by federal, state or local laws. In addition, all human resource actions in such areas as compensation, employee benefits, transfers, layoffs, training and development are to be administered objectively, without regard to race, color, religion, age, sex, national origin, disability, veteran status or any other consideration made unlawful by federal, state or local laws.

By applying to the position, you acknowledge that your information will be used by Kaztronix in processing your application.

Data Engineer Related jobs

Other jobs at Kaztronix LLC

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.