Senior Data Engineer

Zinkal Thakker

Building reliable data systems for modern analytics.

Published Research Author IEEE Conference Presenter Real-time Data Platforms

Data Engineer with 5+ years of experience designing scalable ETL/ELT pipelines, real-time streaming workflows, and analytics-ready data models using Python, SQL, Kafka, Flink, ClickHouse, and cloud technologies.

5+Years Experience
Kafka + FlinkStreaming Pipelines
AI + DataResearch Focus
Zinkal Thakker
Core Expertise

Data engineering across streaming, warehousing, and analytics.

Focused on dependable data pipelines, clean modeling, and decision-ready datasets.

Real-time Data Pipelines

Apache Kafka, Apache Flink, Java, CDC processing, and high-throughput streaming patterns.

Analytics Engineering

ETL/ELT workflows, dimensional modeling, star-schema design, reconciliation, and data validation.

Cloud & Databases

AWS, ClickHouse, Oracle, SQL Server, PostgreSQL, Redshift, Glue, Athena, and query optimization.

Experience

Professional journey

Feb 2024 – Present

Senior Data Engineer

Boar’s Head · Florida, USA
  • Built real-time streaming data pipelines using Apache Kafka and Flink with Java to stream Salesforce CDC data into ClickHouse.
  • Developed scalable ELT workflows and transformation logic using SQL and Python.
  • Implemented validation and reconciliation checks across streaming and warehouse datasets.
  • Accelerated query performance through ClickHouse indexing and partitioning.
Jul 2023 – Jan 2024

Data Engineer

iConsult Collaborative · Syracuse, USA
  • Developed data models for a 100k+ record database, improving performance and reliability.
  • Created Excel and Tableau dashboards to track sales, campaign effectiveness, and executive KPIs.
Jun 2022 – Aug 2022

Software Engineer Intern

JetBlue Airways · New York City, NY
  • Ingested data from SQL and Salesforce API using Python for BI-ready reporting views.
  • Modeled relational datasets using star-schema principles to improve reporting performance.
2018 – 2021

Data Engineer / Software Developer

Rethinksoft LLP & Cygnet Infotech · Gujarat, India
  • Designed PL/SQL queries, stored procedures, views, indexes, and database changes for reporting and data quality.
  • Built fact and dimension datasets, visualizations, and reliable provider information workflows.
Technical Skills

Tools I work with

PythonSQLJavaJavaScriptTypeScriptGo Apache KafkaApache FlinkApache SparkAirflow ClickHouseOracleSQL ServerPostgreSQLMySQL AWS S3AWS GlueAthenaRedshiftLambda TableauPower BIDockerKubernetesGitJenkins
Research & Publications

Published work and conference presentations

Research focused on cloud cost optimization, data quality monitoring, AI-enabled workflow orchestration, and enterprise data platforms.

Published

Cost Optimization Techniques for Efficient Resource Allocation in Cloud Computing Environments

Google Scholar Publication

Research focused on cloud infrastructure optimization, efficient resource allocation, cost reduction strategies, and scalable computing environments.

View Google Scholar
Conference Presentation · ICSSCNA 2026

Self-Supervised Anomaly Detection Models for Continuous Data Quality Monitoring in Enterprise Platforms

International Conference on Signal, Systems, and Computing for Next-Gen Automation

Presented research on self-supervised anomaly detection for continuous data quality monitoring across enterprise data platforms.

Conference Paper · ICISCN 2026

Agentic AI Framework for Autonomous Multi-Step Workflow Orchestration in Enterprise Cloud-Edge Infrastructure

2nd International Conference on Intelligent Systems and Computational Networks

Paper ID 848. Research focused on agentic AI, autonomous workflow orchestration, and enterprise cloud-edge infrastructure.

Get in Touch

Let’s connect.

Open to data engineering opportunities, technical collaborations, speaking engagements, and meaningful data projects.