Archive - Data Engineer — Big Data & Cloud Infrastructure

🌍 Remote | 🕐 Full-time

We are looking for a Data Engineer for our client the US technology company, specializing in digital solutions for roadway and infrastructure management. Their platforms provide government agencies and engineering firms with integrated tools for real-time pavement performance analytics, facilitating a proactive approach to roadway maintenance.

We expect from our candidates:

5+ years of experience in data engineering or big data roles
Expert knowledge of Apache Spark (Core, SQL, DataFrame APIs)
Hands-on experience with Apache Iceberg and Apache Parquet
Proficiency with Databricks and Amazon EMR
Strong experience in deploying and managing Spark clusters on open-source Kubernetes is a must!
Deep understanding of Docker and containerization for Spark applications
Experience with Spark on K8s operator, resource scaling, and job orchestration
Solid programming skills in Java and Python
Good problem-solving skills with a focus on performance optimization
Experience with distributed systems and cloud-native infrastructure
English — B2+ level

Nice to have:

Experience with Flyte workflow orchestration
Familiarity with AWS, Azure, or GCP
Knowledge of CI/CD pipelines for data deployments
Understanding of monitoring/observability tools for distributed systems

What you will be doing:

Design and implement scalable data processing pipelines with Apache Spark
Deploy and manage Spark clusters on open-source Kubernetes infrastructure
Optimize data storage and access with Iceberg and Parquet formats
Collaborate with data scientists and engineers to productionize ML workflows
Apply best practices for building and running containerized big data workloads
Contribute to a platform used for real-time analysis of transportation data and mobility optimization

📩 Interested?
We look forward to hearing from you!