Scala/Spark Developer
28500 - 30000 złDirectio Sp. z o.o.
- Praca zdalna
Strong commercial experience developing production applications in Scala, including functional programming concepts (immutability, pattern matching, higher-order functions), Scala type system, collections, sbt, and testing frameworks such as ScalaTest or munit, with the ability to write clean, maintainable, production-quality code; Proven hands-on experience with Apache Spark, including Spark SQL, DataFrame/Dataset API, designing and maintaining ETL/ELT pipelines, performance optimization (partitioning, shuffle optimization, broadcast joins, caching), and troubleshooting using Spark UI; Experience working in production environments with large-scale data processing and distributed data pipelines; Strong understanding of software engineering best practices, code quality, performance optimization, and maintainability. Nice to haveExperience with Kafka and Spark Structured Streaming; Experience with Airflow or similar workflow orchestration tools; Experience with cloud platforms such as AWS (EMR, S3, Glue), Azure, GCP (Dataproc), or Databricks; Knowledge of the big data ecosystem, including Hadoop, Hive, Delta Lake, Parquet, and Avro; Experience with SQL, data modeling, data warehouses, and data lake architectures; Experience with Docker, Kubernetes, Git, and CI/CD pipelines; Familiarity with Scala libraries such as Cats, ZIO, Akka, or Pekko; Knowledge of Python or Java. We are looking for a Scala/Spark Developer to join an AI-first software engineering company delivering innovative AI, data, cloud, and IoT solutions for clients across Europe and beyond. As part of the engineering team, you will design, develop, and optimize scalable data processing solutions, build high-performance distributed systems, and work with modern big data technologies on challenging international projects.We offerB2B salary: 28 500 -30 000 PLN + VAT; Flexible working conditions; Private healthcare, Multisport card, and professional training opportunities. ,[Design, develop, and maintain scalable data processing applications using Scala and Apache Spark; , Build, optimize, and maintain ETL/ELT pipelines processing large volumes of data; , Develop efficient Spark applications using Spark SQL and the DataFrame/Dataset API; , Optimize Spark jobs by improving partitioning strategies, shuffle operations, caching, and join performance; , Troubleshoot and resolve performance issues using Spark UI and monitoring tools; , Collaborate with data engineers, software developers, and stakeholders to design scalable data processing solutions; , Participate in code reviews and contribute to software quality, performance, and engineering best practices. ] Requirements: Scala, Functional programming, sbt, Testing, Apache Spark, Spark SQL, API, ETL, Spark, UI, Data pipelines, Kafka, Airflow, Cloud platform, AWS EMR, AWS S3, Glue, Azure, GCP, Databricks, Big Data, Hadoop, Hive, SQL, Data warehouses, Data Lake, Docker, Kubernetes, Git, CD pipelines, Akka, Python, Java Additionally: Sport subscription, Training budget, Private healthcare.