Returning Candidate?

PS | Data Engineer

Category: Consultancy
Position Type: Regular Full-Time
Working Model: Hybrid

Overview

About CBTW

As a global tech company, we empower businesses with tailored tech and business solutions that drive innovation and sustainable growth. With over 3,000 experts across 21 countries, we deliver results through flexible engagement models—whether augmenting teams, managing complex projects, or building long-term operations. We specialize in Strategy & Governance, Product Design & Growth, Software Engineering, Data Analytics & AI, Cloud & Enterprise Platforms, Cybersecurity, and industry-specific solutions for Banking, Life Sciences, and Smart Industrial sectors. Guided by our commitment to building a better world, we dedicate time, resources, and expertise to education projects, creating a lasting impact on our teams, communities, and planet.

Responsibilities

Maintain and Enhance Data Infrastructure: Continuously improve and maintain data infrastructure, ensuring optimal performance and reliability.
Develop Data Pipelines: Design, implement, and optimise data pipelines for batch and real-time data processing using Python, Go, and other relevant tools.
Data Systems Management: Manage and monitor data systems including Kafka, Redshift, and Kubernetes, ensuring they meet performance and reliability standards.
Collaboration & Integration: Work closely with stakeholders and team members to integrate new data sources and support existing ones, ensuring seamless data flow across systems.
Workflow Automation: Develop and manage automated workflows using Airflow and DBT to support data processing and transformation tasks.
Cloud Infrastructure: Leverage AWS services to build and maintain cloud-based data infrastructure, ensuring scalability and security.
Monitoring & Troubleshooting: Implement monitoring solutions and troubleshoot issues to maintain high availability and performance of data systems.
Documentation & Best Practices: Document systems, processes, and best practices, ensuring that knowledge is shared, and systems are well-maintained.
Collaboration on Development Practices: Work with the team to improve software development practices, including coding standards, test-driven development, and CI/CD processes.
Weekly Support Rotation: Participate in the weekly support rotation, providing first-level support for any incidents impacting the data engineering space.

Qualifications

Proficient in verbal and written English.
5 years of experience in data engineering or a related field, with a strong understanding of data processing and distributed systems.
Proficiency in Python and/or Go, with experience in building data-intensive applications
Experience with data streaming and batch processing technologies, particularly Kafka, Redshift, and related tools.
Solid experience with AWS services (e.g. S3, Redshift, RDS) and infrastructure-as-code tools (e.g. Terraform)
Hands-on experience with Airflow, DBT, or other workflow automation tools
Knowledge of various data formats (e.g. Parque, Avro) and table formats (e.g. Iceberg, Hudi)
Understanding of DBMS concepts and experience with both SQL and NoSQL databases
Awareness of data security best practices and compliance requirements such as GDPR
Strong problem-solving skills, with the ability to troubleshoot complex data issues and propose innovative solutions.
Adaptability to work in a fast-paced, evolving environment with shifting priorities.
A proactive approach to identifying improvements and addressing challenges.
Strong attention to detail to ensure data systems are accurate and reliable.
Excellent communication skills, both verbal and written, and the ability to collaborate effectively in a team-oriented environment.
Committed to ongoing learning and development

As well as practical experience in most of these languages, and technologies:

Programming Languages:

Scala
Golang
Python

Data Streaming and Batch:

Apache Kafka and related technologies such as Kafka Streaming and Kafka Connect
Airflow

Amazon Web Services (eg’s):

Redshift
RDS
Lake Formation
Glue
S3
Macie
EKS

DevOps / Infrastructure-as-code:

Terraform
Ansible
Docker
Kubernetes
CI systems such as Buildkite and Github Actions

Other:

Data formats: e.g. Apache Parquet, Avro, Arrow, and protobuf
Data processing and data lake technologies such as Apache Hudi
DBMS fundamental concepts as well as specific features of popular products (e.g. MySQL, PostgreSQL, MongoDB)
Modelling: ERD, data flow and state diagrams

Options

ApplyApply

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.

Share on your newsfeed