Data Engineer
Discription:
We are looking for an experienced Data Engineer having experience in building large scale data pipelines and data lake ecosystems. Our daily work is around solving interesting and exciting problems against high engineering standards. Even though you will be a part of the backend team, you will be working with cross-functional teams across the org.
Role Profile
This role demands good hands on different programming languages, especially Python and the knowledge of technologies like Kafka, AWS Glue, Cloudformation, ECS, etc. You will be spending most of your time on facilitating seamless streaming, tracking and collaborating huge data sets. This is a back-end role, but not limited to it. You will be working closely with producers and consumers of the data and build optimal solutions for the organization. Will appreciate a person with lots of patience and data understanding. Also, we believe in extreme ownership!
What will you do..
1. Track, process, and manage huge amount of data (100+ million records/day)
2. Design and build systems to efficiently move data across multiple systems and make it available for various teams like Data
Science, Data Analytics and Product.
3. Design, construct, test and maintain data management systems.
4. Understand data and business metrics required by the product and architect the systems to make that data available in a
usable/queryable manner.
5. Ensure that all systems meet the business/company requirements as well as best industry practices
6. Keep ourselves abreast of new technologies in our domain
7. Recommend different ways to constantly improve data reliability and quality.
Job Requirements :
What we need…
1. Bachelors/Masters, Preferably in Computer Science or a related technical field
2. 1-5 years of relevant experience
3. Deep knowledge and working experience of Kafka ecosystem
4. Good programming experience, preferably in Python, Java, Go and a willingness to learn more.
5. Experience in working with large sets of data platforms
6. Strong knowledge of micro services, data warehouse and data lake systems in cloud, especially AWS Redshift, S3 and Glue.
7. Strong hands on experience in writing complex and efficient ETL jobs
8. Experience in version management systems (preferably with Git)
9. Strong analytical thinking and communication
10. Passion for finding and sharing best practices and driving discipline for superior data quality and integrity.
11. Intellectual curiosity to find new and unusual ways of how to solve data management issues.
Brownie Points (actually we would be delighted if you have these)
1. Knowledge of AWS Kinesis, Elastic Search and Snowplow
2. Knowledge of CI/CD
3. Understanding of infra management and maintenance
A good sense of humour. It harms no one