Key technical skills :
- Python - Pandas, OpenCV, Django, Pandas, Numpy - Must Have
- Any ETL tool (Abinito, teradata,etc) - Must have
- Any NoSql should be okay: MongoDB (preferred) and other relative noSQL DB for JSON Java - Good to have
- Deep understanding on Apache Avro - Good to have
Job Description :
- looking for a senior data engineer to build our next generation big data processing platform.
- You will be responsible for designing, implementing and executing ETL process with above mentioned technical tool chain.
- Extensive experience in Microservices, Rest Services, JPA, Automated unit testing through tools and strong experience with designing and developing REST APIs.
- Knowledge of architecture and design concepts, object-oriented design and techniques.
- Hands on experience in building micro services with python, java and integrating with ab initio, noSQL database, Microsoft Presidio etc. mentioned tools.
- using Spring Boot, Spring Cloud, Cloud foundry (PCF)/AWS, Docker
- Experience in J2EE technologies with, REST APIs, JSON, NoSQL data bases, Hibernate, Messaging, front end technologies CSS, HTML, AngularJS or similar framework, Web & Application Server
- Experience in implementing CI/CD build pipelines with tools like Git, Jenkins, Ansible, Maven, Sonar, Artifactory
- Good exposure to JMS environment and hand on experience in Rabbit MQ / Active MQ / Kafka
- Candidate should be expert in Python in using NumPy and Pandas libraries and data file processing skills with other related python libraries.
- Candidate should be able to integrate multiple data sources and databases into one system.
- Candidate should have strong data engineering skills [data parsing, web scraping, data transformation, data integration, etc.]
- Candidate should have knowledge of data analysis tools (for example pandas), NLP tools (for example: spacy, Stanford NLP, etc)
- Strong programming experience in OOP, Lists, Dictionaries, Multi-level dictionaries, regular expressions, PyUnit, etc.
- Experience in relational database environments like Environment Oracle, MySQL, PostgreSQL with system-related calls API development etc. and experience in designing database schemas that represent and support business processes.
- Candidate should have knowledge of user authentication and authorization between multiple systems, servers, and environments.
- Candidate should have good knowledge and understanding on Google Cloud data processing services like Big Query, Big Table, Cloud Storage, Cloud Dataflow, Cloud function, GCP pub sub.