We are looking for a Data Operations Engineer to join our growing team. The hire will be responsible for monitoring and optimizing our data architecture, day-to-day data operations while working on projects to enhance the functionality and reliability of the overall data infrastructure. This position is a mix of project-based and production support work with an emphasis on building a robust data system. The Data Engineer Operations will support our data engineers, software developers, infrastructure engineers, data analysts, and data scientists on data initiatives and will ensure reliable and efficient data architecture. They must be independent and comfortable supporting the data needs of multiple teams, systems, and products.
Responsibilities for Data Engineer
• Maintaining optimal data pipeline architecture.
• Identify, design, and implement internal process improvements: automating manual processes, optimizing data usage, etc.
• Work with stakeholders including the Executive, Product, Data, and Design teams to assist with data-related technical issues and support their data engineering needs.
Qualifications for Data Engineer
• Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
• Experience maintaining and optimizing ‘big data’ pipelines, architectures, and data sets.
• Build processes supporting data engineering-related matters.
• Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ stores.
• Strong project management and organizational skills.
• Experience supporting and working with cross-functional teams in a dynamic environment.
• We are looking for a candidate with experience in a Data Engineering role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
• They should also have experience using some of the following software/tools:
• Experience with big data tools: Hadoop, Spark, Kafka, etc.
• Experience with relational SQL and NoSQL databases, (e.g. MySQL, CockroachDB, HBase, etc).
• Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
• Experience with GCP cloud services: BigQuery, Dataflow, PubSub, etc
• Experience with deployment and CI/CD tools: Kubernetes, Jenkins, CloudBuild, etc
• Experience with stream-processing systems: Storm, Spark-Streaming, Beam, etc.
• Experience with various programming languages: Python, Java, Scala, Go, etc.