Senior Data Engineer
ChrysaLabs
We believe in taking care of the world around us. Providing the planet with the food it needs is no small task. Every day we work to make agriculture more efficient, more sustainable.
2025 will be a big year for us. We are looking for Senior Data Engineer to join nearly 50 professionals to support ChrysaLabs’ footprint in the North American market.
Halfway between agriculture and science, we are developing real-time soil analysis technology for precision agriculture.
Based on artificial intelligence, cloud computing and many other patented technologies, we allow agronomists and producers to better manage their soils. We believe that improving the agriculture of tomorrow requires a better understanding of what is happening under our feet.
If you want to participate in changing the way we cultivate the fields and contribute to food security, all in a dynamic, technological and growing company, there is a good chance that we are made to discuss
SUMMARY JOB DESCRIPTION
The data engineer plays a key role in supporting the data science team’s efforts to enhance the performance of ChrysaLabs’ predictive models and scale large soil data processing projects. This role blends technical expertise in data engineering with a strong understanding of cloud systems, real-time data integration, and infrastructure optimization. The candidate will be responsible for creating new data pipelines while collaborating with the infrastructure team to align with existing pipelines. They will focus on ensuring efficient data storage, processing, and retrieval, while working closely with data analysts, scientists, and agronomists to turn data into actionable insights.
MAIN RESPONSIBILITIES
– Design and Develop Data Pipelines : Create and optimize data pipelines to support efficient data ingestion, processing, and transformation for model training and automated analysis.
– Collaborate with Infrastructure Team : Work with the infrastructure team to understand the company-wide data architecture and create new structures that better align with data science needs.
– Manage Data for Analysis and Modeling : Ensure data is structured, cleaned, and stored in a way that facilitates easy access (e.g., SQL-based relational database design) and processing for analysis and predictive modeling.
– Data Model Flexibility : Analyze and enhance the existing data model to create a high-quality entity-relationship structure. Ensure that data model and schema can adapt to the addition of new sensors, soil properties, and composite types, ensuring robustness to evolving data science requirements.
– Automate Data Processes : Build automation tools to streamline data processing workflows, enabling efficient data analysis and report generation for clients. Automate pipeline processes, scheduling refreshes to keep data pipelines efficient and up-to-date.
– Optimize Existing Data Pipelines for Machine Learning : Learn and work with the current data pipelines used for model training, testing, and deployment. Enhance and maintain these pipelines where needed, and only build new ones if necessary to complement or improve performance and scalability, ensuring efficient integration with the existing infrastructure. Conduct regular performance optimization and monitoring of pipelines to ensure smooth operation and scalability.
– Maintain Data Quality and Consistency : Implement and monitor quality control measures to ensure data accuracy, integrity, and reliability across all data pipelines. Where necessary, establish staging layers that only transform raw data, while providing a separate layer with independent data quality checks for querying.
– Ensure Data Traceability : Implement processes for tracking data lineage, ensuring that all data transformations and manipulations are traceable. Ensure the database is resilient to historical data overwrites, maintaining data integrity and continuity.
– Support Reporting and Visualization : Build the foundation for automating the generation of client-facing reports and data visualizations based on processed data and model outputs.
– Innovate with New Tools and Techniques : Continuously seek opportunities to enhance data management and processing techniques, incorporating new tools, frameworks, and technologies that improve efficiency and scalability.
– Facilitate Cross-Functional Collaboration : Strong communication skills, ability to work collaboratively with cross-functional teams, and a self-starter attitude with a proactive approach to identifying and solving challenges. Work closely with data scientists, data analysts, and other teams to support their data needs, ensuring alignment with the company’s goals and objectives.
PROFESSIONAL REQUIREMENTS
– A university degree in computer science, information technology, software engineering, or a related field is required. A master’s degree or higher may be beneficial.
– 4+ years of experience in a data engineering role or equivalent.
KNOWLEDGE AND SKILLS
Technical skills
– Mastery of commonly used data engineering tools and experience with data pipeline frameworks and orchestration tools.
– Strong experience with Python for data pipeline automation, with additional experience in Golang and GRPC as secondary skills.
– Proficiency in SQL, data modeling, relational databases, and performance optimization. Strong understanding of relational databases and NoSQL databases (e.g., MongoDB, DynamoDB).
– Experience with cloud platforms (AWS, GCP, or Azure), with an emphasis on data infrastructure services like S3, Redshift, or BigQuery.
– Proficient in ETL tools and frameworks (e.g., Apache Airflow, Spark).
– Understanding of DevOps principles, containerization (Docker, Kubernetes), and CI/CD pipelines.
– Understanding of data governance, auditing, and security principles.
– Familiarity with a range of data architectures – data lakes, warehouses, etc
– Proficiency in designing and optimizing data workflows for processing and storing large-scale datasets.
– Knowledge of handling georeferenced data and familiarity with soil science is an asset
– Ability to design and implement scalable data architectures, ensuring efficient data flow and storage across systems.
– Strong communication skills to explain complex data structures and processes to both technical and non-technical audiences.
Cross functional skills
– Strong ability to communicate technical concepts and processes clearly and effectively, both in writing and verbally, particularly when preparing documentation or presenting system designs. The ability to communicate in French is also an asset.
– Skilled at understanding the requirements of internal and external stakeholders, and delivering the data infrastructure and pipelines that support their decision-making.
– Familiarity with data protection regulations and best practices in data handling and security is a plus.
– Capable of adapting to changes within the team and adjusting data engineering processes to meet the evolving needs of the company and market demands.
WE PLAY AS A TEAM, WE WIN AS A TEAM
– A startup spirit that promotes sustainable development by helping to reduce the impact of agriculture on the environment
– A stimulating and dynamic work environment. For real.
– Modern facilities, with tools TO help you do your job easily
– Adjustable desks, plants everywhere, the most beautiful light north of Mount Royal.
– Fully stocked kitchen.
– An excellent corporate culture that understood that without humans, there would be no business.
– A horizontal and really accessible management: our CEO is in the office hockey team and half of the team is blocking with our CTO (He is not worse!)
– Group insurance and pension plan with employer participation
– Career growth mindset ingrained in our DNA. We want you to grow.
– The most accommodating telecommuting, flexible schedule and work-family policies
– Annual allowance for your well-being.
– Nerf gun available for foam ball fights!
– 5 to 7 frequent, Lunch & Learn, etc.
Does this offer appeal to you? Send us your application!