Our client is solving the hardest data problems in infrastructure while building the market & opportunity intelligence platform that enables a faster, more transparent, and more efficient infrastructure marketplace.
They have developed the largest and fastest growing data platform on North American infrastructure with over 1 Billion public documents from over 31,000 cities and utilities in the US & Canada, triangulated with dozens of government data sources.
They are now quickly expanding into new industries ranging from public transportation, roads and highways, power, gas, healthcare, waste management, and more.
About you:
You are a capable Data Platform Developer with XP in designing systems for data-intensive applications. You have developed, deployed, and monitored systems at scale in the cloud. You will play a key role the sourcing and indexing of documents coming into our data lake.
What you will do:
- Write clean/performant code to ensure their web crawler operates at scale; expanding on resource efficiency, ingestion performance, and metrics monitoring
- Build distributed systems and web services and deploying them with Kubernetes and/or Google Cloud Platform
- Integrate streaming/batching technologies (RabbitMQ/Pulsar/Kafka)
- Drive and contribute to platform design and requirements documentation
- Analyze logs or metrics to troubleshoot and identity bugs and their root cause
- Advocate for platform automation from testing, deployment to monitoring
- Work closely with the data engineering and data visualization team to support stakeholders for leveraging our big data environment
Requirements:
- 4+ years of demonstrated experience coding for enterprise software with near-zero tolerance for error and low performance
- 4+ years of experience of developing production-level systems using Python, JavaScript
- A track record for achieving technical excellence, driving quality results and customer satisfaction
- Experience in designing/implementing components for an event driven data platform
- RabbitMQ/Pulsar/Kafka/KEDA
- Redis/Celery
- ElasticSearch/Kibana
- Experience in containerized applications using Docker, Kubernetes Objects, Flask or similar
- Experience with automation/CICD and infrastructure provisioning as code (e.g. Helm, Terraform)
- Experience in web crawler technologies (Scrapy, selection/re-visit policies, proxy pools)
- Knowledge of networking, data and storage management principles
- Experience with API design and implementation
Nice to have:
- Experience with SQL database design, optimization and writing queries
- Strong technical written and oral communication skills
Benefits + perks:
- This is a rare opportunity to influence positive change within one of the biggest societal challenges of our generation (infrastructure – from water to transport to energy and more).
- Be a part of a scale up company with no corporate bureaucracy here. You will accomplish more here in a few months than what you would in a few years at a large, entrenched technology company.
- Comprehensive health and dental benefits with an emphasis on mental health
- Annual Learning and Development Budget and Plans
- Flexible hours and work-from-wherever you are with the option to work in our downtown office if interested!
- Competitive salary and equity incentives to give you a stake in our future
- We work hard. We play hard. From virtual trivia to exploring the food scene around our office, we like to get together!