OnFleet Integrations Fix for Reverse Logistics Company

Our client was a delivery and reverse-logistics company in the fashion industry. They had integrations with OnFleet, a last-mile delivery provider, to send and track deliveries for their clients.

When a previous CTO had left, their integrations with OnFleet broke, causing their orders to no longer show up in the dashboard.

This issue was both revenue and customer impacting as they weren't receiving any more orders from their clients.

After hiring another contractor who failed to resolve the issue, they brought in All-In Consulting to fix this integration.

Action

We took a piecemeal approach to resolving this issue. First we looked at whether we were receiving requests from their customers at all. When we found their requests showing up in our databases, we concluded the issue was internal.

That's when we took a closer look at how these orders were being processed. We noticed the client was using "Celery" workers to process these tasks, and that the worker had been backed up with tasks since the integration broke.

That's when we realized the issue was likely that the Celery worker had run out of resources, and so we updated their Docker file, rebuilt it, and restarted the Celery worker.

Immediately all the orders began to process again, and their customer confirmed they could see orders in their dashboard as before.

Results

We completed the investigation and bug fix in just 3 days. Their customers verified they started seeing orders again, and normal operations were resumed.

We also fixed many of the permissions issues that the founder was facing, and helped them regain control of their tech stack.

We also delivered recommendations for improvements to their tech infrastructure going forward.

Lessons Learned

1. Importance of Technical Understanding

It's important for non-technical founders to at least have a high-level understanding of the architecture, or else they could be too reliant on their CTO for fixes especially if there are staff changes later on.

At minimum, creating a system design diagram so it's clear what all the different components of their infrastructure will eliminate last-minute scrambles to figure out what the previous engineers did when something breaks.

It also helps anyone called in to fix the issue to know where to start.

2. Take Control of Your Permissions

Part of the complication of this fix was that the founder didn't have ownership over parts of their infrastructure as it was built under their previous engineers' accounts.

As such, we recommend founders create all the accounts and make sure they have admin access to everything built in case any one leaves.

3. Overcomplicated Technical Design

Lastly, we noticed the architecture involved a Celery queue to handle requests.

However, their order volume was not large enough to justify the need for a queue. Furthermore, having to manage their own queue infrastructure made it more likely for things to break as there was just more to maintain.

Instead of managing their own queues, we recommend using a managed queue service like SQS to handle these types of infrastructure issues as they tend to be more reliable than self-hosted solutions.

---

If you have integrations and software needs, contact michael@allinengineeringconsulting.com or reach out to us at https://www.allinengineeringconsulting.com/contact for a free 30-minute intro call.