Navigating the Realities of AI/ML: Lessons Learned from the Trenches

I recently came across an insightful article on the realities of working with AI and machine learning (ML). It resonated with my own experiences, particularly the points that "ML is not software engineering" and "the biggest gains are not from choosing the right model, but from framing a problem in the right way." These nuggets of wisdom are crucial for anyone diving into the field.

My Advice to Machine Learning Newbies After 3 Years in the Game

Having spent over three years doing machine learning at a startup, where we built some impressive technology, I also learned some hard lessons. We raised funding and developed cool tech, but we also wasted a lot of time. Here’s my advice based on that experience, which I believe is relevant to all beginners, especially those looking to solve real-world problems.

1. Stay Away from Unsupervised Learning

Unsupervised learning was a significant time-waster for us. Despite recommendations from numerous AI PhDs, our attempts at unsupervised learning yielded no value. It typically involves clustering models trained on untagged data to uncover unknown patterns, but human intuition outperformed it every time in our case. While there might be cool applications in the space, it's not the place for easy wins. Focus on supervised learning until you have more experience.

2. Skip Neural Networks Initially

Neural networks can outperform traditional models, but the gains are often marginal compared to the effort required. They pose several challenges for beginners:

Slow Iteration: Neural networks take longer to train, reducing the number of iterations you can perform.
Data Requirements: Avoiding overfitting requires a large amount of data, which many companies lack.
Complexity: The infinite configurations can be overwhelming.
Mentorship: Finding mentors with real-world experience in neural networks is tough.

Traditional ML models, like those available in scikit-learn, often perform well and are quicker to iterate on. Use neural networks for fine-tuning rather than starting from scratch.

3. Frame Problems as Binary Classification

Simplify the learning process by framing problems as binary classifications. A binary classification model outputs a 1 or a 0, making it easier for the model to learn. I've found better results with multiple binary classifiers running in parallel compared to a single multi-class model.

4. Tune Your Hyper-Parameters

Hyper-parameter tuning can significantly impact performance. Use automated tools like GridSearchCV or TPOT to save time. Always document your hyper-parameters and results to avoid repeating past mistakes.

5. Set Time Frames for Experiments

Unlike software engineering, you can't predict how long it will take to solve an ML problem or even if it's solvable. Predict the duration of experiments instead. This approach helps manage expectations and avoids frustration.

6. Document Everything

Document your experiments meticulously. Note the model architecture, hyper-parameters, data descriptions, results, and any insights gained. This practice will save you time and help you learn from past experiments, making you a more seasoned ML practitioner over time.

Conclusion

These lessons are drawn from my journey in building ML-powered applications, primarily in the NLP space. While my experience is specific, the principles apply broadly. Start with tried-and-true methods before venturing into bleeding-edge technologies. Build a solid foundation, and then push boundaries as needed.

Remember, the biggest gains often come not from the model you choose but from how you frame the problem. Get out there and start building some useful tech!