CONSTRUCTION & REAL ESTATE
Perspective of looking up a stairway to the outside.
Discover how crafting a robust AI data strategy identifies high-value opportunities. Learn how Ryan Companies used AI to enhance efficiency and innovation.
Read the Case Study ⇢ 

 

    LEGAL SERVICES
    Person looking out airplane window wearing headphones
    Discover how a global law firm uses intelligent automation to enhance client services. Learn how AI improves efficiency, document processing, and client satisfaction.
    Read the Case Study ⇢ 

     

      HEALTHCARE
      Woman with shirt open in back exposing spine
      A startup in digital health trained a risk model to open up a robust, precise, and scalable processing pipeline so providers could move faster, and patients could move with confidence after spinal surgery. 
      Read the Case Study ⇢ 

       

        ⇲ Dive Into
        LEGAL SERVICES
        Wooden gavel on dark background
        Learn how Synaptiq helped a law firm cut down on administrative hours during a document migration project.
        Read the Case Study ⇢ 

         

          GOVERNMENT/LEGAL SERVICES
          Large white stone building with large columns
          Learn how Synaptiq helped a government law firm build an AI product to streamline client experiences.
          Read the Case Study ⇢ 

           

            strvnge-films-P_SSMIgqjY0-unsplash-2-1-1

            Mushrooms, Goats, and Machine Learning: What do they all have in common? You may never know unless you get started exploring the fundamentals of Machine Learning with Dr. Tim Oates, Synaptiq's Chief Data Scientist. You can read and visualize his new book in Python, tinker with inputs, and practice machine learning techniques for free. 

            Start Chapter 1 Now ⇢ 

             

              ⇲ Artificial Intelligence Quotient

              How Should My Company Prioritize AIQ™ Capabilities?

               

                 

                 

                 

                Start With Your AIQ Score

                  6 min read

                  The Cold Start Problem

                  Featured Image

                  By far the biggest lesson I’ve learned since getting involved in AI is that quality data is the limited resource critical for AI success.

                  One of the main reasons why large language models like ChatGPT are so successful is that they were trained on massive amounts of data (e.g., the internet) and improved with human feedback.

                  That said, one of the not so obvious things I learned shortly after starting Synaptiq is that oftentimes you will hit a “cold start problem”.

                  Let’s take the example of building an AI recommendation engine for an e-commerce site.  There are typically two approaches:

                  1. Recommending similar products based on others you’ve purchased in the past
                  2. Recommending products based on your profile and what others with similar profiles purchased

                   

                  Illustrates the two common approaches for designing AI recommenders.
                  Two common approaches to build AI recommenders.

                  In the first approach, so long as you have a past purchase history and a well-defined product catalog, an AI model can start making reasonable recommendations.  In the second model, so long as you have a profile and there are others with similar profiles that have purchased products in the past, recommendations are possible.

                  But, what if you haven’t purchased products in the past, you don’t have a profile yet, or there aren’t others with similar profiles that have purchased products?

                  This is a classic cold start problem, and it’s a common challenge across a wide range of AI applications like:

                  • Healthcare diagnostic tools that require diverse patient data across conditions
                  • Financial fraud detection that requires examples of legitimate and fraudulent transactions
                  • Customer service chatBots that require historical conversation logs to provide relevant answers to questions

                  There are basically three creative options to overcome a cold start problem:

                  1. Data sourcing
                  2. Product design and user experience
                  3. Go-to-market

                  Data sourcing

                  If you don’t have the data you need on hand, there are a handful of options with varying levels of effort and cost.

                  1. Acquire the data somewhere else - look for free, publicly available datasets or data you can purchase through a vendor. You may even be able to set up a data partnership where another company provides you the data you need to solve your cold start problem, while you share the data you create with them.
                  2. Have your data scientists or machine learning engineers search for existing models that are pre-trained in adjacent domains.  They may be able to apply “transfer learning” to bootstrap your AI model.
                  3. Hire people to generate an initial training dataset - you may be able to augment their work with the help of large language models.
                  4. Talk to your domain experts and data scientists about generating representative synthetic data.

                  To dive deeper into this topic read How Much Data Do We Need blog written by my cofounder, Dr. Tim Oates.

                  Product design & user experience

                  When you have a cold start problem, it’s important to think carefully about your product’s user experience.  There are smart ways to design an experience that help you overcome the cold start problem while engaging your early adopters.

                  Here are few suggestions:

                  • Set appropriate expectations on how the system will behave while your AI models are being trained
                  • Don’t present AI model outputs until certain data is collected and, instead, ensure the experience provides value in non-AI ways (e.g., rule-based before AI)
                  • Incentivize users that contribute data to train your AI model
                  • Gradually and carefully expose AI model outputs as its value increases

                  The key here is to invest in the upfront product design for a cold start situation and iteratively improve the experience, or you may never get out of the “chicken and egg problem.

                  Go-To-Market

                  Finally, think carefully about your rollout strategy, pricing strategy, and business case expectations.  For instance, it may be best to start with a small segment of users in a pilot before generating awareness.  Likewise, your pricing may need to be low until your AI models start generating value.  And, whatever you do, don’t overset expectations on any sort of ROI dates until your product has spent time in users’ hands.

                  For those of you that are in sensitive information or knowledge worker organizations (e.g., healthcare, legal, finances, professional services, etc.), it’s also best practice to pilot your AI models internally first.

                  Real life examples

                  We have run into many cold start problems over the last 10 years.  But there are two that stand out and are easy to explain.

                  The first was a project we worked on as a subcontractor for the federal government early in our journey.  Back then, the federal government employed a lot of contractors to build cloud applications and struggled to manage all the costs for cloud resources.  The big cloud providers didn’t have any automated tools to help the government optimize its cloud resources.

                  We built a system that monitored cloud resource consumption and optimized it against its expected quality of service.  That meant we needed a lot of cloud consumption data to prove we had a viable approach which, unfortunately, the government wasn’t going to give us direct access to.  So, we had a cold start problem.  To overcome this challenge, we generated simulated data in our isolated environment, tuned the model until it met expectations, then gave it to the government to deploy in their secure environment.

                  You can read more about it in our published research paper, Automated Cloud Provisioning on AWS using Deep Reinforcement Learning.

                  Shortly after, we worked for a company that sells custom curriculums of training courses to businesses. When we met them, they had realized that their sales and customer success team wasn't going to scale effectively if every sale required manual human curation.  So we built an AI recommender for them fueled by their historical data.

                  Everything was going great until we learned that they were selling into a wide range of customers with diverse profiles.  There was a high likelihood that a prospective customer wouldn’t be similar to any active customers.  In this cold start situation, we worked with the client to purchase company profile data so that our AI model would work if the prospective customer’s profile wasn’t already in the system.

                  This company also rolled out our model as a sales and customer success support tool initially, then expanded it into an active customer recommendation system.

                  Conclusion

                  A cold start problem is a common hurdle when launching AI solutions, especially for machine learning models that rely on robust datasets to function effectively.  Without adequate initial data, these systems often struggle with accuracy and performance issues.

                  Fortunately, a multifaceted approach can help organizations navigate this challenge.  By tapping into alternative data sources—whether public repositories, adjacent domain data, or synthetically generated information—companies can build a foundational dataset.  Thoughtful product design that incorporates strategic data collection mechanisms and expert human oversight further strengthens the solution.  Creating intuitive user experiences that naturally encourage data sharing also accelerates the learning curve.

                  From a strategic perspective, targeting early adopters and focusing on applications that deliver value even with limited data helps establish momentum.  As users engage with the system, the expanding dataset fuels continuous improvement in the AI's capabilities.

                  With creativity and pragmatism, businesses can successfully implement AI solutions that evolve and mature alongside their growing data resources, ultimately delivering increasingly powerful results over time.


                  ANewsletter-Header_01-01

                   

                  About Synaptiq

                  Synaptiq is an AI and data science consultancy based in Portland, Oregon. We collaborate with our clients to develop human-centered products and solutions. We uphold a strong commitment to ethics and innovation. 

                  Contact us if you have a problem to solve, a process to refine, or a question to ask.

                  You can learn more about our story through our past projects, blog, or podcast

                  Additional Reading:

                  AI-ifying Business Processes

                  AI applications range from simple rule-based systems to self-learning models, to complex multi-agent systems that...

                  Data-Driven Product Managers Are the Future

                  A few weeks ago I met a fellow seasoned product manager for lunch. We conversed at length about our origin stories,...

                  The Cold Start Problem

                  By far the biggest lesson I’ve learned since getting involved in AI is that quality data is the limited resource...