Improving Data Discover through User Research

Situation

As the Product Manager of a platform that offered machine learning tools and resources, I noticed an increasing number of customer complaints related to data sourcing. Users were struggling to find datasets that were both legally compliant and suitable for their ML projects. Our engineering team also reported an uptick in technical support tickets, many of which pointed to frustrations with data discovery and use. The goal was to better understand the root causes of these challenges and guide the engineering team in improving our platform's functionality around dataset search, compliance, and usability.

Task

My task was to lead a user research initiative to uncover specific pain points that users face when sourcing machine learning datasets, particularly around compliance (e.g., GDPR, licensing) and ease of use. I was responsible for collaborating with the engineering team to ensure that the findings would guide feature development and platform improvements that addressed these challenges head-on.

Action

  1. Data Collection

    I initiated the project by reviewing logs for labeling, searching and discovering datasets available on the data platform. I also conducting research of tooling and resources of available in the market that might be used by machine learning practitioners as they found, evaluated and stored datasets for projects.

  2. User Interviews & Surveys
    Next, I conducted discovery interviews with 30 researchers, data scientists and machine learning engineers from different departments and skill levels. The interviews focused on their experiences with finding, evaluating, and using datasets for machine learning, with an emphasis on legal compliance and usability. To complement this, I deployed a survey to a broader set of users to gather quantitative data on common issues.

  3. Affinity Mapping
    After gathering the interview and survey data, I synthesize user insights using an affinity mapping exercise. I identified key themes such as data accessibility, compliance concerns, and platform usability. These were shared with the engineering manager to provide a clear picture of recurring issues.

  4. Collaboration with the Engineering Leadership
    I translated user pain points into actionable product requirements, and collaborated with engineering leadership to align on the prioritization of new features that could be implemented within a quarter. I then worked with the Engineering Manager to ensure prioritized work was included in future design reviews, and socialized projects as OKRs. Key focus areas included:

    1. Developing better search filters for dataset discovery.

    2. Integrating compliance guidance into the dataset onboarding process, helping users quickly verify datasets' legal status.

    3. Improving dataset documentation standards to boost transparency and trustworthiness.

  5. Prototyping & Testing
    Once initial solutions were proposed, I helped prioritize these feature requests into sprints. After developing early prototypes, we tested the improvements with a smaller group of beta users, iterating based on their feedback.

Result

The research led to significant improvements in the platform:

  • A new advanced dataset search feature was implemented, which allowed users to filter datasets by compliance, industry, and size. This reduced the time users spent searching for datasets by 35%.

  • Dataset documentation standards were enforced, ensuring that each dataset had detailed compliance and licensing information, increasing user confidence.

  • Users reported a 25% reduction in their data compliance concerns when using our platform, as we built in tools that automatically flagged datasets with potential legal risks.

These changes resulted in a 20% decrease in support tickets related to dataset issues and a notable increase in user satisfaction. My engineering team also had clearer guidance on how to build future improvements based on real user pain points.

Previous
Previous

Automating Data Cleaning for Nonprofits: A Python Journey

Next
Next

Continuous Discovery - Staying Aligned with User Needs