70%
Time Reduced in Data Discovery
670+
Folders Saved in Sherloq
1600+
Queries Saved in Sherloq
We're happy to use your platform every single day!
Daniel Zilberberg
Director of Data Analysis Guild @ AppsFlyer
Executive Summary
Core locations: USA, Israel, London, France, Thailand, Japan, China, India
Headquarters: San Francisco, US
Industry: Mobile Attribution and
#of Employees: 1,000+
#of Data users: 150+
Business Model: B2B2C
Data stack: BigQuery, Athena S3, Looker, Tableau, Jenkins
Main Points:
Challenge:
The main challenge was that two different data analysts would provide two different answers to the same business question. Managing over 10,000 tables led to an absence of standardization, a lack of knowledge, and reliance on tribal knowledge. Additionally, team collaboration was hindered as each analyst worked independently rather than together, leading to a lack of cooperation between analysts and resulting in inconsistency and productivity issues that were significantly time-consuming.
Initial Solutions:
Tried an internal “Wikipedia” for data assets and external tools like GitHub and Google Docs, which proved inadequate due to maintenance burdens and poor workflow integration.
Implementation of Sherloq: Quickly integrated into AppsFlyer’s data systems, streamlining query management and encouraging the saving and reusing of queries.
Results:
Achieved a 70% reduction in time spent searching for data assets, organized 1,600+ queries into 670+ folders, and facilitated rapid adoption across various departments.
Impact:
One source of truth for data analysts’ daily SQL work, resulting in improved productivity, creating data alignment, and reducing SQL errors. Additionally, the number of tables created has decreased, and analysts are increasingly using the same tables, promoting consistency and efficiency.
User Feedback:
“We are happy to use your platform every single day!” – Director of Data Analysis, AppsFlyer.
Sherloq has become a vital tool, driving efficiency and alignment across the organization.
Background
AppsFlyer helps brands make good choices for their business and their customers with its advanced measurement, data analytics, deep linking, engagement, fraud protection, data clean room and privacy preserving technologies. Built on the idea that brands can increase customer privacy while providing exceptional experiences, AppsFlyer empowers thousands of creators and 10,000+ technology partners to create better, more meaningful customer relationships.
To learn more, visit www.appsflyer.com.
More than 10,000+ tables, 2M queries per day
The Challenge
AppsFlyer, a data-intensive company, managed an overwhelming array of over 10,000 tables spread across multiple databases such as BigQuery and Athena S3. This complex data landscape posed several operational challenges for the company’s sizable team of more than 100 data users. Data users frequently struggled to identify the most relevant tables for their projects, and even when identified, they lacked the necessary context to use these tables effectively. “Every new project felt like starting from scratch.” said one of AppsFlyer’s data analysts.
This often resulted in users spending excessive time figuring out the business terms and logic required to manipulate SQL queries to get the data they needed.
Moreover, the absence of data ownership and proper documentation meant that routine queries turned into extensive, time-consuming discussions. AppsFlyer had established a dedicated Slack channel intended to facilitate quick help among the data users; “Who owns this table?” and “How can I query this table for a specific data type without constraints?” were part of these questions. That triggered multiple interactions, causing frequent context switches among team members. This not only slowed down individual productivity but also led to a cycle of repetitive and inefficient work.
The situation was further complicated by the fact that there was no standardized method for saving queries and the absence for one source of truth for the data. Data analysts found themselves repeatedly solving the same problems or answering similar ad-hoc queries. The lack of a centralized, easily accessible system for storing and retrieving past queries meant that even when solutions were found, they were often not recorded in a way that was easy to reuse. Consequently, precious time was lost as analysts would have to recreate queries from scratch, leading to further delays and disruptions in their workflow.
“Even having worked at a smaller company before joining AppsFlyer, with just over 50 employees, we faced the same frustrating delays in our workflows” said one of their data analysts.
This environment of inefficiency and frustration was exacerbated by a general lack of trust in the data. Without reliable documentation at the table, field, or query level, even well-designed dashboards were viewed with skepticism by stakeholders.
last but not least – the lack of collaboration within the data team meant that many analysts were duplicating efforts. Analysts often worked in silos, unaware that the queries they were constructing had already been created by others. This lack of communication and collaboration led to significant productivity losses, with an estimated 30% of AppsFlyer analysts’ time spent on redundant tasks. Over a year, this added up to thousands of hours wasted, further compounding the inefficiency and slowing down the entire data operation.