Skip to main content

Data Dictionary

Savvy investors may consider developer activity a predictive indicator of the success or failure of a technology company. Sherlock data includes a growing list of 100+ tickers.

Available Data Tables

 Six comprehensive datasets covering developer activity and project health
Chart
PROJECT CDI
Community Development Index score measuring developer community engagement
File
WEEKLY FILE TYPE CHANGES
Detailed file modification tracking by type and category showing development focus areas
Code
CODE CHANGE AGG
Aggregated developer activity including commits, issues, pull requests with daily/weekly/monthly views
Dev
DEVELOPER AGG
Developer experience cohorts tracking engagement patterns and community growth over time
Folder
PROJECT REPOSITORIES
Repository metadata, social signals, blockchain categorization and project classification
Engagement-1
OPEN SOURCE ENGAGEMENT
Time series dataset containing total stars, forks, and watchers in open source repositories, updated daily

Sherlock Community Development Index (CDI) Weights

Learn more about CDI at: https:/chaoss.community/?p=4455

19.987%

Contributor Count

16.363%

Commit Frequency

13.853%

Is Maintained

12.612%

Commit to Pull Request Ratio

11.319%

Pull Request to Issue Ratio

10.113%

Pull Request Review Ratio

10.113%

Pull Request Merge by Others Ratio

5.640%

Lines of Code Frequency

Total Engagement Index

A measure of total engagement in the project's repositories in the last 7 days, normalized between 0 and 1, including the following weights as inputs:

50%

STARS 7 DAY MA

30%

 FORKS 7 DAY MA

20%

WATCHERS 7 DAY MA

Daily Change in Engagement

This data point is a measure of total engagement in the project's repositories in the last 24 hours. It's calculated based on the following fixed weights as inputs:

50%

DAILY CHANGE IN STARS

30%

DAILY CHANGE IN FORKS

20%

DAILY CHANGE IN WATCHERS

CDI & Core Metrics

CDI
The predictive measure that investors may use to identify technology companies with strong developer communities
CODE CONTRIBUTOR COUNT
The active pull request creators, code reviewers, and commit authors in the past 90 days
COMMIT FREQUENCY
The average number of commits per week over the past 90 days
IS MAINTAINED
The percentage of code repositories with at least one commit in the last 90 days
CODE REVIEW RATIO
The percentage of code commits with at least one reviewer (not pull request creator) in last 90 days
LINES OF CODE FREQUENCY
The average lines touched (added plus removed) per week in the past 90 days

Developer Analytics

TOTAL DEVS CT
The count of unique developers that have made at least 1 code commit
DISTINCT DEVS COUNT 1 MONTH
The count of developers that actively committed code in exactly 1 distinct month
DISTINCT DEVS COUNT 2 12 MONTH
The count of developers that actively committed code in 2 to 12 distinct months
 
CDI
Community Development Index score measuring developer community dedication (0-5 range)
CODE CONTRIBUTOR COUNT
Active PR creators, code reviewers, and commit authors in the past 90 days
COMMIT FREQUENCY
Average number of commits per week over the past 90 days
COMMIT FREQUENCY
Average number of commits per week over the past 90 days
COMMIT FREQUENCY
Average number of commits per week over the past 90 days
COMMIT FREQUENCY
Average number of commits per week over the past 90 days
DISTINCT DEVS COUNT 13 PLUS MONTH
The count of developers that actively committed code in more than 12 distinct months
DEVELOPER COUNT
The count of unique developers that have made at least 1 code commit

Pull Request & Issue Metrics

PULL REQUEST CREATED
The count of requests to merge code changes into the main branch
PULL REQUEST MERGED
The count of merged pull requests into the main branch
ISSUES OPENED
The count of suggested improvements, tasks or questions related to the project
ISSUES CLOSED
The count of issues closed by developers within a project
COMMIT PULL REQUEST LINKED RATIO
The percentage of new code commits that link to pull requests in the last 90 days
PULL REQUEST ISSUE LINKED RATIO
The percentage of new pull requests that link to issues in the last 90 days
CODE MERGE RATIO
The percentage where pull request mergers and pull request authors are different people in last 90 days

Repository & Social Metrics

FORK COUNT
The number of times a user has duplicated the selected repository
STARGAZERS
The count of users that have starred the selected repository
WATCHERS

The number of users currently watching a selected repository, which enables notifications when the repository is updated

COUNT OF REPO
The count of active repositories maintained
COMMIT COUNT
The count of changes to a code base and its associated files within the selected repo
ORG REPO URL
The URL of the project repository
DESCRIPTION
The description of the project written by the organization administrator

Engagement Metrics

STARS 7 DAY MOVING AVERAGE
The 7 day moving average count of stars
FORKS 7 DAY MOVING AVERAGE
The 7 day moving average count of forks
WATCHERS 7 DAY MOVING AVERAGE
The 7 day moving average count of watchers
DAILY CHANGE IN STARS
The percentage change in total stars in the last 24 hours
DAILY CHANGE IN FORKS
The percentage change in total forks in the last 24 hours
DAILY CHANGE IN WATCHERS
The percentage change in total watchers in the last 24 hours

Project Classification

PROJECT
The colloquial name of the open source project
PROJECT CATEGORY
The company sector or category
ORGANIZATION
The name of the organization contributing to the open source project
WEBSITE

The website url for each organization contributing to the open source project

LOGIN
The name of the project organization
IS CORE
The binary flag identifying if the repository is part of the primary core organization

File & Change Tracking

FILE TYPE CATEGORY
The file type category, summarizing the general purpose of the file
FILE TYPE
The format and nature of files changed in repository, based on file extension
FILE CHANGES

The count of files added, deleted, or modified within selected file type and time frame

REPOSITORY
The centralized file storage location for code files, revisions, and documentation

Temporal & Configuration

DT
The update date for developer cohorts
DATE
The date of calculation
PROJECT START DATE
The date when the first commit was made under this project
PROJECT DEVELOPMENT START DT

The date that the first repository of the project was created

FIRST COMMIT DT
The date of the first code commit within the selected repository
START DATE
The date when the metric calculation began

Sherlock Data Coverage

 

Technology

Media

Travel

Fintech

Communications

AI

Blockchain

Explore the data on Snowflake

Invest in leading technology teams

Sherlock Data helps quantitative and fundamental portfolio managers incorporate open-source software development trends into their investment strategy.