or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing
xxxxxxxxxx
5.6.3 Database Project Analysis
Translator: Fei Teng; Reviewer: Ted Liu
6.3 Database Project Analysis
This section analyzes the growth trend of the database field in terms of OpenRank, Activity and other indicators in the past five years, as well as the concentration trend of the top 10 projects. It also quotes the open source database information disclosed in Database of Databases and DB-Engines Ranking. The focus area is divided into 18 categories according to the database structure and purpose of the database, namely Relational, Key-Value, Document, Wide Column, Search Engine, Time Series, Vector, Graph, Object Oriented, Hierarchical, RDF, Array, Event, Spatial, Columnar, Native XML, and Content. The collaboration log data of the corresponding open source projects on GitHub are collected and analyzed.
6.3.1 Growth Trends in the Database Domain Over the Past Five Years and the Changing Trends in the Concentration of Top 10 Leading Projects
1. Analysis of Concentration Changes in Leading Projects in the Database Domain
Over the past five years, the concentration of OpenRank and concentration of Activity for the Top 10 leading projects in the database domain have remained within the range of [29%, 35%]. However, in the most recent three years (2022-2024), there has been a decline of approximately 3 percentage points compared to 2020 and 2021, with a slight rebound observed in 2024. Specifically:
This indicates that the concentration of top database projects shows a consistent change in both OpenRank and Activity metrics. Moreover, by comparing the peak and trough years and trends of the two metrics, it can be observed that OpenRank changes lag slightly behind Activity, with the time lag being roughly on a monthly to quarterly scale. This lag reflects the temporal logic between activity and influence in database top projects: changes in activity may occur earlier, while changes in influence gradually follow.
In 2024, all concentration metrics for leading projects showed an upward trend, and the month-on-month increase in Activity concentration was greater than that of OpenRank concentration. This phenomenon indicates that the resurgence in activity among top database projects will further drive the accumulation of influence. Based on past trends, it can be predicted that the OpenRank concentration in 2025 may accelerate its recovery, and the influence of leading projects over the entire domain will also significantly strengthen as a result.
As the influence of top projects increases, an important challenge they face is how to convert this influence into higher activity levels to further consolidate their position in the field. This dynamic relationship is particularly crucial for top projects to maintain an advantage in the increasingly competitive database sector.
3. Intensified Industry Competition and Resource Allocation Challenges
Looking at the OpenRank and Activity trends over the past five years, although the indicators for top projects have rebounded in 2024, overall growth has slowed. This suggests that competition for resources in the database sector is intensifying, and the pressure among leading projects is increasing. In this context, how to leverage existing advantages and maintain a leading position will be a critical issue for the future development of top projects.
Overall, the changes in concentration among leading projects in the database domain reveal the temporal relationship between activity and the dissemination of influence, while also reflecting the intensification of competition within the field. In the future, leading projects will need to place greater emphasis on resource integration and the conversion of influence to address domain competition and further solidify their central position in the database technology ecosystem.
6.3.2 Growth Trends in Various Subdomains of Databases Over the Past Five Years
The top three database categories together account for over 70% of the total OpenRank and activity indicators in the database sector.
As a sector that has existed since the birth of computing, databases have shown a stable development trend over the past five years. It is foreseeable that relational databases will continue to lead the industry, while various types of non-relational databases will serve as important branches in the long-term future.
6.3.3 OpenRank Rankings and Activity Rankings with Proportions in Database Subdomains
From the 2024 OpenRank and activity rankings across various categories in the database sector, the following observations can be made:
6.3.4 Open Source Quadrant Charts for Projects in Various Subdomains of the Database Field
The Open Source Quadrant Chart evaluates database categories based on three key metrics: Activity, OpenRank, and CommunityVolume. The CommunityVolume metric follows the same formula as the Attention metric in open-digger project, calculated as the weighted sum of stars and forks over a given time period:
sum(1*star+2*fork)
.Methodology for Quadrant Chart Construction:
Select the top 10 projects from each database subfield based on Activity.
Plot a
log(x)-log(y)
scatter plot usinglog(openrank)-log(communityvolume)
, where the base of the logarithm is 2. This represents the number of half-lives required for the spatial influence (openrank) and temporal influence (communityvolume) to decay to 1.Divide the plot into four quadrants using a vertical line corresponding to the mean of the horizontal coordinates (x-axis) of all points as the vertical axis, and a horizontal line corresponding to the mean of the vertical coordinates (y-axis) of all points as the horizontal axis.
There are 18 database categories in total. For the analysis, we selected 9 categories with an activity proportion greater than 1% in 2023: Relational, Key-value, Document, Wide Column, Search Engine, Time Series, Vector, Graph, and Object Oriented. The Open Source Quadrant Chart based on these categories is shown below:
The Search Engine category exhibits significant polarization, with projects like ElasticSearch having both high OpenRank and CommunityVolume, while others like Lucene-Solr and Xapian have relatively low values in both metrics.
Insights from the First Quadrant: Relational, Document, Search Engine, Vector, and Wide Column databases exhibit strong OpenRank influence as well as high CommunityVolume engagement. In contrast, Object-Oriented and Graph databases show weaker performance in both aspects.
From the vertical distribution in the open-source quadrant chart of the top 9 subcategories by activity, it can be observed that subcategories such as key_value and search_engine, represented by projects like valkey and meilisearch, exhibit higher CommunityVolume relative to their OpenRank, indicating a stronger community presence and faster growth expectations compared to other subcategories. The vector subcategory shows a strong linear correlation between the log-log values of CommunityVolume and OpenRank for its top 10 projects, suggesting a balanced relationship between community presence and collaborative influence.
6.3.5 Analysis of Working Hours of Open Source Database Projects
From the chart, it can be observed that the peak working hours for open-source database projects are mainly concentrated between 2:00 to 10:00 UTC from Monday to Friday, while the active hours span from 1:00 to 18:00 UTC from Monday to Friday. This pattern may be related to the fact that most database-related projects have corporate backing. Based on the active UTC time, the chart shows that the active time of the day begins at 2:00 UTC, reaching a peak time at 6:00 UTC and continuing until 10:00 UTC. At 11:00 UTC, activity significantly decreases, and by 18:00 UTC, the projects are no longer active. The two distinct peak time — 2:00 to 6:00 UTC and 6:00 to 10:00 UTC — correspond to the working hours in Asia and Europe, respectively (assuming a typical work start time of 9:00 local time, aligning with UTC+7 to UTC+3 and UTC+3 to UTC-1). As the overlap in working hours gradually decreases afterward, the work peak quickly diminishes. This analysis highlights the critical role of collaboration between Asia and Europe in the open-source database domain, underscoring the importance of their contributions to the field.