Skip to content

Commit d4de659

Browse files
committed
Second half blog post
1 parent 54e432c commit d4de659

1 file changed

Lines changed: 98 additions & 0 deletions

File tree

  • content/blog/entries/2026-30-03- the-end-of-an-era
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
title: Quantifying the Commons: The end of an era
2+
---
3+
categories:
4+
open-source
5+
collaboration
6+
community
7+
quantifying-the-commons
8+
---
9+
author: Oreoluwa
10+
---
11+
pub_date: 2026-03-03
12+
---
13+
body:
14+
15+
Quantifying the Commons: The end of an era.
16+
17+
Dear gentle reader,
18+
19+
It is the end of an era yet the beginning of my bloom as a young aspiring data
20+
professional on a global stage. It feels so surreal to be at the end of this
21+
amazing journey with my mentors and to see the quantifying commons become a
22+
mature project in the creative commons open source community. Quantifying the
23+
commons is also blooming so stay tuned to experience its impact in
24+
different teams at Creative Commons.
25+
26+
Looking back, I was quite nervous on my first meeting with Timid Robot and
27+
Sara. I did not quite understand the automation part of the project, how long
28+
the scripts ran? Why? I was fascinated by the whole process of the system, after
29+
further explanation by Timid I was really impressed by the design thinking. A
30+
lot of details and critical thinking were put into implementing the system. Big
31+
kudos to the project lead and previous contributors, I am in love with the
32+
foundation being put in place prior to my contribution. It is a firm one and it
33+
made my work easier and worthwhile.
34+
35+
36+
## Day 1 was amazing, Day 90 is growth!
37+
38+
I went from being confused with concepts used in the codebase to suggesting
39+
ideas on improving the automation process in the system. I constantly read
40+
articles, tested, iterated and improvised functions and mechanisms. I improved
41+
on my data structure and algorithm skills, I had to cater for test cases,
42+
limitations and risk. Risk in the sense that the system is exposed to change
43+
because the data is live and dynamic from the API. This is what I did in the
44+
first half of my internship here (first half blog post)
45+
[https://opensource.creativecommons.org/blog/entries/2026-01-22-My-outreachy-journey/].
46+
I would be focusing on the second half of the internship in this blog post. A
47+
big part of the project is ensuring the integrity of data is in sync with the
48+
efficiency of the automation process.
49+
50+
51+
### I worked on completing the automation process for the Smithsonian quarterly
52+
report:
53+
Smithsonian is one of the largest public institutions in the United States. It
54+
has a total of 38 units/data sources like museums, zoos and libraries as of when
55+
I worked on it. We derived insights on the usage of CC0 license across the media
56+
records and records without media. This urged me to add the horizontal stacked
57+
barplot to the collection of visualization in the report system. From this, we
58+
could get the distribution of the records with CC0 licenses at a glance. Also,
59+
we explored the top 10 distribution of units and lowest 10 distribution of
60+
units. This meaningfully tells us how common the CC0 license is used in these
61+
institutions. After testing the whole workflow a couple of times, I detected
62+
that the unit code seems to be updated frequently whether added or removed. I
63+
developed a function that keeps track of these changes and gives a warning about
64+
changes in the next automation process. This was the best way possible at the
65+
moment to handle the sudden unit code, so that our data is quite predictable and
66+
updated.
67+
68+
### I also worked on completing the automation process for the Arxiv quarterly
69+
report:
70+
71+
Arxiv is a curated research-sharing platform with 5 million monthly active users
72+
and hosts 2.6 million research papers. We derived quite interesting insights
73+
from this data source. Then expanded the visualization collection in plot.py by
74+
adding the function for line plot and vertical stacked barplot. The insights
75+
include the count of legal tools on a yearly basis and various comparative
76+
analysis of the tools over the years. We also explored the breakdown of these
77+
tools usage in different categories.
78+
79+
## Lessons learnt:
80+
I learnt so much about creating a structure when solving a problem. It is quite
81+
easier to debug and it presents a detailed workflow for future contributors to
82+
understand what has been done previously. It literally boils down to how you
83+
name your variable or how you use it in a function. I also learnt the importance
84+
of asking why. Timid encouraged me to always question assumptions and understand
85+
the reasoning behind decisions. This was the best thing to do because it made
86+
the whole internship fun and puzzling. Things became naturally logical and I
87+
could connect the dots quite easily.
88+
89+
## What Next!
90+
I hope to continue volunteering my time on the project going forward. I am also
91+
eager to explore other open-source projects involving research, big data, and
92+
automation, and to further align these skill sets with my background in
93+
actuarial science.
94+
95+
## Goodbye for now
96+
I really enjoyed working with my mentors, I will miss our little chit chats
97+
about the holidays, the weather and even vacation trips. I look forward to
98+
catching up again in the future.

0 commit comments

Comments
 (0)