SDGs & Open Data at Wikirate’s Hack Day
On March 2–6, 2024, the fourteenth Open Data Day celebrations took place, with people around the world organizing 287 events to celebrate, promote, and spread the use of open data.
As part of the celebrations, on the 6th of March, Wikirate organized a Hack Day in Berlin with a primary focus on the Sustainable Development Goals (SDGs) — a framework established by the United Nations to tackle global challenges.
Event description and objectives
Ten talented, energetic and committed participants joined the event. They were given seven hours to collect, analyze, visualize, and interpret Wikirate data on the Sustainable Development Goals (SDGs).
Building on Wikirate’s repository of datasets exposing corporate impact on the SDGs, two distinct challenges were crafted to engage participants during the Hack Day.
The Challenges
*This section has been collectively written by Thomas Adler, Sylvia Gozdek and Carolyn Thompson participants of the Hack Day.
Challenge 1: Metric alchemy, crafting your own Wikirating
This challenge invited participants to collaboratively explore their creativity and use various Wikirate metrics to devise a bespoke rating system. This rating system aimed to provide a nuanced evaluation of companies’ endeavors toward advancing Sustainable Development Goals (SDGs).
Dividing the work and approaching the challenge
Excited by the breadth and granularity of the data available on the Wikirate platform, the team had many ideas on how to leverage that data. As with any dataset, data quality and data completeness is a never-ending challenge. With that in mind, we split into two subgroups: the first, focusing on SDG 5: Gender Equality took a practical approach, and the second, which focused on SDG 8: Decent Work and Economic Growth, took a more theoretical approach to the challenge.
Team SDG5: Gender Equality
Focusing on Gender Equality was a timely choice, a few days ahead of International Women’s Day. The idea was to come up with a “Gender Equality” index. First, the team qualitatively assessed relevant metrics that could be included in the index. Then, participants checked the availability of data for a sufficient number of companies. Finally, the index was calculated through the Wikirate platform.
An index is meant to be an aggregate of different metrics to provide end users with one salient number. What metrics to include and their respective weighting is important for the index to make sense. However, one must accept that an index represents a complex idea, in this case “gender parity”, and not a single, specific, concrete metric.
Furthermore, an index is most useful when used to compare the scores of multiple companies. The index score on its own doesn’t say much, but it does enable us to rank companies and compare them on a level playing field.
Finally, the end user of the index is an important consideration. The team thought of the end user as a broad group consisting of consumers wanting to make more ESG-focused and SDG-compliant consumption choices.
Outputs achieved
The first index developed is the Gender Career Growth Parity Index. It attempts to describe how equal the career growth opportunities are between men and women. It includes a measure of equal pay, the hours of training available, and entitlement to parental leave. We want these three metrics to be as equal as possible: a perfect score is reached if the share of the female pay compared to the male pay is 100%.
For this Index, the best-performing company for which we have data is Aston Martin, with a score of 99 out of 100. Q would be proud! Nestlé and Moncler are two other notable companies that perform relatively well, with scores of 93 and 92, respectively. The Hershey Company performs quite badly, with a score of 59, due to sub-par parental leave allowance, possibly due to national regulations (or lack thereof).
The second index the team worked on was the Gender Balance Index, describing the gender balance in the workforce and positions of leadership in a company. It includes the share of women in the workforce, the share of women on the board, and the number of years the CEO of the company has been a woman. For all three of these metrics, we want gender balance, more or less a 50/50 split between men and women.
This Index was harder to calculate because of one interesting dilemma, the gender of the CEO. Since this is a binary variable (man or woman), it is non-trivial to assign a score. We came up with the idea of looking at the past decades and calculating the number of years a male/female was CEO of the company. The idea is that, over a longer period of time, it would be preferable to have some kind of gender balance in the top leadership position. For this metric to make sense, we would need to look back multiple decades, and that data is not readily available, yet.
For both indexes, we faced the challenge of finding enough companies that had data for all the metrics we were interested in. For example, to calculate the first metric, we found around 40 companies. Note that data that is not available does not necessarily mean it was not released; it might just mean that Wikirate researchers have not collected that data (yet). Thus, we cannot penalize companies with missing data on the Wikirate platform. Inflicting a penalty for lack of transparency would only be an option if all the data was available on Wikirate.
Perspectives for needs and next steps
An index needs to be tested, and this would be the first step for further improvements. How does our Gender Balance index fare compared to more established ones? Based on this comparison, we could think about adjusting the weightings, including more or less metrics, and playing around with the reporting years.
Second, it would be interesting to compare Wikirate’s Fashion Transparency Index with our own indices. The former only looks at how transparent a company is about reporting SDG-related metrics (not if they are true or not). Our indices look at the actual performance of companies on an SDG. Can we figure out whether companies that report more actually perform better? This is useful to know because it can guide policymaking on data reporting laws and disclosure about a company’s business practices.
Of course, another question is whether the reported metrics are accurate, as they might have been manipulated to fit a narrative, or resulted from some smart accounting and data tricks designed to make the results look more rosy than the reality. We will leave that to another Hackathon team ;)
Team SDG8: Decent Work and Economic Growth
SDG8 concerns itself with the ability to “Promote sustained, inclusive and sustainable economic growth, full and productive employment and decent work for all”. Over 50 data sets on Wikirate contain metrics mapped to this SDG.
The team spent the initial planning phase looking through Wikirate’s catalog and assessing which ones were populated enough to merit further exploration. We also broke down the categories nested within the “Decent work” into 1) Health and Safety, 2) Youth Empowerment, 3) Marginalized Group Empowerment, and 4) Employment Conditions.
After exploring the data, the team decided to pursue the creation of two new metrics: the first falling under the category of Health and Safety and the second under Employment Conditions. The two metrics could be added to a future index by which an aggregated assessment of a company's participation or lack thereof in the pursuit of SDG8 could be summarized quantitatively.
Outputs achieved
The first metric we designed was related to Employment Conditions and it monitors a company’s proportion of temporary work contracts vs. permanent ones. The central question for this metric is “Has a company had a decrease in permanent employees and an increase in temporary employees compared with the year before?” we called it Increased temporary employees. The thinking behind this was that as a company’s employment base skews more towards a greater population of temporary employees, the working conditions for those employed people tend to deteriorate.
The second metric was nested within the sub-category Health and Safety and was very literally a count of fatalities and injuries in comparison to the year before. The central question for this metric is “Have the number of fatalities and injuries both decreased or remained the same compared with the previous year?” and we called it Fatalities and injuries, no change. There was a bit of nuance to work out with this approach as we discussed that if a company had a decrease in fatalities but a rise in injury, this could be assessed as progress towards the promotion of safety.
It was very exciting to be able to develop these specific metrics which offer a concrete measure of compliance, but it was challenging to consider the interpretation of such a measure on a global scale. When thinking about workplace injury for example, we discussed in great detail how useful it would be to have a comparison to the average reported rates of injury per industry. With this level of insight, a more relevant result emerges and progress can be mapped on a much wider scale.
Perspectives for needs and next steps
The team concluded that framing the index with the end user in mind would play a critical role in motivating actors to create change in these areas. By creating measurements that spotlight companies falling short of their goals and framing this as an opportunity for improvement, we imagined how third-party actors would be empowered to identify collaborators clearly.
Additionally, the indexes created for this purpose would require a greater influx of data into Wikirate. Having more data, particularly that which enhances cross-industry event comparisons, would be invaluable. While data availability is a struggle everywhere, we anticipate that the increasing political pressure for companies to become net zero by 2045 in Germany may incentivize greater transparency and publication of compliance-related information. We are optimistic that this will not only enhance but also expedite the availability of information within Wikirate.
Challenge 2: Visualizing Data with Wikirate API
The team approached the task with enthusiasm and creativity, aiming to develop a solution that could effectively showcase which Sustainable Development Goals (SDGs) were more reported per company. They dedicated considerable time to exploring the Wikirate API and writing code to extract relevant data.
The limited time available prevented them from fully realizing their prototype. This is a valuable teaching moment for the Wikirate Team and a lesson for future hackathons: it’s essential to either provide the necessary data beforehand or have a team member available to supply it on the spot to facilitate the progress of such work. Nevertheless, the experience was valuable as participants explored Wikirate’s REST API, gained insights into open data, and were intrigued by the wealth of information available through Wikirate. They appreciated the direct access to sources provided by Wikirate, which facilitated their exploration of companies’ impact on SDGs.
In addition to the collective efforts, one participant took a unique approach by developing a prototype to streamline data navigation on Wikirate. This ongoing endeavor involved creating a repository on GitHub to share progress and insights with Wikirate and the community. This initiative not only showcases individual dedication to advancing the accessibility of data but also underscores the collaborative spirit fostered within the challenge, where participants actively contribute to enhancing the understanding and utilization of open data resources.
Call to Action
The collective efforts of Wikirate’s Hack Day participants contribute to the enrichment of Wikirate’s data ecosystem, and invites them to join our diverse community engaging with open data to drive positive change.
If you’re passionate about expanding data accessibility and transparency for the SDGs, we encourage you to reach out. The metrics on Wikirate, including those created during the Hackathon, are versatile tools that can shed light on various companies and industries.
Blog Contributors
- Thomas Adler
- Sylvia Gozdek
- Carolyn Thompson
- Lucía Ixtacuy (Wikirate, Coord.)
- Auréliane Froehlich, Marc McGowan, Vasiliki Gkatziaki (Wikirate, Ed.)