Jord

🌱 A GitHub Action that helps move workloads into greener time periods.

github
  • 0 Raised
  • 147 Views
  • 0 Judges

Categories

  • This hackathon has categories available. Please select one if necessary.

Gallery

Description

The problem


The public cloud end-user spending is approaching 500 billion USD in 2022 and is forecasted to land just south of 600 billion USD in 2023, indicating a rapidly growing industry. A significant portion of this amount is attributable to "pre-production computing", e.g. Continuous Integration workloads running on cloud computing platforms such as GitHub Actions and Azure DevOps Pipelines.

While the different platforms might have their own green software-initiatives baked into the product itself by design, there are limited options for an end-user to further influence the carbon footprint of their workloads. Most, if not all, cloud platforms are bound by SLAs and other customer-facing obligations which leaves limited room for reallocating, delaying, or otherwise altering the delivery of the compute service itself according to when the workload would be the most green

Furthermore, a portion of workloads might not be time-sensitive and could therefore tolerate a delay in execution, however this is not a feature that is natively supported in most cloud computing platforms today.

This leaves a gap:


      How can environmentally conscious end-users ensure that their workloads run during "green" time periods?


Solution


What it is

"Jord" (named after Jörð) is a GitHub Action that helps move workloads into greener time periods.


What it does

Based on available carbon emission data, Jord automatically decides when a workload should be run. 

The users can set a tolerance window which defines for how long the workload can be delayed, and Jord finds the greenest time slot within this tolerance window. 

If a greener time slot is available in the future and is within the tolerance window, the workload is delayed until that time slot is reached and reports back to the user accordingly. 

If not, the workload continues without interruption.

  All of the above is achieved using native GitHub Actions functionality, and does not require any hosting or maintenance of infrastructure on the user side.


Log output when job is delayed



GUI output when job is delayed


How it does it

  1. Jord first collects data on the physical location of the machine running the workload (based on public IP address)
  2. Since all GitHub Actions workloads that run on Linux or Windows are hosted in Azure, we're able to correlate the physical location with the Azure datacenter region
  3. The action integrates with the Carbon Aware SDK/API to collect data on;
    • current carbon emission rating of the Azure region where the workload is running
    • forecasted carbon emission rating of the same region
  4. Based on the data collected above, and the tolerance window set by the user, Jord decides to either;
    • continue the workload without interruption, or
    • delay the job for as long it takes to reach the greenest time slot
  5. If the job is delayed, Jord uses a native GitHub Actions functionality called "Environments" to automatically re-trigger the workload once the "green" time slot is reached

Currently, Jord queries the hosted version of the Carbon Aware Web API, but it also accepts other host URLs (via a dedicated input parameter to the GitHub Action) in case the API is deployed to a different server.


Impact


                     🌱 Reach millions of developers via GitHub, the biggest developer community on the planet.


GitHub has 90+ million users across 338 million repositories, and is by far the largest and most popular collaboration platform in the world for open source software. In addition, 84% of Fortune 100 companies use GitHub Enterprise which represents a significant exposure towards the corporate & enterprise space.

According to the 2021 Octoverse report, 50% of open source projects with more than 1000 contributors are "heavy users" of GitHub Actions. For open source projects maintained by companies ("open source at work"), that number rises further to 60.16%

The statistics above represent a significant potential for reducing the carbon footprint of workloads running on GitHub Actions. Although it is complex to calculate exact amounts with the data available, it is reasonable to assess the potential as being large in scope. For example if, say, the solution could scale up to being used by every 10,000nd repository on GitHub, that would translate to almost 35,000 software projects actively contributing to reducing their carbon footprint every day.


Feasibility


Jord v0.2.0 has already been released onto the GitHub Actions marketplace and is available to the community for testing.

The first iterations of Jord have revealed some limitations caused either by the implementation itself or by the native GitHub Actions functionality (for example the public GitHub APIs). For some of these it is feasible to implement fixes and/or workarounds, and for others it might be necessary to request feature updates to the GitHub Actions platform. 

Having the project open-sourced and thereby leveraging the feedback from the community will assist in finding solutions to these limitations.

Top 3 limitations identified:

  • It is currently not possible to ensure that the delayed job will run in the same data centre as the one originally assigned to the workload. This is due to how GitHub Actions allocates hosted runners.
    • Potential workaround/fix: Submit feature request to GitHub. In addition, it is possible to extend support for self-hosted runners, in which case we would be able to control which cloud and region the runner machine is initiated in. Extending support for self-hosted runners would also enhance the potential reduction of the carbon footprint, since we could query emission data across multiple regions instead of just one (where the workload was initiated).
  • Only GitHub-hosted runners are supported at the moment, not self-hosted runners
    • Potential workaround/fix: Can be built into Jord. Some limitations are due to the Carbon Aware SDK/API only supporting Azure regions at the moment, but this would still represent a significant amount of the public cloud market share,
  • Only Linux and Windows runners are supported at the moment, not macOS. This is due to macOS runners being hosted in GitHub's own cloud and not in Azure.
    • Potential workaround/fix: Submit feature request to GitHub to expose datacenter regions and locations. Alternatively, usng the public IP of the runner machine we can still identify the physical location. If the location fo the macOS runner is close to an Azure region we could assume similar emission ratings.


Vision


                             To enable every software developer to run green workloads, without any fuss.


For a solution such as this to become sticky and have an impact, it is vital that it has a low barrier of entry and that it can be seamlessly integrated into the Software Development Lifecycle. Thus "GreenOps" might be a suitable description of the overall concept (borrowing from other X-Ops philosophies such as "DevOps", "SecOps", etc.). This is also one of the main reasons behind the current solution strategy, as Jord can just be added to existing workflows in GitHub with minimal overhead (no deployments, no maintenance, no infrastructure to manage).

With that in mind, here are the top 3 highlights for the future:

  1. Incentivize usage
    • Cloud platforms such as GitHub could integrate this product with their licensing structure, and e.g. give discounts based "carbon savings"
  2. Interoperability
    • Instead of relying on a proprietary runtime such as GitHub Actions, the product could also be generalized into a protocol that allows for interoperability between platforms. This could be achieved by e.g. extending the existing Carbon Aware SDK/API.
  3. Beyond CI/CD
    • A "workload" can be any sort of computing process, and is not constrained to software development. To the extent that "green" scheduling algorithms can be generalized, similar functionality can also be implemented in other cloud-centric technologies such as Internet of Things, AI, data mining, and others.


Resources

Attachments