#P15273. [Google Hashcode 2021 Finals] Software Engineering at Scale
[Google Hashcode 2021 Finals] Software Engineering at Scale
Description
Architecture
In this problem, we consider features, services and binaries:
- Features are the user-facing functionality of Google products. For example: video playlists in YouTube or weather forecasts in Google Search.
- Services are programs running in Google data centers. For example: YouTube may have a service responsible for storing user video collections.
- Binaries are groups of services which are combined to run together for efficiency. For example: services running in the same binary can share resources such as a database connection.
Features
Each feature is described by:
- The services it relies on.
- The number of daily users that benefit from the feature (when it's launched).
Once a feature is implemented in all services that it relies on, it is immediately launched and users start to benefit from it.
Services and binaries
Each service is part of exactly one binary, but one binary can contain multiple services. Initially each binary contains at least one service, but it is valid for binaries to become empty later on. Services can be moved between binaries and new binaries can be created as needed.
When implementing a feature, the engineers take advantage of the fact that a binary contains multiple services - an engineer can implement a given feature in all relevant services in a given binary at once.
:::align{center}
:::
Figure 1. There are two features (foo and bar), five services (sa, sb, sc, sd, se), three binaries (0, 1 and 2), and 2 engineers (G0 and G1). Binary 0 runs one service (se), binary 1 runs two (sa and sc) and binary 2 also runs two (sb and sd). To implement feature foo engineers need to work on services sb, sc, and sd, which means that they will need to work on two binaries (1 and 2). To implement feature bar, engineers need to work on features sa and sc, thus only binary 1.
Scheduling work
Your goal is to assign work to engineers to implement features so that as many users as possible benefit from them.
Google has engineers who can implement the features. On each day an engineer can start any of the following tasks:
- Implement a feature in the services of binary .
- Move service from binary to another binary .
- Create a new empty binary.
- Wait for a number of days.
None of these tasks can be interrupted, meaning that once an engineer starts working on a task, they will continue to do so for the duration of that task.
Making changes to a set of services that are running in one binary is easier than making changes to services that are scattered across multiple binaries. On the other hand, many engineers working on services in one binary get in each other's way, slowing down the work. This is described in more detail below.
Task: Implement a feature
An engineer can choose any binary and implement a feature in all relevant services in that binary at once.
If an engineer works on a binary to implement a feature , this work will require days, where:
- is the difficulty of feature ,
- is the total number of services in (including services not relevant to ), and
- is the number of engineers already working on features in on the first day of this work.
Once the task is done, feature is implemented in all relevant services in .
Multiple engineers can implement the same feature in different binaries, but only one engineer at a time can implement a specific feature in one binary. If two engineers start working on the same binary (implementing two different features) on the same day, the one listed earlier in the submission file is the first to start and counts towards the number of engineers working on the binary for the second one.
:::align{center}
:::
Figure 2. (a) Engineer works on binary 2 to implement feature foo. In this case, as there are three services (sb, sc and sd) in binary 2, and , as no other engineer is working on that binary that day. In total, the engineer needs days of work in binary 2 for feature foo. Similarly, engineer G1 needs , where 1 is the number of services in binary 1 (service sa). (b) Both engineers work on binary 2 implementing different features. Supposing that engineer starts first on this binary, needs days: since no-one was working on binary 2, . Engineer needs days. (c) It is possible for two engineers to work on the same feature (here feature bar), but only if they work in two different binaries. (d) It is not allowed for two engineers to work on the same feature on the same binary.
Task: Move a service
An engineer can move a service from one binary to a different binary . After the move, all features that needed service will need to have engineers working on binary instead of . Features already implemented in service remain implemented.
It takes days to move service from binary to , where and are the number of services running in binaries and respectively, before the move.
:::align{center}
:::
Figure 3. Engineer moves service sc from binary 1 to binary 2. Observe that this affects both features (foo and bar).
No other engineer can start working on binaries and until the move is finished and the move can't start if any engineer is working on either binary at the time (implementing a feature or moving a service to/from one of these binaries).
Task: Create a new binary
An engineer can spend days to create a new empty binary. Its ID is the minimum positive integer that is not used yet.
:::align{center}
:::
Figure 4. Engineer creates a new empty binary (binary 3).
Task: Wait
An engineer can wait for a number of days.
:::align{center}
:::
Figure 5. An engineer takes a break and waits, for example watching cat videos.
Input Format
Input Data Full input (zipped)
File format
Each input data set is provided in a plain text file. The file contains only ASCII characters with lines ending with a single \n character (also called "UNIX-style" line endings). When multiple numbers or strings are given in one line, they are separated by a single space between each two elements.
- The first line:
- time limit in days (),
- number of Google engineers (),
- number of services (),
- number of initial binaries (),
- number of features (),
- duration in days to create a new binary ().
- The next lines describe the services and each of these contains:
- the service name (string of 1-20 lowercase letters a-z and hyphens -),
- an integer () - the ID of the binary where the service initially runs. Binaries are numbered from 0 to .
- The next blocks of lines describe the features. On each block:
- The first line contains:
- the feature name (string of 1-20 lowercase letters a-z and hyphens -),
- () - the number of services that need to be modified to support the -th feature,
- () - the difficulty of the -th feature,
- () - number of daily users that will benefit from the feature once it is launched.
- The second line contains a list of strings - the names of the services to be modified to support the -th feature.
- The first line contains:
Output Format
Your submission describes the schedule of the engineers' work.
File format
- The first line: - the number of engineers that we plan the work for ()
- The next blocks:
- The first line: () - number of tasks for the given engineer.
- The next lines contain one of:
- Literal
imp1followed by a feature name and a number - the engineer should implement the feature named in the binary with ID . - Literal
movefollowed by service name and a number - the engineer should move service from binary (in which resides at the time of the move) to a different binary . - Literal
new- the engineer should start a new (empty) binary. - Literal
waitfollowed by number () - the engineer should wait for days.
- Literal
10 2 5 3 2 5
sa 1
sb 2
sc 1
sd 2
se 0
foo 3 3 100
sc sb sd
bar 2 1 20
sc sa
2
2
move sc 2
impl foo 2
3
wait 2
impl bar 1
impl bar 2
Hint
Example
The following example input dataset matches the one shown in Figure 1.
| Input file | Description |
|---|---|
10 2 5 3 2 5 |
10 days, 2 engineers, 5 services, 3 binaries and 2 features, we need 5 days to create a new binary. |
sa 1 |
The first service is named 'sa' and runs in binary 1 |
sb 2 |
The second service is named 'sb' and runs in binary 2 |
sc 1 |
The third service is named 'sc' and runs in binary 1 |
sd 2 |
The fourth service is named 'sd' and runs in binary 2 |
se 0 |
The fifth service is named 'se' and runs in binary 0 |
foo 3 3 100 |
foo feature is implemented in 3 services, its difficulty is 3 and 100 users per day will benefit from it |
sc sb sd |
foo feature is implemented in services sc, sb and sd |
bar 2 1 20 |
bar feature is implemented in 2 services, its difficulty is 1 and 20 users per day will benefit from it |
sc sa |
bar feature is implemented in services sc and sa |
Note that the input file does not contain any blank lines. Blank lines and line wrapping in the example above are added for clarity.
| Submission file | Description |
|---|---|
2 |
Both engineers will work |
| The first engineer will do 2 tasks | |
move sc 2 |
Move service sc to binary 2 |
impl foo 2 |
Implement feature foo in binary 2 |
3 |
The second engineer will do 3 tasks |
wait 2 |
Wait for 2 days |
impl bar 1 |
Implement feature bar in binary 1 |
impl bar 2 |
Implement feature bar in binary 2 |
:::align{center}
:::
Figure 6. The timeline of the engineers' work. For engineer , moving cs from binary 1, where it is initially, to binary 2 will need , so 2 days. Implementing foo in binary 2 will require 6 days: , where , and . Similarly, for engineer we can calculate the days required to implement each of the binaries for the feature bar.
:::align{center}
:::
Figure 7. The work of each engineer at every given time: (a) between days 0 and 2, (b) between days 2 and 4, (c) between days 4 and 8. After day 8, feature foo goes live, and stays live until the time limit (day 10), thus for 2 days. Feature bar goes live after day 9, and thus stays live for 1 day.
Scoring
The engineers execute the scheduled tasks, one immediately after another, according to the submission file. If an engineer is scheduled to do a task but they can't (e.g. they cannot move a service because another engineer is still working on the same binary), the solution is considered invalid and gets 0 points.
It is valid for an engineer to finish their tasks before the time limit. It is also valid to schedule tasks that will start or finish after the time limit (such tasks are ignored).
Once a feature is implemented in all relevant services before the time limit, it is immediately launched and the users start to benefit from it. It is valid to only partially implement a feature (e.g. if the engineer working on it would finish after the time limit or not even start implementing the feature in some binaries), but it won't earn any points. Likewise, features that get fully implemented after the time limit are allowed but won't earn any points.
Each feature launched before the time limit scores a number of points equal to
where
- - number of users that benefit from the -th feature
- - time limit in days
- - the day when the -th feature was launched (number of days it took to fully implement -th feature).
The total score is a sum of the scores earned by each launched feature.
Example
For instance, in the example above, the first engineer will spend two days to move service from binary 1 to binary 2: initially there are 2 services in binary 1, and 2 services in binary 2, so . The first engineer will then spend 6 days () to implement feature foo there. So foo will be ready on day 8, meaning it will be live for two days (until day 10, which is the time limit), scoring 200 () points.
The second engineer will spend 2 days waiting and then 2 days () implementing feature bar in binary 1. They will then spend 5 days () implementing feature bar in binary 2. The feature will be live for 1 day, scoring 20 points.
The final score is 220 () points.
Note that there are multiple data sets representing separate instances of the problem. The final score for your team will be the sum of your best scores on the individual data sets.
京公网安备 11011102002149号