Final Evaluation and Ranking
We will integrate the submitted algorithm with an Android downloader program, generate the APK, and evaluate the algorithm performance using our online testing platform. We have selected 8 different Android devices to deploy all candidate APKs and run downloading tasks. For each task, all candidate APKs will be brought up sequentially for execution, and the adjusted speed score will be reported after the APK completes or fails the downloading task. We do not run more than one downloading task simultaneously in the same LAN.
We plan to run several rounds of tests and eliminate bottom ranking teams each round. For each round, we will run 30~50 downloading tasks on each Android device for all candidates. We rank all candidates based on the mean value of all collected adjusted speed scores. If two teams have close scores (within 3%), we determine their ranking based on the following rules:
1. The team with less failed tasks (zero score) ranks higher;
2. For each Android device, determine the winner of two teams based on the mean score of all tasks running on this device. The team wins more devices ranks higher;
3. For each downloading task, determine the winner of two teams based on the reported adjusted speed score. The team wins more tasks ranks higher;
4. The team with less standard deviation value ranks higher.
Final Result
Rank | Teamname |
---|---|
1 | xiaoxuesheng |
2 | TXPlayer |
3 | INSTU |
We have received 13 final submissions but could only successfully compile 12 of them. We deployed all compiled APKs to 8 Android phones (Phones are physically located in different cities all over China. 4 of them use WiFi and 4 of them use 4G/LTE networks) for the final test.
Round 1:
Each APK ran 148 times and our server was able to record 128 scores for most teams. We include the demo we provided in the test as well. You can download all the data here. The last three teams were eliminated from this round.
Team | Task Count | Score (>0) records | Min Score | Max Score | Avg Score |
---|---|---|---|---|---|
Team | Task count | Score (>0) records | Min score | Max score | Avg score |
xiaoxuesheng | 148 | 127 | 114.412878 | 1950.167135 | 1272.623973 |
TXPlayer | 148 | 128 | 246.2465399 | 1970.585926 | 1221.077434 |
INSTU | 148 | 128 | 128.4536234 | 1675.648733 | 1171.912441 |
DEMO | 148 | 128 | 264.8107036 | 1602.687507 | 1052.34288 |
NMSL_group2 | 148 | 128 | 268.8068118 | 1471.93943 | 940.4113126 |
MMMM | 148 | 128 | 251.2427407 | 1512.458433 | 932.579818 |
MulNet | 148 | 128 | 222.9931111 | 1482.494691 | 875.4911533 |
room111 | 148 | 128 | 76.64059117 | 1350.569942 | 836.2303507 |
NMSL@NTHU | 148 | 126 | 113.5114793 | 1451.561094 | 818.2252203 |
Sky | 148 | 125 | 34.53533982 | 1423.486753 | 688.7310533 |
No.1 | 148 | 71 | 539.3448576 | 798.3221514 | 682.8517382 |
FluxPilot | 148 | 96 | 104.1978591 | 1064.382236 | 651.1503677 |
NMSL_group7 | 148 | 43 | 41.75271941 | 587.641261 | 340.510378 |
Round 2:
The remaining 10 APKs (including DEMO) ran 240 tasks each and you can download all the data here. Top three teams advanced to the final round. Top three teams advanced to the final round.
Team | Task Count | Score (>0) records | Min Score | Max Score | Avg Score |
---|---|---|---|---|---|
xiaoxuesheng | 240 | 195 | 89.53202734 | 1930.681738 | 1269.490609 |
TXPlayer | 240 | 204 | 85.18879412 | 1977.663221 | 1159.407717 |
INSTU | 240 | 204 | 96.92438998 | 1681.965324 | 1153.277515 |
DEMO | 240 | 203 | 89.86292933 | 1607.893248 | 1041.513736 |
MMMM | 240 | 202 | 96.01323335 | 1517.081089 | 916.6911574 |
NMSL_group2 | 240 | 202 | 112.9813088 | 1476.552803 | 911.2285605 |
MulNet | 240 | 202 | 91.83145281 | 1456.555253 | 834.4653118 |
room111 | 240 | 202 | 84.31457545 | 1355.034146 | 825.2817214 |
NMSL@NTHU | 240 | 200 | 97.91364108 | 1441.56076 | 818.5021431 |
Sky | 240 | 197 | 24.09233687 | 1352.167074 | 715.0363252 |
Round 3:
Top three teams together with our own product version MPD algorithm ran 400 tasks each. You can download all the data here.
Team | Task Count | Score (>0) records | Min Score | Max Score | Avg Score |
---|---|---|---|---|---|
xiaoxuesheng | 400 | 346 | 51.36871514 | 1916.085915 | 1268.207113 |
TXPlayer | 400 | 346 | 240.4325518 | 1993.744047 | 1233.827058 |
INSTU | 400 | 347 | 228.399949 | 1723.10045 | 1167.134425 |
Since the scores of team xiaoxuesheng and team TXPlayer are too close (<3%), we ran further analysis according to the preset rules:
- The team with less failed tasks (zero score) ranks higher. Both teams recorded 346 non zero scores in the final round. We did notice that team xiaoxuesheng has one zero score record while team TXPlayer has no zero score record. However, from the available dataset we were not able to determine whether the zero score was caused by device issues or algorithm itself. Thus we call it a tie.
- For each Android device, determine the winner of two teams based on the mean score of all tasks running on this device. The team wins more devices ranks higher. Each team wins 4 devics, still a tie.
ip | team | min_value | max_value | avg_value |
---|---|---|---|---|
113.127.171.13 | xiaoxuesheng | 51.36871514 | 723.4108511 | 509.2199907 |
113.127.171.13 | TXPlayer | 240.4325518 | 723.3992193 | 495.8793279 |
116.169.10.146 | xiaoxuesheng | 352.4413294 | 1916.085915 | 1818.541114 |
116.169.10.146 | TXPlayer | 1367.346432 | 1721.554671 | 1642.747679 |
122.96.92.18 | TXPlayer | 830.2975927 | 1976.606553 | 1584.820381 |
122.96.92.18 | xiaoxuesheng | 1158.014799 | 1780.223112 | 1422.207348 |
211.95.47.98 | TXPlayer | 871.7894098 | 1086.196479 | 1024.077071 |
211.95.47.98 | xiaoxuesheng | 469.309745 | 1080.654387 | 988.0524434 |
218.107.37.35 | xiaoxuesheng | 844.8520962 | 1163.002314 | 1035.053492 |
218.107.37.35 | TXPlayer | 567.3461547 | 1026.405262 | 877.7089772 |
36.152.144.178 | TXPlayer | 1011.209378 | 1993.744047 | 1607.971646 |
36.152.144.178 | xiaoxuesheng | -1 | 1785.724251 | 1278.882504 |
58.243.250.18 | TXPlayer | 1215.803739 | 1611.896298 | 1485.292093 |
58.243.250.18 | xiaoxuesheng | 1296.026994 | 1579.795892 | 1452.970355 |
61.242.134.185 | xiaoxuesheng | 1466.299325 | 1842.628048 | 1740.300587 |
61.242.134.185 | TXPlayer | 1377.054083 | 1615.514492 | 1532.490321 |
- For each downloading task, determine the winner of two teams based on the reported adjusted speed score. The team wins more tasks ranks higher. In this round, the execution sequence for each task was set to random. We sorted the score records based on the timestamp and chose adjacent scores for head to head comparison. Team xiaoxuesheng wins 221 of 335 head to head comparisons.
Testing Result
Rank | Teamname | Score |
---|---|---|
1 | TXPlayer | 163 |
1 | xiaoxuesheng | 163 |
2 | No.1 | 162 |
2 | MMMM | 162 |
3 | room111 | 160 |
4 | MulNet | 151 |
4 | NMSL@NTHU | 151 |
5 | INSCTU | 111 |
6 | FluxPilot | 45 |
7 | NMSL_group2 | 0 |
7 | PandasNTHU | 0 |