Final Evaluation and Ranking

We will integrate the submitted algorithm with an Android downloader program, generate the APK, and evaluate the algorithm performance using our online testing platform. We have selected 8 different Android devices to deploy all candidate APKs and run downloading tasks. For each task, all candidate APKs will be brought up sequentially for execution, and the adjusted speed score will be reported after the APK completes or fails the downloading task. We do not run more than one downloading task simultaneously in the same LAN.

We plan to run several rounds of tests and eliminate bottom ranking teams each round. For each round, we will run 30~50 downloading tasks on each Android device for all candidates. We rank all candidates based on the mean value of all collected adjusted speed scores. If two teams have close scores (within 3%), we determine their ranking based on the following rules:
1. The team with less failed tasks (zero score) ranks higher;
2. For each Android device, determine the winner of two teams based on the mean score of all tasks running on this device. The team wins more devices ranks higher;
3. For each downloading task, determine the winner of two teams based on the reported adjusted speed score. The team wins more tasks ranks higher;
4. The team with less standard deviation value ranks higher.

Final Result

Rank Teamname
1 xiaoxuesheng
2 TXPlayer

We have received 13 final submissions but could only successfully compile 12 of them. We deployed all compiled APKs to 8 Android phones (Phones are physically located in different cities all over China. 4 of them use WiFi and 4 of them use 4G/LTE networks) for the final test.

Round 1:

Each APK ran 148 times and our server was able to record 128 scores for most teams. We include the demo we provided in the test as well. You can download all the data here. The last three teams were eliminated from this round.

Team Task Count Score (>0) records Min Score Max Score Avg Score
Team Task count Score (>0) records Min score Max score Avg score
xiaoxuesheng 148 127 114.412878 1950.167135 1272.623973
TXPlayer 148 128 246.2465399 1970.585926 1221.077434
INSTU 148 128 128.4536234 1675.648733 1171.912441
DEMO 148 128 264.8107036 1602.687507 1052.34288
NMSL_group2 148 128 268.8068118 1471.93943 940.4113126
MMMM 148 128 251.2427407 1512.458433 932.579818
MulNet 148 128 222.9931111 1482.494691 875.4911533
room111 148 128 76.64059117 1350.569942 836.2303507
NMSL@NTHU 148 126 113.5114793 1451.561094 818.2252203
Sky 148 125 34.53533982 1423.486753 688.7310533
No.1 148 71 539.3448576 798.3221514 682.8517382
FluxPilot 148 96 104.1978591 1064.382236 651.1503677
NMSL_group7 148 43 41.75271941 587.641261 340.510378

Round 2:

The remaining 10 APKs (including DEMO) ran 240 tasks each and you can download all the data here. Top three teams advanced to the final round. Top three teams advanced to the final round.

Team Task Count Score (>0) records Min Score Max Score Avg Score
xiaoxuesheng 240 195 89.53202734 1930.681738 1269.490609
TXPlayer 240 204 85.18879412 1977.663221 1159.407717
INSTU 240 204 96.92438998 1681.965324 1153.277515
DEMO 240 203 89.86292933 1607.893248 1041.513736
MMMM 240 202 96.01323335 1517.081089 916.6911574
NMSL_group2 240 202 112.9813088 1476.552803 911.2285605
MulNet 240 202 91.83145281 1456.555253 834.4653118
room111 240 202 84.31457545 1355.034146 825.2817214
NMSL@NTHU 240 200 97.91364108 1441.56076 818.5021431
Sky 240 197 24.09233687 1352.167074 715.0363252

Round 3:

Top three teams together with our own product version MPD algorithm ran 400 tasks each. You can download all the data here.

Team Task Count Score (>0) records Min Score Max Score Avg Score
xiaoxuesheng 400 346 51.36871514 1916.085915 1268.207113
TXPlayer 400 346 240.4325518 1993.744047 1233.827058
INSTU 400 347 228.399949 1723.10045 1167.134425

Since the scores of team xiaoxuesheng and team TXPlayer are too close (<3%), we ran further analysis according to the preset rules:

  1. The team with less failed tasks (zero score) ranks higher. Both teams recorded 346 non zero scores in the final round. We did notice that team xiaoxuesheng has one zero score record while team TXPlayer has no zero score record. However, from the available dataset we were not able to determine whether the zero score was caused by device issues or algorithm itself. Thus we call it a tie.
  2. For each Android device, determine the winner of two teams based on the mean score of all tasks running on this device. The team wins more devices ranks higher. Each team wins 4 devics, still a tie.
ip team min_value max_value avg_value xiaoxuesheng 51.36871514 723.4108511 509.2199907 TXPlayer 240.4325518 723.3992193 495.8793279 xiaoxuesheng 352.4413294 1916.085915 1818.541114 TXPlayer 1367.346432 1721.554671 1642.747679 TXPlayer 830.2975927 1976.606553 1584.820381 xiaoxuesheng 1158.014799 1780.223112 1422.207348 TXPlayer 871.7894098 1086.196479 1024.077071 xiaoxuesheng 469.309745 1080.654387 988.0524434 xiaoxuesheng 844.8520962 1163.002314 1035.053492 TXPlayer 567.3461547 1026.405262 877.7089772 TXPlayer 1011.209378 1993.744047 1607.971646 xiaoxuesheng -1 1785.724251 1278.882504 TXPlayer 1215.803739 1611.896298 1485.292093 xiaoxuesheng 1296.026994 1579.795892 1452.970355 xiaoxuesheng 1466.299325 1842.628048 1740.300587 TXPlayer 1377.054083 1615.514492 1532.490321
  1. For each downloading task, determine the winner of two teams based on the reported adjusted speed score. The team wins more tasks ranks higher. In this round, the execution sequence for each task was set to random. We sorted the score records based on the timestamp and chose adjacent scores for head to head comparison. Team xiaoxuesheng wins 221 of 335 head to head comparisons.

Testing Result

Rank Teamname Score
1 TXPlayer 163
1 xiaoxuesheng 163
2 No.1 162
2 MMMM 162
3 room111 160
4 MulNet 151
5 INSCTU 111
6 FluxPilot 45
7 NMSL_group2 0
7 PandasNTHU 0