End of Zune time

Like probably everyone else, I was amused when the news broke that most 30gb Zune players died simultaneously on the morning of December 31st, 2008. I looked like a blunder of epic proportions on Microsoft's part (Epic Fail, as it's called now). However, now that more information is available it's apparent that it's not really Microsoft fault. The only thing they can be blamed for is trusting the drivers Freescale provided for their parts. It's not unreasonable to assume that a big semiconductor maker would properly unit test their drivers. That did not happen, as a look at the source code reveals.

Someone helpfully posted the source code of the driver here. The guilty loop is located at line 259:

while (days > 365)
{
  if (IsLeapYear(year))
  {
   if (days > 366)
   {
     days -= 366;
     year += 1;
   }
  }
  else
  {
   days -= 365;
   year += 1;
  }
 }

At first, it looks like a simple off-by-one error. If it's a leap year, and the last day in the year (366), nothing is substracted and the loop never stops. However, replacing "days > 366" with "days >= 366" is incorrect as well, because that would set day to 0. Before I give the correct solution, let's look at some more problems with this driver.

The same loop, albeit slightly modified, is used in a different part of the driver:

while (day > 365)
{
    if (IsLeapYear(year))
    {
        numOfLeap++;
        if (day > 366)
        {
            OALMSG(OAL_RTC&&OAL_INFO, (TEXT("Leap Year:    %u"),year));
            day -= 366;
            year += 1;
            OALMSG(OAL_RTC&&OAL_INFO, (TEXT(", Days left: %u\r\n"),day));
        }
        else
        {
            OALMSG(OAL_ERROR, (TEXT("ERROR calculate day\r\n")));
            break;
        }
    }
    else
    {
        OALMSG(OAL_RTC&&OAL_INFO, (TEXT("Not Leap Year: %u"),year));
        day -= 365;
        year += 1;
        OALMSG(OAL_RTC&&OAL_INFO, (TEXT(", Days left: %u\r\n"),day));
    }
}

The irony here is that although it pops out an error message when days is 366, it breaks the loop and cotinues as if nothing happened, actually creating the correct results. Here, a unit test wouldn't have shown any problems, save for the debug message.

It's a bit unclear at first whether the first day in this timestamp schema is supposed to be 1 or 0. Consider this validity check for date input:

if ((lpst->wYear < ORIGINYEAR) || (lpst->wYear > MAXYEAR))
  return FALSE;
if ((lpst->wMonth < 1) ||(lpst->wMonth > 12))
  return FALSE;
if((lpst->wDay < 0) ||(lpst->wDay > month_tab[lpst->wMonth-1]))
  return FALSE;
if ((lpst->wHour > 23) ||(lpst->wMinute > 59) ||(lpst->wSecond > 59))
  return FALSE;

This code would let day 0 pass. It also doesn't have a problem with negative hours, minutes and seconds, but that's beside the point. The #define block at the top of the source file reveals that day 1 is supposed to be the first day:

#define JAN1WEEK         2                     // Jan 1 1980 is a Tuesday
 #define GetDayOfWeek(X) (((X-1)+JAN1WEEK)%7)

With that cleared up, the solution for the loop is that the boundary is variable, not constant. In a normal year, it needs to keep going until 365 days or less are left. In a leap year, the number is 366. Of course, since the actual year in unknown until the end of the (correct) loop, you can't simply decide that beforehand. The shortest fix would be just to break if 366 days are left in a leap year. Someone also provided this replacement for the loop:

while (days > 365+IsLeapYear(year))
{
  days -= 365+IsLeapYear(year);
  year++;
}

High marks for simplicity and elegance, low marks for calling the leap year check twice per loop. I propose:

while (days > (daysInYear = IsLeapYear(year) ? 366 : 365))
{
  days -= daysInYear;
  year++;
}

I'll be interested to see how they end up fixing it and if they decide to take on the other issues as well.

Leave a comment

Your comment