Monday, January 23, 2012

Slowly seeing the light at the end of the design tunnel

So the design phase for Phoenix is starting to overlap into the design verification phase. For a properly validated design, we take all of the pieces of the design, and we find ways to verify functionality using mostly-existing software. So this would be things like scripts and manual command entry, some random GUIs and occasionally reviewing results with a hex editor. The idea is to prove that when we implement the design, it will do everything we want it to do. Any features that aren't verified in this way need to be double-checked to ensure we can deliver them with the existing design.

The goal is to see the design phase completed, and not have to go back and kludge in anything.

I'll write more about this soon, but I'm tired and it's been a crazy January... Expect things to pick up soon.

Friday, January 13, 2012

A technical dive...

So recently, most of my posts have been related to either the recent passing of one of our beloved cats, and soon to be the discussions about the pending passing of *another* one of our cats. Yes, that's right, the cats in our household are cursed, apparently. The good news is, the two conditions aren't related, but the bad news is, that doesn't help.

So instead of talking about my personal life, while I've got a few moments of a clear brain, I'm going to go into a bit of a technical dive into the design of Phoenix, as well as the reason for some of the architectural decisions I'm making.

To better understand the "Phoenix Recovery System" as I've been calling it, it's helpful to understand the parts of the system. The first, and most obvious portion of Phoenix is the "recovery mode". Recovery mode is a special mode in Android devices that allows you access to the main partitions of the device to apply updates or repair damage. The second portion, and less obvious to some, is the Android system itself. While some recovery engines have a UI that you can use for preparing the recovery partition to perform some tasks, we're taking it a step further. Phoenix is a complete package, not just a recovery mode with the ability to tack on a UI. The flow between the recovery mode and the application will be seamless and easy. The final portion of the system is the computer portion.

Now this needs a bit more explaining. For some devices, when things go wrong, you can just remove the battery, hold down some buttons, and start up. But some devices aren't this friendly. Most of the Samsung devices are like this. If a problem occurs with the kernel, the device won't boot to recovery, either. And this is where a computer recovery is necessary. And this is why we're investigating Windows and Linux support for bringing these devices back up and restoring a valid image. Since I'm not a Mac developer, we'll have to see what can be done to get Mac users the same full experience.

Because the most important part of the recovery system is the recovery engine itself, that is what I'm going to focus on. So the first question I will answer is, what's wrong with Google's/CWM/Amon_RA/TWRP/etc... Well, at a high level, they are all based on some very simple designs that met the needs of Google at the time. The underlying recovery system is the same between them, with the exception of the nandroid portion. But even the nandroid mechanisms are very similar. And I disagree with their architecture, with the exception of Google who really only needed it to deliver the smallest of functionality.

So what does the design look like? Well, for starters, the whole recovery is being written in C++. Now, I'm very pragmatic about my approach to a language, so don't expect to find a lot of templates of templates of templates. And yes, I realize that with the software not being open-source, you won't find anything at all, but come on, cut me a break here and just follow along. C++ brings to the table the ability to use namespaces and classes. And believe me, they will both be used. Nothing will exist that doesn't either belong to a namespace or a class.

I use a pretty common naming scheme, all interfaces (pure virtual classes which represent abstract models) start with I. So the generic interface for a volume is IVolume. So let me give a good example of how a format call would behave on a device...

IVolume* volume = VolumeManager::FindVolume("/system");
if (!volume)  return -1;

return volume->Format();

So what did that do? Well, it called the VolumeManager class, which has a static member of FindVolume, and retrieved the IVolume interface pointer.
But what is the IVolume interface pointer a pointer to? Well, it actually points to an instance of the GenericVolume class. And what does the format function do? Well, let's see...

int GenericVolume::Format(FSTYPE fsType /* = FST_Default */)
{
  int ret = 0;
  vector<IPartition*>::iterator iter;
  for (iter = mPartitions.begin(); iter < mPartitions.end(); iter++)
    ret += mPartition->Format(fsType);
  return (ret == 0 ? 0 : -1);
}

Well *that* was easy. But also doesn't do anything but iterator through a vector of IPartition pointers, and call their Format member. So what's each IPartition point out? In this example case, it's an older device and the only element of the vector is an MTD partition, named MTDPartition of course.

int MTDPartition::Format(FSTYPE fsType /* = FST_Default */)
{
  // For this example, we don't support setting the fsType to anything but default.
  if (fsType != FST_Default)
    return -1;

  // To format this device, we just erase all the blocks, using the MTDBlockReaderWriter class
  return getBlockReaderWriter()->Erase();
}

Again, a very simple and straightforward function. But again, doesn't *actually* do anything. But it's getting closer...  So what does getBlockReaderWriter do? Let's look...

IBlockReaderWriter* MTDPartition::getBlockReaderWriter()
{
  if (!mBlockReaderWriter)
    mBlockReaderWriter = new MTDBlockReaderWriter(mDeviceName);
  return mBlockReaderWriter;
}

So our final dive into this design pattern is the MTDBlockReaderWriter, which implements the IBlockReaderWriter interface, which describes the erase call...

int MTDBlockReaderWriter::Erase(long int startBlock = 0, long int endBlock = 0xFFFFFFFF)
{
  // For simplicity, I'm not going to describe the exact calls to erase either a single or a
  // group of MTD blocks, but the implementation of that code goes here. Very easy
  // to locate, very easy to unit test.
}

So as you can see with the flow, formatting the system partition of the MTD device may look like it goes down a complex path, but think about how simple that path really is. If this were an EMMC device, the only places of difference would be EMMCPartition and EMMCBlockReaderWriter (if EMMCPartition used the IBlockReaderWriter to perform the format, it's just as likely IFileSystem would handle EMMC so you could format to different file system types). But IFileSystem doesn't know how to read or write to a block device. It gets the IBlockReaderWriter from the IPartition class that instantiates it.

This design is meant to keep every peace of logic restricted to doing *exactly* what that piece of logic is intended to do. Nandroid isn't a piece of logic, it's a massive set of operations against numerous block devices, file systems, etc. But because each volume isolates the underlying access mechanisms, the nandroid backup/restore can focus on doing it's own job, but better.

And how hard is it to support new devices? Well, since a large portion of the design is self-detecting (meaning that we don't need a bunch of config files and flags to tell us what the partition types are and block devices, etc), the code is very easy to launch on different platforms. And this model will extend through every task of Phoenix. The screen has an interface for rendering. The touch panel and buttons have an event interface. There's even entire objects dedicated to management of worker threads.

Even the data storage format for the nandroid backup is kept in this type of interface pattern. There is an INandroidContainer interface which gives access to the contents of a nandroid backup. And to support old TWRP backups, there is a TWRPNandroidContainer. Phoenix has its own container, PhoenixNandroidContainer, which has support for incremental backups. Because the model is designed from the ground up on paper with the interfaces being laid out in advance, we'll be able to develop and test for numerous devices easier than ever before. Unit tests can be combined into packages that people can fastboot on their device which only perform read tests, to ensure core functionality. Write tests can be written which intentionally use known-safe blocks for read/erase/write/compare tests to prove we can erase and write blocks. All these features make implementing new features and functionality easy and far safer than todays recovery systems allow.

Sunday, January 8, 2012

Sometimes Recovery isn't possible

The downside to following an individual developer instead of a company as that an individual developer is human. And as such, you're following all aspects of them, as all aspects of a developer affect the software. And today, this post isn't about bricked devices or lost data. Sometime, the things we love most don't recovery. This morning, my family lost our favorite cat, Boo. We rushed him to the emergency vet, and they tried their best, but whatever was wrong with him, it was his time. And with his passing, I'm reminded of the importance of my family, and how easily the things around us slip away.

Nearly four years ago, I met my wife in California. And from the start of our relationship, Boo was a part of it. One of the first things I bought for her was a scratching post for Boo. And from then on, whenever I came over, Boo would run over and lay on it to make sure I didn't take it away. We still have that scratching post. Unfortunately, it now lays empty.

Rest in Peace, Boo. We love you, and you'll always be a special part of our family.

"Each day, the things in our lives slowly pass on, but their memory remains with us for the remainder of time."


Friday, January 6, 2012

TWRP: The final chapter...

If you haven't seen the blog post from TeamWin (http://www.teamw.in/blog/15), then you should read that first.

So let me start by being happy for TeamWin. TeamWin is an innovation incubator. As Vividboarder points out, they started with WiMAX. When they were done, they delivered all the code to the main CM tree. Once in the mainline tree, anyone and everyone can improve and port it. This gave all AOSP ROMs WiMAX, not just CM7. And the team moves on to innovate in other areas. It was the HDMI project that I joined up with TeamWin, and we proved you could have a great HDMI experience with the EVO 4G. But I'm not like the rest of the team. I'm more of a "product" guy. So while the team again moved on to bring new innovations, I stayed behind and worked on HDMI more. But ultimately, that project ran it's course, and the EVO 4G is now more a historic device than an active device.

So TeamWin has announced they are discontinuing TWRP. I'm fairly certain that if I weren't already continuing the work on it, as well as working on a next-generation recovery based on what I learned, they would have continued to support it. But that's not what TeamWin excels at. It excels at bringing new innovation to the community. And it's a waste of talent to slow them down on sustaining engineering.

What does this mean for people who love TWRP? It means that all those features will still be available and supported, albeit under a new, undecided name, until Phoenix is ready. And the TWRP theme engine will be improved, and my best effort to maintain backward compatibility for everyone. I'm a product guy. I want to build products and make them solid and stable. I failed everyone with TWRP2 and it's bugs and instabilities. We were feeling pressure to release it, and wanted the world to get to play with it. But the reality was, it wasn't ready for all the devices we released it for.

I am sorry.

I take full responsibility for that decision, since I was ultimately calling a lot of the shots (which again, is part of the conflict that came up between the rest of the team and myself). I've been asked not to use the original graphics, so I'm in the process of generating all-new graphics for the continued support builds. But this won't change the functionality. And even Phoenix will fully support restoring TWRP backups, so don't worry about losing your existing backups with any of the upcoming changes I'm making.

How can you help? Well, I'm going to need device testers going forward. But please don't volunteer below. When I get closer to ready, I'll be using Rootzwiki to help sign people up, and build a testing team.

And finally, I want to address a few people directly with some parting thoughts...

kevank: You run an amazing set of servers. I'm sure the team will never suffer from growing pains with you at the helm.

shift: Graphics are your thing, but you have more potential in you to own the whole user experience. Think of it as interactive art, and you'll make some truly incredible things.

toastcfh: There isn't much I can complement you about that the community as a whole hasn't already expressed. You are a one-of-a-kind developer, a true believer in free software almost to a fault. And while this whole fiasco has caused more chaos in a less-than-ideal time, know that I'm still fighting for you, and wish you the best. But don't waste your time being a support monkey, you're so much bigger than that. Show the world the power of their devices, and let others support it while you move on to amaze the world again.

vividboarder: It was a real pleasure working with you. You're a great developer, and if you keep at it, you'll be designing complex software soon enough.

dees_troy: First, sorry for getting you caught up in the middle of the fiasco. I underestimated the collateral damage my departure could make. Second, you learn incredibly fast, and your eagerness to grow and learn set you apart from a lot of seasoned developers. Don't lose that. With 21 years of C programming, I still find myself learning from everyone I work with, including you.

onicrom: Thanks for all the help you gave. Developers are never an easy bunch to deal with, and you were very helpful. I owe you a Galaxy Nexus build of TWRP. I snubbed the device a bit, and I shouldn't have. Because that meant I snubbed you a bit, and that's not fair.

eyeballer: You and Onicrom made the Nook Color builds work. Yeah, the developers may have written the code, but you guys did a lot of the homework and testing. It was appreciated.

s0up: Didn't get to work with you for long, but your enthusiasm definitely made you stand out. I'm sure you'll continue to do great things for the team, as you already have.

assassins_lament: TWRP was initially your baby, and given the baseline you had to start with, you did a great job. Sorry we didn't get to work on much of it together, but you made a lot of users very happy.

big_biff: I know you worked more with dees_troy than me, since I couldn't get on IRC during the day. Most of the work I did was tossed over the "wall" to dees_troy to have tested by everyone, so I didn't get the direct interaction. But you were always quick with humor and fun to be around. Thanks.

dkelle4: We really didn't interact much, again because I wasn't online all that much. But you definitely livened up the chat. Even when you weren't there.  ;-)

myndwire: Near the end, I didn't really get to talk with you much. And in the beginning, it took a bit for me to realize myn and myndwire are *not* the same person. But it was a real pleasure being on the team with you and your input and help during my earlier projects.

netarchy: It was great meeting you at the BBQ, and your kernels are unparalleled. If there's ever anything you need from me to make your kernels shine, just ask.

spiicytuna: None of my work would exist without your help. Thanks to you, Windows is now my secondary OS, not my primary. I never would have developed half the software I did without your contributions. And you put up with me and my mood swings. I appreciate that. I'm sure we'll work together again.

shinzul: You started it all, so it's fitting that I end it with you. I met you trying to find a way to help with WiMAX, and you gave me the chance to work on HDMI. While you haven't been around much lately, you are still considered by most to be the "head" of TeamWin, and I thank you for the numerous opportunities you gave me. Thank you.

Thursday, January 5, 2012

The road to Phoenix...

I feel an obligation to the community to support all the users who installed TWRP2. But since I'm no longer in TeamWin, I don't have access to their repositories anymore, nor is the communications between myself and the majority of the team very productive. I extended an offer to them to work on Phoenix as a joint effort, which as expected, was rejected. They will continue to support TWRP and take it in their own direction, although they've changed their stance and will carry the GUI forward. I, on the other hand, may look to offer another team the option to be part of Phoenix (it really is that big of a project).

So in the meantime, since I still feel obligated to the community, I'm *also* still doing development work against the RC tree, but now hosted on a different git server. From this, I've already begun some of the fixes and improvements, as well as continued evolution of the XML theme engine. Unfortunately for themers, this means there will be two theme-able recovery engines out there of similar but not identical functionality.

I'm working hard to clean up the theme style for the Phoenix architecture, and making life a lot easier on themers. Hopefully, people will find my improvements both beneficial and easier to use than the original style.

The upcoming releases for devices (including the Epic Touch 4G) is *NOT* Phoenix. Believe me, Phoenix is a completely different beast, and won't be ready as quickly as people deserve bug fixes and promised products.

And fear not, Kindle Fire folks, I'm getting close to buying a Kindle Fire so I can give you the 100% support you need with a full GUI recovery engine.

And the Galaxy Nexus? Yes, I'm hoping to release a build for you guys, too. I only get sporadic access to the device, making it very difficult to find and fix the graphics bugs. But I will find and fix them.

Tuesday, January 3, 2012

... and rises from the ashes

... Phoenix Recovery

So some of you have no doubt heard that I've begun work on a new recovery. And when I say a new recovery, I don't mean another ClockworkMod or Amon_RA or TWRP. I mean a new recovery. From scratch. Yes, I'll be importing in Google's edify engine, since I obviously have to support edify. But even that will be tweaked to use the new engine's interfaces. So I've started with one of the most powerful yet least used tools in a software architects arsenal... A notebook. Not a laptop or a mobile PC, but a bunch of paper bound together with a metal spiral. And a pencil. I've gots pages of interface designs, flow diagrams, feature requirements, etc. And I'm not yet ready to touch the computer.

I've learned something important in the blog comments and the twitter responses. I was doing TWRP for the wrong reason. And because of that, it didn't live up to my expectations. In my opinion, TWRP 2 was kind of a failure. And I finally understand why. I had the wrong motivation. I wanted to revolutionize recovery. But it doesn't. The only device it was revolutionary for is the Kindle Fire.

So what is my new motivation? Money, of course. I want to sell a product. I want to see you all a recovery engine. Now many of you are probably thinking nobody would pay for recovery. And that's exactly why money is the motivation. If I had to put a price tag on TWRP 2, I don't believe many people would actually pay for it. It's got some bugs, support is sporadic, doesn't have an android app, not a product I'd put my money out for.

So if I want you to give me your money, I need to offer you something you can't get for free. I need to be not just better, but better enough that you'd pay for the features and functionality it provides. If I can make a recovery system that is so good, you'd pay me $5 for a copy, (price has no reflection on plans, just throwing out a number), then I've made a better product. Nobody is going to pay me to add a startup animation to an otherwise run-of-the-mill recovery.

So it's back to my notebook for me. I've got a recovery engine to design.

Moving on...

I've decided to write up two blog entries tonight, since the topics are completely separate. So there's been a lot of questions about my departure from TeamWin, what happened, etc. Ultimately, it comes down to the fact that I take ownership of the software I write, as far as design and direction. For some groups and teams, this is fine. Heck, even within TeamWin, some member have "pet" projects that the rest of the team doesn't get any say in. The problem is, TWRP isn't a "pet" project, and so ownership of it is considered shared across both the developers and non-developers. It doesn't make sense for me to work on a "pet recovery project" while working on a team that has a recovery project. And if I'm going to invest as much time and money into a project as I have for TWRP 2, I need to have a certain level of ownership. And with HDMI a distant memory, it's the only project I had to work on. So I finally decided, as I was identifying how much of the code couldn't be salvaged, that it was time for me to move on to do my own things.