Rebooting in Recovery

Friday, May 4, 2012

On hypocrites and technophobes...

The dire predictions of a post-SOPA world will undoubtedly remain speculation for years to come. What lessons did SOPA teach us, and what weaknesses does it expose? Is the internet really as "wild and dangerous" as they claimed?

To start with, let's look at who started the problem, the MPAA, and in particular, Chris Dodd. It's obvious that Mr Dodd suffers a bit of technophobia. But is he really the root of the problem? Well, yes. But not for the obvious reasons. Any person of leadership has teams of advisors helping research information and gathering technical details. Mr Dodd did not make up a bunch of numbers and claim they were costing the MPAA some particular dollar amount. He likely assigned teams to investigate and return favorable details for the need to regulate technology.

Regulate? Enter the hypocrisy. Mr Dodd has done well in convincing Republican Senators of the need for strict government regulations of the internet. But isn't government regulation the opposite of what the Republicans stand for? By looking deeper, we see the truth of the Republican agenda. It's about de-regulation of business, not human rights. Corporations aren't people as they like to claim, they're "super people". They deserve rights that normal people are not afforded. At least, that's how the Republican congressman would like to make it.

So why isn't it obvious what Mr Dodd is doing wrong? His mistake is in not firing a large portion of his advisers over the debacle. He doesn't realize it, because he's both technophobic and does not have the right people advising him about technology and how it works. It's not that he's wrong in being weary, piracy does exist. But SOPA was more dangerous for the MPAA than it was for the pirates.

Mr Dodd, if you're somehow reading this, let me put it in very simple terms. The internet is much like the roadway system here in the US. If you look closely at what SOPA was actually capable of doing, it didn't stop any of the traffic. Instead, it gave the MPAA and RIAA the ability to block listings in the phone book. If you look closely at how DNS works, it's really just a simple way to look up someone's address by name. Without a phone book, people will have more trouble finding the businesses in question, but word of mouth and exchanging of addresses will ultimately resolve this. Very small businesses that rely on people flipping through the phone book to find them would die out, but the majority would just be "harder to find" for a short bit. Instead of going directly to a website by name, users will end up on a different site which does nothing but list the address of "blocked" sites. Once you know the address, it's a simple drive over.

So what *really* would have come out of SOPA? The first, and likely the most dangerous, result of SOPA would be the loss of power of ICANN. If we use the phone book example above, it doesn't take much of a leap to realize that someone will quickly create a new "phone book" system. This system won't rely on a central repository (the root DNS servers) and instead will use peer-to-peer and off-shore servers to convey the information. Servers of which you have no control over. Any site which isn't blocked on the root DNS servers can still be located over the new peer-to-peer system, because it can act as a relay. But any site you block by lookup, it can share around without power or ability to control.

And you might be thinking to yourself, "we're smarter than that." Well, you're not. That's why your advisers should be fired. Let's take real evidence instead of your beliefs. Back in the early 2000s, a little service popped up which allows the sharing of music. This service relied on a server that everyone connected to, and would publish the list of songs they had on their system. When you searched the system for a song, the server would tell you everyone connected that had a match, and you could download it from them. You don't need me to tell you which service this was. Your buddies at the RIAA probably still have disdain for the name alone. But what happened when it was shut down? Sharing moved to peer-to-peer systems. And suddenly, the RIAA wasn't going after big fish, but was instead being publicly persecuted for going after individuals, sometimes minors, for sharing music over peer-to-peer networks. The MPAA is already in the age of peer-to-peer sharing, so it's struggling to find a way to stop technology. The RIAA has embraced technology to a point now where nobody that I know still pirates their music, but instead, just buys it from any number of legit online retailers straight from their mobile device. The cost of piracy has exceeded the cost of purchase. Yes, even piracy costs time and money.

So for the rest of us, why was SOPA so dangerous? Because the loss of ICANN's control over name lookup would not be the end of the MPAA. On the contrary, in the struggle to fight a new system and protect their existing business models, they would propose crazier and unprecedented lock downs of the internet infrastructure itself. While still failing to understand how SOPA could have gone wrong, and continuing to look at how the technology could and would evolve around the regulations, they would continue to lash out aimlessly at the internet, hoping to stop the one thing that would survive. Piracy is like a cockroach, it can survive almost anything. But the RIAA has learned a few things. People *want* to give you their money. I cancelled Netflix. Not because it was too expensive, but because none of the movies I wanted to watch are available. I'd pay more than they currently charge a month if a much wider selection of movies were available. I'd be willing to have a cap on the number of movies a month I could watch for a subscription tier. I'm *not* willing to pay $4 to "rent" a movie for 24 hours that often times doesn't even play correctly, thanks to the layers and layers of DRM that prevent the legit viewing of the content while not stopping the pirates in the slightest.

A look at the history of DRM systems would quickly find that most people find DRM causes more headaches for the legit owners than for the pirates. I will publicly admit to having downloaded cracks to software I legitimately purchased just because the DRM was excessive and annoying, sometimes leading to system instability or security vulnerabilities.

So what now? Well, nothing really. I don't expect Mr Dodd to realize the problems he's facing. He's a former senator, not a technology maven. I'm also not advocating the idea of ignoring the problem. It needs to be faced, but he needs to recognize the need to hire people who understand the problem, and more importantly, understand the technology. If the best advice your advisers can give you to prevent people from driving somewhere is to take away listings from their phone book, you've really got some dumb advisers.

And for the Republican Senators? What do you *really* stand for? Stop the lies, stop the hand waving, stop the "it's someone else's fault". Do you believe in deregulation, or do you believe in corporate control of America? We know where Lamar Smith stands on the issue, he's never hidden his distaste for the American people. Between pirating pictures for his own campaign site while trying to pass laws to prohibit others from doing the same?

Monday, March 12, 2012

Class Diagrams FTW

So it should come as no secret by now that I'm trying to do the design phase right the first time. I'm using UML as a tool to help me put my thoughts and plans onto paper, and eventually into source code. The biggest reason is so that I can learn how to express my software designs in UML. But also, it makes for an easier job coming up with all the unit test cases that run on the individual interfaces to ensure everything is written correctly and running smoothly.

So without further ado, I want to attach a snapshot of some of the class diagram that makes up the "Volume Manager". This is the block of Phoenix which is ultimately responsible for everything Phoenix does. If you want to read or write to any place on the device, this will be where the traffic gets routed.

It's not complete, but it's getting there.

Now, to answer some other questions:

1. Why am I working on another recovery? The #1 answer to this is that I wasn't done with recovery when TWRP 2 came out. There was so much more power available to us, but it's often overlooked. Google itself passes over recovery without as much as a cursory glance for end users. I think differently. I believe that new ROMs are a great part of the Android community, and the ability to streamline access and use is key to the development community. Numerous tools have been written to help with this, such as Titanium Backup, Rom Manager, Kernel Manager, etc... What all of these tools lack is a single, powerful recovery capable of delivering on the promises that we see the phone as capable of, but don't see available as product. I plan to deliver on that promise.

2. Is it an extension of (insert other recovery system here)? No. It's a write-up from scratch. And by scratch, I don't mean "From Google's Android Recovery". No, Phoenix is not based on *any* recovery, including Google. Would I like to see it shipped from Google as the standard recovery? Of course. And for the figure listed here (http://www.moneymind.sg/2008/04/how-much-does-google-employee-earn.html?m=1) and a little stock and relocation to Chicago, I'd be happy to give the entire design and code base to Google. I love NVIDIA, and I'm fairly loyal. I'm *not* $220,000/yr loyal. Let's be real. I've got a wife and kids I'd love to give the world to.

So if you have any other questions, feel free to ask.

Falling behind...

So I know a lot of people may be wondering about any news on Phoenix. Sadly, progress has been slow. I'm still in the learning phase on a lot of pieces of life, and sometimes these take priority over my favorite hobby, programming. Recently, I've transitions responsibilities at work, as well as had some family visit. The design for Phoenix flows cleanly through my head, and I've gotten some of the pieces down on paper. Soon, I should have the majority of the architecture laid out in both UML and source code. Soon, I should have some flow diagrams to share with everyone...

Friday, February 17, 2012

Mock me all you want...

So today is a pretty big day for having my infrastructure in place for Phoenix. The source code is kept in Git, with code reviews being done by Gerrit. For code verification, Jenkins is the tool of choice, and of course, Jenkins is also handling the mainline branches on a per-changelist basis.

So now to talk a bit more about mocking. Or, in particular, mock objects. The architecture is in place, many of the design issues have been hashed over, and it's time to start putting a few pieces of the puzzle into place. Developing for a large variety of devices is difficult, and developing on the device itself is even trickier. But fortunately, because of the object-oriented nature of Phoenix, this task has been greatly simplified with a combination of mock objects and cxxtest. So what is the purpose of a mock objects? Let me show with a simple example.

Let's use an example of the block reader/writer design. Normally, on a device, the block reader/writer will handle all direct read and write requests to the flash memory. But how do I test the classes which use the block reader/writer without messing up the devices? What if there's a bug?

class MockBlockReaderWriter : public IBlockReaderWriter
{
public:
int readBlock(unsigned long int blockNum, void* block)
{
if (blockNum > 31) return 1;
memcpy(block, mBlocks[blockNum]);
return 0;
}

int writeBlock(unsigned long int blockNum, void* block)
{
if (blockNum > 31) return 1;
memcpy(mBlocks[blockNum], block);
return 0;

}

int getBlockSize(void) { return 512; }
private:
unsigned char mBlocks[32][512];
};

This is a mock object. Now, to do this better, I'd pre-initialize the data in the blocks to something (anything) which allows me to recognize unread sections. I might even add special 'test' APIs that allow my unit tests to verify that all blocks were read or written, etc.

Now, I write three more blocks of code... The first, a unit test around the mock block reader/writer. Why? Because I need to know that it does exactly what I'm expecting it to do. This will be identical to a unit test I write to verify the *real* block reader/writer, so I'm really killing two birds with one stone. I'll just make the test case smart in that it takes in the type of block reader/writer to test with, and verifies full functionality.

The second block of code to write is the real code for one of the classes that uses the block reader/writer. Since I have a block reader/writer that I can test with, I can write real code without even having access to a real block reader/writer available.

And finally, I write unit tests for *that* class. If that class needs one, I'll also write a mock object for that, so that other classes can be written and tested without needing a real interface for it.

Now, how does this affect development time? It shortens it. I'm sure that sounds backwards, but the reality is, with a good build system and great unit test framework, the time it takes to test the code is greatly reduced, and more importantly, the amount of time trying to debug strange issues that could potentially brick our limited number of test devices. Because the unit tests run under the host OS, they can be debugged using normal debugging tools, making quick work of large numbers of bugs.

So as we start to see those empty branches populating with more and more code chunks, we can sleep easily at night knowing that Jenkins is on it, making sure our trees remain green and our sodas remain caffeinated.

Monday, January 23, 2012

Slowly seeing the light at the end of the design tunnel

So the design phase for Phoenix is starting to overlap into the design verification phase. For a properly validated design, we take all of the pieces of the design, and we find ways to verify functionality using mostly-existing software. So this would be things like scripts and manual command entry, some random GUIs and occasionally reviewing results with a hex editor. The idea is to prove that when we implement the design, it will do everything we want it to do. Any features that aren't verified in this way need to be double-checked to ensure we can deliver them with the existing design.

The goal is to see the design phase completed, and not have to go back and kludge in anything.

I'll write more about this soon, but I'm tired and it's been a crazy January... Expect things to pick up soon.

Friday, January 13, 2012

A technical dive...

So recently, most of my posts have been related to either the recent passing of one of our beloved cats, and soon to be the discussions about the pending passing of *another* one of our cats. Yes, that's right, the cats in our household are cursed, apparently. The good news is, the two conditions aren't related, but the bad news is, that doesn't help.

So instead of talking about my personal life, while I've got a few moments of a clear brain, I'm going to go into a bit of a technical dive into the design of Phoenix, as well as the reason for some of the architectural decisions I'm making.

To better understand the "Phoenix Recovery System" as I've been calling it, it's helpful to understand the parts of the system. The first, and most obvious portion of Phoenix is the "recovery mode". Recovery mode is a special mode in Android devices that allows you access to the main partitions of the device to apply updates or repair damage. The second portion, and less obvious to some, is the Android system itself. While some recovery engines have a UI that you can use for preparing the recovery partition to perform some tasks, we're taking it a step further. Phoenix is a complete package, not just a recovery mode with the ability to tack on a UI. The flow between the recovery mode and the application will be seamless and easy. The final portion of the system is the computer portion.

Now this needs a bit more explaining. For some devices, when things go wrong, you can just remove the battery, hold down some buttons, and start up. But some devices aren't this friendly. Most of the Samsung devices are like this. If a problem occurs with the kernel, the device won't boot to recovery, either. And this is where a computer recovery is necessary. And this is why we're investigating Windows and Linux support for bringing these devices back up and restoring a valid image. Since I'm not a Mac developer, we'll have to see what can be done to get Mac users the same full experience.

Because the most important part of the recovery system is the recovery engine itself, that is what I'm going to focus on. So the first question I will answer is, what's wrong with Google's/CWM/Amon_RA/TWRP/etc... Well, at a high level, they are all based on some very simple designs that met the needs of Google at the time. The underlying recovery system is the same between them, with the exception of the nandroid portion. But even the nandroid mechanisms are very similar. And I disagree with their architecture, with the exception of Google who really only needed it to deliver the smallest of functionality.

So what does the design look like? Well, for starters, the whole recovery is being written in C++. Now, I'm very pragmatic about my approach to a language, so don't expect to find a lot of templates of templates of templates. And yes, I realize that with the software not being open-source, you won't find anything at all, but come on, cut me a break here and just follow along. C++ brings to the table the ability to use namespaces and classes. And believe me, they will both be used. Nothing will exist that doesn't either belong to a namespace or a class.

I use a pretty common naming scheme, all interfaces (pure virtual classes which represent abstract models) start with I. So the generic interface for a volume is IVolume. So let me give a good example of how a format call would behave on a device...

IVolume* volume = VolumeManager::FindVolume("/system");
if (!volume) return -1;

return volume->Format();

So what did that do? Well, it called the VolumeManager class, which has a static member of FindVolume, and retrieved the IVolume interface pointer.
But what is the IVolume interface pointer a pointer to? Well, it actually points to an instance of the GenericVolume class. And what does the format function do? Well, let's see...

int GenericVolume::Format(FSTYPE fsType /* = FST_Default */)
{
int ret = 0;
vector<IPartition*>::iterator iter;
for (iter = mPartitions.begin(); iter < mPartitions.end(); iter++)
ret += mPartition->Format(fsType);
return (ret == 0 ? 0 : -1);
}

Well *that* was easy. But also doesn't do anything but iterator through a vector of IPartition pointers, and call their Format member. So what's each IPartition point out? In this example case, it's an older device and the only element of the vector is an MTD partition, named MTDPartition of course.

int MTDPartition::Format(FSTYPE fsType /* = FST_Default */)
{
// For this example, we don't support setting the fsType to anything but default.
if (fsType != FST_Default)
return -1;

// To format this device, we just erase all the blocks, using the MTDBlockReaderWriter class
return getBlockReaderWriter()->Erase();
}

Again, a very simple and straightforward function. But again, doesn't *actually* do anything. But it's getting closer... So what does getBlockReaderWriter do? Let's look...

IBlockReaderWriter* MTDPartition::getBlockReaderWriter()
{
if (!mBlockReaderWriter)
mBlockReaderWriter = new MTDBlockReaderWriter(mDeviceName);
return mBlockReaderWriter;
}

So our final dive into this design pattern is the MTDBlockReaderWriter, which implements the IBlockReaderWriter interface, which describes the erase call...

int MTDBlockReaderWriter::Erase(long int startBlock = 0, long int endBlock = 0xFFFFFFFF)
{
// For simplicity, I'm not going to describe the exact calls to erase either a single or a
// group of MTD blocks, but the implementation of that code goes here. Very easy
// to locate, very easy to unit test.
}

So as you can see with the flow, formatting the system partition of the MTD device may look like it goes down a complex path, but think about how simple that path really is. If this were an EMMC device, the only places of difference would be EMMCPartition and EMMCBlockReaderWriter (if EMMCPartition used the IBlockReaderWriter to perform the format, it's just as likely IFileSystem would handle EMMC so you could format to different file system types). But IFileSystem doesn't know how to read or write to a block device. It gets the IBlockReaderWriter from the IPartition class that instantiates it.

This design is meant to keep every peace of logic restricted to doing *exactly* what that piece of logic is intended to do. Nandroid isn't a piece of logic, it's a massive set of operations against numerous block devices, file systems, etc. But because each volume isolates the underlying access mechanisms, the nandroid backup/restore can focus on doing it's own job, but better.

And how hard is it to support new devices? Well, since a large portion of the design is self-detecting (meaning that we don't need a bunch of config files and flags to tell us what the partition types are and block devices, etc), the code is very easy to launch on different platforms. And this model will extend through every task of Phoenix. The screen has an interface for rendering. The touch panel and buttons have an event interface. There's even entire objects dedicated to management of worker threads.

Even the data storage format for the nandroid backup is kept in this type of interface pattern. There is an INandroidContainer interface which gives access to the contents of a nandroid backup. And to support old TWRP backups, there is a TWRPNandroidContainer. Phoenix has its own container, PhoenixNandroidContainer, which has support for incremental backups. Because the model is designed from the ground up on paper with the interfaces being laid out in advance, we'll be able to develop and test for numerous devices easier than ever before. Unit tests can be combined into packages that people can fastboot on their device which only perform read tests, to ensure core functionality. Write tests can be written which intentionally use known-safe blocks for read/erase/write/compare tests to prove we can erase and write blocks. All these features make implementing new features and functionality easy and far safer than todays recovery systems allow.

Sunday, January 8, 2012

Sometimes Recovery isn't possible

The downside to following an individual developer instead of a company as that an individual developer is human. And as such, you're following all aspects of them, as all aspects of a developer affect the software. And today, this post isn't about bricked devices or lost data. Sometime, the things we love most don't recovery. This morning, my family lost our favorite cat, Boo. We rushed him to the emergency vet, and they tried their best, but whatever was wrong with him, it was his time. And with his passing, I'm reminded of the importance of my family, and how easily the things around us slip away.

Nearly four years ago, I met my wife in California. And from the start of our relationship, Boo was a part of it. One of the first things I bought for her was a scratching post for Boo. And from then on, whenever I came over, Boo would run over and lay on it to make sure I didn't take it away. We still have that scratching post. Unfortunately, it now lays empty.

Rest in Peace, Boo. We love you, and you'll always be a special part of our family.

"Each day, the things in our lives slowly pass on, but their memory remains with us for the remainder of time."