Storm (i)Cloud

Last year I did a series of posts here where I ran through problems I had encountered with Core Data’s iCloud integration, with various solutions and workarounds I had been able to devise. Then iOS 7 and Mac OS X 10.9 came out with numerous visible (and internal) updates and people started asking me, so, is it any better now? Can we use it?

Since then, the answer has been: I have no frickin’ idea. I work as a contractor, and for the past year Core Data with iCloud has not exactly been at the top of client requirement lists. If you’re not sure why, see my previous posts on the topic.

Back when I was setting up this site I didn’t anticipate so much iCloud related content. But that’s how things have happened.

Storming (i)Cloud

But I mean to find out how things have changed. Lately I’ve started a spare-time project which I’ve been calling StormCloud. StormCloud is an app which puts iCloud through its paces to see how well it holds up. If nothing else, I’ve learned a few ways to really stress the system, and a few tricks about keeping an eye on iCloud to see if it’s behaving.

With that in mind, here’s some detail on what it actually does. Or will do, when it’s done.

First, StormCloud is a Mac OS X app. Most people reading are probably more interested in iOS, but at the moment this makes the most sense. On OS X it’s possible to observe iCloud a little more closely. You can watch log files, or reset the ubiquity daemon. You can browse the app’s data directly in Terminal or in Finder. While it may not be precisely the same system as on iOS, it’s pretty close. At some point I’ll probably do an iOS version as well.

Data Model

The data model in Storm Cloud is generic, in the sense that it only exists to see if iCloud can handle it and doesn’t reflect any real data. Entities have names like Entity. I’m not trying to model any specific data here, but I need something non-trivial to throw at iCloud. The model includes the following details, some of which have been problems, others of which have been useful in finding problems:

  • Entity inheritance. Specifically, abstract entities with at least two sub-entities.
  • Relationships between entities that inherit from one of those abstract parents.
  • Binary data attributes. In some cases these use Core Data’s automatic external storage, and in others the data just stays in the persistent store. Keeping a ton of binary data in the persistent store is generally a bad idea, but it’s supposed to work, and I plan on finding out if it does. Regardless, the real edge case with iCloud is (or has been, at least) automatic external storage, which I’ve had fail in interesting and complicated ways.
  • Custom unique IDs. Every entity has a string attribute that contains a UUID. In the past I’ve seen cases where this was the only way to definitively identify an instance– because the object manifested as an instance of the wrong entity. Fortunately the project was already using unique IDs for unrelated reasons.

Test Data

And one more thing, there should be a lot of data. A few kB just isn’t going to cut it for a real test. A free iCloud account currently gets 5GB of data. For a real test I’ll want to use at least 1GB of that. So I’m also including code to generate data to fill out the store file.

Numeric attributes can simply be random; I don’t care what they are, only that they don’t change unexpectedly. The same goes for dates.

Text and binary attributes take a little more work. In principle I could just fill these with random data. But on past experience, I really want data that’s easy to recognize. I don’t want to have to compare two NSData blobs byte by byte to find out if they’re the same, I want to be able to look at the data and see. In one (unfortunately) memorable case, everything looked great until I started browsing images stored in two copies of a persistent store and noticed that they didn’t match up. That would have been harder to notice with random bytes.

As a result,

  • Text attributes are currently being filled up with random time zone names. You can get an array of known names via +[NSTimeZone knownTimeZoneNames]. I add those to text attributes until they reach whatever length I’m aiming for. It would be nice to get more variation but this satisfies the need for easy readability.

  • Binary attributes are all PNGs generated semi-randomly through Core Image. This works by choosing a generator filter, applying a distortion filter, and then compositing a timestamp string over the result. Both filters are semi-randomly configured with values restricted by the filter’s suggested min/max slider values. The resulting image is rendered at somewhere between 1024x1024 and 2048x2048.

Document-Based App

I’m going to want to experiment with multiple persistent stores, so the “shoebox” style app with a single persistent store won’t cut it. At the same time I don’t want any automatic management of the persistent store, so UIDocument style convenience is right out. On Mac OS X this doesn’t make any difference, as of 10.9, because NSPersistentDocument still doesn’t support iCloud. In order to test the API thoroughly I need to make all of the Core Data calls myself, even if other options exist.

Data Verification

Making sure that data syncs correctly seems to be the toughest part of the project. With iCloud up and humming I’ll end up with two or more copies of the persistent store on different Macs. But are they the same? This is not a hypothetical question. Past projects have abandoned iCloud due to data corruption, but it’s not always blindingly obvious when this has happened. If you end up with radically different data you’ll usually notice, if only because one Mac will have more data than others. But are all of the relationships correct? Is every object exactly the same on every device?

I’ll probably have to come up with something to effectively diff multiple persistent stores. I don’t know if that’s reasonable in the general case. In this case though, the fact that I’m using UUIDs everywhere will help. That should provide a reference for absolutely identifying what should be the same object in different stores, after which I can compare data and metadata to see if they’re actually the same.

The sticking point is getting both persistent stores into the app at the same time without relying on iCloud to copy the data. Once I get persistent stores on multiple Macs I’ll need to get them all onto one Mac for verification. That might need to be external, out-of-band data transfer via Dropbox or even a thumb drive (the modern sneakernet).

Right now I’m aiming for detailed coverage of the following iCloud-related actions. Some of these will require a little thought to how they can be done and may require some out-of-band data transfer as discussed above.

  • Create a local persistent store and then migrate it to iCloud.
  • Migrate the same non-ubiquitous store to iCloud on multiple Macs.
  • Create a new ubiquitous persistent store and verify that it syncs.
  • Migrate a ubiquitous store back to local-only status.
  • Make both minor and major updates to existing data and verify correctness after syncing.
  • Remove local data for a persistent store and rebuild it from cloud-based data.
  • Deleting app data externally (e.g. via System Preferences) while the app is running and while it’s not running.
  • Logging out of iCloud and back in, while the app is running.
  • Updating the data model on only one Mac (which stops syncing), making changes on all Macs, then updating on all Macs and seeing how syncing resumes.

Other approaches to banging on iCloud’s series of tubes may be added.

So what’s next?

I’m not sure if I’ll release StormCloud. If I do, it’ll be open source, but for some of the reasons described above it won’t ever be something that can be easily downloaded and used. Intentionally trying to trip up iCloud, and detecting if you’ve done so, is always likely to require a certain amount of out-of-band data movement and manipulation. It works as a testbench but I don’t know if it could work as a non-shitty app.

In the shorter term, while I develop it, I’ll be doing a new series of iCloud-related posts covering issues that arise along the way. As with my previous series, I’ll offer tips and workarounds where possible. By the time I’m done I hope to have a better answer for whether Core Data with iCloud can reasonably be used in a moderate to complex app.