In today's installment of my continuing series on using iCloud with Core Data I'm going to discuss how factors beyond your control may render iCloud unusable, even if everything is working normally. Even if everything is working correctly, the current API can still require complex workarounds to get decent app performance. Through this, keep in mind that as with my previous post, I'm sticking to how the API is designed to work in the absence of bugs in the implementation.

Bringing up iCloud with Core Data

The basic approach to getting Core Data working with iCloud is something like:

  1. When you create your Core Data stack, tell it where to save data in iCloud.
  2. Listen for incoming change notifications and update your app to reflect new data.

There is no step 3. This is the same as any basic Core Data app, with the addition that the persistent store coordinator talks to iCloud on your behalf. This leads to a Core Data stack that looks something like this:

single-stack

The (simplified) code to do this is is something like:

- (NSPersistentStoreCoordinator *)persistentStoreCoordinator {

    if (persistentStoreCoordinator__ != nil) {
        return _persistentStoreCoordinator;
    }
   
    _persistentStoreCoordinator = [[NSPersistentStoreCoordinator alloc] initWithManagedObjectModel: [self managedObjectModel]];

    dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
        NSFileManager *fileManager = [NSFileManager defaultManager];
       
        NSString *storePath = [[self applicationDocumentsDirectory] stringByAppendingPathComponent:@"MyApp.sqlite"];
        NSURL *storeUrl = [NSURL fileURLWithPath:storePath];
       
        NSURL *ubiquityURL = [fileManager URLForUbiquityContainerIdentifier:nil];
        NSString* coreDataCloudContent = [[ubiquityURL path] stringByAppendingPathComponent:@"myApp_v3"];
        cloudURL = [NSURL fileURLWithPath:coreDataCloudContent];

        NSDictionary* options = @{ NSPersistentStoreUbiquitousContentNameKey : @"com.example.myapp.3",
            NSPersistentStoreUbiquitousContentURLKey: cloudURL };

        [_persistentStoreCoordinator lock];
        NSError *error = nil;
        if (![_persistentStoreCoordinator addPersistentStoreWithType:NSSQLiteStoreType configuration:nil URL:storeUrl options:options error:&error]) {
            // No persistent store is available, try to recover somehow.
        }    
        [_persistentStoreCoordinator unlock];

        dispatch_async(dispatch_get_main_queue(), ^{
            [[NSNotificationCenter defaultCenter] postNotificationName:@"RefetchAllDatabaseData" object:self userInfo:nil];
        });
    });
   
    return _persistentStoreCoordinator;
}

The dispatch_async call seems a little weird if you've done non-iCloud Core Data. It's necessary because setting up iCloud may take a while. In the meantime you have a persistent store coordinator but no persistent store. Fetches will return no results, because there's no store to get results from.

The real stumbling block in this approach is the call to addPersistentStoreWithType:etcetera: (it's too damn long to type the whole name every time). Apple warns that it may block. In fact, it may block for a long time. This is where Core Data tries to reach the iCloud service. If it succeeds, it starts downloading any of your app's data that isn't on the local device yet. The method doesn't return until this download process is finished.

So, what if you're on an iPhone, with an unreliable cell phone network, and you have a lot of data to download? Well, do the math. That method blocks for a long time. On iOS, iCloud data only gets downloaded on demand, and addPersistentStoreWithType:etcetera: is where that demand is made. On Mac OS X iCloud is supposed to be "greedy" and download everything as soon as it's available. In practice this is often the case, but it frequently ends up acting more like iOS, not noticing new iCloud data until you ask for it.

The upshot? I've seen addPersistentStoreWithType:etcetera: block for 30 minutes. And keep in mind, this is when iCloud is working normally. This is not an error condition, this is "working as designed".

Now, think about what happens during this time:

  • Because there's only one Core Data stack which doesn't have a persistent store yet, you can't show the user any of their data. Even data that already exists locally is inaccessible during this delay.
  • Since there's no persistent store, you also can't save changes. Of course there's no existing data to change, but you also can't create and save any new data.

The whole point of the dispatch_async call is to keep your app responsive during the iCloud startup delay. But as the delay gets longer and longer, you may well wonder what the point is. Who cares if the UI is responsive if you can't actually do anything with it?

SharedCoreData: Refining the iCloud Setup Process

Apple's SharedCoreData sample app, which I've mentioned before, refines the process somewhat. It introduces the idea of a fallback store, a separate persistent store not connected to iCloud. The fallback store is used when iCloud is not available, for example if the user doesn't have an iCloud account or is not currently logged in to their account.

The approach then becomes:

  • At launch time, check whether iCloud is available.

    • If so, use it more or less as described above. If a fallback store exists, copy every object in it to the iCloud store and then remove duplicates.
    • If not, use a local-only fallback store.
  • If iCloud availability ever changes (maybe the user created or logged in to an iCloud account while the app was running),

    • Shut down whatever store you're using
    • Repeat the initial setup, including the check for iCloud

Now, this doesn't actually solve the long delay problem, and I don't think it was intended to do so. The main purpose of the code is to demonstrate how to handle cases where the user logs in or out of iCloud. If iCloud is available, it ends up doing more or less the same thing as in the code from above, with the same potential results.

It does, however, introduce the idea of using a second persistent store and of copying changes between them. That's useful. It doesn't go nearly far enough, though.

Fixing the Delay with Multiple Core Data Stacks

Any solution to this problem must meet one absolute requirement: Delays from iCloud must not prevent normal use of the application, under any circumstances. One way or another, adding the persistent store must be removed from the critical path of getting from initial app launch to the point where the user can do stuff in the app. Of course, iCloud issues may well impact syncing data, because syncing is inherently dependent on the supporting sync infrastructure. But iCloud shouldn't interfere with the creation and use of local data, and shouldn't keep the user sitting there watching a progress indicator hoping that something happens.

One approach-- after considering ideas suggested by SharedCoreData, the design of the late, lamented MobileMe, and other details-- is to completely separate iCloud from normal app operations. Use iCloud as a transfer mechanism, but don't let it anywhere near the UI or the user.

I approached this by reworking the default iCloud Core Data setup to use two completely independent Core Data stacks, sharing only the underlying data model. The diagram from above ends up looking something like this:

twin-stack

The stack on the left, in yellow, is a plain old non-iCloud Core Data stack. It doesn't know anything about iCloud. It's used for all application logic. At launch time, it starts up normally regardless of iCloud state.

The stack on the right, in blue, is the iCloud stack. It communicates with iCloud as described above, and is subject to all the delays and issues that come with iCloud. But, it never gets involved with the app logic. Nothing that happens here affects getting the app running or letting the user get to work.

The key of course is the mysterious "Custom Copy Logic" block linking the two stacks. It has two responsibilities:

  • Listen for NSPersistentStoreDidImportUbiquitousContentChangesNotification, indicating incoming changes from iCloud. Copy these inbound changes to the local persistent store.
  • Listen for NSManagedObjectContextDidSaveNotification, indicating local changes that need to be sent to iCloud. Copy these outbound changes to the iCloud persistent store.

Getting this to work is not trivial but also far from impossible. Each of these notifications includes a list of inserted, updated, and deleted objects. That provides enough information to update one stack based on changes made on the other. Since the whole reason for this is that iCloud might not be available (either "not yet" or "not at all"), the app shouldn't attempt to immediately transfer changes from one stack to the other. Instead, add the changes to a queue and process the queue at intervals, when iCloud is running. If the user makes changes while iCloud is unavailable, they queue up until it's ready, and are then transferred. The change queue persists from one app run to the next, for cases where the app launches, runs, and quits without iCloud coming up.

An inbound change queue is not strictly necessary, but it's still a good idea. Removing iCloud from the equation means that the local-only stack is very unlikely to be unavailable. It helps to use the same scheme both directions though, because then the change-copying code can be completely generic. Give it a source and a destination stack and a queue of changes, and it doesn't need to know which stack is which.

Having said that, the white block in the middle still implies some rather involved Core Data code. It is however an eminently solvable problem.

So what are the catches?

Besides writing the copy logic? A couple of things:

  • Since the local-only stack is not directly updated from iCloud, extra launch-time work is needed to pick up any new changes available in the cloud. OK, not launch time, but whenever iCloud is ready for action. Getting these changes requires a scan of the entire data store on both sides. Figuring out which instances have been added or deleted is simple enough. Figuring out which have changed is not so obvious. Entities that have modification date fields make this easy. Entities without modification dates make this a pain in the ass. Once identified, the changes are processed with the custom copy block's change queue code.
  • Using this approach means there are two independent persistent stores. That means twice the space is necessary. How significant this is depends on how much data you're likely to have. In effect this approach trades storage space for more reliable operations. I don't like making extra copies of data without good reason, but I've yet to find a more effective way of working around iCloud.

Next Time: Digging in

In my next iCloud-related post I'll look into how iCloud with Core Data is currently being used by Apple (and maybe others). momdec will be involved.