Saturday, June 29, 2013

Unique objects with Core Data: Find and Create

I don't think it's a secret that I love Core Data. It's my opinion that the people who don't like it or use it are either very stupid or extremely smart.

There is one problem that has plagued me for years, how to ensure that when i'm perform imports I don't create duplicate objects. In reality the problem is fairly simple conceptually, you create a hash table, and then hash the Incoming objects to test for their membership in the set, Easy.

But if you believe in DRY, the problem is then making this unique constraint checking code modular and portable across projects, and different Core Data Models.

The first issue we need to is identify what property of a NSManagedObject should be unique relative to the rest of the greater set. We can achieve this by adding a class method to the NSManagedObject class

+ (NSString *)uniqueKeypath {
  return nil;
}


However, in an more advanced model there might be multiple properties that need to be evaluated to determine if an object is unique so …

+ (NSArray *)uniqueKeypaths {
  return nil;
}


It's important to note that since we are using Keypaths, we can actually evaluate the values of related entities.

Finding a Needle in a Haystack

Now it's time to make the magic happen. We apply the concept, we create a hash table, then lookup the unique value of the new objects in the table. If we get a 'hit', we don't do anything, if we 'miss' we create a new object. The sudo code is effectively.

FOR EACH object IN newObjects:

  IF NOT object IN allObjects:

    createObject(object)


The above is simplistic, but you get the point. However you should have made the observation, that if we do it exactly like this, it will not only take ages but we'll run out of memory long before we've began doing the comparison object if we have an extremely large set.

So, the solution is GCD and batched fetching. Given that we can assume that only one import is happening at any one time. So once we have performed the initial fetch of all the existing objects, we can split the comparison operations for each of the new objects on to a concurrent GCD queue. We also need a place to store the objects that need to be created, rather than copying them into another collection, we can keep the current collection and gather the indexes of all the new objects, and then create a subset of the superset.

//Compare new hashes against all the known hashes on multiple threads
dispatch_apply([newObjects count], concurrentQueue, ^(size_t idx) {

   //Note that everything that happens here is on a concurrent queue
   if (![hashes member:[[newObjects objectAtIndex:idx] valueForKeyPath:aKeypath]]) {

   //We have synchronize access to the mutable indexset

    [lock lock]; //Lock the index set

    [uniqueIndexes addIndex:idx]; //add the unique index

    [lock unlock]; //Unlock the index set

  }

});


In this case I'm using dispatch_apply (which I personally think is awesome). It will spawn multiple instances of the block on a concurrent queue. Because of the concurrent nature of this method it's important that we lock the NSMutableIndexSet to ensure that it doesn't blow up when two indexes are added at the same time. The current implementation with a simple NSLock results in poor performance on the initial import as every single block will attempt to get the lock so they can add an index. One possible solution is to use a serial dispatch queue to handle the adding of the indexes, and call it via a dispatch_async.

The next part is to split the work of fetching the objects in to smaller batches so we can not only perform smaller units of work, but also have a lower high memory watermark. NSFetchRequest has support for batching requests so this is apparently handled transparently to the rest of the code using a special type of NSArray. However I haven't tested this to ensure it behaves as I expect.

I've posted an implementation as a abstract subclass of NSManagedObject on Github, fork away!


Tuesday, June 25, 2013

Vida en Venezuela - Parte 1, Colombia a Venezuela

While several services offered buses "directo" to Maracaibo, most of those arrived at night, which is a crappy even in the nicest cities/ countries, never mind Venezuela.

So I formulated a plan, that only really worked based on geography. I would do the long jaunt from Cartagena to Riohacha. Bed down there for a night and then set off early the next morning.

From the bus station in Riohacha I got a bus for 8000 COP to Maicao, the major border town. The journey took a little over an hour. As soon as I got off the bus in Maicao I was accosted by several people offering me a service to Maracaibo. Wildly pointing to a bunch of beaten up American gas guzzlers from the 70s & 80s, the "por puestos".

As I would discover later, there is no direct bus service to Maracaibo from Maicao, however you can reach other destinations deeper in Venezuela from here namely Caracas, a city that commonly has 20 murders a day.

I jumped in one of these beaten up Chevys and paid USD$12 or 20000COP and prepared myself for the adventure. The taxi stops at the Colombian immigration office and lets out those who want to get their exit stamp. Once at the office there is usually a line that takes anywhere from an hour upwards to navigate. As a result of this constant people presence there are plenty of people offering cambio services for both COP and USD, the rate for the latter being better than former. Having said that, both rates are no where near as good as what you would get once your in the country proper.

After reaching my turn at the head of the queue, things started to go off script. The Colombian immigration official suggested that I go over to the Venezuelan side first and see if they would let me in before stamping me out of the country. Strange, but ok this is Bolivarian Republic, strange is just the beginning of normal.

10 minutes later I found the Venezuelan immigration office, complete with a long line of its own. After queuing for around 30 minutes I again reached the head of the line, and shoved my documents through the tidy window and crouched down so the socialist AC could cool my capitalist face.

It took a while for the officer to respond to my documents as he was busy receiving documents with his left hand, and pocketing cash with his right hand. However after a minute or two he took my passport and began to analyse it. After a brief flick through he threw it back on the counter coupled with a loud "NO STAMP". Somewhat expected, to be honest but by this time my patience was wearing thin, and waiting in the Colombian border line again didn't seem like something I wanted to. So I decide I had 3 choices.

* Go back to colombia and get an exit stamp and try again.
* Bribe the Venezuelan immigration official
* Enter Venezuela illegally, and then come back tomorrow and sort this nonsense out.

There was also a fourth option which was entering illegally travelling across the country, without technically having left colombia, see angel falls and then return back across the same border crossing without visiting either nations immigration service. The problem with this is that i would need to bribe every single official i met, an expensive proposition in a country where I can't use an ATM.

I opted for option 2, the bribe.

Bribing is something I'm fascinated by but have no real understanding of in a practical sense. The thing that I really don't get is how does one calculate how much is required to make someone look the other way. Especially when you take into consideration that your bribe might be accepted but your wishes not carried out.

After a little self deliberation I determined that USD$5 was an acceptable amount of money to lose, while also being (just) enough to encourage someone to look the other way. Now the next problem was how to deliver it. While I knew that the officer could be bought, I didn't know if he wanted to maintain a legit appearance to his colleagues, so simply sliding some money across the counter might not only annoy him, but also the people in the queue behind unable to "splash out" in a similar fashion.

I wrapped the bill around my immigration form, inserted it in my passport and rejoined the queue. 45 minutes later my attempt was rebuffed with a firm NO, and a return of my money.

By this time it was about 2pm, I had been awake since 7am, barely eaten anything and had been standing in the sun for most of the day. I was dog tired.

I headed back to Colombian immigration to get my exit stamp and try again, with slightly more complete papers. Whilst waiting in line for the second time I spotted two gringos. Their pale skin and failure to adhere to the local dress code made them stick out, for the record I was wearing jeans, shoes, and a t-shirt the Colombian dress code. An hour later I had my exit stamp in hand, and had met the aforementioned gringos, a French girl and a Canadian boy.

Needless to say I was rejected again, the officer (the same one I had tried to bribe earlier in the day) had shown me a sign this time instructing me to have a notarised invitation or hotel reservation and land/ air tickets out of the country. By this time it was around 5pm. For one reason or another the French lady was accepted without any of the aforementioned documents, but the Canadian was also rejected, with the aforementioned documents. Since we were both wallowing in rejection, Vladimir (the Canadian) and I decided to get a taxi back to Maicao. With the intention of starting bright and early the next day.

We were out of the hotel by 8 am the next day and made a beeline for the bus station on the edge of town. Our plan was simple, we would buy a return ticket to Maracaibo and fabricate a hotel reservation in the city to please the immigration officers. We couldn't just book one, as paying for anything in Venezuela with bolivars not exchanged at the black market rate is 400% more expensive.

However at the bus station we discovered that no company offered service to Maracaibo, only Caracas, the most dangerous city in the world ... So we got a ticket that returned 3 days later for 70000cop and jumped on a couple of Moto taxis for the border.

We stamped out of colombia, again and then headed for the Bolivarian republic. After queuing in the nervously for about 30 minutes we reached the front ... Only to be greeted by the same official as yesterday. Seeing a only a handful of foreigners had used that crossing in the last few days he remembered as both. The guy said no almost immediately, but we insisted in the most broken Spanish ever that we had all the documents. Eventually he told us to move to the side and enter the immigration office using a side entrance.

Once inside we explained to the Jefe (The boss) that we had reservations along with bus tickets. It wasn't going too well until we explained that we also had return tickets to maicao leaving the day out reservation ended. Suddenly the mood changed, "yep that's great, I'll go stamp them then." And with that we were in.