Jim Driscoll's Blog

Notes on Technology and the Web

Bacon Numbers with Java 8, part 3

leave a comment »

Continuing the work of my previous post, we’re examining the following question:

Given a large hash table whose keys are movie names and whose values 
are a list of actors in those movies, write a function to determine 
the Bacon Number of a particular actor.

We created sample data with Stream.generate, we also created a micro benchmark with functional interfaces, to measure how we were doing. Now, let’s massage the data to get it into a form we can easily search.

As I mentioned in the first post, the main problem here is that the form of the data is completely inappropriate for the kind of search we wish to perform.

Movie 1 --
  -- Actor 1
  -- Actor 2
Movie 2 --
  -- Actor 2
  -- Actor 3

Far better would be to associate the actors with each other, so we could readily relate them.

Actor 1 --
  -- Actor 2
  -- Actor 3
Actor 2 -- 
  -- Actor 1
  -- Actor 3
Actor 3 --
  -- Actor 2

Performing this transformation is a relatively simple task.

Map<Integer, Set<Integer>> actorRelations = new HashMap<>();
// get all the list of actors for each movie
for (List<Integer> movie : set) {
    // for each actor in a movie
    for (Integer actor : movie) {
        // get the list of actors
        Set<Integer> coactors = new HashSet(movie);
        // remove self
        coactors.remove(actor);
        // add to the list of relations
        actorRelations.merge(actor, coactors,
            (existing, update) -> {
                existing.addAll(update);
                    return existing;
        });
    }
}

We’re going to create a Map, with the id of the actor as a key, and containing a Set of all the actors they have shared a movie with. Nothing new or interesting here, except one thing – the Map.merge method, which takes three parameters. The first is the key to operate on, the second, the value we wish to place into the Map. The third is a Function interface, called if the value is already set, and passed the existing value along with the new one. In our case, we wish to merge the two Sets, and we do so in the usual way.

We can do much the same thing with a Stream implementation, and I think it’s rather easier to read:

Map<Integer, Set<Integer>> actorRelations = new HashMap<>();
set.stream().forEach((movie) -> {
        movie.stream().forEach((actor) -> {
            Set<Integer> coactors = new HashSet(movie);
            coactors.remove(actor);
            actorRelations.merge(actor, coactors,
                    (existing, update) -> {
                        existing.addAll(update);
                        return existing;});
        });
    });

This operates slightly faster than the previous implementation, but only by a little.

To go through this — take the Set that contains the movies, and create a Stream. For each element in the Stream (and remember, that element is a List), create another Stream. For each element on that Stream, create a Set of all actors in that film (self excluded) and add that to the Map of relationships, merging if necessary.

Of course, one of the best things that you get when using Stream is the ability to change it to parallel processing with only a couple of extra method calls:

Map<Integer, Set<Integer>> actorRelations = new ConcurrentHashMap<>();
set.parallelStream().unordered().forEach((movie) -> {
    movie.stream().forEach((actor) -> {
        Set<Integer> coactors = new HashSet(movie);
        coactors.remove(actor);
        actorRelations.merge(actor, coactors,
            (existing, update) -> {
                existing.addAll(update);
                return existing;
            });
    });
});

This is the same as above, with two changes – instead of a HashMap, use a ConcurrentHashMap, and just add .parrallel().unsorted() to the Stream code. With that simple change, the clock time for this drops by over half (on an 8 core machine).

Next up, we’ll finish working on this problem by actually running the search.

About these ads

Written by jamesgdriscoll

January 11, 2014 at 3:07 PM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 413 other followers

%d bloggers like this: