The normal solution to the first is to have an ordering of the lattice points defined by couple of generators picked so that the decision rule which takes you from one lattice point to the next is straightforward; the normal solution to the second is to collect a series of lists of updates to sub-regions of the array smaller than the size of the level-2 cache, with the number of lists being roughly the number of lines in the L1 cache so that adding an entry to a list is generally a cache hit, and then applying the lists of updates one at a time, where each application will be a level-2 cache hit.